SlideShare a Scribd company logo
1 of 26
An online method for merging
time ranges on top of Scylla
Rohit Saboo
ML Engineering Lead, Nauto Inc.
Presenter bio
▪ Leading an ML engineering team at Nauto -- working on
various things such as finding trips, driver identification,
lossless sensor data compression.
▪ Previously a founding engineer at a search startup, and in
Google search and robotics.
▪ MS., Ph.D in Computer Science from UNC Chapel Hill, and
B.Tech, Computer Science, IIT Madras.
▪ I enjoy hiking, photography, swimming, and biking in my
spare time.
Nauto
▪ Identifies risky driving using
machine learning
▪ Enables fleet managers to
correlate internal driver
behaviour with external vehicle
movements
▪ Coaching drivers for safer roads
Nauto
Trips and Identifying the Driver
▪ Record trip taken.
▪ Identify the driver taking the
trip.
Device takes
● snapshot & crops image
● GPS, speed ...
● vehicle state
N2
Upload
Merging Time Ranges
(Building Trips)
GPS
Moving/Stopped
Speed
Data Flow
N2
Kafka
worker 1
Sharded by
Device
worker i
worker n
Get
neighboring
ranges
Save merged
range
(Delete old)
8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05
8:00am 8:12am 8:18am 8:35am 8:45am 8:55am
Time Ranges (Trips)
Time ranges (trips) table
CREATE TABLE device_trips (
version int,
id text,
bucket timestamp,
end_ms timestamp,
start_ms timestamp,
details blob,
PRIMARY KEY ((version, id, bucket), end, start)
) WITH CLUSTERING ORDER BY (end DESC, start DESC)
Time ranges (trips) table
CREATE TABLE device_trips (
version int,
id text,
bucket timestamp,
end_ms timestamp,
start_ms timestamp,
details blob,
PRIMARY KEY ((version, id, bucket), end, start)
) WITH CLUSTERING ORDER BY (end DESC, start DESC)
The id of the time series. In our case, the vehicle/device id.
Time ranges (trips) table
CREATE TABLE device_trips (
version int,
id text,
bucket timestamp,
end_ms timestamp,
start_ms timestamp,
details blob,
PRIMARY KEY ((version, id, bucket), end, start)
) WITH CLUSTERING ORDER BY (end DESC, start DESC)
For bucketing the time ranges into manageable partitions. Truncated from end_ms.
Time ranges (trips) table
CREATE TABLE device_trips (
version int,
id text,
bucket timestamp,
end_ms timestamp,
start_ms timestamp,
details blob,
PRIMARY KEY ((version, id, bucket), end, start)
) WITH CLUSTERING ORDER BY (end DESC, start DESC)
The start and end of the time range.
Time ranges (trips) table
CREATE TABLE device_trips (
version int,
id text,
bucket timestamp,
end_ms timestamp,
start_ms timestamp,
details blob,
PRIMARY KEY ((version, id, bucket), end, start)
) WITH CLUSTERING ORDER BY (end DESC, start DESC)
To simplify running multiple versions of the algorithms or experiments.
Time ranges (trips) table
CREATE TABLE device_trips (
version int,
id text,
bucket timestamp,
end_ms timestamp,
start_ms timestamp,
details blob,
PRIMARY KEY ((version, id, bucket), end, start)
) WITH CLUSTERING ORDER BY (end DESC, start DESC)
Application-specific.
For us, a marshaled protobuf containing gps, speed, etc. along the trip
(with json type now available, it may be preferable to use json.)
Time ranges (trips) table
CREATE TABLE device_trips (
version int,
id text,
bucket timestamp,
end_ms timestamp,
start_ms timestamp,
details blob,
PRIMARY KEY ((version, id, bucket), end, start)
) WITH CLUSTERING ORDER BY (end DESC, start DESC)
Partitioned by time series id and bucketed to manage partition size.
Clustering order chosen to make the common case of data arriving in-order optimal.
By utilizing a sharded worker pool for merging,
neither are there races nor locks are necessary.
As a result, there will (almost) always be at most
2 neighboring time ranges:
one before and one after
8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05
8:00am 8:12am
Merging
8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05
8:00am 8:12am
8:40am 8:55am
Merging
end_ms ≥ 8:35am start_ms ≤ 9:00am
8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05
8:00am 8:12am
8:40am 8:55am
Merging
end_ms ≥ 8:35am start_ms ≤ 9:00am
8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05
8:00am 8:12am 8:40am 8:55am
Merging
8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05
8:00am 8:12am
8:15am 8:35am
8:40am 8:55am
Merging
end_ms ≥ 8:10am
8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05
8:00am 8:12am 8:40am 8:55am
Merging
8:00am 8:55am
8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05
Merging
8:00am 8:55am
8:00am 8:55am
Merging
8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05
The Team
Rohit Saboo, Adam Sowinski, Christian Merkwirth
Yingyi Hu, Karol Kokoszka, Mykola Terelia
and many others
Thank You
Any Questions ?
Please stay in touch
rohit@nauto.com
@NAUTODriver

More Related Content

More from ScyllaDB

Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 
Top NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling MistakesTop NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling MistakesScyllaDB
 
NoSQL Data Modeling Foundations — Introducing Concepts & Principles
NoSQL Data Modeling Foundations — Introducing Concepts & PrinciplesNoSQL Data Modeling Foundations — Introducing Concepts & Principles
NoSQL Data Modeling Foundations — Introducing Concepts & PrinciplesScyllaDB
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversScyllaDB
 
Overcoming Media Streaming Challenges with NoSQL
Overcoming Media Streaming Challenges with NoSQLOvercoming Media Streaming Challenges with NoSQL
Overcoming Media Streaming Challenges with NoSQLScyllaDB
 
How Optimizely (Safely) Maximizes Database Concurrency.pdf
How Optimizely (Safely) Maximizes Database Concurrency.pdfHow Optimizely (Safely) Maximizes Database Concurrency.pdf
How Optimizely (Safely) Maximizes Database Concurrency.pdfScyllaDB
 
How Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfHow Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfScyllaDB
 
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
Learning Rust the Hard Way for a Production Kafka + ScyllaDB PipelineLearning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
Learning Rust the Hard Way for a Production Kafka + ScyllaDB PipelineScyllaDB
 
NoSQL at Scale: Proven Practices & Pitfalls
NoSQL at Scale: Proven Practices & PitfallsNoSQL at Scale: Proven Practices & Pitfalls
NoSQL at Scale: Proven Practices & PitfallsScyllaDB
 

More from ScyllaDB (20)

Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 
Top NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling MistakesTop NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling Mistakes
 
NoSQL Data Modeling Foundations — Introducing Concepts & Principles
NoSQL Data Modeling Foundations — Introducing Concepts & PrinciplesNoSQL Data Modeling Foundations — Introducing Concepts & Principles
NoSQL Data Modeling Foundations — Introducing Concepts & Principles
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database Drivers
 
Overcoming Media Streaming Challenges with NoSQL
Overcoming Media Streaming Challenges with NoSQLOvercoming Media Streaming Challenges with NoSQL
Overcoming Media Streaming Challenges with NoSQL
 
How Optimizely (Safely) Maximizes Database Concurrency.pdf
How Optimizely (Safely) Maximizes Database Concurrency.pdfHow Optimizely (Safely) Maximizes Database Concurrency.pdf
How Optimizely (Safely) Maximizes Database Concurrency.pdf
 
How Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfHow Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdf
 
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
Learning Rust the Hard Way for a Production Kafka + ScyllaDB PipelineLearning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
 
NoSQL at Scale: Proven Practices & Pitfalls
NoSQL at Scale: Proven Practices & PitfallsNoSQL at Scale: Proven Practices & Pitfalls
NoSQL at Scale: Proven Practices & Pitfalls
 

Recently uploaded

How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 

Recently uploaded (20)

How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 

Scylla Summit 2018: Nauto - An Online Method for Merging Time Ranges on Top of Scylla

  • 1. An online method for merging time ranges on top of Scylla Rohit Saboo ML Engineering Lead, Nauto Inc.
  • 2. Presenter bio ▪ Leading an ML engineering team at Nauto -- working on various things such as finding trips, driver identification, lossless sensor data compression. ▪ Previously a founding engineer at a search startup, and in Google search and robotics. ▪ MS., Ph.D in Computer Science from UNC Chapel Hill, and B.Tech, Computer Science, IIT Madras. ▪ I enjoy hiking, photography, swimming, and biking in my spare time.
  • 4. ▪ Identifies risky driving using machine learning ▪ Enables fleet managers to correlate internal driver behaviour with external vehicle movements ▪ Coaching drivers for safer roads Nauto
  • 5. Trips and Identifying the Driver ▪ Record trip taken. ▪ Identify the driver taking the trip. Device takes ● snapshot & crops image ● GPS, speed ... ● vehicle state N2 Upload
  • 7. GPS Moving/Stopped Speed Data Flow N2 Kafka worker 1 Sharded by Device worker i worker n Get neighboring ranges Save merged range (Delete old)
  • 8. 8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05 8:00am 8:12am 8:18am 8:35am 8:45am 8:55am Time Ranges (Trips)
  • 9. Time ranges (trips) table CREATE TABLE device_trips ( version int, id text, bucket timestamp, end_ms timestamp, start_ms timestamp, details blob, PRIMARY KEY ((version, id, bucket), end, start) ) WITH CLUSTERING ORDER BY (end DESC, start DESC)
  • 10. Time ranges (trips) table CREATE TABLE device_trips ( version int, id text, bucket timestamp, end_ms timestamp, start_ms timestamp, details blob, PRIMARY KEY ((version, id, bucket), end, start) ) WITH CLUSTERING ORDER BY (end DESC, start DESC) The id of the time series. In our case, the vehicle/device id.
  • 11. Time ranges (trips) table CREATE TABLE device_trips ( version int, id text, bucket timestamp, end_ms timestamp, start_ms timestamp, details blob, PRIMARY KEY ((version, id, bucket), end, start) ) WITH CLUSTERING ORDER BY (end DESC, start DESC) For bucketing the time ranges into manageable partitions. Truncated from end_ms.
  • 12. Time ranges (trips) table CREATE TABLE device_trips ( version int, id text, bucket timestamp, end_ms timestamp, start_ms timestamp, details blob, PRIMARY KEY ((version, id, bucket), end, start) ) WITH CLUSTERING ORDER BY (end DESC, start DESC) The start and end of the time range.
  • 13. Time ranges (trips) table CREATE TABLE device_trips ( version int, id text, bucket timestamp, end_ms timestamp, start_ms timestamp, details blob, PRIMARY KEY ((version, id, bucket), end, start) ) WITH CLUSTERING ORDER BY (end DESC, start DESC) To simplify running multiple versions of the algorithms or experiments.
  • 14. Time ranges (trips) table CREATE TABLE device_trips ( version int, id text, bucket timestamp, end_ms timestamp, start_ms timestamp, details blob, PRIMARY KEY ((version, id, bucket), end, start) ) WITH CLUSTERING ORDER BY (end DESC, start DESC) Application-specific. For us, a marshaled protobuf containing gps, speed, etc. along the trip (with json type now available, it may be preferable to use json.)
  • 15. Time ranges (trips) table CREATE TABLE device_trips ( version int, id text, bucket timestamp, end_ms timestamp, start_ms timestamp, details blob, PRIMARY KEY ((version, id, bucket), end, start) ) WITH CLUSTERING ORDER BY (end DESC, start DESC) Partitioned by time series id and bucketed to manage partition size. Clustering order chosen to make the common case of data arriving in-order optimal.
  • 16. By utilizing a sharded worker pool for merging, neither are there races nor locks are necessary. As a result, there will (almost) always be at most 2 neighboring time ranges: one before and one after
  • 17. 8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05 8:00am 8:12am Merging
  • 18. 8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05 8:00am 8:12am 8:40am 8:55am Merging end_ms ≥ 8:35am start_ms ≤ 9:00am
  • 19. 8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05 8:00am 8:12am 8:40am 8:55am Merging end_ms ≥ 8:35am start_ms ≤ 9:00am
  • 20. 8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05 8:00am 8:12am 8:40am 8:55am Merging
  • 21. 8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05 8:00am 8:12am 8:15am 8:35am 8:40am 8:55am Merging end_ms ≥ 8:10am
  • 22. 8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05 8:00am 8:12am 8:40am 8:55am Merging 8:00am 8:55am
  • 23. 8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05 Merging 8:00am 8:55am
  • 24. 8:00am 8:55am Merging 8:00 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9:00 9:05
  • 25. The Team Rohit Saboo, Adam Sowinski, Christian Merkwirth Yingyi Hu, Karol Kokoszka, Mykola Terelia and many others
  • 26. Thank You Any Questions ? Please stay in touch rohit@nauto.com @NAUTODriver