SlideShare a Scribd company logo
100 Million Events
    Eric Lubow
    @elubow
    elubow@simplereach.com
Overview




100 Million Events   Eric Lubow   @elubow
Overview
•   SimpleReach




    100 Million Events   Eric Lubow   @elubow
Overview
•   SimpleReach
•   100 Million Events




    100 Million Events   Eric Lubow   @elubow
Overview
•   SimpleReach
•   100 Million Events
•   Finding Patterns in Your Data




    100 Million Events              Eric Lubow   @elubow
Overview
•   SimpleReach
•   100 Million Events
•   Finding Patterns in Your Data
•   What Mistakes?




    100 Million Events              Eric Lubow   @elubow
Overview
•   SimpleReach
•   100 Million Events
•   Finding Patterns in Your Data
•   What Mistakes?
•   Questions


    100 Million Events              Eric Lubow   @elubow
Socially Intelligent



100 Million Events                          Eric Lubow   @elubow
Size




100 Million Events   Eric Lubow   @elubow
Size
•   100m events
    recorded per day and
    growing




     100 Million Events    Eric Lubow   @elubow
Size
•   100m events
    recorded per day and
    growing
•   500m Pageviews per
    month and growing




     100 Million Events    Eric Lubow   @elubow
Right Tool For The Job




100 Million Events              Eric Lubow   @elubow
Why?




100 Million Events   Eric Lubow   @elubow
Why?
•   Heavier READ loads vs heavier write loads




    100 Million Events                          Eric Lubow   @elubow
Why?
•   Heavier READ loads vs heavier write loads
•   Data relationships may be less important




    100 Million Events                          Eric Lubow   @elubow
Why?
•   Heavier READ loads vs heavier write loads
•   Data relationships may be less important
•   Different aspects of a system have different requirements




    100 Million Events                               Eric Lubow   @elubow
Why?
•   Heavier READ loads vs heavier write loads
•   Data relationships may be less important
•   Different aspects of a system have different requirements
•   Know your compromises




    100 Million Events                               Eric Lubow   @elubow
Cassandra




100 Million Events   Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion




    100 Million Events            Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion
•   Really fast writes to many locations (eventual consistency)




    100 Million Events                                            Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion
•   Really fast writes to many locations (eventual consistency)
•   Query by column groups within rows




    100 Million Events                                            Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion
•   Really fast writes to many locations (eventual consistency)
•   Query by column groups within rows
•   Range queries in Hive (Slice predicate ranges)




    100 Million Events                                            Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion
•   Really fast writes to many locations (eventual consistency)
•   Query by column groups within rows
•   Range queries in Hive (Slice predicate ranges)
•   Fault tolerant




    100 Million Events                                            Eric Lubow   @elubow
What Mistakes?




100 Million Events   Eric Lubow   @elubow
What Mistakes?
•   Manage how many servers?




    100 Million Events         Eric Lubow   @elubow
What Mistakes?
•   Manage how many servers?
•   Re-inventing the wheel (Helenus)




    100 Million Events                 Eric Lubow   @elubow
What Mistakes?
•   Manage how many servers?
•   Re-inventing the wheel (Helenus)
•   Composites Rock




    100 Million Events                 Eric Lubow   @elubow
What Mistakes?
•   Manage how many servers?
•   Re-inventing the wheel (Helenus)
•   Composites Rock
•   Snapshots before drop keyspace




    100 Million Events                 Eric Lubow   @elubow
What Mistakes?
•   Manage how many servers?
•   Re-inventing the wheel (Helenus)
•   Composites Rock
•   Snapshots before drop keyspace
•   How many experts does it take to run a cluster?




    100 Million Events                                Eric Lubow   @elubow
What Mistakes?
•   Manage how many servers?
•   Re-inventing the wheel (Helenus)
•   Composites Rock
•   Snapshots before drop keyspace
•   How many experts does it take to run a cluster?
•   You can tune Cassandra?!?

    100 Million Events                                Eric Lubow   @elubow
Server Management


                     Cluster SSH




100 Million Events   Eric Lubow   @elubow
Server Management
•   Hand tools - AWS, csshx


                              Cluster SSH




    100 Million Events        Eric Lubow   @elubow
Server Management
•   Hand tools - AWS, csshx
•   Configuration Management
                              Cluster SSH




    100 Million Events        Eric Lubow   @elubow
Server Management
•   Hand tools - AWS, csshx
•   Configuration Management
•   Monitoring and Alerting Tools   Cluster SSH




    100 Million Events              Eric Lubow   @elubow
Server Management
•   Hand tools - AWS, csshx
•   Configuration Management
•   Monitoring and Alerting Tools   Cluster SSH
•   Performance




    100 Million Events              Eric Lubow   @elubow
Server Management
•   Hand tools - AWS, csshx
•   Configuration Management
•   Monitoring and Alerting Tools   Cluster SSH
•   Performance
•   Security




    100 Million Events              Eric Lubow   @elubow
Helenus




100 Million Events   Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra




    100 Million Events                   Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra
•   https://github.com/simplereach/helenus




    100 Million Events                       Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra
•   https://github.com/simplereach/helenus
•   CQL 2/3, Composite Column, Thrift Interface




    100 Million Events                            Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra
•   https://github.com/simplereach/helenus
•   CQL 2/3, Composite Column, Thrift Interface
•   Parallel querying (split up queries)




    100 Million Events                            Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra
•   https://github.com/simplereach/helenus
•   CQL 2/3, Composite Column, Thrift Interface
•   Parallel querying (split up queries)
•   Fault tolerance and resilience


    100 Million Events                            Eric Lubow   @elubow
Data Patterns




100 Million Events   Eric Lubow   @elubow
Data Patterns
•   Storage is cheap




    100 Million Events   Eric Lubow   @elubow
Data Patterns
•   Storage is cheap
•   Composites are WAY better than underscores




    100 Million Events                           Eric Lubow   @elubow
Data Patterns
•   Storage is cheap
•   Composites are WAY better than underscores
•   Beyond UTF8Type




    100 Million Events                           Eric Lubow   @elubow
Data Patterns
•   Storage is cheap
•   Composites are WAY better than underscores
•   Beyond UTF8Type
•   Timestamps as LongType




    100 Million Events                           Eric Lubow   @elubow
Safety Mechanisms




100 Million Events   Eric Lubow   @elubow
Safety Mechanisms
•   Snapshots before dropping keyspaces




    100 Million Events                    Eric Lubow   @elubow
Safety Mechanisms
•   Snapshots before dropping keyspaces
•   Authorization and authentication




    100 Million Events                    Eric Lubow   @elubow
Safety Mechanisms
•   Snapshots before dropping keyspaces
•   Authorization and authentication
•   (Limit) Direct access to the data store




    100 Million Events                        Eric Lubow   @elubow
Expertise




100 Million Events   Eric Lubow   @elubow
Expertise
•   What happens when you need help?




    100 Million Events                 Eric Lubow   @elubow
Expertise
•   What happens when you need help?
•   How do you become an expert?




    100 Million Events                 Eric Lubow   @elubow
Expertise
•   What happens when you need help?
•   How do you become an expert?
•   What happens when you need more experts?




    100 Million Events                         Eric Lubow   @elubow
Tunables




100 Million Events   Eric Lubow   @elubow
Tunables
•   Replication factor and read_repair_chance




    100 Million Events                          Eric Lubow   @elubow
Tunables
•   Replication factor and read_repair_chance
•   Phi Convict and RPC timeout for AWS or DC separation




    100 Million Events                                     Eric Lubow   @elubow
Tunables
•   Replication factor and read_repair_chance
•   Phi Convict and RPC timeout for AWS or DC separation
•   MAX_HEAP_SIZE and HEAP_NEWSIZE (Analytics vs Realtime)




    100 Million Events                                     Eric Lubow   @elubow
Future
•   Priam
•   Asgard
•   Curator
•   Work for             ?
•   Hastur



    100 Million Events       Eric Lubow   @elubow
Summary




100 Million Events   Eric Lubow   @elubow
Summary
•   Learn from others mistakes




    100 Million Events           Eric Lubow   @elubow
Summary
•   Learn from others mistakes
•   Tuning and data patterns




    100 Million Events           Eric Lubow   @elubow
Summary
•   Learn from others mistakes
•   Tuning and data patterns
•   It’s ok to re-invent the wheel




    100 Million Events               Eric Lubow   @elubow
Summary
•   Learn from others mistakes
•   Tuning and data patterns
•   It’s ok to re-invent the wheel
•   Applications for/with Cassandra




    100 Million Events                Eric Lubow   @elubow
We’re Hiring




100 Million Events                  Eric Lubow   @elubow
Questions are guaranteed in life.
Answers aren’t.

               Eric Lubow
               @elubow
               elubow@simplereach.com


               Thank you.

More Related Content

Similar to 100m Events

Adopting Elixir in a 10 year old codebase
Adopting Elixir in a 10 year old codebaseAdopting Elixir in a 10 year old codebase
Adopting Elixir in a 10 year old codebase
Michael Klishin
 
Premature optimisation: The Root of All Evil
Premature optimisation: The Root of All EvilPremature optimisation: The Root of All Evil
Premature optimisation: The Root of All Evil
Fabio Akita
 
DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...
DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...
DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...
iMasters
 
Canary Analyze All the Things
Canary Analyze All the ThingsCanary Analyze All the Things
Canary Analyze All the Things
royrapoport
 
TDC2016SP - Otimização Prematura: a Raíz de Todo o Mal
TDC2016SP - Otimização Prematura: a Raíz de Todo o MalTDC2016SP - Otimização Prematura: a Raíz de Todo o Mal
TDC2016SP - Otimização Prematura: a Raíz de Todo o Mal
tdc-globalcode
 
Andrew Polaszek - ZooBank: ICZN’s open-access web register of animal names a...
Andrew Polaszek - ZooBank:  ICZN’s open-access web register of animal names a...Andrew Polaszek - ZooBank:  ICZN’s open-access web register of animal names a...
Andrew Polaszek - ZooBank: ICZN’s open-access web register of animal names a...
ICZN
 
Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015
Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015
Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015
Chef
 
Apache Solr 5.0 and beyond
Apache Solr 5.0 and beyondApache Solr 5.0 and beyond
Apache Solr 5.0 and beyond
Anshum Gupta
 
Micro Services - Smaller is Better?
Micro Services - Smaller is Better?Micro Services - Smaller is Better?
Micro Services - Smaller is Better?
Eberhard Wolff
 
Testing Variability-Intensive Systems, tutorial SPLC 2017, part I
Testing Variability-Intensive Systems, tutorial SPLC 2017, part ITesting Variability-Intensive Systems, tutorial SPLC 2017, part I
Testing Variability-Intensive Systems, tutorial SPLC 2017, part I
XavierDevroey
 
Conexão Kinghost - Otimização Prematura
Conexão Kinghost - Otimização PrematuraConexão Kinghost - Otimização Prematura
Conexão Kinghost - Otimização Prematura
Fabio Akita
 
Micro Service – The New Architecture Paradigm
Micro Service – The New Architecture ParadigmMicro Service – The New Architecture Paradigm
Micro Service – The New Architecture Paradigm
Eberhard Wolff
 
Interns What Is DevOps
Interns What Is DevOpsInterns What Is DevOps
Interns What Is DevOps
Aaron Blythe
 
Dashboard Mania
Dashboard ManiaDashboard Mania
Dashboard Mania
Tim Lossen
 
Dockercon USA 2016 - Immutable Awesomeness
Dockercon USA 2016 - Immutable Awesomeness Dockercon USA 2016 - Immutable Awesomeness
Dockercon USA 2016 - Immutable Awesomeness
John Willis
 
Immutable Awesomeness by John Willis and Josh Corman
Immutable Awesomeness by John Willis and Josh CormanImmutable Awesomeness by John Willis and Josh Corman
Immutable Awesomeness by John Willis and Josh Corman
Docker, Inc.
 
The "Ops" Side of DevSecOps
The "Ops" Side of DevSecOps The "Ops" Side of DevSecOps
The "Ops" Side of DevSecOps
Rundeck
 
Programming quantum computers in Q# (Techorama NL 2018)
Programming quantum computers in Q# (Techorama NL 2018)Programming quantum computers in Q# (Techorama NL 2018)
Programming quantum computers in Q# (Techorama NL 2018)
Rolf Huisman
 
Ds @ bol
Ds @ bolDs @ bol
Ds @ bol
Asparuh Hristov
 
DevOps 2016 summit
DevOps 2016 summitDevOps 2016 summit
DevOps 2016 summit
Chihyang Li
 

Similar to 100m Events (20)

Adopting Elixir in a 10 year old codebase
Adopting Elixir in a 10 year old codebaseAdopting Elixir in a 10 year old codebase
Adopting Elixir in a 10 year old codebase
 
Premature optimisation: The Root of All Evil
Premature optimisation: The Root of All EvilPremature optimisation: The Root of All Evil
Premature optimisation: The Root of All Evil
 
DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...
DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...
DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...
 
Canary Analyze All the Things
Canary Analyze All the ThingsCanary Analyze All the Things
Canary Analyze All the Things
 
TDC2016SP - Otimização Prematura: a Raíz de Todo o Mal
TDC2016SP - Otimização Prematura: a Raíz de Todo o MalTDC2016SP - Otimização Prematura: a Raíz de Todo o Mal
TDC2016SP - Otimização Prematura: a Raíz de Todo o Mal
 
Andrew Polaszek - ZooBank: ICZN’s open-access web register of animal names a...
Andrew Polaszek - ZooBank:  ICZN’s open-access web register of animal names a...Andrew Polaszek - ZooBank:  ICZN’s open-access web register of animal names a...
Andrew Polaszek - ZooBank: ICZN’s open-access web register of animal names a...
 
Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015
Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015
Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015
 
Apache Solr 5.0 and beyond
Apache Solr 5.0 and beyondApache Solr 5.0 and beyond
Apache Solr 5.0 and beyond
 
Micro Services - Smaller is Better?
Micro Services - Smaller is Better?Micro Services - Smaller is Better?
Micro Services - Smaller is Better?
 
Testing Variability-Intensive Systems, tutorial SPLC 2017, part I
Testing Variability-Intensive Systems, tutorial SPLC 2017, part ITesting Variability-Intensive Systems, tutorial SPLC 2017, part I
Testing Variability-Intensive Systems, tutorial SPLC 2017, part I
 
Conexão Kinghost - Otimização Prematura
Conexão Kinghost - Otimização PrematuraConexão Kinghost - Otimização Prematura
Conexão Kinghost - Otimização Prematura
 
Micro Service – The New Architecture Paradigm
Micro Service – The New Architecture ParadigmMicro Service – The New Architecture Paradigm
Micro Service – The New Architecture Paradigm
 
Interns What Is DevOps
Interns What Is DevOpsInterns What Is DevOps
Interns What Is DevOps
 
Dashboard Mania
Dashboard ManiaDashboard Mania
Dashboard Mania
 
Dockercon USA 2016 - Immutable Awesomeness
Dockercon USA 2016 - Immutable Awesomeness Dockercon USA 2016 - Immutable Awesomeness
Dockercon USA 2016 - Immutable Awesomeness
 
Immutable Awesomeness by John Willis and Josh Corman
Immutable Awesomeness by John Willis and Josh CormanImmutable Awesomeness by John Willis and Josh Corman
Immutable Awesomeness by John Willis and Josh Corman
 
The "Ops" Side of DevSecOps
The "Ops" Side of DevSecOps The "Ops" Side of DevSecOps
The "Ops" Side of DevSecOps
 
Programming quantum computers in Q# (Techorama NL 2018)
Programming quantum computers in Q# (Techorama NL 2018)Programming quantum computers in Q# (Techorama NL 2018)
Programming quantum computers in Q# (Techorama NL 2018)
 
Ds @ bol
Ds @ bolDs @ bol
Ds @ bol
 
DevOps 2016 summit
DevOps 2016 summitDevOps 2016 summit
DevOps 2016 summit
 

Recently uploaded

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Zilliz
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 

Recently uploaded (20)

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 

100m Events

  • 1. 100 Million Events Eric Lubow @elubow elubow@simplereach.com
  • 2. Overview 100 Million Events Eric Lubow @elubow
  • 3. Overview • SimpleReach 100 Million Events Eric Lubow @elubow
  • 4. Overview • SimpleReach • 100 Million Events 100 Million Events Eric Lubow @elubow
  • 5. Overview • SimpleReach • 100 Million Events • Finding Patterns in Your Data 100 Million Events Eric Lubow @elubow
  • 6. Overview • SimpleReach • 100 Million Events • Finding Patterns in Your Data • What Mistakes? 100 Million Events Eric Lubow @elubow
  • 7. Overview • SimpleReach • 100 Million Events • Finding Patterns in Your Data • What Mistakes? • Questions 100 Million Events Eric Lubow @elubow
  • 8. Socially Intelligent 100 Million Events Eric Lubow @elubow
  • 9. Size 100 Million Events Eric Lubow @elubow
  • 10. Size • 100m events recorded per day and growing 100 Million Events Eric Lubow @elubow
  • 11. Size • 100m events recorded per day and growing • 500m Pageviews per month and growing 100 Million Events Eric Lubow @elubow
  • 12. Right Tool For The Job 100 Million Events Eric Lubow @elubow
  • 13. Why? 100 Million Events Eric Lubow @elubow
  • 14. Why? • Heavier READ loads vs heavier write loads 100 Million Events Eric Lubow @elubow
  • 15. Why? • Heavier READ loads vs heavier write loads • Data relationships may be less important 100 Million Events Eric Lubow @elubow
  • 16. Why? • Heavier READ loads vs heavier write loads • Data relationships may be less important • Different aspects of a system have different requirements 100 Million Events Eric Lubow @elubow
  • 17. Why? • Heavier READ loads vs heavier write loads • Data relationships may be less important • Different aspects of a system have different requirements • Know your compromises 100 Million Events Eric Lubow @elubow
  • 18. Cassandra 100 Million Events Eric Lubow @elubow
  • 19. Cassandra • Large data volume ingestion 100 Million Events Eric Lubow @elubow
  • 20. Cassandra • Large data volume ingestion • Really fast writes to many locations (eventual consistency) 100 Million Events Eric Lubow @elubow
  • 21. Cassandra • Large data volume ingestion • Really fast writes to many locations (eventual consistency) • Query by column groups within rows 100 Million Events Eric Lubow @elubow
  • 22. Cassandra • Large data volume ingestion • Really fast writes to many locations (eventual consistency) • Query by column groups within rows • Range queries in Hive (Slice predicate ranges) 100 Million Events Eric Lubow @elubow
  • 23. Cassandra • Large data volume ingestion • Really fast writes to many locations (eventual consistency) • Query by column groups within rows • Range queries in Hive (Slice predicate ranges) • Fault tolerant 100 Million Events Eric Lubow @elubow
  • 24. What Mistakes? 100 Million Events Eric Lubow @elubow
  • 25. What Mistakes? • Manage how many servers? 100 Million Events Eric Lubow @elubow
  • 26. What Mistakes? • Manage how many servers? • Re-inventing the wheel (Helenus) 100 Million Events Eric Lubow @elubow
  • 27. What Mistakes? • Manage how many servers? • Re-inventing the wheel (Helenus) • Composites Rock 100 Million Events Eric Lubow @elubow
  • 28. What Mistakes? • Manage how many servers? • Re-inventing the wheel (Helenus) • Composites Rock • Snapshots before drop keyspace 100 Million Events Eric Lubow @elubow
  • 29. What Mistakes? • Manage how many servers? • Re-inventing the wheel (Helenus) • Composites Rock • Snapshots before drop keyspace • How many experts does it take to run a cluster? 100 Million Events Eric Lubow @elubow
  • 30. What Mistakes? • Manage how many servers? • Re-inventing the wheel (Helenus) • Composites Rock • Snapshots before drop keyspace • How many experts does it take to run a cluster? • You can tune Cassandra?!? 100 Million Events Eric Lubow @elubow
  • 31. Server Management Cluster SSH 100 Million Events Eric Lubow @elubow
  • 32. Server Management • Hand tools - AWS, csshx Cluster SSH 100 Million Events Eric Lubow @elubow
  • 33. Server Management • Hand tools - AWS, csshx • Configuration Management Cluster SSH 100 Million Events Eric Lubow @elubow
  • 34. Server Management • Hand tools - AWS, csshx • Configuration Management • Monitoring and Alerting Tools Cluster SSH 100 Million Events Eric Lubow @elubow
  • 35. Server Management • Hand tools - AWS, csshx • Configuration Management • Monitoring and Alerting Tools Cluster SSH • Performance 100 Million Events Eric Lubow @elubow
  • 36. Server Management • Hand tools - AWS, csshx • Configuration Management • Monitoring and Alerting Tools Cluster SSH • Performance • Security 100 Million Events Eric Lubow @elubow
  • 37. Helenus 100 Million Events Eric Lubow @elubow
  • 38. Helenus • Built Node.js driver for Cassandra 100 Million Events Eric Lubow @elubow
  • 39. Helenus • Built Node.js driver for Cassandra • https://github.com/simplereach/helenus 100 Million Events Eric Lubow @elubow
  • 40. Helenus • Built Node.js driver for Cassandra • https://github.com/simplereach/helenus • CQL 2/3, Composite Column, Thrift Interface 100 Million Events Eric Lubow @elubow
  • 41. Helenus • Built Node.js driver for Cassandra • https://github.com/simplereach/helenus • CQL 2/3, Composite Column, Thrift Interface • Parallel querying (split up queries) 100 Million Events Eric Lubow @elubow
  • 42. Helenus • Built Node.js driver for Cassandra • https://github.com/simplereach/helenus • CQL 2/3, Composite Column, Thrift Interface • Parallel querying (split up queries) • Fault tolerance and resilience 100 Million Events Eric Lubow @elubow
  • 43. Data Patterns 100 Million Events Eric Lubow @elubow
  • 44. Data Patterns • Storage is cheap 100 Million Events Eric Lubow @elubow
  • 45. Data Patterns • Storage is cheap • Composites are WAY better than underscores 100 Million Events Eric Lubow @elubow
  • 46. Data Patterns • Storage is cheap • Composites are WAY better than underscores • Beyond UTF8Type 100 Million Events Eric Lubow @elubow
  • 47. Data Patterns • Storage is cheap • Composites are WAY better than underscores • Beyond UTF8Type • Timestamps as LongType 100 Million Events Eric Lubow @elubow
  • 48. Safety Mechanisms 100 Million Events Eric Lubow @elubow
  • 49. Safety Mechanisms • Snapshots before dropping keyspaces 100 Million Events Eric Lubow @elubow
  • 50. Safety Mechanisms • Snapshots before dropping keyspaces • Authorization and authentication 100 Million Events Eric Lubow @elubow
  • 51. Safety Mechanisms • Snapshots before dropping keyspaces • Authorization and authentication • (Limit) Direct access to the data store 100 Million Events Eric Lubow @elubow
  • 52. Expertise 100 Million Events Eric Lubow @elubow
  • 53. Expertise • What happens when you need help? 100 Million Events Eric Lubow @elubow
  • 54. Expertise • What happens when you need help? • How do you become an expert? 100 Million Events Eric Lubow @elubow
  • 55. Expertise • What happens when you need help? • How do you become an expert? • What happens when you need more experts? 100 Million Events Eric Lubow @elubow
  • 56. Tunables 100 Million Events Eric Lubow @elubow
  • 57. Tunables • Replication factor and read_repair_chance 100 Million Events Eric Lubow @elubow
  • 58. Tunables • Replication factor and read_repair_chance • Phi Convict and RPC timeout for AWS or DC separation 100 Million Events Eric Lubow @elubow
  • 59. Tunables • Replication factor and read_repair_chance • Phi Convict and RPC timeout for AWS or DC separation • MAX_HEAP_SIZE and HEAP_NEWSIZE (Analytics vs Realtime) 100 Million Events Eric Lubow @elubow
  • 60. Future • Priam • Asgard • Curator • Work for ? • Hastur 100 Million Events Eric Lubow @elubow
  • 61. Summary 100 Million Events Eric Lubow @elubow
  • 62. Summary • Learn from others mistakes 100 Million Events Eric Lubow @elubow
  • 63. Summary • Learn from others mistakes • Tuning and data patterns 100 Million Events Eric Lubow @elubow
  • 64. Summary • Learn from others mistakes • Tuning and data patterns • It’s ok to re-invent the wheel 100 Million Events Eric Lubow @elubow
  • 65. Summary • Learn from others mistakes • Tuning and data patterns • It’s ok to re-invent the wheel • Applications for/with Cassandra 100 Million Events Eric Lubow @elubow
  • 66. We’re Hiring 100 Million Events Eric Lubow @elubow
  • 67. Questions are guaranteed in life. Answers aren’t. Eric Lubow @elubow elubow@simplereach.com Thank you.

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. SimpleReach is a social intelligence tool for content creators. We track everything social action, on every major network, across the entire web in real-time. That means every like, tweet, pin, stumble and many more.\n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n