Submit Search
Upload
Big data
•
Download as ODP, PDF
•
1 like
•
964 views
Kevin Cawley
Follow
a quick talk i gave at the meetup in boulder, colorado
Read less
Read more
Technology
Design
Slideshow view
Report
Share
Slideshow view
Report
Share
1 of 14
Download now
Recommended
High order bits from cassandra & hadoop
High order bits from cassandra & hadoop
srisatish ambati
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
srisatish ambati
High order bits from cassandra & hadoop
High order bits from cassandra & hadoop
srisatish ambati
Cassandra at no_sql
Cassandra at no_sql
srisatish ambati
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
Sri Ambati
Brisk hadoop june2011
Brisk hadoop june2011
srisatish ambati
Brisk hadoop june2011_sfjava
Brisk hadoop june2011_sfjava
srisatish ambati
Cassandra for Sysadmins
Cassandra for Sysadmins
Nathan Milford
Recommended
High order bits from cassandra & hadoop
High order bits from cassandra & hadoop
srisatish ambati
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
srisatish ambati
High order bits from cassandra & hadoop
High order bits from cassandra & hadoop
srisatish ambati
Cassandra at no_sql
Cassandra at no_sql
srisatish ambati
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
Sri Ambati
Brisk hadoop june2011
Brisk hadoop june2011
srisatish ambati
Brisk hadoop june2011_sfjava
Brisk hadoop june2011_sfjava
srisatish ambati
Cassandra for Sysadmins
Cassandra for Sysadmins
Nathan Milford
The Automation Factory
The Automation Factory
Nathan Milford
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
DataStax
SSTable Reader Cassandra Day Denver 2014
SSTable Reader Cassandra Day Denver 2014
Ben Vanberg
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra Community
Hiromitsu Komatsu
Spark application on ec2 cluster
Spark application on ec2 cluster
Chao-Hsuan Shen
Introduction to the Hadoop Ecosystem (SEACON Edition)
Introduction to the Hadoop Ecosystem (SEACON Edition)
Uwe Printz
Big Data Day LA 2015 - Sparking up your Cassandra Cluster- Analytics made Awe...
Big Data Day LA 2015 - Sparking up your Cassandra Cluster- Analytics made Awe...
Data Con LA
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark Meetup
Frens Jan Rumph
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
DataStax
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
DataStax
Learn Cassandra at edureka!
Learn Cassandra at edureka!
Edureka!
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
DataStax
Scylla db deck, july 2017
Scylla db deck, july 2017
Dor Laor
Cloud Friendly Hadoop and Hive
Cloud Friendly Hadoop and Hive
DataWorks Summit
Time to Science, Time to Results. Accelerating Scientific research in the Cloud
Time to Science, Time to Results. Accelerating Scientific research in the Cloud
Amazon Web Services
Introduction to Cassandra
Introduction to Cassandra
Gokhan Atil
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Magalix Corporation
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08 japan
Hiromitsu Komatsu
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
DataStax
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
Roger Xia
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Michael Wynholds
More Related Content
What's hot
The Automation Factory
The Automation Factory
Nathan Milford
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
DataStax
SSTable Reader Cassandra Day Denver 2014
SSTable Reader Cassandra Day Denver 2014
Ben Vanberg
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra Community
Hiromitsu Komatsu
Spark application on ec2 cluster
Spark application on ec2 cluster
Chao-Hsuan Shen
Introduction to the Hadoop Ecosystem (SEACON Edition)
Introduction to the Hadoop Ecosystem (SEACON Edition)
Uwe Printz
Big Data Day LA 2015 - Sparking up your Cassandra Cluster- Analytics made Awe...
Big Data Day LA 2015 - Sparking up your Cassandra Cluster- Analytics made Awe...
Data Con LA
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark Meetup
Frens Jan Rumph
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
DataStax
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
DataStax
Learn Cassandra at edureka!
Learn Cassandra at edureka!
Edureka!
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
DataStax
Scylla db deck, july 2017
Scylla db deck, july 2017
Dor Laor
Cloud Friendly Hadoop and Hive
Cloud Friendly Hadoop and Hive
DataWorks Summit
Time to Science, Time to Results. Accelerating Scientific research in the Cloud
Time to Science, Time to Results. Accelerating Scientific research in the Cloud
Amazon Web Services
Introduction to Cassandra
Introduction to Cassandra
Gokhan Atil
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Magalix Corporation
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08 japan
Hiromitsu Komatsu
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
DataStax
What's hot
(20)
The Automation Factory
The Automation Factory
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
SSTable Reader Cassandra Day Denver 2014
SSTable Reader Cassandra Day Denver 2014
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra Community
Spark application on ec2 cluster
Spark application on ec2 cluster
Introduction to the Hadoop Ecosystem (SEACON Edition)
Introduction to the Hadoop Ecosystem (SEACON Edition)
Big Data Day LA 2015 - Sparking up your Cassandra Cluster- Analytics made Awe...
Big Data Day LA 2015 - Sparking up your Cassandra Cluster- Analytics made Awe...
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark Meetup
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Learn Cassandra at edureka!
Learn Cassandra at edureka!
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Scylla db deck, july 2017
Scylla db deck, july 2017
Cloud Friendly Hadoop and Hive
Cloud Friendly Hadoop and Hive
Time to Science, Time to Results. Accelerating Scientific research in the Cloud
Time to Science, Time to Results. Accelerating Scientific research in the Cloud
Introduction to Cassandra
Introduction to Cassandra
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08 japan
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
Similar to Big data
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
Roger Xia
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Michael Wynholds
Stratio big data spain
Stratio big data spain
Álvaro Agea Herradón
No sql
No sql
Murat Çakal
NoSql Database
NoSql Database
Suresh Parmar
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
Joe Alex
Qubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant Conference
Joydeep Sen Sarma
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
Benoit Perroud
Design for a Distributed Name Node
Design for a Distributed Name Node
Aaron Cordova
Strategies for Distributed Data Storage
Strategies for Distributed Data Storage
kakugawa
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
Amazon Web Services
Cassandra and Hybrid Cloud - Introducing Mache
Cassandra and Hybrid Cloud - Introducing Mache
Excelian | Luxoft Financial Services
Cassandra+Hadoop
Cassandra+Hadoop
Jeremy Hanna
Kafka spark cassandra webinar feb 16 2016
Kafka spark cassandra webinar feb 16 2016
Hiromitsu Komatsu
Kafka spark cassandra webinar feb 16 2016
Kafka spark cassandra webinar feb 16 2016
Hiromitsu Komatsu
Cassndra (4).pptx
Cassndra (4).pptx
NikhilAmauriya
cassandra
cassandra
Akash R
Cassandra Redis
Cassandra Redis
Diego Pacheco
HDFS introduction
HDFS introduction
injae yeo
An efficient data mining solution by integrating Spark and Cassandra
An efficient data mining solution by integrating Spark and Cassandra
Stratio
Similar to Big data
(20)
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Stratio big data spain
Stratio big data spain
No sql
No sql
NoSql Database
NoSql Database
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
Qubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant Conference
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
Design for a Distributed Name Node
Design for a Distributed Name Node
Strategies for Distributed Data Storage
Strategies for Distributed Data Storage
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
Cassandra and Hybrid Cloud - Introducing Mache
Cassandra and Hybrid Cloud - Introducing Mache
Cassandra+Hadoop
Cassandra+Hadoop
Kafka spark cassandra webinar feb 16 2016
Kafka spark cassandra webinar feb 16 2016
Kafka spark cassandra webinar feb 16 2016
Kafka spark cassandra webinar feb 16 2016
Cassndra (4).pptx
Cassndra (4).pptx
cassandra
cassandra
Cassandra Redis
Cassandra Redis
HDFS introduction
HDFS introduction
An efficient data mining solution by integrating Spark and Cassandra
An efficient data mining solution by integrating Spark and Cassandra
Recently uploaded
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
UiPathCommunity
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
NavinnSomaal
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
Manik S Magar
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Alex Barbosa Coqueiro
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
charlottematthew16
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Addepto
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Mark Billinghurst
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
Florian Wilhelm
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
2toLead Limited
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
Commit University
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Sri Ambati
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
Fwdays
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
Lars Bell
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
Alfredo García Lavilla
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
ScyllaDB
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
null - The Open Security Community
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Fwdays
How to write a Business Continuity Plan
How to write a Business Continuity Plan
Databarracks
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
Recently uploaded
(20)
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
How to write a Business Continuity Plan
How to write a Business Continuity Plan
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Big data
1.
2.
Cassandra – been
actively using for 2+ years
3.
Hadoop – 1
yr experience, sort of
4.
5.
Options: mongodb, redis,
cassandra, couch, hbase, riak, voldermort, dynamodb
6.
We're gonna get
billions of responses – RDBMS is going to fall over
7.
We need nosql...
what the hell is that??
8.
Problem For Today,
cont. Kevin, response=cassandra, kevin@foo.com Emma, response=redis, emma@foo.com Asher, response=cassandra, [email_address] … … … BILLIONS AND BILLIONS OF THESE!!!!
9.
10.
Key Value store
11.
Ring architecture w/
replication 2^217 tokens Node 1 Node 2 Node 4 Node 3
12.
13.
Column Families –
std, dynamic (mo better) name preference 100 kevin cawley cassandra 101 asher cawley cassandra 102 emma cawley redis 202 201 redis ['joe','bob'] ['matthias'] cassandra ['kevin', 'asher'] ['tom'] mongodb ['holly'] ['dan'] assume User keys as utf8;
14.
15.
Fanning – not
the cool refreshing kind
16.
Getting phased out
202 201 redis {'joe' => 'joe@foo.com , 'bob' => 'bob@boo.com'} {'matthias' => 'matthias@foo.cm', 'tom' => 'tom@boo.com'}
17.
18.
19.
20.
We built our
own – now free
21.
Cassandra is eventually
consistent makes this hard
22.
Be clever and
you will win
23.
Demo 2
24.
Counters Counter cassandra
30333 redis 22098 mongodb 24567 couch 12340 ...
25.
26.
27.
28.
Map - processes
a key/value pair to generate a set of intermediate key/value pairs
29.
Reduce - function
that merges all intermediate values associated with the same intermediate key
30.
31.
cassandra asher
32.
33.
Redis 1 AND
the winner is cassandra w/ 2 votes!!!
34.
35.
36.
37.
38.
Dangerous if you
don't know what you are doing
39.
Schemaless – ironically
modelling is extermely imp.
40.
Download now