SlideShare a Scribd company logo
1 of 41
STREAM PROCESSING
@ASHIC
HTTP://WWW.HEARTYSOFT.COM
BIG DATA
• What?
BIG DATA
• Hadoop
• Map-Reduce
• Spark
BIG DATA
• Optimisations
• Parquet, etc.
BIG DATA
• Problems?
BIG DATA
• Problems?
BIG DATA
• Problems?
STREAMING DATA
• What?
STREAMING DATA
• Cheaper?
• Timely results?
• Approximations?
STREAMING DATA
EXAMPLES
• Statistical Summaries
Mean, Standard Deviation
EXAMPLES
• Statistical Summaries
Hold n, sum, and sum of
squares =>
Mean, Standard Deviation
EXAMPLES
• Statistical Summaries
Approximation of Median
EXAMPLES
• Statistical Summaries
* Start with a value
* If item > value, add learning
rate
* If item < value, subtract
learning rate
=>
Approximation of Median
EXAMPLES
• Taking Representative Samples
- From weblogs (i.e. ip-timestamp tuples) approximate average
percentage of users who have revisited.
EXAMPLES
• Filtering Streams
Filter Out (or In) Things That May Not Be
Needed
EXAMPLES
• Filtering Streams
Bloom Filter
• Hash based on criterion
• Matching hash means entry may be in
there
• Non matching hash means it’s
definitely not
EXAMPLES
How Many Distinct Things Did We Get?
EXAMPLES
• Approximate Distinct Elements
Flajolet-Martin Algorithm
• Hash element (or identifier) to longs using many
hash functions. Count trailing zeroes of hash. Let
it be r.
• Approximation for distinct elements = 2^R where
R = max(r)
• Combine groups of hashes: Take average for each
group, then take median of the averages.
EXAMPLES
• Clustering
• Bradley, Fattad, Reina (BFR) approach.
• BDMO algorithm.
BACK TO…
USEFUL TECHNOLOGY
• Apache Kafka
• Apache Cassandra
• Apache Spark
KAFKA
• Scale out, clustered, durable message broker.
• Fault tolerant, replicated.
• Uses topics, which have partitions.
• Messages within partitions have guaranteed ordering.
KAFKA
• Kafka Streams: Lightweight Kafka => [x] library
• Kafka Connect: Enables streaming large amounts
of data reliability between Kafka and other
systems
• Schema Registry: Well…registry for schemas
KAFKA
KAFKA - GOTCHAS
• Messages in a partition are ordered, message
processing may not be.
• At least once… downstream idempotence
required.
• Disk.
• Rebalances.
CASSANDRA
• Partitioned row store.
• Fault tolerant, Masterless.
• Very fast writes, fast reads.
• Tunable consistency.
• Multi-datacentre aware.
• OLTP + OLAP (via Spark).
CASSANDRA - DATACENTRES
CASSANDRA – SCHEMA
• Collection Types
• User defined types
• Static Columns
• Materialised Views
CASSANDRA - CQL
CASSANDRA
– DATA MODELLING
• NOT a relational database
• KNOW YOUR QUERIES
• Model for queries, not normalisation
• Consolidate to minimal number of tables that get the job done
• Unbound partition growth will bring down nodes, then quorum
CASSANDRA + SPARK
SPARK
• General purpose data processing
• Ability to cache things in memory, and re-use across steps.
SPARK
SPARK STREAMING
• Microbatches
• Similar API to non-streaming Spark
SPARK STREAMING WC
SPARK + KAFKA
Kafka Direct Stream
SPARK + CASSANDRA
* rdd.saveToCassandra
* sc.cassandraTable
KAFKA + CASSANDRA
* Cassandra Sink
* Cassandra Connect
STREAM PROCESSING
• Lots of open problems
• RISE Labs (Real-time, Intelligent, and Secure Execution
THANK YOU
@ashic
http://github/Heartysoft/cassy-up

More Related Content

Viewers also liked

Logarska Valley (Logarska dolina), Slovenia images
Logarska Valley (Logarska dolina), Slovenia imagesLogarska Valley (Logarska dolina), Slovenia images
Logarska Valley (Logarska dolina), Slovenia imagesDaria Perse
 
W T S Resume Workshop 03
W T S  Resume  Workshop 03W T S  Resume  Workshop 03
W T S Resume Workshop 03lecipollo
 
D+c 2011 03 – focus – robles why filipinos have reason to fear their nation’...
D+c 2011 03 – focus – robles  why filipinos have reason to fear their nation’...D+c 2011 03 – focus – robles  why filipinos have reason to fear their nation’...
D+c 2011 03 – focus – robles why filipinos have reason to fear their nation’...hotmanila
 
Bágyi Péter: CT protokollok, dózis-csökkentés lehetőségei. MRAE Országos Radi...
Bágyi Péter: CT protokollok, dózis-csökkentés lehetőségei. MRAE Országos Radi...Bágyi Péter: CT protokollok, dózis-csökkentés lehetőségei. MRAE Országos Radi...
Bágyi Péter: CT protokollok, dózis-csökkentés lehetőségei. MRAE Országos Radi...Péter Bágyi M.D.
 
How It Works
How It WorksHow It Works
How It WorksnuResume
 
120116 workforce development pull-up banner - 0987
120116   workforce development pull-up banner - 0987120116   workforce development pull-up banner - 0987
120116 workforce development pull-up banner - 0987Gihan Lahoud
 
Cqrs, Event Sourcing
Cqrs, Event SourcingCqrs, Event Sourcing
Cqrs, Event SourcingAshic Mahtab
 
Growth Strategies Across the Product Lifecycle
Growth Strategies Across the Product LifecycleGrowth Strategies Across the Product Lifecycle
Growth Strategies Across the Product LifecyclePaul Morgan
 
Mobile futures ppt intro getting mobile in education
Mobile futures ppt intro getting mobile in educationMobile futures ppt intro getting mobile in education
Mobile futures ppt intro getting mobile in educationGihan Lahoud
 
Aan de slag met social media
Aan de slag met social mediaAan de slag met social media
Aan de slag met social mediahallofryslan
 
V1mobile futures enable presentation v1
V1mobile futures enable presentation v1V1mobile futures enable presentation v1
V1mobile futures enable presentation v1Gihan Lahoud
 
Presentacion
PresentacionPresentacion
Presentacionfosky
 
Pictures And Music
Pictures And  MusicPictures And  Music
Pictures And MusicBless_India
 
Web 2.0 alkalmazások az egészségügyben, képalkotó diagnosztikában - II. rész ...
Web 2.0 alkalmazások az egészségügyben, képalkotó diagnosztikában - II. rész ...Web 2.0 alkalmazások az egészségügyben, képalkotó diagnosztikában - II. rész ...
Web 2.0 alkalmazások az egészségügyben, képalkotó diagnosztikában - II. rész ...Péter Bágyi M.D.
 

Viewers also liked (20)

Logarska Valley (Logarska dolina), Slovenia images
Logarska Valley (Logarska dolina), Slovenia imagesLogarska Valley (Logarska dolina), Slovenia images
Logarska Valley (Logarska dolina), Slovenia images
 
W T S Resume Workshop 03
W T S  Resume  Workshop 03W T S  Resume  Workshop 03
W T S Resume Workshop 03
 
The Buckboard
The BuckboardThe Buckboard
The Buckboard
 
D+c 2011 03 – focus – robles why filipinos have reason to fear their nation’...
D+c 2011 03 – focus – robles  why filipinos have reason to fear their nation’...D+c 2011 03 – focus – robles  why filipinos have reason to fear their nation’...
D+c 2011 03 – focus – robles why filipinos have reason to fear their nation’...
 
Bágyi Péter: CT protokollok, dózis-csökkentés lehetőségei. MRAE Országos Radi...
Bágyi Péter: CT protokollok, dózis-csökkentés lehetőségei. MRAE Országos Radi...Bágyi Péter: CT protokollok, dózis-csökkentés lehetőségei. MRAE Országos Radi...
Bágyi Péter: CT protokollok, dózis-csökkentés lehetőségei. MRAE Országos Radi...
 
Adjectives
AdjectivesAdjectives
Adjectives
 
Merlin Pc1
Merlin Pc1Merlin Pc1
Merlin Pc1
 
How It Works
How It WorksHow It Works
How It Works
 
120116 workforce development pull-up banner - 0987
120116   workforce development pull-up banner - 0987120116   workforce development pull-up banner - 0987
120116 workforce development pull-up banner - 0987
 
Cqrs, Event Sourcing
Cqrs, Event SourcingCqrs, Event Sourcing
Cqrs, Event Sourcing
 
Growth Strategies Across the Product Lifecycle
Growth Strategies Across the Product LifecycleGrowth Strategies Across the Product Lifecycle
Growth Strategies Across the Product Lifecycle
 
Marjoriepp
MarjorieppMarjoriepp
Marjoriepp
 
Mobile futures ppt intro getting mobile in education
Mobile futures ppt intro getting mobile in educationMobile futures ppt intro getting mobile in education
Mobile futures ppt intro getting mobile in education
 
Entorgcorp academy power point courtesy
Entorgcorp academy power point courtesyEntorgcorp academy power point courtesy
Entorgcorp academy power point courtesy
 
Aan de slag met social media
Aan de slag met social mediaAan de slag met social media
Aan de slag met social media
 
Homophones
HomophonesHomophones
Homophones
 
V1mobile futures enable presentation v1
V1mobile futures enable presentation v1V1mobile futures enable presentation v1
V1mobile futures enable presentation v1
 
Presentacion
PresentacionPresentacion
Presentacion
 
Pictures And Music
Pictures And  MusicPictures And  Music
Pictures And Music
 
Web 2.0 alkalmazások az egészségügyben, képalkotó diagnosztikában - II. rész ...
Web 2.0 alkalmazások az egészségügyben, képalkotó diagnosztikában - II. rész ...Web 2.0 alkalmazások az egészségügyben, képalkotó diagnosztikában - II. rész ...
Web 2.0 alkalmazások az egészségügyben, képalkotó diagnosztikában - II. rész ...
 

Recently uploaded

SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 

Recently uploaded (20)

SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 

Stream processing