SlideShare a Scribd company logo
Big & Fast:
A quest for relevant and real-time analytics
Natalino Busa
@natalinobusa
Parallelism Mathematics Programming
Languages Machine Learning Statistics
Big Data Algorithms Cloud Computing
Natalino Busa
@natalinobusa
www.natalinobusa.com
Big and Fast.
Methodology Architecture Roles and organization
Conversion is the ultimate form of
permission marketing
Permission marketing is about the honour of being
heard.
How to earn it ?
Provide the right suggestions, at the right time.
This is what makes data analysis valuable
When do you really know your customer ?
know about last unique:
5 songs?
100 songs?
10’000 songs?
Old & New stuff.
We evolve slowly, our personality, our habits.
But events and trends can affect us on a short notice
How do you combine old with new?
The customer’s context
Complex on many dimensions:
Personal history:
amount of transactions ever done
Long term Interaction:
how the users’ action correlate with others
Real time events:
Trends and recent events
The customer’s context
context is related to time:
slow changing: the defining characteristic of a person
fast changing: events which influence our lives, trends
Require very different
technology solutions !!!
Challenges
millions of billions of
Not much time to react
window of opportunity sometimes is just a few seconds
Load of information to process
you want to understand well the user history
Slow and fast
ranking and preference
analysis
segmentation and clustering
short term trending topics
rule-based recommendations
10’s Terabytes of Data.
This can take hours ….
100’s of events per second.
This must be fast ….
Hadoop: Distributed Data OS
Reliable
Distributed, Replicated File System
Low cost
↓ Cost vs ↑ Performance/Storage
Computing Powerhouse
All clusters CPU’s working in parallel
for running queries
Scala / Akka / Spray:
a WEB API reactive framework
Actor
A Actor
B
Actor
C
msg 1
msg 2
msg 3
msg 4
● it scales horizontally (can run in cluster mode)
● maximum use of the available cores/memory
1. processing is non-blocking, threads are re-used
2. can parallelize computing power across many actors
Very fast: 1000’s messages/sec
Very reliable: auto recovery
Distributed computing:
lambda architecture
Batch
Computing
HTTP RESTful API
In-Memory
Distributed Database
In-memory
Distributed DB’s
Lambda Architecture
Batch + Streaming
low-latency
Web API services
Streaming
Computing
Data Warehouses Messaging Busses
Distributed computing: some techs
Hadoop
Cassandra
millions of billions of
λ=
conversions
( lamda )
All Things Distributed
Distributing computing and storage
more machines = more storage/computing
Open Source software solutions
mature enough for pragmatic adopters
Near realtime + big data technologies
Hadoop, Scala, Akka, Spray, Cassandra
Science & Engineering
Statistics,
Data Science
Python
R
Visualization
IT Infra
Big Data
Java
Scala
SQL
Hadoop: Big Data Infrastructure, Data Science on large datasets
Big Data and Fast Data
requires different profiles to be able to
achieve the best results
Parallelism Mathematics Programming
Languages Machine Learning Statistics
Big Data Algorithms Cloud Computing
Natalino Busa
@natalinobusa
www.natalinobusa.com
Thanks !
Any questions?
Natalino Busa
@natalinobusa

More Related Content

What's hot

Final deck
Final deckFinal deck
Final deck
Steve Watt
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Shirshanka Das
 
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Data Con LA
 
Big Data LDN 2016: Out of the Data Warehouses, and into the Data Lakes and St...
Big Data LDN 2016: Out of the Data Warehouses, and into the Data Lakes and St...Big Data LDN 2016: Out of the Data Warehouses, and into the Data Lakes and St...
Big Data LDN 2016: Out of the Data Warehouses, and into the Data Lakes and St...
Matt Stubbs
 
Dataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin Buzzwords
Dataiku
 
The of Operational Analytics Data Store
The of Operational Analytics Data StoreThe of Operational Analytics Data Store
The of Operational Analytics Data Store
Rommel Garcia
 
Big Data - Part I
Big Data - Part IBig Data - Part I
Big Data - Part I
Thanuja Seneviratne
 
What's next for Big Data? -- Apache Spark
What's next for Big Data? -- Apache SparkWhat's next for Big Data? -- Apache Spark
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Charles Allen
 
Self Service Analytics at Twitch
Self Service Analytics at TwitchSelf Service Analytics at Twitch
Self Service Analytics at Twitch
Imply
 
Clickstream & Social Media Analysis using Apache Spark
Clickstream & Social Media Analysis using Apache SparkClickstream & Social Media Analysis using Apache Spark
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Rommel Garcia
 
Microservices Live
Microservices LiveMicroservices Live
Microservices Live
Data Driven Innovation
 
OSCON 2015
OSCON 2015OSCON 2015
OSCON 2015
Charles Smith
 
Big Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkBig Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache Spark
Kenny Bastani
 
Introduction_OF_Hadoop_and_BigData
Introduction_OF_Hadoop_and_BigDataIntroduction_OF_Hadoop_and_BigData
Introduction_OF_Hadoop_and_BigData
Nilay Mishra
 
Non-Relational Databases: This hurts. I like it.
Non-Relational Databases: This hurts. I like it.Non-Relational Databases: This hurts. I like it.
Non-Relational Databases: This hurts. I like it.
Onyxfish
 
Strata London 16: sightseeing, venues, and friends
Strata  London 16: sightseeing, venues, and friendsStrata  London 16: sightseeing, venues, and friends
Strata London 16: sightseeing, venues, and friends
Natalino Busa
 
Blue Pill/Red Pill: The Matrix of Thousands of Data Streams
Blue Pill/Red Pill: The Matrix of Thousands of Data StreamsBlue Pill/Red Pill: The Matrix of Thousands of Data Streams
Blue Pill/Red Pill: The Matrix of Thousands of Data Streams
Databricks
 
Strata Online_road_to_enterprise_data_2011
Strata Online_road_to_enterprise_data_2011Strata Online_road_to_enterprise_data_2011
Strata Online_road_to_enterprise_data_2011
Lynn Langit
 

What's hot (20)

Final deck
Final deckFinal deck
Final deck
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
 
Big Data LDN 2016: Out of the Data Warehouses, and into the Data Lakes and St...
Big Data LDN 2016: Out of the Data Warehouses, and into the Data Lakes and St...Big Data LDN 2016: Out of the Data Warehouses, and into the Data Lakes and St...
Big Data LDN 2016: Out of the Data Warehouses, and into the Data Lakes and St...
 
Dataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin Buzzwords
 
The of Operational Analytics Data Store
The of Operational Analytics Data StoreThe of Operational Analytics Data Store
The of Operational Analytics Data Store
 
Big Data - Part I
Big Data - Part IBig Data - Part I
Big Data - Part I
 
What's next for Big Data? -- Apache Spark
What's next for Big Data? -- Apache SparkWhat's next for Big Data? -- Apache Spark
What's next for Big Data? -- Apache Spark
 
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
 
Self Service Analytics at Twitch
Self Service Analytics at TwitchSelf Service Analytics at Twitch
Self Service Analytics at Twitch
 
Clickstream & Social Media Analysis using Apache Spark
Clickstream & Social Media Analysis using Apache SparkClickstream & Social Media Analysis using Apache Spark
Clickstream & Social Media Analysis using Apache Spark
 
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
 
Microservices Live
Microservices LiveMicroservices Live
Microservices Live
 
OSCON 2015
OSCON 2015OSCON 2015
OSCON 2015
 
Big Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkBig Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache Spark
 
Introduction_OF_Hadoop_and_BigData
Introduction_OF_Hadoop_and_BigDataIntroduction_OF_Hadoop_and_BigData
Introduction_OF_Hadoop_and_BigData
 
Non-Relational Databases: This hurts. I like it.
Non-Relational Databases: This hurts. I like it.Non-Relational Databases: This hurts. I like it.
Non-Relational Databases: This hurts. I like it.
 
Strata London 16: sightseeing, venues, and friends
Strata  London 16: sightseeing, venues, and friendsStrata  London 16: sightseeing, venues, and friends
Strata London 16: sightseeing, venues, and friends
 
Blue Pill/Red Pill: The Matrix of Thousands of Data Streams
Blue Pill/Red Pill: The Matrix of Thousands of Data StreamsBlue Pill/Red Pill: The Matrix of Thousands of Data Streams
Blue Pill/Red Pill: The Matrix of Thousands of Data Streams
 
Strata Online_road_to_enterprise_data_2011
Strata Online_road_to_enterprise_data_2011Strata Online_road_to_enterprise_data_2011
Strata Online_road_to_enterprise_data_2011
 

Viewers also liked

Lunar lander - The Game
Lunar lander - The GameLunar lander - The Game
Lunar lander - The Game
Thiago Santos
 
Harvard mark1
Harvard mark1Harvard mark1
Harvard mark1
Arthur Beelen
 
Historyofcomputers
HistoryofcomputersHistoryofcomputers
Historyofcomputers
ca999
 
Unix.part1.history
Unix.part1.historyUnix.part1.history
Unix.part1.history
Kseniya Aleinikova
 
software History
software Historysoftware History
software History
Avinash Avi
 
Software Testing: History, Trends, Perspectives - a Brief Overview
Software Testing: History, Trends, Perspectives - a Brief OverviewSoftware Testing: History, Trends, Perspectives - a Brief Overview
Software Testing: History, Trends, Perspectives - a Brief Overview
Softheme
 
Advanced API Design: how an awesome API can help you make friends, get rich, ...
Advanced API Design: how an awesome API can help you make friends, get rich, ...Advanced API Design: how an awesome API can help you make friends, get rich, ...
Advanced API Design: how an awesome API can help you make friends, get rich, ...
Jonathan Dahl
 
Streaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and SprayStreaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and Spray
Natalino Busa
 
Computer Software introduction
Computer  Software introductionComputer  Software introduction
Computer Software introduction
faisalahmed2017
 
Animated Visualization of Software History Using Software Evolution Storyboards
Animated Visualization of Software History Using Software Evolution StoryboardsAnimated Visualization of Software History Using Software Evolution Storyboards
Animated Visualization of Software History Using Software Evolution Storyboards
SAIL_QU
 
Operating Systems: Network Management
Operating Systems: Network ManagementOperating Systems: Network Management
Operating Systems: Network Management
Damian T. Gordon
 
Operating Systems: A History of Linux
Operating Systems: A History of LinuxOperating Systems: A History of Linux
Operating Systems: A History of Linux
Damian T. Gordon
 

Viewers also liked (12)

Lunar lander - The Game
Lunar lander - The GameLunar lander - The Game
Lunar lander - The Game
 
Harvard mark1
Harvard mark1Harvard mark1
Harvard mark1
 
Historyofcomputers
HistoryofcomputersHistoryofcomputers
Historyofcomputers
 
Unix.part1.history
Unix.part1.historyUnix.part1.history
Unix.part1.history
 
software History
software Historysoftware History
software History
 
Software Testing: History, Trends, Perspectives - a Brief Overview
Software Testing: History, Trends, Perspectives - a Brief OverviewSoftware Testing: History, Trends, Perspectives - a Brief Overview
Software Testing: History, Trends, Perspectives - a Brief Overview
 
Advanced API Design: how an awesome API can help you make friends, get rich, ...
Advanced API Design: how an awesome API can help you make friends, get rich, ...Advanced API Design: how an awesome API can help you make friends, get rich, ...
Advanced API Design: how an awesome API can help you make friends, get rich, ...
 
Streaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and SprayStreaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and Spray
 
Computer Software introduction
Computer  Software introductionComputer  Software introduction
Computer Software introduction
 
Animated Visualization of Software History Using Software Evolution Storyboards
Animated Visualization of Software History Using Software Evolution StoryboardsAnimated Visualization of Software History Using Software Evolution Storyboards
Animated Visualization of Software History Using Software Evolution Storyboards
 
Operating Systems: Network Management
Operating Systems: Network ManagementOperating Systems: Network Management
Operating Systems: Network Management
 
Operating Systems: A History of Linux
Operating Systems: A History of LinuxOperating Systems: A History of Linux
Operating Systems: A History of Linux
 

Similar to Big and fast a quest for relevant and real-time analytics

Traitement d'événements
Traitement d'événementsTraitement d'événements
Traitement d'événements
Amazon Web Services
 
Big Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS CloudBig Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS Cloud
Amazon Web Services
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Altan Khendup
 
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
NoSQLmatters
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Yahoo Developer Network
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
Amazon Web Services
 
Big data business case
Big data   business caseBig data   business case
Big data business case
Karthik Padmanabhan ( MLE℠)
 
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOTAWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
Amazon Web Services
 
UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015
Christopher Curtin
 
GraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinGraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos Guestrin
Turi, Inc.
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Yahoo Developer Network
 
(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem
(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem
(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem
Amazon Web Services
 
Redis and Bloom Filters - Atlanta Java Users Group 9/2014
Redis and Bloom Filters - Atlanta Java Users Group 9/2014Redis and Bloom Filters - Atlanta Java Users Group 9/2014
Redis and Bloom Filters - Atlanta Java Users Group 9/2014
Christopher Curtin
 
Big Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of LightBig Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of Light
Amazon Web Services LATAM
 
Scaling APIs: Predict, Prepare for, Overcome the Challenges
Scaling APIs: Predict, Prepare for, Overcome the ChallengesScaling APIs: Predict, Prepare for, Overcome the Challenges
Scaling APIs: Predict, Prepare for, Overcome the Challenges
Apigee | Google Cloud
 
ConFoo 2017: Introduction to performance optimization of .NET web apps
ConFoo 2017: Introduction to performance optimization of .NET web appsConFoo 2017: Introduction to performance optimization of .NET web apps
ConFoo 2017: Introduction to performance optimization of .NET web apps
Pierre-Luc Maheu
 
Kellogg XML Holland Speech
Kellogg XML Holland SpeechKellogg XML Holland Speech
Kellogg XML Holland Speech
Dave Kellogg
 
Need for Time series Database
Need for Time series DatabaseNeed for Time series Database
Need for Time series Database
Pramit Choudhary
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
Amazon Web Services
 
Big Data LDN 2017: Delivering Instant Experience with Redid Enterprise
Big Data LDN 2017: Delivering Instant Experience with Redid EnterpriseBig Data LDN 2017: Delivering Instant Experience with Redid Enterprise
Big Data LDN 2017: Delivering Instant Experience with Redid Enterprise
Matt Stubbs
 

Similar to Big and fast a quest for relevant and real-time analytics (20)

Traitement d'événements
Traitement d'événementsTraitement d'événements
Traitement d'événements
 
Big Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS CloudBig Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS Cloud
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
 
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Big data business case
Big data   business caseBig data   business case
Big data business case
 
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOTAWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
 
UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015
 
GraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinGraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos Guestrin
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
 
(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem
(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem
(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem
 
Redis and Bloom Filters - Atlanta Java Users Group 9/2014
Redis and Bloom Filters - Atlanta Java Users Group 9/2014Redis and Bloom Filters - Atlanta Java Users Group 9/2014
Redis and Bloom Filters - Atlanta Java Users Group 9/2014
 
Big Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of LightBig Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of Light
 
Scaling APIs: Predict, Prepare for, Overcome the Challenges
Scaling APIs: Predict, Prepare for, Overcome the ChallengesScaling APIs: Predict, Prepare for, Overcome the Challenges
Scaling APIs: Predict, Prepare for, Overcome the Challenges
 
ConFoo 2017: Introduction to performance optimization of .NET web apps
ConFoo 2017: Introduction to performance optimization of .NET web appsConFoo 2017: Introduction to performance optimization of .NET web apps
ConFoo 2017: Introduction to performance optimization of .NET web apps
 
Kellogg XML Holland Speech
Kellogg XML Holland SpeechKellogg XML Holland Speech
Kellogg XML Holland Speech
 
Need for Time series Database
Need for Time series DatabaseNeed for Time series Database
Need for Time series Database
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
 
Big Data LDN 2017: Delivering Instant Experience with Redid Enterprise
Big Data LDN 2017: Delivering Instant Experience with Redid EnterpriseBig Data LDN 2017: Delivering Instant Experience with Redid Enterprise
Big Data LDN 2017: Delivering Instant Experience with Redid Enterprise
 

More from Natalino Busa

Data Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovationData Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovation
Natalino Busa
 
Data science apps powered by Jupyter Notebooks
Data science apps powered by Jupyter NotebooksData science apps powered by Jupyter Notebooks
Data science apps powered by Jupyter Notebooks
Natalino Busa
 
7 steps for highly effective deep neural networks
7 steps for highly effective deep neural networks7 steps for highly effective deep neural networks
7 steps for highly effective deep neural networks
Natalino Busa
 
Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooks
Natalino Busa
 
[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing
Natalino Busa
 
Data in Action
Data in ActionData in Action
Data in Action
Natalino Busa
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Natalino Busa
 
The evolution of data analytics
The evolution of data analyticsThe evolution of data analytics
The evolution of data analytics
Natalino Busa
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Natalino Busa
 
Awesome Banking API's
Awesome Banking API'sAwesome Banking API's
Awesome Banking API's
Natalino Busa
 
Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.
Natalino Busa
 
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analyticsBig Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
Natalino Busa
 
Strata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topicsStrata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topics
Natalino Busa
 
Streaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesStreaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologies
Natalino Busa
 
Big data landscape
Big data landscapeBig data landscape
Big data landscape
Natalino Busa
 

More from Natalino Busa (15)

Data Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovationData Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovation
 
Data science apps powered by Jupyter Notebooks
Data science apps powered by Jupyter NotebooksData science apps powered by Jupyter Notebooks
Data science apps powered by Jupyter Notebooks
 
7 steps for highly effective deep neural networks
7 steps for highly effective deep neural networks7 steps for highly effective deep neural networks
7 steps for highly effective deep neural networks
 
Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooks
 
[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing
 
Data in Action
Data in ActionData in Action
Data in Action
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
 
The evolution of data analytics
The evolution of data analyticsThe evolution of data analytics
The evolution of data analytics
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
 
Awesome Banking API's
Awesome Banking API'sAwesome Banking API's
Awesome Banking API's
 
Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.
 
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analyticsBig Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
 
Strata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topicsStrata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topics
 
Streaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesStreaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologies
 
Big data landscape
Big data landscapeBig data landscape
Big data landscape
 

Recently uploaded

2024 case study-SHEIN Analysis-Final.pdf
2024 case study-SHEIN Analysis-Final.pdf2024 case study-SHEIN Analysis-Final.pdf
2024 case study-SHEIN Analysis-Final.pdf
Sun.Lee
 
Practical Progress from a Theory by Steven Kingpdf
Practical Progress from a Theory by Steven KingpdfPractical Progress from a Theory by Steven Kingpdf
Practical Progress from a Theory by Steven Kingpdf
william charnock
 
Social Samosa's #30Under30 Guidebook.pdf
Social Samosa's #30Under30 Guidebook.pdfSocial Samosa's #30Under30 Guidebook.pdf
Social Samosa's #30Under30 Guidebook.pdf
Social Samosa
 
Snapshot of Consumer Behaviors of May 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of May 2024-EOLiSurvey (EN).pdfSnapshot of Consumer Behaviors of May 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of May 2024-EOLiSurvey (EN).pdf
Eastern Online-iSURVEY
 
How to Maximize Sales Using Social Commerce
How to Maximize Sales Using Social CommerceHow to Maximize Sales Using Social Commerce
How to Maximize Sales Using Social Commerce
Vbout.com
 
NIMA2024 | AI en de 50% afname van je traffic. Wat nu! | Folkje Berends en Sj...
NIMA2024 | AI en de 50% afname van je traffic. Wat nu! | Folkje Berends en Sj...NIMA2024 | AI en de 50% afname van je traffic. Wat nu! | Folkje Berends en Sj...
NIMA2024 | AI en de 50% afname van je traffic. Wat nu! | Folkje Berends en Sj...
BBPMedia1
 
01 Field+Guide+to+Human-Centered+Design_IDEOorg_English GUIA COMPLETA DETALLA...
01 Field+Guide+to+Human-Centered+Design_IDEOorg_English GUIA COMPLETA DETALLA...01 Field+Guide+to+Human-Centered+Design_IDEOorg_English GUIA COMPLETA DETALLA...
01 Field+Guide+to+Human-Centered+Design_IDEOorg_English GUIA COMPLETA DETALLA...
Jorge Calmett
 
Boost Your Instagram Views Instantly Proven Free Strategies.
Boost Your Instagram Views Instantly Proven Free Strategies.Boost Your Instagram Views Instantly Proven Free Strategies.
Boost Your Instagram Views Instantly Proven Free Strategies.
InstBlast Marketing
 
Evaluating the Effectiveness of Women-Focused Marketing
Evaluating the Effectiveness of Women-Focused MarketingEvaluating the Effectiveness of Women-Focused Marketing
Evaluating the Effectiveness of Women-Focused Marketing
HighViz PR
 
SEO and Google Ads Fundamentals Gokce Yesilbas
SEO and Google Ads Fundamentals Gokce YesilbasSEO and Google Ads Fundamentals Gokce Yesilbas
SEO and Google Ads Fundamentals Gokce Yesilbas
gokceyesilbas
 
一比一原版(TMU毕业证)多伦多都会大学毕业证如何办理
一比一原版(TMU毕业证)多伦多都会大学毕业证如何办理一比一原版(TMU毕业证)多伦多都会大学毕业证如何办理
一比一原版(TMU毕业证)多伦多都会大学毕业证如何办理
oeveu
 
在线办理(worc毕业证书)伍斯特大学毕业证成绩单一模一样
在线办理(worc毕业证书)伍斯特大学毕业证成绩单一模一样在线办理(worc毕业证书)伍斯特大学毕业证成绩单一模一样
在线办理(worc毕业证书)伍斯特大学毕业证成绩单一模一样
5ys5mvlp
 
Lyrics Video Document for artist Kevin Aguirre
Lyrics Video Document for artist Kevin AguirreLyrics Video Document for artist Kevin Aguirre
Lyrics Video Document for artist Kevin Aguirre
rawlensproductionsco
 
Content Marketing Blueprint For Content Strategy, content creation, content d...
Content Marketing Blueprint For Content Strategy, content creation, content d...Content Marketing Blueprint For Content Strategy, content creation, content d...
Content Marketing Blueprint For Content Strategy, content creation, content d...
Bertie Birchfield
 
How to Start Affiliate Marketing with ChatGPT- A Step-by-Step Guide (1).pdf
How to Start Affiliate Marketing with ChatGPT- A Step-by-Step Guide (1).pdfHow to Start Affiliate Marketing with ChatGPT- A Step-by-Step Guide (1).pdf
How to Start Affiliate Marketing with ChatGPT- A Step-by-Step Guide (1).pdf
SimpleMoneyMaker
 
DTA Promotion - Marketing Specialist Assignment
DTA Promotion - Marketing Specialist AssignmentDTA Promotion - Marketing Specialist Assignment
DTA Promotion - Marketing Specialist Assignment
Amal Agung Cahyadi
 
Learn more about affiliate marketing as a beginner
Learn more about affiliate marketing as a beginnerLearn more about affiliate marketing as a beginner
Learn more about affiliate marketing as a beginner
MichaelGiles34
 
What is Digital Marketing: A Comprehensive Guide
What is Digital Marketing: A Comprehensive GuideWhat is Digital Marketing: A Comprehensive Guide
What is Digital Marketing: A Comprehensive Guide
V-tech Marketing
 
Embark on style journeys Indian clothing store denver guide.pptx
Embark on style journeys Indian clothing store denver guide.pptxEmbark on style journeys Indian clothing store denver guide.pptx
Embark on style journeys Indian clothing store denver guide.pptx
Omnama Fashions
 
DCC Technology Intro Slides June 2024.pptx
DCC Technology Intro Slides June 2024.pptxDCC Technology Intro Slides June 2024.pptx
DCC Technology Intro Slides June 2024.pptx
vishc81
 

Recently uploaded (20)

2024 case study-SHEIN Analysis-Final.pdf
2024 case study-SHEIN Analysis-Final.pdf2024 case study-SHEIN Analysis-Final.pdf
2024 case study-SHEIN Analysis-Final.pdf
 
Practical Progress from a Theory by Steven Kingpdf
Practical Progress from a Theory by Steven KingpdfPractical Progress from a Theory by Steven Kingpdf
Practical Progress from a Theory by Steven Kingpdf
 
Social Samosa's #30Under30 Guidebook.pdf
Social Samosa's #30Under30 Guidebook.pdfSocial Samosa's #30Under30 Guidebook.pdf
Social Samosa's #30Under30 Guidebook.pdf
 
Snapshot of Consumer Behaviors of May 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of May 2024-EOLiSurvey (EN).pdfSnapshot of Consumer Behaviors of May 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of May 2024-EOLiSurvey (EN).pdf
 
How to Maximize Sales Using Social Commerce
How to Maximize Sales Using Social CommerceHow to Maximize Sales Using Social Commerce
How to Maximize Sales Using Social Commerce
 
NIMA2024 | AI en de 50% afname van je traffic. Wat nu! | Folkje Berends en Sj...
NIMA2024 | AI en de 50% afname van je traffic. Wat nu! | Folkje Berends en Sj...NIMA2024 | AI en de 50% afname van je traffic. Wat nu! | Folkje Berends en Sj...
NIMA2024 | AI en de 50% afname van je traffic. Wat nu! | Folkje Berends en Sj...
 
01 Field+Guide+to+Human-Centered+Design_IDEOorg_English GUIA COMPLETA DETALLA...
01 Field+Guide+to+Human-Centered+Design_IDEOorg_English GUIA COMPLETA DETALLA...01 Field+Guide+to+Human-Centered+Design_IDEOorg_English GUIA COMPLETA DETALLA...
01 Field+Guide+to+Human-Centered+Design_IDEOorg_English GUIA COMPLETA DETALLA...
 
Boost Your Instagram Views Instantly Proven Free Strategies.
Boost Your Instagram Views Instantly Proven Free Strategies.Boost Your Instagram Views Instantly Proven Free Strategies.
Boost Your Instagram Views Instantly Proven Free Strategies.
 
Evaluating the Effectiveness of Women-Focused Marketing
Evaluating the Effectiveness of Women-Focused MarketingEvaluating the Effectiveness of Women-Focused Marketing
Evaluating the Effectiveness of Women-Focused Marketing
 
SEO and Google Ads Fundamentals Gokce Yesilbas
SEO and Google Ads Fundamentals Gokce YesilbasSEO and Google Ads Fundamentals Gokce Yesilbas
SEO and Google Ads Fundamentals Gokce Yesilbas
 
一比一原版(TMU毕业证)多伦多都会大学毕业证如何办理
一比一原版(TMU毕业证)多伦多都会大学毕业证如何办理一比一原版(TMU毕业证)多伦多都会大学毕业证如何办理
一比一原版(TMU毕业证)多伦多都会大学毕业证如何办理
 
在线办理(worc毕业证书)伍斯特大学毕业证成绩单一模一样
在线办理(worc毕业证书)伍斯特大学毕业证成绩单一模一样在线办理(worc毕业证书)伍斯特大学毕业证成绩单一模一样
在线办理(worc毕业证书)伍斯特大学毕业证成绩单一模一样
 
Lyrics Video Document for artist Kevin Aguirre
Lyrics Video Document for artist Kevin AguirreLyrics Video Document for artist Kevin Aguirre
Lyrics Video Document for artist Kevin Aguirre
 
Content Marketing Blueprint For Content Strategy, content creation, content d...
Content Marketing Blueprint For Content Strategy, content creation, content d...Content Marketing Blueprint For Content Strategy, content creation, content d...
Content Marketing Blueprint For Content Strategy, content creation, content d...
 
How to Start Affiliate Marketing with ChatGPT- A Step-by-Step Guide (1).pdf
How to Start Affiliate Marketing with ChatGPT- A Step-by-Step Guide (1).pdfHow to Start Affiliate Marketing with ChatGPT- A Step-by-Step Guide (1).pdf
How to Start Affiliate Marketing with ChatGPT- A Step-by-Step Guide (1).pdf
 
DTA Promotion - Marketing Specialist Assignment
DTA Promotion - Marketing Specialist AssignmentDTA Promotion - Marketing Specialist Assignment
DTA Promotion - Marketing Specialist Assignment
 
Learn more about affiliate marketing as a beginner
Learn more about affiliate marketing as a beginnerLearn more about affiliate marketing as a beginner
Learn more about affiliate marketing as a beginner
 
What is Digital Marketing: A Comprehensive Guide
What is Digital Marketing: A Comprehensive GuideWhat is Digital Marketing: A Comprehensive Guide
What is Digital Marketing: A Comprehensive Guide
 
Embark on style journeys Indian clothing store denver guide.pptx
Embark on style journeys Indian clothing store denver guide.pptxEmbark on style journeys Indian clothing store denver guide.pptx
Embark on style journeys Indian clothing store denver guide.pptx
 
DCC Technology Intro Slides June 2024.pptx
DCC Technology Intro Slides June 2024.pptxDCC Technology Intro Slides June 2024.pptx
DCC Technology Intro Slides June 2024.pptx
 

Big and fast a quest for relevant and real-time analytics

  • 1. Big & Fast: A quest for relevant and real-time analytics Natalino Busa @natalinobusa
  • 2. Parallelism Mathematics Programming Languages Machine Learning Statistics Big Data Algorithms Cloud Computing Natalino Busa @natalinobusa www.natalinobusa.com
  • 3. Big and Fast. Methodology Architecture Roles and organization
  • 4. Conversion is the ultimate form of permission marketing Permission marketing is about the honour of being heard. How to earn it ? Provide the right suggestions, at the right time. This is what makes data analysis valuable
  • 5. When do you really know your customer ? know about last unique: 5 songs? 100 songs? 10’000 songs?
  • 6. Old & New stuff. We evolve slowly, our personality, our habits. But events and trends can affect us on a short notice How do you combine old with new?
  • 7. The customer’s context Complex on many dimensions: Personal history: amount of transactions ever done Long term Interaction: how the users’ action correlate with others Real time events: Trends and recent events
  • 8. The customer’s context context is related to time: slow changing: the defining characteristic of a person fast changing: events which influence our lives, trends Require very different technology solutions !!!
  • 9. Challenges millions of billions of Not much time to react window of opportunity sometimes is just a few seconds Load of information to process you want to understand well the user history
  • 10. Slow and fast ranking and preference analysis segmentation and clustering short term trending topics rule-based recommendations 10’s Terabytes of Data. This can take hours …. 100’s of events per second. This must be fast ….
  • 11. Hadoop: Distributed Data OS Reliable Distributed, Replicated File System Low cost ↓ Cost vs ↑ Performance/Storage Computing Powerhouse All clusters CPU’s working in parallel for running queries
  • 12. Scala / Akka / Spray: a WEB API reactive framework Actor A Actor B Actor C msg 1 msg 2 msg 3 msg 4 ● it scales horizontally (can run in cluster mode) ● maximum use of the available cores/memory 1. processing is non-blocking, threads are re-used 2. can parallelize computing power across many actors Very fast: 1000’s messages/sec Very reliable: auto recovery
  • 13. Distributed computing: lambda architecture Batch Computing HTTP RESTful API In-Memory Distributed Database In-memory Distributed DB’s Lambda Architecture Batch + Streaming low-latency Web API services Streaming Computing Data Warehouses Messaging Busses
  • 14. Distributed computing: some techs Hadoop Cassandra millions of billions of λ= conversions ( lamda )
  • 15. All Things Distributed Distributing computing and storage more machines = more storage/computing Open Source software solutions mature enough for pragmatic adopters Near realtime + big data technologies Hadoop, Scala, Akka, Spray, Cassandra
  • 16. Science & Engineering Statistics, Data Science Python R Visualization IT Infra Big Data Java Scala SQL Hadoop: Big Data Infrastructure, Data Science on large datasets Big Data and Fast Data requires different profiles to be able to achieve the best results
  • 17. Parallelism Mathematics Programming Languages Machine Learning Statistics Big Data Algorithms Cloud Computing Natalino Busa @natalinobusa www.natalinobusa.com Thanks ! Any questions?