SlideShare a Scribd company logo
Lambda Architecture
in Real-time Big Data
● Concepts & Techniques “Thinking with Lambda”
● Case studies in Practice
Trieu Nguyen - http://nguyentantrieu.info/blog or @tantrieuf31
Lead Engineer at eClick Data Analytics team at FPT Online
All contents and thoughts in this slide are my subjective ideas and compiled from Open Source
Communities
Just a little introduction
● 2008 Java Developer, developed Social
Trading Network for a small startup (Yopco)
● 2011 worked at FPT Online, software engineer
in Banbe Project, Restful API for VnExpress
Mobile App
● 2012 joined Greengar Studio in 6 months,
scaling backend API mobile games (iOS, Android)
● 2013 back to FPT Online, R&D about Big Data
& Analytics, developing the new core
Analytics Platform (on JVM Platform)
Stupid questions
● Big Data means big logs storage ?
● I just installed Hadoop, and it works! Do we
really get a big data solution ?
● We have lots data, so let’s play with cool big
data technologies x,y, z! Do we get profits
from that ?
● We can hire or outsource a professional
team to build big data solution, but do they
answer what problem we get ?
Contents for this talk
● A little introduction about Lambda in history
● Trends of Now and the Future
● Why lambda architecture is correct solution
for big data?
● Lambda in Practice, case studies from
Greengar Studios and eClick
● Lessons
● Questions & Answers
History
The best way to predict the future is
looking at the past and now ?
Lambda is the symbol to denote:
● Half-life game ?
● Anonymous function, aka: Closure ?
● functional computation/programming?
● scalable system ?
http://en.wikipedia.org/wiki/Lambda
When I study “lambda” ?
I studied Haskell in 2007 with Dr.Peter Gammie http://peteg.org/ when
internship at DRD (a non-profit organization).
● Imperative programs will always be vulnerable to data races because
they contain mutable variables.
● There are no data races in purely functional languages because they
don't have mutable variables.
http://stackoverflow.com/questions/6087834/how-
scalable-is-mapreduce-in-the-original-functional-
languages
How did Google scale their search engine ?
How does Hadoop really work ?
The Closure in JavaScript,
running by billion websites
!
(Lambda) is everywhere !
Trends of Now and the Future
● Big Data
● Data Analytics
● Reactive Programming
● Functional Programming
● Streaming Computation
=> All just the special cases of Lambda
Question:
Is mobile app
generating more
data than
traditional web ?
Question:
Is the Open Source Big Data Solution like Hadoop, that makes big
data more popular to enterprises and startups ?
2009, a big-data startup, Cloudera was founded !
What is the λ
(Lambda)
Architecture ?
the Lambda Architecture:
● apply the (λ) Lambda philosophy in designing big data
system
● equation “query = function(all data)” which is the basis
of all data systems
● proposed by Nathan Marz (http://nathanmarz.com/), a
software engineer from Twitter in his “Big Data” book.
● is based on three main design principles:
○ human fault-tolerance – the system is unsusceptible to data loss or
data corruption because at scale it could be irreparable. (BUGS ?)
○ data immutability – store data in it’s rawest form immutable and for
perpetuity. (INSERT/ SELECT/DELETE but no UPDATE !)
○ recomputation – with the two principles above it is always possible to
(re)-compute results by running a function on the raw data.
“lambda architecture”
proposed by @nathanmarz
We, at FPT Online, have applied
the lambda architecture since
April, 2013
Lambda In Practice
2 case studies from my experiences
Case Study 1:
Greengar Studios
API Backend Monitor + Statistics
http://www.greengar.com/
Backend System at Greengar Studio
I applied
“Lambda”
here
The data and the size, not too big for a small
startup!
Where is the lambda ?
I used Groovy + GPars (Groovy Parallel Systems) + MongoDB for fast
parallel computation (actor model) on statistical data
http://gpars.codehaus.org/
The GPars framework offers Java developers intuitive and safe ways to handle
Java or Groovy tasks concurrently.
Support:
● Dataflow concurrency
● Actor programming model
● CSP
● Agent - an thread-safe reference to mutable state
● Concurrent collection processing
● Composable asynchronous functions
● Fork/Join
● STM (Software Transactional Memory)
Mobile Apps => Backend APIs =>
Statistics => Find the Trends & Insights?
Case Study 2:
eClick Ad-Network
● Real-time Data Analytics
● Monitoring Stream Data (Reactive)
http://eclick.vn
at eClick we have
30~40 GB Logs in Stream
10~20 GB Bandwidth
just for tracking user
actions (click,
impression,...)
in ONE day !
at eClick we must
check campaigns in
near-real-time
(seconds) !
at eClick we have many types of log (video, web,
mobile, system logs, ad-campaign, articles, … )
Our big-data system
Leverage Open Source Projects
● Netty (http://netty.io/) a framework using reactive programming
pattern for scaling HTTP system easier
● Kafka (http://kafka.apache.org/) a publish-subscribe messaging
rethought as a distributed commit log.
● Storm (http://storm-project.net/) a framework for distributed
realtime computation system.
● Redis (http://redis.io/) a advanced key-value in-memory NoSQL
database, all fast statistical computations in here.
● Groovy for scripting layer, dynamic query on Redis + RDBMSs
● Hadoop ecosystem: HDFS, Hive, HBase for batch processing
● RxJava https://github.com/Netflix/RxJava a library for
composing asynchronous and event-based programs
Some new ideas for the future:
Connecting the active functor pattern + reactive programming
+ stream computation + in-memory computing to make:
● real-time data analytics easier
● better recommendation system
● build more profitable big data solutions
More Information:
● http://activefunctor.blogspot.com/ (a special case of Lambda
that actively search best connections to form optimal
topology) - from ideas when internship at DRD with my
advisor.
● Can a function be persistent (stored as data), distributed in
a cluster (cloud), reactive to right data (best value in
network)?
We can't solve problems
by using the same kind of
thinking we used when we
created them.
Albert Einstein
Think more Lambda and Reactive
How could we see "user interest graph" in our user's database ?
● Social Graph
=> Keep the connection
● Interest Graph
=> Make new connection
=> recommendation
platform
Source: http://en.wikipedia.org/wiki/Interest_graph
Lessons
What I have learned from Lambda and Big
Data World
What I have learned
● Keep it as simple as possible, but no simpler !
● Ask right questions=> deep analytics=>Profit
● Reactive and Lambda for your data products
● Implement it! Just right tools for right jobs.
● Turn your data into the things everyone can
"look & feel"
How to build profitable big data solutions?
=> read these Behavioral Economics Books
http://www.goodreads.com/shelf/show/behavioral-economics
Stay focused, keep innovating
Big Data is not profitable if you do not know
what you want and ask right questions
“Logic will get you from A to Z;
imagination will get you
everywhere.” - Albert Einstein
Use your imaginationwith data analytics, not
just logic
Lambda architecture for real time big data

More Related Content

What's hot

Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
Rahul Agarwal
 
Hadoop YARN
Hadoop YARNHadoop YARN
Hadoop YARN
Vigen Sahakyan
 
PPT on Hadoop
PPT on HadoopPPT on Hadoop
PPT on Hadoop
Shubham Parmar
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
Prashant Gupta
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
Russell Jurney
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
Saurav Haloi
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS  Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS
Dr Neelesh Jain
 
Document Database
Document DatabaseDocument Database
Document Database
Heman Hosainpana
 
Fraud and Risk in Big Data
Fraud and Risk in Big DataFraud and Risk in Big Data
Fraud and Risk in Big Data
Umma Khatuna Jannat
 
Apache Zookeeper
Apache ZookeeperApache Zookeeper
Apache Zookeeper
Nguyen Quang
 
Introduction to hadoop
Introduction to hadoopIntroduction to hadoop
Introduction to hadoop
karthika karthi
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
alexbaranau
 
Unit-3_BDA.ppt
Unit-3_BDA.pptUnit-3_BDA.ppt
Unit-3_BDA.ppt
PoojaShah174393
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 
Présentation de Apache Zookeeper
Présentation de Apache ZookeeperPrésentation de Apache Zookeeper
Présentation de Apache Zookeeper
Michaël Morello
 
MongoDB- Crud Operation
MongoDB- Crud OperationMongoDB- Crud Operation
MongoDB- Crud OperationEdureka!
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
 

What's hot (20)

Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Hadoop YARN
Hadoop YARNHadoop YARN
Hadoop YARN
 
PPT on Hadoop
PPT on HadoopPPT on Hadoop
PPT on Hadoop
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
 
Hadoop
HadoopHadoop
Hadoop
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS  Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS
 
Document Database
Document DatabaseDocument Database
Document Database
 
Fraud and Risk in Big Data
Fraud and Risk in Big DataFraud and Risk in Big Data
Fraud and Risk in Big Data
 
Apache Zookeeper
Apache ZookeeperApache Zookeeper
Apache Zookeeper
 
Introduction to hadoop
Introduction to hadoopIntroduction to hadoop
Introduction to hadoop
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Unit-3_BDA.ppt
Unit-3_BDA.pptUnit-3_BDA.ppt
Unit-3_BDA.ppt
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Présentation de Apache Zookeeper
Présentation de Apache ZookeeperPrésentation de Apache Zookeeper
Présentation de Apache Zookeeper
 
MongoDB- Crud Operation
MongoDB- Crud OperationMongoDB- Crud Operation
MongoDB- Crud Operation
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 

Viewers also liked

Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in Action
Guido Schmutz
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
DataWorks Summit
 
Runaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop itRunaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop itnathanmarz
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Helena Edelson
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Helena Edelson
 
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...Nathan Bijnens
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
SoftServe
 
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
MongoDB
 
Lambda Architecture Using SQL
Lambda Architecture Using SQLLambda Architecture Using SQL
Lambda Architecture Using SQL
SATOSHI TAGOMORI
 
[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel
[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel
[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel
Mathieu DESPRIEE
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013Nathan Bijnens
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
Guido Schmutz
 
Lambda Architecture 2.0 for Reactive AB Testing
Lambda Architecture 2.0 for Reactive AB TestingLambda Architecture 2.0 for Reactive AB Testing
Lambda Architecture 2.0 for Reactive AB Testing
Trieu Nguyen
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architectures
Daniel Marcous
 
Zeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data ArchitectureZeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data Architecture
MapR Technologies
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics Architecture
Arvind Sathi
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Hortonworks
 
Big Data Revolution: Are You Ready for the Data Overload?
Big Data Revolution: Are You Ready for the Data Overload?Big Data Revolution: Are You Ready for the Data Overload?
Big Data Revolution: Are You Ready for the Data Overload?
Aleah Radovich
 
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLA
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLARiot Games Scalable Data Warehouse Lecture at UCSB / UCLA
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLA
sean_seannery
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance
Qubole
 

Viewers also liked (20)

Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in Action
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
 
Runaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop itRunaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop it
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
 
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
 
Lambda Architecture Using SQL
Lambda Architecture Using SQLLambda Architecture Using SQL
Lambda Architecture Using SQL
 
[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel
[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel
[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Lambda Architecture 2.0 for Reactive AB Testing
Lambda Architecture 2.0 for Reactive AB TestingLambda Architecture 2.0 for Reactive AB Testing
Lambda Architecture 2.0 for Reactive AB Testing
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architectures
 
Zeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data ArchitectureZeta Architecture: The Next Generation Big Data Architecture
Zeta Architecture: The Next Generation Big Data Architecture
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics Architecture
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Big Data Revolution: Are You Ready for the Data Overload?
Big Data Revolution: Are You Ready for the Data Overload?Big Data Revolution: Are You Ready for the Data Overload?
Big Data Revolution: Are You Ready for the Data Overload?
 
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLA
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLARiot Games Scalable Data Warehouse Lecture at UCSB / UCLA
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLA
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance
 

Similar to Lambda architecture for real time big data

Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
Trieu Nguyen
 
Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Rendy Bambang Junior
 
London atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesLondon atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slides
Rudiger Wolf
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data Pipeline
Trieu Nguyen
 
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Imam Raza
 
Career opportunities in open source framework
Career opportunities in open source frameworkCareer opportunities in open source framework
Career opportunities in open source framework
edunextgen
 
Career opportunities in open source framework
Career opportunities in open source framework Career opportunities in open source framework
Career opportunities in open source framework
edunextgen
 
UX Analytics for Data-driven Product Development
UX Analytics for Data-driven Product DevelopmentUX Analytics for Data-driven Product Development
UX Analytics for Data-driven Product Development
Trieu Nguyen
 
Top 10 Data analytics tools to look for in 2021
Top 10 Data analytics tools to look for in 2021Top 10 Data analytics tools to look for in 2021
Top 10 Data analytics tools to look for in 2021
Mobcoder
 
2013 - Yhat - YC app.pdf
2013 - Yhat - YC app.pdf2013 - Yhat - YC app.pdf
2013 - Yhat - YC app.pdf
Austin Ogilvie
 
Serverless Computing with Google Cloud
Serverless Computing with Google CloudServerless Computing with Google Cloud
Serverless Computing with Google Cloud
wesley chun
 
Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)
Benjamin Bengfort
 
Making Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxMaking Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFlux
Trayan Iliev
 
How to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryHow to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st century
Ali Dasdan
 
Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017
MSMK - Madrid School of Marketing
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
Trivadis
 
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google CloudMongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Ramiro Aduviri Velasco
 
Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...
Christophe Guéret
 
System design for Web Application
System design for Web ApplicationSystem design for Web Application
System design for Web Application
Michael Choi
 

Similar to Lambda architecture for real time big data (20)

Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
 
Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017
 
London atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesLondon atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slides
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data Pipeline
 
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
 
Career opportunities in open source framework
Career opportunities in open source frameworkCareer opportunities in open source framework
Career opportunities in open source framework
 
Career opportunities in open source framework
Career opportunities in open source framework Career opportunities in open source framework
Career opportunities in open source framework
 
UX Analytics for Data-driven Product Development
UX Analytics for Data-driven Product DevelopmentUX Analytics for Data-driven Product Development
UX Analytics for Data-driven Product Development
 
Top 10 Data analytics tools to look for in 2021
Top 10 Data analytics tools to look for in 2021Top 10 Data analytics tools to look for in 2021
Top 10 Data analytics tools to look for in 2021
 
2013 - Yhat - YC app.pdf
2013 - Yhat - YC app.pdf2013 - Yhat - YC app.pdf
2013 - Yhat - YC app.pdf
 
Serverless Computing with Google Cloud
Serverless Computing with Google CloudServerless Computing with Google Cloud
Serverless Computing with Google Cloud
 
Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)
 
Making Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxMaking Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFlux
 
How to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryHow to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st century
 
Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
 
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google CloudMongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...
 
System design for Web Application
System design for Web ApplicationSystem design for Web Application
System design for Web Application
 

More from Trieu Nguyen

Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfBuilding Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Trieu Nguyen
 
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessBuilding Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Trieu Nguyen
 
Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP
Trieu Nguyen
 
How to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPHow to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDP
Trieu Nguyen
 
[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP
Trieu Nguyen
 
Leo CDP - Pitch Deck
Leo CDP - Pitch DeckLeo CDP - Pitch Deck
Leo CDP - Pitch Deck
Trieu Nguyen
 
LEO CDP - What's new in 2022
LEO CDP  - What's new in 2022LEO CDP  - What's new in 2022
LEO CDP - What's new in 2022
Trieu Nguyen
 
Lộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnLộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sản
Trieu Nguyen
 
Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?
Trieu Nguyen
 
From Dataism to Customer Data Platform
From Dataism to Customer Data PlatformFrom Dataism to Customer Data Platform
From Dataism to Customer Data Platform
Trieu Nguyen
 
Data collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkData collection, processing & organization with USPA framework
Data collection, processing & organization with USPA framework
Trieu Nguyen
 
Part 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyPart 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technology
Trieu Nguyen
 
Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?
Trieu Nguyen
 
How to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformHow to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation Platform
Trieu Nguyen
 
How to grow your business in the age of digital marketing 4.0
How to grow your business  in the age of digital marketing 4.0How to grow your business  in the age of digital marketing 4.0
How to grow your business in the age of digital marketing 4.0
Trieu Nguyen
 
Video Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataVideo Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big data
Trieu Nguyen
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)
Trieu Nguyen
 
Open OTT - Video Content Platform
Open OTT - Video Content PlatformOpen OTT - Video Content Platform
Open OTT - Video Content Platform
Trieu Nguyen
 
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisApache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Trieu Nguyen
 
Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)
Trieu Nguyen
 

More from Trieu Nguyen (20)

Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfBuilding Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
 
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessBuilding Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
 
Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP
 
How to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPHow to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDP
 
[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP
 
Leo CDP - Pitch Deck
Leo CDP - Pitch DeckLeo CDP - Pitch Deck
Leo CDP - Pitch Deck
 
LEO CDP - What's new in 2022
LEO CDP  - What's new in 2022LEO CDP  - What's new in 2022
LEO CDP - What's new in 2022
 
Lộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnLộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sản
 
Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?
 
From Dataism to Customer Data Platform
From Dataism to Customer Data PlatformFrom Dataism to Customer Data Platform
From Dataism to Customer Data Platform
 
Data collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkData collection, processing & organization with USPA framework
Data collection, processing & organization with USPA framework
 
Part 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyPart 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technology
 
Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?
 
How to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformHow to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation Platform
 
How to grow your business in the age of digital marketing 4.0
How to grow your business  in the age of digital marketing 4.0How to grow your business  in the age of digital marketing 4.0
How to grow your business in the age of digital marketing 4.0
 
Video Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataVideo Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big data
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)
 
Open OTT - Video Content Platform
Open OTT - Video Content PlatformOpen OTT - Video Content Platform
Open OTT - Video Content Platform
 
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisApache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
 
Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 

Lambda architecture for real time big data

  • 1. Lambda Architecture in Real-time Big Data ● Concepts & Techniques “Thinking with Lambda” ● Case studies in Practice Trieu Nguyen - http://nguyentantrieu.info/blog or @tantrieuf31 Lead Engineer at eClick Data Analytics team at FPT Online All contents and thoughts in this slide are my subjective ideas and compiled from Open Source Communities
  • 2. Just a little introduction ● 2008 Java Developer, developed Social Trading Network for a small startup (Yopco) ● 2011 worked at FPT Online, software engineer in Banbe Project, Restful API for VnExpress Mobile App ● 2012 joined Greengar Studio in 6 months, scaling backend API mobile games (iOS, Android) ● 2013 back to FPT Online, R&D about Big Data & Analytics, developing the new core Analytics Platform (on JVM Platform)
  • 3. Stupid questions ● Big Data means big logs storage ? ● I just installed Hadoop, and it works! Do we really get a big data solution ? ● We have lots data, so let’s play with cool big data technologies x,y, z! Do we get profits from that ? ● We can hire or outsource a professional team to build big data solution, but do they answer what problem we get ?
  • 4. Contents for this talk ● A little introduction about Lambda in history ● Trends of Now and the Future ● Why lambda architecture is correct solution for big data? ● Lambda in Practice, case studies from Greengar Studios and eClick ● Lessons ● Questions & Answers
  • 5. History The best way to predict the future is looking at the past and now ?
  • 6. Lambda is the symbol to denote: ● Half-life game ? ● Anonymous function, aka: Closure ? ● functional computation/programming? ● scalable system ?
  • 7.
  • 9. When I study “lambda” ? I studied Haskell in 2007 with Dr.Peter Gammie http://peteg.org/ when internship at DRD (a non-profit organization). ● Imperative programs will always be vulnerable to data races because they contain mutable variables. ● There are no data races in purely functional languages because they don't have mutable variables.
  • 11. How did Google scale their search engine ? How does Hadoop really work ?
  • 12.
  • 13.
  • 14. The Closure in JavaScript, running by billion websites !
  • 16. Trends of Now and the Future ● Big Data ● Data Analytics ● Reactive Programming ● Functional Programming ● Streaming Computation => All just the special cases of Lambda
  • 17. Question: Is mobile app generating more data than traditional web ?
  • 18.
  • 19. Question: Is the Open Source Big Data Solution like Hadoop, that makes big data more popular to enterprises and startups ? 2009, a big-data startup, Cloudera was founded !
  • 20. What is the λ (Lambda) Architecture ?
  • 21. the Lambda Architecture: ● apply the (λ) Lambda philosophy in designing big data system ● equation “query = function(all data)” which is the basis of all data systems ● proposed by Nathan Marz (http://nathanmarz.com/), a software engineer from Twitter in his “Big Data” book. ● is based on three main design principles: ○ human fault-tolerance – the system is unsusceptible to data loss or data corruption because at scale it could be irreparable. (BUGS ?) ○ data immutability – store data in it’s rawest form immutable and for perpetuity. (INSERT/ SELECT/DELETE but no UPDATE !) ○ recomputation – with the two principles above it is always possible to (re)-compute results by running a function on the raw data.
  • 22.
  • 23. “lambda architecture” proposed by @nathanmarz We, at FPT Online, have applied the lambda architecture since April, 2013
  • 24. Lambda In Practice 2 case studies from my experiences
  • 25. Case Study 1: Greengar Studios API Backend Monitor + Statistics http://www.greengar.com/
  • 26. Backend System at Greengar Studio I applied “Lambda” here
  • 27. The data and the size, not too big for a small startup! Where is the lambda ? I used Groovy + GPars (Groovy Parallel Systems) + MongoDB for fast parallel computation (actor model) on statistical data http://gpars.codehaus.org/ The GPars framework offers Java developers intuitive and safe ways to handle Java or Groovy tasks concurrently. Support: ● Dataflow concurrency ● Actor programming model ● CSP ● Agent - an thread-safe reference to mutable state ● Concurrent collection processing ● Composable asynchronous functions ● Fork/Join ● STM (Software Transactional Memory)
  • 28. Mobile Apps => Backend APIs => Statistics => Find the Trends & Insights?
  • 29. Case Study 2: eClick Ad-Network ● Real-time Data Analytics ● Monitoring Stream Data (Reactive) http://eclick.vn
  • 30. at eClick we have 30~40 GB Logs in Stream 10~20 GB Bandwidth just for tracking user actions (click, impression,...) in ONE day ! at eClick we must check campaigns in near-real-time (seconds) ! at eClick we have many types of log (video, web, mobile, system logs, ad-campaign, articles, … )
  • 31. Our big-data system Leverage Open Source Projects ● Netty (http://netty.io/) a framework using reactive programming pattern for scaling HTTP system easier ● Kafka (http://kafka.apache.org/) a publish-subscribe messaging rethought as a distributed commit log. ● Storm (http://storm-project.net/) a framework for distributed realtime computation system. ● Redis (http://redis.io/) a advanced key-value in-memory NoSQL database, all fast statistical computations in here. ● Groovy for scripting layer, dynamic query on Redis + RDBMSs ● Hadoop ecosystem: HDFS, Hive, HBase for batch processing ● RxJava https://github.com/Netflix/RxJava a library for composing asynchronous and event-based programs
  • 32. Some new ideas for the future: Connecting the active functor pattern + reactive programming + stream computation + in-memory computing to make: ● real-time data analytics easier ● better recommendation system ● build more profitable big data solutions More Information: ● http://activefunctor.blogspot.com/ (a special case of Lambda that actively search best connections to form optimal topology) - from ideas when internship at DRD with my advisor. ● Can a function be persistent (stored as data), distributed in a cluster (cloud), reactive to right data (best value in network)?
  • 33. We can't solve problems by using the same kind of thinking we used when we created them. Albert Einstein Think more Lambda and Reactive
  • 34. How could we see "user interest graph" in our user's database ?
  • 35. ● Social Graph => Keep the connection ● Interest Graph => Make new connection => recommendation platform Source: http://en.wikipedia.org/wiki/Interest_graph
  • 36. Lessons What I have learned from Lambda and Big Data World
  • 37. What I have learned ● Keep it as simple as possible, but no simpler ! ● Ask right questions=> deep analytics=>Profit ● Reactive and Lambda for your data products ● Implement it! Just right tools for right jobs. ● Turn your data into the things everyone can "look & feel"
  • 38. How to build profitable big data solutions? => read these Behavioral Economics Books http://www.goodreads.com/shelf/show/behavioral-economics
  • 39. Stay focused, keep innovating Big Data is not profitable if you do not know what you want and ask right questions
  • 40.
  • 41. “Logic will get you from A to Z; imagination will get you everywhere.” - Albert Einstein Use your imaginationwith data analytics, not just logic