SlideShare a Scribd company logo
Apache LENS : Unified OLAP on
Realtime and Batch Data
HADOOP SUMMIT
San Jose 2015
Sharad Agarwal (Flipkart)
Jothi Padmanabhan (InMobi)
Motivation
Most enterprises
have more than one
Analytics Data Systems
Catering to diverse requirements
Flexible
questions
Fast response
Fresh data
Different kind of analysis
Variety
No of 

Queries
Operational
Exploratory
Example Scenarios
Example 1
Operational and Exploratory
Analyse user behaviour in Mobile
Advertising domain
User Activity
timestamp
user-id
location-id
device-id
device-
orientation
served-ad-id
clicks
downloads
revenue
User
user-id
age
gender
interests
Location
location-id
city
country
Device
device-id
manufacturer
os
model
Operational Analytics
User Activity (by demog, geo, device)
Click activity of users by city and age
Download activity by gender and country for iOS/
Android
Exploratory Analytics
Download conversion by device orientation
(landscape/portrait)
Activity
ETL
Subset
Activity
User
All
Activity
Location
User
DWH Store
Batch Store (Hive)
Device
Location
Operational

Analytics
Exploratory

Analytics
Frequently used Data

Low latency response

All Data

High latency response

Device
Example 2
Reacting to Realtime-Data
Sale Days in e-commerce
Look at current trends to mark related
items for offers
Logistics decisions
Orders
Offers
Realtime Store
Logistics
Realtime

Dashboards
Fresh Data

only for recent time
window
Product
Raw Data Streams
Streaming

pipelines
Solving varied scenarios
leads to
multiple disparate Systems
and
complexity
Complexity is a silent
killer!
Data Inconsistencies
High engineering and operation cost
Moving data across systems is non trivial
Confusion among users
Multiple definitions of data
Different way of access
Data Silos - Data Discovery
What is desired ?
Easy and Consistent mechanism to
discover and query all data
Cost and performance trade-off knobs
for different queries
Apache LENS
Cube abstraction
Common view across
Multiple Tiers
Multiple Storages
OLAP Data Model
CUBE DIMENSION
Dimension
Table
Fact Table
Tiered Data Layout -
Fact
Raw Fact (mr), Dim-Cuts (dr)
Agrr Fact1 mr1<=mr, da1<dr
Agrr Fact2 mr2<=mr1,
da2<dr1
Measures, Dim-cuts
Query
Interactivity…
Query
Flexibility
Activity
ETL
All
Activity Location
User
Batch Store (Hive)
Device
Weekly
Rollup
Monthly
Rollup
Multiple Tiers of storage for 

performance
Unified
View
LENS
CUBE
Traditional DWH
Exploratory
Frequently used Data

Low latency response

Fresh Data

Low latency response

Realtime store
All Data

High latency response

Batch store
Batch
ETL
Streaming
Aggregations
Batch
ETL
Unified
View
LENS
CUBE
Operational
Architecture
Lens Capabilities
OLAP Cube Abstraction
Data Discovery via single metadata layer
Query Life Cycle Management
Data Optimisation via Query Analytics
Fast Workload based experimentation
with newer systems: Spark, Tez, AWS
Redshift etc.
Integrates with Apache
Zeppelin for Data
exploration
Current Status
Incubated in Apache in Nov 2014
Two releases - 2.1 being latest
Supports stores
Hive
JDBC compliant
Deployed at
InMobi
Flipkart
Lens Roadmap
Authorization
Scheduler service
Make it suitable to integrate with BI
tools
Automatic Roll up suggestions
New Drivers : Elastic search, Spark SQL
Administrator console
http://lens.incubator.apache.org

Mailing Lists

lens-dev@lens.incubator.apache.org

lens-user@lens.incubator.apache.org

Stay Involved

More Related Content

Similar to Apache Lens : Unified OLAP on Realtime and Batch Data

Spark Summit Keynote by Seshu Adunuthula
Spark Summit Keynote by Seshu AdunuthulaSpark Summit Keynote by Seshu Adunuthula
Spark Summit Keynote by Seshu Adunuthula
Spark Summit
 
Cross Device Tracking - Thomas Danniau
Cross Device Tracking - Thomas DanniauCross Device Tracking - Thomas Danniau
Cross Device Tracking - Thomas Danniau
The Reference
 
Contextually Relevant Retail APIs for Dynamic Insights & Experiences
Contextually Relevant Retail APIs for Dynamic Insights & ExperiencesContextually Relevant Retail APIs for Dynamic Insights & Experiences
Contextually Relevant Retail APIs for Dynamic Insights & Experiences
Jason Lobel
 
Search, APIs, capability management and the Sensis journey - By Rees Craig
Search, APIs, capability management and the Sensis journey - By Rees CraigSearch, APIs, capability management and the Sensis journey - By Rees Craig
Search, APIs, capability management and the Sensis journey - By Rees Craig
lucenerevolution
 
Click stream analysis and hadoop framwork
Click stream analysis and hadoop framworkClick stream analysis and hadoop framwork
Click stream analysis and hadoop framwork
Marwadi Univercity
 
Internet of Things Chicago - Meetup
Internet of Things Chicago - MeetupInternet of Things Chicago - Meetup
Internet of Things Chicago - Meetup
Jason Lobel
 
Online retail a look at data consulting approach
Online retail   a look at data consulting approachOnline retail   a look at data consulting approach
Online retail a look at data consulting approach
Shesha R
 
Building an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signalsBuilding an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signals
TigerGraph
 
Semantic search in the cloud
Semantic search in the cloudSemantic search in the cloud
Semantic search in the cloud
lucenerevolution
 
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Amazon Web Services
 
Boston seo meetup 2-28-2017
Boston seo meetup 2-28-2017Boston seo meetup 2-28-2017
Boston seo meetup 2-28-2017
Overdrive Interactive
 
5733 a deep dive into IBM Watson Foundation for CSP (WFC)
5733   a deep dive into IBM Watson Foundation for CSP (WFC)5733   a deep dive into IBM Watson Foundation for CSP (WFC)
5733 a deep dive into IBM Watson Foundation for CSP (WFC)
Arvind Sathi
 
SalesLogix Roadmap 2008 11 01
SalesLogix Roadmap 2008 11 01SalesLogix Roadmap 2008 11 01
SalesLogix Roadmap 2008 11 01
Customer FX Corporation
 
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Amazon Web Services
 
Technologies
TechnologiesTechnologies
Technologies
guest6cdabe
 
How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top Contenders
VoltDB
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
Hortonworks
 
Big Data in Online Classifieds
Big Data in Online ClassifiedsBig Data in Online Classifieds
Big Data in Online Classifieds
Domonkos Tikk
 
Applications of Big Data & Hadoop
Applications of Big Data & HadoopApplications of Big Data & Hadoop
Applications of Big Data & Hadoop
Seo Gyansha
 

Similar to Apache Lens : Unified OLAP on Realtime and Batch Data (20)

Spark Summit Keynote by Seshu Adunuthula
Spark Summit Keynote by Seshu AdunuthulaSpark Summit Keynote by Seshu Adunuthula
Spark Summit Keynote by Seshu Adunuthula
 
Cross Device Tracking - Thomas Danniau
Cross Device Tracking - Thomas DanniauCross Device Tracking - Thomas Danniau
Cross Device Tracking - Thomas Danniau
 
Contextually Relevant Retail APIs for Dynamic Insights & Experiences
Contextually Relevant Retail APIs for Dynamic Insights & ExperiencesContextually Relevant Retail APIs for Dynamic Insights & Experiences
Contextually Relevant Retail APIs for Dynamic Insights & Experiences
 
Search, APIs, capability management and the Sensis journey - By Rees Craig
Search, APIs, capability management and the Sensis journey - By Rees CraigSearch, APIs, capability management and the Sensis journey - By Rees Craig
Search, APIs, capability management and the Sensis journey - By Rees Craig
 
Click stream analysis and hadoop framwork
Click stream analysis and hadoop framworkClick stream analysis and hadoop framwork
Click stream analysis and hadoop framwork
 
Internet of Things Chicago - Meetup
Internet of Things Chicago - MeetupInternet of Things Chicago - Meetup
Internet of Things Chicago - Meetup
 
Online retail a look at data consulting approach
Online retail   a look at data consulting approachOnline retail   a look at data consulting approach
Online retail a look at data consulting approach
 
Building an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signalsBuilding an accurate understanding of consumers based on real-world signals
Building an accurate understanding of consumers based on real-world signals
 
Semantic search in the cloud
Semantic search in the cloudSemantic search in the cloud
Semantic search in the cloud
 
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
 
Boston seo meetup 2-28-2017
Boston seo meetup 2-28-2017Boston seo meetup 2-28-2017
Boston seo meetup 2-28-2017
 
5733 a deep dive into IBM Watson Foundation for CSP (WFC)
5733   a deep dive into IBM Watson Foundation for CSP (WFC)5733   a deep dive into IBM Watson Foundation for CSP (WFC)
5733 a deep dive into IBM Watson Foundation for CSP (WFC)
 
Archie CV (2)
Archie CV (2)Archie CV (2)
Archie CV (2)
 
SalesLogix Roadmap 2008 11 01
SalesLogix Roadmap 2008 11 01SalesLogix Roadmap 2008 11 01
SalesLogix Roadmap 2008 11 01
 
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
 
Technologies
TechnologiesTechnologies
Technologies
 
How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top Contenders
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Big Data in Online Classifieds
Big Data in Online ClassifiedsBig Data in Online Classifieds
Big Data in Online Classifieds
 
Applications of Big Data & Hadoop
Applications of Big Data & HadoopApplications of Big Data & Hadoop
Applications of Big Data & Hadoop
 

Recently uploaded

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 

Recently uploaded (20)

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 

Apache Lens : Unified OLAP on Realtime and Batch Data