SlideShare a Scribd company logo
1 of 55
Welcome to today’s webinar How AOL Accelerates Ad Targeting Decisions with Hadoop and Membase Server Audio/Telephone:  +1 (805) 309-0021 Access Code:  670-793-134 Audio PIN: Shown after joining the Webinar Host: John Kreisa, VP of Marketing, Cloudera
Housekeeping Ask questions at any time using the Questions panel Problems? Use the Chat panel Recording will be available
About the Webinar How AOL Accelerates Ad Targeting Decisions with Hadoop and Membase Server
Speakers Matt Aslett, Senior Analyst, Enterprise Software Matt covers data management software for The 451 Group's Information Management practice, including relational and non-relational databases, data warehousing and data caching. Matthew is also an expert in open source software and contributes regularly to reports produced through the 451 Commercial Adoption of Open Source (CAOS) Research Service, as well as to the 451 CAOS Theory blog. PeroSubasic, Chief Architect, AOL Pero works on research and development in new technologies and contextual advertising at Aol Advertising in Palo Alto. Over the past 4 years he was the Chief Architect of R&D distributed infrastructure which today comprises more than 1000 nodes in multiple data centers. He also led large-scale contextual analysis and segmentation projects and a variety of machine learning efforts at Aol, Yahoo and Cadence Design Systems and published patents and research papers in these areas.
NoSQL and Hadoop: Open source innovation and adoption drivers Matthew Aslett, senior analyst The 451 Group
Analyzing the business of Enterprise IT Innovation Unique Analysis of the Hosting, Managed Service, Third-Party Datacenter and Internet Infrastructure sectors The 451 Group The Uptime Institute is the leading independent think tank and research body serving the global datacenter industry.
Coverage areas Commercial Adoption of Open Source (CAOS) Adoption by enterprise Adoption by vendors Information Management Database Data warehousing Data caching ,[object Object]
Senior analyst, enterprise software
With The 451 Group since 2007
www.about.me/mattaslett
www.twitter.com/maslett,[object Object]
Open source database landscape 2008 ,[object Object]
30.4% using open source databases
Main usage areas
45% In-house-developed apps
41% Single-function apps
38% Development
36% Web apps,[object Object]
Relevant reports Warehouse Optimization Ten considerations for choosing/building a data warehouse Published September 2009 The role of open source and emergence of Hadoop sales@the451group.com
Open source database landscape 2009 Analytic Infobright InfiniDB MonetDB LucidDB Hadoop
Relevant reports Data Warehousing 2009-2013 Market Sizing, Landscape and Future Published August 2010 The potential impact of Hadoop sales@the451group.com
Open source database landscape 2011 Analytic Infobright (InfiniDB) MonetDB LucidDB Hadoop Hadoop Pig Hive ZooKeeper Mahout Avro
Open source database landscape 2011 Hadoop Hadoop Pig Hive ZooKeeper Mahout Avro Analytic Infobright (InfiniDB) MonetDB LucidDB Cassandra CouchDB MongoDB NoSQL HBase Membase Riak
Relevant reports “Database alternatives” Assessing the drivers behind the development and adoption of NoSQL and scalable SQL databases, as well as Hadoop Planned for April 2011 Role of open source in driving innovation COMING  APRIL 2011
SPRAINED RELATIONAL DATABASES Photo credit: Foxtongue on Flickr http://www.flickr.com/photos/foxtongue/4844016087/
SPRAINed relational databases ,[object Object]
“An injury to ligaments… caused by being stretched beyond normal capacity”Wikipedia
SPRAINed relational databases SPRAIN: “An injury to ligaments… caused by being stretched beyond normal capacity” Wikipedia Six key drivers for NoSQL/Hadoop adoption Scalability Performance Relaxed consistency Agility Intricacy Necessity
SPRAINed relational databases SPRAIN: “An injury to ligaments… caused by being stretched beyond normal capacity” Wikipedia Six key drivers for NoSQL/Hadoop adoption Scalability Performance – performance does not necessarily mean scalability Relaxed consistency Agility Intricacy Necessity
SPRAINed relational databases SPRAIN: “An injury to ligaments… caused by being stretched beyond normal capacity” Wikipedia Six key drivers for NoSQL/Hadoop adoption Scalability Performance – performance does not necessarily mean scalability Relaxed consistency – where scalability is a given Agility Intricacy Necessity
SPRAINed relational databases SPRAIN: “An injury to ligaments… caused by being stretched beyond normal capacity” Wikipedia Six key drivers for NoSQL/Hadoop adoption Scalability Performance – performance does not necessarily mean scalability Relaxed consistency – where scalability is a given Agility – flexible, schema-free data models and agile development Intricacy Necessity
SPRAINed relational databases SPRAIN: “An injury to ligaments… caused by being stretched beyond normal capacity” Wikipedia Six key drivers for NoSQL/Hadoop adoption Scalability Performance – performance does not necessarily mean scalability Relaxed consistency – where scalability is a given Agility – flexible, schema-free data models and agile development Intricacy – complex relationships and data types Necessity
Scalability users application database hardware
Scalability users users users application application application database hardware
Scalability users users users application application application database hardware hardware hardware hardware hardware hardware hardware hardware
Scalability users users users users users users users users application application application database hardware hardware hardware hardware hardware hardware hardware hardware
Scalability users users users users users users users users application application application application application application database hardware hardware hardware hardware hardware hardware hardware hardware
Scalability users users users users users users users users application application application application application application DATA – large volumes, structured and unstructured, real-time demands  database hardware hardware hardware hardware hardware hardware hardware hardware
Scalability users users users users users users users users application application application application application application BIG DATA – Volume, Variety and Velocity database hardware hardware hardware hardware hardware hardware hardware hardware
Scalability Operational database Database Analytic database
Scalability big audience real-timetransactional data management Database large scale data analysis big data
Requirements big audience real-timetransactional data management Interactive application ,[object Object]
 real-time
 low, predictable latency
 working set often < total data setData analysis ,[object Object]
 batch processing
 analytics-optimized
 data locality modellarge scale data analysis big data
Requirements big audience Membase Interactive application ,[object Object]
 real-time
 low, predictable latency
 working set often < total data setData analysis ,[object Object]
 batch processing
 analytics-optimized
 data locality modelCloudera’s Distribution for Apache Hadoop big data

More Related Content

More from Cloudera, Inc.

Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Cloudera, Inc.
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionCloudera, Inc.
 

More from Cloudera, Inc. (20)

Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
 

Recently uploaded

Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeCzechDreamin
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty SecureFemke de Vroome
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoTAnalytics
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyUXDXConf
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoUXDXConf
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekCzechDreamin
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?Mark Billinghurst
 

Recently uploaded (20)

Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 

Webinar: How AOL Accelerates Targeting Decisions with Hadoop and Membase Server

  • 1. Welcome to today’s webinar How AOL Accelerates Ad Targeting Decisions with Hadoop and Membase Server Audio/Telephone: +1 (805) 309-0021 Access Code: 670-793-134 Audio PIN: Shown after joining the Webinar Host: John Kreisa, VP of Marketing, Cloudera
  • 2. Housekeeping Ask questions at any time using the Questions panel Problems? Use the Chat panel Recording will be available
  • 3. About the Webinar How AOL Accelerates Ad Targeting Decisions with Hadoop and Membase Server
  • 4. Speakers Matt Aslett, Senior Analyst, Enterprise Software Matt covers data management software for The 451 Group's Information Management practice, including relational and non-relational databases, data warehousing and data caching. Matthew is also an expert in open source software and contributes regularly to reports produced through the 451 Commercial Adoption of Open Source (CAOS) Research Service, as well as to the 451 CAOS Theory blog. PeroSubasic, Chief Architect, AOL Pero works on research and development in new technologies and contextual advertising at Aol Advertising in Palo Alto. Over the past 4 years he was the Chief Architect of R&D distributed infrastructure which today comprises more than 1000 nodes in multiple data centers. He also led large-scale contextual analysis and segmentation projects and a variety of machine learning efforts at Aol, Yahoo and Cadence Design Systems and published patents and research papers in these areas.
  • 5. NoSQL and Hadoop: Open source innovation and adoption drivers Matthew Aslett, senior analyst The 451 Group
  • 6. Analyzing the business of Enterprise IT Innovation Unique Analysis of the Hosting, Managed Service, Third-Party Datacenter and Internet Infrastructure sectors The 451 Group The Uptime Institute is the leading independent think tank and research body serving the global datacenter industry.
  • 7.
  • 9. With The 451 Group since 2007
  • 11.
  • 12.
  • 13. 30.4% using open source databases
  • 18.
  • 19. Relevant reports Warehouse Optimization Ten considerations for choosing/building a data warehouse Published September 2009 The role of open source and emergence of Hadoop sales@the451group.com
  • 20. Open source database landscape 2009 Analytic Infobright InfiniDB MonetDB LucidDB Hadoop
  • 21. Relevant reports Data Warehousing 2009-2013 Market Sizing, Landscape and Future Published August 2010 The potential impact of Hadoop sales@the451group.com
  • 22. Open source database landscape 2011 Analytic Infobright (InfiniDB) MonetDB LucidDB Hadoop Hadoop Pig Hive ZooKeeper Mahout Avro
  • 23. Open source database landscape 2011 Hadoop Hadoop Pig Hive ZooKeeper Mahout Avro Analytic Infobright (InfiniDB) MonetDB LucidDB Cassandra CouchDB MongoDB NoSQL HBase Membase Riak
  • 24. Relevant reports “Database alternatives” Assessing the drivers behind the development and adoption of NoSQL and scalable SQL databases, as well as Hadoop Planned for April 2011 Role of open source in driving innovation COMING APRIL 2011
  • 25. SPRAINED RELATIONAL DATABASES Photo credit: Foxtongue on Flickr http://www.flickr.com/photos/foxtongue/4844016087/
  • 26.
  • 27. “An injury to ligaments… caused by being stretched beyond normal capacity”Wikipedia
  • 28. SPRAINed relational databases SPRAIN: “An injury to ligaments… caused by being stretched beyond normal capacity” Wikipedia Six key drivers for NoSQL/Hadoop adoption Scalability Performance Relaxed consistency Agility Intricacy Necessity
  • 29. SPRAINed relational databases SPRAIN: “An injury to ligaments… caused by being stretched beyond normal capacity” Wikipedia Six key drivers for NoSQL/Hadoop adoption Scalability Performance – performance does not necessarily mean scalability Relaxed consistency Agility Intricacy Necessity
  • 30. SPRAINed relational databases SPRAIN: “An injury to ligaments… caused by being stretched beyond normal capacity” Wikipedia Six key drivers for NoSQL/Hadoop adoption Scalability Performance – performance does not necessarily mean scalability Relaxed consistency – where scalability is a given Agility Intricacy Necessity
  • 31. SPRAINed relational databases SPRAIN: “An injury to ligaments… caused by being stretched beyond normal capacity” Wikipedia Six key drivers for NoSQL/Hadoop adoption Scalability Performance – performance does not necessarily mean scalability Relaxed consistency – where scalability is a given Agility – flexible, schema-free data models and agile development Intricacy Necessity
  • 32. SPRAINed relational databases SPRAIN: “An injury to ligaments… caused by being stretched beyond normal capacity” Wikipedia Six key drivers for NoSQL/Hadoop adoption Scalability Performance – performance does not necessarily mean scalability Relaxed consistency – where scalability is a given Agility – flexible, schema-free data models and agile development Intricacy – complex relationships and data types Necessity
  • 33. Scalability users application database hardware
  • 34. Scalability users users users application application application database hardware
  • 35. Scalability users users users application application application database hardware hardware hardware hardware hardware hardware hardware hardware
  • 36. Scalability users users users users users users users users application application application database hardware hardware hardware hardware hardware hardware hardware hardware
  • 37. Scalability users users users users users users users users application application application application application application database hardware hardware hardware hardware hardware hardware hardware hardware
  • 38. Scalability users users users users users users users users application application application application application application DATA – large volumes, structured and unstructured, real-time demands database hardware hardware hardware hardware hardware hardware hardware hardware
  • 39. Scalability users users users users users users users users application application application application application application BIG DATA – Volume, Variety and Velocity database hardware hardware hardware hardware hardware hardware hardware hardware
  • 40. Scalability Operational database Database Analytic database
  • 41. Scalability big audience real-timetransactional data management Database large scale data analysis big data
  • 42.
  • 45.
  • 48. data locality modellarge scale data analysis big data
  • 49.
  • 52.
  • 55. data locality modelCloudera’s Distribution for Apache Hadoop big data
  • 56. Scalability big audience Membase Membase Membase Membase Membase Membase Membase Cloudera’s Distribution for Apache Hadoop Cloudera’s Distribution for Apache Hadoop Cloudera’s Distribution for Apache Hadoop Cloudera’s Distribution for Apache Hadoop Cloudera’s Distribution for Apache Hadoop Cloudera’s Distribution for Apache Hadoop Cloudera’s Distribution for Apache Hadoop big data
  • 57.
  • 58. real-time data collection, analysis
  • 59. shared data platform
  • 60.
  • 61. aggregation of mixed data sources
  • 62. structured and un/semi-structured data
  • 63. transform and loadCloudera’s Distribution for Apache Hadoop big data
  • 64. Target markets big audience Enterprise applications event monitoring sensor data compliance and regulatory reporting intelligence analysis fraud detection Web applications social games SaaS e-commerce systems clickstream analysis ad and offer targeting systems Membase Cloudera’s Distribution for Apache Hadoop big data
  • 65. Necessity BigTable and MapReduce – Google Dynamo – Amazon Hadoop, Pig, HBase – Yahoo Cassandra, Hive - Facebook Voldemort – Linkedin FlockDB – Twitter Hypertable – Zvents Neo4J – Windh Technologies Memcached – Danga Interactive MongoDB – Doubleclick Membase – Zynga
  • 66.
  • 67.
  • 68.
  • 70. Accelerating Ad Targeting Decisions with Hadoop and Membase PeroSubasic, Aol pero.subasic@teamaol.com
  • 71. Overview Online advertising overview Large-scale Analytics at Aol Current Architecture and Data Flows Hadoop+Membase current use cases RT Architecture Proposal RT Use Case: RT Contextual Segmentation of Users Conclusion
  • 72.
  • 73. CPC = Cost Per Click, e.g. $2 per click
  • 74.
  • 75.
  • 77.
  • 78. Use Cases Today data set enrichment: given a field in a data set stored on HDFS, enrich by adding related fields; media -> campaign -> advertiser chain blackboard for inter-process/job communication: contextual segmentation pipelines; predictive modeling can load per-campaign models to be used for large-scale scoring larger map-side joins (where HadoopDistributedCache and in-memory process/task cache is insufficient) aggregations with large number of item lookups, e.g. user-level contextual profiles aggregated from visited url contextual profies stored in memcache Flume integration for data flow reliability end recovery segment generation currently carried out through Hadoop pipelines and uploaded into server-side Membase for targeting but: strong tendency to move closer to ad serving motivates thinking about new architectures to reduce segment generation time
  • 79. RT Framework: Capture, Compute and Forward Flume Ingestion Data Feeds CAPTURE COMPUTE FORWARD Big Data Loop Membase (back-end) Compute Cluster Membase (front-end) and ad-serving logic Hadoop
  • 80. Features Ahead bucket lifecycle management (creation, sizing, deposition) asynchronous stream with all mutation operators iteration through key space without knowing the keys in advance (making key space Ord?) regex or range-based key iteration for finer-grain key space control bucket drain to HDFS event-based synchronization between instances (TAP)
  • 81. RT Contextual Segmentation Flume Ingestion Data Feeds User-ContentIDMapper Membase Active Event Frame Membase + ad-serving logic User-Segment Mapper UC Map US Short-term Map ContentID-Segment Map Event-based updates Daily Map updates Hadoop US Long-term Map
  • 82.
  • 83. Closing Remarks Exciting Times Need is real and recognized Technological capability is within reach Q/A Contact: pero.subasic@teamaol.com
  • 84.
  • 86. Frank Weigel, Director of Product Management, Couchbase (formerly Membase)
  • 87.