SlideShare a Scribd company logo
1 of 26
Download to read offline
The Last Mile:
Challenges and opportunities
in data tools
Strata Santa Clara 2014
Wes McKinney
@wesmckinn
• Former quant @ AQR (a hedge fund)
• Creator of pandas
• Author of 


Python for Data Analysis — O’Reilly

• Founder and CEO of DataPad

2

www.datapad.io
3

www.datapad.io
• http://datapad.io
• New web-based visual analytics
environment

• In private beta, join us!
• Hiring for engineering
4

www.datapad.io
Some Problems

•
• Statistics and ML
ETL
•
• Data Visualization
Workflows + Collaboration
•
Business Analytics

5

www.datapad.io
Data toolchains
Data Acquisition
Data Slinging /
Management

ETL
SQL / Tidy Form

Code-based Env
6

UI-based Env
www.datapad.io

Analysis
Data toolchains
Data Acquisition
Maybe

HDFS
ETL

ETL
Analytic DBMS

Code-based Env
7

ETL?
www.datapad.io

UI-based Env
Some Trends

•
• SQL-on-Hadoop
Spark / Spark ecosystem
•
New life in visual ETL / data prep
•
Better data manipulation libraries
•
Columnar / analytic databases

8

www.datapad.io
Crunching data with code

•
Python: pandas
•
Data frames in Scala, F#, Julia, …
•
Spark (Scala/Java)
•
R (+ data.table, dplyr)

9

www.datapad.io
Some Programmatic Tool
Problems

• Awkward / slow DB interactions
• In-process memory management
Reuse of intermediate results
•
Execution speed
•
• Evaluation semantics
10

www.datapad.io
dplyr (R library)

• By Hadley Wickham and Romain
Francois

•

Uniform R API, SQL and in-memory
backends

• Describe complex data

manipulation using “chaining”

11

www.datapad.io
dplyr (R library)
final
%.%
%.%
%.%
%.%
%.%
%.%

12

<- crime.by.state
filter(State=="New York", Year==2005)
arrange(desc(Count))
select(Type.of.Crime, Count)
mutate(Proportion=Count/sum(Count))
group_by(Type.of.Crime)
summarise(num.types = n(),
counts = sum(Count))

www.datapad.io
Apache Spark

• Broad set of primitive data ops
• Distributed in-memory model scales
naturally, high performance

•

Build complex computation graphs
for analytics

• Applications: Shark, GraphX, …
13

www.datapad.io
pandas (Python library)

• Broad traction
• Strong feature: time series analytics
User-friendly API and community
•
Being used in many unexpected
•
ways

14

www.datapad.io
badger (DataPad internal)

• A high performance in-memory
analytics engine for DataPad

• Addresses many performance and

memory management concerns in
pandas

• May become an OSS project someday
15

www.datapad.io
Standardized machine
learning toolkits

• scikit-learn
• PMML
• Mahout
Cloudera ML
•
16

www.datapad.io
Enterprise data workflows

•
• Apache Crunch
Pig
•

Cascading (+ Scalding, Cascalog)

17

www.datapad.io
Analytic databases

• Powering visual analytics tools on
big data

•
• MPP / in-memory execution model
Compressed columnar storage

18

www.datapad.io
Visual data tools

• Visual Analytics/BI gone mainstream
• New Data Prep products
• Drag-and-drop predictive analytics
• Proliferation of vertical SaaS solutions
19

www.datapad.io
Visual tool challenges

• Tend to be less flexible than code
• Multiple tools to get the job done
• Many still dependent on Excel
• Collaboration, versioning, provenance
20

www.datapad.io
Collaboration tools

• Discovery and reuse
• Cataloguing insights
• Analytics from ad-hoc to production
• Interesting projects: IPython Notebook,
Shiny, Pivotal Chorus

21

www.datapad.io
Some ideas

22

www.datapad.io
Abstract away the
execution model
(where possible)

23

www.datapad.io
More integrated
environments
24

www.datapad.io
Enhance collaboration

25

www.datapad.io
Thank you!

26

www.datapad.io

More Related Content

What's hot

Automating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittAutomating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittDatabricks
 
Accelerate Data Science Initiatives: Databricks & Privacera
Accelerate Data Science Initiatives: Databricks & PrivaceraAccelerate Data Science Initiatives: Databricks & Privacera
Accelerate Data Science Initiatives: Databricks & PrivaceraDatabricks
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks
 
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
 Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr... Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...Databricks
 
Presto: Fast SQL on Everything
Presto: Fast SQL on EverythingPresto: Fast SQL on Everything
Presto: Fast SQL on EverythingDavid Phillips
 
Nodes2020 | Graph of enterprise_metadata | NEO4J Conference
Nodes2020 | Graph of enterprise_metadata | NEO4J ConferenceNodes2020 | Graph of enterprise_metadata | NEO4J Conference
Nodes2020 | Graph of enterprise_metadata | NEO4J ConferenceDeepak Chandramouli
 
Introduction to Cortana Analytics
Introduction to Cortana AnalyticsIntroduction to Cortana Analytics
Introduction to Cortana AnalyticsChris Testa-O'Neill
 
A7 storytelling with_oracle_analytics_cloud
A7 storytelling with_oracle_analytics_cloudA7 storytelling with_oracle_analytics_cloud
A7 storytelling with_oracle_analytics_cloudDr. Wilfred Lin (Ph.D.)
 
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Fwdays
 
Integration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeIntegration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeTom Kerkhove
 
Getting Started with Databricks SQL Analytics
Getting Started with Databricks SQL AnalyticsGetting Started with Databricks SQL Analytics
Getting Started with Databricks SQL AnalyticsDatabricks
 
Дмитрий Попович "How to build a data warehouse?"
Дмитрий Попович "How to build a data warehouse?"Дмитрий Попович "How to build a data warehouse?"
Дмитрий Попович "How to build a data warehouse?"Fwdays
 
The SAS Search Journey: Using AI to Move from Google to Lucidworks - Alex Fl...
The SAS Search Journey:  Using AI to Move from Google to Lucidworks - Alex Fl...The SAS Search Journey:  Using AI to Move from Google to Lucidworks - Alex Fl...
The SAS Search Journey: Using AI to Move from Google to Lucidworks - Alex Fl...Lucidworks
 
Democratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseDemocratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseJesus Rodriguez
 
Fighting Fraud with Apache Spark
Fighting Fraud with Apache SparkFighting Fraud with Apache Spark
Fighting Fraud with Apache SparkMiklos Christine
 
Transitioning to a BI Role
Transitioning to a BI RoleTransitioning to a BI Role
Transitioning to a BI RoleJames Serra
 
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ DevicesDelivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ DevicesDatabricks
 

What's hot (20)

Automating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittAutomating Data Quality Processes at Reckitt
Automating Data Quality Processes at Reckitt
 
Accelerate Data Science Initiatives: Databricks & Privacera
Accelerate Data Science Initiatives: Databricks & PrivaceraAccelerate Data Science Initiatives: Databricks & Privacera
Accelerate Data Science Initiatives: Databricks & Privacera
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With Data
 
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
 Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr... Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
 
Presto: Fast SQL on Everything
Presto: Fast SQL on EverythingPresto: Fast SQL on Everything
Presto: Fast SQL on Everything
 
Nodes2020 | Graph of enterprise_metadata | NEO4J Conference
Nodes2020 | Graph of enterprise_metadata | NEO4J ConferenceNodes2020 | Graph of enterprise_metadata | NEO4J Conference
Nodes2020 | Graph of enterprise_metadata | NEO4J Conference
 
Introduction to Cortana Analytics
Introduction to Cortana AnalyticsIntroduction to Cortana Analytics
Introduction to Cortana Analytics
 
A7 storytelling with_oracle_analytics_cloud
A7 storytelling with_oracle_analytics_cloudA7 storytelling with_oracle_analytics_cloud
A7 storytelling with_oracle_analytics_cloud
 
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
 
Integration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeIntegration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data Lake
 
Getting Started with Databricks SQL Analytics
Getting Started with Databricks SQL AnalyticsGetting Started with Databricks SQL Analytics
Getting Started with Databricks SQL Analytics
 
Power BI: Tips and Tricks
Power BI: Tips and TricksPower BI: Tips and Tricks
Power BI: Tips and Tricks
 
Дмитрий Попович "How to build a data warehouse?"
Дмитрий Попович "How to build a data warehouse?"Дмитрий Попович "How to build a data warehouse?"
Дмитрий Попович "How to build a data warehouse?"
 
Scale By The Bay | 2020 | Gimel
Scale By The Bay | 2020 | GimelScale By The Bay | 2020 | Gimel
Scale By The Bay | 2020 | Gimel
 
Marketing vs Technology
Marketing vs TechnologyMarketing vs Technology
Marketing vs Technology
 
The SAS Search Journey: Using AI to Move from Google to Lucidworks - Alex Fl...
The SAS Search Journey:  Using AI to Move from Google to Lucidworks - Alex Fl...The SAS Search Journey:  Using AI to Move from Google to Lucidworks - Alex Fl...
The SAS Search Journey: Using AI to Move from Google to Lucidworks - Alex Fl...
 
Democratizing Data Science in the Enterprise
Democratizing Data Science in the EnterpriseDemocratizing Data Science in the Enterprise
Democratizing Data Science in the Enterprise
 
Fighting Fraud with Apache Spark
Fighting Fraud with Apache SparkFighting Fraud with Apache Spark
Fighting Fraud with Apache Spark
 
Transitioning to a BI Role
Transitioning to a BI RoleTransitioning to a BI Role
Transitioning to a BI Role
 
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ DevicesDelivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
 

Viewers also liked

Happiness By Design - Nindo
Happiness By Design - NindoHappiness By Design - Nindo
Happiness By Design - NindoSelf Employed
 
Building Websites 101 (Online Portfolios) 112013
Building Websites 101 (Online Portfolios) 112013Building Websites 101 (Online Portfolios) 112013
Building Websites 101 (Online Portfolios) 112013Dioni Wise
 
Online 2003 bom presentation july 19 th 2014.pptx [autosaved] new
Online 2003 bom presentation july 19 th 2014.pptx [autosaved]  new Online 2003 bom presentation july 19 th 2014.pptx [autosaved]  new
Online 2003 bom presentation july 19 th 2014.pptx [autosaved] new balakigl
 
Human Resources Management
Human Resources ManagementHuman Resources Management
Human Resources Managementtarun chauhan
 
Learning 101: Critical Thinking
Learning 101: Critical ThinkingLearning 101: Critical Thinking
Learning 101: Critical ThinkingUNTLC
 
Learning 101: Test Taking
Learning 101: Test TakingLearning 101: Test Taking
Learning 101: Test TakingUNTLC
 
TABJ Resume/LinkedIn Workshop Presentation 110814
TABJ Resume/LinkedIn Workshop Presentation 110814TABJ Resume/LinkedIn Workshop Presentation 110814
TABJ Resume/LinkedIn Workshop Presentation 110814Dioni Wise
 
English Center resources
English Center resourcesEnglish Center resources
English Center resourcesPrincessJuliana
 
Social Media Branding (Bennett College Workshop 111113)
Social Media Branding (Bennett College Workshop 111113)Social Media Branding (Bennett College Workshop 111113)
Social Media Branding (Bennett College Workshop 111113)Dioni Wise
 
Learning 101: Presentation Skills
Learning 101: Presentation SkillsLearning 101: Presentation Skills
Learning 101: Presentation SkillsUNTLC
 
Learning 101: Managing Stress
Learning 101: Managing StressLearning 101: Managing Stress
Learning 101: Managing StressUNTLC
 
Learning 101: Note Taking
Learning 101: Note TakingLearning 101: Note Taking
Learning 101: Note TakingUNTLC
 
Learning 101: Time Mangement
Learning 101: Time MangementLearning 101: Time Mangement
Learning 101: Time MangementUNTLC
 
Coordination And Communication
Coordination And CommunicationCoordination And Communication
Coordination And Communicationtarun chauhan
 

Viewers also liked (15)

Happiness By Design - Nindo
Happiness By Design - NindoHappiness By Design - Nindo
Happiness By Design - Nindo
 
Building Websites 101 (Online Portfolios) 112013
Building Websites 101 (Online Portfolios) 112013Building Websites 101 (Online Portfolios) 112013
Building Websites 101 (Online Portfolios) 112013
 
Online 2003 bom presentation july 19 th 2014.pptx [autosaved] new
Online 2003 bom presentation july 19 th 2014.pptx [autosaved]  new Online 2003 bom presentation july 19 th 2014.pptx [autosaved]  new
Online 2003 bom presentation july 19 th 2014.pptx [autosaved] new
 
Human Resources Management
Human Resources ManagementHuman Resources Management
Human Resources Management
 
Learning 101: Critical Thinking
Learning 101: Critical ThinkingLearning 101: Critical Thinking
Learning 101: Critical Thinking
 
Learning 101: Test Taking
Learning 101: Test TakingLearning 101: Test Taking
Learning 101: Test Taking
 
TABJ Resume/LinkedIn Workshop Presentation 110814
TABJ Resume/LinkedIn Workshop Presentation 110814TABJ Resume/LinkedIn Workshop Presentation 110814
TABJ Resume/LinkedIn Workshop Presentation 110814
 
Murphy’s law
Murphy’s lawMurphy’s law
Murphy’s law
 
English Center resources
English Center resourcesEnglish Center resources
English Center resources
 
Social Media Branding (Bennett College Workshop 111113)
Social Media Branding (Bennett College Workshop 111113)Social Media Branding (Bennett College Workshop 111113)
Social Media Branding (Bennett College Workshop 111113)
 
Learning 101: Presentation Skills
Learning 101: Presentation SkillsLearning 101: Presentation Skills
Learning 101: Presentation Skills
 
Learning 101: Managing Stress
Learning 101: Managing StressLearning 101: Managing Stress
Learning 101: Managing Stress
 
Learning 101: Note Taking
Learning 101: Note TakingLearning 101: Note Taking
Learning 101: Note Taking
 
Learning 101: Time Mangement
Learning 101: Time MangementLearning 101: Time Mangement
Learning 101: Time Mangement
 
Coordination And Communication
Coordination And CommunicationCoordination And Communication
Coordination And Communication
 

Similar to The Last Mile: Challenges and Opportunities in Data Tools (Strata 2014)

Building Better Analytics Workflows (Strata-Hadoop World 2013)
Building Better Analytics Workflows (Strata-Hadoop World 2013)Building Better Analytics Workflows (Strata-Hadoop World 2013)
Building Better Analytics Workflows (Strata-Hadoop World 2013)Wes McKinney
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...Rittman Analytics
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Rittman Analytics
 
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...Mark Rittman
 
Data Science Day New York: Data Science: A Personal History
Data Science Day New York: Data Science: A Personal HistoryData Science Day New York: Data Science: A Personal History
Data Science Day New York: Data Science: A Personal HistoryCloudera, Inc.
 
Pacemaker hadoop infrastructure and soft serve experience
Pacemaker   hadoop infrastructure and soft serve experiencePacemaker   hadoop infrastructure and soft serve experience
Pacemaker hadoop infrastructure and soft serve experienceVitaliy Bashun
 
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data ArchitectHadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data ArchitectSoftServe
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...MongoDB
 
PyData: The Next Generation | Data Day Texas 2015
PyData: The Next Generation | Data Day Texas 2015PyData: The Next Generation | Data Day Texas 2015
PyData: The Next Generation | Data Day Texas 2015Cloudera, Inc.
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data PipelineJesus Rodriguez
 
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...Wes McKinney
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonDremio Corporation
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
20160331 sa introduction to big data pipelining berlin meetup 0.3
20160331 sa introduction to big data pipelining berlin meetup   0.320160331 sa introduction to big data pipelining berlin meetup   0.3
20160331 sa introduction to big data pipelining berlin meetup 0.3Simon Ambridge
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...Institute of Contemporary Sciences
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...Mark Rittman
 
Advanced Analytics in Hadoop
Advanced Analytics in HadoopAdvanced Analytics in Hadoop
Advanced Analytics in HadoopAnalyticsWeek
 
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...Flink Forward
 

Similar to The Last Mile: Challenges and Opportunities in Data Tools (Strata 2014) (20)

Building Better Analytics Workflows (Strata-Hadoop World 2013)
Building Better Analytics Workflows (Strata-Hadoop World 2013)Building Better Analytics Workflows (Strata-Hadoop World 2013)
Building Better Analytics Workflows (Strata-Hadoop World 2013)
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
 
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
 
Data Science Day New York: Data Science: A Personal History
Data Science Day New York: Data Science: A Personal HistoryData Science Day New York: Data Science: A Personal History
Data Science Day New York: Data Science: A Personal History
 
Pacemaker hadoop infrastructure and soft serve experience
Pacemaker   hadoop infrastructure and soft serve experiencePacemaker   hadoop infrastructure and soft serve experience
Pacemaker hadoop infrastructure and soft serve experience
 
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data ArchitectHadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
 
Apache drill
Apache drillApache drill
Apache drill
 
PyData: The Next Generation | Data Day Texas 2015
PyData: The Next Generation | Data Day Texas 2015PyData: The Next Generation | Data Day Texas 2015
PyData: The Next Generation | Data Day Texas 2015
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data Pipeline
 
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in London
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
20160331 sa introduction to big data pipelining berlin meetup 0.3
20160331 sa introduction to big data pipelining berlin meetup   0.320160331 sa introduction to big data pipelining berlin meetup   0.3
20160331 sa introduction to big data pipelining berlin meetup 0.3
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
 
Advanced Analytics in Hadoop
Advanced Analytics in HadoopAdvanced Analytics in Hadoop
Advanced Analytics in Hadoop
 
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

The Last Mile: Challenges and Opportunities in Data Tools (Strata 2014)