SlideShare a Scribd company logo
1 of 14
4 Advice for your Big Data initiative
Jari Koister
Talk at IEEE Big Data/Cloud Conference, June 28th 2013
Complexity and Direction of Predictive Big Data
A few learning that may increase you likely hood of
success.
Infochimps about challenges….
Brownelles November 14, 20123
Complex Environment
4
DataScience
Big Data
Predictive Analysis
Machine Learning
Marketing Analytics
Sales Analytics
Columnar Data Bases
DataCubes
Hadoop
Hive
Spark
PigImpala
ETL
WebAnalytics
Churn
Segmentation
Clustering
Drill
Propensity
Uplift
Business Intelligence
Chief Intelligence Officer
Data Warehouse
InformationValuation
Entity Linkage
De-duplication
ImmutableStore
MesosSupervised
Un-supervised
Non-parametric
Big Data
Gartner believes big data is neither a technology
nor a distinct and uniquely measured market of
products. We believe it is a phenomenon brought
about by rapid data growth, complex new data
types and parallel advancements in technology,
all combining to enable people to analyze
information in new ways to produce more useful
insights about the world around them.
Brownelles November 14, 20125
6
Hype, Maturity, Potential…
Gartner Hypercycle for Big Data, 2012
What is changing?
Brownelles November 14, 20127
Experts
Intermediate
Beginners
A Few Tens Hundreds
Many
Algorithms
Experimental
Value Focused
Audience
Data Sources
Complexity and Direction of Predictive Big Data
A few learnings that may increase you likely hood of
success.
1st (4) Advice: Don’t get bogged down
in technology.
9
Data Access (Query Expressiveness)
Scale
HDFS
HBase
ParAccelRedShift
Cassandra CouchBase
Cascading
Riak
MySQL
Vertica
InfoBright
VectorWise
Spark
CitusData
WibiData
Phoenix
MSSQL
MSAS Mahout
Map/Reduce
R MatLab
SciPy
Snow
Hive
Impala
Drill Pig
2nd (4) Advice: Find a DQE provider
Brownelles November 14, 201210
Complex
Entity linkage
Fuzzy matching
External data
De duplication
Repetitive
&
Scale
Continous
Lots of data
Common
Necessary
but not
unique
3rd(4) Advice: Be Realistic
Brownelles November 14, 201211
Narrow solution Customized
Low Investment
High Investment
*Size Indicates Return
4th(4) Advice: Scale is expensive,
sample when you can.
12
http://www.agilone.com/email-marketing/what-you-shouldnt-need-to-know-about-big-data-and-machine-learning/
Relation
Simple Complex Noisy Biased
Sample Big Data Overkill ✓ ✓ N/A
Large Overkill ✓ ✓ ≈✓
Small ✓ ✗ ✗ ✗
Data set of
Learning Scoring
Propensity to buy Sample Complete
Customer clustering Sample Complete
Customer segmentation Sample Complete
U2P Recommendation Sample Complete
P2P Recommendations Complete Complete
Bonus Advice: Orchestration is a ….
1
13
Batch Real-timeDead-line-time Speed-of-thought
Eventual
L Revenue
impact
*Size indicates # of customer
immediately impacted
M Revenue
impact
S Revenue
impact
Thank you for listening
jari@agilone.com
14

More Related Content

What's hot

Big, small or just complex data?
Big, small or just complex data?Big, small or just complex data?
Big, small or just complex data?panoratio
 
DMTI Spatial Location Hub Analytics: big data, analytics, visualization
DMTI Spatial Location Hub Analytics: big data, analytics, visualizationDMTI Spatial Location Hub Analytics: big data, analytics, visualization
DMTI Spatial Location Hub Analytics: big data, analytics, visualizationDMTI Spatial
 
Candor - open analytics nyc
Candor  - open analytics nycCandor  - open analytics nyc
Candor - open analytics nycOpen Analytics
 
Big data analytics in banking sector
Big data analytics in banking sectorBig data analytics in banking sector
Big data analytics in banking sectorAnil Rana
 
Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...
Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...
Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...Dataconomy Media
 
Big Data – From Strategy to Production
Big Data – From Strategy to ProductionBig Data – From Strategy to Production
Big Data – From Strategy to ProductionSemantic Web Company
 
Data Discovery and Governance
Data Discovery and GovernanceData Discovery and Governance
Data Discovery and Governanceibi
 
Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...
Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...
Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...Microsoft
 
3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences
3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences
3 Steps to Turning CCPA & Data Privacy into Personalized Customer ExperiencesJean-Michel Franco
 
Dell hans timmerman v1.1
Dell hans timmerman v1.1Dell hans timmerman v1.1
Dell hans timmerman v1.1BigDataExpo
 
Summary of Insights Learned from the Data Science Program Team Training
Summary of Insights Learned from the Data Science Program Team TrainingSummary of Insights Learned from the Data Science Program Team Training
Summary of Insights Learned from the Data Science Program Team TrainingFred Chiang
 
Introduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data ScienceIntroduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data ScienceIMC Institute
 
Chief Data Officer: Evolution to the Chief Analytics Officer and Data Science
Chief Data Officer: Evolution to the Chief Analytics Officer and Data ScienceChief Data Officer: Evolution to the Chief Analytics Officer and Data Science
Chief Data Officer: Evolution to the Chief Analytics Officer and Data ScienceCraig Milroy
 
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...Datameer
 
Strata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
Strata Data Conference 2019 : Scaling Visualization for Big Data in the CloudStrata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
Strata Data Conference 2019 : Scaling Visualization for Big Data in the CloudJaipaul Agonus
 
Pieter den Hamer Alliander
Pieter den Hamer Alliander Pieter den Hamer Alliander
Pieter den Hamer Alliander BigDataExpo
 
World of distributed analytics
World of distributed analyticsWorld of distributed analytics
World of distributed analyticsAshnikbiz
 

What's hot (20)

Big, small or just complex data?
Big, small or just complex data?Big, small or just complex data?
Big, small or just complex data?
 
Big Data
Big DataBig Data
Big Data
 
DMTI Spatial Location Hub Analytics: big data, analytics, visualization
DMTI Spatial Location Hub Analytics: big data, analytics, visualizationDMTI Spatial Location Hub Analytics: big data, analytics, visualization
DMTI Spatial Location Hub Analytics: big data, analytics, visualization
 
Candor - open analytics nyc
Candor  - open analytics nycCandor  - open analytics nyc
Candor - open analytics nyc
 
Big data analytics in banking sector
Big data analytics in banking sectorBig data analytics in banking sector
Big data analytics in banking sector
 
"Big Data Dreams"
"Big Data Dreams""Big Data Dreams"
"Big Data Dreams"
 
Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...
Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...
Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...
 
Big Data – From Strategy to Production
Big Data – From Strategy to ProductionBig Data – From Strategy to Production
Big Data – From Strategy to Production
 
Data Discovery and Governance
Data Discovery and GovernanceData Discovery and Governance
Data Discovery and Governance
 
Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...
Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...
Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...
 
3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences
3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences
3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences
 
Dell hans timmerman v1.1
Dell hans timmerman v1.1Dell hans timmerman v1.1
Dell hans timmerman v1.1
 
Summary of Insights Learned from the Data Science Program Team Training
Summary of Insights Learned from the Data Science Program Team TrainingSummary of Insights Learned from the Data Science Program Team Training
Summary of Insights Learned from the Data Science Program Team Training
 
Introduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data ScienceIntroduction to Data Mining, Business Intelligence and Data Science
Introduction to Data Mining, Business Intelligence and Data Science
 
Chief Data Officer: Evolution to the Chief Analytics Officer and Data Science
Chief Data Officer: Evolution to the Chief Analytics Officer and Data ScienceChief Data Officer: Evolution to the Chief Analytics Officer and Data Science
Chief Data Officer: Evolution to the Chief Analytics Officer and Data Science
 
Why Alt Data Is So Important
Why Alt Data Is So ImportantWhy Alt Data Is So Important
Why Alt Data Is So Important
 
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
 
Strata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
Strata Data Conference 2019 : Scaling Visualization for Big Data in the CloudStrata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
Strata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
 
Pieter den Hamer Alliander
Pieter den Hamer Alliander Pieter den Hamer Alliander
Pieter den Hamer Alliander
 
World of distributed analytics
World of distributed analyticsWorld of distributed analytics
World of distributed analytics
 

Similar to Talk at IEEE Big Data/Cloud conference in Santa Clara, June 28th, 2013.

Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Inside Analysis
 
How to tackle big data from a security
How to tackle big data from a securityHow to tackle big data from a security
How to tackle big data from a securityTyrone Systems
 
Ibm 1129-the big data zoo
Ibm 1129-the big data zooIbm 1129-the big data zoo
Ibm 1129-the big data zooAccenture
 
Ibm 1129-the big data zoo
Ibm 1129-the big data zooIbm 1129-the big data zoo
Ibm 1129-the big data zooAccenture
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigDataValarmathi V
 
Gartner eBook on Big Data
Gartner eBook on Big DataGartner eBook on Big Data
Gartner eBook on Big DataJyrki Määttä
 
Austrade Presentation - Big Data the New Oil (Microsoft draft)
Austrade Presentation - Big Data the New Oil   (Microsoft draft)Austrade Presentation - Big Data the New Oil   (Microsoft draft)
Austrade Presentation - Big Data the New Oil (Microsoft draft)Dr Andrew Seit
 
Day 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressDay 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressIntelAPAC
 
Big Data Handbook - 8 Juy 2013
Big Data Handbook - 8 Juy 2013Big Data Handbook - 8 Juy 2013
Big Data Handbook - 8 Juy 2013Lora Cecere
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieSunil Ranka
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICSNAGARAJAGIDDE
 
An Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAn Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAudrey Britton
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)Shahbaz Anjam
 
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Jennifer Walker
 
Hadoop for beginners free course ppt
Hadoop for beginners   free course pptHadoop for beginners   free course ppt
Hadoop for beginners free course pptNjain85
 
Achieving Flexible Scalability of Hadoop to Meet Enterprise Workload Requirem...
Achieving Flexible Scalability of Hadoop to Meet Enterprise Workload Requirem...Achieving Flexible Scalability of Hadoop to Meet Enterprise Workload Requirem...
Achieving Flexible Scalability of Hadoop to Meet Enterprise Workload Requirem...EMC
 

Similar to Talk at IEEE Big Data/Cloud conference in Santa Clara, June 28th, 2013. (20)

Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
 
How to tackle big data from a security
How to tackle big data from a securityHow to tackle big data from a security
How to tackle big data from a security
 
Ibm 1129-the big data zoo
Ibm 1129-the big data zooIbm 1129-the big data zoo
Ibm 1129-the big data zoo
 
Ibm 1129-the big data zoo
Ibm 1129-the big data zooIbm 1129-the big data zoo
Ibm 1129-the big data zoo
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
Gartner eBook on Big Data
Gartner eBook on Big DataGartner eBook on Big Data
Gartner eBook on Big Data
 
Austrade Presentation - Big Data the New Oil (Microsoft draft)
Austrade Presentation - Big Data the New Oil   (Microsoft draft)Austrade Presentation - Big Data the New Oil   (Microsoft draft)
Austrade Presentation - Big Data the New Oil (Microsoft draft)
 
Day 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressDay 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_press
 
Big Data Handbook - 8 Juy 2013
Big Data Handbook - 8 Juy 2013Big Data Handbook - 8 Juy 2013
Big Data Handbook - 8 Juy 2013
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big Data
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A Lie
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
 
An Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAn Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data Analytics
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)
 
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
 
Bidata
BidataBidata
Bidata
 
Hadoop for beginners free course ppt
Hadoop for beginners   free course pptHadoop for beginners   free course ppt
Hadoop for beginners free course ppt
 
Achieving Flexible Scalability of Hadoop to Meet Enterprise Workload Requirem...
Achieving Flexible Scalability of Hadoop to Meet Enterprise Workload Requirem...Achieving Flexible Scalability of Hadoop to Meet Enterprise Workload Requirem...
Achieving Flexible Scalability of Hadoop to Meet Enterprise Workload Requirem...
 
7 trends-for-big-data
7 trends-for-big-data7 trends-for-big-data
7 trends-for-big-data
 
Big Data 2.0
Big Data 2.0Big Data 2.0
Big Data 2.0
 

Recently uploaded

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Talk at IEEE Big Data/Cloud conference in Santa Clara, June 28th, 2013.

  • 1. 4 Advice for your Big Data initiative Jari Koister Talk at IEEE Big Data/Cloud Conference, June 28th 2013
  • 2. Complexity and Direction of Predictive Big Data A few learning that may increase you likely hood of success.
  • 4. Complex Environment 4 DataScience Big Data Predictive Analysis Machine Learning Marketing Analytics Sales Analytics Columnar Data Bases DataCubes Hadoop Hive Spark PigImpala ETL WebAnalytics Churn Segmentation Clustering Drill Propensity Uplift Business Intelligence Chief Intelligence Officer Data Warehouse InformationValuation Entity Linkage De-duplication ImmutableStore MesosSupervised Un-supervised Non-parametric
  • 5. Big Data Gartner believes big data is neither a technology nor a distinct and uniquely measured market of products. We believe it is a phenomenon brought about by rapid data growth, complex new data types and parallel advancements in technology, all combining to enable people to analyze information in new ways to produce more useful insights about the world around them. Brownelles November 14, 20125
  • 6. 6 Hype, Maturity, Potential… Gartner Hypercycle for Big Data, 2012
  • 7. What is changing? Brownelles November 14, 20127 Experts Intermediate Beginners A Few Tens Hundreds Many Algorithms Experimental Value Focused Audience Data Sources
  • 8. Complexity and Direction of Predictive Big Data A few learnings that may increase you likely hood of success.
  • 9. 1st (4) Advice: Don’t get bogged down in technology. 9 Data Access (Query Expressiveness) Scale HDFS HBase ParAccelRedShift Cassandra CouchBase Cascading Riak MySQL Vertica InfoBright VectorWise Spark CitusData WibiData Phoenix MSSQL MSAS Mahout Map/Reduce R MatLab SciPy Snow Hive Impala Drill Pig
  • 10. 2nd (4) Advice: Find a DQE provider Brownelles November 14, 201210 Complex Entity linkage Fuzzy matching External data De duplication Repetitive & Scale Continous Lots of data Common Necessary but not unique
  • 11. 3rd(4) Advice: Be Realistic Brownelles November 14, 201211 Narrow solution Customized Low Investment High Investment *Size Indicates Return
  • 12. 4th(4) Advice: Scale is expensive, sample when you can. 12 http://www.agilone.com/email-marketing/what-you-shouldnt-need-to-know-about-big-data-and-machine-learning/ Relation Simple Complex Noisy Biased Sample Big Data Overkill ✓ ✓ N/A Large Overkill ✓ ✓ ≈✓ Small ✓ ✗ ✗ ✗ Data set of Learning Scoring Propensity to buy Sample Complete Customer clustering Sample Complete Customer segmentation Sample Complete U2P Recommendation Sample Complete P2P Recommendations Complete Complete
  • 13. Bonus Advice: Orchestration is a …. 1 13 Batch Real-timeDead-line-time Speed-of-thought Eventual L Revenue impact *Size indicates # of customer immediately impacted M Revenue impact S Revenue impact
  • 14. Thank you for listening jari@agilone.com 14