Talk at IEEE Big Data/Cloud conference in Santa Clara, June 28th, 2013.

  • 198 views
Uploaded on

Sharing why it is hard to succeed with Big Data/Predictive projects in terms of productionalizing them what you can do to reduce risk while take is steps in the right direction.

Sharing why it is hard to succeed with Big Data/Predictive projects in terms of productionalizing them what you can do to reduce risk while take is steps in the right direction.

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
198
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. 4 Advice for your Big Data initiative Jari Koister Talk at IEEE Big Data/Cloud Conference, June 28th 2013
  • 2. Complexity and Direction of Predictive Big Data A few learning that may increase you likely hood of success.
  • 3. Infochimps about challenges…. Brownelles November 14, 20123
  • 4. Complex Environment 4 DataScience Big Data Predictive Analysis Machine Learning Marketing Analytics Sales Analytics Columnar Data Bases DataCubes Hadoop Hive Spark PigImpala ETL WebAnalytics Churn Segmentation Clustering Drill Propensity Uplift Business Intelligence Chief Intelligence Officer Data Warehouse InformationValuation Entity Linkage De-duplication ImmutableStore MesosSupervised Un-supervised Non-parametric
  • 5. Big Data Gartner believes big data is neither a technology nor a distinct and uniquely measured market of products. We believe it is a phenomenon brought about by rapid data growth, complex new data types and parallel advancements in technology, all combining to enable people to analyze information in new ways to produce more useful insights about the world around them. Brownelles November 14, 20125
  • 6. 6 Hype, Maturity, Potential… Gartner Hypercycle for Big Data, 2012
  • 7. What is changing? Brownelles November 14, 20127 Experts Intermediate Beginners A Few Tens Hundreds Many Algorithms Experimental Value Focused Audience Data Sources
  • 8. Complexity and Direction of Predictive Big Data A few learnings that may increase you likely hood of success.
  • 9. 1st (4) Advice: Don’t get bogged down in technology. 9 Data Access (Query Expressiveness) Scale HDFS HBase ParAccelRedShift Cassandra CouchBase Cascading Riak MySQL Vertica InfoBright VectorWise Spark CitusData WibiData Phoenix MSSQL MSAS Mahout Map/Reduce R MatLab SciPy Snow Hive Impala Drill Pig
  • 10. 2nd (4) Advice: Find a DQE provider Brownelles November 14, 201210 Complex Entity linkage Fuzzy matching External data De duplication Repetitive & Scale Continous Lots of data Common Necessary but not unique
  • 11. 3rd(4) Advice: Be Realistic Brownelles November 14, 201211 Narrow solution Customized Low Investment High Investment *Size Indicates Return
  • 12. 4th(4) Advice: Scale is expensive, sample when you can. 12 http://www.agilone.com/email-marketing/what-you-shouldnt-need-to-know-about-big-data-and-machine-learning/ Relation Simple Complex Noisy Biased Sample Big Data Overkill ✓ ✓ N/A Large Overkill ✓ ✓ ≈✓ Small ✓ ✗ ✗ ✗ Data set of Learning Scoring Propensity to buy Sample Complete Customer clustering Sample Complete Customer segmentation Sample Complete U2P Recommendation Sample Complete P2P Recommendations Complete Complete
  • 13. Bonus Advice: Orchestration is a …. 1 13 Batch Real-timeDead-line-time Speed-of-thought Eventual L Revenue impact *Size indicates # of customer immediately impacted M Revenue impact S Revenue impact
  • 14. Thank you for listening jari@agilone.com 14