• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Datameer - Stephen Groschupf - Hadoop World 2010

on

  • 2,017 views

Multi Channel Behavioral Analytics

Multi Channel Behavioral Analytics

Stephen Groschupf
Datameer

Statistics

Views

Total Views
2,017
Views on SlideShare
1,752
Embed Views
265

Actions

Likes
0
Downloads
0
Comments
0

2 Embeds 265

http://www.cloudera.com 262
unmht:// 3

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Datameer - Stephen Groschupf - Hadoop World 2010 Datameer - Stephen Groschupf - Hadoop World 2010 Presentation Transcript

    • “Hello World!”
    • My Path
    • What we do. Analyst Administrator Decision Maker Datasource: User: Password: Submit Import ETL Analytics Dashboards Export
    • Buy a Laptop! Social Click Config Call Shop Deliver Register
    • Customer Behavior
    • Star Schema ?
    • Big Data
    • Cost to scale Software Licenses ? $0 ETL/Man Month $220,000 Oracle Hardware $1,148,908 * $0 $375,000 $750,000 $1,125,000 $1,500,000 * Slashing Data Warehouse Costs with the Vertica® Analytic Database: Server (list price): $450,108 Storage (list price): $600,000 (2x $300,000) Data center ␣ power & cooling: $43,200 / year Data center ␣ space: $6,600 / year Implementation: 49,000
    • Laws of Physics Random Sequential 316 Disk 53,200,000 1,924 SSD 42,200,000 36,700,000 Memory 358,200,000 1 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 Adam Jacobs The Pathologies of Big Data
    • Seq R/W Linear Scale Open Min. Admin Unstructured Learning No Tools Security Integration HR Batch
    • Hadoop Eco System Query/Serving Cascading Jaql S3 Execution Storage
    • Architecture API
    • Join Ref Url Cookie email/phone email email email
    • Click Event Stream N-Gram Frequency
    • Implementation effort Learning/POC Hardware Integration Analytics 0 5 10 15 20
    • Hadoop Cost Software Licenses ? $0 Integration $120,000 Hadoop Hardware $50,000 Support $100,000 Administration $120,000 Analytics $240,000 $0 $75,000 $150,000 $225,000 $300,000
    • Lessons learned Growing Remains strong Great market research Generates recommandations
    • Pull vs Push Very slow Local buffer = risk of lost data Monitor many agents Complicated Simple Pull as often as required Just one system to monitor Easier to secure
    • SQL vs. API (vs. Spreadsheets) SQL + UDF (20%) (80%) SQL SQL + + UDF
    • www.datameer.com sg AT datameer.com @datameer