Mark Watkins Big Data Presentation


Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • How familiar are you with mobile?What mobile initiatives have you undertaken, what are your overall corporate mobile goals?
  • Mark Watkins Big Data Presentation

    1. 1. BIG DATA AT TELENAVUSING DATA TO IMPROVE YOUR LIFEMark Watkins, general manager, entertainment content@viking2917 2/21/2012 © 2012 Telenav, Proprietary and Confidential 1
    2. 2. A PIONEER IN LOCATION SERVICES OUR GPSPublic company: $200M+ revenue, 11 NAVIGATION PARTNERS years in businessLeader in Personalized Mobile Navigation: 30MM+ subscribersLeader in Drive To Mobile Advertising: 750K local advertisersLeader in Mobile Distribution Platforms: 900+ devicesGrowing Global Carrier Audience Reach: 14 carriers in 29 countries 2
    3. 3. KEY PROBLEMS WE ARE WORKING ONTraffic & MappingLocal Search for businesses, events, points of interestLifestyle content & recommendation engineCombination of “traditional” big data processing, machine learning and proprietary algorithmsPeople are drowning in information – use “big data” signals to condense to something manageable
    4. 4. TRAFFIC & MAPSTraffic-aware routing engine – Navigation is core competency – 1.3B routes/trips since 2007Routes generate traffic/motion data – “probe data” from app (billions/month) – Anonymized & summarized to power routing – Persisted in aggregate form for historical traffic metricsUsed to augment Open Street Map – Turn restrictions, stop signs, road geometry – Deduced from probe patternsTechnology set – Hadoop + Hive
    5. 5. AUTOMATED DEVELOPMENT OF RICH LOCAL CONTENT(YOU MAY KNOW THIS AS GOBY) Categorized to taxonomy (“blues”, “hiking trails”) all entities geotagged OTHER FEATURES WORTH NOTING • automatic entity/place creation • aggregated ratings & reviews • proprietary result ranking formula venues automatically recognized; events • domain-specific metadata extraction mapped to venues • sorting by metadata (e.g. price, rating)
    6. 6. AUTOMATED DEVELOPMENT OF RICH LOCAL DATAData space is large, but not immense – Tens or Hundreds of millions (or smaller), not billionsBut very complex – Thousands of data sources – attribute space is 10,000 wide – E.g. how many holes in the golf course; how long is the hiking trail?Generates a large, sparse matrix – Ambiguous, conflicting data – Unstructured or semi-structured data – Need to recognize entities & merge/dedup
    7. 7. SOME LEARNINGSLots of data sources / signals generate “goodness” – Ranking, Confidence, importance, comprehensiveness“Interesting” ≠ “Most Popular”Frequency of occurrence Museum of Bad Art The Middle East NightclubFred’s dry cleaners Museum of Science 2/21/2012 © 2012 Telenav, Proprietary and Confidential 7
    8. 8. COMPOSITE, STRUCTURED LOCAL DATA 2/21/2012 © 2012 Telenav, Proprietary and Confidential 8
    9. 9. PERSONALIZED RECOMMENDATIONS 2/21/2012 © 2012 Telenav, Proprietary and Confidential 9
    10. 10. RECOMMENDATIONS – WORK IN PROGRESSKey signals – Personalized “interest graph” – “Drive to” data (where are people driving to?) – Entity-level “page rank” – Web/mobile clickstream dataIntegrated with social media – Facebook actions influencing recommendationsKey technology enablers – Large amounts of user-generated data – Proprietary algorithms; machine learning / SVM
    11. 11. TELENAV.COM – SCOUT 2/21/2012 © 2012 Telenav, Proprietary and Confidential 11
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.