Revenue Growth through Machine Learning

1,816 views

Published on

The greater promise of Big Data lies in doing new things that were previously not possible. One major class of new things is adding intelligence to large-scale systems. This session will present a survey of how machine learning can be applied to real-life situations without having to get a PhD in advanced mathematics. These systems can be built today from open source components to increase business revenues by understanding what customers need and want. Real world examples of best practices and pitfalls in machine learning will be provided including practical ways
to build maintainable, high performance systems.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,816
On SlideShare
0
From Embeds
0
Number of Embeds
90
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Revenue Growth through Machine Learning

  1. 1. Revenue GrowthThrough Machine Learning Ted Dunning – March 21, 2013
  2. 2. Agenda• Intelligence – Artificial or Reflected• Quick survey of machine learning – without a PhD – not all of it• Available components• What do customers really want
  3. 3. Artificial Intelligence?
  4. 4. Artificial Intelligence?• Turing and the intelligent machine• Rules?• Neural networks?• Logic?
  5. 5. Reflected Intelligence!• Society is not just a million individuals• A web service with a million users is not the same as a million users each with a computer• Social computing emerges
  6. 6. What is Machine Learning?• Statistics, but …• New focus on prediction rather than hypothesis testing• Prediction means held-out data, not just the future (now-casting)
  7. 7. The Classics• Unsupervised – AKA clustering (but not what you think that is) – Mixture models, Markov models and more – Learn from unlabeled data, describe it predictively• Supervised – AKA classification – Learn from labeled data, guess labels for new data• Also semi-supervised and hundreds of variants
  8. 8. Recent Insurgents• Collaborative learning – models that learn about you based on others• Meta-modeling – models that learn to reason about what other models say• Interactive systems – systems that pick what to learn from
  9. 9. Techniques• Surprise and coincidence• Anomalous indicators• Non-textual search using textual tools• Dithering• Meta-learning
  10. 10. Surprise and coincidence• What is accidental or uninteresting?• What is surprising and informative?
  11. 11. A vice president of South Carolina Bank and Trust in Bamberg,Maxwell has served as a tireless champion for economicdevelopment in Bamberg County since 1999, welcomingindustrial prospects to the county and working with existingindustries in their expansion efforts. Maxwell served for manyyears as the president of the Bamberg County Chamber ofCommerce and remains an active member today.
  12. 12. The goal of learning is prediction. Learning falls into manycategories, including supervised learning, unsupervised learning,online learning, and reinforcement learning. From theperspective of statistical learning theory, supervised learning isbest understood.
  13. 13. Surprise and Coincidence• Which words stand out in these examples?• Which are just there because these are in English?• The words “the” and “Bamberg” both occur 3 times in the second article – which is the more interesting statistic? Why?
  14. 14. More Surprise• Anomalous indicators – Events that occur before other events – But occur anomalously often• Indicators are not causes• Nor certain
  15. 15. Example #1- Auto Insurance• Predict probability of attrition and loss for auto insurance customers• Transactional variables include – Claim history – Traffic violation history – Geographical code of residence(s) – Vehicles owned• Observed attrition and loss define past behavior
  16. 16. Derived Variables• Split training data according to observable classes• Define LLR variables for each class/variable combination• These 2 m v derived variables can be used for clustering (spectral, k-means, neural gas ...)• Proximity in LLR space to clusters are the new modeling variables
  17. 17. Example #2 – Fraud Detection• Predict probability that an account is likely to result in charge-off due to fraud• Transactional variables include – Zip code – Recent payments and charges – Recent non-monetary transactions• Bad payments, charge-off, delinquency are observable behavioral outcomes
  18. 18. Derived Variables• Split training data according to observable classes• Define LLR variables for each class/variable combination• These 2 m v derived variables can be used directly as model variables
  19. 19. Search Abuse• Non-textual search using textual tools – A document can contain non-word tokens – These might be anomalous indicators of an event• SolR and similar engines can search for indicators – If we have a history of recent indicators, search finds possible follow-on events
  20. 20. Introducing Noise• Dithering – add noise – less for high ranks, more for low ranks• Softens page boundary effects• Introduces more exploration
  21. 21. Meta-learning• Which settings work best?• Which indicators?• A/B testing for the back-end
  22. 22. Available components• Mahout – LLR test for anomaly – Coocurrence computations – Baseline components of Bayesian Bandits• SolR – Ready to roll for search
  23. 23. History matrixOne row per userOne column per thing
  24. 24. Recommendation based oncooccurrenceCooccurrence gives item-itemmappingOne row and column per thing
  25. 25. Cooccurrence matrix can also beimplemented as a search index
  26. 26. Input Data• User transactions – user id, merchant id – SIC code, amount• Offer transactions – user id, offer id – vendor id, merchant id’s, – offers, views, accepts
  27. 27. Input Data• User transactions – user id, merchant id – SIC code, amount• Offer transactions – user id, offer id – vendor id, merchant id’s, – offers, views, accepts • Derived merchant data• Derived user data – local top40 – merchant id’s – SIC code – SIC codes – vendor code – offer & vendor id’s – amount distribution
  28. 28. Cross-recommendation• Per merchant indicators – merchant id’s – chain id’s – SIC codes – offer vendor id’s• Computed by finding anomalous (indicator => merchant) rates
  29. 29. Search-based Recommendations• Sample document – Merchant Id – Field for text description – Phone – Address – Location
  30. 30. Search-based Recommendations• Sample document – Merchant Id – Field for text description – Phone – Address – Location – Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40
  31. 31. Search-based Recommendations• Sample document • Sample query – Merchant Id – Current location – Field for text description – Recent merchant – Phone descriptions – Address – Recent merchant id’s – Location – Recent SIC codes – Recent accepted offers – Indicator merchant id’s – Local top40 – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40
  32. 32. SolR SolRComplete Cooccurrence Indexer Solr Indexer history (Mahout) indexing Item meta- Index data shards
  33. 33. SolR SolR User Indexer Solr Web tier Indexerhistory search Item meta- Index data shards
  34. 34. Objective Results• At a very large credit card company• History is all transactions, all web interaction• Processing time cut from 20 hours per day to 3• Recommendation engine load time decreased from 8 hours to 3 minutes
  35. 35. Platform Needs• Need to root web services and search system on the cluster – Copying negates unification• Legacy indexers are extremely fast … but they assume conventional file access• High performance search engines need high performance file I/O• Need coordinated process management
  36. 36. Additional Opportunities• Cross recommend from search queries to documents• Result is semantic search engine• Uses reflected intelligence instead of artificial intelligence
  37. 37. • What do customers really want?
  38. 38. Another Example• Users enter queries (A) – (actor = user, item=query)• Users view videos (B) – (actor = user, item=video)• A’A gives query recommendation – “did you mean to ask for”• B’B gives video recommendation – “you might like these videos”
  39. 39. The punch-line• B’A recommends videos in response to a query – (isn’t that a search engine?) – (not quite, it doesn’t look at content or meta-data)
  40. 40. Real-life example• Query: “Paco de Lucia”• Conventional meta-data search results: – “hombres del paco” times 400 – not much else• Recommendation based search: – Flamenco guitar and dancers – Spanish and classical guitar – Van Halen doing a classical/flamenco riff
  41. 41. Real-life example
  42. 42. Hypothetical Example• Want a navigational ontology?• Just put labels on a web page with traffic – This gives A = users x label clicks• Remember viewing history – This gives B = users x items• Cross recommend – B’A = label to item mapping• After several users click, results are whatever users think they should be
  43. 43. Next Steps• That is up to you• But I can help – platforms (Solr, MapR) – techniques (Mahout, math)tdunning@maprtech.com@ted_dunning@ApacheMahouthttp://slidesha.re/ZVOS40

×