Summit EU Machine Learning

Revenue Growth
Through Machine Learning

Ted Dunning – March 21, 2013

Agenda
• Intelligence – Artificial or Reflected
• Quick survey of machine learning
– without a PhD
– not all of it
• Available components
• What do customers really want

Artificial Intelligence?
• Turing and the intelligent machine

• Rules?

• Neural networks?

• Logic?

Reflected Intelligence!
• Society is not just a million individuals

• A web service with a million users is not the
same as a million users each with a computer

• Social computing emerges

What is Machine Learning?
• Statistics, but …
• New focus on prediction rather than
hypothesis testing
• Prediction means held-out data, not just the
future (now-casting)

The Classics
• Unsupervised
– AKA clustering (but not what you think that is)
– Mixture models, Markov models and more
– Learn from unlabeled data, describe it predictively
• Supervised
– AKA classification
– Learn from labeled data, guess labels for new data
• Also semi-supervised and hundreds of variants

Recent Insurgents
• Collaborative learning
– models that learn about you based on others

• Meta-modeling
– models that learn to reason about what other
models say

• Interactive systems
– systems that pick what to learn from

Techniques
• Surprise and coincidence
• Anomalous indicators
• Non-textual search using textual tools
• Dithering
• Meta-learning

Surprise and coincidence
• What is accidental or uninteresting?

• What is surprising and informative?

A vice president of South Carolina Bank and Trust in Bamberg,
Maxwell has served as a tireless champion for economic
development in Bamberg County since 1999, welcoming
industrial prospects to the county and working with existing
industries in their expansion efforts. Maxwell served for many
years as the president of the Bamberg County Chamber of
Commerce and remains an active member today.

The goal of learning is prediction. Learning falls into many
categories, including supervised learning, unsupervised learning,
online learning, and reinforcement learning. From the
perspective of statistical learning theory, supervised learning is
best understood.

Surprise and Coincidence
• Which words stand out in these examples?

• Which are just there because these are in
English?

• The words “the” and “Bamberg” both occur 3
times in the second article
– which is the more interesting statistic? Why?

More Surprise
• Anomalous indicators
– Events that occur before other events
– But occur anomalously often

• Indicators are not causes

• Nor certain

Example #1- Auto Insurance
• Predict probability of attrition and loss for
auto insurance customers
• Transactional variables include
– Claim history
– Traffic violation history
– Geographical code of residence(s)
– Vehicles owned
• Observed attrition and loss define past
behavior

Derived Variables
• Split training data according to observable
classes
• Define LLR variables for each class/variable
combination
• These 2 m v derived variables can be used
for clustering (spectral, k-means, neural gas
...)
• Proximity in LLR space to clusters are the
new modeling variables

Example #2 – Fraud Detection
• Predict probability that an account is likely
to result in charge-off due to fraud
• Transactional variables include
– Zip code
– Recent payments and charges
– Recent non-monetary transactions
• Bad payments, charge-off, delinquency are
observable behavioral outcomes

Derived Variables
• Split training data according to observable
classes
• Define LLR variables for each class/variable
combination
• These 2 m v derived variables can be used
directly as model variables

Search Abuse
• Non-textual search using textual tools
– A document can contain non-word tokens
– These might be anomalous indicators of an event

• SolR and similar engines can search for
indicators
– If we have a history of recent indicators, search
finds possible follow-on events

Introducing Noise
• Dithering
– add noise
– less for high ranks, more for low ranks
• Softens page boundary effects
• Introduces more exploration

Meta-learning

• Which settings work best?
• Which indicators?

• A/B testing for the back-end

Available components
• Mahout
– LLR test for anomaly
– Coocurrence computations
– Baseline components of Bayesian Bandits
• SolR
– Ready to roll for search

History matrix

One row per user

One column per thing

Recommendation based on
cooccurrence

Cooccurrence gives item-item
mapping

One row and column per thing

Cooccurrence matrix can also be
implemented as a search index

Input Data
• User transactions
– user id, merchant id
– SIC code, amount

• Offer transactions
– user id, offer id
– vendor id, merchant id’s,
– offers, views, accepts

Input Data
• User transactions
– user id, merchant id
– SIC code, amount

• Offer transactions
– user id, offer id
– vendor id, merchant id’s,
– offers, views, accepts
• Derived merchant data
• Derived user data – local top40
– merchant id’s
– SIC code
– SIC codes
– vendor code
– offer & vendor id’s
– amount distribution

Cross-recommendation
• Per merchant indicators
– merchant id’s
– chain id’s
– SIC codes
– offer vendor id’s

• Computed by finding anomalous (indicator =>
merchant) rates

Search-based Recommendations
• Sample document
– Merchant Id
– Field for text description
– Phone
– Address
– Location

• Sample document
– Merchant Id
– Field for text description
– Phone
– Address
– Location

– Indicator merchant id’s
– Indicator industry (SIC) id’s
– Indicator offers
– Indicator text
– Local top40

• Sample document • Sample query
– Merchant Id – Current location
– Field for text description – Recent merchant
– Phone descriptions
– Address – Recent merchant id’s
– Location – Recent SIC codes
– Recent accepted offers
– Indicator merchant id’s – Local top40
– Indicator industry (SIC) id’s
– Indicator offers
– Indicator text
– Local top40

SolR
SolR
Complete Cooccurrence Indexer
Solr
Indexer
history (Mahout) indexing

Item meta- Index
data shards

SolR
SolR
User Indexer
Solr
Web tier Indexer
history search

Item meta-
Index
data shards

Objective Results
• At a very large credit card company

• History is all transactions, all web interaction

• Processing time cut from 20 hours per day to 3

• Recommendation engine load time decreased
from 8 hours to 3 minutes

Platform Needs
• Need to root web services and search system on the
cluster
– Copying negates unification

• Legacy indexers are extremely fast … but they assume
conventional file access

• High performance search engines need high
performance file I/O

• Need coordinated process management

Additional Opportunities
• Cross recommend from search queries to
documents

• Result is semantic search engine

• Uses reflected intelligence instead of artificial
intelligence

• What do customers really want?

Another Example
• Users enter queries (A)
– (actor = user, item=query)
• Users view videos (B)
– (actor = user, item=video)
• A’A gives query recommendation
– “did you mean to ask for”
• B’B gives video recommendation
– “you might like these videos”

The punch-line
• B’A recommends videos in response to a
query
– (isn’t that a search engine?)
– (not quite, it doesn’t look at content or meta-data)

Real-life example
• Query: “Paco de Lucia”
• Conventional meta-data search results:
– “hombres del paco” times 400
– not much else
• Recommendation based search:
– Flamenco guitar and dancers
– Spanish and classical guitar
– Van Halen doing a classical/flamenco riff

Hypothetical Example
• Want a navigational ontology?
• Just put labels on a web page with traffic
– This gives A = users x label clicks
• Remember viewing history
– This gives B = users x items
• Cross recommend
– B’A = label to item mapping
• After several users click, results are whatever
users think they should be

Next Steps
• That is up to you
• But I can help
– platforms (Solr, MapR)
– techniques (Mahout, math)

tdunning@maprtech.com
@ted_dunning
@ApacheMahout

Summit EU Machine Learning

Recommended

Recommended

More Related Content

Similar to Summit EU Machine Learning

Similar to Summit EU Machine Learning (20)

More from Ted Dunning

More from Ted Dunning (20)

Recently uploaded

Recently uploaded (20)

Summit EU Machine Learning