ENTERPRISE IT 20 x 20
Data Mining as an Engine of
Personalization
Jonathan LeBlanc (@jcleblanc)
The Web is Becoming Personal
Premise
You can determine the personality
profile of a person based on their
browsing habits
Then I Read This…
Us & Them
The Science of Identity
By David Berreby
Different States of Knowledge
What a person knows
What a person knows they don’t know
What a person doesn’t know they don’...
Technology was NOT the Solution
Identity and discovery are
NOT a technology solution
Our Subject Material
HTML content is poorly structured
There are some pretty bad web
practices on the interwebz
You can’t trust that anything
s...
The Basic Pieces
Page Data
Scrapey
Scrapey
Keywords
Without all
the fluff
Weighting
Word diets
FTW
Capture Raw Page Data
Semantic data on the web
is sucktastic
Assume 5 year olds built
the sites
Language is the key
Extract Keywords
We now have a big jumble
of words. Let’s extract
Why is “and” a top word?
Stop words = sad panda
Weight Keywords
All content is not created
equal
Pay special attention to
high value tags & content
location
Expanding to Phrases
2-3 adjacent words, making
up a direct relevant callout
Seems easy right? Just like
single words
Working with Unknown Users
The majority of users won’t
be immediately targetable
Tracking Emotional Change
You have to be aware of
personality changes
Tracking users as they
use your service
Using On Demand Tracking
Traits of the Bored
Distraction
Repetition
Tiredness
Reasons for Boredom
Lack of interest
Readine...
Adding in Time Interactions
Time and interaction need
to be accounted for
Gift buying seasons see
interest variations
Grouping Using Commonality
Interests
User A
Interests
User B
Interests
Common
A Closing Thought
Just because you can do something,
doesn’t mean you should
Data Mining as an Engine of Personalization
Data Mining as an Engine of Personalization
Upcoming SlideShare
Loading in...5
×

Data Mining as an Engine of Personalization

821

Published on

People are no longer satisfied with flat, single-output websites
that do not personalize to the needs and differences of each viewer. With the wealth of data and interaction mining techniques being employed in everything from online sites to brick and mortar stores, we are truly seeing a major industry shift towards automatic personalization.

This session will cover the concepts of long-term personalization and on-demand emotional state interaction, which in turn can be used as the architecture to drive commerce and personalization.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
821
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • The web is going towards personalization – no one wants a flat experience anymore
  • How we’ll capture the data: Start with base linguistics Extend with available extras
  • Language gets wonky without stop words
  • Use HTML5 LocalStorage & Cookie backup
  • Data Mining as an Engine of Personalization

    1. 1. ENTERPRISE IT 20 x 20 Data Mining as an Engine of Personalization Jonathan LeBlanc (@jcleblanc)
    2. 2. The Web is Becoming Personal
    3. 3. Premise You can determine the personality profile of a person based on their browsing habits
    4. 4. Then I Read This… Us & Them The Science of Identity By David Berreby
    5. 5. Different States of Knowledge What a person knows What a person knows they don’t know What a person doesn’t know they don’t know
    6. 6. Technology was NOT the Solution Identity and discovery are NOT a technology solution
    7. 7. Our Subject Material
    8. 8. HTML content is poorly structured There are some pretty bad web practices on the interwebz You can’t trust that anything semantically valid will be present
    9. 9. The Basic Pieces Page Data Scrapey Scrapey Keywords Without all the fluff Weighting Word diets FTW
    10. 10. Capture Raw Page Data Semantic data on the web is sucktastic Assume 5 year olds built the sites Language is the key
    11. 11. Extract Keywords We now have a big jumble of words. Let’s extract Why is “and” a top word? Stop words = sad panda
    12. 12. Weight Keywords All content is not created equal Pay special attention to high value tags & content location
    13. 13. Expanding to Phrases 2-3 adjacent words, making up a direct relevant callout Seems easy right? Just like single words
    14. 14. Working with Unknown Users The majority of users won’t be immediately targetable
    15. 15. Tracking Emotional Change You have to be aware of personality changes Tracking users as they use your service
    16. 16. Using On Demand Tracking Traits of the Bored Distraction Repetition Tiredness Reasons for Boredom Lack of interest Readiness
    17. 17. Adding in Time Interactions Time and interaction need to be accounted for Gift buying seasons see interest variations
    18. 18. Grouping Using Commonality Interests User A Interests User B Interests Common
    19. 19. A Closing Thought Just because you can do something, doesn’t mean you should
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×