THE SIDEKICK PATTERN:
USING SMALL DATA TO MULTIPLY
THE VALUE OF BIG DATA
@AbeGong
Data Scientist, Jawbone
Strata - Februar...
Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
DATA SIDEKICKS

Wednesday, February 12, 14
EX: HIEROGLYPH
TRANSLATION

Wednesday, February 12, 14
EX: HIEROGLYPH
TRANSLATION

Wednesday, February 12, 14
EX: HIEROGLYPH
TRANSLATION

Wednesday, February 12, 14
EX: CAMPAIGN TARGETING

Wednesday, February 12, 14
EX: CAMPAIGN TARGETING

Wednesday, February 12, 14
EX: CAMPAIGN TARGETING

Wednesday, February 12, 14
EX: SLEEP CONTEXT

Wednesday, February 12, 14
EX: SLEEP CONTEXT

Wednesday, February 12, 14
EX: SLEEP CONTEXT

Wednesday, February 12, 14
[DATA ART EXAMPLE]
SUB-TITLE

Wednesday, February 12, 14
Wednesday, February 12, 14
EXAMPLES, PLEASE:
WHICH DATA STREAMS GET

BIG?
(...AND BESIDES SIZE, WHAT ELSE DO THEY HAVE IN COMMON?)

Wednesday, Februa...
BIG, RICH, MESSY

Wednesday, February 12, 14
BIG, RICH, MESSY CAREFULLY CURATED

Wednesday, February 12, 14
TRANSMUTATION!

Wednesday, February 12, 14
EX: HUFFPO MODERATION

Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
EX: HUFFPO MODERATION

Wednesday, February 12, 14
EX: HUFFPO MODERATION

Wednesday, February 12, 14
WHEN SHOULD I USE THE
SIDEKICK PATTERN?

Wednesday, February 12, 14
WHEN SHOULD I USE THE
SIDEKICK PATTERN?
• To

Wednesday, February 12, 14

separate munging and cleaning from scaling.
WHEN SHOULD I USE THE
SIDEKICK PATTERN?
• To
• To

Wednesday, February 12, 14

separate munging and cleaning from scaling....
WHEN SHOULD I USE THE
SIDEKICK PATTERN?
• To
• To

bootstrap new data products.

• To

Wednesday, February 12, 14

separat...
EX: SLEEP RECOVERY

Wednesday, February 12, 14
EX: SLEEP RECOVERY

Wednesday, February 12, 14
EX: SLEEP RECOVERY

Wednesday, February 12, 14
EX: SLEEP RECOVERY

Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
LEVELS OF ABSTRACTION

Wednesday, February 12, 14
LEVELS OF ABSTRACTION

Wednesday, February 12, 14
LEVELS OF ABSTRACTION

Wednesday, February 12, 14
QUESTIONS? COMMENTS?

@AbeGong
Data Scientist, Jawbone
Strata - February 2014

Wednesday, February 12, 14
Wednesday, February 12, 14
Big
Rich
Messy
Sensory
User experience
External-facing

Abstract
Business logic
Internal-facing

“Qualitative”
Story-makin...
TRANSMUTATION EXAMPLES
Example

Example

Property

Rosetta stone

Synonyms/Comparability

Bridge cases in IRT
scaling mode...
RECOMMENDED READING
•

•

Paco Nathan: http://www.slideshare.net/pacoid/using-cascalog-to-build-an-appbased-on-city-of-pal...
Upcoming SlideShare
Loading in...5
×

The Sidekick Pattern: Strata talk by Abe Gong

633

Published on

Slides from my Strata talk: http://strataconf.com/strata2014/public/schedule/speaker/163953

Abstract: Creating value from big, messy data sets can be a daunting task. The session introduces the Sidekick Pattern: using small, curated data to increase the value of Big Data. Drawing on lessons from data science for Jawbone’s UP fitness tracker, we will see how smart selection of data sidekicks can accelerate analysis, solve cold start problems, and simplify complicated data pipelines.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
633
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

The Sidekick Pattern: Strata talk by Abe Gong

  1. 1. THE SIDEKICK PATTERN: USING SMALL DATA TO MULTIPLY THE VALUE OF BIG DATA @AbeGong Data Scientist, Jawbone Strata - February 2014 Wednesday, February 12, 14
  2. 2. Wednesday, February 12, 14
  3. 3. Wednesday, February 12, 14
  4. 4. Wednesday, February 12, 14
  5. 5. Wednesday, February 12, 14
  6. 6. Wednesday, February 12, 14
  7. 7. Wednesday, February 12, 14
  8. 8. DATA SIDEKICKS Wednesday, February 12, 14
  9. 9. EX: HIEROGLYPH TRANSLATION Wednesday, February 12, 14
  10. 10. EX: HIEROGLYPH TRANSLATION Wednesday, February 12, 14
  11. 11. EX: HIEROGLYPH TRANSLATION Wednesday, February 12, 14
  12. 12. EX: CAMPAIGN TARGETING Wednesday, February 12, 14
  13. 13. EX: CAMPAIGN TARGETING Wednesday, February 12, 14
  14. 14. EX: CAMPAIGN TARGETING Wednesday, February 12, 14
  15. 15. EX: SLEEP CONTEXT Wednesday, February 12, 14
  16. 16. EX: SLEEP CONTEXT Wednesday, February 12, 14
  17. 17. EX: SLEEP CONTEXT Wednesday, February 12, 14
  18. 18. [DATA ART EXAMPLE] SUB-TITLE Wednesday, February 12, 14
  19. 19. Wednesday, February 12, 14
  20. 20. EXAMPLES, PLEASE: WHICH DATA STREAMS GET BIG? (...AND BESIDES SIZE, WHAT ELSE DO THEY HAVE IN COMMON?) Wednesday, February 12, 14
  21. 21. BIG, RICH, MESSY Wednesday, February 12, 14
  22. 22. BIG, RICH, MESSY CAREFULLY CURATED Wednesday, February 12, 14
  23. 23. TRANSMUTATION! Wednesday, February 12, 14
  24. 24. EX: HUFFPO MODERATION Wednesday, February 12, 14
  25. 25. Wednesday, February 12, 14
  26. 26. Wednesday, February 12, 14
  27. 27. EX: HUFFPO MODERATION Wednesday, February 12, 14
  28. 28. EX: HUFFPO MODERATION Wednesday, February 12, 14
  29. 29. WHEN SHOULD I USE THE SIDEKICK PATTERN? Wednesday, February 12, 14
  30. 30. WHEN SHOULD I USE THE SIDEKICK PATTERN? • To Wednesday, February 12, 14 separate munging and cleaning from scaling.
  31. 31. WHEN SHOULD I USE THE SIDEKICK PATTERN? • To • To Wednesday, February 12, 14 separate munging and cleaning from scaling. bootstrap new data products.
  32. 32. WHEN SHOULD I USE THE SIDEKICK PATTERN? • To • To bootstrap new data products. • To Wednesday, February 12, 14 separate munging and cleaning from scaling. leverage variety against volume.
  33. 33. EX: SLEEP RECOVERY Wednesday, February 12, 14
  34. 34. EX: SLEEP RECOVERY Wednesday, February 12, 14
  35. 35. EX: SLEEP RECOVERY Wednesday, February 12, 14
  36. 36. EX: SLEEP RECOVERY Wednesday, February 12, 14
  37. 37. Wednesday, February 12, 14
  38. 38. Wednesday, February 12, 14
  39. 39. LEVELS OF ABSTRACTION Wednesday, February 12, 14
  40. 40. LEVELS OF ABSTRACTION Wednesday, February 12, 14
  41. 41. LEVELS OF ABSTRACTION Wednesday, February 12, 14
  42. 42. QUESTIONS? COMMENTS? @AbeGong Data Scientist, Jawbone Strata - February 2014 Wednesday, February 12, 14
  43. 43. Wednesday, February 12, 14
  44. 44. Big Rich Messy Sensory User experience External-facing Abstract Business logic Internal-facing “Qualitative” Story-making Wednesday, February 12, 14 Small Focused Curated “Quantitative” Science-making
  45. 45. TRANSMUTATION EXAMPLES Example Example Property Rosetta stone Synonyms/Comparability Bridge cases in IRT scaling models Relative ranking Campaign targeting Demographic categories Sentiment analysis Categories Sleep context Context Pretty much all supervised learning Categories/Scales Instrumental variables Causality ... HuffPo moderation Credibility Sleep recovery Clean examples Economic mobility Continuity Crowdflower gold Wednesday, February 12, 14 Property Credibility
  46. 46. RECOMMENDED READING • • Paco Nathan: http://www.slideshare.net/pacoid/using-cascalog-to-build-an-appbased-on-city-of-palo-alto-open-data • Jay Kreps: http://engineering.linkedin.com/distributed-systems/log-what-everysoftware-engineer-should-know-about-real-time-datas-unifying • Joseph Turian: http://files.meetup.com/1542972/20120202-more-data-samemodels-STUDY-SLIDES.pdf • Wednesday, February 12, 14 Pete Skomoroch: http://www.slideshare.net/pskomoroch/strataendorsements-16939466 Me: http://blog.abegong.com/2014/02/wanted-good-examples-of-datasidekicks.html
  1. ¿Le ha llamado la atención una diapositiva en particular?

    Recortar diapositivas es una manera útil de recopilar información importante para consultarla más tarde.

×