Street Fighting Data Science
Upcoming SlideShare
Loading in...5
×
 

Street Fighting Data Science

on

  • 2,124 views

Practical problem solving with data involves more than just visualization or applying the latest machine learning techniques. Intuition, domain knowledge, and reasonable approximations can mean the ...

Practical problem solving with data involves more than just visualization or applying the latest machine learning techniques. Intuition, domain knowledge, and reasonable approximations can mean the difference between a successful model and a catastrophic failure. We’ll dive into some best practices I’ve extracted from solving real world problems like computing trending topics, cleaning election data, and ranking experts on social networks.

New analysts or engineers are often lost when textbook approaches fail on real world data. Drawing inspiration from problem solving techniques in mathematics and physics, we will walk through examples that illustrate how come up with creative solutions and solve problems with big data.

Statistics

Views

Total Views
2,124
Views on SlideShare
2,062
Embed Views
62

Actions

Likes
4
Downloads
25
Comments
0

9 Embeds 62

http://www.linkedin.com 22
https://twitter.com 15
https://www.linkedin.com 11
https://si0.twimg.com 6
http://localhost 2
http://www.slashdocs.com 2
http://www.datawrangling.com 2
http://tweetedtimes.com 1
http://www.instapaper.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Street Fighting Data Science Street Fighting Data Science Presentation Transcript

  • Street Fighting Data Science Pete Skomoroch @peteskomoroch O’Reilly Strata Conference February 28, 2012
  • To solve hard problems:
  • Think like a street fighter
  • AnalyzeImproviseAnticipateAdapt
  • How does this apply to Data Science?
  • Pricing model decreases profit in test stores by 30%
  • What went wrong?
  • • Ran complex “black box” model• Didn’t analyze the data first• Didn’t anticipate elasticity errors
  • How could this have been avoided?
  • The Men Who Stare at Charts
  • Look at your data
  • Raw Data: FEC Contributions
  • not employed 118672 retired 32938self employed 92973 self-employed 25454information requested 17627 information requested per best efforts 1313refused 728 homemaker 4992unemployed 1493 the bank of new york 65self-employed 5919 john mccain 2008 57university of california 825 u.s. government 121microsoft 915 idt corp. 54university of chicago 616 merrill lynch 273harvard university 848 blank rome l.l.p. 51google 662 department of defense 100stanford university 716 u.s. army 90university of washington 614 us army 141ibm 1016 none 642columbia university 782 greenberg traurig 118university of michigan 514 northrop grumman 105freelance 372 at&t 141sa 150 citigroup 134sidley austin llp 509 bridgewater associates 44na 999 univision communications inc. 36
  • not employed 118672 retired 32938self employed 92973 self-employed 25454information requested 17627 information requested per best efforts 1313refused 728 homemaker 4992unemployed 1493 the bank of new york 65self-employed 5919 john mccain 2008 57university of california 825 u.s. government 121microsoft 915 idt corp. 54university of chicago 616 merrill lynch 273harvard university 848 blank rome l.l.p. 51google 662 department of defense 100stanford university 716 u.s. army 90university of washington 614 us army 141ibm 1016 none 642columbia university 782 greenberg traurig 118university of michigan 514 northrop grumman 105freelance 372 at&t 141sa 150 citigroup 134sidley austin llp 509 bridgewater associates 44na 999 univision communications inc. 36
  • Katherine Alexandra
  • “Dont indulge in anyunnecessary, sophisticatedmoves.Youll get clobbered if you do,and in a street fight youllhave your shirt zipped offyou.”- Bruce Lee
  • Look at your errors
  • • Sanity check row counts• Track errors over time• Find patterns in the error data• Add missing features to models• Replace models entirely
  • AnalyzeImproviseAnticipateAdapt
  • Think like a street fighter