Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
4 Consulting Projects from this past year 
September 19, 2014 
Machine Learning 2014 
Amy Langville 
Mathematics Departmen...
4 Consulting Projects from this past year 
Tyler Perini 
Mathematics Department 
College of Charleston 
perinita@g.cofc.ed...
4 Consulting Projects from this past year 
3 
Tyler Perini 
Mathematics Department 
College of Charleston 
perinita@g.cofc...
Outline 
 2 Books generate questions 
 US Olympic Projects 
 CageRank 
 Ranking Cell Phone Carriers 
 The Humility Pr...
2 Books generate questions 
1232-1315 
5
2 Books generate questions 
1232-1315 
6 
Chapter 7 talks about . . . but I 
need to . . . Any advice?
2 Books generate questions 
1232-1315 
7 
Chapter 7 talks about . . . but I 
need to . . . Any advice? 
I really enjoyed y...
Project 1: from U.S. Olympic Committee 
8
Project 1: from U.S. Olympic Committee 
9 
 Problem 1: 
Your book talks a lot about ranking 
in head-to-head contests (an...
Project 1: from U.S. Olympic Committee 
10 
 Problem 1: 
 Solution 1: 
μ = average skill 
σ = uncertainty 
Your book tal...
Project 1: from U.S. Olympic Committee 
11 
 Problem 1: 
 Solution 1: TrueSkill
Project 1: from U.S. Olympic Committee 
12 
1st 
3rd 
2nd
Project 1: from U.S. Olympic Committee 
13 
1st 
3rd 
2nd
Project 1: from U.S. Olympic Committee 
14 
2nd 
3rd 
1st
Project 1: from U.S. Olympic Committee 
15 
 Problem 2: 
Your book talks a lot about ranking 
in head-to-head contests wh...
Project 2: CageRank 
 Problem: 
16 
You talk a lot about ranking head-to- 
head contests, like ours [MMA 
fights], but ou...
Project 2: CageRank 
 Problem: 
 Solution: to densify the graph 
17 
You talk a lot about ranking head-to- 
head contest...
UFC 163 
Phil Davis LyotoMachida
UFC 163 
Phil Davis LyotoMachida 
had never fought each other
College football vs. UFC
UFC 163 
Phil Davis LyotoMachida 
1 Ricardo Arona 
Rashad Evans 1 
Find 10 most 
2 Jason Brilz 
Ryan Bader 2 
similar 
3 R...
UFC 163 
Phil Davis LyotoMachida 
1 Ricardo Arona 
Rashad Evans 1 
2 Jason Brilz 
Ryan Bader 2 
3 Ryan Bader 
Alexander Gu...
UFC 163 
1 
2 
Phil Davis LyotoMachida 
1 Ricardo Arona 
Rashad Evans 1 
2 Jason Brilz 
Ryan Bader 2 
3 Ryan Bader 
Alexan...
Project 3: Ranking Cell Phone Carriers 
 Problem: 
24 
Rather than individual games 
between carriers, we have a 
distrib...
Project 3: Ranking Cell Phone Carriers 
 Problem: 
 Solution: 
, then rank aggregate by 
(#carriers each carrier outrank...
Project 3: Ranking Cell Phone Carriers 
 Problem: 
 Solution: simulate head-to-head games by random draws from 
data, th...
Project 3: Ranking Cell Phone Carriers 
27 
Question: what makes a model good? 
Stability in the face of small data change...
Project 4: Humility Project 
 Problem: 
28 
We’re trying to analyze a 
person’s writing to predict 
his/her humility, but...
Project 4: Humility Project 
 Problem: 
 Solution: (NMF) to 
find hidden clusters in text. 
29 
We’re trying to analyze ...
Project 4: Humility Project 
30
Project 4: Humility Project 
31
Project 4: Humility Project 
32
Project 4: Humility Project 
33
Project 4: Humility Project 
34
Conclusions 
We need you. You open our eyes to problems we never 
would have thought about. 
Iterative Collaboration 
Many...
Conclusions 
We need you. You open our eyes to problems we never would 
have thought about. 
Iterative Collaboration 
Many...
Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL
Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL
Upcoming SlideShare
Loading in …5
×

Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL

2,548 views

Published on

My talk will cover four ranking and clustering projects that I consulted on this past year. The projects range from ranking Olympic athletes, mixed martial arts fighters, and cell phone carriers to clustering sentences to rank individuals by how much humility they evidence in their written language. For each project, I will address the particular data challenges and the solutions and techniques we proposed.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL

  1. 1. 4 Consulting Projects from this past year September 19, 2014 Machine Learning 2014 Amy Langville Mathematics Department College of Charleston langvillea@cofc.edu 1
  2. 2. 4 Consulting Projects from this past year Tyler Perini Mathematics Department College of Charleston perinita@g.cofc.edu 2 Amy Langville Mathematics Department College of Charleston langvillea@cofc.edu
  3. 3. 4 Consulting Projects from this past year 3 Tyler Perini Mathematics Department College of Charleston perinita@g.cofc.edu Amy Langville Mathematics Department College of Charleston langvillea@cofc.edu
  4. 4. Outline  2 Books generate questions  US Olympic Projects  CageRank  Ranking Cell Phone Carriers  The Humility Project 4
  5. 5. 2 Books generate questions 1232-1315 5
  6. 6. 2 Books generate questions 1232-1315 6 Chapter 7 talks about . . . but I need to . . . Any advice?
  7. 7. 2 Books generate questions 1232-1315 7 Chapter 7 talks about . . . but I need to . . . Any advice? I really enjoyed your book, but my problem is . . ., which you don’t mention. How do I solve it?
  8. 8. Project 1: from U.S. Olympic Committee 8
  9. 9. Project 1: from U.S. Olympic Committee 9  Problem 1: Your book talks a lot about ranking in head-to-head contests (and that was helpful), but we need to rank multi-competitor sports like downhill skiing and gymnastics.
  10. 10. Project 1: from U.S. Olympic Committee 10  Problem 1:  Solution 1: μ = average skill σ = uncertainty Your book talks a lot about ranking in head-to-head contests (and that was helpful), but we need to rank multi-competitor sports like downhill skiing and gymnastics.
  11. 11. Project 1: from U.S. Olympic Committee 11  Problem 1:  Solution 1: TrueSkill
  12. 12. Project 1: from U.S. Olympic Committee 12 1st 3rd 2nd
  13. 13. Project 1: from U.S. Olympic Committee 13 1st 3rd 2nd
  14. 14. Project 1: from U.S. Olympic Committee 14 2nd 3rd 1st
  15. 15. Project 1: from U.S. Olympic Committee 15  Problem 2: Your book talks a lot about ranking in head-to-head contests where there are multiple matches between competitors, but our data is sparse. Any advice?
  16. 16. Project 2: CageRank  Problem: 16 You talk a lot about ranking head-to- head contests, like ours [MMA fights], but our data is really sparse. How do we deal with that?
  17. 17. Project 2: CageRank  Problem:  Solution: to densify the graph 17 You talk a lot about ranking head-to- head contests, like ours [MMA fights], but our data is really sparse. How do we deal with that?
  18. 18. UFC 163 Phil Davis LyotoMachida
  19. 19. UFC 163 Phil Davis LyotoMachida had never fought each other
  20. 20. College football vs. UFC
  21. 21. UFC 163 Phil Davis LyotoMachida 1 Ricardo Arona Rashad Evans 1 Find 10 most 2 Jason Brilz Ryan Bader 2 similar 3 Ryan Bader Alexander Gustafson 3 fighters to 4 Stephan Bonnar Antonio Rogerio Nogueira 4 each 5 Randy Couture Quinton “Rampage” Jackson 5 6 Trevor Prangley Chael Sonnen 6 7 Tito Ortiz Matt Hamill 7 8 Mark Coleman James Te-Huna 8 9 Ovince St. Preux Dan Henderson 9 10 Chael Sonnen Vladimir Matyushenko 10 Similar by? Fightmetric stats
  22. 22. UFC 163 Phil Davis LyotoMachida 1 Ricardo Arona Rashad Evans 1 2 Jason Brilz Ryan Bader 2 3 Ryan Bader Alexander Gustafson 3 4 Stephan Bonnar Antonio Rogerio Nogueira 4 5 Randy Couture Quinton “Rampage” Jackson 5 6 Trevor Prangley Chael Sonnen 6 7 Tito Ortiz Matt Hamill 7 8 Mark Coleman James Te-Huna 8 9 Ovince St. Preux Dan Henderson 9 10 Chael Sonnen Vladimir Matyushenko 10 6
  23. 23. UFC 163 1 2 Phil Davis LyotoMachida 1 Ricardo Arona Rashad Evans 1 2 Jason Brilz Ryan Bader 2 3 Ryan Bader Alexander Gustafson 3 4 Stephan Bonnar Antonio Rogerio Nogueira 4 5 Randy Couture Quinton “Rampage” Jackson 5 6 Trevor Prangley Chael Sonnen 6 7 Tito Ortiz Matt Hamill 7 8 Mark Coleman James Te-Huna 8 9 Ovince St. Preux Dan Henderson 9 10 Chael Sonnen Vladimir Matyushenko 10 6 Question: is the goal to predict the winner or generate buzz?
  24. 24. Project 3: Ranking Cell Phone Carriers  Problem: 24 Rather than individual games between carriers, we have a distribution of game scores for each carrier. How do we use this data to rank carriers?
  25. 25. Project 3: Ranking Cell Phone Carriers  Problem:  Solution: , then rank aggregate by (#carriers each carrier outranks). 25 Rather than individual games between carriers, we have a distribution of game scores for each carrier. How do we use this data to rank carriers?
  26. 26. Project 3: Ranking Cell Phone Carriers  Problem:  Solution: simulate head-to-head games by random draws from data, then rank aggregate by Borda count (#carriers each carrier outranks).  New Problem: data is loaded with ties! 26 Rather than individual games between carriers, we have a distribution of game scores for each carrier. How do we use this data to rank carriers?
  27. 27. Project 3: Ranking Cell Phone Carriers 27 Question: what makes a model good? Stability in the face of small data changes Explainability to public
  28. 28. Project 4: Humility Project  Problem: 28 We’re trying to analyze a person’s writing to predict his/her humility, but we lost our data guy. Can you help us?
  29. 29. Project 4: Humility Project  Problem:  Solution: (NMF) to find hidden clusters in text. 29 We’re trying to analyze a person’s writing to predict his/her humility, but we lost our data guy. Can you help us?
  30. 30. Project 4: Humility Project 30
  31. 31. Project 4: Humility Project 31
  32. 32. Project 4: Humility Project 32
  33. 33. Project 4: Humility Project 33
  34. 34. Project 4: Humility Project 34
  35. 35. Conclusions We need you. You open our eyes to problems we never would have thought about. Iterative Collaboration Many exist. Some just need tweaking. 35
  36. 36. Conclusions We need you. You open our eyes to problems we never would have thought about. Iterative Collaboration Many exist. Some just need tweaking. 36 Future Work . . . (you tell me)

×