quant skillz beyond wall st: deriving value from large, non-financial datasets
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

quant skillz beyond wall st: deriving value from large, non-financial datasets

  • 3,305 views
Uploaded on

This presentation was prepared for a talk on 2014.08.06 at the NYC Algorithmic Trading meetup (http://www.meetup.com/NYC-Algorithmic-Trading/events/197749772/) ...

This presentation was prepared for a talk on 2014.08.06 at the NYC Algorithmic Trading meetup (http://www.meetup.com/NYC-Algorithmic-Trading/events/197749772/)

Regardless of whether you call it "data science", "business intelligence", "analytics", "statistics" or just plain old "math", we have many tried and true techniques for dealing with uncertainty (particularly in quantitative finance). But ambiguity—what problem do we need to solve in the first place?—is a separate matter and, at least in my experience, is the hardest part of creating value from data. During this talk, I'll discuss how we address ambiguity by giving a guided tour of some of our client projects, such as how to reduce legal e-discovery costs by 99% (hint: supervised binary classification of text documents) or how to assemble project teams on emerging R&D opportunities in a multinational organization (hint: unsupervised classification of employee expertise).

More in: Data & Analytics
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,305
On Slideshare
3,232
From Embeds
73
Number of Embeds
4

Actions

Shares
Downloads
42
Comments
0
Likes
19

Embeds 73

https://twitter.com 62
https://www.linkedin.com 8
http://www.slideee.com 2
http://wave.webaim.org 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. @deanmalmgren @DsAtweet 2014 august nyc algorithmic trading quant skillz beyond wall st deriving value from large, non-financial datasets
  • 2. @deanmalmgren | bit.ly/design-data data scientists thrive with ambiguity solve for x x = 5 + 2 projectevolution
  • 3. @deanmalmgren | bit.ly/design-data data scientists thrive with ambiguity solve for x x = 5 + 2 projectevolution A x = b
  • 4. @deanmalmgren | bit.ly/design-data data scientists thrive with ambiguity solve for x x = 5 + 2 projectevolution A x = b optimize A x = b subject to f(x) > 0
  • 5. @deanmalmgren | bit.ly/design-data data scientists thrive with ambiguity solve for x x = 5 + 2 projectevolution A x = b optimize f(x) optimize A x = b subject to f(x) > 0
  • 6. @deanmalmgren | bit.ly/design-data data scientists thrive with ambiguity solve for x x = 5 + 2 projectevolution A x = b optimize f(x) optimize A x = b subject to f(x) > 0 optimize “our profitability”
  • 7. @deanmalmgren | bit.ly/design-data origins of ambiguity many feasible approaches
  • 8. @deanmalmgren | bit.ly/design-data origins of ambiguity unclear problems identify the best locations to plant new trees
  • 9. @deanmalmgren | bit.ly/design-data origins of ambiguity unclear problems @deanmalmgren | bit.ly/design-data identify the best locations to plant new trees how many? what kinds of trees? move old trees? replace old trees?
  • 10. @deanmalmgren | bit.ly/design-data origins of ambiguity unclear problems identify the best locations to plant new trees how many? what kinds of trees? move old trees? replace old trees? aesthetically pleasing? maximize growth? increase foliage? offset CO2 emissions? @deanmalmgren | bit.ly/design-data
  • 11. @deanmalmgren | bit.ly/design-data generate hypotheses build prototype evaluate feedback “design process” is used everywhere anticipate failure 1-4 week iterations
  • 12. @deanmalmgren | bit.ly/design-data generate hypotheses build prototype evaluate feedback surveys, interviews, focus groups split testing, A/B testing QA; requirements churn personas, scenarios, use cases business/product requirements story/user cards build device prototypes minimum viable product write code human-centered design lean startup agile programming “design process” is used everywhere anticipate failure 1-4 week iterations
  • 13. @deanmalmgren | bit.ly/design-data generate hypotheses build prototype evaluate feedback design and data science challenges in practice 1-4 week iterations
  • 14. @deanmalmgren | bit.ly/design-data generate hypotheses build prototype evaluate feedback problem lost in translation design and data science challenges in practice 1-4 week iterations
  • 15. @deanmalmgren | bit.ly/design-data generate hypotheses build prototype evaluate feedback problem lost in translation takes a long time to collect data, analyze, and build visualization design and data science challenges in practice 1-4 week iterations
  • 16. @deanmalmgren | bit.ly/design-data generate hypotheses build prototype evaluate feedback proof is in the pudding problem lost in translation takes a long time to collect data, analyze, and build visualization design and data science challenges in practice 1-4 week iterations
  • 17. @deanmalmgren | bit.ly/design-data how do projects start?
  • 18. @deanmalmgren | bit.ly/design-data how do projects start?
  • 19. @deanmalmgren | bit.ly/design-data how do projects start?
  • 20. @deanmalmgren | bit.ly/design-data how do projects start?
  • 21. @deanmalmgren | bit.ly/design-data how do projects start?
  • 22. @deanmalmgren | bit.ly/design-data informal conversation to stated goals mostly bad ideas, but a few good ones
  • 23. @deanmalmgren | bit.ly/design-data@deanmalmgren | bit.ly/design-data mostly bad ideas, but a few good ones informal conversation to stated goals
  • 24. @deanmalmgren | bit.ly/design-data@deanmalmgren | bit.ly/design-data mostly bad ideas, but a few good ones Lorem Ipsum: a narrative about blankets. Author: Charlie Brown Date: 31 Jan 2012 ! Lorem Ipsum is a dummy text used when typesetting or marking up documents. It has a long history starting from the 1500s and is still used in digital millennium for typesetting electronic documents, page designs, etc. ! In itself, the original text of Lorem Ipsum might have been taken from an ancient Latin book that was written about 50 BC. Nevertheless, Lorem Ipsum’s words have been changed so they don’t read as a proper text. ! Naturally, page designs that are made for text documents must contain some text rather than placeholder dots or something else. However, should they contain proper English words and sentences almost every reader will deliberately try to interpret it eventually, missing the design itself. ! However, a placeholder text must have a natural distribution of letters and punctuation or otherwise the markup will look strange and unnatural. That’s what Lorem Ipsum helps to achieve. ! I would like to thank Peppermint Pattyfor her support on studying Lorem Ipsum as well as the infinite wisdom of Linus van Peltand his willingness to use his blanket in my experiments. informal conversation to stated goals
  • 25. @deanmalmgren | bit.ly/design-data@deanmalmgren | bit.ly/design-data mostly bad ideas, but a few good ones Lorem Ipsum: a narrative about blankets. Author: Charlie Brown Date: 31 Jan 2012 ! Lorem Ipsum is a dummy text used when typesetting or marking up documents. It has a long history starting from the 1500s and is still used in digital millennium for typesetting electronic documents, page designs, etc. ! In itself, the original text of Lorem Ipsum might have been taken from an ancient Latin book that was written about 50 BC. Nevertheless, Lorem Ipsum’s words have been changed so they don’t read as a proper text. ! Naturally, page designs that are made for text documents must contain some text rather than placeholder dots or something else. However, should they contain proper English words and sentences almost every reader will deliberately try to interpret it eventually, missing the design itself. ! However, a placeholder text must have a natural distribution of letters and punctuation or otherwise the markup will look strange and unnatural. That’s what Lorem Ipsum helps to achieve. ! I would like to thank Peppermint Pattyfor her support on studying Lorem Ipsum as well as the infinite wisdom of Linus van Peltand his willingness to use his blanket in my experiments. informal conversation to stated goals
  • 26. @deanmalmgren | bit.ly/design-data@deanmalmgren | bit.ly/design-data mostly bad ideas, but a few good ones informal conversation to stated goals
  • 27. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 28. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 29. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 30. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 31. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 32. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 33. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 34. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 35. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 36. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 37. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 38. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 39. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 40. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 41. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing
  • 42. @deanmalmgren | bit.ly/design-data concept sketch comparisons qualitative a/b testing search engine with relevance metrics demographics human readable expertise summary
  • 43. @deanmalmgren | bit.ly/design-data from sketch to blue print to prototype add detail to get feedback (while building)
  • 44. @deanmalmgren | bit.ly/design-data from sketch to blue print to prototype add detail to get feedback (while building)
  • 45. @deanmalmgren | bit.ly/design-data from sketch to blue print to prototype add detail to get feedback (while building)
  • 46. @deanmalmgren | bit.ly/design-data from sketch to blue print to prototype add detail to get feedback (while building)
  • 47. @deanmalmgren | bit.ly/design-data motorola data-driven consumer feedback
  • 48. @deanmalmgren | bit.ly/design-data motorola new product announcement data-driven consumer feedback
  • 49. @deanmalmgren | bit.ly/design-data motorola new product announcement first versions from manufacturer data-driven consumer feedback
  • 50. @deanmalmgren | bit.ly/design-data motorola new product announcement first versions from manufacturer available in stores data-driven consumer feedback
  • 51. @deanmalmgren | bit.ly/design-data motorola new product announcement first versions from manufacturer available in stores next generation to manufacturer data-driven consumer feedback
  • 52. @deanmalmgren | bit.ly/design-data motorola new product announcement first versions from manufacturer available in stores next generation to manufacturer product defects from consumers data-driven consumer feedback
  • 53. @deanmalmgren | bit.ly/design-data motorola data-driven consumer feedback
  • 54. @deanmalmgren | bit.ly/design-data motorola data-driven consumer feedback
  • 55. @deanmalmgren | bit.ly/design-data motorola data-driven consumer feedback
  • 56. @deanmalmgren | bit.ly/design-data motorola data-driven consumer feedback
  • 57. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis
  • 58. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis
  • 59. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis
  • 60. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis aboutpatent not aboutpatent
  • 61. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis aboutpatent not aboutpatent turn over to plaintiff don’t turn over to plaintiff adverse inference
  • 62. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis aboutpatent not aboutpatent turn over to plaintiff don’t turn over to plaintiff adverse inference give away trade secrets
  • 63. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis aboutpatent not aboutpatent turn over to plaintiff don’t turn over to plaintiff adverse inference give away trade secrets
  • 64. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis turn over to plaintiff don’t turn over to plaintiff
  • 65. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis
  • 66. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis
  • 67. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis algorithm design patents
  • 68. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis algorithm design patents fantasy football lunch coffee
  • 69. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis algorithm design patents marketing finances fantasy football lunch coffee
  • 70. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis create a “document map” algorithm design patents marketing finances fantasy football lunch coffee
  • 71. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis create a “document map” fantasy football algorithm design patents lunch marketing finances coffee
  • 72. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis create a “document map” fantasy football algorithm design patents lunch marketing finances coffee
  • 73. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis create a “document map” fantasy football algorithm design patents lunch marketing finances coffee
  • 74. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis create a “document map” fantasy football algorithm design patents lunch marketing finances coffee
  • 75. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis create a “document map” fantasy football algorithm design patents lunch marketing finances coffee
  • 76. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis create a “document map” fantasy football algorithm design patents lunch marketing finances coffee
  • 77. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis create a “document map” fantasy football algorithm design patents lunch marketing finances coffee
  • 78. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis create a “document map” fantasy football algorithm design patents lunch marketing finances coffee review away shades of grey
  • 79. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis create a “document map” fantasy football algorithm design patents lunch marketing finances coffee review away shades of grey reduce reviews by 90-99%
  • 80. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis
  • 81. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis awesome!
  • 82. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis who cares? awesome!
  • 83. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis who cares? awesome! <lots of iteration/>
  • 84. @deanmalmgren | bit.ly/design-data data-driven e-discovery daegis
  • 85. @deanmalmgren | bit.ly/design-data quant skillz to data science? bit.ly/metis-ds generate hypotheses build prototype evaluate feedback 1-4 week iterations
  • 86. @deanmalmgren | bit.ly/design-data quant skillz to data science? bit.ly/metis-ds
  • 87. http://bit.ly/design-data http://bit.ly/metis-ds ! @deanmalmgren dean.malmgren@datascopeanalytics.com solve ambiguous problems with quantitative, iterative approach