Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

of

Mind the Gap - Data Science Meets Software Engineering Slide 1 Mind the Gap - Data Science Meets Software Engineering Slide 2 Mind the Gap - Data Science Meets Software Engineering Slide 3 Mind the Gap - Data Science Meets Software Engineering Slide 4 Mind the Gap - Data Science Meets Software Engineering Slide 5 Mind the Gap - Data Science Meets Software Engineering Slide 6 Mind the Gap - Data Science Meets Software Engineering Slide 7 Mind the Gap - Data Science Meets Software Engineering Slide 8 Mind the Gap - Data Science Meets Software Engineering Slide 9 Mind the Gap - Data Science Meets Software Engineering Slide 10 Mind the Gap - Data Science Meets Software Engineering Slide 11 Mind the Gap - Data Science Meets Software Engineering Slide 12 Mind the Gap - Data Science Meets Software Engineering Slide 13 Mind the Gap - Data Science Meets Software Engineering Slide 14 Mind the Gap - Data Science Meets Software Engineering Slide 15 Mind the Gap - Data Science Meets Software Engineering Slide 16 Mind the Gap - Data Science Meets Software Engineering Slide 17
Upcoming SlideShare
What to Upload to SlideShare
Next

0 Likes

Share

Mind the Gap - Data Science Meets Software Engineering

Talk given at the Vienna Semantic Web Meetup in 2016

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Mind the Gap - Data Science Meets Software Engineering

  1. 1. Data Science meets Software Engineering Vienna Semantic Web Meetup 2016-03-01 Bernhard Haslhofer
  2. 2. Who am I? • Scientist at AIT’s Digital Insight Lab • Specialization • Network Analytics • Machine Learning • Text Mining • PhD in Computer Science
  3. 3. Plan for tonight • Build an example service • Approach problem from • software engineering perspective • data science perspective • Look at gap & propose solution
  4. 4. Example Service sports politics art business Text Classification API
  5. 5. Approaching the Problem Software Engineering Software Engineering
  6. 6. Steps • Identify use cases / features • Choose framework • Implement functionality • Ensure quality: test functionality, scalability etc… • Deploy service
  7. 7. Ensure quality public classify(Document document) { …. } @Test(timeout=100) public test_classify(…) { d = new Document(…) c = classifier.classify(d) assertNotNull(c) assert(c in [sports, politics, …]) }
  8. 8. Result / Quality Expectation • A service • implementing defined use case(s) • passing all tests (unit, integration, functional) • fulfilling scalability needs
  9. 9. Approaching the Problem Data Science Data Science
  10. 10. Steps • Define problem / hypothesis • Collect data • Design approach / model • Ensure quality: evaluate model, compare • Prototype algorithm (in R, Matlab, Octave, etc.)
  11. 11. Ensure quality • Split dataset into training / test / cross-validation dataset • Train model using training dataset • Evaluate using test (and cross-validation) dataset • Report and investigate metrics • precision, recall, F1, …
  12. 12. What ??? Software Engineering Data Science Overall Goal Build the service Build the service Technical Goal Implement software features, deploy working service Find the right model features, get the model right Quality assurance Unit, functional, integration tests Evaluate model, report metrics, re- design model
  13. 13. What ??? • The overall (business) goal can be the same • Different technical approach • language issues (what is a “feature” !?) • lack of understanding differences and necessities • Different quality assurance • notion of “testing” is different • different “success factors” (passing test vs. metrics)
  14. 14. Possible solution Define Goal Collect Ground Truth Implement Model and Functions Test & Evaluate Analyze Errors Deploy Service Metrics Driven Software Engineering
  15. 15. Tool support @Test(precision >= 0.8) @Test(timeout=100) public test_classify(…) { d = new Document(…) c = classifier.classify(d) assertNotNull(c) assert(c in [sports, politics, …]) }
  16. 16. Thank You! bernhard.haslhofer@ait.ac.at

Talk given at the Vienna Semantic Web Meetup in 2016

Views

Total views

262

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

0

Shares

0

Comments

0

Likes

0

×