Successfully reported this slideshow.
Your SlideShare is downloading. ×

Life of a data scientist (pub)

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 40 Ad

Life of a data scientist (pub)

Download to read offline

데이터 과학자의 실체 The Reality of Data Scientist
전체 분석 과정에서 대부분은 데이터를 모으고 가공하는데 소요한다.
그리고 애플리케이션에 데이터를 적용하기 위해서는 테스팅이 가장 중요하다.
인간공학 전공자들을 대상으로 준비한 발표자료라서 '데이터 수집 및 클렌징'보다는 '테스트 (온라인 테스트)'에 초점을 두고 자료를 만들었습니다.

데이터 과학자의 실체 The Reality of Data Scientist
전체 분석 과정에서 대부분은 데이터를 모으고 가공하는데 소요한다.
그리고 애플리케이션에 데이터를 적용하기 위해서는 테스팅이 가장 중요하다.
인간공학 전공자들을 대상으로 준비한 발표자료라서 '데이터 수집 및 클렌징'보다는 '테스트 (온라인 테스트)'에 초점을 두고 자료를 만들었습니다.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Viewers also liked (13)

Advertisement

Similar to Life of a data scientist (pub) (20)

Recently uploaded (20)

Advertisement

Life of a data scientist (pub)

  1. 1. Life of Data Scientist myths and reality Jeong, Buhwan Ph.D Data Hacker / Kakao Corp.
  2. 2. Data scientists are big data wranglers. They take an enormous mass of messy data points and use their formidable skills in math, statistics and programming to clean, massage and organize them. Then they apply all their analytic powers and domain knowledge to uncover hidden solutions to business challenges. Script (modified) from http://www.mastersindatascience.org/careers/data-scientist/
  3. 3. A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician. domain knowledge business understanding+ http://www.mastersindatascience.org/careers/data-scientist/
  4. 4. Diagram from https://www.quora.com/What-is-a-data-scientist-3
  5. 5. Image from http://paper4pc.com/superman-logosuperman.html
  6. 6. http://www.sintetia.com/wp-content/uploads/2014/05/Data-Scientist-What-I-really-do.png
  7. 7. DB Log SQL Data TXT / EXL Visualization Implement Test & Deploy [KR] Algorithm - Regression - Classification - Clustering [Insight]
  8. 8. Big Data?
  9. 9. Volume Variety Velocity Value
  10. 10. Value BIG & FAST SMART Count & Trend Predictive Technical Meaningful Analytics Volume Variety Velocity Engineering Science
  11. 11. Data Science Scientific Method Proved by Theory Verified with Experiment Algorithm (Equations) Testing (Evidence)
  12. 12. Experiment & Test
  13. 13. Hypothesis Experiment Graduation
  14. 14. Observation Deployment Test (Comparison)
  15. 15. Observation Off-line Test Deployment On-line Test
  16. 16. Test Deploy Modeling Test set Observe A (Treatment) B (Control) ← Offline : Online → Solve
  17. 17. M T W T F S S Code Release Off-line Test On-line Test Deployment Monitoring & Improvement Netflix’s Weekly Test & Deployment
  18. 18. Image from https://vwo.com/ab-testing/ On-line A/B Test
  19. 19. Image from https://vwo.com/ab-testing/
  20. 20. From Yahoo! (Creative Best Practices: Native Ads)
  21. 21. From Yahoo! (Creative Best Practices: Native Ads)
  22. 22. A/B Test Configuration Traffic-driven For every incoming request, if random() < 0.1, then assign the treatment group (10%) otherwise, assign the control group (90%) User-driven For every requestor (whose userId ends with ‘NN’) if ‘NN’ is in ’00 ~ 09’, then assign the treatment group otherwise, assign the control group
  23. 23. Random Control Group Treatment (A) A/B Test Random Control Group Treatment A Treatment B Treatment C Multivariate Test
  24. 24. Multivariate test: https://www.optimizely.com/resources/multivariate-testing/
  25. 25. Red Daum vs Blue Daum
  26. 26. Data over Algorithm
  27. 27. Forbes.com: http://goo.gl/bauDHw
  28. 28. DB Log SQL Data Implement Test & Deploy [KR] Algorithm - Regression - Classification - Clustering [Insight] 20 60 15 5
  29. 29. Forbes.com: http://goo.gl/bauDHw
  30. 30. Hacking Data for business goals - Right data - Right algorithm - Right evaluation
  31. 31. Good UI/UX is defined by User Adoption
  32. 32. Human Hacker Image from https://goo.gl/vClux5
  33. 33. Enjoy your Jeju

×