Successfully reported this slideshow.

A survey of 2013 data science salary survey”

4,406 views

Published on

I'm Japanese. I wrote survey of O'reilly "2013 Data science salary survey" and differences of US and JP.

Published in: Engineering, Technology, Business
  • Be the first to comment

A survey of 2013 data science salary survey”

  1. 1. A survey of "2013 Data Science Salary Survey” 2014/04/26 Tokyo Webmining / @showyou 1/32
  2. 2. 2013 Data Science Salary Survey http://strata.oreilly.com/2014/01/2013-data-science-salary-survey.html +My Comments Mostly figures come from this survay. Abstruction 2/32
  3. 3. Agenda About me Survay Comments/ my opinions 3/32
  4. 4. About me Datamining Engineer Hadoop(Pig, Hive, Cloudera Hue) BI Tool, JIRA, Confluence, git Python, MachineLearning, NLP(Lang), (R, js, highchart, C++, Java) Status: Looking for the new job Previous: Hikarie Not an job consultant or recruiter http://about.me/showyou41 4/32
  5. 5. Summary of this paper OSS(Python, R) > Tradisional Tools(SAS, Excel) Tradisional Tools are used in relative isolation Wider variety of tools, higher salary Bigdata = higher salary 5/32
  6. 6. Respondents Atendees of Two Strata conferences (New York 2012 and Santa Clara 2013) Members&range of ages in US is -> Most respondents is 30s or 40s 6/32
  7. 7. The jobs of respondents(1) Top 10 Industories -> Startup 1/5 Median salary: Startup > Public > Private > gov 7/32
  8. 8. The jobs of respondents(2) Most respondents(56%) describe themselves as data scientists/ analysts. 8/32
  9. 9. Tool Usage 9/32
  10. 10. Tool usage SQL/RDB is Top R Python > Excel 10/32
  11. 11. Tool correlations Orange: Group “Hadoop” Blue: Group “SQL/Excel” Red: Neither 11/32
  12. 12. Tools(hadoop) 12/32
  13. 13. Tools(SQL/Excel) Not correlative 13/32
  14. 14. Median Salary vs Tools 14/32
  15. 15. Salary vs Hadoop or SQL/Excel 15/32
  16. 16. Salary & Tools 16/32
  17. 17. Comment or my opinion 17/32
  18. 18. Questionare of the categolize Orange vs Blue seems correct, but Red is doubtful e.g. JavaScript vs D3.js, VBA vs C#, Python vs Ruby, Pentaho vs Tableau,... 18/32
  19. 19. What is data scientist? What are differences of data scientist & analiyst? The definitions of data scientist in U.S. and JP are different. U.S.: O’reilly http://radar.oreilly.com/2010/06/what-is-data-science.html JP:Nikkei http://itpro.nikkeibp.co.jp/article/Keyword/20130614/485142/ Japanese often drops the side of Engineering 19/32
  20. 20. Indeed search(US) Keyword Low High mean Hadoop $60,000+ $140,000+ $81,300 Hive $60,000+ $140,000+ $80,400 SAS $50,000+ $130,000+ $72,100 Data scientist $50,000+ $130,000+ $72,000 Excel $30,000+ $110,000+ $51,200 Sun Francisco Bay area Strata survay : 50% over are Tech lead or Executive 20/32
  21. 21. Indeed search(Tokyo, JP) Keyword Low High mean Hadoop 5.00+ m Yen 13.00+ 6.27 Hive 5.00+ 13.00+ 6.68 SAS 4.00+ 12.00+ 6.13 Data scientist 4.00+ 12.00+ 5.81 Excel 3.00+ 11.00+ 4.40 21/32
  22. 22. Salary US vs JP(1$=102.5Yen) US($) JP(m Yen) JP($) US/JP Hadoop 81,300 6.27 61,200 1.33 Hive 80,400 6.68 65,200 1.23 SAS 72,100 6.13 59,800 1.21 Data scientist 72,000 5.81 56.700 1.27 Excel 51,200 4.40 42.900 1.19 22/32
  23. 23. Costs US vs JP U.S. House(Cal, Bayarea, 1Bed room, Sep 2013) $2192~ $2800 Foods $8~$12~+tip15% JP House(Tokyo, 1 Room under 30m^2, Apr 2014) 20k~150k Yen Foods 500~1500 Yen US = JP * 1.2 or 1.5 23/32
  24. 24. References http://strata.oreilly.com/2014/01/2013-data-science-salary-survey.html http://radar.oreilly.com/2010/06/what-is-data-science.html http://www.datascientist.or.jp/ http://priceonomics.com/the-rise-of-bay-area-rent-prices/ http://www.indeed.com/ http://jp.indeed.com/ 24/32
  25. 25. Appendix 25/32
  26. 26. Tool usage in Tokyo Webminig #35 All people using Excel(but I don’t know whether for data mining or not). Javascript / SAS / SPSS is higher, Hadoop is lower A few Hadoop developer joined in Tokyo Webmining(They often joined Hadoop Code Reading). 26/32
  27. 27. Hive http://hive.apache.org/ SQL like language for Hadoop Convert hiveQL to map reduce when you execute hive query http://www.cloudera.com/content/cloudera/en/products-and-services/cloudera-live.html 27/32
  28. 28. R language R is a free software environment for statistical computing and graphics http://www.r-project.org/ e.g. $ R > demo(graphics) 28/32
  29. 29. Tableau http://www. tableausoftware.com/ BI Tool(Commercial) 29/32
  30. 30. Pentaho http://www.pentaho.com/ http://www.pentaho- partner.jp/ BI Tool (Free/Commercial) 30/32
  31. 31. SAS http://www.sas. com/en_us/software/s as9.html Analytics tool cf. SPSS http://www.sas.com/offices/NA/canada/en/resources/screenshot/sas-marketing- optimization-2-full.jpg 31/32
  32. 32. D3(.js) http://d3js.org/ http://ja.d3js.node.ws/ Rendering Library for JavaScript backbone.js 32/32

×