So, What Does a Data Scientist do?    A Data Scientist in the Music Industry              Dr Jameel Syed                  ...
Overview– Musicmetric CTO– InforSense founding member  • PhD in Workflows for Life Sciences Analysis– Co-organiser Big Dat...
Some questions...
Music has moved online• The world has changed  –   Do you buy vinyl/tapes/CDs of music?  –   Do you buy music downloads?  ...
How popular am I?
Who are my fans?
Where are my fans?
What is the press saying?
Who is popular?
A Data Scientist in the Music Industry•   Raw Data -> Derived Data -> Insight     – Who is popular right now/in the immedi...
What is a Data Scientist?
Have we been here before?•   Statistician•   Data Analyst•   Quantitative analyst•   Bioinformatician•   Data Miner•   Bus...
A Life Sciences digression...
What’s new?• Data provides the opportunity   – Old: Collect and store data presupposing how it will be used   – New: Colle...
Data Scientist• “Jack of all trades”  – “Hacker” mentality: learn new technology and    approaches for a project on short ...
A Data Scientist is good at knitting?• Not building from scratch, knitting together pre-existing parts• Data    – Database...
Upcoming SlideShare
Loading in...5
×

So, What Does a Data Scientist do?

567

Published on

What a Data Scientist does in the music industry, and my thoughts on what a data scientist is. Presented at the March 2012 Data Science London meetup

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
567
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • http://jasyed.com/datascience/
  • http://meetup.com/big-data-london/
  • Long infographic is long: http://www.musicmetric.com/musicmetric-south-by-south-west-infographic/
  • As of this writing there does not exist a "Data Scientist" entryin Wikipedia although there is one for http://en.wikipedia.org/wiki/Big_data
  • Microarray image from http://en.wikipedia.org/wiki/DNA_microarray
  • https://twitter.com/#!/DEVOPS_BORAT/status/174602033872109569
  • Sewing a quilt probably doesn’t involve knitting
  • So, What Does a Data Scientist do?

    1. 1. So, What Does a Data Scientist do? A Data Scientist in the Music Industry Dr Jameel Syed March 2012 http://jasyed.com/datascience/
    2. 2. Overview– Musicmetric CTO– InforSense founding member • PhD in Workflows for Life Sciences Analysis– Co-organiser Big Data London meetup
    3. 3. Some questions...
    4. 4. Music has moved online• The world has changed – Do you buy vinyl/tapes/CDs of music? – Do you buy music downloads? – Do you download illegal content from bittorrent? – Do you listen to music on YouTube? – Do you “like” bands on Facebook? – Do you subscribe to Spotify? – Do you listen on the radio to the weekly charts on a Sunday afternoon?• What’s happening online?
    5. 5. How popular am I?
    6. 6. Who are my fans?
    7. 7. Where are my fans?
    8. 8. What is the press saying?
    9. 9. Who is popular?
    10. 10. A Data Scientist in the Music Industry• Raw Data -> Derived Data -> Insight – Who is popular right now/in the immediate future? – What was the effect of appearing at a festival? – Which artists are (becoming) popular with listeners with certain demographics (in a region)?• Data processing, machine learning & statistical methods – Sentiment analysis – Named Entity Recognition – Ranking – Segmentation• One-offs – Infographics and microsites for events – Brand alignment via demographics – Music Hack Days• Product – Daily charts – Sentiment scoring web crawled reviews
    11. 11. What is a Data Scientist?
    12. 12. Have we been here before?• Statistician• Data Analyst• Quantitative analyst• Bioinformatician• Data Miner• Business Intelligence consultant• Computational physicst
    13. 13. A Life Sciences digression...
    14. 14. What’s new?• Data provides the opportunity – Old: Collect and store data presupposing how it will be used – New: Collect raw data & explore which derivations are interesting; integrating data from multiple online sources. – Big Data technology to cope with data volume• Programming is essential – APIs – Heterogeneous environment(s)• Method of presentation – Infographics – Interactive (web) applications – (Raw data)
    15. 15. Data Scientist• “Jack of all trades” – “Hacker” mentality: learn new technology and approaches for a project on short notice – Creative self-starters – Work alongside other experts (data, domain, software engineering)
    16. 16. A Data Scientist is good at knitting?• Not building from scratch, knitting together pre-existing parts• Data – Databases (relational/NoSQL) – Files – APIs• Algorithms – Open source libraries – Off the shelf tools• Compute – Linux – AWS?• Languages – Many, especially “scripting” languages

    ×