SlideShare a Scribd company logo
1 of 16
So, What Does a Data Scientist do?
    A Data Scientist in the Music Industry

              Dr Jameel Syed
                      March 2012
            http://jasyed.com/datascience/
Overview
– Musicmetric CTO
– InforSense founding member
  • PhD in Workflows for Life Sciences Analysis
– Co-organiser Big Data London meetup
Some questions...
Music has moved online
• The world has changed
  –   Do you buy vinyl/tapes/CDs of music?
  –   Do you buy music downloads?
  –   Do you download illegal content from bittorrent?
  –   Do you listen to music on YouTube?
  –   Do you “like” bands on Facebook?
  –   Do you subscribe to Spotify?
  –   Do you listen on the radio to the weekly charts on a
      Sunday afternoon?
• What’s happening online?
How popular am I?
Who are my fans?
Where are my fans?
What is the press saying?
Who is popular?
A Data Scientist in the Music Industry
•   Raw Data -> Derived Data -> Insight
     – Who is popular right now/in the immediate future?
     – What was the effect of appearing at a festival?
     – Which artists are (becoming) popular with listeners
       with certain demographics (in a region)?
•   Data processing, machine learning & statistical
    methods
     –   Sentiment analysis
     –   Named Entity Recognition
     –   Ranking
     –   Segmentation
•   One-offs
     – Infographics and microsites for events
     – Brand alignment via demographics
     – Music Hack Days
•   Product
     – Daily charts
     – Sentiment scoring web crawled reviews
What is a Data Scientist?
Have we been here before?
•   Statistician
•   Data Analyst
•   Quantitative analyst
•   Bioinformatician
•   Data Miner
•   Business Intelligence consultant
•   Computational physicst
A Life Sciences digression...
What’s new?
• Data provides the opportunity
   – Old: Collect and store data presupposing how it will be used
   – New: Collect raw data & explore which derivations are
     interesting; integrating data from multiple online sources.
   – Big Data technology to cope with data volume
• Programming is essential
   – APIs
   – Heterogeneous environment(s)
• Method of presentation
   – Infographics
   – Interactive (web) applications
   – (Raw data)
Data Scientist
• “Jack of all trades”
  – “Hacker” mentality: learn new technology and
    approaches for a project on short notice
  – Creative self-starters
  – Work alongside other experts
    (data, domain, software engineering)
A Data Scientist is good at knitting?
• Not building from scratch, knitting together pre-existing parts

• Data
    – Databases (relational/NoSQL)
    – Files
    – APIs
• Algorithms
    – Open source libraries
    – Off the shelf tools
• Compute
    – Linux
    – AWS?
• Languages
    – Many, especially “scripting” languages

More Related Content

What's hot

Top 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Top 8 Data Science Tools | Open Source Tools for Data Scientists | EdurekaTop 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Top 8 Data Science Tools | Open Source Tools for Data Scientists | EdurekaEdureka!
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceCaserta
 
Big data Big Analytics
Big data Big AnalyticsBig data Big Analytics
Big data Big AnalyticsAjay Ohri
 
The evolution of data analytics
The evolution of data analyticsThe evolution of data analytics
The evolution of data analyticsNatalino Busa
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?CodePolitan
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptxSadhanaParameswaran
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles ParkerBigMine
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learningGiuseppe Manco
 
Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningJulian Bright
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challengesfazail amin
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data ScienceAndrew Gardner
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsMohd Izhar Firdaus Ismail
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceSampath Kumar
 
Combining Data Mining and Machine Learning for Effective User Profiling
Combining Data Mining and Machine Learning for Effective User ProfilingCombining Data Mining and Machine Learning for Effective User Profiling
Combining Data Mining and Machine Learning for Effective User ProfilingCodePolitan
 
Big data analytics
Big data analyticsBig data analytics
Big data analyticsRavi Teja
 
Making Open the Default
Making Open the DefaultMaking Open the Default
Making Open the DefaultBjörn Brembs
 
Data analytics using the cloud challenges and opportunities for india
Data analytics using the cloud   challenges and opportunities for india Data analytics using the cloud   challenges and opportunities for india
Data analytics using the cloud challenges and opportunities for india Ajay Ohri
 

What's hot (20)

Top 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Top 8 Data Science Tools | Open Source Tools for Data Scientists | EdurekaTop 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Top 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big data Big Analytics
Big data Big AnalyticsBig data Big Analytics
Big data Big Analytics
 
The evolution of data analytics
The evolution of data analyticsThe evolution of data analytics
The evolution of data analytics
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptx
 
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 
Big data
Big dataBig data
Big data
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learning
 
Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine Learning
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challenges
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data Science
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact Solutions
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Combining Data Mining and Machine Learning for Effective User Profiling
Combining Data Mining and Machine Learning for Effective User ProfilingCombining Data Mining and Machine Learning for Effective User Profiling
Combining Data Mining and Machine Learning for Effective User Profiling
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Paving The Way To Data Driven
Paving The Way To Data DrivenPaving The Way To Data Driven
Paving The Way To Data Driven
 
Making Open the Default
Making Open the DefaultMaking Open the Default
Making Open the Default
 
Data analytics using the cloud challenges and opportunities for india
Data analytics using the cloud   challenges and opportunities for india Data analytics using the cloud   challenges and opportunities for india
Data analytics using the cloud challenges and opportunities for india
 

Similar to So, What Does a Data Scientist do?

Music Recommendation 2018
Music Recommendation 2018Music Recommendation 2018
Music Recommendation 2018Fabien Gouyon
 
Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Karthik Murugesan
 
Music data analysis big data presentation
Music data analysis big data presentationMusic data analysis big data presentation
Music data analysis big data presentationShubhanshu Gupta
 
Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...
Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...
Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...AIST
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptxPerumalPitchandi
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data sciencebhavesh lande
 
Analytics in media and entertainment industry
Analytics in media and entertainment industryAnalytics in media and entertainment industry
Analytics in media and entertainment industrySupreethaKrishna2
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptxAkhirulAminulloh2
 
DATA SCINCE.pptx
DATA SCINCE.pptxDATA SCINCE.pptx
DATA SCINCE.pptxMeesanRaza
 
Exploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban ForestryExploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban ForestryAzavea
 
SuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalSuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalstelligence
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media Analysisikanow
 
Drupal case study: ABC Dig Music
Drupal case study: ABC Dig MusicDrupal case study: ABC Dig Music
Drupal case study: ABC Dig MusicDavid Peterson
 

Similar to So, What Does a Data Scientist do? (20)

Music Recommendation 2018
Music Recommendation 2018Music Recommendation 2018
Music Recommendation 2018
 
Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Trends in Music Recommendations 2018
Trends in Music Recommendations 2018
 
Music data analysis big data presentation
Music data analysis big data presentationMusic data analysis big data presentation
Music data analysis big data presentation
 
Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...
Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...
Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...
 
Data Science Intro.pptx
Data Science Intro.pptxData Science Intro.pptx
Data Science Intro.pptx
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Analytics in media and entertainment industry
Analytics in media and entertainment industryAnalytics in media and entertainment industry
Analytics in media and entertainment industry
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptx
 
Data Science.pptx
Data Science.pptxData Science.pptx
Data Science.pptx
 
DATA SCINCE.pptx
DATA SCINCE.pptxDATA SCINCE.pptx
DATA SCINCE.pptx
 
Exploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban ForestryExploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban Forestry
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
 
Digital Marketing & Discoverability for the Performing Arts
Digital Marketing & Discoverability for the Performing ArtsDigital Marketing & Discoverability for the Performing Arts
Digital Marketing & Discoverability for the Performing Arts
 
SuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalSuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-final
 
STI Summit 2011 - Intro
STI Summit 2011 - IntroSTI Summit 2011 - Intro
STI Summit 2011 - Intro
 
Big data
Big dataBig data
Big data
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media Analysis
 
Drupal case study: ABC Dig Music
Drupal case study: ABC Dig MusicDrupal case study: ABC Dig Music
Drupal case study: ABC Dig Music
 

Recently uploaded

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 

Recently uploaded (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

So, What Does a Data Scientist do?

  • 1. So, What Does a Data Scientist do? A Data Scientist in the Music Industry Dr Jameel Syed March 2012 http://jasyed.com/datascience/
  • 2. Overview – Musicmetric CTO – InforSense founding member • PhD in Workflows for Life Sciences Analysis – Co-organiser Big Data London meetup
  • 4. Music has moved online • The world has changed – Do you buy vinyl/tapes/CDs of music? – Do you buy music downloads? – Do you download illegal content from bittorrent? – Do you listen to music on YouTube? – Do you “like” bands on Facebook? – Do you subscribe to Spotify? – Do you listen on the radio to the weekly charts on a Sunday afternoon? • What’s happening online?
  • 6. Who are my fans?
  • 7. Where are my fans?
  • 8. What is the press saying?
  • 10. A Data Scientist in the Music Industry • Raw Data -> Derived Data -> Insight – Who is popular right now/in the immediate future? – What was the effect of appearing at a festival? – Which artists are (becoming) popular with listeners with certain demographics (in a region)? • Data processing, machine learning & statistical methods – Sentiment analysis – Named Entity Recognition – Ranking – Segmentation • One-offs – Infographics and microsites for events – Brand alignment via demographics – Music Hack Days • Product – Daily charts – Sentiment scoring web crawled reviews
  • 11. What is a Data Scientist?
  • 12. Have we been here before? • Statistician • Data Analyst • Quantitative analyst • Bioinformatician • Data Miner • Business Intelligence consultant • Computational physicst
  • 13. A Life Sciences digression...
  • 14. What’s new? • Data provides the opportunity – Old: Collect and store data presupposing how it will be used – New: Collect raw data & explore which derivations are interesting; integrating data from multiple online sources. – Big Data technology to cope with data volume • Programming is essential – APIs – Heterogeneous environment(s) • Method of presentation – Infographics – Interactive (web) applications – (Raw data)
  • 15. Data Scientist • “Jack of all trades” – “Hacker” mentality: learn new technology and approaches for a project on short notice – Creative self-starters – Work alongside other experts (data, domain, software engineering)
  • 16. A Data Scientist is good at knitting? • Not building from scratch, knitting together pre-existing parts • Data – Databases (relational/NoSQL) – Files – APIs • Algorithms – Open source libraries – Off the shelf tools • Compute – Linux – AWS? • Languages – Many, especially “scripting” languages

Editor's Notes

  1. http://jasyed.com/datascience/
  2. http://meetup.com/big-data-london/
  3. Long infographic is long: http://www.musicmetric.com/musicmetric-south-by-south-west-infographic/
  4. As of this writing there does not exist a "Data Scientist" entryin Wikipedia although there is one for http://en.wikipedia.org/wiki/Big_data
  5. Microarray image from http://en.wikipedia.org/wiki/DNA_microarray
  6. https://twitter.com/#!/DEVOPS_BORAT/status/174602033872109569
  7. Sewing a quilt probably doesn’t involve knitting