SlideShare a Scribd company logo
So, What Does a Data Scientist do?
    A Data Scientist in the Music Industry

              Dr Jameel Syed
                      March 2012
            http://jasyed.com/datascience/
Overview
– Musicmetric CTO
– InforSense founding member
  • PhD in Workflows for Life Sciences Analysis
– Co-organiser Big Data London meetup
Some questions...
Music has moved online
• The world has changed
  –   Do you buy vinyl/tapes/CDs of music?
  –   Do you buy music downloads?
  –   Do you download illegal content from bittorrent?
  –   Do you listen to music on YouTube?
  –   Do you “like” bands on Facebook?
  –   Do you subscribe to Spotify?
  –   Do you listen on the radio to the weekly charts on a
      Sunday afternoon?
• What’s happening online?
How popular am I?
Who are my fans?
Where are my fans?
What is the press saying?
Who is popular?
A Data Scientist in the Music Industry
•   Raw Data -> Derived Data -> Insight
     – Who is popular right now/in the immediate future?
     – What was the effect of appearing at a festival?
     – Which artists are (becoming) popular with listeners
       with certain demographics (in a region)?
•   Data processing, machine learning & statistical
    methods
     –   Sentiment analysis
     –   Named Entity Recognition
     –   Ranking
     –   Segmentation
•   One-offs
     – Infographics and microsites for events
     – Brand alignment via demographics
     – Music Hack Days
•   Product
     – Daily charts
     – Sentiment scoring web crawled reviews
What is a Data Scientist?
Have we been here before?
•   Statistician
•   Data Analyst
•   Quantitative analyst
•   Bioinformatician
•   Data Miner
•   Business Intelligence consultant
•   Computational physicst
A Life Sciences digression...
What’s new?
• Data provides the opportunity
   – Old: Collect and store data presupposing how it will be used
   – New: Collect raw data & explore which derivations are
     interesting; integrating data from multiple online sources.
   – Big Data technology to cope with data volume
• Programming is essential
   – APIs
   – Heterogeneous environment(s)
• Method of presentation
   – Infographics
   – Interactive (web) applications
   – (Raw data)
Data Scientist
• “Jack of all trades”
  – “Hacker” mentality: learn new technology and
    approaches for a project on short notice
  – Creative self-starters
  – Work alongside other experts
    (data, domain, software engineering)
A Data Scientist is good at knitting?
• Not building from scratch, knitting together pre-existing parts

• Data
    – Databases (relational/NoSQL)
    – Files
    – APIs
• Algorithms
    – Open source libraries
    – Off the shelf tools
• Compute
    – Linux
    – AWS?
• Languages
    – Many, especially “scripting” languages

More Related Content

What's hot

Top 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Top 8 Data Science Tools | Open Source Tools for Data Scientists | EdurekaTop 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Top 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Edureka!
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Caserta
 
Big data Big Analytics
Big data Big AnalyticsBig data Big Analytics
Big data Big Analytics
Ajay Ohri
 
The evolution of data analytics
The evolution of data analyticsThe evolution of data analytics
The evolution of data analytics
Natalino Busa
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
CodePolitan
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptx
SadhanaParameswaran
 
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Dr.Sotarat Thammaboosadee CIMP-Data Governance
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
BigMine
 
Big data
Big dataBig data
Big data
Pietro Nardone
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learning
Giuseppe Manco
 
Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine Learning
Julian Bright
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challenges
fazail amin
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data Science
Andrew Gardner
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact Solutions
Mohd Izhar Firdaus Ismail
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
Sampath Kumar
 
Combining Data Mining and Machine Learning for Effective User Profiling
Combining Data Mining and Machine Learning for Effective User ProfilingCombining Data Mining and Machine Learning for Effective User Profiling
Combining Data Mining and Machine Learning for Effective User Profiling
CodePolitan
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
Ravi Teja
 
Paving The Way To Data Driven
Paving The Way To Data DrivenPaving The Way To Data Driven
Paving The Way To Data Driven
Mohd Izhar Firdaus Ismail
 
Making Open the Default
Making Open the DefaultMaking Open the Default
Making Open the Default
Björn Brembs
 
Data analytics using the cloud challenges and opportunities for india
Data analytics using the cloud   challenges and opportunities for india Data analytics using the cloud   challenges and opportunities for india
Data analytics using the cloud challenges and opportunities for india
Ajay Ohri
 

What's hot (20)

Top 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Top 8 Data Science Tools | Open Source Tools for Data Scientists | EdurekaTop 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Top 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big data Big Analytics
Big data Big AnalyticsBig data Big Analytics
Big data Big Analytics
 
The evolution of data analytics
The evolution of data analyticsThe evolution of data analytics
The evolution of data analytics
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptx
 
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 
Big data
Big dataBig data
Big data
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learning
 
Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine Learning
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challenges
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data Science
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact Solutions
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Combining Data Mining and Machine Learning for Effective User Profiling
Combining Data Mining and Machine Learning for Effective User ProfilingCombining Data Mining and Machine Learning for Effective User Profiling
Combining Data Mining and Machine Learning for Effective User Profiling
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Paving The Way To Data Driven
Paving The Way To Data DrivenPaving The Way To Data Driven
Paving The Way To Data Driven
 
Making Open the Default
Making Open the DefaultMaking Open the Default
Making Open the Default
 
Data analytics using the cloud challenges and opportunities for india
Data analytics using the cloud   challenges and opportunities for india Data analytics using the cloud   challenges and opportunities for india
Data analytics using the cloud challenges and opportunities for india
 

Similar to So, What Does a Data Scientist do?

Music Recommendation 2018
Music Recommendation 2018Music Recommendation 2018
Music Recommendation 2018
Fabien Gouyon
 
Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Trends in Music Recommendations 2018
Trends in Music Recommendations 2018
Karthik Murugesan
 
Music data analysis big data presentation
Music data analysis big data presentationMusic data analysis big data presentation
Music data analysis big data presentation
Shubhanshu Gupta
 
Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...
Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...
Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...
AIST
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
PerumalPitchandi
 
Data Science Intro.pptx
Data Science Intro.pptxData Science Intro.pptx
Data Science Intro.pptx
PerumalPitchandi
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
bhavesh lande
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Laguna State Polytechnic University
 
Analytics in media and entertainment industry
Analytics in media and entertainment industryAnalytics in media and entertainment industry
Analytics in media and entertainment industry
SupreethaKrishna2
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptx
AkhirulAminulloh2
 
Introduction to Data Science Introduction to Data Science .pptx
Introduction to Data Science Introduction to Data Science .pptxIntroduction to Data Science Introduction to Data Science .pptx
Introduction to Data Science Introduction to Data Science .pptx
Nishant83346
 
Data Science.pptx
Data Science.pptxData Science.pptx
Data Science.pptx
CarolineRebeccaD
 
DATA SCINCE.pptx
DATA SCINCE.pptxDATA SCINCE.pptx
DATA SCINCE.pptx
MeesanRaza
 
Exploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban ForestryExploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban Forestry
Azavea
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
Skillwise Consulting
 
Digital Marketing & Discoverability for the Performing Arts
Digital Marketing & Discoverability for the Performing ArtsDigital Marketing & Discoverability for the Performing Arts
SuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalSuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-final
stelligence
 
STI Summit 2011 - Intro
STI Summit 2011 - IntroSTI Summit 2011 - Intro
Big data
Big dataBig data
Big data
raghav125
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media Analysis
ikanow
 

Similar to So, What Does a Data Scientist do? (20)

Music Recommendation 2018
Music Recommendation 2018Music Recommendation 2018
Music Recommendation 2018
 
Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Trends in Music Recommendations 2018
Trends in Music Recommendations 2018
 
Music data analysis big data presentation
Music data analysis big data presentationMusic data analysis big data presentation
Music data analysis big data presentation
 
Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...
Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...
Dmitry Bugaychenko - Smart.Data@ОК.ru. How to make the world a bit better usi...
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 
Data Science Intro.pptx
Data Science Intro.pptxData Science Intro.pptx
Data Science Intro.pptx
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Analytics in media and entertainment industry
Analytics in media and entertainment industryAnalytics in media and entertainment industry
Analytics in media and entertainment industry
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptx
 
Introduction to Data Science Introduction to Data Science .pptx
Introduction to Data Science Introduction to Data Science .pptxIntroduction to Data Science Introduction to Data Science .pptx
Introduction to Data Science Introduction to Data Science .pptx
 
Data Science.pptx
Data Science.pptxData Science.pptx
Data Science.pptx
 
DATA SCINCE.pptx
DATA SCINCE.pptxDATA SCINCE.pptx
DATA SCINCE.pptx
 
Exploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban ForestryExploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban Forestry
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
 
Digital Marketing & Discoverability for the Performing Arts
Digital Marketing & Discoverability for the Performing ArtsDigital Marketing & Discoverability for the Performing Arts
Digital Marketing & Discoverability for the Performing Arts
 
SuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalSuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-final
 
STI Summit 2011 - Intro
STI Summit 2011 - IntroSTI Summit 2011 - Intro
STI Summit 2011 - Intro
 
Big data
Big dataBig data
Big data
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media Analysis
 

Recently uploaded

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 

So, What Does a Data Scientist do?

  • 1. So, What Does a Data Scientist do? A Data Scientist in the Music Industry Dr Jameel Syed March 2012 http://jasyed.com/datascience/
  • 2. Overview – Musicmetric CTO – InforSense founding member • PhD in Workflows for Life Sciences Analysis – Co-organiser Big Data London meetup
  • 4. Music has moved online • The world has changed – Do you buy vinyl/tapes/CDs of music? – Do you buy music downloads? – Do you download illegal content from bittorrent? – Do you listen to music on YouTube? – Do you “like” bands on Facebook? – Do you subscribe to Spotify? – Do you listen on the radio to the weekly charts on a Sunday afternoon? • What’s happening online?
  • 6. Who are my fans?
  • 7. Where are my fans?
  • 8. What is the press saying?
  • 10. A Data Scientist in the Music Industry • Raw Data -> Derived Data -> Insight – Who is popular right now/in the immediate future? – What was the effect of appearing at a festival? – Which artists are (becoming) popular with listeners with certain demographics (in a region)? • Data processing, machine learning & statistical methods – Sentiment analysis – Named Entity Recognition – Ranking – Segmentation • One-offs – Infographics and microsites for events – Brand alignment via demographics – Music Hack Days • Product – Daily charts – Sentiment scoring web crawled reviews
  • 11. What is a Data Scientist?
  • 12. Have we been here before? • Statistician • Data Analyst • Quantitative analyst • Bioinformatician • Data Miner • Business Intelligence consultant • Computational physicst
  • 13. A Life Sciences digression...
  • 14. What’s new? • Data provides the opportunity – Old: Collect and store data presupposing how it will be used – New: Collect raw data & explore which derivations are interesting; integrating data from multiple online sources. – Big Data technology to cope with data volume • Programming is essential – APIs – Heterogeneous environment(s) • Method of presentation – Infographics – Interactive (web) applications – (Raw data)
  • 15. Data Scientist • “Jack of all trades” – “Hacker” mentality: learn new technology and approaches for a project on short notice – Creative self-starters – Work alongside other experts (data, domain, software engineering)
  • 16. A Data Scientist is good at knitting? • Not building from scratch, knitting together pre-existing parts • Data – Databases (relational/NoSQL) – Files – APIs • Algorithms – Open source libraries – Off the shelf tools • Compute – Linux – AWS? • Languages – Many, especially “scripting” languages

Editor's Notes

  1. http://jasyed.com/datascience/
  2. http://meetup.com/big-data-london/
  3. Long infographic is long: http://www.musicmetric.com/musicmetric-south-by-south-west-infographic/
  4. As of this writing there does not exist a "Data Scientist" entryin Wikipedia although there is one for http://en.wikipedia.org/wiki/Big_data
  5. Microarray image from http://en.wikipedia.org/wiki/DNA_microarray
  6. https://twitter.com/#!/DEVOPS_BORAT/status/174602033872109569
  7. Sewing a quilt probably doesn’t involve knitting