Part of my guest lecture on Data Driven Business Models at Stockholm School of Entrepreneurship. I spoke about how Data is core to the Spotify business and it drives Spotify forward.
Presented at the Machine Learning class at Chalmers, Gothenburg.
http://www.cse.chalmers.se/research/lab/courses.php?coid=9
Trying to connect their theoretical machine learning class with industry examples.
Slides from Metadata for Artists session at Music Business Conference in Nashville, TN, May 13, 2015. Panelists included Bill Wilson (Music Biz), Bryan Calhoun (Blueprint Music Group), Alison Booth (Sony Nashville) and Kristin Thomson (Future of Music Coalition. #musicbiz2015
Part of my guest lecture on Data Driven Business Models at Stockholm School of Entrepreneurship. I spoke about how Data is core to the Spotify business and it drives Spotify forward.
Presented at the Machine Learning class at Chalmers, Gothenburg.
http://www.cse.chalmers.se/research/lab/courses.php?coid=9
Trying to connect their theoretical machine learning class with industry examples.
Slides from Metadata for Artists session at Music Business Conference in Nashville, TN, May 13, 2015. Panelists included Bill Wilson (Music Biz), Bryan Calhoun (Blueprint Music Group), Alison Booth (Sony Nashville) and Kristin Thomson (Future of Music Coalition. #musicbiz2015
Spotify Discover Weekly: The machine learning behind your music recommendationsSophia Ciocca
In this presentation, I give an overview of the machine learning algorithms behind Spotify’s extraordinarily popular Discover Weekly playlist. I provide a brief introduction to what the playlist is, explain how music recommendation engines have evolved over time, then break down the three main algorithm types powering Spotify’s recommendations: (1) collaborative filtering, (2) Natural Language Processing (NLP), and (3) Raw audio analysis.
Video of the presentation can be found here: https://www.youtube.com/watch?v=PUtYNjInopA
Metadata for musicians: discovery, attribution and paymentKristin Thomson
Part 2: Metadata for Musicians workshop at University of the Arts, Philadelphia, PA on Feb 3, 3016. Hosted by the Corzo Center for the Creative Economy. http://corzocenter.uarts.edu/
Machine Learning and Big Data for Music Discovery at SpotifyChing-Wei Chen
Spotify is the world’s largest on-demand music streaming company, with over 100 million active users who generate around 2TB of interaction data every day. With over 30 million songs to choose from, discovery and personalization play an essential role in helping users discover the best music for them. In this talk, given at the newly opened Galvanize space in NYC in March 2017, we’ll explain how Spotify uses Latent Space Models and Deep Learning to power features such as Discover Weekly and Release Radar.
I share ideas for what Spotify (my favorite app) could be doing better as far as new product features and partnerships along with demographic data (see appendix). This deck is for fun and relates to personal opinions as a user along with real data. Enjoy it!
Deezer - Big data as a streaming serviceJulie Knibbe
40 million songs, albums and artists available - how nice? Streaming allows you to get a grasp at the biggest music collections in the world. The only thing is that you would need centuries to listen to all of it.
Getting access doesn’t mean knowing what to do with it. How are we making music discovery more & more efficient at Deezer?
Workshop on the Last.fm API given at Music Hack Day Stockholm (Jan 30, 2010). WIth contributions from Matthew Ogle, David Singleton, and Jonty Wareing.
Apologies for some text flowing off the slides, Slideshare doesn't seem to like the typeface we use. :-/
Machine learning for creative AI applications in music (2018 nov)Yi-Hsuan Yang
An up-to-date overview of our recent research on music/audio and AI. It contains four parts:
* AI Listener: source separation (ICMLA'18a) and sound event detection (IJCAI'18)
* AI DJ: music thumbnailing (TISMIR'18) and music sequencing (AAAI'18a)
* AI Composer: melody generation (ISMIR'17), lead sheet generation (ICMLA'18b), multitrack pianoroll generation (AAAI'18b), and instrumentation generation (arxiv)
* AI Performer: CNN-based score-to-audio generation (AAAI'19)
L’APGEF interviendra au VIIe Forum Economique Européen organisé par la ville ...APGEF
L’APGEF est invitée à présenter sa construction, son fonctionnement et ses réalisations dans le cadre du VIIeme Forum Economique Européen organisé par la ville de Lódz.
Spotify Discover Weekly: The machine learning behind your music recommendationsSophia Ciocca
In this presentation, I give an overview of the machine learning algorithms behind Spotify’s extraordinarily popular Discover Weekly playlist. I provide a brief introduction to what the playlist is, explain how music recommendation engines have evolved over time, then break down the three main algorithm types powering Spotify’s recommendations: (1) collaborative filtering, (2) Natural Language Processing (NLP), and (3) Raw audio analysis.
Video of the presentation can be found here: https://www.youtube.com/watch?v=PUtYNjInopA
Metadata for musicians: discovery, attribution and paymentKristin Thomson
Part 2: Metadata for Musicians workshop at University of the Arts, Philadelphia, PA on Feb 3, 3016. Hosted by the Corzo Center for the Creative Economy. http://corzocenter.uarts.edu/
Machine Learning and Big Data for Music Discovery at SpotifyChing-Wei Chen
Spotify is the world’s largest on-demand music streaming company, with over 100 million active users who generate around 2TB of interaction data every day. With over 30 million songs to choose from, discovery and personalization play an essential role in helping users discover the best music for them. In this talk, given at the newly opened Galvanize space in NYC in March 2017, we’ll explain how Spotify uses Latent Space Models and Deep Learning to power features such as Discover Weekly and Release Radar.
I share ideas for what Spotify (my favorite app) could be doing better as far as new product features and partnerships along with demographic data (see appendix). This deck is for fun and relates to personal opinions as a user along with real data. Enjoy it!
Deezer - Big data as a streaming serviceJulie Knibbe
40 million songs, albums and artists available - how nice? Streaming allows you to get a grasp at the biggest music collections in the world. The only thing is that you would need centuries to listen to all of it.
Getting access doesn’t mean knowing what to do with it. How are we making music discovery more & more efficient at Deezer?
Workshop on the Last.fm API given at Music Hack Day Stockholm (Jan 30, 2010). WIth contributions from Matthew Ogle, David Singleton, and Jonty Wareing.
Apologies for some text flowing off the slides, Slideshare doesn't seem to like the typeface we use. :-/
Machine learning for creative AI applications in music (2018 nov)Yi-Hsuan Yang
An up-to-date overview of our recent research on music/audio and AI. It contains four parts:
* AI Listener: source separation (ICMLA'18a) and sound event detection (IJCAI'18)
* AI DJ: music thumbnailing (TISMIR'18) and music sequencing (AAAI'18a)
* AI Composer: melody generation (ISMIR'17), lead sheet generation (ICMLA'18b), multitrack pianoroll generation (AAAI'18b), and instrumentation generation (arxiv)
* AI Performer: CNN-based score-to-audio generation (AAAI'19)
L’APGEF interviendra au VIIe Forum Economique Européen organisé par la ville ...APGEF
L’APGEF est invitée à présenter sa construction, son fonctionnement et ses réalisations dans le cadre du VIIeme Forum Economique Européen organisé par la ville de Lódz.
Early thoughts on open source music industry activity tracking...
There's way too much unnecessary chaff and noise in the day to day efforts of album and artist promotion. We are developing an open-source, web-based tracking system to solve that and other issues.
Program 01 For this program you will complete the program in.pdfaddtechglobalmarketi
Program 01
For this program you will complete the program in the a05.c file. Find the places in the code
marked with the ToDo label and complete the work. You should not change any other part of the
code. This includes do not add any global variables or any other function. The program declares a
struct musicRepository with elements for songs ID, songs name, singers name, genre of the song,
and songs year. You will be completing a program that creates a list of songs (array of structs). It
is a menu-driven program where the user is given the following options:
Add a song to the repository - always inserted in the next available space in the array. The next
available array might not be next to the last song added, it might be somewhere in the middle.
Check Delete function below for details. If there is no space to add a new song, the function
should return -1, and 1 otherwise.
Search song by name - given the name of the song, return the struct with the data of the song.
Print a song - given a struct print its content. Please note that the Genre is an enum type;
however, the print function should print a string, not a number for this field.
Edit a song - given the name of the song, find it (using the Search function above), and modify
any of the fields of the song except ID. ID cannot be edited, and all fields should have a valid
value.
Delete a song - given the name of the song, find it (using the Search function above), and modify
the fields of the song (using the Edit function above), making ID, Year, and Genre 0, and name
and author an empty string. Return -1 if the song was not found, 1 if it was deleted.
Print the whole list of songs. Print the list of songs including all the fields of the song except for
ID.
// CSE240 SPRING 2023 Assignment 05
// Enter your name here
// You are given a partially completed program that creates a list of songs, like a
music repository.
// Each song has this information: song's id, song's name, singer's name, genre of
the song, and its published year.
// The struct 'musicRepository' holds information of one song. Genre is enum type.
// An array of structs called 'list' is made to hold the list of songs.
// To begin, you should trace through the given code and understand how it works.
// Please read the instructions above each resquired function and follow the
directions carefully.
// You should not modify any of the given code, the return types, or the
parameters, you will risk of getting compilation error.
// You are not allowed to modify main ().
// You can use string library functions.
// WRITE COMMENTS FOR IMPORTANT STEPS IN YOUR CODE.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#pragma warning(disable : 4996) // for Visual Studio Only
#define MAX_SONGS 20
#define MAX_SONG_NAME_LENGTH 40
#define MAX_SINGER_NAME_LENGTH 40
typedef enum
{
unclassified = 0,
Pop,
Rock,
Reggae,
Country,
Blues,
Balad,
} genreType; // enum type
struct musicRepository
{
// struct for song details
unsigned int ID;
c.
Learn how to use Deezer and Spotify platform to highlight your music content. How to create Deezer and Spotify Apps?
What are the 5 points to take into account for a good label application?
With Spotify being one of the most popular music promotion services to date, it does make sense for aspiring musicians to get featured.
However, there’s a good number of musicians who struggle each day and end up getting rejected by the curators.
Fret not! We’ve got you covered. In this post, we will let you know about a handful of full-proof ways to get on a Spotify playlist.
Visit - https://playlistpumppr.agency/
Movingfans is digital music platform 2.0 which provide direct transaction to musician, classify musician by network based reputation system and focus to build D.I.Y. ecosystem for musicians
Movingfans is digital music platform 2.0 which provide direct transaction to musician, classify musician by network based reputation system and focus to build D.I.Y. ecosystem for musicians
How to Discover New Music in 2023 5 Best AppsDarylMitchell9
Spotify is the newest and greatest way to find entirely new and offbeat music. This music streaming service is so efficient at doing so because of a special recommendation engine, Bart, that uses an AI system to recommend music based on a listener’s taste.https://audiospeaks.com/how-to-discover-new-music/
Maintaining a community based music website with open source softwareAndrew Goodwin
The Newcastle Music Directory is a community based music website that covers the music scene in and around Newcastle, NSW. This covers a region roughly from the
Hawkesbury River in the South to Taree in the North, and West to Merriwa in the upper Hunter Valley. The site publishes over 1500 local events per month on its gig guide and lists details of over 4000 local bands and artists and over 650 venues from the region. The site has constantly had to be upgraded and updated to reflect the changing internet technology over its 10 year history.
Applying geospatial analytics and social media to understand audience behaviour. This presentations show how anonymous geospatial tracking of smartphone data can help understand audience behaviour. It also shows examples of advanced text analytics using IBM Watson to understand the psychographic profile of artists. The same techniques can be applied for customers or citizens.
1. WOULD THEY
PLAY?
B Y C H R I S T I N E O S A Z U W A
Using Python and music APIs to predict the likelihood of an
artist or band playing a music festival based on artist &
song attributes.
2. The Problem
Every year music festivals are announced with
dozens to hundreds of artists. Does your
favorite band fit in? Could the company’s
curating their list automate the process?
Given music data from the top music tech
companies in the world, can you predict the
likelihood of an artist playing a popular
festival?
3. LIST OF ALL BANDS HISTORICALLY THAT PLAYED EACH FESTIVAL
FOR 5-10 YEARS
HISTORICAL DATA ON THE BANDS THAT PLAYED AT THE TIME
THEY PLAYED INCLUDING:
SALES VOLUME, POPULARITY OF ARTIST, POPULARITY OF TOP
TRACKS, LISTENERSHIP, RECORD LABEL, SOCIAL MEDIA METRICS,
RADIO PLAYS, AVERAGE CONCERT TICKET PRICE, AWARDS WON,
APPEARANCE ON BILLBOARD CHARTS, ATTRIBUTES OF TOP SONGS,
GOOGLE TRENDS
EVERGREEN DATA ABOUT THE BANDS:
NUMBER OF BAND MEMBERS, PLACE OF ORIGIN, SIMILARITIES TO
OTHER ARTISTS, START YEAR, BREAK UP YEAR, GENRE, IF ARTIST HAVE
PLAYED THE FESTIVAL BEFORE
4. Acquiring & Cleaning the
Data
While all the data exists for the artists, the
challenge was finding and connecting data for
various artists together by scraping the web &
using public APIs.
5. Original:
Get 5-10 years of historical data for 10-15 large world-wide
festivals.
Revised:
As I started the process, I realized it would be challenging
to get all the festival data I wanted, so for the sake of time I
choose to limit to one festival to start:
Vans Warped Tour—a summer long tour music festival
usually played by pop punk, punk, hardcore & emo bands.
6. A special issue I ran into with Warped Tour data is that the artist change daily for the festival. One area of consistency I found is that Warped Tour releases a
compilation album each year with a song from most of the artists that played a significant amount of dates for the festival, so I choose to use that compilation as my
baseline.
Used “ItemSearch” function to find the
compilation albums, then iterated
through results to get track listings
from each year. Missing 2013-2015
Library/API: bottlenose
Used “search_album” function to find
compilation albums that were missing.
Then iterated though results to get track
listings from each year. Missing 2014.
Library/API: python-itunes
Wikia is a crowdsourced site with
sections dedicated to specific topics. I
used BS4 to pull the data for the artists
for the missing date of 2014.
No Library/API, used BeautifulSoup
7. Used spotipy’s “search” function to
locate each artist in the Warped Tour
compilation album list, then appended
their Spotify ID. From there, I was able
to use spotipy’s “artist” feature to
append artist popularity, artist genre.
Finally, I used spotipy’s
“artist_top_tracks” to find the artists
top 10 tracks by popularity on Spotify
and append an average of those
song’s attributes.
Music Streaming Service
Library/API: spotipy
Used “search” function to locate each
artist. Appended the
Gracenote/MusicBrainz ID for each
artist and also returned a list of band
members. From there, I did a count of
number of members and appended to
my data list.
Library/API: musicbrainzngs
Used the ID I got from MusicBrainz to
use pylast’s function
“get_artist_by_mbid”. From there, I
appended the artist play count on
LastFM and their overall listener count.
API: pylast
Used pygn’s “search” function to
search for the artist by name and then
append the decade that the artist
started in.
Library/API: pygn
This involved a lot of moving parts but thankfully, these API’s play well together.
Crowd Sourced Music
Database
Music Plays Tracker Music Database
8. These sites/APIs had data I wanted but it was too challenging to get.
Amazon: Lack of consistent data among the artists.
Google Trends: No official API, limited to 10 queries a minute. Too slow for
mass searching.
Pollstar: API only for artists. Used BeautifulSoup to parse the site for concert
ticket cost data, but very limited amount, resulted in too many missing values.
Billboard: Attempted to find the frequency in which an artist was on a
Billboard chart but API lacked the relevant charts.
Gracenote: Attempted to get artist origin data but rate limits too low for mass
searching on multiple queries, so I prioritized “era”.
Ultimate Music Database: Incredibly robust site, however; no API and very
dated, so even parsing with BeautifulSoup proved challenging.
9. Instead of manually cleaning the data, I opted to add in
exceptions to errors. Instead of having missing values to clean, I
structure my functions with “try”/“except”, so that in the case of
an error, the function continues, and would pass through a ‘0’.
I didn’t really have time…
10. band: str, name of band/artist
spotify_id: str, alpha numeric string assign to artist
(provided by Spotify)
musicbrainz_id: str, alpha numeric string assign to
artist (provided by MusicBrainz)
member_count: int, count of number of people
associated with or in the given group/artists
(provided by MusicBrainz)
spotify_popularity: int, popularity of artist based on
Spotify’s algorithm range 0-100 (provided by
Spotify)
spotify_genre: str, the genre assigned the artist
(provided by Spotify)
duration_ms: int, length of the top 10 songs (as
defined by Spotify) of the given artist averaged
(provided by Spotify/EchoNest)
acousticness, danceability, energy,
instrumentalness, key, liveness, loudness, mode,
speechiness, tempo, time_signature, valence:
int, attributes of the top 10 songs (as defined by
Spotify) of the given artist averaged (provided by
Spotify/EchoNest)
lastfm_play_count: int, the number of times an
artists song has been counted on LastFM (provided
by LastFM)
lastfm_listener_count: int, the number of people
that have listened to an artist on LastFM (provided
by LastFM)
Data Explained
Resulting
Dataset
After gathering data from
various websites, I was able
to acquire and successfully
attribute a small portion of
the initial data I sought out
for this project.
11. Thanks to the Songkick API (and python-songkick
library), I was able to pull data from Lollapalooza
(an indie music festival that takes place in Chicago
every year). This supplemented the Warped Tour
dataset & allow for a split, train test.
14. The Models
Since acquiring the right data took so long, I
did some minimal modeling using Linear
Regression, Logistic Regression, Decision Trees
& K-Nearest Neighbors
15. Used pandas throughout to import and export CSV
files in order to store data & manipulate.
pandas
z
From sklearn, imported linear_model, feature_selection,
import cross_validation, metrics, LinearRegression,
cross_val_score, export_graphviz, KNeighborsClassifier,
LogisticRegression, DecisionTreeClassifier in order to
run Logistic Regression, Linear Regression, Decision
Trees & KNN. From statsmodel I used to run a Linear
Regression model.
sklearn + statsmodel
m
Imported numpy in order to do basic math: mean,
square root, etc.
numpy
Imported seaborn for data visualizing the correlation
heatmap.
seaborn
Imported pygn, spotipy, billboard, pylast, songkick,
discogs_client, pytrends, musicbrainzngs, pytrends
Music APIs
5
Imported requests, sys, os, itertools, re, bs4, json, time,
string, collections
Others
Did the basics here to make sure everything worked!
16. Did the basics here to make sure everything worked!
accuracy confusion matrix
train/test scores
17. Linear
Regression
Results
For the sake of simplicity, I
opted to use this Linear
Regression model in my
function though the overall R2
is not very strong.
18. I needed to automate getting the data for each artist, for
each festival, so I created a function to run that process.
Given a list of bands, the function can iterate through that
list, append all the data necessary and output a pandas
data frame.
Getting New Festival Data
I created the ability to input an artist and then have a
function run that:
1. Use the above function that collects all the data for
the artist needed to run the model and places it in a
pandas data frame.
2. Runs the model
3. Outputs a plain text answer to the artist’s likelihood of
playing that festival
Predicting Band Likelihood to Play
21. Gathering data from the internet is time & memory intensive. In addition, restrictions & rate limits,
as well as continuously learning new libraries were strong barriers. Using “sleep” is important!
APIs & Websites
In order to have a large enough data set, I looked at bands that have played the festival across
several years, however; I don’t have the data to say how that band was doing within the year
they played. This automatically skews my data, thus finding more artist attributed data would
help the modeling process.
Lack of Historical Data
These commands were instrumental to making my functions work properly without having to have
completely clean data.
Understanding yield, return, except & try
When you have just a few lines of code it’s easy to remember what you’re doing but when you
approach hundreds, it’s important to make note of what you’re doing and use variables that
make sense. Also, important to know when to use a function & when to just use a variable.
Importance of Labels & Conventions
22. Next Steps
During this process I requested access to
several APIs such as Next Big Sound that
have yet to be granted, which would
provide me with additional artist data. In
addition, I came across the
Discography.com data late in my process &
hope to incorporate some of that data as
well.
Getting More
Data
Now that I have the functionality created
for one festival, I hope to replicate amongst
several festivals to eventually allow a user
to enter an artist name and it will query
data from multiple festivals and provide the
likelihood of them playing each one.
Adding More
Festivals
Within the scope of this project I focused
primarily on data acquisition and creating a
proof of concept, so given more time, I
intend to refine and improve upon my
modeling.
Improving The
Model
Ultimately, I’d like for this to be web based
so users can simply input an artists &
search, then it will display the results. In
order for this to happen and deliver the
results in a timely fashion, I may need to
start exploring cache & memory.
Creating a User
Interface