SlideShare a Scribd company logo
1 of 1
Download to read offline
Making Music with Machine Learning
Tyler Doll Follow
Jan 29 · 9 min read
Image from https://www.maxpixel.net/Circle-Structure-Music-Points-Clef-Pattern-Heart-1790837
Music is not just an art, music is an expression of the human condition.
When an artist is making a song you can often hear the emotions,
experiences, and energy they have in that moment. Music connects people
all over the world and is shared across cultures. So there is no way a
computer could possibly compete with this right? That’s the question my
group and I asked when we chose our semester project for our Machine
Learning class. Our goal was to create something that would make the
listener believe that what they were listening to was created by a human. I
think we succeeded personally, but I will let you be the judge (see the
results towards the bottom of this post).
Approach
In order to create music, we needed some way to learn the patterns and
behaviors of existing songs so that we could reproduce something that
sounded like actual music. All of us had been interested in deep learning, so
we saw this as a perfect opportunity to explore this technology. To begin we
researched existing solutions to this problem and came across a great
tutorial from Sigurður Skúli on how to generate music using Keras. After
reading their tutorial, we had a pretty good idea of what we wanted to do.
File format is important as it is what would decide how we would approach
the problem. The tutorial used midi files so we followed suit and decided to
use them as well because they were easy to parse and learn from (you can
learn more about them here). Using midi files gave us a couple advantages
because we could easily detect the pitch of a note as well as the duration.
But before we dove in and began building our network, we needed some
more information on how music is structured and the patterns to consider.
For this we went to a good friend of mine Mitch Burdick. He helped us to
determine a few things about our approach and gave us a crash course on
simple music theory.
After our conversation we realized that the time step and sequence length
would be two important factors for our network. The time step determined
when we analyzed and produced each note while the sequence length
determined how we learned patterns in a song. For our solution we chose a
time step of 0.25 seconds and 8 notes per time step. This corresponded to a
time signature of 4/4, which for us meant 8 different sequence of 4 notes.
By learning these sequences and repeating them, we could generate a
pattern that sounded like actual music and build from there. As a starting
point we used the code mentioned in Skúli’s tutorial, however in the end
our implementation differentiated from the original in several ways:
Network architecture
Restricted to single key
Use of variable length notes and rests
Use of the structure/patterns of a song
Network Architecture
For our architecture we decided to lean heavily on Bidirectional Long Short-
Term Memory (BLSTM) layers. Below is the Keras code we used:
model = Sequential()
model.add(
Bidirectional(
LSTM(512, return_sequences=True),
input_shape=(
network_input.shape[1], network_input.shape[2]),
)
)
model.add(Dropout(0.3))
model.add(Bidirectional(LSTM(512)))
model.add(Dense(n_vocab))
model.add(Activation("softmax"))
model.compile(loss="categorical_crossentropy", optimizer="rmsprop")
Our thoughts behind this were by using the notes before and after a
particular spot in a song we could generate melodies that sounded similar
to a human. Often when listening to music what came before helps the
listener predict what is next. There have been many times when I’ve been
listening to a song and I can bob along to a particular beat because I can
predict what will come next. This is exactly what happens when building up
to a drop in a song. The song gets more and more intense which causes the
listener to build tension in anticipation of the drop and causes that moment
of relief and excitement when it finally hits. By taking advantage of this we
were able to produce beats that would sound natural and bring forth the
same emotions that we have become accustomed to expecting in modern
music.
For the number of nodes in our BLSTM layers we chose 512 as that was
what Skúli used. However we did experiment with this a little, but due to
time constraints we ended up sticking with the original number. Same goes
for the dropout rate of 30% (read more about dropout rates here). For the
activation function we chose softmax and for our loss function we chose
categorical cross-entropy as they work well for multi-class classification
problems such as note prediction (you can read more about both of them
here). Lastly we chose RMSprop for our optimizer as this was recommended
by Keras for RNNs.
Key Restriction
An important assumption we made was that we would only use songs from
the same key: C major/A minor. The reason for this is by keeping every song
we produced in the same key, our output would sound more song-like as the
network wouldn’t ever learn notes that would cause a song to go off key. To
do this we used a script we found here from Nick Kelly. This part was really
simple but gave us a huge improvement in our results.
Variable Length Notes and Rests
An important part of music is the dynamic and creative use of variable
length notes and rests. That one long note struck by the guitarist followed
by a peaceful pause can send a wave of emotion to the listener as we hear
the heart and soul of the player spilled out into the world. To capture this
we looked into ways of introducing long notes, short notes, and rests so that
we could create different emotions throughout the song.
In order to implement this we looked at the pitch and duration of a note and
treated this as a separate value we could input into our network. This meant
that a C# played for 0.5 seconds and a C# played for 1 second would be
treated as different values by the network. This allowed us to learn what
pitches were played longer or shorter than others and enabled us to
combine notes to produce something that sounded natural and fitting for
that part of the song.
Of course rests cannot be forgotten as they are crucial for guiding the
listener to a place of anticipation or excitement. A slow note and a pause
followed by a burst of quick firing notes can create a different emotion than
several long notes with long pauses between. We felt this was important in
order to replicate the experience the listener has when listening to a
relaxing Sunday afternoon song or a Friday night party anthem.
To achieve these goals we had to focus on our preprocessing. Again here we
started with the code from Skúli’s tutorial and adapted it to fit our needs.
for element in notes_to_parse:
if (isinstance(element, note.Note) or
isinstance(element, chord.Chord
):
duration = element.duration.quarterLength
if isinstance(element, note.Note):
name = element.pitch
elif isinstance(element, chord.Chord):
name = ".".join(str(n) for n in element.normalOrder)
notes.append(f"{name}${duration}")
rest_notes = int((element.offset - prev_offset) / TIMESTEP - 1)
for _ in range(0, rest_notes):
notes.append("NULL")
prev_offset = element.offset
To elaborate on the code above, we create notes by combining their pitch
and duration with a “$” to feed into our network. For example “A$1.0”,
“A$0.75”, “B$0.25”, etc. would all be encoded separately for use by our
network (inputs are encoded by mapping each unique note/duration to an
integer then dividing all of the integers by the number of unique
combinations thus encoding each one as a floating point number between 0
and 1). The more interesting part is calculating how many rests to insert.
We look at the offset of the current note and compare it to the offset of the
last note we looked at. We take this gap and divide it by our time step to
calculate how many rest notes we can fit (minus 1 because really this
calculates how many notes fit in the gap, but one of them is our actual next
note so we don’t want to double count it). An example would be if one note
started at 0.5s and the next didn’t start till 1.0s. With a time step of 0.25
(each note is played in 0.25s intervals), this would mean we need one rest
note to fill the gap.
Song Structure
Lastly one of the most important parts of writing a song is the structure, and
this is one of the things we found lacking in existing solutions. From what I
have seen most researchers are hoping for their network to learn this on its
own, and I don’t think that is a misguided approach. However I think this
introduces complexity to the problem and leads to further difficulty. This
could be a source of improvement upon our solution though as we take a
more manual approach to this and assume a constant pattern.
One of the key assumptions we made is that we would only produce songs
that follow the specific pattern ABCBDB where:
A is the first verse
B is the chorus
C is the second verse
and D is the bridge
Initially we tried ABABCB but this felt too formulaic. To resolve this we
decided to introduce a second verse that was different than the first but still
related. We generated the first verse from a random note and then
generated the second verse based on the first. Effectively this is generating a
single section that is twice as long and splitting it in half. The thought
process here was that if we create one verse the second should still fit the
same vibe, and by using the first as a reference we could achieve this.
def generate_notes(self, model, network_input, pitchnames, n_vocab):
""" Generate notes from the neural network based on a sequence
of notes """
int_to_note = dict(
(
number + 1,
note
) for number, note in enumerate(pitchnames)
)
int_to_note[0] = "NULL"
def get_start():
# pick a random sequence from the input as a starting point for
# the prediction
start = numpy.random.randint(0, len(network_input) - 1)
pattern = network_input[start]
prediction_output = []
return pattern, prediction_output
# generate verse 1
verse1_pattern, verse1_prediction_output = get_start()
for note_index in range(4 * SEQUENCE_LEN):
prediction_input = numpy.reshape(
verse1_pattern, (1, len(verse1_pattern), 1)
)
prediction_input = prediction_input / float(n_vocab)
prediction = model.predict(prediction_input, verbose=0)
index = numpy.argmax(prediction)
result = int_to_note[index]
verse1_prediction_output.append(result)
verse1_pattern.append(index)
verse1_pattern = verse1_pattern[1 : len(verse1_pattern)]
# generate verse 2
verse2_pattern = verse1_pattern
verse2_prediction_output = []
for note_index in range(4 * SEQUENCE_LEN):
prediction_input = numpy.reshape(
verse2_pattern, (1, len(verse2_pattern), 1)
)
prediction_input = prediction_input / float(n_vocab)
prediction = model.predict(prediction_input, verbose=0)
index = numpy.argmax(prediction)
result = int_to_note[index]
verse2_prediction_output.append(result)
verse2_pattern.append(index)
verse2_pattern = verse2_pattern[1 : len(verse2_pattern)]
# generate chorus
chorus_pattern, chorus_prediction_output = get_start()
for note_index in range(4 * SEQUENCE_LEN):
prediction_input = numpy.reshape(
chorus_pattern, (1, len(chorus_pattern), 1)
)
prediction_input = prediction_input / float(n_vocab)
prediction = model.predict(prediction_input, verbose=0)
index = numpy.argmax(prediction)
result = int_to_note[index]
chorus_prediction_output.append(result)
chorus_pattern.append(index)
chorus_pattern = chorus_pattern[1 : len(chorus_pattern)]
# generate bridge
bridge_pattern, bridge_prediction_output = get_start()
for note_index in range(4 * SEQUENCE_LEN):
prediction_input = numpy.reshape(
bridge_pattern, (1, len(bridge_pattern), 1)
)
prediction_input = prediction_input / float(n_vocab)
prediction = model.predict(prediction_input, verbose=0)
index = numpy.argmax(prediction)
result = int_to_note[index]
bridge_prediction_output.append(result)
bridge_pattern.append(index)
bridge_pattern = bridge_pattern[1 : len(bridge_pattern)]
return (
verse1_prediction_output
+ chorus_prediction_output
+ verse2_prediction_output
+ chorus_prediction_output
+ bridge_prediction_output
+ chorus_prediction_output
)
Results
We were able to achieve surprising results from this approach. We could
consistently generate unique songs that fell into the proper genre that we
trained the respective networks on. Below are some example outputs from
our various networks.
Ragtime
Christmas
Rap
Conclusion
Music generation by machines is indeed possible. Is it better or could it be
better than music generated by humans? Only time will tell. From these
results though I would say that it’s definitely possible.
Future Work
Several improvements could be made that would bring this even closer to
true music. Some possible ideas/experiments include:
Learn patterns in songs rather than manually piecing together parts
Take note duration as a separate input to the network rather than
treating each pitch/duration separately
Expand to multiple instruments
Move away from midi files and produce/learn from actual MP3s
Learn the time step, sequence length, and time signature
Introduce randomness to emulate “human error/experimentation”
Allow for multiple keys
Learn how to use intros and outros
Acknowledgments
I would like to thank my teammates Izaak Sulka and Jeff Greene for their
help on this project as well as my friend Mitch Burdick for his expertise on
music that enabled us to get these great results. And of course we would like
to thank Sigurður Skúli for their tutorial as it gave us a great starting point
and something to reference. Last but not least I would like to thank Nick
Kelly for his script to transpose songs to C major.
The code for this project can be found here:
https://github.com/tylerdoll/music-generator
Disclaimer: the music used in our project does not belong to us and was
sourced from various public websites.
Machine Learning Music Music Generation AI Towards Data Science
292 claps
See more stories from Towards Data Science.
Create a free Medium account to follow Towards Data Science. You’ll see more of
their stories on Medium and in your inbox.
Follow
Write the rst response
More From Medium
More from Towards Data Science
Want a data science job? Use the
weekend project principle to get it
Daniel Bourke in Towards…
Nov 3 · 4 min read 3.8K
More from Towards Data Science
How To Fake Being a Good
Programmer
Sten Sootla in Towards Dat…
Oct 30 · 5 min read 5.2K
More from Towards Data Science
One Word of Code to Stop Using
Pandas So Slowly
Tyler Folkman in Towards…
Nov 2 · 3 min read 1.7K
Discover Medium
Welcome to a place where words matter. On Medium,
smart voices and original ideas take center stage - with
no ads in sight. Watch
Make Medium yours
Follow all the topics you care about, and we’ll deliver the
best stories for you to your homepage and inbox. Explore
Become a member
Get unlimited access to the best stories on Medium —
and support writers while you’re at it. Just $5/month.
Upgrade
About Help Legal
DeepWaveDeepWave
Example RagtimeExample Ragtime Share
1.9KCookie policy
DeepWaveDeepWave
Example ChristmasExample Christmas Share
1.6KCookie policy
DeepWaveDeepWave
Example Rap 2Example Rap 2 Share
1.6KCookie policy
WRITTEN BY
Tyler Doll Follow
A guy who likes to think about computer things
DATA SCIENCE MACHINE LEARNING PROGRAMMING VISUALIZATION AI JOURNALISM MORE CONTRIBUTE
Sign in Get started

More Related Content

Similar to Making music with machine learning towards data science

Jen, lily and justines evaluation for a2
Jen, lily and justines evaluation for a2Jen, lily and justines evaluation for a2
Jen, lily and justines evaluation for a2lilyhoneybun
 
Jen, lily and justines evaluation for a2
Jen, lily and justines evaluation for a2Jen, lily and justines evaluation for a2
Jen, lily and justines evaluation for a2lilyhoneybun
 
C:\documents and settings\owner\my documents\a2 media evaluation jack
C:\documents and settings\owner\my documents\a2 media evaluation jackC:\documents and settings\owner\my documents\a2 media evaluation jack
C:\documents and settings\owner\my documents\a2 media evaluation jackjackwalterssufc
 
[DSC DACH 23] Lyrics Generator (+ Results: AI Usecases for climate change) - ...
[DSC DACH 23] Lyrics Generator (+ Results: AI Usecases for climate change) - ...[DSC DACH 23] Lyrics Generator (+ Results: AI Usecases for climate change) - ...
[DSC DACH 23] Lyrics Generator (+ Results: AI Usecases for climate change) - ...DataScienceConferenc1
 
Nithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier
 
The Streams of Our Lives - Visualizing Listening Histories in Context
The Streams of Our Lives - Visualizing Listening Histories in ContextThe Streams of Our Lives - Visualizing Listening Histories in Context
The Streams of Our Lives - Visualizing Listening Histories in ContextDominikus Baur
 
Music video final evaluation finished
Music video final evaluation finishedMusic video final evaluation finished
Music video final evaluation finishedBethMelia
 
visualization-discography-analysis(2)
visualization-discography-analysis(2)visualization-discography-analysis(2)
visualization-discography-analysis(2)Sofia Kypraiou
 
C:\documents and settings\owner\my documents\a2 media evaluation jack
C:\documents and settings\owner\my documents\a2 media evaluation jackC:\documents and settings\owner\my documents\a2 media evaluation jack
C:\documents and settings\owner\my documents\a2 media evaluation jackjackwalterssufc
 
Evaluation
EvaluationEvaluation
Evaluationlebron61
 
Evaluation
EvaluationEvaluation
Evaluationlebron61
 
Q6. Equipment and editing
Q6. Equipment and editing Q6. Equipment and editing
Q6. Equipment and editing amyflint1234
 
Concrete Music Brief 2009
Concrete Music Brief 2009Concrete Music Brief 2009
Concrete Music Brief 2009matthewlovett
 
Music Video #3 Proposal 2019 Bolisimo!!
Music Video #3 Proposal 2019 Bolisimo!!Music Video #3 Proposal 2019 Bolisimo!!
Music Video #3 Proposal 2019 Bolisimo!!JoeDuffy28
 
Evaluation 6
Evaluation 6Evaluation 6
Evaluation 6Sascha96
 
Media Evaluation
Media EvaluationMedia Evaluation
Media EvaluationLPMedia
 
Print - Proposal Yay!!!! Joe Duffy
Print - Proposal Yay!!!! Joe DuffyPrint - Proposal Yay!!!! Joe Duffy
Print - Proposal Yay!!!! Joe DuffyJoeDuffy28
 
Evaluation task 6
Evaluation task 6 Evaluation task 6
Evaluation task 6 mm12744
 

Similar to Making music with machine learning towards data science (20)

Jen, lily and justines evaluation for a2
Jen, lily and justines evaluation for a2Jen, lily and justines evaluation for a2
Jen, lily and justines evaluation for a2
 
Jen, lily and justines evaluation for a2
Jen, lily and justines evaluation for a2Jen, lily and justines evaluation for a2
Jen, lily and justines evaluation for a2
 
auto_playlist
auto_playlistauto_playlist
auto_playlist
 
C:\documents and settings\owner\my documents\a2 media evaluation jack
C:\documents and settings\owner\my documents\a2 media evaluation jackC:\documents and settings\owner\my documents\a2 media evaluation jack
C:\documents and settings\owner\my documents\a2 media evaluation jack
 
[DSC DACH 23] Lyrics Generator (+ Results: AI Usecases for climate change) - ...
[DSC DACH 23] Lyrics Generator (+ Results: AI Usecases for climate change) - ...[DSC DACH 23] Lyrics Generator (+ Results: AI Usecases for climate change) - ...
[DSC DACH 23] Lyrics Generator (+ Results: AI Usecases for climate change) - ...
 
Nithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier research_proposal
Nithin Xavier research_proposal
 
The Streams of Our Lives - Visualizing Listening Histories in Context
The Streams of Our Lives - Visualizing Listening Histories in ContextThe Streams of Our Lives - Visualizing Listening Histories in Context
The Streams of Our Lives - Visualizing Listening Histories in Context
 
Music video final evaluation finished
Music video final evaluation finishedMusic video final evaluation finished
Music video final evaluation finished
 
visualization-discography-analysis(2)
visualization-discography-analysis(2)visualization-discography-analysis(2)
visualization-discography-analysis(2)
 
C:\documents and settings\owner\my documents\a2 media evaluation jack
C:\documents and settings\owner\my documents\a2 media evaluation jackC:\documents and settings\owner\my documents\a2 media evaluation jack
C:\documents and settings\owner\my documents\a2 media evaluation jack
 
Media 3
Media 3Media 3
Media 3
 
Evaluation
EvaluationEvaluation
Evaluation
 
Evaluation
EvaluationEvaluation
Evaluation
 
Q6. Equipment and editing
Q6. Equipment and editing Q6. Equipment and editing
Q6. Equipment and editing
 
Concrete Music Brief 2009
Concrete Music Brief 2009Concrete Music Brief 2009
Concrete Music Brief 2009
 
Music Video #3 Proposal 2019 Bolisimo!!
Music Video #3 Proposal 2019 Bolisimo!!Music Video #3 Proposal 2019 Bolisimo!!
Music Video #3 Proposal 2019 Bolisimo!!
 
Evaluation 6
Evaluation 6Evaluation 6
Evaluation 6
 
Media Evaluation
Media EvaluationMedia Evaluation
Media Evaluation
 
Print - Proposal Yay!!!! Joe Duffy
Print - Proposal Yay!!!! Joe DuffyPrint - Proposal Yay!!!! Joe Duffy
Print - Proposal Yay!!!! Joe Duffy
 
Evaluation task 6
Evaluation task 6 Evaluation task 6
Evaluation task 6
 

More from Nishan Sharma

More from Nishan Sharma (16)

1.avalanche gw
1.avalanche gw1.avalanche gw
1.avalanche gw
 
2. glof
2. glof2. glof
2. glof
 
3.cyclone
3.cyclone3.cyclone
3.cyclone
 
1. air pollution
1. air pollution1. air pollution
1. air pollution
 
2.water pollution
2.water pollution2.water pollution
2.water pollution
 
3. soil pollution
3. soil pollution3. soil pollution
3. soil pollution
 
4. conservation of forest and its management
4. conservation of forest and its management4. conservation of forest and its management
4. conservation of forest and its management
 
Province 7
Province 7Province 7
Province 7
 
Karnali province
Karnali provinceKarnali province
Karnali province
 
Province 2 of nepal
Province 2 of nepalProvince 2 of nepal
Province 2 of nepal
 
Bagmati province
Bagmati provinceBagmati province
Bagmati province
 
Lumbini province
Lumbini provinceLumbini province
Lumbini province
 
Presentation on province 1 (is)
Presentation on province 1 (is)Presentation on province 1 (is)
Presentation on province 1 (is)
 
Gandaki
GandakiGandaki
Gandaki
 
Introducing maths in qbasic
Introducing maths in qbasicIntroducing maths in qbasic
Introducing maths in qbasic
 
Software Project Management
Software Project ManagementSoftware Project Management
Software Project Management
 

Recently uploaded

VIP Call Girls Service Cuttack Aishwarya 8250192130 Independent Escort Servic...
VIP Call Girls Service Cuttack Aishwarya 8250192130 Independent Escort Servic...VIP Call Girls Service Cuttack Aishwarya 8250192130 Independent Escort Servic...
VIP Call Girls Service Cuttack Aishwarya 8250192130 Independent Escort Servic...Suhani Kapoor
 
Call Girl in Low Price Delhi Punjabi Bagh 9711199012
Call Girl in Low Price Delhi Punjabi Bagh  9711199012Call Girl in Low Price Delhi Punjabi Bagh  9711199012
Call Girl in Low Price Delhi Punjabi Bagh 9711199012sapnasaifi408
 
VIP Russian Call Girls in Bhilai Deepika 8250192130 Independent Escort Servic...
VIP Russian Call Girls in Bhilai Deepika 8250192130 Independent Escort Servic...VIP Russian Call Girls in Bhilai Deepika 8250192130 Independent Escort Servic...
VIP Russian Call Girls in Bhilai Deepika 8250192130 Independent Escort Servic...Suhani Kapoor
 
Internshala Student Partner 6.0 Jadavpur University Certificate
Internshala Student Partner 6.0 Jadavpur University CertificateInternshala Student Partner 6.0 Jadavpur University Certificate
Internshala Student Partner 6.0 Jadavpur University CertificateSoham Mondal
 
Neha +91-9537192988-Friendly Ahmedabad Call Girls has Complete Authority for ...
Neha +91-9537192988-Friendly Ahmedabad Call Girls has Complete Authority for ...Neha +91-9537192988-Friendly Ahmedabad Call Girls has Complete Authority for ...
Neha +91-9537192988-Friendly Ahmedabad Call Girls has Complete Authority for ...Niya Khan
 
VIP Call Girls Firozabad Aaradhya 8250192130 Independent Escort Service Firoz...
VIP Call Girls Firozabad Aaradhya 8250192130 Independent Escort Service Firoz...VIP Call Girls Firozabad Aaradhya 8250192130 Independent Escort Service Firoz...
VIP Call Girls Firozabad Aaradhya 8250192130 Independent Escort Service Firoz...Suhani Kapoor
 
Delhi Call Girls Preet Vihar 9711199171 ☎✔👌✔ Whatsapp Body to body massage wi...
Delhi Call Girls Preet Vihar 9711199171 ☎✔👌✔ Whatsapp Body to body massage wi...Delhi Call Girls Preet Vihar 9711199171 ☎✔👌✔ Whatsapp Body to body massage wi...
Delhi Call Girls Preet Vihar 9711199171 ☎✔👌✔ Whatsapp Body to body massage wi...shivangimorya083
 
VIP High Profile Call Girls Jamshedpur Aarushi 8250192130 Independent Escort ...
VIP High Profile Call Girls Jamshedpur Aarushi 8250192130 Independent Escort ...VIP High Profile Call Girls Jamshedpur Aarushi 8250192130 Independent Escort ...
VIP High Profile Call Girls Jamshedpur Aarushi 8250192130 Independent Escort ...Suhani Kapoor
 
Dubai Call Girls Demons O525547819 Call Girls IN DUbai Natural Big Boody
Dubai Call Girls Demons O525547819 Call Girls IN DUbai Natural Big BoodyDubai Call Girls Demons O525547819 Call Girls IN DUbai Natural Big Boody
Dubai Call Girls Demons O525547819 Call Girls IN DUbai Natural Big Boodykojalkojal131
 
CFO_SB_Career History_Multi Sector Experience
CFO_SB_Career History_Multi Sector ExperienceCFO_SB_Career History_Multi Sector Experience
CFO_SB_Career History_Multi Sector ExperienceSanjay Bokadia
 
PM Job Search Council Info Session - PMI Silver Spring Chapter
PM Job Search Council Info Session - PMI Silver Spring ChapterPM Job Search Council Info Session - PMI Silver Spring Chapter
PM Job Search Council Info Session - PMI Silver Spring ChapterHector Del Castillo, CPM, CPMM
 
VIP Call Girl Bhilai Aashi 8250192130 Independent Escort Service Bhilai
VIP Call Girl Bhilai Aashi 8250192130 Independent Escort Service BhilaiVIP Call Girl Bhilai Aashi 8250192130 Independent Escort Service Bhilai
VIP Call Girl Bhilai Aashi 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Delhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP Call Girl Bhiwandi Aashi 8250192130 Independent Escort Service Bhiwandi
VIP Call Girl Bhiwandi Aashi 8250192130 Independent Escort Service BhiwandiVIP Call Girl Bhiwandi Aashi 8250192130 Independent Escort Service Bhiwandi
VIP Call Girl Bhiwandi Aashi 8250192130 Independent Escort Service BhiwandiSuhani Kapoor
 
Low Rate Call Girls Gorakhpur Anika 8250192130 Independent Escort Service Gor...
Low Rate Call Girls Gorakhpur Anika 8250192130 Independent Escort Service Gor...Low Rate Call Girls Gorakhpur Anika 8250192130 Independent Escort Service Gor...
Low Rate Call Girls Gorakhpur Anika 8250192130 Independent Escort Service Gor...Suhani Kapoor
 
VIP Call Girl Cuttack Aashi 8250192130 Independent Escort Service Cuttack
VIP Call Girl Cuttack Aashi 8250192130 Independent Escort Service CuttackVIP Call Girl Cuttack Aashi 8250192130 Independent Escort Service Cuttack
VIP Call Girl Cuttack Aashi 8250192130 Independent Escort Service CuttackSuhani Kapoor
 
VIP Russian Call Girls Amravati Chhaya 8250192130 Independent Escort Service ...
VIP Russian Call Girls Amravati Chhaya 8250192130 Independent Escort Service ...VIP Russian Call Girls Amravati Chhaya 8250192130 Independent Escort Service ...
VIP Russian Call Girls Amravati Chhaya 8250192130 Independent Escort Service ...Suhani Kapoor
 
内布拉斯加大学林肯分校毕业证录取书( 退学 )学位证书硕士
内布拉斯加大学林肯分校毕业证录取书( 退学 )学位证书硕士内布拉斯加大学林肯分校毕业证录取书( 退学 )学位证书硕士
内布拉斯加大学林肯分校毕业证录取书( 退学 )学位证书硕士obuhobo
 
VIP Russian Call Girls in Amravati Deepika 8250192130 Independent Escort Serv...
VIP Russian Call Girls in Amravati Deepika 8250192130 Independent Escort Serv...VIP Russian Call Girls in Amravati Deepika 8250192130 Independent Escort Serv...
VIP Russian Call Girls in Amravati Deepika 8250192130 Independent Escort Serv...Suhani Kapoor
 
NPPE STUDY GUIDE - NOV2021_study_104040.pdf
NPPE STUDY GUIDE - NOV2021_study_104040.pdfNPPE STUDY GUIDE - NOV2021_study_104040.pdf
NPPE STUDY GUIDE - NOV2021_study_104040.pdfDivyeshPatel234692
 

Recently uploaded (20)

VIP Call Girls Service Cuttack Aishwarya 8250192130 Independent Escort Servic...
VIP Call Girls Service Cuttack Aishwarya 8250192130 Independent Escort Servic...VIP Call Girls Service Cuttack Aishwarya 8250192130 Independent Escort Servic...
VIP Call Girls Service Cuttack Aishwarya 8250192130 Independent Escort Servic...
 
Call Girl in Low Price Delhi Punjabi Bagh 9711199012
Call Girl in Low Price Delhi Punjabi Bagh  9711199012Call Girl in Low Price Delhi Punjabi Bagh  9711199012
Call Girl in Low Price Delhi Punjabi Bagh 9711199012
 
VIP Russian Call Girls in Bhilai Deepika 8250192130 Independent Escort Servic...
VIP Russian Call Girls in Bhilai Deepika 8250192130 Independent Escort Servic...VIP Russian Call Girls in Bhilai Deepika 8250192130 Independent Escort Servic...
VIP Russian Call Girls in Bhilai Deepika 8250192130 Independent Escort Servic...
 
Internshala Student Partner 6.0 Jadavpur University Certificate
Internshala Student Partner 6.0 Jadavpur University CertificateInternshala Student Partner 6.0 Jadavpur University Certificate
Internshala Student Partner 6.0 Jadavpur University Certificate
 
Neha +91-9537192988-Friendly Ahmedabad Call Girls has Complete Authority for ...
Neha +91-9537192988-Friendly Ahmedabad Call Girls has Complete Authority for ...Neha +91-9537192988-Friendly Ahmedabad Call Girls has Complete Authority for ...
Neha +91-9537192988-Friendly Ahmedabad Call Girls has Complete Authority for ...
 
VIP Call Girls Firozabad Aaradhya 8250192130 Independent Escort Service Firoz...
VIP Call Girls Firozabad Aaradhya 8250192130 Independent Escort Service Firoz...VIP Call Girls Firozabad Aaradhya 8250192130 Independent Escort Service Firoz...
VIP Call Girls Firozabad Aaradhya 8250192130 Independent Escort Service Firoz...
 
Delhi Call Girls Preet Vihar 9711199171 ☎✔👌✔ Whatsapp Body to body massage wi...
Delhi Call Girls Preet Vihar 9711199171 ☎✔👌✔ Whatsapp Body to body massage wi...Delhi Call Girls Preet Vihar 9711199171 ☎✔👌✔ Whatsapp Body to body massage wi...
Delhi Call Girls Preet Vihar 9711199171 ☎✔👌✔ Whatsapp Body to body massage wi...
 
VIP High Profile Call Girls Jamshedpur Aarushi 8250192130 Independent Escort ...
VIP High Profile Call Girls Jamshedpur Aarushi 8250192130 Independent Escort ...VIP High Profile Call Girls Jamshedpur Aarushi 8250192130 Independent Escort ...
VIP High Profile Call Girls Jamshedpur Aarushi 8250192130 Independent Escort ...
 
Dubai Call Girls Demons O525547819 Call Girls IN DUbai Natural Big Boody
Dubai Call Girls Demons O525547819 Call Girls IN DUbai Natural Big BoodyDubai Call Girls Demons O525547819 Call Girls IN DUbai Natural Big Boody
Dubai Call Girls Demons O525547819 Call Girls IN DUbai Natural Big Boody
 
CFO_SB_Career History_Multi Sector Experience
CFO_SB_Career History_Multi Sector ExperienceCFO_SB_Career History_Multi Sector Experience
CFO_SB_Career History_Multi Sector Experience
 
PM Job Search Council Info Session - PMI Silver Spring Chapter
PM Job Search Council Info Session - PMI Silver Spring ChapterPM Job Search Council Info Session - PMI Silver Spring Chapter
PM Job Search Council Info Session - PMI Silver Spring Chapter
 
VIP Call Girl Bhilai Aashi 8250192130 Independent Escort Service Bhilai
VIP Call Girl Bhilai Aashi 8250192130 Independent Escort Service BhilaiVIP Call Girl Bhilai Aashi 8250192130 Independent Escort Service Bhilai
VIP Call Girl Bhilai Aashi 8250192130 Independent Escort Service Bhilai
 
Delhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls South Delhi 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP Call Girl Bhiwandi Aashi 8250192130 Independent Escort Service Bhiwandi
VIP Call Girl Bhiwandi Aashi 8250192130 Independent Escort Service BhiwandiVIP Call Girl Bhiwandi Aashi 8250192130 Independent Escort Service Bhiwandi
VIP Call Girl Bhiwandi Aashi 8250192130 Independent Escort Service Bhiwandi
 
Low Rate Call Girls Gorakhpur Anika 8250192130 Independent Escort Service Gor...
Low Rate Call Girls Gorakhpur Anika 8250192130 Independent Escort Service Gor...Low Rate Call Girls Gorakhpur Anika 8250192130 Independent Escort Service Gor...
Low Rate Call Girls Gorakhpur Anika 8250192130 Independent Escort Service Gor...
 
VIP Call Girl Cuttack Aashi 8250192130 Independent Escort Service Cuttack
VIP Call Girl Cuttack Aashi 8250192130 Independent Escort Service CuttackVIP Call Girl Cuttack Aashi 8250192130 Independent Escort Service Cuttack
VIP Call Girl Cuttack Aashi 8250192130 Independent Escort Service Cuttack
 
VIP Russian Call Girls Amravati Chhaya 8250192130 Independent Escort Service ...
VIP Russian Call Girls Amravati Chhaya 8250192130 Independent Escort Service ...VIP Russian Call Girls Amravati Chhaya 8250192130 Independent Escort Service ...
VIP Russian Call Girls Amravati Chhaya 8250192130 Independent Escort Service ...
 
内布拉斯加大学林肯分校毕业证录取书( 退学 )学位证书硕士
内布拉斯加大学林肯分校毕业证录取书( 退学 )学位证书硕士内布拉斯加大学林肯分校毕业证录取书( 退学 )学位证书硕士
内布拉斯加大学林肯分校毕业证录取书( 退学 )学位证书硕士
 
VIP Russian Call Girls in Amravati Deepika 8250192130 Independent Escort Serv...
VIP Russian Call Girls in Amravati Deepika 8250192130 Independent Escort Serv...VIP Russian Call Girls in Amravati Deepika 8250192130 Independent Escort Serv...
VIP Russian Call Girls in Amravati Deepika 8250192130 Independent Escort Serv...
 
NPPE STUDY GUIDE - NOV2021_study_104040.pdf
NPPE STUDY GUIDE - NOV2021_study_104040.pdfNPPE STUDY GUIDE - NOV2021_study_104040.pdf
NPPE STUDY GUIDE - NOV2021_study_104040.pdf
 

Making music with machine learning towards data science

  • 1. Making Music with Machine Learning Tyler Doll Follow Jan 29 · 9 min read Image from https://www.maxpixel.net/Circle-Structure-Music-Points-Clef-Pattern-Heart-1790837 Music is not just an art, music is an expression of the human condition. When an artist is making a song you can often hear the emotions, experiences, and energy they have in that moment. Music connects people all over the world and is shared across cultures. So there is no way a computer could possibly compete with this right? That’s the question my group and I asked when we chose our semester project for our Machine Learning class. Our goal was to create something that would make the listener believe that what they were listening to was created by a human. I think we succeeded personally, but I will let you be the judge (see the results towards the bottom of this post). Approach In order to create music, we needed some way to learn the patterns and behaviors of existing songs so that we could reproduce something that sounded like actual music. All of us had been interested in deep learning, so we saw this as a perfect opportunity to explore this technology. To begin we researched existing solutions to this problem and came across a great tutorial from Sigurður Skúli on how to generate music using Keras. After reading their tutorial, we had a pretty good idea of what we wanted to do. File format is important as it is what would decide how we would approach the problem. The tutorial used midi files so we followed suit and decided to use them as well because they were easy to parse and learn from (you can learn more about them here). Using midi files gave us a couple advantages because we could easily detect the pitch of a note as well as the duration. But before we dove in and began building our network, we needed some more information on how music is structured and the patterns to consider. For this we went to a good friend of mine Mitch Burdick. He helped us to determine a few things about our approach and gave us a crash course on simple music theory. After our conversation we realized that the time step and sequence length would be two important factors for our network. The time step determined when we analyzed and produced each note while the sequence length determined how we learned patterns in a song. For our solution we chose a time step of 0.25 seconds and 8 notes per time step. This corresponded to a time signature of 4/4, which for us meant 8 different sequence of 4 notes. By learning these sequences and repeating them, we could generate a pattern that sounded like actual music and build from there. As a starting point we used the code mentioned in Skúli’s tutorial, however in the end our implementation differentiated from the original in several ways: Network architecture Restricted to single key Use of variable length notes and rests Use of the structure/patterns of a song Network Architecture For our architecture we decided to lean heavily on Bidirectional Long Short- Term Memory (BLSTM) layers. Below is the Keras code we used: model = Sequential() model.add( Bidirectional( LSTM(512, return_sequences=True), input_shape=( network_input.shape[1], network_input.shape[2]), ) ) model.add(Dropout(0.3)) model.add(Bidirectional(LSTM(512))) model.add(Dense(n_vocab)) model.add(Activation("softmax")) model.compile(loss="categorical_crossentropy", optimizer="rmsprop") Our thoughts behind this were by using the notes before and after a particular spot in a song we could generate melodies that sounded similar to a human. Often when listening to music what came before helps the listener predict what is next. There have been many times when I’ve been listening to a song and I can bob along to a particular beat because I can predict what will come next. This is exactly what happens when building up to a drop in a song. The song gets more and more intense which causes the listener to build tension in anticipation of the drop and causes that moment of relief and excitement when it finally hits. By taking advantage of this we were able to produce beats that would sound natural and bring forth the same emotions that we have become accustomed to expecting in modern music. For the number of nodes in our BLSTM layers we chose 512 as that was what Skúli used. However we did experiment with this a little, but due to time constraints we ended up sticking with the original number. Same goes for the dropout rate of 30% (read more about dropout rates here). For the activation function we chose softmax and for our loss function we chose categorical cross-entropy as they work well for multi-class classification problems such as note prediction (you can read more about both of them here). Lastly we chose RMSprop for our optimizer as this was recommended by Keras for RNNs. Key Restriction An important assumption we made was that we would only use songs from the same key: C major/A minor. The reason for this is by keeping every song we produced in the same key, our output would sound more song-like as the network wouldn’t ever learn notes that would cause a song to go off key. To do this we used a script we found here from Nick Kelly. This part was really simple but gave us a huge improvement in our results. Variable Length Notes and Rests An important part of music is the dynamic and creative use of variable length notes and rests. That one long note struck by the guitarist followed by a peaceful pause can send a wave of emotion to the listener as we hear the heart and soul of the player spilled out into the world. To capture this we looked into ways of introducing long notes, short notes, and rests so that we could create different emotions throughout the song. In order to implement this we looked at the pitch and duration of a note and treated this as a separate value we could input into our network. This meant that a C# played for 0.5 seconds and a C# played for 1 second would be treated as different values by the network. This allowed us to learn what pitches were played longer or shorter than others and enabled us to combine notes to produce something that sounded natural and fitting for that part of the song. Of course rests cannot be forgotten as they are crucial for guiding the listener to a place of anticipation or excitement. A slow note and a pause followed by a burst of quick firing notes can create a different emotion than several long notes with long pauses between. We felt this was important in order to replicate the experience the listener has when listening to a relaxing Sunday afternoon song or a Friday night party anthem. To achieve these goals we had to focus on our preprocessing. Again here we started with the code from Skúli’s tutorial and adapted it to fit our needs. for element in notes_to_parse: if (isinstance(element, note.Note) or isinstance(element, chord.Chord ): duration = element.duration.quarterLength if isinstance(element, note.Note): name = element.pitch elif isinstance(element, chord.Chord): name = ".".join(str(n) for n in element.normalOrder) notes.append(f"{name}${duration}") rest_notes = int((element.offset - prev_offset) / TIMESTEP - 1) for _ in range(0, rest_notes): notes.append("NULL") prev_offset = element.offset To elaborate on the code above, we create notes by combining their pitch and duration with a “$” to feed into our network. For example “A$1.0”, “A$0.75”, “B$0.25”, etc. would all be encoded separately for use by our network (inputs are encoded by mapping each unique note/duration to an integer then dividing all of the integers by the number of unique combinations thus encoding each one as a floating point number between 0 and 1). The more interesting part is calculating how many rests to insert. We look at the offset of the current note and compare it to the offset of the last note we looked at. We take this gap and divide it by our time step to calculate how many rest notes we can fit (minus 1 because really this calculates how many notes fit in the gap, but one of them is our actual next note so we don’t want to double count it). An example would be if one note started at 0.5s and the next didn’t start till 1.0s. With a time step of 0.25 (each note is played in 0.25s intervals), this would mean we need one rest note to fill the gap. Song Structure Lastly one of the most important parts of writing a song is the structure, and this is one of the things we found lacking in existing solutions. From what I have seen most researchers are hoping for their network to learn this on its own, and I don’t think that is a misguided approach. However I think this introduces complexity to the problem and leads to further difficulty. This could be a source of improvement upon our solution though as we take a more manual approach to this and assume a constant pattern. One of the key assumptions we made is that we would only produce songs that follow the specific pattern ABCBDB where: A is the first verse B is the chorus C is the second verse and D is the bridge Initially we tried ABABCB but this felt too formulaic. To resolve this we decided to introduce a second verse that was different than the first but still related. We generated the first verse from a random note and then generated the second verse based on the first. Effectively this is generating a single section that is twice as long and splitting it in half. The thought process here was that if we create one verse the second should still fit the same vibe, and by using the first as a reference we could achieve this. def generate_notes(self, model, network_input, pitchnames, n_vocab): """ Generate notes from the neural network based on a sequence of notes """ int_to_note = dict( ( number + 1, note ) for number, note in enumerate(pitchnames) ) int_to_note[0] = "NULL" def get_start(): # pick a random sequence from the input as a starting point for # the prediction start = numpy.random.randint(0, len(network_input) - 1) pattern = network_input[start] prediction_output = [] return pattern, prediction_output # generate verse 1 verse1_pattern, verse1_prediction_output = get_start() for note_index in range(4 * SEQUENCE_LEN): prediction_input = numpy.reshape( verse1_pattern, (1, len(verse1_pattern), 1) ) prediction_input = prediction_input / float(n_vocab) prediction = model.predict(prediction_input, verbose=0) index = numpy.argmax(prediction) result = int_to_note[index] verse1_prediction_output.append(result) verse1_pattern.append(index) verse1_pattern = verse1_pattern[1 : len(verse1_pattern)] # generate verse 2 verse2_pattern = verse1_pattern verse2_prediction_output = [] for note_index in range(4 * SEQUENCE_LEN): prediction_input = numpy.reshape( verse2_pattern, (1, len(verse2_pattern), 1) ) prediction_input = prediction_input / float(n_vocab) prediction = model.predict(prediction_input, verbose=0) index = numpy.argmax(prediction) result = int_to_note[index] verse2_prediction_output.append(result) verse2_pattern.append(index) verse2_pattern = verse2_pattern[1 : len(verse2_pattern)] # generate chorus chorus_pattern, chorus_prediction_output = get_start() for note_index in range(4 * SEQUENCE_LEN): prediction_input = numpy.reshape( chorus_pattern, (1, len(chorus_pattern), 1) ) prediction_input = prediction_input / float(n_vocab) prediction = model.predict(prediction_input, verbose=0) index = numpy.argmax(prediction) result = int_to_note[index] chorus_prediction_output.append(result) chorus_pattern.append(index) chorus_pattern = chorus_pattern[1 : len(chorus_pattern)] # generate bridge bridge_pattern, bridge_prediction_output = get_start() for note_index in range(4 * SEQUENCE_LEN): prediction_input = numpy.reshape( bridge_pattern, (1, len(bridge_pattern), 1) ) prediction_input = prediction_input / float(n_vocab) prediction = model.predict(prediction_input, verbose=0) index = numpy.argmax(prediction) result = int_to_note[index] bridge_prediction_output.append(result) bridge_pattern.append(index) bridge_pattern = bridge_pattern[1 : len(bridge_pattern)] return ( verse1_prediction_output + chorus_prediction_output + verse2_prediction_output + chorus_prediction_output + bridge_prediction_output + chorus_prediction_output ) Results We were able to achieve surprising results from this approach. We could consistently generate unique songs that fell into the proper genre that we trained the respective networks on. Below are some example outputs from our various networks. Ragtime Christmas Rap Conclusion Music generation by machines is indeed possible. Is it better or could it be better than music generated by humans? Only time will tell. From these results though I would say that it’s definitely possible. Future Work Several improvements could be made that would bring this even closer to true music. Some possible ideas/experiments include: Learn patterns in songs rather than manually piecing together parts Take note duration as a separate input to the network rather than treating each pitch/duration separately Expand to multiple instruments Move away from midi files and produce/learn from actual MP3s Learn the time step, sequence length, and time signature Introduce randomness to emulate “human error/experimentation” Allow for multiple keys Learn how to use intros and outros Acknowledgments I would like to thank my teammates Izaak Sulka and Jeff Greene for their help on this project as well as my friend Mitch Burdick for his expertise on music that enabled us to get these great results. And of course we would like to thank Sigurður Skúli for their tutorial as it gave us a great starting point and something to reference. Last but not least I would like to thank Nick Kelly for his script to transpose songs to C major. The code for this project can be found here: https://github.com/tylerdoll/music-generator Disclaimer: the music used in our project does not belong to us and was sourced from various public websites. Machine Learning Music Music Generation AI Towards Data Science 292 claps See more stories from Towards Data Science. Create a free Medium account to follow Towards Data Science. You’ll see more of their stories on Medium and in your inbox. Follow Write the rst response More From Medium More from Towards Data Science Want a data science job? Use the weekend project principle to get it Daniel Bourke in Towards… Nov 3 · 4 min read 3.8K More from Towards Data Science How To Fake Being a Good Programmer Sten Sootla in Towards Dat… Oct 30 · 5 min read 5.2K More from Towards Data Science One Word of Code to Stop Using Pandas So Slowly Tyler Folkman in Towards… Nov 2 · 3 min read 1.7K Discover Medium Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch Make Medium yours Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore Become a member Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade About Help Legal DeepWaveDeepWave Example RagtimeExample Ragtime Share 1.9KCookie policy DeepWaveDeepWave Example ChristmasExample Christmas Share 1.6KCookie policy DeepWaveDeepWave Example Rap 2Example Rap 2 Share 1.6KCookie policy WRITTEN BY Tyler Doll Follow A guy who likes to think about computer things DATA SCIENCE MACHINE LEARNING PROGRAMMING VISUALIZATION AI JOURNALISM MORE CONTRIBUTE Sign in Get started