The following slides were presented at PyCon 2017, Portland, USA. It is the improved version of the project presented at PyCon India 2016.
It mainly highlights the ability of machines to generate something as beautiful as music using a neural network model called as LSTM. It also covers some of the challenges, trade-offs that we faced during the project.
12. We’ll go with RNN...
12
* http://colah.github.io/posts/2015-08-Understanding-LSTMs/
13. Can we
make it
remember
for longer
period of
time?
Yes we can! By using
LSTM networks.
13
* http://colah.github.io/posts/2015-08-Understanding-LSTMs/
22. How to do it with Python?
from keras.layers.recurrent import LSTM # import LSTM network
from keras.models import Sequential # import Sequential model
model = Sequential() # instantiate model
# add LSTM layer to the model
model.add(LSTM(input_dim = num_hidden_dimensions,
output_dim=num_hidden_dimensions, return_sequences=True))
22
24. Train the model
while cur_iter < num_iters:
# Iterate over the training data in batches
history = model.fit(X_train, y_train,
batch_size=batch_size, nb_epoch=epochs_per_iter,
verbose=1, validation_split=0.0)
cur_iter += epochs_per_iter
24
26. # We take first chunk of the training data as a seed sequence
seed_seq = seed_generator.generate_copy_seed_sequence(seed_length=seed_len,
training_data=X_train)
# This defines how long the final song is. Total song length in samples = max_seq_len * example_len
output = sequence_generator.generate_from_seed(model=model, seed=seed_seq,
sequence_length=max_seq_len)
26
Generate the music
31. How long does it take?
After about 100 iterations
over 20 different music files.
2231
# Parameters
num_iters = 100
epochs_per_iter = 25
batch_size = 5
32. After 2000 iterations over 20
different music files.
Well it takes time…
# Parameters
num_iters = 2000
epochs_per_iter = 25
batch_size = 5
32
33. Python
Libraries
used
• LAME and SoX to convert mp3 files
into other formats such as wav.
• NumPy and SciPy for various
mathematical computation on
tensors.
• Matplotlib for visualizing the input.
• Keras version 0.1.0 with Theano as
the backend.
33
34. The entire code is on GitHub
https://github.com/unnati-xyz/music-generation
34
35. Key notes
for using
this code to
generate
your own
music
Step 1: Convert given mp3 files into np
tensors.
python convert_directory.py
# converts mp3 files to numpy tensors
Step 2: Train the model.
python train.py # creates LSTM model
Step 3: Generate the music.
python generate.py # final generated music which is
stored in a file named generated_song.wav.
35