26. Convolutional
Neural
Networks
Keunwoo.Choi
@qmul.ac.uk
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Hierarchical features
Hierarchical feature learning
Each layer learns features in different levels of hierarchy
High-level features are built on low-level features
E.g.
Layer 1: Edges (low-level, concrete)
Layer 2: Simple shapes
Layer 3: Complex shapes
Layer 4: More complex shapes
Layer 5: Shapes of target objects (high-level, abstract)
26/43
30. Convolutional
Neural
Networks
Keunwoo.Choi
@qmul.ac.uk
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?
1/2 in music genre classification task
Layer 1/5
Bach
Original
Dream Toy Eminem
Bach
[Feature 1-9], Crude onset detector
Dream Toy Eminem
Bach
[Feature 1-27], Onset detector
Dream Toy Eminem
[2]
blog demo
30/43
31. Convolutional
Neural
Networks
Keunwoo.Choi
@qmul.ac.uk
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?
1/2 in music genre classification task
Layer 2/5
Bach
Original
Dream Toy Eminem
Bach
[Feature 2-0], Good onset detector
Dream Toy Eminem
Bach
[Feature 2-1], Bass note selector
Dream Toy Eminem
Bach
[Feature 2-10], Harmonic selector
Dream Toy Eminem
Bach
[Feature 2-48], Melody (large energy)
Dream Toy Eminem
[2]
31/43
32. Convolutional
Neural
Networks
Keunwoo.Choi
@qmul.ac.uk
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?
1/2 in music genre classification task
Layer 3/5
Bach
Original
Dream Toy Eminem
Bach
[Feature 3-1], Better onset detector
Dream Toy Eminem
Bach
[Feature 3-7], Melody (top note)
Dream Toy Eminem
Bach
[Feature 3-38], Kick drum extractor
Dream Toy Eminem
Bach
[Feature 3-40], Percussive eraser
Dream Toy Eminem
[2]
32/43
33. Convolutional
Neural
Networks
Keunwoo.Choi
@qmul.ac.uk
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?
1/2 in music genre classification task
Layer 4/5
Bach
Original
Dream Toy Eminem
Bach
[Feature 4-5], Lowest notes selector
Dream Toy Eminem
Bach
[Feature 4-11], Vertical line eraser
Dream Toy Eminem
Bach
[Feature 4-30], Long horizontal line selector
Dream Toy Eminem
[2]
33/43
34. Convolutional
Neural
Networks
Keunwoo.Choi
@qmul.ac.uk
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?
1/2 in music genre classification task
Layer 5/5
Bach
Original
Dream Toy Eminem
Bach
[Feature 5-11], texture 1
Dream Toy Eminem
Bach
[Feature 5-15], texture 2
Dream Toy Eminem
Bach
[Feature 5-56], Harmo-Rhythmic structure
Dream Toy Eminem
Bach
[Feature 5-33], texture 3
Dream Toy Eminem
[2]
34/43
40. Convolutional
Neural
Networks
Keunwoo.Choi
@qmul.ac.uk
Overview
CNN use-cases
References
References I
Choi, K., Fazekas, G., Sandler, M.: Automatic tagging
using deep convolutional neural networks. In: Proceedings
of the 17th International Society for Music Information
Retrieval Conference (ISMIR 2016), New York, USA (2016)
Choi, K., Fazekas, G., Sandler, M.: Explaining
convolutional neural networks on music classification
(submitted). In: IEEE Conference on Machine Learning
and Signal Processing (2016)
Dieleman, S., Schrauwen, B.: End-to-end learning for
music audio. In: Acoustics, Speech and Signal Processing
(ICASSP), 2014 IEEE International Conference on. pp.
6964–6968. IEEE (2014)
40/43
41. Convolutional
Neural
Networks
Keunwoo.Choi
@qmul.ac.uk
Overview
CNN use-cases
References
References II
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm
of artistic style. arXiv preprint arXiv:1508.06576 (2015)
Humphrey, E.J., Bello, J.P.: From music audio to chord
tablature: Teaching deep convolutional networks toplay
guitar. In: Acoustics, Speech and Signal Processing
(ICASSP), 2014 IEEE International Conference on. pp.
6974–6978. IEEE (2014)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.:
Gradient-based learning applied to document recognition.
Proceedings of the IEEE 86(11), 2278–2324 (1998)
Li, P., Qian, J., Wang, T.: Automatic instrument
recognition in polyphonic music using convolutional neural
networks. arXiv preprint arXiv:1511.05520 (2015)
41/43
42. Convolutional
Neural
Networks
Keunwoo.Choi
@qmul.ac.uk
Overview
CNN use-cases
References
References III
Park, T., Lee, T.: Music-noise segmentation in
spectrotemporal domain using convolutional neural
networks. ISMIR late-breaking session (2015)
Schluter, J., Bock, S.: Improved musical onset detection
with convolutional neural networks. In: International
Conference on Acoustics, Speech and Signal Processing.
IEEE (2014)
Ullrich, K., Schl¨uter, J., Grill, T.: Boundary detection in
music structure analysis using convolutional neural
networks. In: Proceedings of the 15th International Society
for Music Information Retrieval Conference (ISMIR 2014),
Taipei, Taiwan (2014)
42/43
43. Convolutional
Neural
Networks
Keunwoo.Choi
@qmul.ac.uk
Overview
CNN use-cases
References
References IV
Zeiler, M.D., Fergus, R.: Visualizing and understanding
convolutional networks. In: Computer Vision–ECCV 2014,
pp. 818–833. Springer (2014)
Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet,
V., Su, Z., Du, D., Huang, C., Torr, P.H.: Conditional
random fields as recurrent neural networks. In: Proceedings
of the IEEE International Conference on Computer Vision.
pp. 1529–1537 (2015)
43/43