This document summarizes research on implementing musical instrument recognition using convolutional neural networks (CNNs) and support vector machines (SVMs). The researchers aim to preprocess audio excerpts into images and use CNNs to achieve high accuracy in instrument classification. They will then combine CNN and SVM classifications and take a weighted average to achieve even higher accuracy. The document reviews several related works that used features like MFCCs and classifiers like SVMs, GMMs, and neural networks for instrument recognition. The researchers intend to use mel spectrograms and MFCCs to represent audio as images for CNN classification and improve music information retrieval and organization.