SlideShare a Scribd company logo
1 of 1
Download to read offline
ScoReaderDAVID LE | OLIVER LO | DEREK LY | DEREK Faculty Mentor: Dr. Christine Julien
Senior Design Open House - April 29th
2015 - The University of Texas at Austin
Musicians are often expected to learn new music quickly.
Hearing the actual pitches of an unfamiliar piece drastically
reduces the time required for a player to perform said
piece. With our application, musicians will be able to hear
a synthesized version of a song that will clear up issues of
pitch and rhythm, even for songs without easily accessible
professional recordings.
NGO | CAMERON MOUSIGHI | CHASE RIGGINS
Background
Figure 2: OMR Pipeline in Detail
1. Original Image 2. Thresholding 3. Line Removal
4. Parsed Line5. Parsed Symbols
OMR PIPELINE (Figure 2 Below)
1. Capture image and correct for skew
2. Convert image from RGB to grayscale
3. Remove all staff lines
4. Decompose sheet music into individual staff lines
5. Further decompose each staff line into musical symbols
6. Compare segmented images to database of musical
		 symbols using template matching
7. Musically interpret these symbols in the order they appear
Optical Music Recognition is computationally expensive,
which is why we chose to offload processing to a server. We
chose a Google Cloud Platform Server with specifications:
Component Quantity Details
CPU 2 vCPUs 2.6 GHz Intel Xeon E5
Memory 13 GB RAM 6.50 GB per Virtual Core
Figure 1: System Block Diagram
Our application captures an image of sheet music, sends
that image to a remote server for processing, and then plays
the returned audio file back to the user. This process is
illustrated in Figure 1 below.
Summary of Design & Block Diagram
PRIMARY PROBLEMS
• Image capture
		 • Image normalization
		 • Image verification
		 • Image stabilization
• Optical Music Recognition
		 • Real-time processing speed
		• Accessibility
		 • Object recognition & Musical interpretation
REQUIREMENTS
• Processing time per page: 30 seconds
• Read note lengths: 64th
– Double Whole
• Read note pitches: ±3 staff ledger lines
• Read common: key signature & time signatures
CONSTRAINTS
• Enviroment - Visibility: indoors, well-lit area, low glare
• Enviroment - Noise: ≤ 55 dB
• User cannot tilt camera more than 35˚ from horizontal
• User must have access to Wi-Fi
• Music limited to 2 instruments
• Music must be printed clearly in a standard font
Problem Definition
ScoReader is an Optical Music Recognition (OMR)
application built for Android and Google Glass. We wanted
to advance and refine current OMR implementations while
developing on an accessible platform. Our project is unique
in utilizing Glass to provide musicians with a hands-free
experience.
Abstract
Figure 4: Conversion Accuracy by Pitch & Rhythm
Figure 3: Original Twinkle Twinkle Little Star Sheet Music
Ninety percent of pitches and rhythms in the samples
we supplied were correctly identified and met our stated
requirements. We organized our testing as follows:
	 • Glass UI (Subsystem)
			• image normalization/verification
			• music playback
	 • Server (Subsystem)
			 • line removal & staff/object segmentation
			 • object recognition/musical interpretation
In retrospect we would have liked to get an earlier start on the
imageprocessingmodules,sincetheircomplexitymadeithard
to debug. Additionally, creating our app solely for Android
would’ve allowed us to focus more on the functionality of the
app. If we were to continue working on the project, we would
extend our app to read stylistic markings, like dynamics,
tempo, accents, and instrument types.
To evaluate ScoReader, we compared the pitches and rhthyms
produced by the output file to the pitches and rhythms in the
sheet music. For each note, we considered the conversion a
100% success if both the pitch and rhythm are accurate. If
only one of these was correct, we said it was a 50% success,
and if both were missed, it was a failure.The final score is the
average score of each note on the page.
Testing & Evaluation Results

More Related Content

Similar to ScoReader: A Mobile Computer Vision System for Optical Music Recognition

Shaun warburton ig2 task 1 work sheet improved
Shaun warburton ig2 task 1 work sheet improvedShaun warburton ig2 task 1 work sheet improved
Shaun warburton ig2 task 1 work sheet improved
warburton9191
 
Face2mus 1437580648936
Face2mus 1437580648936Face2mus 1437580648936
Face2mus 1437580648936
Ann Thomas
 
Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2
BenCom1
 
Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2
BenCom1
 
Speech recognition challenges
Speech recognition challengesSpeech recognition challenges
Speech recognition challenges
Alexandru Chica
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheet
Luke Summers
 
IG2 Task 1 Worksheet
IG2 Task 1 WorksheetIG2 Task 1 Worksheet
IG2 Task 1 Worksheet
SamDuxburyGDS
 

Similar to ScoReader: A Mobile Computer Vision System for Optical Music Recognition (20)

Presentation_final
Presentation_finalPresentation_final
Presentation_final
 
2012 a rebeloijmir
2012 a rebeloijmir2012 a rebeloijmir
2012 a rebeloijmir
 
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
 
IRJET - Music Generation using Deep Learning
IRJET -  	  Music Generation using Deep LearningIRJET -  	  Music Generation using Deep Learning
IRJET - Music Generation using Deep Learning
 
Shaun warburton ig2 task 1 work sheet improved
Shaun warburton ig2 task 1 work sheet improvedShaun warburton ig2 task 1 work sheet improved
Shaun warburton ig2 task 1 work sheet improved
 
Ai based character recognition and speech synthesis
Ai based character recognition and speech  synthesisAi based character recognition and speech  synthesis
Ai based character recognition and speech synthesis
 
IRJET- The Complete Music Player
IRJET- The Complete Music PlayerIRJET- The Complete Music Player
IRJET- The Complete Music Player
 
Mood based Music Player
Mood based Music PlayerMood based Music Player
Mood based Music Player
 
Automatic Music Generation Using Deep Learning
Automatic Music Generation Using Deep LearningAutomatic Music Generation Using Deep Learning
Automatic Music Generation Using Deep Learning
 
Face2mus 1437580648936
Face2mus 1437580648936Face2mus 1437580648936
Face2mus 1437580648936
 
Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2
 
Snorm–A Prototype for Increasing Audio File Stepwise Normalization
Snorm–A Prototype for Increasing Audio File Stepwise NormalizationSnorm–A Prototype for Increasing Audio File Stepwise Normalization
Snorm–A Prototype for Increasing Audio File Stepwise Normalization
 
IRJET- A Survey on Sound Recognition
IRJET- A Survey on Sound RecognitionIRJET- A Survey on Sound Recognition
IRJET- A Survey on Sound Recognition
 
Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)
 
Deep Learning Meetup #5
Deep Learning Meetup #5Deep Learning Meetup #5
Deep Learning Meetup #5
 
Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2
 
Speech recognition challenges
Speech recognition challengesSpeech recognition challenges
Speech recognition challenges
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheet
 
Sound recording glossary
Sound recording glossarySound recording glossary
Sound recording glossary
 
IG2 Task 1 Worksheet
IG2 Task 1 WorksheetIG2 Task 1 Worksheet
IG2 Task 1 Worksheet
 

Recently uploaded

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Recently uploaded (20)

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 

ScoReader: A Mobile Computer Vision System for Optical Music Recognition

  • 1. ScoReaderDAVID LE | OLIVER LO | DEREK LY | DEREK Faculty Mentor: Dr. Christine Julien Senior Design Open House - April 29th 2015 - The University of Texas at Austin Musicians are often expected to learn new music quickly. Hearing the actual pitches of an unfamiliar piece drastically reduces the time required for a player to perform said piece. With our application, musicians will be able to hear a synthesized version of a song that will clear up issues of pitch and rhythm, even for songs without easily accessible professional recordings. NGO | CAMERON MOUSIGHI | CHASE RIGGINS Background Figure 2: OMR Pipeline in Detail 1. Original Image 2. Thresholding 3. Line Removal 4. Parsed Line5. Parsed Symbols OMR PIPELINE (Figure 2 Below) 1. Capture image and correct for skew 2. Convert image from RGB to grayscale 3. Remove all staff lines 4. Decompose sheet music into individual staff lines 5. Further decompose each staff line into musical symbols 6. Compare segmented images to database of musical symbols using template matching 7. Musically interpret these symbols in the order they appear Optical Music Recognition is computationally expensive, which is why we chose to offload processing to a server. We chose a Google Cloud Platform Server with specifications: Component Quantity Details CPU 2 vCPUs 2.6 GHz Intel Xeon E5 Memory 13 GB RAM 6.50 GB per Virtual Core Figure 1: System Block Diagram Our application captures an image of sheet music, sends that image to a remote server for processing, and then plays the returned audio file back to the user. This process is illustrated in Figure 1 below. Summary of Design & Block Diagram PRIMARY PROBLEMS • Image capture • Image normalization • Image verification • Image stabilization • Optical Music Recognition • Real-time processing speed • Accessibility • Object recognition & Musical interpretation REQUIREMENTS • Processing time per page: 30 seconds • Read note lengths: 64th – Double Whole • Read note pitches: ±3 staff ledger lines • Read common: key signature & time signatures CONSTRAINTS • Enviroment - Visibility: indoors, well-lit area, low glare • Enviroment - Noise: ≤ 55 dB • User cannot tilt camera more than 35˚ from horizontal • User must have access to Wi-Fi • Music limited to 2 instruments • Music must be printed clearly in a standard font Problem Definition ScoReader is an Optical Music Recognition (OMR) application built for Android and Google Glass. We wanted to advance and refine current OMR implementations while developing on an accessible platform. Our project is unique in utilizing Glass to provide musicians with a hands-free experience. Abstract Figure 4: Conversion Accuracy by Pitch & Rhythm Figure 3: Original Twinkle Twinkle Little Star Sheet Music Ninety percent of pitches and rhythms in the samples we supplied were correctly identified and met our stated requirements. We organized our testing as follows: • Glass UI (Subsystem) • image normalization/verification • music playback • Server (Subsystem) • line removal & staff/object segmentation • object recognition/musical interpretation In retrospect we would have liked to get an earlier start on the imageprocessingmodules,sincetheircomplexitymadeithard to debug. Additionally, creating our app solely for Android would’ve allowed us to focus more on the functionality of the app. If we were to continue working on the project, we would extend our app to read stylistic markings, like dynamics, tempo, accents, and instrument types. To evaluate ScoReader, we compared the pitches and rhthyms produced by the output file to the pitches and rhythms in the sheet music. For each note, we considered the conversion a 100% success if both the pitch and rhythm are accurate. If only one of these was correct, we said it was a 50% success, and if both were missed, it was a failure.The final score is the average score of each note on the page. Testing & Evaluation Results