SlideShare a Scribd company logo
ScoReaderDAVID LE | OLIVER LO | DEREK LY | DEREK Faculty Mentor: Dr. Christine Julien
Senior Design Open House - April 29th
2015 - The University of Texas at Austin
Musicians are often expected to learn new music quickly.
Hearing the actual pitches of an unfamiliar piece drastically
reduces the time required for a player to perform said
piece. With our application, musicians will be able to hear
a synthesized version of a song that will clear up issues of
pitch and rhythm, even for songs without easily accessible
professional recordings.
NGO | CAMERON MOUSIGHI | CHASE RIGGINS
Background
Figure 2: OMR Pipeline in Detail
1. Original Image 2. Thresholding 3. Line Removal
4. Parsed Line5. Parsed Symbols
OMR PIPELINE (Figure 2 Below)
1. Capture image and correct for skew
2. Convert image from RGB to grayscale
3. Remove all staff lines
4. Decompose sheet music into individual staff lines
5. Further decompose each staff line into musical symbols
6. Compare segmented images to database of musical
		 symbols using template matching
7. Musically interpret these symbols in the order they appear
Optical Music Recognition is computationally expensive,
which is why we chose to offload processing to a server. We
chose a Google Cloud Platform Server with specifications:
Component Quantity Details
CPU 2 vCPUs 2.6 GHz Intel Xeon E5
Memory 13 GB RAM 6.50 GB per Virtual Core
Figure 1: System Block Diagram
Our application captures an image of sheet music, sends
that image to a remote server for processing, and then plays
the returned audio file back to the user. This process is
illustrated in Figure 1 below.
Summary of Design & Block Diagram
PRIMARY PROBLEMS
• Image capture
		 • Image normalization
		 • Image verification
		 • Image stabilization
• Optical Music Recognition
		 • Real-time processing speed
		• Accessibility
		 • Object recognition & Musical interpretation
REQUIREMENTS
• Processing time per page: 30 seconds
• Read note lengths: 64th
– Double Whole
• Read note pitches: ±3 staff ledger lines
• Read common: key signature & time signatures
CONSTRAINTS
• Enviroment - Visibility: indoors, well-lit area, low glare
• Enviroment - Noise: ≤ 55 dB
• User cannot tilt camera more than 35˚ from horizontal
• User must have access to Wi-Fi
• Music limited to 2 instruments
• Music must be printed clearly in a standard font
Problem Definition
ScoReader is an Optical Music Recognition (OMR)
application built for Android and Google Glass. We wanted
to advance and refine current OMR implementations while
developing on an accessible platform. Our project is unique
in utilizing Glass to provide musicians with a hands-free
experience.
Abstract
Figure 4: Conversion Accuracy by Pitch & Rhythm
Figure 3: Original Twinkle Twinkle Little Star Sheet Music
Ninety percent of pitches and rhythms in the samples
we supplied were correctly identified and met our stated
requirements. We organized our testing as follows:
	 • Glass UI (Subsystem)
			• image normalization/verification
			• music playback
	 • Server (Subsystem)
			 • line removal & staff/object segmentation
			 • object recognition/musical interpretation
In retrospect we would have liked to get an earlier start on the
imageprocessingmodules,sincetheircomplexitymadeithard
to debug. Additionally, creating our app solely for Android
would’ve allowed us to focus more on the functionality of the
app. If we were to continue working on the project, we would
extend our app to read stylistic markings, like dynamics,
tempo, accents, and instrument types.
To evaluate ScoReader, we compared the pitches and rhthyms
produced by the output file to the pitches and rhythms in the
sheet music. For each note, we considered the conversion a
100% success if both the pitch and rhythm are accurate. If
only one of these was correct, we said it was a 50% success,
and if both were missed, it was a failure.The final score is the
average score of each note on the page.
Testing & Evaluation Results

More Related Content

Similar to ScoReader: A Mobile Computer Vision System for Optical Music Recognition

2012 a rebeloijmir
2012 a rebeloijmir2012 a rebeloijmir
2012 a rebeloijmir
Miguel Ponce
 
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET Journal
 
IRJET - Music Generation using Deep Learning
IRJET -  	  Music Generation using Deep LearningIRJET -  	  Music Generation using Deep Learning
IRJET - Music Generation using Deep Learning
IRJET Journal
 
Shaun warburton ig2 task 1 work sheet improved
Shaun warburton ig2 task 1 work sheet improvedShaun warburton ig2 task 1 work sheet improved
Shaun warburton ig2 task 1 work sheet improvedwarburton9191
 
Ai based character recognition and speech synthesis
Ai based character recognition and speech  synthesisAi based character recognition and speech  synthesis
Ai based character recognition and speech synthesis
Ankita Jadhao
 
IRJET- The Complete Music Player
IRJET- The Complete Music PlayerIRJET- The Complete Music Player
IRJET- The Complete Music Player
IRJET Journal
 
Mood based Music Player
Mood based Music PlayerMood based Music Player
Mood based Music Player
IRJET Journal
 
Automatic Music Generation Using Deep Learning
Automatic Music Generation Using Deep LearningAutomatic Music Generation Using Deep Learning
Automatic Music Generation Using Deep Learning
IRJET Journal
 
Face2mus 1437580648936
Face2mus 1437580648936Face2mus 1437580648936
Face2mus 1437580648936
Ann Thomas
 
Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2BenCom1
 
Snorm–A Prototype for Increasing Audio File Stepwise Normalization
Snorm–A Prototype for Increasing Audio File Stepwise NormalizationSnorm–A Prototype for Increasing Audio File Stepwise Normalization
Snorm–A Prototype for Increasing Audio File Stepwise Normalization
IJERA Editor
 
IRJET- A Survey on Sound Recognition
IRJET- A Survey on Sound RecognitionIRJET- A Survey on Sound Recognition
IRJET- A Survey on Sound Recognition
IRJET Journal
 
Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)
Yi-Hsuan Yang
 
Deep Learning Meetup #5
Deep Learning Meetup #5Deep Learning Meetup #5
Deep Learning Meetup #5
Aloïs Gruson
 
Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2BenCom1
 
Speech recognition challenges
Speech recognition challengesSpeech recognition challenges
Speech recognition challengesAlexandru Chica
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheetLuke Summers
 
Sound recording glossary
Sound recording glossarySound recording glossary
Sound recording glossary
Ben Atherton
 
IG2 Task 1 Worksheet
IG2 Task 1 WorksheetIG2 Task 1 Worksheet
IG2 Task 1 WorksheetSamDuxburyGDS
 

Similar to ScoReader: A Mobile Computer Vision System for Optical Music Recognition (20)

Presentation_final
Presentation_finalPresentation_final
Presentation_final
 
2012 a rebeloijmir
2012 a rebeloijmir2012 a rebeloijmir
2012 a rebeloijmir
 
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
 
IRJET - Music Generation using Deep Learning
IRJET -  	  Music Generation using Deep LearningIRJET -  	  Music Generation using Deep Learning
IRJET - Music Generation using Deep Learning
 
Shaun warburton ig2 task 1 work sheet improved
Shaun warburton ig2 task 1 work sheet improvedShaun warburton ig2 task 1 work sheet improved
Shaun warburton ig2 task 1 work sheet improved
 
Ai based character recognition and speech synthesis
Ai based character recognition and speech  synthesisAi based character recognition and speech  synthesis
Ai based character recognition and speech synthesis
 
IRJET- The Complete Music Player
IRJET- The Complete Music PlayerIRJET- The Complete Music Player
IRJET- The Complete Music Player
 
Mood based Music Player
Mood based Music PlayerMood based Music Player
Mood based Music Player
 
Automatic Music Generation Using Deep Learning
Automatic Music Generation Using Deep LearningAutomatic Music Generation Using Deep Learning
Automatic Music Generation Using Deep Learning
 
Face2mus 1437580648936
Face2mus 1437580648936Face2mus 1437580648936
Face2mus 1437580648936
 
Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2
 
Snorm–A Prototype for Increasing Audio File Stepwise Normalization
Snorm–A Prototype for Increasing Audio File Stepwise NormalizationSnorm–A Prototype for Increasing Audio File Stepwise Normalization
Snorm–A Prototype for Increasing Audio File Stepwise Normalization
 
IRJET- A Survey on Sound Recognition
IRJET- A Survey on Sound RecognitionIRJET- A Survey on Sound Recognition
IRJET- A Survey on Sound Recognition
 
Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)
 
Deep Learning Meetup #5
Deep Learning Meetup #5Deep Learning Meetup #5
Deep Learning Meetup #5
 
Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2Ben ce sound recording glossary version 2
Ben ce sound recording glossary version 2
 
Speech recognition challenges
Speech recognition challengesSpeech recognition challenges
Speech recognition challenges
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheet
 
Sound recording glossary
Sound recording glossarySound recording glossary
Sound recording glossary
 
IG2 Task 1 Worksheet
IG2 Task 1 WorksheetIG2 Task 1 Worksheet
IG2 Task 1 Worksheet
 

Recently uploaded

在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
ChristineTorrepenida1
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 

Recently uploaded (20)

在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 

ScoReader: A Mobile Computer Vision System for Optical Music Recognition

  • 1. ScoReaderDAVID LE | OLIVER LO | DEREK LY | DEREK Faculty Mentor: Dr. Christine Julien Senior Design Open House - April 29th 2015 - The University of Texas at Austin Musicians are often expected to learn new music quickly. Hearing the actual pitches of an unfamiliar piece drastically reduces the time required for a player to perform said piece. With our application, musicians will be able to hear a synthesized version of a song that will clear up issues of pitch and rhythm, even for songs without easily accessible professional recordings. NGO | CAMERON MOUSIGHI | CHASE RIGGINS Background Figure 2: OMR Pipeline in Detail 1. Original Image 2. Thresholding 3. Line Removal 4. Parsed Line5. Parsed Symbols OMR PIPELINE (Figure 2 Below) 1. Capture image and correct for skew 2. Convert image from RGB to grayscale 3. Remove all staff lines 4. Decompose sheet music into individual staff lines 5. Further decompose each staff line into musical symbols 6. Compare segmented images to database of musical symbols using template matching 7. Musically interpret these symbols in the order they appear Optical Music Recognition is computationally expensive, which is why we chose to offload processing to a server. We chose a Google Cloud Platform Server with specifications: Component Quantity Details CPU 2 vCPUs 2.6 GHz Intel Xeon E5 Memory 13 GB RAM 6.50 GB per Virtual Core Figure 1: System Block Diagram Our application captures an image of sheet music, sends that image to a remote server for processing, and then plays the returned audio file back to the user. This process is illustrated in Figure 1 below. Summary of Design & Block Diagram PRIMARY PROBLEMS • Image capture • Image normalization • Image verification • Image stabilization • Optical Music Recognition • Real-time processing speed • Accessibility • Object recognition & Musical interpretation REQUIREMENTS • Processing time per page: 30 seconds • Read note lengths: 64th – Double Whole • Read note pitches: ±3 staff ledger lines • Read common: key signature & time signatures CONSTRAINTS • Enviroment - Visibility: indoors, well-lit area, low glare • Enviroment - Noise: ≤ 55 dB • User cannot tilt camera more than 35˚ from horizontal • User must have access to Wi-Fi • Music limited to 2 instruments • Music must be printed clearly in a standard font Problem Definition ScoReader is an Optical Music Recognition (OMR) application built for Android and Google Glass. We wanted to advance and refine current OMR implementations while developing on an accessible platform. Our project is unique in utilizing Glass to provide musicians with a hands-free experience. Abstract Figure 4: Conversion Accuracy by Pitch & Rhythm Figure 3: Original Twinkle Twinkle Little Star Sheet Music Ninety percent of pitches and rhythms in the samples we supplied were correctly identified and met our stated requirements. We organized our testing as follows: • Glass UI (Subsystem) • image normalization/verification • music playback • Server (Subsystem) • line removal & staff/object segmentation • object recognition/musical interpretation In retrospect we would have liked to get an earlier start on the imageprocessingmodules,sincetheircomplexitymadeithard to debug. Additionally, creating our app solely for Android would’ve allowed us to focus more on the functionality of the app. If we were to continue working on the project, we would extend our app to read stylistic markings, like dynamics, tempo, accents, and instrument types. To evaluate ScoReader, we compared the pitches and rhthyms produced by the output file to the pitches and rhythms in the sheet music. For each note, we considered the conversion a 100% success if both the pitch and rhythm are accurate. If only one of these was correct, we said it was a 50% success, and if both were missed, it was a failure.The final score is the average score of each note on the page. Testing & Evaluation Results