The document discusses pitch detection of singing voice in tabla accompaniment. It proposes a two-way mismatch procedure (TWM) for pitch detection that is more robust to harmonic and single partial interference from tabla strokes compared to autocorrelation. Post-processing including dynamic programming-based smoothing and pitch correction is shown to reduce errors by up to 50% compared to the raw TWM output, especially for songs with slowly varying pitch contours. However, some errors remain in regions with fast pitch variations during tabla strokes. Future work to combine TWM with autocorrelation and classify frames by presence of tabla is suggested.
Linear Predictive Coding (LPC) is one of the most powerful speech analysis techniques, and one of the most useful methods for encoding good quality speech at a low bit rate. It provides extremely accurate estimates of speech parameters, and is relatively efficient for computation.
Linear Predictive Coding (LPC) is one of the most powerful speech analysis techniques, and one of the most useful methods for encoding good quality speech at a low bit rate. It provides extremely accurate estimates of speech parameters, and is relatively efficient for computation.
Speech Analysis and synthesis using VocoderIJTET Journal
Abstract— In this paper, I proposed a speech analysis and synthesis using a vocoder. Voice conversion systems do not create new speech signals, but just transform existing one. The proposed speech vocoding is different from speech coding. To analyze the speech signal and represent it with less number of bits, so that bandwidth efficiency can be increased. The Synthesis of speech signal from the received bits of information. In this paper three aspects of analysis have been discussed: pitch refinement, spectral envelope estimation and maximum voiced frequency estimation. A Quasi-harmonic analysis model can be used to implement a pitch refinement algorithm which improves the accuracy of the spectral estimation. Harmonic plus noise model to reconstruct the speech signal from parameter. Finally to achieve the highest possible resynthesis quality using the lowest possible number of bits to transmit the speech signal. Future work aims at incorporating the phase information into the analysis and modeling process and also synthesis these three aspects in different pitch period.
This paper proposes a voice morphing system for people suffering from Laryngectomy, which is the surgical removal of all or part of the larynx or the voice box, particularly performed in cases of laryngeal cancer. A primitive method of achieving voice morphing is by extracting the source's vocal coefficients and then converting them into the target speaker's vocal parameters. In this paper, we deploy Gaussian Mixture Models (GMM) for mapping the coefficients from source to destination. However, the use of the traditional/conventional GMM-based mapping approach results in the problem of over-smoothening of the converted voice. Thus, we hereby propose a unique method to perform efficient voice morphing and conversion based on GMM, which overcomes the traditional-method effects of over-smoothening. It uses a technique of glottal waveform separation and prediction of excitations and hence the result shows that not only over-smoothening is eliminated but also the transformed vocal tract parameters match with the target. Moreover, the synthesized speech thus obtained is found to be of a sufficiently high quality. Thus, voice morphing based on a unique GMM approach has been proposed and also critically evaluated based on various subjective and objective evaluation parameters. Further, an application of voice morphing for Laryngectomees which deploys this unique approach has been recommended by this paper
Amplitude spectrogram prediction from mel-frequency cepstrum coefficients and...Kitamura Laboratory
Shoya Kawaguchi and Daichi Kitamura,
"Amplitude spectrogram prediction from mel-frequency cepstrum coefficients and loudness using deep neural networks,"
Proceedings of RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2023), pp. 225–228, Honolulu, USA, March 2023.
Transferring Singing Expressions from One Voice to Another for a Given SongNAVER Engineering
발표자: 용상언(KAIST 박사과정)
발표일: 2018.6.
노래를 못부르는 사람도 노래를 부르고, 녹음하는 행위를 즐길 수 있게 하는 방법에 대한 연구를 진행하고 있습니다. 그 중에서도 현재 중점적으로 진행하고 있는 노래를 잘 부르는 사람의 음악적 표현을 못 부르는 사람의 음색에 이식하는 방법에 대해 소개합니다. 해당 시스템의 간단한 구조부터 시스템이 현재 가지고 있는 한계점, 그리고 어떻게 응용 및 확장될 수 있는지에 대해서 이야기 해보려고 합니다.
Audio Morphing for Percussive Sound Generationa3labdsp
The aim of audio morphing algorithms is to combine two or more sounds to create a new sound with intermediate timbre and duration. During the last two decades several efforts have been made to improve morphing algorithms in order to obtain more realistic and perceptually relevant sounds. In this paper we present an automatic audio morphing technique applied to percussive musical instruments. Based on preprocessing of the sound references in frequency domain and linear interpolation in time domain, the presented approach allows one to generate high quality hybrid sounds at a low computational cost. Several results are reported in order to show the effectiveness of the proposed approach in terms of audio quality and acoustic perception of the generated hybrid sounds, taking into consideration different percussive samples. Mean opinion score and multidimensional scaling were used to compare the presented approach with existing state of the art techniques.
Speech Analysis and synthesis using VocoderIJTET Journal
Abstract— In this paper, I proposed a speech analysis and synthesis using a vocoder. Voice conversion systems do not create new speech signals, but just transform existing one. The proposed speech vocoding is different from speech coding. To analyze the speech signal and represent it with less number of bits, so that bandwidth efficiency can be increased. The Synthesis of speech signal from the received bits of information. In this paper three aspects of analysis have been discussed: pitch refinement, spectral envelope estimation and maximum voiced frequency estimation. A Quasi-harmonic analysis model can be used to implement a pitch refinement algorithm which improves the accuracy of the spectral estimation. Harmonic plus noise model to reconstruct the speech signal from parameter. Finally to achieve the highest possible resynthesis quality using the lowest possible number of bits to transmit the speech signal. Future work aims at incorporating the phase information into the analysis and modeling process and also synthesis these three aspects in different pitch period.
This paper proposes a voice morphing system for people suffering from Laryngectomy, which is the surgical removal of all or part of the larynx or the voice box, particularly performed in cases of laryngeal cancer. A primitive method of achieving voice morphing is by extracting the source's vocal coefficients and then converting them into the target speaker's vocal parameters. In this paper, we deploy Gaussian Mixture Models (GMM) for mapping the coefficients from source to destination. However, the use of the traditional/conventional GMM-based mapping approach results in the problem of over-smoothening of the converted voice. Thus, we hereby propose a unique method to perform efficient voice morphing and conversion based on GMM, which overcomes the traditional-method effects of over-smoothening. It uses a technique of glottal waveform separation and prediction of excitations and hence the result shows that not only over-smoothening is eliminated but also the transformed vocal tract parameters match with the target. Moreover, the synthesized speech thus obtained is found to be of a sufficiently high quality. Thus, voice morphing based on a unique GMM approach has been proposed and also critically evaluated based on various subjective and objective evaluation parameters. Further, an application of voice morphing for Laryngectomees which deploys this unique approach has been recommended by this paper
Amplitude spectrogram prediction from mel-frequency cepstrum coefficients and...Kitamura Laboratory
Shoya Kawaguchi and Daichi Kitamura,
"Amplitude spectrogram prediction from mel-frequency cepstrum coefficients and loudness using deep neural networks,"
Proceedings of RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2023), pp. 225–228, Honolulu, USA, March 2023.
Transferring Singing Expressions from One Voice to Another for a Given SongNAVER Engineering
발표자: 용상언(KAIST 박사과정)
발표일: 2018.6.
노래를 못부르는 사람도 노래를 부르고, 녹음하는 행위를 즐길 수 있게 하는 방법에 대한 연구를 진행하고 있습니다. 그 중에서도 현재 중점적으로 진행하고 있는 노래를 잘 부르는 사람의 음악적 표현을 못 부르는 사람의 음색에 이식하는 방법에 대해 소개합니다. 해당 시스템의 간단한 구조부터 시스템이 현재 가지고 있는 한계점, 그리고 어떻게 응용 및 확장될 수 있는지에 대해서 이야기 해보려고 합니다.
Audio Morphing for Percussive Sound Generationa3labdsp
The aim of audio morphing algorithms is to combine two or more sounds to create a new sound with intermediate timbre and duration. During the last two decades several efforts have been made to improve morphing algorithms in order to obtain more realistic and perceptually relevant sounds. In this paper we present an automatic audio morphing technique applied to percussive musical instruments. Based on preprocessing of the sound references in frequency domain and linear interpolation in time domain, the presented approach allows one to generate high quality hybrid sounds at a low computational cost. Several results are reported in order to show the effectiveness of the proposed approach in terms of audio quality and acoustic perception of the generated hybrid sounds, taking into consideration different percussive samples. Mean opinion score and multidimensional scaling were used to compare the presented approach with existing state of the art techniques.
The Influence of perceptual Attack times in
Networked Music performance
Pilot Study conducted at CCRMA, Stanford University (2011)
44th AES conference (2011 - San Diego)
Ruby is used for a lot of things, but for some reason, only a few people are using it for music. In a language that is meant to make programming fun, the lack of such creative code is scary. Let's fix the current landscape by learning how to use the tools available in Ruby (and some not) to let those creative juices flow. We will be focusing on how to build sounds from the ground up (the powerful amplitude, and the majestic waveform), so you don't need any prior audio wizardry. Just bring yourself and a Ruby console and we just might create some beautiful music in a beautiful language.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Similar to Pitch detection in Tabla accompaniment (16)
Query optimization techniques for partitioned tables.Ashutosh Bapat
Unlike inheritance based partitioning, declarative partitioning introduced in PostgreSQL 10 leaves nothing to infer about how the data is divided into partitions. PostgreSQL 11’s query optimizer is gearing up to take advantage of this “no-inference” representation. While basic partition-wise join has been already committed, patches for various techniques like partition pruning, partition-wise aggregate/grouping and partition-wise ordering are proposed on hackers. Partition-wise join, partition-wise aggregates and partition-wise ordering break a large operation into smaller ones enabling use of smaller hash tables, faster in-memory sorts, per partition indexes, FDW pushdown (in case of foreign partitions) to speed up the large operation. Parallel append technique brings parallel query and partitioning together to improve query performance. We hope that most of those will make it to v11. This talk explains these techniques with performance numbers.
Atomicity for transactions involving foreign server in PostgreSQLAshutosh Bapat
Slides for my presentation at PGCon 2015 at Ottawa, Canada. The presentation covered the proposed design and implementation of atomicity for transactions involving foreign servers.
Presentation introducing materialized views in PostgreSQL with use cases. These slides were used for my talk at Indian PostgreSQL Users Group meetup at Hyderabad on 28th March, 2014
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Welocme to ViralQR, your best QR code generator.ViralQR
Welcome to ViralQR, your best QR code generator available on the market!
At ViralQR, we design static and dynamic QR codes. Our mission is to make business operations easier and customer engagement more powerful through the use of QR technology. Be it a small-scale business or a huge enterprise, our easy-to-use platform provides multiple choices that can be tailored according to your company's branding and marketing strategies.
Our Vision
We are here to make the process of creating QR codes easy and smooth, thus enhancing customer interaction and making business more fluid. We very strongly believe in the ability of QR codes to change the world for businesses in their interaction with customers and are set on making that technology accessible and usable far and wide.
Our Achievements
Ever since its inception, we have successfully served many clients by offering QR codes in their marketing, service delivery, and collection of feedback across various industries. Our platform has been recognized for its ease of use and amazing features, which helped a business to make QR codes.
Our Services
At ViralQR, here is a comprehensive suite of services that caters to your very needs:
Static QR Codes: Create free static QR codes. These QR codes are able to store significant information such as URLs, vCards, plain text, emails and SMS, Wi-Fi credentials, and Bitcoin addresses.
Dynamic QR codes: These also have all the advanced features but are subscription-based. They can directly link to PDF files, images, micro-landing pages, social accounts, review forms, business pages, and applications. In addition, they can be branded with CTAs, frames, patterns, colors, and logos to enhance your branding.
Pricing and Packages
Additionally, there is a 14-day free offer to ViralQR, which is an exceptional opportunity for new users to take a feel of this platform. One can easily subscribe from there and experience the full dynamic of using QR codes. The subscription plans are not only meant for business; they are priced very flexibly so that literally every business could afford to benefit from our service.
Why choose us?
ViralQR will provide services for marketing, advertising, catering, retail, and the like. The QR codes can be posted on fliers, packaging, merchandise, and banners, as well as to substitute for cash and cards in a restaurant or coffee shop. With QR codes integrated into your business, improve customer engagement and streamline operations.
Comprehensive Analytics
Subscribers of ViralQR receive detailed analytics and tracking tools in light of having a view of the core values of QR code performance. Our analytics dashboard shows aggregate views and unique views, as well as detailed information about each impression, including time, device, browser, and estimated location by city and country.
So, thank you for choosing ViralQR; we have an offer of nothing but the best in terms of QR code services to meet business diversity!
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
Pitch detection in Tabla accompaniment
1. CSE, Indian Institute of Technology Bombay
Pitch Detection of Singing Voice in Tabla
Accompaniment
Ashutosh Bapat
(03305010)
2. CSE, Indian Institute of Technology Bombay
Outline
Motivation
Music transcription
Pitch & pitch detection
Signal characteristics
Two-way mismatch procedure
Post processing
DP based smoothing
Pitch correction
Experimental evaluation
Conclusion
3. CSE, Indian Institute of Technology Bombay
Automatic Music Transcription (AMT) system
Converts acoustic musical signal to symbolic
representation
Documents musical attributes
Pitch
Timbre
Rhythm
Pitch is the most salient
Melody = pitch contour
4. CSE, Indian Institute of Technology Bombay
AMT for Indian classical music
Components of Indian classical and semi-classical
music
Melody line sung by a single main voice
o Gamakas: are described by detailed pitch contour
Accompaniment of tabla and tanpura
Musicological and pedagogical applications
Rich archives of audio recordings => need for a
reliable PDA for singing voice pitch tracking in tabla
accompaniment
5. CSE, Indian Institute of Technology Bombay
Pitch
Many definitions exist
Pitch of a signal is defined as the fundamental
frequency of an approximately harmonic pattern in
spectral representation of signal.
Pitch period is defined as average length of several
periods of the signal.
0
0
1
T
F =
6. CSE, Indian Institute of Technology Bombay
Pitch detection
• Input: musical signal
• Output: pitch contour
Song waveform Pitch contour
7. CSE, Indian Institute of Technology Bombay
Pitch Detection Algorithm
Preprocessor: Data reduction and enhancement
Commonly used method: Filtering
Basic Extractor: Estimates single/multiple pitch
candidates per frame
Commonly used method: ACF
Post processing:
Measure of reliability of each candidate
Smoothness of pitch contour
o Commonly used method: Dynamic programming
8. CSE, Indian Institute of Technology Bombay
Singing voice
• Pitch evolves
continuously
• Shows inflexions
like bents, stresses,
oscillations etc.
9. CSE, Indian Institute of Technology Bombay
Classification of tabla strokes
Number of drums
Simple strokes: Na, Ge, Ke, Tit, Tun etc.
Complex strokes: Dha, Dhin, Dhun
Harmonicity
Harmonic: Na, Tin, Tun, Ge
Inharmonic: Ke, Tit
Rate of Decay:
Slowly decaying: Na, Ge, Tin, Tun
Fast decaying: Ke, Tit
13. CSE, Indian Institute of Technology Bombay
Pitch detection of mixed song
Waveform of song mixed with
stroke Na
Pitch contour by ACF
14. CSE, Indian Institute of Technology Bombay
Two-way mismatch procedure
TWM error, F0 = 300 Hz
15. CSE, Indian Institute of Technology Bombay
TWM and ACF: harmonic interference
• Complex tone of 450 Hz + signal
simulating Na
• In TWM error we can see
minimum at correct pitch
• In ACF all peaks are at lags
corresponding to 790 Hz
TWM error ACF
Magnitude plot
16. CSE, Indian Institute of Technology Bombay
TWM and ACF: single partial interference
• Complex tone of 300 Hz mixed with a single partial with
amplitude varied from 0 to 100.
• TWM is more robust than ACF
400 Hz 450 Hz
17. CSE, Indian Institute of Technology Bombay
TWM and ACF correlograms
• Correlograms of complex tone of 300 Hz mixed with stroke
Na
• Notice horizontal line at 300 Hz in TWM
• No clue to lag 73 (corresponds to 300 Hz)
ACF correlogram TWM error correlogram
18. CSE, Indian Institute of Technology Bombay
TWM pitch contour
• Pitch contour of song mixed with stroke Na
• Notice large pitch artifacts during strokes
20. CSE, Indian Institute of Technology Bombay
DP based smoothing
Smoothing based on
Measure of reliability of pitch candidates
Smoothness of pitch contour
Measurement cost:
Smoothness cost:
Local transition cost:
Global transition cost:
∑=
−=
N
j
jjpjpTNpjppS
1
)),1(),(())(,),(,),1((
))(),1(()),(()),1(),(( jpjpWjjpEjjpjpT −+=−
),( jpE
)',( ppW
21. CSE, Indian Institute of Technology Bombay
Smoothness cost
• The width of bell varies
proportional to pitch
• Pitch variation at high
pitches is expected to be
more than that at low
pitches
• Saturates at high values
pc
s
pp
ppW e
*
)'(
1)',(
2
2
=
−
−=
−
σ
σ
22. CSE, Indian Institute of Technology Bombay
Pitch contour after applying DP
• Smoothened pitch contour
• Suppresses fast pitch variations
• May introduce errors where tabla is absent
23. CSE, Indian Institute of Technology Bombay
Pitch correction
• Searches for deepest local minimum in 6% range near pitch
estimated by DP
• Corrects most of the fine errors
24. CSE, Indian Institute of Technology Bombay
Experimental evaluation
Test samples
Samples produced by digitally adding tabla strokes Na,
Ge, Ke to pure song waveforms sung with syllable /la/
and /aa/
Algorithms
TWM:
TDP: TWM + DP
TDC: TWM + DP + PC
Errors
Fine error: error magnitude between 3% to 6%
Gross error: error magnitue above 6%
25. CSE, Indian Institute of Technology Bombay
Results
• DP has decreased number of gross errors increasing number
of fine errors
• PC has decreased number of fine errors
• Better performance in case of songs with slowly varying pitch
contours
TWM TDP TDC
F G F G F G
Na 0.0 49.
3
4.6 13.
4
2.1 14.
8
Ge 0.0 20.
9
3.9 2.1 3.4 2.5
Ke 0.0 25.
7
4.9 5.1 0.2 5.1
TWM TDP TDC
F G F G F G
Na 4.7 14.
7
11.
6
1.6 5.3 2.1
Ge 0.0 22.
5
8.1 4.4 1.5 4.1
Ke 0.0 17.
9
7.2 2.5 0.0 2.5
Song with many fast variations of pitch Song with slowly varying pitch contour
Error rates in percentage
26. CSE, Indian Institute of Technology Bombay
Errors after application of DP + PC
• Errors remaining after application of DP and pitch correction
are found in regions with fast variations in pitch
27. CSE, Indian Institute of Technology Bombay
Conclusion
Importance of music transcription
Characteristics of tabla strokes
Two-way mismatch PDA
Results showing improvements by application of DP
smoothing and pitch correction
Applications in building pitch detector for Indian
classical and semi classical music
28. CSE, Indian Institute of Technology Bombay
Future work
Combination of ACF and TWM to take advantage of
Lesser computational complexity of ACF
ACF’s robustness to noise, thus better results in Ke
Classification of frames by presence/ absence of
tabla strokes
Use pitch estimated by DP and pitch correction only in
frames containing tabla stroke
Application of advanced techniques:
adaptive windowing, peak selection, selective search
Pitch tracking in case of complex strokes like Dha
and words like TiReKiTa
Pitch is most salient. Melody = pitch contour (evolution of pitch over time)
Add a piece of real song with tabla and tanpura
Mention tanpura is not considered after point 1.2
Pedagogical – visualisation and reproduction
Since musical signal is dynamic the signal is divided into small frames and for each frame pitch is estimated according to definition seen
No one system solves all problems, so each block is chosen and adapted according to the application
One of the partial is stronger than the others
Explain ACF
Just after onset Na
Add vertical lines at 450 Hz and 263 Hz as well as lags 28, 49, 56
Why not to discuss now noisy interference
Notes are equispaced on logarithmic scale than on linear scale
Half semitone 2^(1/12), octave is divided into 12 notes
Pitch estimates obtained from pure songs by applying TWM were assumed to be correct