This document discusses a computational framework for sound segregation in music signals. It begins with acknowledgments of collaborators on the work. It then provides an overview of the research project, which involves developing an auditory scene analysis framework for sound segregation in polyphonic music signals. The document outlines the problem statement, main challenges, current state of research, related research areas, and the main contributions and proposed approach of the framework. It involves applying ideas from computational auditory scene analysis to define perceptual grouping cues and implement a flexible and efficient sound segregation system based on these cues.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Episode 33 : Project Execution Part (4)
complete quotations for spare and wear parts
instructions for maintenance and repair of the whole plant
lists of lubricants in consideration of the selected lubricant suppliers
coding of the operating resources and equipments
establishment of special single data sheets for the entire equipment
establishment of a complete plant index resp. plant database (plant registration system)
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 54 : CAPE Problem Formulations
Computer Aided Process Engineering
Lecture 2: CAPE Problem Formulations
* Four types of CAPE problems
Flowsheeting Specification (Design) Optimization (Design) Synthesis (& Design)
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 45 : 4 Stages Of Solid Liquid Separations
Cost of S/L relates directly to the volume of material
Pressurized equipment is more expensive to operate than thickener
Other techniques are classified according to the substances that act upon, namely the liquid, the solid particles, solids concentration and solid-liquid interaction
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 51 : Integrated Process Simulation
Why Integration ?
* Consider aspects of control, environmental impact, energy, etc., early during process design
* Prevent potential problems rather than cure (which may not be possible)
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 36 : What is Powder Technology?
All the technology which concerns itself with the handling or processing of powders, or materials in particulate form
- production, storage, transportation, mixing, dusting, characterization, packing, crushing and milling
Important role for medicines, food stuffs, plastics, metals, fertilizer, cement and etc.
A prominent academic discipline
The roots of powder technology
- in the areas of material handling and processing.
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 44 : 4 Stages Of Solid Liquid Separations
Cost of S/L relates directly to the volume of material
Pressurized equipment is more expensive to operate than thickener
Other techniques are classified according to the substances that act upon, namely the liquid, the solid particles, solids concentration and solid-liquid interaction
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 35 : Design Approach to Dilute Phase Pneumatic ConveyingSAJJAD KHUDHUR ABBAS
Episode 35 : Design Approach to Dilute Phase Pneumatic Conveying
For many years gases have been used successfully in industry to transport a wide range of particulate solids - from wheat flour to wheat grain and plastic chips to coal.
The pneumatic transport of particulate solids is broadly classified into two flow regimes: dilute (or lean) phase and dense phase
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 52 : Flow sheeting Case Study
A Standard Test Problem for Flowsheeting
The Cavett Problem
* A typical flowsheeting problem from the petroleum industry
* The flowsheet consists of mixers and TP-flash units
* The mixture consists of ethane, propane, 1- butane, n-butane, i-pentane, n-pentane
* The problem is interesting because tear-stream convergence is not easy and process is very sensitive to changes to the condition of operation
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 48 : Computer Aided Process Engineering Simulation Problem SAJJAD KHUDHUR ABBAS
Episode 48 : Computer Aided Process Engineering Simulation Problem
* Identify partitions
* Identify recycle-loops
* Determine tear-streams
* Determine calculation order
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 47 : CONCEPTUAL DESIGN OF CHEMICAL PROCESSES
Chemical process design is the application of chemical engineering knowledge (chemical, physical and/or biological transformations of raw materials) into products and economics in the conceiving a chemical process plant to profitably manufacture chemicals in a reliable and safe manner without unduly affecting adversely the environment and society
Chemical process plants are by nature large capital investment projects that
are expensive to build and operate
have very long life times and
manufacture specific chemicals
Chemical process plants must be designed well to avoid large financial losses over long periods of times due to inefficient processes/poor operations
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 40 : DESIGN EXAMPLE – DILUTE PHASE PNEUMATIC CONVEYING (Part 2)SAJJAD KHUDHUR ABBAS
Episode 40 : DESIGN EXAMPLE – DILUTE PHASE PNEUMATIC CONVEYING (Part 2)
DESIGN EXAMPLE – DILUTE PHASE PNEUMATIC CONVEYING
A plastics production plant wants to increase the capacity through an existing conveying system. The existing system has 6 inch ID pipes and is configured as shown in the diagram below.
The High Density Polyethylene (HDPE) particles have an average size of 4 mm. The conveying gas is at 68oF. The existing blower can produce 1375 SCFM.
The desired capacity increase is from 20,000 lbm/hr to 30,000 lbm/hr. Can the existing blower and pipe system meet this increase in capacity?
Assume the pressure drop across the cyclone is 5 inches of water. The pressure drop across the blower inlet pipe and silencers is 0.3 psi. The pipe bends have R/D = 6. Pipe roughness is k = 0.00015 ft. The particles have density pρ = 59 lbm/ft3. Terminal velocity of the particles is = 30.6 ft/s.
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 53 : Computer Aided Process Engineering
Lecture notes and reading material
* A lecture note covering all the lectures has been prepared (see course home-page)
* Supplementary text-books are listed
* A course home-page has been created
* All lecture and tutorial material can be downloaded from the home-page
http://www.capec.kt.dtu.dk/Courses/MSc-level-Courses/
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
THE INTELLIGENCE OF MACHINES AND BRANCH OF COMPUTER SCIENCE THAT AIMS TO CREATE IT- “The study and design of intelligent agents”.
An area of computer science that deals with giving m/c ability to seem like they have human intelligence.
Episode 33 : Project Execution Part (4)
complete quotations for spare and wear parts
instructions for maintenance and repair of the whole plant
lists of lubricants in consideration of the selected lubricant suppliers
coding of the operating resources and equipments
establishment of special single data sheets for the entire equipment
establishment of a complete plant index resp. plant database (plant registration system)
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 54 : CAPE Problem Formulations
Computer Aided Process Engineering
Lecture 2: CAPE Problem Formulations
* Four types of CAPE problems
Flowsheeting Specification (Design) Optimization (Design) Synthesis (& Design)
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 45 : 4 Stages Of Solid Liquid Separations
Cost of S/L relates directly to the volume of material
Pressurized equipment is more expensive to operate than thickener
Other techniques are classified according to the substances that act upon, namely the liquid, the solid particles, solids concentration and solid-liquid interaction
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 51 : Integrated Process Simulation
Why Integration ?
* Consider aspects of control, environmental impact, energy, etc., early during process design
* Prevent potential problems rather than cure (which may not be possible)
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 36 : What is Powder Technology?
All the technology which concerns itself with the handling or processing of powders, or materials in particulate form
- production, storage, transportation, mixing, dusting, characterization, packing, crushing and milling
Important role for medicines, food stuffs, plastics, metals, fertilizer, cement and etc.
A prominent academic discipline
The roots of powder technology
- in the areas of material handling and processing.
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 44 : 4 Stages Of Solid Liquid Separations
Cost of S/L relates directly to the volume of material
Pressurized equipment is more expensive to operate than thickener
Other techniques are classified according to the substances that act upon, namely the liquid, the solid particles, solids concentration and solid-liquid interaction
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 35 : Design Approach to Dilute Phase Pneumatic ConveyingSAJJAD KHUDHUR ABBAS
Episode 35 : Design Approach to Dilute Phase Pneumatic Conveying
For many years gases have been used successfully in industry to transport a wide range of particulate solids - from wheat flour to wheat grain and plastic chips to coal.
The pneumatic transport of particulate solids is broadly classified into two flow regimes: dilute (or lean) phase and dense phase
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 52 : Flow sheeting Case Study
A Standard Test Problem for Flowsheeting
The Cavett Problem
* A typical flowsheeting problem from the petroleum industry
* The flowsheet consists of mixers and TP-flash units
* The mixture consists of ethane, propane, 1- butane, n-butane, i-pentane, n-pentane
* The problem is interesting because tear-stream convergence is not easy and process is very sensitive to changes to the condition of operation
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 48 : Computer Aided Process Engineering Simulation Problem SAJJAD KHUDHUR ABBAS
Episode 48 : Computer Aided Process Engineering Simulation Problem
* Identify partitions
* Identify recycle-loops
* Determine tear-streams
* Determine calculation order
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 47 : CONCEPTUAL DESIGN OF CHEMICAL PROCESSES
Chemical process design is the application of chemical engineering knowledge (chemical, physical and/or biological transformations of raw materials) into products and economics in the conceiving a chemical process plant to profitably manufacture chemicals in a reliable and safe manner without unduly affecting adversely the environment and society
Chemical process plants are by nature large capital investment projects that
are expensive to build and operate
have very long life times and
manufacture specific chemicals
Chemical process plants must be designed well to avoid large financial losses over long periods of times due to inefficient processes/poor operations
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 40 : DESIGN EXAMPLE – DILUTE PHASE PNEUMATIC CONVEYING (Part 2)SAJJAD KHUDHUR ABBAS
Episode 40 : DESIGN EXAMPLE – DILUTE PHASE PNEUMATIC CONVEYING (Part 2)
DESIGN EXAMPLE – DILUTE PHASE PNEUMATIC CONVEYING
A plastics production plant wants to increase the capacity through an existing conveying system. The existing system has 6 inch ID pipes and is configured as shown in the diagram below.
The High Density Polyethylene (HDPE) particles have an average size of 4 mm. The conveying gas is at 68oF. The existing blower can produce 1375 SCFM.
The desired capacity increase is from 20,000 lbm/hr to 30,000 lbm/hr. Can the existing blower and pipe system meet this increase in capacity?
Assume the pressure drop across the cyclone is 5 inches of water. The pressure drop across the blower inlet pipe and silencers is 0.3 psi. The pipe bends have R/D = 6. Pipe roughness is k = 0.00015 ft. The particles have density pρ = 59 lbm/ft3. Terminal velocity of the particles is = 30.6 ft/s.
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Episode 53 : Computer Aided Process Engineering
Lecture notes and reading material
* A lecture note covering all the lectures has been prepared (see course home-page)
* Supplementary text-books are listed
* A course home-page has been created
* All lecture and tutorial material can be downloaded from the home-page
http://www.capec.kt.dtu.dk/Courses/MSc-level-Courses/
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
THE INTELLIGENCE OF MACHINES AND BRANCH OF COMPUTER SCIENCE THAT AIMS TO CREATE IT- “The study and design of intelligent agents”.
An area of computer science that deals with giving m/c ability to seem like they have human intelligence.
Performance Comparison of Musical Instrument Family Classification Using Soft...Waqas Tariq
Nowadays, it appears essential to design automatic and efficacious classification algorithm for the musical instruments. Automatic classification of musical instruments is made by extracting relevant features from the audio samples, afterward classification algorithm is used (using these extracted features) to identify into which of a set of classes, the sound sample is possible to fit. The aim of this paper is to demonstrate the viability of soft set for audio signal classification. A dataset of 104 (single monophonic notes) pieces of Traditional Pakistani musical instruments were designed. Feature extraction is done using two feature sets namely perception based and mel-frequency cepstral coefficients (MFCCs). In a while, two different classification techniques are applied for classification task, which are soft set (comparison table) and fuzzy soft set (similarity measurement). Experimental results show that both classifiers can perform well on numerical data. However, soft set achieved accuracy up to 94.26% with best generated dataset. Consequently, these promising results provide new possibilities for soft set in classifying musical instrument sounds. Based on the analysis of the results, this study offers a new view on automatic instrument classification
Shoichi Koyama, Naoki Murata, and Hiroshi Saruwatari. "Super-resolution in sound field recording and reproduction based on sparse representation"
presented at 5th Joint Meeting Acoustical Society of America and Acoustical Society of Japan (28 Nov. - 2 Dec. 2016, Honolulu, USA)
CONTENT BASED AUDIO CLASSIFIER & FEATURE EXTRACTION USING ANN TECNIQUESAM Publications
Audio signals which include speech, music and environmental sounds are important types of media. The problem of distinguishing audio signals into these different audio types is thus becoming increasingly significant. A human listener can easily distinguish between different audio types by just listening to a short segment of an audio signal. However, solving this problem using computers has proven to be very difficult. Nevertheless, many systems with modest accuracy could still be implemented. The experimental results demonstrate the effectiveness of our classification system. The complete system is developed in ANN Techniques with Autonomic Computing system
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...TELKOMNIKA JOURNAL
The ability of traditional packet level Forward Error Correction approaches can limit errors for
small sporadic network losses but when dropouts of large portions occur listening quality becomes an
issue. Services such as audio-on-demand drastically increase the loads on networks therefore new, robust
and highly efficient coding algorithms are necessary. One method overlooked to date, which can work
alongside existing audio compression schemes, is that which takes account of the semantics and natural
repetition of music through meta-data tagging. Similarity detection within polyphonic audio has presented
problematic challenges within the field of Music Information Retrieval. We present a system which works
at the content level thus rendering it applicable in existing streaming services. Using the MPEG–7 Audio
Spectrum Envelope (ASE) gives features for extraction and combined with k-means clustering enables
self-similarity to be performed within polyphonic audio.
Human Perception and Recognition of Musical Instruments: A ReviewEditor IJCATR
Musical Instrument is the soul of music. Musical Instrument and Player are the two fundamental component of Music. In
the past decade the growth of a new research field targeting the Musical Instrument Identification, Retrieval, Classification,
Recognition and management of large sets of music is known as Music Information Retrieval. An attempt to review the methods,
features and database is done.
A talk about Artificial Intelligence and its impacts, and how it relates to Creativity: can artificial intelligence be creative? Does it have a sense of ethics or morals? Is it all simply a simulation?
Slides I used in a Research Methodology seminar I gave in 2010 for the Interactive Art PhD at School of Arts of the Portuguese Catholic University, Porto, Portugal (http://artes.ucp.pt)
A brief introduction to Pattern Recognition. Slides were used for a Seminar at the Interactive Art PhD at School of Arts of the UCP, Porto, Portugal (http://artes.ucp.pt)
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
A Computational Framework for Sound Segregation in Music Signals using Marsyas
1. A Computational Framework for
Sound Segregation in Music
Signals
Luís Gustavo Martins
CITAR / Escola das Artes da UCP
lmartins@porto.ucp.pt
Porto, Portugal
Auditory Modeling Workshop
Google, MountainView, CA, USA
19.11.2010
2. Acknowledgments
A Computational Framework for Sound Segregation in Music Signals2
} This work is the result of the collaboration with:
} University ofVictoria, BC, Canada
} GeorgeTzanetakis, Mathieu Lagrange, Jennifer Murdock
} All the Marsyas team
} INESC Porto
} Luis Filipe Teixeira
} Jaime Cardoso
} Fabien Gouyon
} Technical University of Berlin, Germany
} Juan José Burred
} FEUP PhD Advisor Professor
} Aníbal Ferreira
} Supporting entities
} Fundação para a Ciência e aTecnologia - FCT
} Fundação Calouste Gulbenkian
} VISNET II, NoE European Project
3. Research Project
A Computational Framework for Sound Segregation in Music Signals3
} FCT R&D Project (APPROVED FOR FUNDING)
} A Computational Auditory Scene Analysis Framework for Sound Segregation in Music Signals
} 3-year project (starting Jan. 2011)
} Partners:
} CITAR (Porto, Portugal)
Luís Gustavo Martins (PI), Álvaro Barbosa, Daniela Coimbra
} INESC Porto (Porto, Portugal)
Fabien Gouyon
} UVic (Victoria, BC, Canada)
George Tzanetakis
} IRCAM (Paris, France)
Mathieu Lagrange
} Consultants
} FEUP (Porto, Portugal)
Prof. Aníbal Ferreira, Prof. Jaime Cardoso
} McGill University / CIRMMT (Montreal, QC, Canada)
Prof. Stephan McAdams
4. Summary
A Computational Framework for Sound Segregation in Music Signals4
} Problem Statement
} The Main Challenges
} Current State
} Related Research Areas
} Main Contributions
} Proposed Approach
} Results
} Software Implementation
} Conclusions and Future Work
5. Problem Statement
A Computational Framework for Sound Segregation in Music Signals5
} Propose a computational sound segregation framework
} Focused on music signals
} But not necessarily limited to music signals
} Perceptually inspired
} So it can build upon the current knowledge of how listeners perceive sound
events in music signals
} Causal
} So it mimics the human auditory system and allows online processing of sounds
} Flexible
} So it can accommodate different perceptually inspired grouping cues
} Generic
} So it can be used in different audio and MIR application scenarios
} Effective
} So it can improve the extraction of perceptually relevant information from musical
mixtures
} Efficient
} So it can find practical use in audio processing and MIR tasks
6. MUSIC LISTENING
ABSTRACT
KNOWLEDGE
STRUCTURES
EVENT
STRUCTURE
PROCESSING
EXTRACTION
OF
ATTRIBUTES
AUDITORY
GROUPING
PROCESSES
MENTAL
REPRESENTATION OF
SOUND
ENVIRONMENT
TRANSDUCTION
TRANSDUCTION
ATTENTIONAL
PROCESSES
Figure 2: The main types of auditory processing and their interactions (adapted
from [McAdams and Bigand, 1993]).
possible to extract perceptual attributes which provide a representation of each element in
the auditory system.
} Human listeners are able to perceive individual sound
events in complex mixtures
} Even if listening to:
} Monaural music recordings
} Unknown sounds, timbres or instruments
} Perception is influenced by several complex factors
} Listener’s prior knowledge, context, attention, …
} Based on both low-level and high-level cues
} Difficult to replicate computationally…
The Main Challenges
A Computational Framework for Sound Segregation in Music Signals8
7. The Main Challenges
A Computational Framework for Sound Segregation in Music Signals9
} Why Music Signals?
} Music sound is, in some senses, more challenging to analyse
than non-musical sounds
} High time-frequency overlap of sources and sound events
Music composition and orchestration
Sources that often play simultaneously polyphony
Favor consonant pitch intervals
Sound sources are highly correlated
} High variety of spectral and temporal characteristics
Musical instruments present a wide range of sound production
mechanisms
} Techniques traditionally used for monophonic, non-musical
or speech signals perform poorly
} Yet, music signals are usually well organized and structured
8. Current State
A Computational Framework for Sound Segregation in Music Signals10
} Typical systems in MIR
} Represent statistically the entire sound mixture
} Analysis and retrieval performance reached a “glass ceiling”
[Aucouturier and Pachet, 2004]
} New Paradigm
} Attempt to individually characterize the different sound
events in a sound mixture
} Performance still quite limited when compared to human auditory
system
} But already provides alternative and improved approaches to common
sound analysis and MIR tasks
9. Applications
A Computational Framework for Sound Segregation in Music Signals11
} “Holy grail” applications
} “The Listening Machine”
} “The Robotic Ear”
} “Down to earth” applications
} Sound and Music Description
} Sound Manipulation
} Robust Speech and Speaker Recognition
} Object-based Audio Coding
} Automatic Music Transcription
} Audio and Music Information Retrieval
} Auditory Scene Reconstruction
} Hearing Prostheses
} Up-mixing
} …
10. Related Research Areas
A Computational Framework for Sound Segregation in Music Signals12
} Sound and Music Computing (SMC) [Serra et al., 2007]
} Computational Auditory Scene Analysis (CASA)
[Wang and Brown, 2006]
} Perception Research
} Psychoacoustics [Stevens, 1957]
} Auditory Scene Analysis (ASA) [Bregman, 1990]
} Digital Signal Processing [Oppenheim and Schafer, 1975]
} Music Information Retrieval (MIR) [Downie, 2003]
} Machine Learning [Duda et al., 2000]
} ComputerVision [Marr, 1982]
11. Related Areas
A Computational Framework for Sound Segregation in Music Signals13
} Auditory Scene Analysis (ASA) [Bregman, 1990]
} How do humans “understand” sound mixtures?
} Find packages of acoustic evidence such that each package has
arisen from a single sound source
} Grouping Cues
} Integration
Simultaneous vs. Sequential
Primitive vs. schema-based
} Cues
Common amplitude, frequency, fate
Harmonicity
Time continuity
…
Time
12. Related Areas
A Computational Framework for Sound Segregation in Music Signals14
} Computational Auditory Scene Analysis (CASA)
[Wang and Brown, 2006]
} “Field of computational study that aims to achieve human
performance in ASA by using one or two microphone recordings of
the acoustic scene.” [Wang and Brown, 2006]
MUSIC LISTENING
SOURCE
MODELS
ANALYSIS
FRONT-END
MID-LEVEL
REPRESENTATION
SCENE
ORGANIZATION
GROUPING
CUES
STREAM
RESYNTHESIS
ACOUSTIC
MIXTURE
SEGREGATED
SIGNALS
Figure 3: System Architecture of a typical CASA system.
reference in the development of sound source separation systems, since it is the only ex-
13. Main Contributions
A Computational Framework for Sound Segregation in Music Signals15
} Proposal and experimental validation of a flexible and efficient
framework for sound segregation
} Focused on “real-world” polyphonic music
} Inspired by ideas from CASA
} Causal and data-driven
} Definition of a novel harmonicity cue
} Termed HarmonicallyWrapped Peak Similarity (HWPS)
} Experimentally shown as a good grouping criteria
} Software implementation of the proposed sound segregation
framework
} Modular, extensible and efficient
} Made available as free and open source software (FOSS)
} Based on the MARSYAS framework
14. Proposed Approach
A Computational Framework for Sound Segregation in Music Signals16
} Assumptions
} Perception primarily depends on the use of low-level sensory
information
} Does not necessarily require prior knowledge (i.e. training)
} Still able to perform primitive identification and segregation of sound
events in a sound mixture
} Prior knowledge and high-level information can still be used
} To award additional meaning to the primitive observations
} To consolidate primitive observations as relevant sound events
} To modify the listener’s focus of attention
15. Proposed Approach
A Computational Framework for Sound Segregation in Music Signals19
} System overview
Sinusoidal
Synthesis
Texture Window
Spectral Peaks
(over Texture Window)
150ms
Spectral
Peaks
46ms
Sinusoidal
Analysis
Spectral
Peaks
46ms
Cluster Selection
Similarity Computation
Normalized Cut
16. Analysis Front-end
A Computational Framework for Sound Segregation in Music Signals22
} Sinusoidal Modeling
} Sum of highest amplitude sinusoids at each frame peaks
} Maximum of 20 peaks/frame
} Window = 46ms ; hop = 11ms
} Parametric model: Estimate Amplitude, Frequency, Phase of each peak
frequency
Spectral Peaks
Sinusoidal
Analysis
Spectral
Peaks
46ms
17. Time Segmentation
A Computational Framework for Sound Segregation in Music Signals23
} Texture Windows
} Construct a graph over a texture window of the sound
mixture
} Provides time integration
Approaches partial tracking and source separation jointly
Traditionally two separated, consecutive stages
Spectral Peaks
Sinusoidal
Analysis
time
frequency
Spectral Peaks
Sinusoidal
Analysis
Texture Window
18. Time Segmentation
A Computational Framework for Sound Segregation in Music Signals24
} Fixed length texture
windows
} E.g. 150 ms
} Dynamically adjusted
texture windows
} Onset detector
} Perceptually more
relevant
} 50ms ~ 300ms
AmplitudeFrequency
0 0.8 1.6
Time (secs)
SpectralFlux
1 TEXTURE WINDOW 2 TEXTURE WINDOW 3 TEXTURE WINDOW 4 TEXTURE WINDOW 5 6 TEXTURE WIN. 7
19. Perceptual Cues as Similarity Functions
A Computational Framework for Sound Segregation in Music Signals25
Similarity Computation
AMPLITUDE
SIMILARITY
FREQUENCY
SIMILARITY
HARMONIC
SIMILARITY
(HWPS)
AZIMUTH
PROXIMITY
COMMON
ONSET
OFFSET
SOURCE
MODELS
COMBINER
Spectral Peaks
(over Texture Window)
150ms
OVERALL
SIMILARITY MATRIX
Normalized Cut
...
20. Perceptual Cues as Similarity Functions
A Computational Framework for Sound Segregation in Music Signals26
} Grouping Cues (inspired from ASA)
} Similarity between time-frequency components in a texture window
} Frequency proximity
} Amplitude proximity
} Harmonicity proximity (HWPS)
} …
} Encode topological knowledge into a similarity graph/matrix
} Simultaneous integration (peaks within the same frame)
} Sequential integration over the texture window
Similarity Matrix
A0 A1 A2 A3 B3, A4 B0 B1 B2 B4
A0
A1
A2
A3
B3, A4
B0
B1
B2
B4
xi
xj
xk
wij = wji
xq
xp
xl
21. Perceptual Cues as Similarity Functions
A Computational Framework for Sound Segregation in Music Signals27
} Defining a Generic Similarity Function
} Fully connected graphs
} Gaussian similarity function
How to define neighborhood width (σ)?
Local statistics from data in a Texture Window
Use prior knowledge (e.g. JNDs)
Use σ as weights (after normalizing the Sim. Fun. to [0,1])
0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 2.5
0.25
0.5
0.75
1
d(xi, xj)
wij
σ=0.4
σ
=
1.0
σ =
1.2
wij = e
−
“ d(xi,xj )
σ
”2
xi
xj
wij = wji
22. Perceptual Cues as Similarity Functions
A Computational Framework for Sound Segregation in Music Signals28
} Amplitude and Frequency Similarity
} Amplitude
} Gaussian function of the Euclidean distances
In dB more perceptually relevant
} Frequency
} Gaussian function of the Euclidean distances
In Bark more perceptually relevant
} Not sufficient to segregate harmonic events
} Nevertheless are important to group peaks from:
Inharmonic or noisy frequency components in harmonic sounds
Non-harmonic sounds (unpitched sounds)
Two of the most basic similarities explored by the auditory system a
frequency and amplitude features of the sound components in a sound m
tion 2.3.1).
Accordingly, the edge weight connecting two peaks pk
l and pk+n
m will
frequency and amplitude proximities. Following the generic considerati
the definition of a similarity function for spectral clustering in Section
and frequency similarities, Wa and Wf respectively, are defined as follow
Wa(pk
l , pk+n
m ) = e
−
„
ak
l −ak+n
m
σa
«2
Wf (pk
l , pk+n
m ) = e
−
„
fk
l −fk+n
m
σf
«2
where the Euclidean distances are modeled as two Gaussian functions,
fined in Equation 8. The amplitudes are measured in Decibels (dB) an
are measured in Barks (a frequency scale approximately linear below 500
mic above), since these scales have shown to better model the the sensib
the human ear [Hartmann, 1998].
79
frequency and amplitude features of the sound components in a sound m
tion 2.3.1).
Accordingly, the edge weight connecting two peaks pk
l and pk+n
m will
frequency and amplitude proximities. Following the generic considerati
the definition of a similarity function for spectral clustering in Section
and frequency similarities, Wa and Wf respectively, are defined as follow
Wa(pk
l , pk+n
m ) = e
−
„
ak
l −ak+n
m
σa
«2
Wf (pk
l , pk+n
m ) = e
−
„
fk
l −fk+n
m
σf
«2
where the Euclidean distances are modeled as two Gaussian functions,
fined in Equation 8. The amplitudes are measured in Decibels (dB) an
are measured in Barks (a frequency scale approximately linear below 500
mic above), since these scales have shown to better model the the sensib
the human ear [Hartmann, 1998].
79
23. Perceptual Cues as Similarity Functions
A Computational Framework for Sound Segregation in Music Signals29
} Harmonically Wrapped Peak Similarity (HWPS)
} Harmonicity is one of the most powerful ASA cues [Wang and Brown, 2006]
} Proposal of a novel harmonicity similarity function
} Does not rely on the prior knowledge of f0 in the signal
} Takes into account spectral information in a global manner (spectral patterns)
For peaks in a same frame or in different frames in a Texture Window
Takes into consideration the amplitudes of the spectral peaks
} 3 step algorithm
Shifted Spectral Pattern
Wrapped Frequency Space Histogram computation
Discrete Cosine Similarity [0,1]
STEP 3 – Discrete Cosine Similarity
The last step is now to correlate the two shifted and harmonically wrapped spec-
tral patterns ( ˆF k
l and ˆF k+n
m ) to obtain the HWPS measure between the two correspond-
ing peaks. This correlation can be done using an algorithmic approach as proposed in
[Lagrange and Marchand, 2006], but this was found not to be reliable or robust in prac-
tice. Alternatively, the proposal is to discretize each shifted and harmonically wrapped
spectral pattern into an amplitude weighted histogram, Hk
l , corresponding to each spec-
tral pattern ˆF k
l . The contribution of each peak to the histogram is equal to its amplitude
and the range between 0 and 1 of the Harmonically-Wrapped Frequency is divided into
20 equal-size bins (a 12 or a 24 bin histogram would provide a more musically meaning-
ful chroma-based representation, but preliminary and empirical tests have shown better
results when using 20 bin histograms).
In addition, the harmonically wrapped spectral patterns are also folded into an octave
to form a pitch-invariant “chroma” profile. For example, in Figure 19, the energy of the
spectral pattern in wrapped frequency 1 (all integer multiples of the wrapping frequency)
is mapped to histogram bin 0.
The HWPS similarity between the peaks pk
l and pk+n
m is then defined based on the
cosine distance between the two corresponding discretized histograms as follows:
Wh(pk
l , pk+n
m ) = HWPS(pk
l , pk+n
m ) = e
0
@ c(Hk
l ,Hk+n
m )
r
c(Hk
l
,Hk
l
)·c(Hk+n
m ,Hk+n
m )
1
A
2
(28)
where
c(Hb
a, Hd
c ) =
i
Hb
a(i) × Hd
c (i)
. (29)
One may notice that due to the wrapping operation of Equation 25, the size of the
histograms can be relatively small (e.g. 20 bins), thus being computationally efficient. A
Gaussian function is also used for controlling the neighborhood width of the harmonicity
cue, where σh = 1 is implicitly used in the current system implementation.
Wh(pk
l , pk+n
m ) = HWPS(pk
l , pk+n
m ) = e
−
1−
c(Hk
l ,Hk+n
m )
√
c(Hk
l
,Hk
l
)×c(H
k+n
m ,H
k+n
m )
2
24. Perceptual Cues as Similarity Functions
A Computational Framework for Sound Segregation in Music Signals30
} HWPS
} Between peaks of a same
harmonic “source”
} In a same frame
High similarity
(~1.0)
A0 B0
A1 B1
B2
A2
f0A f0B 2f0A 3f0A 3f0B2f0B0
frame k
0
1
3f0
0
−f0A
f0A
2f0A
3f0A
4f0A
0
1
3f0
0
−f0A
f0A
2f0A
3f0A
4f0A
0
1
3f0
0
−f0A
f0A
2f0A
3f0A
4f0A
0
1
3f0
0
−f0A
f0A
2f0A
3f0A
4f0A
fk
A1
= 2f0A
SHIFTING
SHIFTING
fk
A0
= f0A
A1
A0
HWPS(A1, A0)|h=f0A
¯Fk
A1
˜Fk
A1
˜Fk
A0
¯Fk
A0
ˆFk
A0
ˆFk
A1
dB
High HWPS(A1, A0)|h=f0A
= =
0 1
A1 A0
Fk
A1
= = Fk
A0
˜A1
˜A0
25. Perceptual Cues as Similarity Functions
A Computational Framework for Sound Segregation in Music Signals31
} HWPS
} Between peaks of
different harmonic
“sources”
} In a same frame
Low similarity
(~0.0)
A0 B0
A1 B1
B2
A2
f0A f0B 2f0A 3f0A 3f0B2f0B0
frame k
0
1
3f0
0
−f0A
f0A
2f0A
3f0A
4f0A
0
1
3f0
0
−f0A
f0A
2f0A
3f0A
4f0A
0
1
3f0
0
−f0A
f0A
2f0A
3f0A
4f0A
0
1
3f0
0
−f0A
f0A
2f0A
3f0A
4f0A
Fk
A1
= = Fk
B0
fk
A1
= 2f0A
SHIFTING
SHIFTING
fk
B0
= f0B
A1
HWPS(A1, B0)|h=f0A
¯Fk
A1
˜Fk
A1
˜Fk
B0
¯Fk
B0
ˆFk
B0
ˆFk
A1
dB
B0
!
A1 B0
˜A1
˜B0
Low HWPS(A1, B0)|h=f0A
=
0 1
26. Perceptual Cues as Similarity Functions
A Computational Framework for Sound Segregation in Music Signals32
} HWPS
} Between peaks of a same
harmonic “source”
} In different frames
Mid-High similarity
Interfering spectral content may
be different
Degrades HWPS…
Only consider bin 0?
A0 B0
A1 B1
B2
A2
f0A f0B 2f0A 3f0A 3f0B2f0B0
frame k
Fk
A1
= = Fk+n
A0
dB
A0
A1
A2
f0A 2f0A 3f0A0
dB
frame k + n
C0
C1
C2
f0C 2f0C
3f0C
0
1
3f0
0
−f0A
f0A
2f0A
3f0A
4f0A
0
1
3f0
0
−f0A
f0A
2f0A
3f0A
4f0A
0
1
3f0
0
−f0A
f0A
2f0A
3f0A
4f0A
0
1
3f0
0
−f0A
f0A
2f0A
3f0A
4f0A
fk
A1
= 2f0A
SHIFTING
SHIFTING
Ak
1
HWPS(Ak
1, Ak+n
0 )|h=f0A
¯Fk
A1
˜Fk
A1
˜Fk+n
A0
¯Fk+n
A0
ˆFk+n
A0
ˆFk
A1
Ak+n
0
Ak
1 Ak+n
0
fk+n
A0
= f0A
˜Ak
1
˜Ak+n
0
Mid-High HWPS(Ak
1, Ak+n
0 )|h=f0A
=
0 1
=
27. Perceptual Cues as Similarity Functions
A Computational Framework for Sound Segregation in Music Signals33
} HWPS
} Impact of f0 estimates (h’)
} Ideal
} Min peak frequency
} Highest amplitude peak
} Histogram-based f0 estimates pitch estimates == nr. Sources?
A FRAMEWORK FOR SOUND SEGREGATION IN MUSIC SIGNALS
wrapping operation would be perfect with the prior knowledge of the fundamental fre-
quency. With this knowledge it would be possible to parametrize the wrapping operation
h as:
h = min(f0
k
l , f0
k+n
m ) (26)
where f0
k
l is the fundamental frequency of the source of the peak pk
l . Without such prior,
a conservative approach h is considered instead, although it will tend to over estimate
the fundamental frequency:
h
= min(fk
l , fk+n
m ) (27)
Notice that the value of the wrapping frequency function h is the same for both pat-
terns corresponding to the peaks under consideration. Therefore the resulting shifted and
wrapped frequency pattern will be more similar if the peaks belong to the same harmonic
“source”. The resulting shifted and wrapped patterns are pitch invariant and can be seen
in the middle plot of Figures 19 and 20.
Different approaches could have been taken for the definition of the fundamental fre-
quency estimation function h. One possibility would be to select the highest amplitude
peak in the union of the two spectral patterns under consideration as the f0 estimate
(i.e. h = {fi|i = argmaxi(Ai), ∀i ∈ [1, #A], where A = Ak
l ∪ Ak+n
m , #A is its number
of elements and Ak
l is the set of amplitudes corresponding to the spectral pattern Fk
l ).
The motivation for this approach is the fact that the highest amplitude partial in musical
signals often corresponds to the fundamental frequency of the most prominent harmonic
‘source” active in that frame, although this assumption will not always hold.
A more robust approach, though more computationally expensive, would be to calcu-
late all the frequency differences between all peaks in each spectral pattern and compute a
A FRAMEWORK FOR SOUND SEGREGATION IN MUSIC SIGNALS
wrapping operation would be perfect with the prior knowledge of the fundamental fre-
quency. With this knowledge it would be possible to parametrize the wrapping operation
h as:
h = min(f0
k
l , f0
k+n
m ) (26)
where f0
k
l is the fundamental frequency of the source of the peak pk
l . Without such prior,
a conservative approach h is considered instead, although it will tend to over estimate
the fundamental frequency:
h
= min(fk
l , fk+n
m ) (27)
Notice that the value of the wrapping frequency function h is the same for both pat-
terns corresponding to the peaks under consideration. Therefore the resulting shifted and
wrapped frequency pattern will be more similar if the peaks belong to the same harmonic
“source”. The resulting shifted and wrapped patterns are pitch invariant and can be seen
in the middle plot of Figures 19 and 20.
Different approaches could have been taken for the definition of the fundamental fre-
quency estimation function h. One possibility would be to select the highest amplitude
peak in the union of the two spectral patterns under consideration as the f0 estimate
(i.e. h = {fi|i = argmaxi(Ai), ∀i ∈ [1, #A], where A = Ak
l ∪ Ak+n
m , #A is its number
of elements and Ak
l is the set of amplitudes corresponding to the spectral pattern Fk
l ).
The motivation for this approach is the fact that the highest amplitude partial in musical
signals often corresponds to the fundamental frequency of the most prominent harmonic
‘source” active in that frame, although this assumption will not always hold.
A more robust approach, though more computationally expensive, would be to calcu-
late all the frequency differences between all peaks in each spectral pattern and compute a
histogram. The peaks in these histograms would be good candidates for the fundamental
frequencies in each frame (in order to avoid octave ambiguities, a second histogram with
the differences between all the candidate f0 values could be again computed, where the
highest peaks would be selected as the final f0 candidates). The HWPS could then be
where f0l is the fundamental frequency of the source of the peak pl . Without such prio
a conservative approach h is considered instead, although it will tend to over estima
the fundamental frequency:
h
= min(fk
l , fk+n
m ) (2
Notice that the value of the wrapping frequency function h is the same for both pa
terns corresponding to the peaks under consideration. Therefore the resulting shifted an
wrapped frequency pattern will be more similar if the peaks belong to the same harmon
“source”. The resulting shifted and wrapped patterns are pitch invariant and can be se
in the middle plot of Figures 19 and 20.
Different approaches could have been taken for the definition of the fundamental fr
quency estimation function h. One possibility would be to select the highest amplitud
peak in the union of the two spectral patterns under consideration as the f0 estima
(i.e. h = {fi|i = argmaxi(Ai), ∀i ∈ [1, #A], where A = Ak
l ∪ Ak+n
m , #A is its numb
of elements and Ak
l is the set of amplitudes corresponding to the spectral pattern Fk
l
The motivation for this approach is the fact that the highest amplitude partial in music
signals often corresponds to the fundamental frequency of the most prominent harmon
‘source” active in that frame, although this assumption will not always hold.
A more robust approach, though more computationally expensive, would be to calc
late all the frequency differences between all peaks in each spectral pattern and compute
histogram. The peaks in these histograms would be good candidates for the fundament
frequencies in each frame (in order to avoid octave ambiguities, a second histogram wi
the differences between all the candidate f0 values could be again computed, where th
highest peaks would be selected as the final f0 candidates). The HWPS could then b
iteratively calculated using each f0 candidate in this short list, and select the one wi
the best value as the final choice. In fact, this technique could prove an interesting way
robustly estimate the number of harmonic “sources” in each frame, including their pitche
but experimental evaluations are still required to validate these approaches.
—————
0 500 1000 1500 2000 2500 3000
0
0.2
0.4
0.6
0.8
1
A0
A1
A2
A3 A4
, B3
B0
B1
B2
B4
Frequency (Hz)
Amplitude
28. Similarity Combination
A Computational Framework for Sound Segregation in Music Signals36
Similarity Computation
AMPLITUDE
SIMILARITY
FREQUENCY
SIMILARITY
HARMONIC
SIMILARITY
(HWPS)
AZIMUTH
PROXIMITY
COMMON
ONSET
OFFSET
SOURCE
MODELS
COMBINER
Spectral Peaks
(over Texture Window)
150ms
OVERALL
SIMILARITY MATRIX
Normalized Cut
...
29. Similarity Combination
A Computational Framework for Sound Segregation in Music Signals38
} Combining cues
} Product operator [ShiMalik2000]
High overall similarity only if all cues are high…
} More expressive operators?
to represent the different sound events in a complex mixture. Therefore, the combination
of different similarity cues could allow to make the best use of their isolated grouping
abilities towards a more meaningful segregation of a sound mixture.
Following the work of Shi and Malik [Shi and Malik, 2000], who proposed to compute
the overall similarity function as the product of the individual similarity cues used for
image segmentation, the current system combines the amplitude, frequency and HWPS
grouping cues presented in the previous sections into a combined similarity function W as
follows:
W(pl, pm) = Wafh(pl, pm) = Wa(pl, pm) × Wf (pl, pm) × Wh(pl, pm) (30)
Plots g in Figures 15 and 16 show the histogram of the values resulting from the com-
bined similarity functions for the two sound examples, Tones A+B and Jazz1, respectively.
5
Audio clips of the signals plotted in Figures 17 and 18 are available at http://www.inescporto.
pt/˜lmartins/Research/Phd/Phd.htmXXX
105Wafh = [(Wf ∧ Wa) ∨ Wh] ∧ Ws
30. Segregating Sound Events
A Computational Framework for Sound Segregation in Music Signals39
} Segregation task
} Carried out by clustering components that are close in the similarity space
} Novel method based on Spectral Clustering
} Normalized Cut (Ncut) criterion
Originally proposed for ComputerVision
Takes cues as pair-wise similarities
Cluster the peaks into groups taking into account simultaneously all cues
Similarity Computation
AMPLITUDE
SIMILARITY
FREQUENCY
SIMILARITY
HARMONIC
SIMILARITY
(HWPS)
AZIMUTH
PROXIMITY
COMMON
ONSET
OFFSET
SOURCE
MODELS
COMBINER
Spectral Peaks
(over Texture Window)
150ms
OVERALL
SIMILARITY MATRIX
Normalized Cut
...
31. Segregating Sound Events
A Computational Framework for Sound Segregation in Music Signals40
} Segregation Task
} Normalized Cut criterion
} Achieves a balanced clustering of elements
} Relies on the eigenstructure of a similarity matrix to partition points
into disjoint clusters
Points in the same cluster high similarity
Points in different clusters low similarity
xi
xj
xk
wij = wji
better cut mincut
xq
xp
xl
32. Segregating Sound Events
A Computational Framework for Sound Segregation in Music Signals41
} Spectral Clustering
} Alternative to the EM and k-means traditional algorithms:
} Does not assume a convex shaped data representation
} Does not assume Gaussian distribution of data
} Does not present multiple minima in log-likelihood
Avoids multiple restarts of the iterative process
} Correctly handles complex and unknown shapes
} Usual in audio signals [Bach and Jordan 2004]
33. Segregating Sound Events
A Computational Framework for Sound Segregation in Music Signals42
} Divisive clustering approach
} Recursive two-way cut
} Hierarchical partition of the data
Recursively partitions the data into two sets
Until pre-defined number of clusters is reached (requires prior knowledge!)
Until a stopping criteria is met
} Current implementation
Requires definition of number of clusters [Martins et al., 2007]
Or alternatively partitions data into 5 clusters and selects the 2 “denser”
ones
Segregation of the dominant clusters in the mixture [Lagrange et al., 2008a]
34. Segregation Results
A Computational Framework for Sound Segregation in Music Signals43
a) Jazz1
b) AMPLITUDE SIMILARITY
CLUSTER 1
c) AMPLITUDE SIMILARITY
CLUSTER 2
d) FREQUENCY SIMILARITY
CLUSTER 1
e) FREQUENCY SIMILARITY
CLUSTER 2
f) HWPS SIMILARITY
CLUSTER 1
g) HWPS SIMILARITY
CLUSTER 2
h) COMBINED SIMILARITIES
CLUSTER 1
i) COMBINED SIMILARITIES
CLUSTER 2
FREQUENCY(Hz)
TIME (secs)
TIME (secs) TIME (secs)
FREQUENCY(Hz)FREQUENCY(Hz)FREQUENCY(Hz)FREQUENCY(Hz)
a) Tones A+B
b) AMPLITUDE SIMILARITY
CLUSTER 1
c) AMPLITUDE SIMILARITY
CLUSTER 2
d) FREQUENCY SIMILARITY
CLUSTER 1
e) FREQUENCY SIMILARITY
CLUSTER 2
f) HWPS SIMILARITY
CLUSTER 1
g) HWPS SIMILARITY
CLUSTER 2
h) COMBINED SIMILARITIES
CLUSTER 1
i) COMBINED SIMILARITIES
CLUSTER 2
FREQUENCY(Hz)
TIME (secs)
TIME (secs) TIME (secs)
FREQUENCY(Hz)FREQUENCY(Hz)FREQUENCY(Hz)FREQUENCY(Hz)
B0
B1
B2
A4 + B3
A3
A2
A1
A0
0 500 1000 1500 2000 2500 3000
0
0.2
0.4
0.6
0.8
1
A0
A1
A2
A3 A4
, B3
B0
B1
B2
B4
Frequency (Hz)
Amplitude
35. Results
A Computational Framework for Sound Segregation in Music Signals45
} Predominant Melodic Source Segregation
} Dataset of real-world polyphonic music recordings
} Availability of the original isolated tracks (ground truth)
} Results (the higher the better)
HWPS improves results
When combined with other similarity features
When compared with other state-of-the-art harmonicity features [Srinivasan and Kankanhalli, 2003]
[Virtanen and Klapuri, 2000]
0 1 2 3 4 5 6 7
Mean SDR (dB) for a 10 song dataset
A+F+HWPS
A+F+rHWPS
A+F+HV
A+F+HS
A+F
36. Results
A Computational Framework for Sound Segregation in Music Signals47
} Predominant Melodic Source Segregation
} On the use of Dynamic Texture Windows
} Results (the higher the better)
Smaller improvement (0.15 dB) than expected
Probably due to the cluster selection approach being used…
More computationally intensive (for longer texture windows)
37. Results
A Computational Framework for Sound Segregation in Music Signals51
} Main Melody Pitch Estimation
} Resynthesize the segregated main voice clusters
} Perform pitch estimation using well known monophonic pitch estimation technique
(Praat)
} Comparison with two techniques:
} Monophonic pitch estimation applied to mixture audio (from Praat)
} State-of-the-Art multi-pitch and main melody estimation algorithm applied to mixture
audio [Klapuri, 2006]
} Results (the lower the better)
38. Results
A Computational Framework for Sound Segregation in Music Signals56
} Voicing Detection
} Identifying portions of a music file containing vocals
} Evaluated three feature sets:
MFCC features extracted from the polyphonic signal
MFCC features extracted from the segregated main voice
Cluster Peak Ratio (CPR) feature
extracted from the segregated main voice clusters
39. Results
A Computational Framework for Sound Segregation in Music Signals57
} Timbre Identification in polyphonic music signals [Martins et al., 2007]
} Polyphonic, multi-instrumental audio signals
} Artificial mixtures of 2-, 3- and 4-notes from real instruments
} Automatic separation of the sound sources
} Sound sources and events are reasonably captured, corresponding in
most cases to played notes
} Matching of the separated events to a collection of 6 timbre models
note 1
note n
...
Sound
Source
Formation
note 1 / inst 1
note n / inst i
...
Timbre
Models
Matching
Matching
Peak
Picking
Sinusoidal
Analysis
......
...
40. Results
A Computational Framework for Sound Segregation in Music Signals58
} Timbre Identification in polyphonic music signals [Martins et al., 2007]
} Sound sources and events are reasonably captured,
corresponding in most cases to played notes
41. Results
A Computational Framework for Sound Segregation in Music Signals59
} Timbre Identification in polyphonic music signals [Martins et al., 2007]
} 6 instruments modeled [Burred et al., 2006]:
} Piano, violin, oboe, clarinet, trumpet and alto sax
} Modeled as a set of time-frequency templates
Describe the typical evolution in time of the spectral envelope of a note
Matches the salient peaks of the spectrum
0
0.2
0.4
0.6
0.8
1
2000 4000 6000 8000 10000
-80
-60
-40
-20
0
Frequency (Hz)
Time(normalized)
Amplitude(dB)
PIANO
0.2
0.4
0.6
0.8
1
2000 4000 6000 8000 10000
-80
-60
-40
-20
0
Frequency (Hz)
Time(normalized)
Amplitude(dB)
OBOE
42. Results
A Computational Framework for Sound Segregation in Music Signals60
} Timbre Identification in polyphonic music signals [Martins et al., 2007]
} Instrument presence detection in mixtures of notes
} 56% of instruments occurrences correctly detected, with a precision of
64% [Martins et al., 2007]
Weak Matching
Alto sax cluster piano prototype
Strong Matching
Piano cluster piano prototype
43. Software Implementation
A Computational Framework for Sound Segregation in Music Signals62
} Modular, flexible and efficient software implementation
} Based on Marsyas
} Free and Open Source framework for audio analysis and processing
http://marsyas.sourceforge.net
peakClustering myAudio.wav
44. Software Implementation
A Computational Framework for Sound Segregation in Music Signals63
} Marsyas
} peakClustering Overview
Series/mainNet
frameMaxNumPeaks
totalNumPeaks
PeakViewSink/
peSink
PeakLabeler/
labeler
PeakConvert/
conv
Accumulator/textWinNet
... ... ...
1
FlowThru/clustNet
... ... ...
Shredder/synthNet
... ... ...
2 3
nTimes
A B
peakLabels
nTimestotalNumPeaks
frameMaxNumPeaks
innerOut
B
45. Software Implementation
A Computational Framework for Sound Segregation in Music Signals64
} Marsyas
} Sinusoidal analysis front-end
Accumulator/textWinNet
Series/analysisNet
Series/peakExtract
ShiftInput/
si
Fanout/stereoFo
Series/stereoSpkNet
Parallel/LRnet
Series/spkL
Windowing/
win
Spectrum/
spk
Series/spkR
Windowing/
win
Spectrum/
spk
EnhADRessStereoSpectrum/
stereoSpk
EnhADRess/
ADRess
Series/spectrumNet
Stereo2Mono/
s2m
Shifter/
sh
Windowing/
wi
Parallel/par
Spectrum/
spk1
Spectrum/
spk2
FlowThru/onsetdetector
... ... ...
1a
FanOutIn/mixer
+
Series/mixSeries
Delay/
noiseDelay
SoundFileSource/
src
Gain/
noiseGain
Series/oriNet
SoundFileSource/
src
Gain/
oriGain
A
1
onsetDetected
flush
FlowThru/onsetdetector
Windowing/
wi
Spectrum/
spk
PowerSpectrum/
pspk
Flux/
flux
ShiftInput/
sif
Filter/
filt1
Filter/
filt2
Reverse/
rev1
Reverse/
rev2
PeakerOnset/
peaker
1a
onsetDetected
I
S
46. Software Implementation
A Computational Framework for Sound Segregation in Music Signals65
} Marsyas
} Onset detection
ShiftInput/
si
Series/stereoSpkNet
Parallel/LRnet
Series/spkL
Windowing/
win
Spectrum/
spk
Series/spkR
Windowing/
win
Spectrum/
spk
EnhADRessStereoSpectrum/
stereoSpk
EnhADRess/
ADRess
s2m sh wi
Spectrum/
spk2
... ... ...
FanOutIn/mixer
+
Series/mixSeries
Delay/
noiseDelay
SoundFileSource/
src
Gain/
noiseGain
Series/oriNet
SoundFileSource/
src
Gain/
oriGain
A
onsetDetected
flush
FlowThru/onsetdetector
Windowing/
wi
Spectrum/
spk
PowerSpectrum/
pspk
Flux/
flux
ShiftInput/
sif
Filter/
filt1
Filter/
filt2
Reverse/
rev1
Reverse/
rev2
PeakerOnset/
peaker
1a
onsetDetected
I
47. Software Implementation
A Computational Framework for Sound Segregation in Music Signals66
} Marsyas
} Similarity matrix computation and Clustering
PeakConvert
/conv
FlowThru/clustNet
frameMaxNumPeaks
totalNumPeaks
FanOutIn/simNet
x
Series/freqSim
SimilarityMatrix/FREQsimMat
Metric/
FreqL2Norm
RBF/
FREQrbf
Series/ampSim
SimilarityMatrix/AMPsimMat
Metric/
AmpL2Norm
RBF/
AMPrbf
Series/HWPSim
SimilarityMatrix/HWPSsimMat
HWPS/
hwps
RBF/
HWPSrbf
Series/panSim
SimilarityMatrix/PANsimMat
Metric/
PanL2Norm
RBF/
PANrbf
PeakFeatureSelect/
FREQfeatSelect
2
B
D
D
Series/NCutNet
Fanout/stack
NormCut/
NCut
Gain/
ID
PeakClusterSelect/
clusterSelect
E
innerOut
PeakLabeler/
labeler
B
labels
D
D
D
PeakFeatureSelect/
AMPfeatSelect
PeakFeatureSelect/
PANfeatSelect
PeakFeatureSelect/
HWPSfeatSelect
F
C1
C2
C3
48. Software Implementation
A Computational Framework for Sound Segregation in Music Signals67
} Marsyas
} More flexible Similarity expression
FanOutIn/simNet
Series/panSim
SimilarityMatrix/PANsimMat
Metric/
PanL2Norm
RBF/
PANrbf
PeakFeatureSelect/
PANfeatSelect
.*
FanOutIn/ORnet
FanOutIn/ANDnet
.*
Series/freqSim
SimilarityMatrix/FREQsimMat
Metric/
FreqL2Norm
RBF/
FREQrbf
PeakFeatureSelect/
FREQfeatSelect
Series/ampSim
SimilarityMatrix/AMPsimMat
Metric/
AmpL2Norm
RBF/
AMPrbf
PeakFeatureSelect/
AMPfeatSelect max
Series/HWPSim
SimilarityMatrix/HWPSsimMat
HWPS/
hwps
RBF/
HWPSrbf
PeakFeatureSelect/
HWPSfeatSelect
49. Software Implementation
A Computational Framework for Sound Segregation in Music Signals68
} Marsyas
} Cluster Resynthesis
Shredder/synthNet
Series/postNet
Gain/
outGain
PeakSynthOsc/
pso
Windowing/
wiSyn
OverlapAdd/
ov
SoundFileSink/
dest
3
B
50. Software Implementation
A Computational Framework for Sound Segregation in Music Signals69
} Marsyas
} Data structures
D
totalnumbe
intextureSIMILARITY
C1
f2 f5f4f1 f3 f6
peaks'
frequency
total number of peaks
A
Re(0)
Re(N/2)
Re(1)
Im(1)
Im(N/2-1)
Re(N/2-1)
...
...
...
...
...
...
...
Re(0)
Re(N/2)
Re(1)
Im(1)
Im(N/2-1)
Re(N/2-1)
...
...
...
...
...
...
...
complexspectrum1
(Npoints)
Pan(0)
Pan(1)
Pan(N/2)
...
...
...
...
...
...
...
stereo
spectrum
(N/2+1points)
texture
window frames
complexspectrum2
(Npoints)
B
peaks
FREQUENCY
peaks
AMPLITUDE
peaks
PHASE
peaks
GROUP
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
frameMaxNumPeaks
texture
window frames
peaks
TRACK
...
...
...
...
...
...
...
...
...
...
...
...
...
audio frame
(N+1 samples)
I
31 42 50
1 430 2 5 Ch1 samples
Ch2 samples
analysis window
(N samples)
S
1 30 2 5 Audio Samples
430 2 5 Shifted Audio
Samples
1
4
51. Software Implementation
A Computational Framework for Sound Segregation in Music Signals70
} Marsyas
} Data structures
D
total number of peaks
in texture window
totalnumberofpeaks
intexturewindow
SIMILARITY
MATRIX
E
total number of peaks
in texture window
totalnumberofpeaks
intexturewindow
3 221 1 3 NCUT indicator
SIMILARITY
MATRIX
F
3 -1-11 1 3
cluster selection
indicator
C1
f2 f5f4f1 f3 f6
peaks'
frequency
total number of peaks
in texture window
C2
a2 a5a4a1 a3 a6
peaks'
amplitude
total number of peaks
in texture window
C3
3 21 2 1 3
f2 f4f1 f3 f5 f6peaks' frequency
XX aa XX
a aX XX X
X aa aa X
aX Xf a f
f fa f a f
f ff f f f
NumPeaks in frame
peak
spectralpattern
total number of peaks
in texture window
Im(N/2-1)
Re(N/2-1)
Re(0)
Re(N/2)
Re(1)
Im(1)
Im(N/2-1)
Re(N/2-1)
...
...
...
...
...
...
...
m1
Pan(0)
Pan(1)
Pan(N/2)
...
...
...
...
...
...
...
stereo
spectrum
(N/2+1points)
texture
window frames
complexspectrum2
(Npoints)
peaks
FREQUENCY
peaks
AMPLITUDE
peaks
PHASE
GROUP
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
..
..
..
..
..
..
frameMaxNumPeaks
texture
window frames
analysis window
(N samples)
S
1 30 2 5 Audio Samples
430 2 5 Shifted Audio
Samples
1
4
52. Conclusions
A Computational Framework for Sound Segregation in Music Signals71
} Proposal of a framework for sound source segregation
} Inspired by ideas of CASA
} Focused on “real-world” music signals
} Designed to be causal and efficient
} Data-driven
} Does not require any training or prior knowledge about audio signals under analysis
} Approaches partial tracking and source separation jointly
} Flexible enough to include new perceptually motivated auditory cues
} Based on a Spectral Clustering technique
} Shows good potential for applications
} Source segregation/separation,
} Monophonic or polyphonic instrument classification,
} Main melody estimation
} Pre-processing for polyphonic transcription, ...
53. Conclusions
A Computational Framework for Sound Segregation in Music Signals72
} Definition of a novel harmonicity cue
} Termed Harmonically Wrapped Peak Similarity (HWPS)
} Experimentally shown as:
} Good grouping criteria for sound segregation in polyphonic music signals.
} Compares favorably to other state-of-the-art harmonicity cues
} Software development of the sound segregation framework
} Used for validation and evaluation
} Made available as Free and Open Source Software (FOSS)
} Based on Marsyas
} Free for everyone to try, evaluate, modify and improve
54. Future Work
A Computational Framework for Sound Segregation in Music Signals73
} Analysis front-end
} Evaluate alternative analysis frontends
} Perceptually-informed filterbanks
} Sinusoid+transient representations
} A different auditory front-end (as long as it is invertible).…
} Evaluate alternative frequency estimation methods for spectral peaks
} Parabolic interpolation
} Subspace methods
} …
} Use of a beat-synchronous approach
} Based on the use of onset detectors and beat estimators for dynamic
adjustment of texture windows
} Perceptually motivated
55. Future Work
A Computational Framework for Sound Segregation in Music Signals74
} Grouping Cues
} Improve HWPS
} Better f0 candidate estimation
} Reduce negative impact of sound events in different audio frames
} Inclusion of new perceptually motivated auditory cues
} Time and frequency masking
} Stereo placement of spectral components (for stereo signals)
} Timbre models as a priori information
} Peak tracking as a pre- and post-processing
} Common fate (onsets, offsets, modulation)
56. Future Work
A Computational Framework for Sound Segregation in Music Signals75
} Implement Sequential integration
} between texture windows
} Cluster segregated clusters?
} Timbre similarity [Martins et al. 2007]
Cluster 1
Cluster 2
57. Future Work
A Computational Framework for Sound Segregation in Music Signals76
} Clustering
} Definition of the neighborhood width (σ) in similarity
functions
} JNDs?
} Define and evaluate more expressive combinations of similarity
functions
} Automatic estimation of the number of clusters in each
texture window
} Extraction of new descriptors directly from segregated
cluster parameters (e.g., CPR):
} Pitch, spectral features, frequency tracks, timing information
58. Future Work
A Computational Framework for Sound Segregation in Music Signals77
} Creation of a sound/music evaluation dataset
} Simple and synthetic sound examples
} For preliminary testing, fine tuning, validation
} “real-world” polyphonic recordings
} More complex signals, for final stress-test evaluations
} To be made publicly available
} Software Framework
} Analysis an processing framework based on Marsyas
} FOSS, C++, multi-platform, real-time
} Feature rich software visualization and sonification tools
59. Related Publications
A Computational Framework for Sound Segregation in Music Signals78
} PhD Thesis:
} Martins, L. G. (2009).A Computational
Framework for Sound Segregation in Music
Signals. PhD thesis, FEUP.
} Book:
} Martins, L. G. (2009).A Computational
Framework for Sound Segregation in Music
Signals – An Auditory Scene Analysis Approach
for Modeling Perceptual Grouping in Music
Listening. Lambert Academic Publishing.
} Book Chapter:
} Martins, L. G., Lagrange, M., and Tzanetakis, G.
(2010). Modeling grouping cues for auditory
scene analysis using a spectral clustering
formulation. Machine Audition: Principles,
Algorithms and Systems. IGI Global.
60. Related Publications
A Computational Framework for Sound Segregation in Music Signals79
} Lagrange, M., Martins, L. G., Murdoch, J., and Tzanetakis, G. (2008). Normalized cuts for
predominant melodic source separation. IEEETransactions on Audio, Speech, and
Language Processing, 16(2). Special Issue on MIR.
} Martins, L. G., Burred, J. J.,Tzanetakis, G., and Lagrange, M. (2007). Polyphonic instrument
recognition using spectral clustering. In Proc. International Conference on Music
Information Retrieval (ISMIR),Vienna,Austria.
} Lagrange, M., Martins, L. G., and Tzanetakis, G. (2008).A computationally efficient scheme
for dominant harmonic source separation. In Proc. IEEE International Conference on
Acoustics, Speech, and Signal Processing (ICASSP), LasVegas, Nevada, USA.
} Tzanetakis, G., Martins, L. G.,Teixeira, L. F., Castillo, C., Jones, R., and Lagrange, M. (2008).
Interoperability and the Marsyas 0.2 runtime. In Proc. International Computer Music
Conference (ICMC), Belfast, Northern Ireland.
} Lagrange, M., Martins, L. G., and Tzanetakis, G. (2007). Semi-automatic mono to stereo
up-mixing using sound source formation. In Proc. 112th Convention of the Audio
Engineering Society,Vienna,Austria.
61. Thank you
A Computational Framework for Sound Segregation in Music Signals80
Questions?
lmartins@porto.ucp.pt
http://www.artes.ucp.pt/citar/