• Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
688
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
6
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Digital Music & Music Processing George Tzanetakis PostDoctoral Fellow Computer Science Department Carnegie Mellon University [email_address] http://www.cs.cmu.edu/~gtzan
  • 2. Overview
    • Music Information Retrieval (MIR) and Computer Audition
      • Motivation
      • Techniques
      • Applications
    • Computer Music and Sound Synthesis
      • Examples, demos
  • 3. MIR Music History 9000 B.C 1000 1700 1877 1960 2002
  • 4. Music
    • 4 million recorded CD tracks
    • 4000 CDs / month
    • Mp3 bandwidth %
    • Global
    • Pervasive
    • Persistent
    • Why ?
  • 5. The future of MIR
    • Library of all recorded music
    • Tasks: organize, search, retrieve, classify, recommend, browse, listen, annotate
    • Examples:
  • 6. Audio MIR Pipeline Signal Processing Machine Learning Human Computer Interaction Hearing Representation Understanding Analysis Reacting Interaction
  • 7. Traditional Music Representations
  • 8. Time domain waveform time pressure Decompose into building blocks time frequency
  • 9. MIDI
    • Musical Instrument Digital Interface
      • Hardware interface
      • File format
    • Note events
      • Duration, discrete pitch, “instrument”
    • Extensions
      • General MIDI
      • Notation, OMR, continuous pitch
  • 10. Symbolic vs Audio MIR Audio Polyphonic Transcription Symbolic Representation (MIDI) MIR Audio Computer Audition MIR Machine Learning Models
  • 11. Feature extraction
  • 12. Timbral Texture Timbre = differentiate sounds of same loudness, pitch Timbral Texture = differentiate mixtures of sounds Global, statistical and fuzzy properties
  • 13. Spectrum t t+1 M M
  • 14. Fourier Transform P=1/f
  • 15. Short Time Fourier Transform STFT Filterbank interpretation Filters Oscillators Amplitude Frequency output
  • 16. Short Time Fourier Transform II t t+1 M M
  • 17. Formants From “Real Time Synthesis for Interactive Applications” P.Cook, A.K Peters Press, used by permission
  • 18. Linear Prediction Coefficients Impulses @ f0 White Noise Source Filter Speech Lossless tubes
  • 19. MPEG Audio Coding (mp3) Analysis Filterbank Psychoacoustic Model Available bits 32 linearly spaced bands Encoder: Slower, Complicated Decoder: Faster, Simpler Perceptual Audio Coding
  • 20. Spectral Shape t M Centroid Rolloff Flux RMS Moments …
  • 21. Summary of Timbral Texture Features
    • Time-Frequency analysis
    • Signal Processing (STFT, DWT)
    • Source-filter (LPC)
    • Perceptual (MP3)
    • Spectral Shape to feature vector
  • 22. Pitch Content
    • Harmony-melody = pitch concepts
    • Music theory score = music
    • Bridge to symbolic MIR
    • Automatic music transcription
    • Non-transcriptive arguments
  • 23. Automatic Pitch Detection P=1/f Time-domain Frequency-domain Perceptual Zerocrossings Autocorrelation analysis = peaks of function correspond to dominant pitches
  • 24. Pitch Histograms Jazz Irish Chroma - folded Height - unfolded
  • 25. Automatic Music Transcription Original Transcribed Mixture signal Noise suppresion Predominant Pitch Estimation Remove detected sound Estimate # of voices
  • 26. Rhythm
    • Movement in time
    • Origins in Poetry (iambic, trochaic)
    • Foot tapping definition
    • Hierarchical semi-periodic structure at multiple levels of detail
    • Links to motion, dance
    • Running vs global
  • 27. Self similarity DWT Autocorrelation Peak Picking Beat Histograms Envelope Extraction
  • 28. Beat Histograms
  • 29. Analysis
    • Classification
    • Segmentation
    • Similarity Retrieval
    • Clustering
    • Thumbnailing
    • Fingerprinting
  • 30. Analysis Overview Musical Piece Trajectory Point
  • 31. Query-by-example Content-based Retrieval Ranked list of k nearest neighbors
  • 32. QBE examples Rock: Beatles Jazz: Bobby Hutserson Funk: Mano negra World: Tibetan singer Computer Music: Paul Lansky Query Match
  • 33. Automatic Musical Genre Classification
    • Categorical music descriptions created by humans
      • Fuzzy boundaries
    • Statistical properties
      • Timbral texture, rhythmic structure, harmonic content
    • Automatic musical genre classification
      • Evaluate musical content features
      • Structure audio collections
  • 34. Genregram demo Dynamic real-time visualization for classification of radio signals
  • 35. Audio segmentation Detect changes of audio texture
  • 36. Multifeature automatic segmenation methodology
    • Time series of feature vector v(t)
    • Detect abrupt changes in trajectory
  • 37. Context & Content Aware User Interfaces
    • Automatic results not perfect
    • Music listening is personal and subjective
    • Browsing vs retrieval
    • “ Overview, zoom and filter, details on demand”, Shneiderman mantra
    • Adapt UI to music content and context
      • Computer Audition
      • Visualization
  • 38. Content and Context
    • Content ~ file
      • Genre, male voice, saxophone
    • Content ~ file, collection
      • Similarity
      • Slow-fast
    • Multiple visualizations
      • Same content
      • Different context
  • 39. Timbregrams Content & Context Similarity + Time Structure Principal Component Analysis Map feature vectors to color
  • 40. Timbrespaces
  • 41. Islands of Music
  • 42. Auditory Scene Analysis