Extracting Tones of Gamelan with STFT

Extracting Tones of
Gamelan Musical Instrument “Saron of Balungan”
with Short Time Fourier Transform Method

Arief Mubarok, Eric Jansen, Hendra Tri Sadewa
arief.mubarok10@mhs.ee.its.ac.id, eric10@mhs.ee.its.ac.id, hendra.tri.sadewa10@mhs.ee.its.ac.id
Adviser:
Dr. Ir. Yoyon Kusnendar Suprapto, M.Sc
yoyonsuprapto@ee.its.ac.id

July 7, 2011

Abstract
Gamelan is a musical ensemble from Indonesia. In General, gamelan is made from bronze and brass. Today,
gamelan music is fairly seen in normal occasions. Gamelan orchestra is only held in special events. e.g.
puppet show or wayang kulit. Due to highly expensive the instrument costs and the influences of modern
music, role of gamelan as one of Indonesian characteristic music is getting less enthusiasts. In contrast to main
role of gamelan in Indonesia, the amount of foreigners in dedicating to gamelan is increasing extensively and
indicating great enthusiasm.
This research refers to determine tones of balungan(1) musical instrument, particularly in saron using Short
Time Fourier Transform or STFT, in purpose to analyze signals in frequency and time domains.

1 Preface lan, e.g. slicing frequency constraints. Gamelan has
several instruments, that in each instrument has its
The saron typically consists of seven bronze bars on own typical tone. This research refers to specific
top of a resonating frame (rancak) and plays melody gamelan: saron of balungan(1) . saron of balungan(1) has
along with slenthem(2) . It is usually about 20 cm (8in) five tones: ji, ro, lu, ma and nem. In each tone has
high, and is played on the floor by a seated performer. its own frequency. The range of notes is defined in
In slendro(3) scale, the bars are 6-1-2-3-5-6-1; This can following table 1.
vary from gamelan to gamelan, or even among instru-
ments in the same gamelan. Slendro(3) instruments Tones Range of Frequencies
commonly have only six keys. It provides the core Ji 504 - 539 Hz
melody (balungan(1) ) in the gamelan orchestra. Sarons Ro 574 - 610 Hz
typically come in a number often sizes, from smallest Lu 688 - 703 Hz
to largest: saron panerus (also: peking), saron barung Ma 792 - 799 Hz
(sometimes just saron) and saron demung (often just Nem 909 - 926 Hz
called demung). Each one of those is pitched an oc-
tave below the previous. The slenthem(2) or slentho
Table 1. Range of Frequencies in Gamelan Slendro(3)
performs a similar function to the sarons one octave
below the demung.
Defining tones according to frequency, Short Time Any interventions from other balungan instruments,
Fourier Transform (STFT) renders output in form of e.g. demung and bonang are disaggregated, as result-
filtered sound based on window. STFT is developed ing saron tones.
from Fast Fourier Transform (FFT). The algorithm of
(1) The balungan (Javanese: skeleton, frame) is sometimes called the “core
STFT captures input signals in t time, then the results melody” of a Javanese gamelan composition. This corresponds to the view
are generated in time and frequency domains. that gamelan music is heterophonic: the balungan is then the melody which
is being elaborated.
Basicly, STFT has the same definition to Fourier (2) Slenthem frequently plays the same basic melody as that of the saron.
Transform. The difference between the both is in win- Occasionally, it does have its own important part to play. It is low in pitch,
and its sound sustains for a relatively long period of time because of the
dow function. With window function, STFT renders or tubular resonators below each bar.
slices signals from time domain to frequency domain. (3) Slendroof the two salendro by the Sundanese is aused in Indonesian
scale, one
or called
most common scales (laras)
pentatonic

window function aims at recognizing notes in game- gamelan music, the other being p´log. e

1

2 Theoritical Methods 2.2 Discrete Fourier Transform
In general, Discrete Fourier Transform or DFT is similar
Analyzing notes in form of signals, is described in
to native fourier transform. In distinct manner, DFT
following figure 1.
needs input function as discrete signal. Theoritically,
DFT is defined as following:

Input notations signal n−1
Xk = x[n]ωkn , k = 0,1,2,...,N-1
n (5)
n=0
N−1
X(ωk ) x(tn )e− jωk tn , k = 0,1,2,...,N-1 (6)
Signals rendered into time
n=0
and/or frequency domains
where ’ ’ means “is defined as” or “equals by defini-
tion”, and
N−1
Define frequency
fundamental f (n) f (0) + f (1) + ... + f (N − 1)
n=0
x(tn ) input signal amplitude (real and complex)
at time tn
Analyze frequency of
tn nT = nth sampling instant (sec), n an
tones that rendered
integer ≥ 0
T sampling interval (sec)
X(ωk ) spectrum of x, at frequency ωk
Determine occuring
notes in time domain ωk kΩ = kth frequency sample (radians per
second)
figure 1. Signal Processing System 2π
Ω = radian-frequency sampling interval
NT
(rad/sec)
2.1 Fast Fourier Transform
fs 1/T = sampling rate (Hz)
The term Fast Fourier Transform or FFT refers to an effi- N = number of time samples (integer).
cient implementation of the Discrete Fourier Transform
(DFT). FFT is commonly used in analyzing signal,
e.g. filtering, analyzing correlation and spectrum. 2.3 Short Time Fourier Transform
Fast Fourier Transform is developed from DFT or Dis- Short Time Fourier Transform or STFT or short-term
crete Fourier Transform. Mainly used in transforming Fourier Transform is a powerful general-purpose tool
signal from time domain to frequency domain. The for audio signal processing. It defines a partic-
method is intended to process signals in spectral sub- ularly useful class of time-frequency distributions
straction. Fast Fourier Transform or FFT is defined as which specify complex amplitude versus time and
following: frequency for any signal. Tuning the STFT parame-
ters for the following applications:
H= h(t)e− jωt dt (1) 1. Approximating the time-frequency analysis per-
f formed by the ear for purposes of spectral display.
whereas ω = 2π = 2π f T (2)
fs 2. Measuring model parameters in a short-time spec-
trum.
As transformed into discrete is defined as following:
The definition of the continuous-time STFT is:
N−1 ∞
H(kω0 ) = h(nT)e− jkω0 nT (3) STFTx(t) = X(τ, ω) = x(t)w(t − τ)e− jωt dt (7)
n=0 −∞

where
As simplified with T = 1, time sample N is equal to k
frequency, therefore resulting as following: w(t) = window function
x(t) = the signal to be transformed
H(k) = h(n)e, k = 0,1,2,...,N-1 (4)
X(τ, ω) = essentially the Fourier Transform of x(t)w(t − τ)

2

τ = freqency axis According to Theory of Shannon, the minimum value
ω = variable to suppress any jump discontinuity of sampling frequency is more or less half times of the
signal frequency. Thus, the sampling yields original
shapes of signal. The greater is better, as it visualizes
The usual mathematical definition of the discrete- authentic signal.
time STFT is: The following figure 3 shows the sampling process
∞
in the analog and digital signals.
Xm (ω) = x(n)w(n − mR)e− jωn (8)
n=−∞
= DTFTω (x.ShiftmR (w)), (9)

where

x(n) = input signal at time n
w(n) = length M window function (e.g. Hamming)
Xm (ω) = DTFT of windowed data centered about
time mR
Figure 3. Sampling process
R = hop size, in samples, between successive
DTFTs. 2.3.2 Frame Blocking
whereas: τ is time parameter, ω is frequency parame- Frame-blocking is a method to divide sound signal
ter, x(t) is analyzed signal, W(t-τ) is window function into several frames. In one frame consists of several
and e− jωt dt is inherited function from Discrete Fourier samples. Capturing samples depends on quantity of
Transform. sounds in every second and the magnitude of sam-
Analyzing signal is described in following dia- pling frequency. Described as following figure 4.
gram.

Input signals Sampling

Frame blocking

Figure 4. Signal in Frame-blocking

2.3.3 Windowing

Windowing Sliced signals in every frame are prone to data errors
FFT
when calculated through Fourier transforming. Thus,
windowing is necessary to reduce discontinuity effects
in sliced signals. In simple calculation of continuous
signal, transformation is taken place with multiplying
Notes every short-time signal with window function in period
of time.
figure 2. Signal Block Diagram In this phase, frequency scaling is measured only
in length of window. As the output from the previous
2.3.1 Sampling Process phases that produced by STFT function in frequency
and time integrated with windowing, generating visual
The sound signal is analogous or continuous and cat- content of tones.
egorized as infinite time interval. As an object of ob-
servation, sound is partitioned into slices in time con-
straints. Therefore, so-called as finite time interval. 3 Analysis and Testing
Based on theory of Nyquist, sampling frequency is
required at least twice times signal frequency: Performance testing and durability are examined
with 3 sound files: SaronTok.wav, consisting of notes
Fsampling ≥ 2 × Fsignal (10) of saron; SaronDemung.wav, consisting of notes of

3

saron and demung; SarongBonang.wav, consisting of
notes of saron and bonang. Three kinds of different
length of window are inspected: 2048, 4096 and 8192.
Extracting tones of saron in SaronTok.wav with
length of window 2048 is shown in figure 5.

Figure 8. Intersections between saron with bonang

The experiment is not fully capable to analyze sig-
nals with composition of notes and in similar fre-
quency. The percentages of testing results in Saron-
Tok.wav, SaronDemung.wav, and SaronBonang.wav
are described in the following table 2.
Figure 5. Extracting tones of saron with length of
window 2048
Extracting tones of saron in SaronTok.wav with length Window Saron Saron+Demung Saron+Bonang
of window 4096 is shown in figure 6. 2048 100% 93.75% 42%
4096 100% 93.75% 50%
8192 100% 100% 57%

Table 2. Results of testing tones of saron tones in several
windows

4 Conclusion
From testing and systems analysis have been done
on the determination of the gamelan notation can be
summarized as followings:
window 4096 1. In determination of the width of the window af-
Extracting tones of saron in SaronTok.wav with length fects the accuracy of the analysis, the larger the
of window 8192 is shown in figure 7. window width, different frequencies in scaling
the smaller, more meticulous and graphic signals
the ramps. While in the area when the oppo-
site occurs, the greater width of the window, the
graph in the region increasingly narrower time,
the timing of notes tends to overlap. At ampli-
tude axis, the greater width of window, more
meticulous the value.

2. In determination of blended notes of saron and
demung, yet the extraction does work well due
to dissimilarities of both notations.

3. In determination of blended notes of saron and
bonang, the extraction of both notes is arduous to
highly percentages of success due to their simil-
ities in frequencies.
window 8192
In testing signals in SaronBonang.wav, some inter- 4. The highest peak frequency of saron tones is
ventions of intersection between saron and bonang in- highly influential in the accuracy of analysis in
struments, as shown in figure 8. order to determine the notations.

4

5. Theoritically, this research is useful as manual to 6 Students Profile
determine other notes in other gamelan instru-
ments. Eric Jansen is student of
Computer Engineering and
Telematics, Department of
5 References Electrical Engineering,
Faculty of Industrial Tech-
1. Traditional Music Sound Extraction Based nology, Institut Teknologi
on Spectral Density Model Using Adaptive Sepuluh Nopember as a
Cross Correlation for Automatic Transcrip- continuance of Three-Year
tion.Surabaya : ITS., Suprapto, Yoyon., Hariadi, Diploma. Graduated in 2002
Mochammad., Purnomo, Mauridhi Hery. (2010)., as Ahli Madya/Intermediate
Expert or A.Md from Insti-
2. Short Time Fourier Transform,
tut Teknologi Sepuluh Nopember. Experienced in
Smith, Julius Orion. (2007),
C/C++ programming in more than a decade and
http://ccrna.stanford.edu/ jos/parsh/Short Time
specializing in parallel and distributed computation
Fourier Transform STFT.html
with message passing interface and grid computing.
3. STFT in Matlab, Contributed and participated in artificial intelligence
https://ccrma.stanford.edu/ jos/sasp/STFT Matlab. projects: facial recognition and biomedical. Devel-
html/ oping for years in open sources projects. Working
professional in UNIX variants operation systems,
4. Pembuatan Program Aplikasi Untuk such as FreeBSD, Solaris and Linux. Worked as
Menampilkan Ciri Sinyal Wicara Dengan application developer, website developer and net-
Matlab. Surabaya : PENS, work engineer in more than 5 years in Indonesia
Maulidia, Nia. (2009) and European countries: The Netherlands and Spain.
5. Mathematical Definition of the STFT,
http://ccrma.stanford.edu/ jos/sasp/Mathematical De
finition STFT.html
6. evaluating STFT of a stationary signal,
http://www.mathworks.com/matlabcentral/fileex
change/22033-evaluating-short-time-fourier-
transform-of-a-stationary-signal
7. Fast Fourier Transform,
https://ccrma.stanford.edu/ jos/dft/Fast Fourier Trans
form FFT.html#22320

8. Discrete Fourier Transform,
https://ccrma.stanford.edu/ jos/mdft/Discrete Time Fou
rier Transform.html
9. Fourier Series,
https://ccrma.stanford.edu/ jos/dft/Fourier Series FS Re
lation.html#23346
10. Summary of STFT Computation Using the FFT,
https://ccrma.stanford.edu/ jos/sasp/Summary STFT
Computation Using.html

11. Discrete Fourier Transform Tutorial,
http://www.fourier-series.com/fourierseries2/DFT tuto
rial.html
12. Mathematics of the discrete Fourier transform
(DFT) with Audio Applications second edition,
https://ccrma.stanford.edu/ jos/dft/

5

Extracting Tones of Gamelan with STFT

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Extracting Tones of Gamelan with STFT

Similar to Extracting Tones of Gamelan with STFT (16)

Recently uploaded

Recently uploaded (20)

Extracting Tones of Gamelan with STFT