雜訊消除
組員:許家愷、楊志璿、涂哲誠
Introduction
Decode EncodeReadfile FFT/IFFT Noise
Introduction
Decode EncodeReadfile FFT/IFFT Noise
Schedule
4 6 7 11 12 13 15 (weeks)
Decode
FFT
Learn ML
& TF API
ML find
people sound
Filter Encode
Combine
& Debug
Method
https://newt.phys.unsw.edu.au/jw/voice.html
Method
https://newt.phys.unsw.edu.au/jw/voice.html
Method
https://newt.phys.unsw.edu.au/jw/voice.html
Decode EncodeProcess
Number
of
channel
Unichannel LeftChannel
RightChannel
FFT
IFFT
Algorithm
Unichannel
LeftChannel
RightChannel
Encoder
Output.wav
input.wav ReadFile
Decode EncodeProcess
Number
of
channel
Unichannel LeftChannel
RightChannel
FFT
IFFT
Algorithm
Unichannel
LeftChannel
RightChannel
Encoder
Output.wav
input.wav ReadFile
Decode
Unichannel LeftChannel
RightChannel
input.wav
Number
of
channel
Decode
import wave # python module
f = wave.open(filename, 'rb’) # in binary mode
params = f.getparams() # 取得音檔前面的參數
nchannels, sampwidth, framerate, nframes = params[:4]
strData = f.readframes(nframes) # 讀取整首歌
waveData = np.fromstring(strData, dtype=np.int16)
# np.fromstring 會自動把兩個bytes 翻轉過來 並轉成int
Decode EncodeProcess
Number
of
channel
Unichannel LeftChannel
RightChannel
FFT
IFFT
Algorithm
Unichannel
LeftChannel
RightChannel
Encoder
Output.wav
input.wav ReadFile
Encode
Unichannel
LeftChannel
RightChannel
Encoder
Output.wav
Encode
for l,r in zip(L_IFFT, R_IFFT): # 把兩個陣列合併
change = int(l)
if change < 0 :
change = change+65536 # 補數關係,小於0要加65536
lastE = change%128 #切割兩個byte
firstE = change>>8
try: #預防寫入的資料過大
writeFile.writeframes((lastE).to_bytes(1, byteorder='big')) #先寫入後面的byte
writeFile.writeframes((firstE).to_bytes(1, byteorder='big')) #再寫入前面的byte
except:
print(change) #把過大的數字印出
Decode EncodeProcess
Number
of
channel
Unichannel LeftChannel
RightChannel
FFT
IFFT
Algorithm
Unichannel
LeftChannel
RightChannel
Encoder
Output.wav
input.wav ReadFile
Process
FFT
IFFT
Algorithm
ReadFile
Algorithm – Filter & subtraction
def subtraction(rawFreqData, mode=0,
toFindSuitableNoise=False):
filtedData = freqFilter(rawFreqData)
if toFindSuitableNoise == True:
findSuitableNoise(filtedData, mode)
return [a-b for a,b in zip(filtedData, SuitableNoise)]
def freqFilter(freqTable):
for i in range(1 << num):
for j in len(lowerBound):
if ((not (i > lowerBound[j]
and i < upperBound[j]))):
freqTable[i] = 0
return freqTable
Imagine Filter
Imagine Filter
Imagine subtraction
Algorithm – Find suitable noise
def findSuitableNoise(filtedData, mode=0):
'''
filtedData is the frequency graph which had been filtered
mode = 0 is abslute
= 1, 1.4, 2... etc. is power
(Real number not complex)
‘‘’
processFunction = testingPower if mode else testingAbs
success, times = processFunction(filtedData, mode)
SuitableNoise = [x * times for x in success]
Algorithm – Find suitable noise
def testingAbs(rawData, mode):
rawDataSum = 0
result = []
for low, high in zip(lowerBound, upperBound):
rawDataSum += sum(rawData[low:high + 1])
for i in range(len(noiseData)):
subtract = 0
print(len(noiseData[i]))
print(len(rawData))
for j in range(len(noiseData[i])):
tmp = noiseData[i][j] * (rawDataSum / sN[i])
subtract += abs(rawData[j] - tmp)
result.append((subtract, i))
bestNoise = min(result)
return bestNoise, rawDataSum / sN[bestNoise[1]]
Algorithm – Find suitable noise
def findSuitableNoise(filtedData, mode=0):
'''
filtedData is the frequency graph which had been filtered
mode = 0 is abslute
= 1, 1.4, 2... etc. is power
(Real number not complex)
‘‘’
processFunction = testingPower if mode else testingAbs
success, times = processFunction(filtedData, mode)
SuitableNoise = [x * times for x in success]
The better way
• Time domain subtraction
• Smoothly generate
Problems-1
Problems-1
• FFT dtype=int64
def fft_T(x):
# must x size must be 2^N
x = np.asarray(x, dtype=np.complex64)
n = x.shape[0]
if n < 8:
numberList = np.arange(n)
i = numberList.reshape((n, 1)) # turn to column
M = np.exp(-2j * pi * i * numberList / n)
return np.dot(x, M) # matrix dot product
X_even = fft_T(x[::2])
X_odd = fft_T(x[1::2])
twiddleFactor = np.exp(-1j * pi * np.arange(n >> 1) / (n >> 1)) # half
return np.concatenate([X_even + twiddleFactor * X_odd, X_even - twiddleFactor * X_odd])
Problems-2
Problems-2
• Our filter is equivalent to multiply window functions
• F(ω) ∙ G(ω) = f(t) Convolution g(t)
def freqFilter(freqTable):
# 1<<num is sample rate
# so num is same as fft.py
global num
global filterTable
After_filter = [0]*(1<<num)
for run in range(1<<num):
After_filter[run] = filterTable[run] * freqTable[run]
Problems-2
• Our filter is equivalent to multiply window functions
• F(ω) ∙ G(ω) = f(t) Convolution g(t)
src: https://www.twblogs.net/a/5b8cf3582b71771883374f58
def freqFilter(freqTable):
# 1<<num is sample rate
# so num is same as fft.py
global num
global filterTable
After_filter = [0]*(1<<num)
for run in range(1<<num):
After_filter[run] = filterTable[run] * freqTable[run]
Problems-2
filterTable=[0]*(1<<num)
lowerBound = [90, 2000, 6000] #90, 2000, 6000
upperBound = [1300, 5000, 10000] #1300, 5000,10000
for l in range( len(lowerBound) ):
lowerBound[l] = round( lowerBound[l]/( 44100/(1<<num) ) )
upperBound += [ (1<<num) - lowerBound[l] ]
for u in range( len(upperBound) ):
upperBound[u] = round( upperBound[u]/( 44100/(1<<num) ) )
lowerBound += [ (1<<num) - upperBound[u] ]
lowerBound.sort()
upperBound.sort()
expand = 1000
def gaussGenerate():
for l,u in zip(lowerBound, upperBound):
sig = signal.general_gaussian( (u-l) + expand*2, sig = (u-l)//2, p = 3)
for run in range( (u-l) + expand*2 ):
cac = run – expand + l
if(cac >= 0 and cac < 131072):
filterTable[cac] = sig[run]
gaussGenerate()
New filter
Problems-2
filterTable=[0]*(1<<num)
lowerBound = [90, 2000, 6000] #90, 2000, 6000
upperBound = [1300, 5000, 10000] #1300, 5000,10000
for l in range( len(lowerBound) ):
lowerBound[l] = round( lowerBound[l]/( 44100/(1<<num) ) )
upperBound += [ (1<<num) - lowerBound[l] ]
for u in range( len(upperBound) ):
upperBound[u] = round( upperBound[u]/( 44100/(1<<num) ) )
lowerBound += [ (1<<num) - upperBound[u] ]
lowerBound.sort()
upperBound.sort()
expand = 1000
def gaussGenerate():
for l,u in zip(lowerBound, upperBound):
sig = signal.general_gaussian( (u-l) + expand*2, sig = (u-l)//2, p = 3)
for run in range( (u-l) + expand*2 ):
cac = run – expand + l
if(cac >= 0 and cac < 131072):
filterTable[cac] = sig[run]
gaussGenerate()
New filter
Problems-2
Origin
Guass
Problems-3
#numpy.fft.ifft()
def ifft(a, n=None, axis=-1, norm=None):
a = asarray(a)
if n is None:
n = a.shape[axis]
fct = 1/n
if norm is not None and _unitary(norm):
fct = 1/sqrt(n)
output = _raw_fft(a, n, axis, False, False, fct)
return output
def ifft_T(x):
x = np.array(x, dtype=complex)
n = len(x)half = n//2
x = np.concatenate((x[half:],x[:half]),axis =None) #shifting
output = fft_T(x)output = list(reversed(output))
output.insert(0, output[-1])
del output[-1]
return [element / n for element in output]
#Our ifft_T()
Problems-3
#numpy.fft.ifft()
def ifft(a, n=None, axis=-1, norm=None):
a = asarray(a)
if n is None:
n = a.shape[axis]
fct = 1/n
if norm is not None and _unitary(norm):
fct = 1/sqrt(n)
output = _raw_fft(a, n, axis, False, False, fct)
return output
#Our ifft_T()
>>> a = range(1<<5)
>>> b = fft_T(b)
>>> c = ifft_T(b)
>>> c
[0j
(31.00000000649298+1.9984014443252818e-15j)
(30.000000085901572+3.730689515421092e-15j)
(29.000000170765194+5.10702591327572e-15j)
…
(16.99999994318653+1.9984014443252818e-15j)
(16+0j)
(15.000000056813466-6.661338147750939e-16j)
…
(2.9999998292348096-8.881784197001252e-16j)
(1.9999999140984315-5.2850017498963115e-15j)
(0.9999999935070285-7.327471962526033e-15j)]
>>>
def ifft_T(x):
x = np.array(x, dtype=complex)
n = len(x)half = n//2
x = np.concatenate((x[half:],x[:half]),axis =None) #shifting
output = fft_T(x)output = list(reversed(output))
output.insert(0, output[-1])
del output[-1]
return [element / n for element in output]
Problems-3
def ifft_T(x):
x = np.array(x, dtype=complex)
n = len(x)
half = n//2
x = np.concatenate((x[half:], x[:half]), axis = None) #shifting
output = fft_T(x)
output = list(reversed(output))
output.insert(0, output[-1])
del output[-1]
return [element / n for element in output]
Problems-3
def ifft_T(x):
x = np.array(x, dtype=complex)
n = len(x)
half = n//2
x = np.concatenate((x[half:], x[:half]), axis = None) #shifting
output = fft_T(x)
output = list(reversed(output))
output.insert(0, output[-1])
del output[-1]
return [element / n for element in output]
Problems-4
Before process
After process
Project result

Fourier project presentation

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
    Schedule 4 6 711 12 13 15 (weeks) Decode FFT Learn ML & TF API ML find people sound Filter Encode Combine & Debug
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
    Decode import wave #python module f = wave.open(filename, 'rb’) # in binary mode params = f.getparams() # 取得音檔前面的參數 nchannels, sampwidth, framerate, nframes = params[:4] strData = f.readframes(nframes) # 讀取整首歌 waveData = np.fromstring(strData, dtype=np.int16) # np.fromstring 會自動把兩個bytes 翻轉過來 並轉成int
  • 12.
  • 13.
    Encode for l,r inzip(L_IFFT, R_IFFT): # 把兩個陣列合併 change = int(l) if change < 0 : change = change+65536 # 補數關係,小於0要加65536 lastE = change%128 #切割兩個byte firstE = change>>8 try: #預防寫入的資料過大 writeFile.writeframes((lastE).to_bytes(1, byteorder='big')) #先寫入後面的byte writeFile.writeframes((firstE).to_bytes(1, byteorder='big')) #再寫入前面的byte except: print(change) #把過大的數字印出
  • 14.
  • 15.
    Algorithm – Filter& subtraction def subtraction(rawFreqData, mode=0, toFindSuitableNoise=False): filtedData = freqFilter(rawFreqData) if toFindSuitableNoise == True: findSuitableNoise(filtedData, mode) return [a-b for a,b in zip(filtedData, SuitableNoise)] def freqFilter(freqTable): for i in range(1 << num): for j in len(lowerBound): if ((not (i > lowerBound[j] and i < upperBound[j]))): freqTable[i] = 0 return freqTable
  • 16.
  • 17.
  • 18.
  • 20.
    Algorithm – Findsuitable noise def findSuitableNoise(filtedData, mode=0): ''' filtedData is the frequency graph which had been filtered mode = 0 is abslute = 1, 1.4, 2... etc. is power (Real number not complex) ‘‘’ processFunction = testingPower if mode else testingAbs success, times = processFunction(filtedData, mode) SuitableNoise = [x * times for x in success]
  • 21.
    Algorithm – Findsuitable noise def testingAbs(rawData, mode): rawDataSum = 0 result = [] for low, high in zip(lowerBound, upperBound): rawDataSum += sum(rawData[low:high + 1]) for i in range(len(noiseData)): subtract = 0 print(len(noiseData[i])) print(len(rawData)) for j in range(len(noiseData[i])): tmp = noiseData[i][j] * (rawDataSum / sN[i]) subtract += abs(rawData[j] - tmp) result.append((subtract, i)) bestNoise = min(result) return bestNoise, rawDataSum / sN[bestNoise[1]]
  • 22.
    Algorithm – Findsuitable noise def findSuitableNoise(filtedData, mode=0): ''' filtedData is the frequency graph which had been filtered mode = 0 is abslute = 1, 1.4, 2... etc. is power (Real number not complex) ‘‘’ processFunction = testingPower if mode else testingAbs success, times = processFunction(filtedData, mode) SuitableNoise = [x * times for x in success]
  • 23.
    The better way •Time domain subtraction • Smoothly generate
  • 24.
  • 25.
    Problems-1 • FFT dtype=int64 deffft_T(x): # must x size must be 2^N x = np.asarray(x, dtype=np.complex64) n = x.shape[0] if n < 8: numberList = np.arange(n) i = numberList.reshape((n, 1)) # turn to column M = np.exp(-2j * pi * i * numberList / n) return np.dot(x, M) # matrix dot product X_even = fft_T(x[::2]) X_odd = fft_T(x[1::2]) twiddleFactor = np.exp(-1j * pi * np.arange(n >> 1) / (n >> 1)) # half return np.concatenate([X_even + twiddleFactor * X_odd, X_even - twiddleFactor * X_odd])
  • 26.
  • 27.
    Problems-2 • Our filteris equivalent to multiply window functions • F(ω) ∙ G(ω) = f(t) Convolution g(t) def freqFilter(freqTable): # 1<<num is sample rate # so num is same as fft.py global num global filterTable After_filter = [0]*(1<<num) for run in range(1<<num): After_filter[run] = filterTable[run] * freqTable[run]
  • 28.
    Problems-2 • Our filteris equivalent to multiply window functions • F(ω) ∙ G(ω) = f(t) Convolution g(t) src: https://www.twblogs.net/a/5b8cf3582b71771883374f58 def freqFilter(freqTable): # 1<<num is sample rate # so num is same as fft.py global num global filterTable After_filter = [0]*(1<<num) for run in range(1<<num): After_filter[run] = filterTable[run] * freqTable[run]
  • 29.
    Problems-2 filterTable=[0]*(1<<num) lowerBound = [90,2000, 6000] #90, 2000, 6000 upperBound = [1300, 5000, 10000] #1300, 5000,10000 for l in range( len(lowerBound) ): lowerBound[l] = round( lowerBound[l]/( 44100/(1<<num) ) ) upperBound += [ (1<<num) - lowerBound[l] ] for u in range( len(upperBound) ): upperBound[u] = round( upperBound[u]/( 44100/(1<<num) ) ) lowerBound += [ (1<<num) - upperBound[u] ] lowerBound.sort() upperBound.sort() expand = 1000 def gaussGenerate(): for l,u in zip(lowerBound, upperBound): sig = signal.general_gaussian( (u-l) + expand*2, sig = (u-l)//2, p = 3) for run in range( (u-l) + expand*2 ): cac = run – expand + l if(cac >= 0 and cac < 131072): filterTable[cac] = sig[run] gaussGenerate() New filter
  • 30.
    Problems-2 filterTable=[0]*(1<<num) lowerBound = [90,2000, 6000] #90, 2000, 6000 upperBound = [1300, 5000, 10000] #1300, 5000,10000 for l in range( len(lowerBound) ): lowerBound[l] = round( lowerBound[l]/( 44100/(1<<num) ) ) upperBound += [ (1<<num) - lowerBound[l] ] for u in range( len(upperBound) ): upperBound[u] = round( upperBound[u]/( 44100/(1<<num) ) ) lowerBound += [ (1<<num) - upperBound[u] ] lowerBound.sort() upperBound.sort() expand = 1000 def gaussGenerate(): for l,u in zip(lowerBound, upperBound): sig = signal.general_gaussian( (u-l) + expand*2, sig = (u-l)//2, p = 3) for run in range( (u-l) + expand*2 ): cac = run – expand + l if(cac >= 0 and cac < 131072): filterTable[cac] = sig[run] gaussGenerate() New filter
  • 31.
  • 32.
    Problems-3 #numpy.fft.ifft() def ifft(a, n=None,axis=-1, norm=None): a = asarray(a) if n is None: n = a.shape[axis] fct = 1/n if norm is not None and _unitary(norm): fct = 1/sqrt(n) output = _raw_fft(a, n, axis, False, False, fct) return output def ifft_T(x): x = np.array(x, dtype=complex) n = len(x)half = n//2 x = np.concatenate((x[half:],x[:half]),axis =None) #shifting output = fft_T(x)output = list(reversed(output)) output.insert(0, output[-1]) del output[-1] return [element / n for element in output] #Our ifft_T()
  • 33.
    Problems-3 #numpy.fft.ifft() def ifft(a, n=None,axis=-1, norm=None): a = asarray(a) if n is None: n = a.shape[axis] fct = 1/n if norm is not None and _unitary(norm): fct = 1/sqrt(n) output = _raw_fft(a, n, axis, False, False, fct) return output #Our ifft_T() >>> a = range(1<<5) >>> b = fft_T(b) >>> c = ifft_T(b) >>> c [0j (31.00000000649298+1.9984014443252818e-15j) (30.000000085901572+3.730689515421092e-15j) (29.000000170765194+5.10702591327572e-15j) … (16.99999994318653+1.9984014443252818e-15j) (16+0j) (15.000000056813466-6.661338147750939e-16j) … (2.9999998292348096-8.881784197001252e-16j) (1.9999999140984315-5.2850017498963115e-15j) (0.9999999935070285-7.327471962526033e-15j)] >>> def ifft_T(x): x = np.array(x, dtype=complex) n = len(x)half = n//2 x = np.concatenate((x[half:],x[:half]),axis =None) #shifting output = fft_T(x)output = list(reversed(output)) output.insert(0, output[-1]) del output[-1] return [element / n for element in output]
  • 34.
    Problems-3 def ifft_T(x): x =np.array(x, dtype=complex) n = len(x) half = n//2 x = np.concatenate((x[half:], x[:half]), axis = None) #shifting output = fft_T(x) output = list(reversed(output)) output.insert(0, output[-1]) del output[-1] return [element / n for element in output]
  • 35.
    Problems-3 def ifft_T(x): x =np.array(x, dtype=complex) n = len(x) half = n//2 x = np.concatenate((x[half:], x[:half]), axis = None) #shifting output = fft_T(x) output = list(reversed(output)) output.insert(0, output[-1]) del output[-1] return [element / n for element in output]
  • 36.
  • 37.

Editor's Notes

  • #4 目標:讓含有噪音的人聲檔案 經過我們的程式將雜音給消除 最後輸出一個純人聲的檔案
  • #6 1.需要一個可以解析音檔的程式(時間對頻率)(預計2周) 2-1.實作一個FFT(預計2周)    2-2學習ML(Tensor Flow API)(預計3周) 3.製作ML找出人聲的頻率位置(預計4周) 4.製作非人聲範圍濾鏡(預計1周) 5.製作可輸出音檔的程式(預計1周) 6.整合所有的模組並debug(預計2周)
  • #14 -32768~32767 轉成 0~65535,負的在上面
  • #16 作為替代ML的方案,我們決定出了兩種方式來取代,螢幕上的兩個函式是我們主題 : 雜訊消除的核心與構思,在左邊的函式是我們的頻率濾波函式,主要是用來將人聲頻段從原始檔案中過濾出來;右邊的相減函式是我們將原始檔案減去噪音的樣本以達到降低噪音的效果
  • #17 以下我們函式的概念圖,在濾波函式裡我們界定範圍在90~1300和6000~10000赫茲這兩個區間,請注意X軸單位並非赫茲,在濾波函式中我們將原始檔案乘上一個窗函數,
  • #18 期望得到一個濾波後的頻譜;
  • #19 再來我們將頻譜減去我們預設的噪音樣本,
  • #20 期望得到一個降噪後的頻譜
  • #21 在進入這個執行相減函式以前,我們讓使用者可以選擇使用兩種模式尋找最佳噪音樣本,一種是取絕對值,另一種是取次方
  • #22 舉"絕對值模式"為例,我們將每一個噪音頻率的值乘上原始檔案的頻率值總和除以當前噪音樣本的頻率值總和以達到增幅與還原當時環境的效果,當樣本噪音跟原始檔案相減後的絕對值越小,即代表越相近,我們則挑選此樣本用來與原始檔案進行相減
  • #23 因此,SuitableNoise會得到來自abs函式或者power函式回傳的最佳噪音和最佳增幅倍數
  • #24 以上是我們原先對於雜訊消除的想法,在多方請教過後,我們知道了其實有更好的方式來進行處理,例如我們直接在時域的將檔案和樣本進行相減,或是"增加採樣點數"以取到更細的內容來進行處理,接下來將說明我們在實現程式的過程中所遇到的問題
  • #25 左邊的圖,是我們原本自己寫的 FFT 右邊的圖,是 numpy 的 FFT 很明點,我們做錯了(接下頁)
  • #26 那是因為,我們一開始把 dtype 設成 int64 ,而我們de這個bug De了兩週 而用complex64就對了
  • #27 前面說我們的filter 是 90~1300、6000~10000 可是做出來長這樣不合理,因為這張圖中(去比出來), 2K~4.3K都還有雜訊
  • #28 我們推測是:我們在使用 filter 的方式, 就是乘以一個窗型函數,因為在 freq domain 乘以一個函數 根據 Convolution 特性(接下頁)
  • #29 在我們的這一行,就是乘以 filter 他就會造成如右邊,逆轉換出來會在比較其他地方震盪 的結果 所以他在其他區域(2k~4.3k)會產生雜音
  • #30 經過我們詢問柯正雯教授之後,教授有提供這個 高斯function 這個想法。(下一頁)
  • #31 因此我們這樣產生了一張這樣的 filter去做處理。
  • #32 如此產生這樣的結果。 我們來做一下比較
  • #33 這一個問題是出現在 ifft 上 基本上,我們的 function 跟 numpy 的code 差不多,numpy 的 axis 跟 norm 一般我們都不會去預設 所以結果應該會跟我們想法做出來的 ifft 相同(接下頁)
  • #34 但是,我們code 跑出來的結果卻是長這樣 看到右邊,他除了第零個的內容是正確的以外,其餘是倒敘
  • #35 因此我們這樣做
  • #36 It works, but I still don’t know why!!