Upcoming SlideShare
×

1,273 views

Published on

Terabyte scale Sensor Network data analysis using MapReduce/ Hadoop

1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
1,273
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
52
0
Likes
1
Embeds 0
No embeds

No notes for slide

1. 1. 2 if0 t h.t / e ” H.f f / frequency shiftin Figure 13.1.2. Convolution of discretely sampled functions. Note how the response function fo 0 Convolution times is wrapped around and stored at the extreme right end of the array rk . With two functions h.t / and g.t /, and their corresponding H.f / and G.f /, we can form two combinations of special intere• Amount of Example: Abetween g functions 1asbyall other rk ’s equal of the two functions, denoted 2 h, is r0 D and overlap response function with deﬁned they are translatedjust the identity ﬁlter. Convolution Zis1signal with this response functio is of a identically the signal. Another example the response function with r14 D hÁ all other rk ’s equal to zero.gThis produces convolved output / d is the inpu g. /h.t that 1 multiplied by 1:5 and delayed by 14 sample intervals.• In Evidently, we have just described in words the following deﬁnition of practice: discrete a(sampled) theof ﬁnitedomain M : that g h D Note convolution witha function in signal s,duration and that g h is response function time short response that the function g h is one member of a simple transform pair, function r (kernel) M=2 X g h ” G.f /H.f / .r s/j Á convolution theorem sj k rk kD M=2C1 In other words, the Fourier transform of the convolution is just individual Fourier transforms.is nonzero only in some range M=2 < k Ä If a discrete response function The correlation of two functions, denoted Corr.g; h/, is deﬁ where M is a sufﬁciently large even integer, then the response function is ﬁnite impulse response (FIR), and its duration is M . (Notice that we are deﬁ Z 1 as the number of nonzero values of rk ; these values span a time interval of Corr.g; circumstances the C t /h. / d is the sampling times.) In most practicalh/ Á g. case of ﬁnite M interest, either because the response really has1ﬁnite duration, or because we a
2. 2. 2 if0 t h.t / e ” H.f f / frequency shiftin Figure 13.1.2. Convolution of discretely sampled functions. Note how the response function fo 0 Convolution times is wrapped around and stored at the extreme right end of the array rk . With two functions h.t / and g.t /, and their corresponding H.f / and G.f /, we can form two combinations of special intere • Amount of Example: Abetween g functions 1asbyall other rk ’s equal of the two functions, denoted 2 h, is r0 D and overlap response function with deﬁned they are translatedjust the identity ﬁlter. Convolution Zis1signal with this response functio is of a identically the signal. Another example the response function with r14 D hÁ all other rk ’s equal to zero.gThis produces convolved output / d is the inpu g. /h.t that 1 multiplied by 1:5 and delayed by 14 sample intervals. • In Evidently, we have just described in words the following deﬁnition of practice: discrete a(sampled) theof ﬁnitedomain M : that g h D Note convolution witha function in signal s,duration and that g h is response function time short response that the function g h is one member of a simple transform pair, function r (kernel) M=2 X g h ” G.f /H.f / .r s/j Á convolution theorem sj k rk kD M=2C1 In other words, the Fourier transform of the convolution is just individual Fourier transforms.convolution If a discrete response function is nonzero only in some range M=2 < k Ä The correlation of two functions, denoted Corr.g; h/, is deﬁ where M is a sufﬁciently large even integer, then the response function issignal ﬁnite impulse response (FIR), and its duration is M . (Notice that we are deﬁ Z 1 as the number of nonzero values of rk ; these values span a time interval of Corr.g; circumstances the C t /h. / d is the sampling times.) In most practicalh/ Á g. case of ﬁnite Mkernel interest, either because the response really has1ﬁnite duration, or because we a
3. 3. Convolution• Width of kernel deﬁnes smoothing strength
4. 4. Convolution • Width of kernel deﬁnes smoothing strengthconvolution 1convolution 2signalkernel 1kernel 2
5. 5. Convolution • Width of kernel deﬁnes smoothing strengthconvolution 1convolution 2signalkernel 1kernel 2 • Quite fast (O(N*M)), not fast enough
6. 6. Convolution • Width of kernel deﬁnes smoothing strengthconvolution 1convolution 2signalkernel 1kernel 2 • Quite fast (O(N*M)), not fast enough
7. 7. MapReduce
8. 8. MapReduce
9. 9. Map Map Map Map MapReduce
10. 10. Map Map Map Map MapReduce Reduce Reduce Reduce Reduce
11. 11. Map Map Map Map MapReduce Reduce Reduce Reduce Reduce
12. 12. Build Build Build Build Build Map Map Map Map Mapwindows windows windows windows windowsReduce Reduce Reduce Reduce Reduce
13. 13. Build Build Build Build Build Map Map Map Map Mapwindows windows windows windows windowsReduce Reduce Reduce Reduce Reduce
14. 14. Build Build Build Build Build Map Map Map Map Map windows windows windows windows windowsShufﬂe Reduce Reduce Reduce Reduce Reduce
15. 15. Build Build Build Build Build Map Map Map Map Map windows windows windows windows windowsShufﬂe Reduce Convolute Reduce Convolute Reduce Convolute Reduce Convolute Reduce Convolute
16. 16. Build Build Build Build Build Map Map Map Map Map windows windows windows windows windowsShufﬂe Reduce Convolute Reduce Convolute Reduce Convolute Reduce Convolute Reduce Convolute
17. 17. Build Build Build Build Build Map Map Map Map Map windows windows windows windows windowsShufﬂe Reduce Convolute Reduce Convolute Reduce Convolute Reduce Convolute Reduce Convolute
18. 18. i Convolution in Hadoop “nr3” — 2007/5/1 — 20:53 — page 644 — #666 644• Wrap-around problem Chapter 13. Fourier and Spectral Applications response function m+ m− sample of original function m+ m− convolution spoiled unspoiled spoiled
19. 19. Convolution in Hadoop spoiled convolution unspoiled spoiled• Wrap-around problem Figure 13.1.3. The wraparound problem in convolving ﬁnite segments of a function. Not only must the response function wrap be viewed as cyclic, but so must the sampled original function. Therefore, a portion at each end of the original function is erroneously wrapped around by convolution with the • Ignore spoiled regions response function. response function • Mirror the sequence (works well in our case) m+ m− • Zero-padding original function zero padding m− m+ not spoiled because zero m+ m− unspoiled spoiled but irrelevant
20. 20. Convolution in Hadoop• Data split problem: windowing • `Overlap-convolute’
21. 21. Convolution in Hadoop • Data split problem: windowing • `Overlap-convolute’Map(window) timestamp1 timestamp2 timestamp3
22. 22. Convolution in Hadoop • Data split problem: windowing • `Overlap-convolute’ Mapper1 Mapper2 Mapper3Map(window) 1 2 1 2 3 2 3 timestamp1 timestamp2 timestamp3
23. 23. Convolution in Hadoop • Data split problem: windowing • `Overlap-convolute’ Mapper1 Mapper2 Mapper3Map(window) 1 2 1 2 3 2 3 timestamp1 timestamp2Reduce(convolute) timestamp3
24. 24. Convolution in Hadoop • Data split problem: windowing • `Overlap-convolute’ Mapper1 Mapper2 Mapper3Map(window) 1 2 1 2 3 2 3 timestamp1 timestamp2Reduce(convolute) timestamp3 Emit only unpolluted data