2
How to turbo charge
your data transfers with
WebHDFS
Andy Done, Data Platform Lead
andy.done@king.com
Last time…
Since then…
100
40
Hadoop
1
0.5
Storage
15
10
Events
10
4
ExaSol
2.5
6
Load times
Problem
WebHDFS
12
Old way
WebHDFS
Old way
hadoop fs –cat /some/path/* | bulk_load my_table
WebHDFS
WebHDFS way
WebHDFS
WebHDFS way
IMPORT INTO TABLE my_table FROM
FILE ‘http://namenode/webhdfs/v1/some/path/file_1’
FILE ‘http://namenode/webhdfs/v1/some/path/file_2’
…
FILE ‘http://namenode/webhdfs/v1/some/path/file_n’
WebHDFS
WebHDFS benefits
•  Simple
•  Efficient
•  Ubiquitous
•  Parallelisable
•  Bidirectional
•  Fast
WebHDFS
18
Conclusion
WebHDFS
Thank you
19
We're hiring!
20

WebHDFS at King - May 2014 Hadoop MeetUp