Ruby for soul of BigData Nerds

2,233 views
2,107 views

Published on

Published in: Technology, Business
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,233
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
16
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Ruby for soul of BigData Nerds

  1. 1. Ruby for the soul of BigData Nerds
  2. 2. Who Am I?● Engineering Team Lead Analytics & Data Platforms @ Viki.com● Founder of http://BigData.SG● Contributor to fluentd, pfeed, cartographer, watir
  3. 3. BigData & Its Challenges"big data" is when the size of the data itself becomes part of the problem - Mike Loukides● Twitter produces over 230 million tweets per day● Wal-Mart is logging one million transactions per hour● Facebook creates over 30 billion pieces of contentranging from web links, news, blogs, photo
  4. 4. Everyone has a big data problem
  5. 5. Evolving Trends Batch Processing Hadoop , HPCC, Google BigQuery Stream Processing STORM (Twitter) & S4 (Yahoo)
  6. 6. Common Engineering Challenges● Data Collection● Filtering / Segmentation● Data Storage● Analysis● Visualization● Prediction / Extrapolation
  7. 7. Data Collection + Filtering /Segmentation http://fluentd.org/
  8. 8. Data Collection + Filtering /Segmentation You send events as: Http://domain:8080/namespace?key1=value1&key2=value2 Fluent forwards the data as: <timestamp> <namespace> {key1:value1,key2:value2} http://fluentd.org/
  9. 9. Screencast:http://www.bigdata.sg/videos/fluentd/
  10. 10. Storage Hadoop HDFS OpenTSDB (http://opentsdb.net) SciDB (DMAS)
  11. 11. Analysis Hadoop Streaming (Ruby) Hadoop Hive (Using rbhive)
  12. 12. Visualization Custom Dashboard (Rails + Google Charts / d3.js) Some Hosted Services: tableaupublic.com, geckoboard.com, splunkstorm.com
  13. 13. Stream Computing
  14. 14. What is STORM?
  15. 15. STORM terminology● Streams● Spouts● Bolts● Topologies
  16. 16. RedStorm (https://github.com/colinsurprenant/redstorm)$ rvm use jruby-1.6.3$ bundle install redstorm$ bundle exec redstorm install
  17. 17. Visualizing average bandwidthexperienced by users whilewatching videos on viki.com acrossthe globe.
  18. 18. Thank you! Lets stay in touch :)● Signup for my newsletter at http://parolkar.com● Visit BigData.SG Meetup in Singapore.

×