Your SlideShare is downloading. ×
0
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Ruby for soul of BigData Nerds
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Ruby for soul of BigData Nerds

1,979

Published on

Published in: Technology, Business
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,979
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
15
Comments
0
Likes
5
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Ruby for the soul of BigData Nerds
  • 2. Who Am I?● Engineering Team Lead Analytics & Data Platforms @ Viki.com● Founder of http://BigData.SG● Contributor to fluentd, pfeed, cartographer, watir
  • 3. BigData & Its Challenges"big data" is when the size of the data itself becomes part of the problem - Mike Loukides● Twitter produces over 230 million tweets per day● Wal-Mart is logging one million transactions per hour● Facebook creates over 30 billion pieces of contentranging from web links, news, blogs, photo
  • 4. Everyone has a big data problem
  • 5. Evolving Trends Batch Processing Hadoop , HPCC, Google BigQuery Stream Processing STORM (Twitter) & S4 (Yahoo)
  • 6. Common Engineering Challenges● Data Collection● Filtering / Segmentation● Data Storage● Analysis● Visualization● Prediction / Extrapolation
  • 7. Data Collection + Filtering /Segmentation http://fluentd.org/
  • 8. Data Collection + Filtering /Segmentation You send events as: Http://domain:8080/namespace?key1=value1&key2=value2 Fluent forwards the data as: <timestamp> <namespace> {key1:value1,key2:value2} http://fluentd.org/
  • 9. Screencast:http://www.bigdata.sg/videos/fluentd/
  • 10. Storage Hadoop HDFS OpenTSDB (http://opentsdb.net) SciDB (DMAS)
  • 11. Analysis Hadoop Streaming (Ruby) Hadoop Hive (Using rbhive)
  • 12. Visualization Custom Dashboard (Rails + Google Charts / d3.js) Some Hosted Services: tableaupublic.com, geckoboard.com, splunkstorm.com
  • 13. Stream Computing
  • 14. What is STORM?
  • 15. STORM terminology● Streams● Spouts● Bolts● Topologies
  • 16. RedStorm (https://github.com/colinsurprenant/redstorm)$ rvm use jruby-1.6.3$ bundle install redstorm$ bundle exec redstorm install
  • 17. Visualizing average bandwidthexperienced by users whilewatching videos on viki.com acrossthe globe.
  • 18. Thank you! Lets stay in touch :)● Signup for my newsletter at http://parolkar.com● Visit BigData.SG Meetup in Singapore.

×