SlideShare is now on Android. 15 million presentations at your fingertips.  Get the app

×
  • Share
  • Email
  • Embed
  • Like
  • Private Content
 

RHadoop, R meets Hadoop

by on Mar 06, 2012

  • 59,594 views

(Presented by Antonio Piccolboni to Strata 2012 Conference, Feb 29 2012)....

(Presented by Antonio Piccolboni to Strata 2012 Conference, Feb 29 2012).

Rhadoop is an open source project spearheaded by Revolution Analytics to grant data scientists access to Hadoop’s scalability from their favorite language, R. RHadoop is comprised of three packages.

- rhdfs provides file level manipulation for HDFS, the Hadoop file system
- rhbase provides access to HBASE, the hadoop database
- rmr allows to write mapreduce programs in R

rmr allows R developers to program in the mapreduce framework, and to all developers provides an alternative way to implement mapreduce programs that strikes a delicate compromise betwen power and usability. It allows to write general mapreduce programs, offering the full power and ecosystem of an existing, established programming language. It doesn’t force you to replace the R interpreter with a special run-time—it is just a library. You can write logistic regression in half a page and even understand it. It feels and behaves almost like the usual R iteration and aggregation primitives. It is comprised of a handful of functions with a modest number of arguments and sensible defaults that combine in many useful ways. But there is no way to prove that an API works: one can only show examples of what it allows to do and we will do that covering a few from machine learning and statistics. Finally, we will discuss how to get involved.

Statistics

Views

Total Views
59,594
Views on SlideShare
6,266
Embed Views
53,328

Actions

Likes
29
Downloads
575
Comments
2

31 Embeds 53,328

http://blog.revolutionanalytics.com 49916
http://www.r-bloggers.com 2356
http://smartdatacollective.com 851
http://www.scoop.it 62
http://translate.googleusercontent.com 22
http://feeds.feedburner.com 20
http://www.newsblur.com 15
http://atomicules.co.uk 12
https://www.google.com 10
http://webcache.googleusercontent.com 10
http://vizdat.collected.info 9
https://www.google.co.kr 5
http://core.traackr.com 5
http://www.google.com 4
http://03.collected.info 4
http://127.0.0.1 3
https://www.google.se 2
https://www.google.co.uk 2
https://twitter.com 2
http://revolution-computing.typepad.com 2
http://xianguo.com 2
http://www.hanrss.com 2
http://webmail.scu.edu.au 2
http://staffmail.scu.edu.au 2
http://www.twylah.com 2
http://www.smartdatacollective.com 1
http://cache.baidu.com 1
http://cafe.naver.com 1
https://www.google.lk 1
http://www.google.com.au 1
https://www.google.be 1
More...

Accessibility

Upload Details

Uploaded via SlideShare as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

12 of 2 previous next

  • halueda Haruyasu Ueda, 主任研究員/Senior researcher at Fujitsu Hmmm. I'm not sure the difference between RHadoop and RHive. RHive has mapreduce functionality even though its name is Hive. It also has HDFS adapter. 1 year ago
    Are you sure you want to
    Your message goes here
    Processing…
  • sildershare2010 学峰 司 RHadoop step by step 1 year ago
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

RHadoop, R meets Hadoop RHadoop, R meets Hadoop Presentation Transcript