SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
11.
Language
Designers?
Statisticians?
The best thing about R is that it was developed by statisticians. The
worst thing about R is that... it was developed by statisticians.
- Bo Cowgill, Google
12.
Why R, then?
Open Source
De-facto standard (in statistical research)
“It’s a DSL posing as general purpose language”
Interactive data exploration
13.
Why not R, then?
Slow
Memory Bound
(Did I mention it’s a quirky language?)
Try googling for R...
14.
Why not R, then?
‘If you are using R and you think
you’re in hell, this is a map for you.’
- The R Inferno
Slow
Memory Bound
(Did I mention it’s a quirky language?)
Try googling for R...
27.
Titanic Competition:
Machine Learning from Disaster
28.
Titanic Competition:
Machine Learning from Disaster
29.
Titanic Competition:
Machine Learning from Disaster
Sex == Female
Decision Tree
Age > 50Age > 16
Fare > 100
T FT T F
30.
Titanic Competition:
Machine Learning from Disaster
Sex == Female
Decision Tree
Age > 50Age > 16
Random Forest
Fare > 100
T FT T F
T
FT T FT
FT T F
T
FT T FT
FT T F
33.
Agenda
Data Science
The R language
Gimme some Java!
1
1
1 1
11
1
1
0
0
0
0
0
0
34.
Bridging R and Java
Integrate
Assimilate
Replace
35.
rJava & Java/R interface
Integrate
Two way native interface
- JNI: libjri
- or TCP to RServe
Rengine re = new Rengine(new String[] {}, false, null);
// wait until engine is ready
if (!re.waitForR()) {
throw new IllegalStateException(“Can’t load R engine”);
}
re.eval("data(cars)", false);
REXP cars = re.eval("cars");
RVector carsVector = cars.asVector();
// dissect carsVector...
36.
Assimilate
Reimplementation of R on JVM
Fast & lean
Parallelized
Just-another-lib
... not production ready yet...
37.
Assimilate
// create a script engine manager
ScriptEngineManager factory =
new ScriptEngineManager();
// create an R engine
ScriptEngine engine =
factory.getEngineByName("Renjin");
// load package from classpath
engine.eval(“library(survey)");
// evaluate R code from String
engine.eval("print('Hello from R')");
Reimplementation of R on JVM
Fast & lean
Parallelized
Just-another-lib
... not production ready yet...
38.
Reimplementation of R on JVM
Share data:
Integer[] data = {1, 2, 3};
engine.put("data", data);
engine.eval("print(sum(data))");
Assimilate
39.
Reimplementation of R on JVM
Share data:
import(com.foo.User)
# instantiate Java beans
tim <- User$new(name='Tim', age=23)
tom <- User$new(name='Tom', age=45)
# invoke setter
tim$name <- "Timmy"
Use Java from Renjin:
Integer[] data = {1, 2, 3};
engine.put("data", data);
engine.eval("print(sum(data))");
Assimilate