Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Data Science with R for Java develo... by Sander Mak (@Sand... 22968 views
- Combining R With Java For Data Anal... by Ryan Cuprak 2182 views
- rJava by Philip Zheng 1739 views
- Exp keja 1 by Kejalakshmy Namas... 75 views
- R + Storm Moneyball - Realtime Adva... by Allen Day, PhD 4525 views
- Data Stream Algorithms in Storm and R by Radek Maciaszek 1120 views

3,769 views

Published on

No Downloads

Total views

3,769

On SlideShare

0

From Embeds

0

Number of Embeds

258

Shares

0

Downloads

43

Comments

0

Likes

5

No embeds

No notes for slide

- 1. a D ci S ta i W ce en ~ R th r~ fo va Ja rs pe elo ev D S @ ak _M er nd a
- 2. Data Science 1 0 1 0 a 1 0 1 1 The R language end 0 Ag 0 1 10 1 Gimme some Java!
- 3. 90% of the world’s data was produced in the last 2 years - SINTEF/ScienceDaily June 2013!!!!!!!! We need more than just CRUD
- 4. Stand back. I know Data Science!
- 5. Hacking Skills Math & Statistics Machine Learning Data Science Danger! Perl ahead! Operations Research Domain Expertise
- 6. Hacking Skills Math & Statistics Machine Learning Data Science Danger! Perl ahead! Operations Research Domain Expertise
- 7. Data Science: Achievement Unlocked
- 8. Data Science: Achievement Unlocked To da y R, R-Studio
- 9. Ag end a Data Science 1 0 1 0 The R language 0 1 0 1 1 0 11 0 1 Gimme some Java!
- 10. Language Designers? Statisticians?
- 11. Language Designers? Statisticians? The best thing about R is that it was developed by statisticians. The worst thing about R is that... it was developed by statisticians. - Bo Cowgill, Google
- 12. Why R, then? De-facto standard (in statistical research) Open Source Interactive data exploration “It’s a DSL posing as general purpose language”
- 13. Why not R, then? Slow Memory Bound Try googling for R... (Did I mention it’s a quirky language?)
- 14. Why not R, then? Slow Memory Bound Try googling for R... (Did I mention it’s a quirky language?) ‘If you are using R and you think you’re in hell, this is a map for you.’ - The R Inferno
- 15. Apparently, statisticians aren’t designers, either...
- 16. VS
- 17. Functional/OO/Procedural Dynamic (eval) Interpreted OO Static types Compiled
- 18. numeric character Factor Integer/Double/... String Enum
- 19. vector list dataframe numeric character Factor Integer/Double/... String Enum
- 20. 1 2 3 4 1-based 0 1 2 3 0-based
- 21. higher-order functions sapply(vec, function(elm) { elm + 1; }) 1 2 3 4 1-based 0 1 2 3 0-based for-loops
- 22. Studio
- 23. Studio Comprehensive R Archive Network Central
- 24. Coding time!
- 25. Titanic Competition: Machine Learning from Disaster
- 26. Titanic Competition: Machine Learning from Disaster Survived?
- 27. Titanic Competition: Machine Learning from Disaster Decision Tree Sex == Female Age > 16 Age > 50 Fare > 100 T T F T F
- 28. Titanic Competition: Machine Learning from Disaster Decision Tree Random Forest Sex == Female Age > 16 Age > 50 T T T F T T Fare > 100 F F T F T T T F T F T T T F T F F T F
- 29. Demo time!
- 30. . . . . . .
- 31. Data Science a 1 0 1 1 0 The R language end 0 Ag 0 11 0 1 Gimme some Java! 1 0 1
- 32. Bridging R and Java Integrate Assimilate Replace
- 33. rJava & Java/R interface Integrate Two way native interface - JNI: libjri - or TCP to RServe Rengine re = new Rengine(new String[] {}, false, null); // wait until engine is ready if (!re.waitForR()) { throw new IllegalStateException(“Can’t load R engine”); } re.eval("data(cars)", false); REXP cars = re.eval("cars"); RVector carsVector = cars.asVector(); // dissect carsVector...
- 34. Assimilate Reimplementation of R on JVM Fast & lean Parallelized Just-another-lib ... not production ready yet...
- 35. Assimilate Reimplementation of R on JVM Fast & lean Parallelized Just-another-lib ... not production ready yet... // create a script engine manager ScriptEngineManager factory = new ScriptEngineManager(); // create an R engine ScriptEngine engine = factory.getEngineByName("Renjin"); // load package from classpath engine.eval(“library(survey)"); // evaluate R code from String engine.eval("print('Hello from R')");
- 36. Big Data?
- 37. JVM Libraries/platforms Replace
- 38. Scalable R distributions (non-JVM) Replace Revolution Analytics Oracle Enterprise R
- 39. Wr apup Data Science 1 0 The R language 0 1 1 1 0 Gimme some Java! 1 0 1 0 11 0
- 40. Sanitize Explore Model Predict Scale
- 41. Next steps Install R Read Computing for Data Analysis starts Jan. 6th 2014
- 42. Qu esti ons ? Data Science The R language @Sander_Mak 0 0 1 Gimme some Java! 0 1 1 11 0 00 1 1 1 branchandbound.net

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment