SlideShare a Scribd company logo
1 of 47
RUBY AND R


Chang Sau Sheong
Director, Applied Research, HP Labs Singapore


1   © Copyright 2010 Hewlett-Packard Development Company, L.P.
About HP Labs



2   © Copyright 2010 Hewlett-Packard Development Company, L.P.
HP LABS
– Exploratory and advanced
  research group for Hewlett-Packard
– Global organization that tackles
  complex challenges facing our
  customers and society over the next
  decade
– Pushes the frontiers of fundamental
  science
– HQ Palo Alto



3   © Copyright 2010 Hewlett-Packard Development Company, L.P.
HP LABS AROUND THE WORLD

                                                                 Bristol   St. Petersburg

                                                                                 Beijing
           Palo Alto

                                                                             Bangalore

                      Haifa                                                 Singapore




4   © Copyright 2010 Hewlett-Packard Development Company, L.P.
HP LABS SINGAPORE
– Set up in February 2010
– Focus on Cloud Computing
      Research                                                   Applied Research
            •   Exploratory research                              •   Applied Research
            •   Researchers                                       •   Innovators
            •   Change the state of the art                       •   Take the research to the next
                                                                      stage
            •   Working closely with the
                academic community                                •   Work closely with customers
                                                                      and business units



5   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Ruby and R



6   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Programming language and
    platform for statistical computing,
           licensed under GPL


7   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Strengths in
               statistical processing
                                                                 and
                          data visualization

8   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Extensive library of statistical
           computing packages (CRAN)
              written by statisticians



9   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Statistics is not just
                            for statisticians


10   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Recommendation                                                       Speech
         engine                                                         recognition
        Fingerprint         Spam detection
       identification
                    Card fraud Financial
         Face        detection forecasting
     recognition

                       Data                                       OCR      Credit scoring
                      mining
11   © Copyright 2010 Hewlett-Packard Development Company, L.P.
CRAN
– Almost 2000 packages, mostly created by
  statisticians
     • BiodiversityR                           – GUI for biodiversity and community ecology
       analysis
     • Emu – analyze speech patterns
     • GenABEL – study human genome
     • Quantmod– quantitative financial modeling framework
     • Ftrading – technical trading analysis
     • Cyclones – cyclone identification
     • DOSim – disease analysis toolkit for gene set
     • Agricolae – statistical procedures for agricultural research


12   © Copyright 2010 Hewlett-Packard Development Company, L.P.
EXAMPLE R CODE
– EPL data from football-data.co.uk
– Show home/away goals distribution for 201 season
                                           1




13   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Why Ruby and R?



14   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Stand on shoulders
                          of giants


15   © Copyright 2010 Hewlett-Packard Development Company, L.P.
–Ruby
     • Human   focused programming!
     • Better general purpose programming capabilities
     • Great                  frameworks!
     • Great                  libraries (20,000+ gems in RubyGems)
–R
     • Focus   on statistical computing/crunching
     • Lots of packages written by domain experts/
       statisticians
     • Great                  graphing libraries

16   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Ruby and R
                                                    integration


17   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RINRUBY
– 100% Ruby
– Uses pipes to send commands and evals
– Uses TCP/IP Sockets to send and retrieve data
– Pros:
     •   Doesn't requires anything but R
     •   Works flawlessly on Windows
     •   Work with Ruby 1.8, 1.9 and JRuby 1.5
     •   All API tested

– Cons:
     •   VERY SLOW in assigning
     •   Very limited datatypes: only Vector and Matrix
     •   Not released since 2009
     •   Poor documentation


18   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RSRUBY
– C Extension for Ruby, linked to R's shared library
– Pros:
     •   Blazing speed! 5-10 times faster than Rserve and 100-1000 than RinRuby.
     •   Seamless integration with Ruby. Every method and object is treated like a Ruby object

– Cons:
     •   Transformation between R and Ruby types aren't trivial
     •   Dependent on operating system, Ruby implementation and R version
     •   Not available for alternative implementations of Ruby (eg JRuby)
     •   Not released since 2009
     •   Poor documentation




19   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RSERVE
– 100% Ruby
– Uses TCP/IP sockets to interchange data and commands
– Requires Rserve installed on the server machine
– Access with Ruby uses Ruby-Rserve-Client library
– Pros:
     •   Work with Ruby 1.8, 1.9 and JRuby 1.5.
     •   Session allows to process data asynchronously
     •   Fast: 5-10 times faster than RinRuby
     •   Most recently updated (Jan 2011)

– Cons:
     •   Requires Rserve
     •   Limited features on Windows
     •   Poor documentation



20   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RAPACHE/RRACK
– Web service based
– Run R scripts as web services, consumed by Ruby front-end apps
– Pros:
     •   Modular and separate (no direct integration)
     •   Can be scalable, ‘cloud’-ready

– Cons:
     •   Requires Rapache/rRack
     •   rRack is very new (not accepted by CRAN yet, as of today!), requires R 2.13 (just
         released a few weeks ago)
     •   Rapache specific to Apache web server only
     •   Communications overhead for smaller integrations




21   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Let’s look at some
                                    code!
                                                  (I’m going to use Rserve)




22   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Text classification



23   © Copyright 2010 Hewlett-Packard Development Company, L.P.
TEXT CLASSIFICATION
–Automatically sorting a set of documents into
 different categories from a predefined set
–Classic uses:                                                    Training
                                                                                          Test data
     • Spam               filtering                                 data
     • Email              prioritization
                                                                             Classifier




                                                                             category


24   © Copyright 2010 Hewlett-Packard Development Company, L.P.
25   © Copyright 2010 Hewlett-Packard Development Company, L.P.
TEXT CLASSIFIER CODE

 Prepare




26   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Train classifier by counting frequency of
each word in the document




27   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Get word count




28   © Copyright 2010 Hewlett-Packard Development Company, L.P.
What you get
     {"check"=>1, "result"=>3, "marissa"=>1, "experi"=>1,
     "click"=>1, "engin"=>1, "simpli"=>1, "mistakenli"=>1,
     "pick"=>1, "prevent"=>1, "40"=>1, "regularli"=>1, "place"=>1,
     "user"=>5, "prefer"=>1, "malevol"=>1, "access"=>1,
     "robust"=>1, "servic"=>1, "fault"=>1, "malici"=>1, "list"=>2,
     "hand"=>1, "internet"=>1, "attribut"=>1, "instal"=>1,
     "file"=>1, "unabl"=>1, "vice"=>1, "stopbadwareorg"=>2,
     "merit"=>1, "decid"=>1, "flag"=>2, "saturdai"=>2, "hit"=>2,
     "offici"=>1, "error"=>3, "work"=>1, "site"=>5, "happen"=>2,
     "incid"=>1, "technic"=>1, "advis"=>1, "put"=>1, "human"=>3,
     "harm"=>2, "softwar"=>1, "ms"=>1, "affect"=>1, "carefulli"=>1,
     "product"=>1, "presid"=>1, "complaint"=>1, "potenti"=>2,
     "googl"=>6, "comput"=>2, "peopl"=>1, "investig"=>2,
     "consum"=>1, "danger"=>2, "period"=>1, "wrote"=>2,
     "search"=>7, "ascertain"=>1, "blog"=>1, "warn"=>2,
     "problem"=>1, "updat"=>2, "minut"=>1, "mayer"=>2}




29   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Generate training data for prediction




30   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Training data



31   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,microsoft,site,sof
twar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sharpli,error,group,result,syst
em,rebel,econom,presid,crisi,find,year,accus,global,obama,china,civilian,shrink,hous,wal
l,street,quarter,white,heavi,lehman,economi,session,ey,time,davo,human
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0,0
not_interesting,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,0,2,0,0,0,3,0,0,0,3,
1,0,0,0,0,0,3,0,0,0,0,0,0,2
not_interesting,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,3,0,3,1,2,0,2,0,0,0,
0,0,0,0,0,0,0,3,1,3,1,0,2,0
not_interesting,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,


                                                                     The top 25 most
0,0,0,0,0,0,0,0,0,0,0,0,0,1
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,1,2,1,4,0,
0,2,0,0,0,2,0,0,0,0,2,0,1,0

                                                                    frequent words in
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,
0,0,3,3,0,0,0,0,0,0,0,2,0,0


                                                                   the training dataset
not_interesting,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,2,0,0,
2,1,0,0,2,1,0,0,2,0,0,1,0,0
interesting,6,0,7,5,0,0,0,0,1,0,5,1,2,0,0,0,0,0,0,0,0,3,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,3
interesting,0,7,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,1,0,0,0,0,0,3,3,1,0,1,1,1,0,3,3,0,1,0,3,0,1,0,2,0,1,0,0,0,3,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,0,0,3,0
interesting,0,0,0,0,3,5,5,0,0,0,0,0,0,0,0,0,1,4,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,6,0,1,1,0,0,0,0,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,2,0,0



 32   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,microsoft,site,sof
twar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sharpli,error,group,result,syst
em,rebel,econom,presid,crisi,find,year,accus,global,obama,china,civilian,shrink,hous,wal
l,street,quarter,white,heavi,lehman,economi,session,ey,time,davo,human
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0,0
not_interesting,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,0,2,0,0,0,3,0,0,0,3,
1,0,0,0,0,0,3,0,0,0,0,0,0,2
not_interesting,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,3,0,3,1,2,0,2,0,0,0,
0,0,0,0,0,0,0,3,1,3,1,0,2,0
not_interesting,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,


                                                                       Each line
0,0,0,0,0,0,0,0,0,0,0,0,0,1
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,1,2,1,4,0,
0,2,0,0,0,2,0,0,0,0,2,0,1,0

                                                                     represents 1
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,
0,0,3,3,0,0,0,0,0,0,0,2,0,0


                                                                   document trained
not_interesting,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,2,0,0,
2,1,0,0,2,1,0,0,2,0,0,1,0,0
interesting,6,0,7,5,0,0,0,0,1,0,5,1,2,0,0,0,0,0,0,0,0,3,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,3
interesting,0,7,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,1,0,0,0,0,0,3,3,1,0,1,1,1,0,3,3,0,1,0,3,0,1,0,2,0,1,0,0,0,3,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,0,0,3,0
interesting,0,0,0,0,3,5,5,0,0,0,0,0,0,0,0,0,1,4,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,6,0,1,1,0,0,0,0,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,2,0,0



 33   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,microsoft,site
,softwar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sharpli,error,group,result,
system,rebel,econom,presid,crisi,find,year,accus,global,obama,china,civilian,shrink,hous
,wall,street,quarter,white,heavi,lehman,economi,session,ey,time,davo,human
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0
,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
not_interesting,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,0,2,0,0,0,3,0,0,0,3,
1,0,0,0,0,0,3,0,0,0,0,0,0,2
not_interesting,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,3,0,3,1,2,0,2,0,0,0,
0,0,0,0,0,0,0,3,1,3,1,0,2,0


                                                                    Categories set
not_interesting,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,1
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,1,2,1,4,0,
0,2,0,0,0,2,0,0,0,0,2,0,1,0
                                                                   when the classifier
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,


                                                                      is created
0,0,3,3,0,0,0,0,0,0,0,2,0,0
not_interesting,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,2,0,0,
2,1,0,0,2,1,0,0,2,0,0,1,0,0
interesting,6,0,7,5,0,0,0,0,1,0,5,1,2,0,0,0,0,0,0,0,0,3,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,3
interesting,0,7,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,1,0,0,0,0,0,3,3,1,0,1,1,1,0,3,3,0,1,0,3,0,1,0,2,0,1,0,0,0,3,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,0,0,3,0
interesting,0,0,0,0,3,5,5,0,0,0,0,0,0,0,0,0,1,4,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,6,0,1,1,0,0,0,0,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,2,0,0


 34   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,microsoft,site,s
oftwar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sharpli,error,group,result,sy
stem,rebel,econom,presid,crisi,find,year,accus,global,obama,china,civilian,shrink,hous,w
all,street,quarter,white,heavi,lehman,economi,session,ey,time,davo,human
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0,0
not_interesting,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,5,0,2,0,0,0,3,0,0,0,3,
1,0,0,0,0,0,3,0,0,0,0,0,0,2
not_interesting,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,3,0,3,1,2,0,2,0,0,0,


                                                                   Number indicates the
0,0,0,0,0,0,0,3,1,3,1,0,2,0
not_interesting,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,1

                                                                   number of times the
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,0,0,1,2,1,4,0,
0,2,0,0,0,2,0,0,0,0,2,0,1,0


                                                                   word appears in that
not_interesting,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,
0,0,3,3,0,0,0,0,0,0,0,2,0,0
not_interesting,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,0,0,2,0,0,

                                                                        document
2,1,0,0,2,1,0,0,2,0,0,1,0,0
interesting,6,0,7,5,0,0,0,0,1,0,5,1,2,0,0,0,0,0,0,0,0,3,0,3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,3
interesting,0,7,0,0,2,0,0,0,0,0,0,0,1,0,0,1,0,0,3,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,1,0,0,0,0,0,3,3,1,0,1,1,1,0,3,3,0,1,0,3,0,1,0,2,0,1,0,0,0,3,0,0,0,0,0,0,0,
0,0,0,0,0,0,1,1,0,0,3,0
interesting,0,0,0,0,3,5,5,0,0,0,0,0,0,0,0,0,1,4,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,6,0,1,1,0,0,0,0,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0
interesting,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,2,0,0


 35   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Test data



36   © Copyright 2010 Hewlett-Packard Development Company, L.P.
category,googl,report,search,user,review,court,mckinnon,year,internet,micr
 osoft,site,softwar,warn,browser,oper,expert,rise,lawyer,digit,extradit,sha
 rpli,error,group,result,system,rebel,econom,presid,crisi,find,year,accus,g
 lobal,obama,china,civilian,shrink,hous,wall,street,quarter,white,heavi,leh
 man,economi,session,ey,time,davo,human
 category,0,0,0,2,0,0,0,2,1,4,0,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0
 ,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0

37   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Using different
                  classification models


38   © Copyright 2010 Hewlett-Packard Development Company, L.P.
NAÏVE BAYES




39   © Copyright 2010 Hewlett-Packard Development Company, L.P.
SVM




40   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RANDOM FOREST




41   © Copyright 2010 Hewlett-Packard Development Company, L.P.
NEURAL NETWORKS




42   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Using the classifier



43   © Copyright 2010 Hewlett-Packard Development Company, L.P.
44   © Copyright 2010 Hewlett-Packard Development Company, L.P.
45   © Copyright 2010 Hewlett-Packard Development Company, L.P.
RESOURCES
– HP Labs Worldwide                                               – Rserve-Ruby-Client
http://www.hpl.hp.com/                                            https://github.com/clbustos/Rserve-
– R Project                                                       Ruby-client

http://www.r-project.org/                                         – rApache
– RsRuby                                                          http://rapache.net/index.html

https://github.com/alexgutteridge/rsrub                           – rRack
y                                                                 https://github.com/jeffreyhorner/rRack/
– RinRuby
http://rinruby.ddahl.org/
– Rserve
http://www.rforge.net/Rserve/


46   © Copyright 2010 Hewlett-Packard Development Company, L.P.
Thank you

 sausheong@hp.com
 http://twitter.com/sausheong
 http://blog.saush.com
47   © Copyright 2010 Hewlett-Packard Development Company, L.P.

More Related Content

Similar to Ruby and R

Python course in hyderabad
Python course in hyderabadPython course in hyderabad
Python course in hyderabadRevathiUppala
 
Introduction to pig
Introduction to pigIntroduction to pig
Introduction to pigRavi Mutyala
 
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian FrankHP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian FrankBeMyApp
 
Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Revolution Analytics
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...DataWorks Summit
 
Mrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big DataMrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big DataPatrickCrompton
 
Pilot Project Highlights: Ruby on Rails - November 2006
Pilot Project Highlights: Ruby on Rails - November 2006Pilot Project Highlights: Ruby on Rails - November 2006
Pilot Project Highlights: Ruby on Rails - November 2006juliannacole
 
Helion meetup-2014
Helion meetup-2014Helion meetup-2014
Helion meetup-2014Bruno Cornec
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack EuropeHortonworks
 
Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Inside Analysis
 
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopTrafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopKrishna-Kumar
 
2019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 42019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 4Ferdin Joe John Joseph PhD
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Mac Moore
 

Similar to Ruby and R (20)

Evented programming
Evented programmingEvented programming
Evented programming
 
Python course in hyderabad
Python course in hyderabadPython course in hyderabad
Python course in hyderabad
 
Introduction to pig
Introduction to pigIntroduction to pig
Introduction to pig
 
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian FrankHP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
HP Helion Webinar #1 - Introduction to HP Helion OpenStack w/Christian Frank
 
Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics?
 
Revolution Analytics Podcast
Revolution Analytics PodcastRevolution Analytics Podcast
Revolution Analytics Podcast
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...
 
Reason To learn & use r
Reason To learn & use rReason To learn & use r
Reason To learn & use r
 
Mrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big DataMrinal devadas, Hortonworks Making Sense Of Big Data
Mrinal devadas, Hortonworks Making Sense Of Big Data
 
iKariera 2015
iKariera 2015iKariera 2015
iKariera 2015
 
Pilot Project Highlights: Ruby on Rails - November 2006
Pilot Project Highlights: Ruby on Rails - November 2006Pilot Project Highlights: Ruby on Rails - November 2006
Pilot Project Highlights: Ruby on Rails - November 2006
 
Helion meetup-2014
Helion meetup-2014Helion meetup-2014
Helion meetup-2014
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop
 
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopTrafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoop
 
2019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 42019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 4
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Pig programming is fun
Pig programming is funPig programming is fun
Pig programming is fun
 
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
 
HP and linux
HP and linuxHP and linux
HP and linux
 

Recently uploaded

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Ruby and R