Matlab/R Dictionary

5,949 views
5,791 views

Published on

Talk given at R Rosetta Stone meetup in NYC on 1/7/2010 about MATLAB and R.

Co-authored with Harlan Harris.

Video of the talk available at:
http://www.vcasmo.com/video/drewconway/7211

1 Comment
2 Likes
Statistics
Notes
  • Matlab struct can hold data of different lengths . like
    patient(1).name = 'John Doe';
    patient(1).billing = 127.00;
    patient(1).test = [79, 75, 73; 180, 178, 177.5; 220, 210, 205];

    R Dataframes however, cannot hold that type of data
    R List can
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
5,949
On SlideShare
0
From Embeds
0
Number of Embeds
116
Actions
Shares
0
Downloads
42
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide

Matlab/R Dictionary

  1. 1. MATLAB/R  Dic,onary   R  meetup  NYC   January  7,  2010   Harlan  Harris   harlan@harris.name   @HarlanH   Marck  Vaisman   marck@vaisman.us   @wahalulu   MATLAB  and  the  MATLAB  logo  are  registered  trademarks  of  The  Mathworks.  
  2. 2. About  MATLAB   What  is  MATLAB   MATLAB  History   •  Commercial  numerical   •  Developed  by  Cleve  Moler   programming  language,   (Math/CS  Prof  at  UNM)  in  the   simula,on  and  visualiza,on   1970’s  as  a  higher-­‐level   •  One  million  users  (engineers,   numerical  programming   scien,sts,  academics)   language  (vs.  Fortran  LINPACK)   •  MATrix  LABoratory  –   •  Adopted  by    engineers  for   specializes  in  matrix   signal  processing,  control   opera,ons   modeling   •  Mathworks    -­‐  base  &  add-­‐ons   •  Mul,purpose  programming   •  Open-­‐source  Octave  project   language  
  3. 3. Notes   •  Today’s  focus:  Compare  MATLAB  &  R  for  data   analysis,  contrast  as  programming  languages   •  MATLAB  is  Base  plus  many  toolboxes   –  Base  includes:  descrip,ve  stats,  covariance  and   correla,on,  linear  and  nonlinear  regression     –  Sta,s,cs  toolbox  adds:  dataset  and  category  (like   data.frames  and  factors)  arrays,  more  visualiza,ons,   distribu,ons,  ANOVA,  mul,variate  regression,  hypothesis   tests    
  4. 4. -­‐>   •  Interac,ve  programming:  Scripts  and  Read-­‐Evaluate-­‐ Print  Loop   •  Similar  representa,ons  of  data   –  Both  use  vectors/arrays  as  the  primary  data  structures   •  Matlab  is  based  on  2-­‐D  matricies;  R  is  based  on  1-­‐D  vectors   –  Both  prefer  vectorized  func,ons  to  for  loops   –  Variables  are  declared  dynamically   •  Can  do  most  MATLAB  func,onality  in  R;  can  do  most   R  func,onality  in  MATLAB.  
  5. 5. The  basics:  vectors,  matrices  and  indexing   Task   Create  a  row  vector   v  =  [1  2  3  4]   v<-­‐c(1,2,3,4)   Create  a  column  vector   v=[1;2;3;4]  or  v=[1  2  3  4]’   v<-­‐c(1,2,3,4)     Note:  R  does  not  distinguish   between  row  and  column  vectors   Enter  a  matrix  A   A=[1  2  3;  4  5  6]   Enter  values  by  row:   A<-­‐matrix(c(1,2,3,4,5,6),   nrow=2,  byrow=TRUE)   Enter  values  by  column:   A<-­‐matrix(c(1,4,2,5,3,6),   nrow=2)   Access  third  element  of  vector  v   v(3)   v[3]  or  v[[3]]   Access  element  of  matrix  A   A(2,3)   A[2,3]   “Glue”  two  matrices  a1  and  a2,   A=[a1  a2]   A<-­‐cbind(a1,a2)   same  number  of  rows,  side  by  side   “Stack”  two  matrices  a1  and  a2,   A=[a1;a2]   A<-­‐rbind(a1,a2)   same  number  of  columns   Reshape*  matrix  A,  making  it  an  m   A=reshape(A,m,n)   dim(A)<-­‐c(m,n)   x  n  matrix  with  elements  taken   columnwise  from  A  
  6. 6. Operators   Task   Assignment   =   <-­‐  or  =   Whole  Matrix    Opera,ons:   Multiplication:  A*B   A  %*%  B   Square  the  matrix:  A^2   A  %*%  A   Raise  to  power  k:  A^k   A  %*%  A  %*%  A  …     Element-­‐by-­‐element   A.*B   A*B   A./B   A/B   Opera,ons:   A.^k   A^k   Compute  A-­‐1B   AB   A%*%  solve(B)   Sums   Columns  of  matrix:  sum(A)   colSums(A)   Rows  of  matrix:  sum(A,2)   rowSums(A)   Logical  operators  (element-­‐by-­‐ a  <  b,  a  >  b,  a  <=  b,  a  >=  b   a  <  b,  a  >  b,  a  <=  b,  a  >=  b   a  ==  b   a  ==  b   element  on  vectors/matrices)   a  ~=  b   a  !=  b   AND:  a  &&  b   AND:  a  &&  b  (short-­‐circuit)            a  &  b  (element-­‐wise)   OR:  a  ||  b   OR:  a  ||  b          a  |  b   XOR:  xor(a,b)   XOR:  xor(a,b)   NOT:  ~a   NOT:  !a  
  7. 7. Working  with  data  structures   Task   Build  a  structure  v  of  length  n,   v=cell(1,n)  In  general,  cell v<-­‐vector(’list’,n)     capable  of  containing  different   (m,n)  makes  an  m  ×  n  cell   Then  you  can  do  e.g.:   array.  Then  you  can  do  e.g.:   v[[1]]<-­‐12   data  types  in  different  elements.   v{1}=12   v[[2]]<-­‐’hi  there’   MATLAB:  cell  array   v{2}=’hi  there’   v[[3]]<-­‐matrix(runif(9),3)   R:  list   v{3}=rand(3)   Create  a  matrix-­‐like  object  with   avals=2*ones(1,6);   v<-­‐c(1,5,3,2,3,7)   different  named  columns.   yvals=6:-­‐1:1;  v=[1  5  3  2  3  7];   d<-­‐data.frame(cbind(a=2,   d=struct(’a’,  avals,   yy=6:1),  v)   MATLAB:  struct  array   ’yy’,  yyvals,  ’fac’,  v);   R:  data.frame  
  8. 8. Condi,onals,  control  structures,  loops   Task   for  loops  over  values  in  vector   for  i=v   If  only  one  command:    command1   for  (i  in  v)   v    command2    command   end   If  multiple  commands:   for  (i  in  v)  {    command1    command2   }   If/else  statement     if  cond   if  (cond)  {    command1    command1    command2    command2   else   }  else  {    command3    command3    command4    command4   end   }   MATLAB  also  has  the  elseif   R  uses  chained  “else  if”   statement.   statements.   ifelse()  func,on     >  print(ifelse(c(T,F),  2,  3))   [1]  2  3  
  9. 9. Help!   Task   Get  help  on  a  func,on   help  fminsearch   help(pmin)    or   ?pmin   Search  the  help  for  a  word   lookfor  inverse   ??inverse   Describe  a  variable   class(a)   class(a)   str(a)   Show  variables  in  environment   who   ls()   Underlying  type  of  variable   whos(‘a’)   typeof(a)  
  10. 10. Example:  k-­‐means  clustering  of  Fisher  Iris  data   Fisher  Iris  Dataset   sepal_length,sepal_width,petal_length,petal_width,species   5.1,3.5,1.4,0.2,setosa   4.9,3.0,1.4,0.2,setosa   4.7,3.2,1.3,0.2,setosa   4.6,3.1,1.5,0.2,setosa   …  
  11. 11. Matlab  and  R  as  programming  languages   Scrip,ng,  real-­‐,me  analysis   Scrip,ng,  real-­‐,me  analysis   File-­‐based  environments   Files  unimportant   Impera,ve  programming  style   Func,onal  programming  style  (impure)   Sta,cally  scoped   Dynamically  scoped   Func,ons  with  mul,ple  return  values   Func,ons  with  named  arguments,  lazy   evalua,on   Evolving  OOP  system   Mul,ple  compe,ng  OOP  systems   Can  be  compiled   Cannot  be  compiled   Large  library  of  func,ons   Large  library  of  func,ons                  Professional  developed,  cost  money   Varying  quality  and  support   Can  embed  (in)  many  other  languages   Can  embed  (in)  many  other  languages  
  12. 12. Func,ons   function  [a,  b]  =  minmax(z)   minmax  <-­‐  function(c,  opt=12)  {      %  one  function  per  .m  file!      #  functions  are  assigned  to      %  assign  to  formal  return  names      #  variables      a  =  min(z)      ret  <-­‐  list(min  =  min(z),      b  =  max(z)                                        max  =  max(z))   end      ret        #  last  statement  is                    #  return  value   }   %  if  minmax.m  in  path   #  if  minmax  was  created  in  current   [smallest,  largest]  =  …   #  environment    minmax([1  30  3])   x  <-­‐  minmax(c(1,  30,  3))   smallest  <-­‐  x$min  
  13. 13. Object-­‐Oriented  Programming   •  Formerly:  objects  were   •  S3  classes:  anributes  +   defined  by  a  directory   syntax   tree,  with  one  method   –  class(object)   per  file   –  plot.lm()   •  As  of  2008:  new   •  S4  classes:  defini,ons  +   classdef    syntax   methods   resembles  other   •  R.oo,  proto,  etc…   languages  
  14. 14. Other  notes   •  r.matlab  package   •  Graphics   –  Matlab  has  much  bener  3-­‐d/interac,ve  graphics  support   –  R  has  ggplot2  and  much  bener  sta,s,cal  graphics  
  15. 15. Addi,onal  Resources   •  Will  Dwinell,  Data  Mining  in  MATLAB   •  Computerworld  ar,cle  on  Cleve  Moler   •  Mathworks   •  Matlabcentral   •  Comparison  of  Data  Analysis  packages  ( hnp://anyall.org/blog/2009/02/comparison-­‐of-­‐data-­‐ analysis-­‐packages-­‐r-­‐matlab-­‐scipy-­‐excel-­‐sas-­‐spss-­‐ stata/)   •  R.matlab  package   •  stackoverflow  
  16. 16. References  used  for  this  talk   •  David  Hiebeler  MATLAB/R  Reference  document:   hnp://www.math.umaine.edu/~hiebeler/comp/ matlabR.html   •  hnp://www.cyclismo.org/tutorial/R/index.html   •  hnp://www.stat.berkeley.edu/~spector/R.pdf   •  MATLAB  documenta,on   •  hnp://www.r-­‐cookbook.com/node/23  
  17. 17. Thank  You!  

×