Szilard Pafka – Los Angeles area R users group meeting – November 17, 2010Software tools for data analysis: (size related to surveyed usage)C C++ Fortran Java + libraries...Perl Python Ruby Unix shellLisp ClojureR Matlab Octave Maple MathematicaSPSS Stata Statistica SAS JMPExcelSAS EM SPSS Clementine RapidMiner Weka MahoutMySQL SQL Server NoSQL storesHadoop CUDAsupport: editors code versioning cloud computingPossible talks: December: 1. C, interfaces with R (both ways) / something else ? 2. SAS: performance, R interface ready? 3. RExcel January: 1. Python & R – a comparison 2. numpy, scipy 3. Python vs Unix shell / NLTK / networkX Other talks (March-) 1. data storage (SQL and some noSQL), access from R 2. data mining platforms 3. Hadoop 4. gpu 5. Java 6. Clojure ...
Criterias for talks:usefulness (for data analysis!) and also comparing it with Rparadigm/philosophy, main usage domain, performance, easiness to learn, quick to program, librariesbreak down by:- part of the data analysis process (pre-processing, exploration (e.g. visualization), modeling etc.)- nature of data (e.g. numeric, categorical, unstructured text, networks/links etc.)- size of datastuff that increases functionality: libraries, 3rd party extensions...does tool X have R to X and/or X to R interface?how these tools can be combined to support the whole process of data analysis