SlideShare a Scribd company logo
1 of 4
Download to read offline
I NVITED S ECTION : T HE F UTURE OF R                                                                                5



Facets of R
Special invited paper on ā€œThe Future of Rā€                      2. interactive, hands-on in real time;

by John M. Chambers                                             3. functional in its model of programming;
                                                                4. object-oriented, ā€œeverything is an objectā€;
We are seeing today a widespread, and welcome,
tendency for non-computer-specialists among statis-             5. modular, built from standardized pieces; and,
ticians and others to write collections of R functions
                                                                6. collaborative, a world-wide, open-source effort.
that organize and communicate their work. Along
with the ļ¬‚ood of software sometimes comes an atti-           None of these facets is original to R, but while many
tude that one need only learn, or teach, a sort of basic     systems emphasize one or two of them, R continues
how-to-write-the-function level of R programming,            to reļ¬‚ect them all, resulting in a programming model
beyond which most of the detail is unimportant or            that is indeed rich but sometimes messy.
can be absorbed without much discussion. As delu-                The thousands of R packages available from
sions go, this one is not very objectionable if it en-       CRAN, BioConductor, R-Forge, and other reposi-
courages participation. Nevertheless, a delusion it is.      tories, the uncounted other software contributions
In fact, functions are only one of a variety of impor-       from groups and individuals, and the many citations
tant facets that R has acquired by intent or circum-         in the scientiļ¬c literature all testify to the useful com-
stance during the three-plus decades of the history of       putations created within the R model. Understand-
the software and of its predecessor S. To create valu-       ing how the various facets arose and how they can
able and trustworthy software using R often requires         work together may help us improve and extend that
an understanding of some of these facets and their           software.
interrelations. This paper identiļ¬es six facets, dis-            We introduce the facets more or less chronolog-
cussing where they came from, how they support or            ically, in three pairs that entered into the software
conļ¬‚ict with each other, and what implications they          during successive intervals of roughly a decade. To
have for the future of programming with R.                   provide some context, here is a brief chronology,
                                                             with some references. The S project began in our
                                                             statistics research group at Bell Labs in 1976, evolved
Facets                                                       into a generally licensed system through the 1980s
                                                             and continues in the S+ software, currently owned by
Any software system that has endured and retained            TIBCO Software Inc. The history of S is summarized
a reasonably happy user community will likely                in the Appendix to Chambers (2008); the standard
have some distinguishing characteristics that form           books introducing still-relevant versions of S include
its style. The characteristics of different systems are      Becker et al. (1988) (no longer in print), Chambers
usually not totally unrelated, at least among systems        and Hastie (1992), and Chambers (1998). R was an-
serving roughly similar goalsā€”in computing as in             nounced to the world in Ihaka and Gentleman (1996)
other ļ¬elds, the set of truly distinct concepts is not       and evolved fairly soon to be developed and man-
that large. But in the mix of different characteris-         aged by the R-core group and other contributors. Its
tics and in the details of how they work together lies       history is harder to pin down, partly because the
much of the ļ¬‚avor of a particular system.                    story is very much still happening and partly be-
     Understanding such characteristics, including a         cause, as a collaborative open-source project, indi-
bit of the historical background, can be helpful in          vidual responsibility is sometimes hard to attribute;
making better use of the software. It can also guide         in addition, not all the important new ideas have
thinking about directions for future improvements of         been described separately by their authors. The
the system itself.                                           http://www.r-project.org site points to the on-line
     The R software and the S software that preceded         manuals, as well as a list of over 70 related books and
it reļ¬‚ect a rather large range of characteristics, result-   other material. The manuals are invaluable as a tech-
ing in software that might be termed rich or messy           nical resource, but short on background information.
according to oneā€™s taste. Since R has become a very          Articles in R News, the predecessor of this journal,
widely used environment for applications and re-             give descriptions of some of the key contributions,
search in data analysis, understanding something of          such as namespaces (Tierney, 2003), and internation-
these characteristics may be helpful to the commu-           alization (Ripley, 2005).
nity.
     This paper considers six characteristics, which we
will call facets. They characterize R as:                    An interactive interface
   1. an interface to computational procedures of            The goals for the ļ¬rst version of S in 1976 already
      many kinds;                                            combined two facets of computing usually kept


The R Journal Vol. 1/1, May 2009                                                                     ISSN 2073-4859
6                                                                           I NVITED S ECTION : T HE F UTURE OF R


apart, interactive computing and the development of         Functional and object-oriented pro-
procedural software for scientiļ¬c applications.
                                                            gramming
    One goal was a better interface to new computa-
tional procedures for scientiļ¬c data analysis, usually      During the 1980s the topics of functional programming
implemented then as Fortran subroutines. Bell Labs          and object-oriented programming stimulated growing
statistics research had collected and written a variety     interest in the computer science literature. At the
of these while at the same time numerical libraries         same time, I was exploring possibilities for the next
and published algorithm collections were establish-         S, perhaps ā€œafter Sā€. These ideas were eventually
ing many of the essential computational techniques.         blended with other work to produce the ā€œnew Sā€,
These procedures were the essence of the actual data        later called Version 3 of S, or S3, (Becker et al., 1988).
analysis; the initial goal was essentially a better in-         As before, user expressions were still interpreted,
terface to procedural computation. A graphic from           but were now parsed into objects representing the
the ļ¬rst plans for S (Chambers, 2008, Appendix, page        language elements, mainly function calls. The func-
476) illustrated this.                                      tions were also objects, the result of parsing expres-
                                                            sions in the language that deļ¬ned the function. As
    But that interface was to be interactive, speciļ¬cally   the motto expressed it, ā€œEverything is an objectā€. The
via a language used by the data analyst communicat-         procedural interface facet was not abandoned, how-
ing in real time with the software. At this time, a few     ever, but from the userā€™s perspective it was now de-
systems had pointed the way for interactive scientiļ¬c       ļ¬ned by functions that provided an interface to a C
computing, but most of them were highly specialized         or Fortran subroutine.
with limited programming facilities; that is, a limited         The new programming model was functional in
ability for the user to express novel computations.         two senses. It largely followed the functional pro-
The most striking exception was APL, created by             gramming paradigm, which deļ¬ned programming
Kenneth Iverson (1962), an operator-based interac-          as the evaluation of function calls, with the value
tive language with heavy emphasis on general mul-           uniquely determined by the objects supplied as ar-
tiway arrays. APL introduced conventions that now           guments and with no external side-effects. Not all
seem obvious; for example, the result of evaluating         functions followed this strictly, however.
a userā€™s expression was either an assignment or a               The new version was functional also in the sense
printed display of the computed object (rather than         that nearly everything that happened in evaluation
expecting users to type a print command). But APL           was a function call. Evaluation in R is even more
at the time had no notion of an interface to proce-         functional in that language components such as as-
dures, which along with a limited, if exotic, syntax        signment and loops are formally function calls inter-
caused us to reject it, although S adopted a number         nally, reļ¬‚ecting in part the inļ¬‚uence of Lisp on the
of features from it, such as its approach to general        initial implementation of R.
multidimensional arrays.                                        Subsequently, the programing model was ex-
                                                            tended to include classes of objects and methods
    From its ļ¬rst implementation, the S software            that specialized functions according to the classes
combined these two facets: users carried out data           of their arguments. That extension itself came in
analysis interactively by writing expressions in the        two stages. First, a thin layer of additional code
S language, but much of the programming to extend           implemented classes and methods by two simple
the system was via new Fortran subroutines accom-           mechanisms: objects could have an attribute deļ¬n-
panied by interfaces to call them from S. The inter-        ing their class as a character string or a vector of
faces were written in a special language that was           multiple strings; and an explicitly invoked method
compiled into Fortran.                                      dispatch function would match the class of the func-
   The result was already a mixed system that had           tionā€™s ļ¬rst argument against methods, which were
much of the tension between human and machine               simply saved function objects with a class string ap-
efļ¬ciency that still stimulates debate among R users        pended to the objectā€™s name. In the second stage, the
and programmers. At the same time, the ļ¬‚exibility           version of the system known as S4 provided formal
from that same mixture helped the software to grow          class deļ¬nitions, stored as a special metadata object,
and adapt with a growing user community.                    and similarly formal method deļ¬nitions associated
                                                            with generic functions, also deļ¬ned via object classes
    Later versions of the system replaced the inter-        (Chambers, 2008, Chapters 9 and 10, for the R ver-
face language with individual functions providing           sion).
interfaces to C or Fortran, but the combination of              The functional and object-oriented facets were
an interactive language and an interface to com-            combined in S and R, but they appear more fre-
piled procedures remains a key feature of R. Increas-       quently in separate, incompatible languages. Typi-
ingly, interfaces to other languages and systems have       cal object-oriented languages such as C++ or Java are
been added as well, for example to database systems,        not functional; instead, their programming model is
spreadsheets and user interface software.                   of methods associated with a class rather than with a


The R Journal Vol. 1/1, May 2009                                                                    ISSN 2073-4859
I NVITED S ECTION : T HE F UTURE OF R                                                                             7


function and invoked on an object. Because the ob-        essential tools provided in R to create, develop, test,
ject is treated as a reference, the methods can usu-      and distribute packages. Among many beneļ¬ts for
ally modify the object, again quite a different model     users, these tools help programmers create software
from that leading to R. The functional use of meth-       that can be installed and used on the three main op-
ods is more complicated, but richer and indeed nec-       erating systems for computing with dataā€”Windows,
essary to support the essential role of functions. A      Linux, and Mac OS X.
few other languages do combine functional structure
with methods, such as Dylan, (Shalit, 1996, chapters          The sixth facet for our discussion is that R is a col-
5 and 6), and comparisons have proved useful for R,       laborative enterprise, which a large group of people
but came after the initial design was in place.           share in and contribute to. Central to the collabo-
                                                          rative facet is, of course, that R is a freely available,
                                                          open-source system, both the core R software and
Modular design and collaborative                          most of the packages built upon it. As most read-
                                                          ers of this journal will be aware, the combination of
support                                                   the collaborative efforts and the package mechanism
                                                          have enormously increased the availability of tech-
Our third pair of facets arrives with R. The paper by
                                                          niques for data analysis, especially the results of new
Ihaka and Gentleman (1996) introduced a statistical
                                                          research in statistics and its applications. The pack-
software system, characterized later as ā€œnot unlike
                                                          age management tools, in fact, illustrate the essential
Sā€, but distinguished by some ideas intended to im-
                                                          role of collaboration. Another highly collaborative
prove on the earlier system. A facet of R related to
                                                          contribution has facilitated use of R internationally
some of those ideas and to important later develop-
                                                          (Ripley, 2005), both by the use of locales in compu-
ments is its approach to modularity. In a number of
                                                          tations and by the contribution of translations for R
ļ¬elds, including architecture and computer science,
                                                          documentation and messages.
modules are units designed to be useful in them-
selves and to ļ¬t together easily to create a desired          While the collaborative facet of R is the least tech-
result, such as a building or a software system. R has    nical, it may eventually have the most profound im-
two very important levels of modules: functions and       plications. The growth of the collaborative commu-
packages.                                                 nity involved is unprecedented. For example, a plot
    Functions are an obvious modular unit, espe-          in Fox (2008) shows that the number of packages
cially functions as objects. R extended the modu-         in the CRAN repository exhibited literal exponen-
larity of functions by providing them with an envi-       tial growth from 2001 to 2008. The current size of
ronment, usually the environment in which the func-       this and other repositories implies at a conservative
tion object was created. Objects assigned in this en-     estimate that several hundreds of contributors have
vironment can be accessed by name in a call to the        mastered quite a bit of technical expertise and satis-
function, and even modiļ¬ed, by stepping outside           ļ¬ed some considerable formal requirements to offer
the strict functional programming model. Functions        their efforts freely to the worldwide scientiļ¬c com-
sharing an environment can thus be used together          munity.
as a unit. The programming technique resulting is
usually called programming with closures in R; for            This facet does have some technical implications.
a discussion and comparison with other techniques,        One is that, being an open-source system, R is gener-
see (Chambers, 2008, section 5.4). Because the eval-      ally able to incorporate other open-source software
uator searches for external objects in the functionā€™s     and does indeed do so in many areas. In contrast,
environment, dependencies on external objects can         proprietary systems are more likely to build exten-
be controlled through that environment. The names-        sions internally, either to maintain competitive ad-
pace mechanism (Tierney, 2003), an important con-         vantage or to avoid the free distribution require-
tribution to trustworthy programming with R, uses         ments of some open-source licenses. Looking for
this technique to help ensure consistent behavior of      quality open-source software to extend the system
functions in a package.                                   should continue to be a favored strategy for R de-
    The larger module introduced by R is probably         velopment.
the most important for the systemā€™s growing use, the
package. As it has evolved, and continues to evolve,          On the downside, a large collaborative enterprise
an R package is a module combining in most cases          with a general practice of making collective decisions
R functions, their documentation, and possibly data       has a natural tendency towards conservatism. Rad-
objects and/or code in other languages.                   ical changes do threaten to break currently working
    While S always had the idea of collections of soft-   features. The future beneļ¬ts they might bring will of-
ware for distribution, somewhat formalized in S4          ten not be sufļ¬ciently persuasive. The very success
as chapters, the R package greatly extends and im-        of R, combined with its collaborative facet, poses a
proves the ability to share computing results as ef-      challenge to cultivate the ā€œnext big stepā€ in software
fective modules. The mechanism beneļ¬ts from some          for data analysis.


The R Journal Vol. 1/1, May 2009                                                                  ISSN 2073-4859
8                                                                         I NVITED S ECTION : T HE F UTURE OF R


Implications                                               ā€œworkaroundsā€ are a good thing or not. One may
                                                           feel that the total effort devoted to multiple approx-
A number of speciļ¬c implications of the facets of R,       imate solutions would have been more productive if
and of their interaction, have been noted in previous      coordinated and applied to improving the core im-
sections. Further implications are suggested in con-       plementation. In practice, however, the nature of the
sidering the current state of R and its future possibil-   collaborative effort supporting R makes it much eas-
ities.                                                     ier to obtain a modest effort from one or a few indi-
    Many reasons can be suggested for Rā€™s increasing       viduals than to organize and execute a larger project.
popularity. Certainly part of the picture is a produc-     Perhaps the growth of R will encourage the com-
tive interaction among the interface facet, the modu-      munity to push for such projects, with a new view
larity of packages, and the collaborative, open-source     of funding and organization. Meanwhile, we can
approach, all in an interactive mode. From the very        hope that the growing communication facilities in
beginning, S was not viewed as a set of procedures         the community will assess and improve on the ex-
(the model for SAS, for example) but as an interactive     isting efforts. Particularly helpful facilities in this
system to support the implementation of new pro-           context include the CRAN task views (http://cran.
cedures. The growth in packages noted previously           r-project.org/web/views/), the mailing lists and
reļ¬‚ects the same view.                                     publications such as this journal.
    The emphasis on interfaces has never diminished
and the range of possible procedures across the in-
terface has expanded greatly. That Rā€™s growth has          Bibliography
been especially strong in academic and other re-
search environments is at least partly due to the abil-    R. A. Becker, J. M. Chambers, and A. R. Wilks. The
ity to write interfaces to existing and new software         New S Language. Chapman & Hall, London, 1988.
of many forms. The evolution of the package as             J. M. Chambers. Programming with Data. Springer,
an effective module and of the repositories for con-          New York, 1998. URL http://cm.bell-labs.
tributed software from the community have resulted            com/cm/ms/departments/sia/Sbook/. ISBN 0-387-
in many new interfaces.                                       98503-4.
    The facets also have implications for the possible
future of R. The hope is that R can continue to sup-       J. M. Chambers. Software for Data Analysis: Program-
port the implementation and communication of im-              ming with R. Springer, New York, 2008. ISBN 978-
portant techniques for data analysis, contributed and         0-387-75935-7.
used by a growing community. At the same time,
                                                           J. M. Chambers and T. J. Hastie. Statistical Mod-
those interested in new directions for statistical com-
                                                              els in S. Chapman & Hall, London, 1992. ISBN
puting will hope that R or some future alternative
                                                              9780412830402.
engages a variety of challenges for computing with
data. The collaborative community seems the only           J. Fox. Editorial. R News, 8(2):1ā€“2, October 2008. URL
likely generator of the new features required but, as         http://CRAN.R-project.org/doc/Rnews/.
noted in the previous section, combining the new
with maintenance of a popular current collaborative        R. Ihaka and R. Gentleman. R: A language for data
system will not be easy.                                     analysis and graphics. Journal of Computational and
    The encouragement for new packages generally             Graphical Statistics, 5:299ā€“314, 1996.
and for interfaces to other software in particular are     K. E. Iverson. A Programming Language. Wiley, 1962.
relevant for this question also. Changes that might
be difļ¬cult if applied internally to the core R im-        B. D. Ripley. Internationalization features of R 2.1.0.
plementation can often be supplied, or at least ap-          R News, 5(1):2ā€“7, May 2005. URL http://CRAN.
proximated, by packages. Two examples of enhance-            R-project.org/doc/Rnews/.
ments considered in discussions are an optional ob-        A. Shalit. The Dylan Reference Manual. Addison-
ject model using mutable references and computa-             Wesley Developers Press, Reading, Mass., 1996.
tions using multiple threads. In both cases, it can          ISBN 0-201-44211-6.
be argued that adding the facility to R could extend
its applicability and performance. But a substantial       L. Tierney. Name space management for R. R News,
programming effort and knowledge of the current               3(1):2ā€“6, June 2003. URL http://CRAN.R-project.
implementation would be required to modify the in-            org/doc/Rnews/.
ternal object model or to make the core code thread-
safe. Issues of back compatibility are likely to arise
                                                           John M. Chambers
as well. Meanwhile, less ambitious approaches in
                                                           Department of Statistics
the form of packages that approximate the desired
                                                           Stanford University
behavior have been written. For the second exam-
                                                           USA
ple especially, these may interface to other software
                                                           jmc@r-project.org
designed to provide this facility.
    Opinions may reasonably differ on whether these
The R Journal Vol. 1/1, May 2009                                                                 ISSN 2073-4859

More Related Content

Similar to R Journal 2009 1 Chambers

Financial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali Tirmizi
Financial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali TirmiziFinancial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali Tirmizi
Financial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali TirmiziDr. Muhammad Ali Tirmizi., Ph.D.
Ā 
Switching from lispstat to r
Switching from lispstat to rSwitching from lispstat to r
Switching from lispstat to rAjay Ohri
Ā 
effectivegraphsmro1
effectivegraphsmro1effectivegraphsmro1
effectivegraphsmro1Joyce Robbins
Ā 
GoOpen 2010: Roger Bivand
GoOpen 2010: Roger BivandGoOpen 2010: Roger Bivand
GoOpen 2010: Roger BivandFriprogsenteret
Ā 
OBJECT ORIENTED PROGRAMMING.docx
OBJECT ORIENTED PROGRAMMING.docxOBJECT ORIENTED PROGRAMMING.docx
OBJECT ORIENTED PROGRAMMING.docxAleKi2
Ā 
Analysis Report
 Analysis Report  Analysis Report
Analysis Report Dhaarna Singh
Ā 
statistical computation using R- report
statistical computation using R- reportstatistical computation using R- report
statistical computation using R- reportKamarudheen KV
Ā 
2 it unit-1 start learning r
2 it   unit-1 start learning r2 it   unit-1 start learning r
2 it unit-1 start learning rNetaji Gandi
Ā 
Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...
Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...
Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...Sundar B N
Ā 
R for data analytics
R for data analyticsR for data analytics
R for data analyticsVijayMohan Vasu
Ā 
Realization of natural language interfaces using
Realization of natural language interfaces usingRealization of natural language interfaces using
Realization of natural language interfaces usingunyil96
Ā 
R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometricsDiane Talley
Ā 
The Statistical Significance of "R"
The Statistical Significance of "R"The Statistical Significance of "R"
The Statistical Significance of "R"ppvora
Ā 
Learn a language : LISP
Learn a language : LISPLearn a language : LISP
Learn a language : LISPDevnology
Ā 
R vs SPSS: Which One is The Best Statistical Language
R vs SPSS: Which One is The Best Statistical LanguageR vs SPSS: Which One is The Best Statistical Language
R vs SPSS: Which One is The Best Statistical LanguageStat Analytica
Ā 
R programming language
R programming languageR programming language
R programming languageKeerti Verma
Ā 

Similar to R Journal 2009 1 Chambers (20)

Modest Formalization of Software Design Patterns
Modest Formalization of Software Design PatternsModest Formalization of Software Design Patterns
Modest Formalization of Software Design Patterns
Ā 
Financial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali Tirmizi
Financial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali TirmiziFinancial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali Tirmizi
Financial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali Tirmizi
Ā 
Switching from lispstat to r
Switching from lispstat to rSwitching from lispstat to r
Switching from lispstat to r
Ā 
R_L1-Aug-2022.pptx
R_L1-Aug-2022.pptxR_L1-Aug-2022.pptx
R_L1-Aug-2022.pptx
Ā 
effectivegraphsmro1
effectivegraphsmro1effectivegraphsmro1
effectivegraphsmro1
Ā 
GoOpen 2010: Roger Bivand
GoOpen 2010: Roger BivandGoOpen 2010: Roger Bivand
GoOpen 2010: Roger Bivand
Ā 
OBJECT ORIENTED PROGRAMMING.docx
OBJECT ORIENTED PROGRAMMING.docxOBJECT ORIENTED PROGRAMMING.docx
OBJECT ORIENTED PROGRAMMING.docx
Ā 
Analysis Report
 Analysis Report  Analysis Report
Analysis Report
Ā 
statistical computation using R- report
statistical computation using R- reportstatistical computation using R- report
statistical computation using R- report
Ā 
2 it unit-1 start learning r
2 it   unit-1 start learning r2 it   unit-1 start learning r
2 it unit-1 start learning r
Ā 
UNIT-1 Start Learning R.pdf
UNIT-1 Start Learning R.pdfUNIT-1 Start Learning R.pdf
UNIT-1 Start Learning R.pdf
Ā 
Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...
Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...
Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...
Ā 
R for data analytics
R for data analyticsR for data analytics
R for data analytics
Ā 
Realization of natural language interfaces using
Realization of natural language interfaces usingRealization of natural language interfaces using
Realization of natural language interfaces using
Ā 
History Of C Essay
History Of C EssayHistory Of C Essay
History Of C Essay
Ā 
R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometrics
Ā 
The Statistical Significance of "R"
The Statistical Significance of "R"The Statistical Significance of "R"
The Statistical Significance of "R"
Ā 
Learn a language : LISP
Learn a language : LISPLearn a language : LISP
Learn a language : LISP
Ā 
R vs SPSS: Which One is The Best Statistical Language
R vs SPSS: Which One is The Best Statistical LanguageR vs SPSS: Which One is The Best Statistical Language
R vs SPSS: Which One is The Best Statistical Language
Ā 
R programming language
R programming languageR programming language
R programming language
Ā 

More from Ajay Ohri

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay OhriAjay Ohri
Ā 
Introduction to R
Introduction to RIntroduction to R
Introduction to RAjay Ohri
Ā 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionAjay Ohri
Ā 
Pyspark
PysparkPyspark
PysparkAjay Ohri
Ā 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for freeAjay Ohri
Ā 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10Ajay Ohri
Ā 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri ResumeAjay Ohri
Ā 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientistsAjay Ohri
Ā 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...Ajay Ohri
Ā 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data scienceAjay Ohri
Ā 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data ScienceAjay Ohri
Ā 
Tradecraft
Tradecraft   Tradecraft
Tradecraft Ajay Ohri
Ā 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data ScientistsAjay Ohri
Ā 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in PythonAjay Ohri
Ā 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen OomsAjay Ohri
Ā 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsAjay Ohri
Ā 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha Ajay Ohri
Ā 
Analyze this
Analyze thisAnalyze this
Analyze thisAjay Ohri
Ā 
Summer school python in spanish
Summer school python in spanishSummer school python in spanish
Summer school python in spanishAjay Ohri
Ā 

More from Ajay Ohri (20)

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay Ohri
Ā 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
Ā 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 Election
Ā 
Pyspark
PysparkPyspark
Pyspark
Ā 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for free
Ā 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10
Ā 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri Resume
Ā 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientists
Ā 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...
Ā 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data science
Ā 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
Ā 
Tradecraft
Tradecraft   Tradecraft
Tradecraft
Ā 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data Scientists
Ā 
Craps
CrapsCraps
Craps
Ā 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in Python
Ā 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen Ooms
Ā 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports Analytics
Ā 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha
Ā 
Analyze this
Analyze thisAnalyze this
Analyze this
Ā 
Summer school python in spanish
Summer school python in spanishSummer school python in spanish
Summer school python in spanish
Ā 

Recently uploaded

šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜RTylerCroy
Ā 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
Ā 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
Ā 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
Ā 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
Ā 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
Ā 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
Ā 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
Ā 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
Ā 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
Ā 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
Ā 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
Ā 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
Ā 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
Ā 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
Ā 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
Ā 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
Ā 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
Ā 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
Ā 

Recently uploaded (20)

šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜
Ā 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Ā 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Ā 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Ā 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
Ā 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Ā 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
Ā 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Ā 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Ā 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Ā 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Ā 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Ā 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Ā 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Ā 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
Ā 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Ā 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Ā 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Ā 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Ā 

R Journal 2009 1 Chambers

  • 1. I NVITED S ECTION : T HE F UTURE OF R 5 Facets of R Special invited paper on ā€œThe Future of Rā€ 2. interactive, hands-on in real time; by John M. Chambers 3. functional in its model of programming; 4. object-oriented, ā€œeverything is an objectā€; We are seeing today a widespread, and welcome, tendency for non-computer-specialists among statis- 5. modular, built from standardized pieces; and, ticians and others to write collections of R functions 6. collaborative, a world-wide, open-source effort. that organize and communicate their work. Along with the ļ¬‚ood of software sometimes comes an atti- None of these facets is original to R, but while many tude that one need only learn, or teach, a sort of basic systems emphasize one or two of them, R continues how-to-write-the-function level of R programming, to reļ¬‚ect them all, resulting in a programming model beyond which most of the detail is unimportant or that is indeed rich but sometimes messy. can be absorbed without much discussion. As delu- The thousands of R packages available from sions go, this one is not very objectionable if it en- CRAN, BioConductor, R-Forge, and other reposi- courages participation. Nevertheless, a delusion it is. tories, the uncounted other software contributions In fact, functions are only one of a variety of impor- from groups and individuals, and the many citations tant facets that R has acquired by intent or circum- in the scientiļ¬c literature all testify to the useful com- stance during the three-plus decades of the history of putations created within the R model. Understand- the software and of its predecessor S. To create valu- ing how the various facets arose and how they can able and trustworthy software using R often requires work together may help us improve and extend that an understanding of some of these facets and their software. interrelations. This paper identiļ¬es six facets, dis- We introduce the facets more or less chronolog- cussing where they came from, how they support or ically, in three pairs that entered into the software conļ¬‚ict with each other, and what implications they during successive intervals of roughly a decade. To have for the future of programming with R. provide some context, here is a brief chronology, with some references. The S project began in our statistics research group at Bell Labs in 1976, evolved Facets into a generally licensed system through the 1980s and continues in the S+ software, currently owned by Any software system that has endured and retained TIBCO Software Inc. The history of S is summarized a reasonably happy user community will likely in the Appendix to Chambers (2008); the standard have some distinguishing characteristics that form books introducing still-relevant versions of S include its style. The characteristics of different systems are Becker et al. (1988) (no longer in print), Chambers usually not totally unrelated, at least among systems and Hastie (1992), and Chambers (1998). R was an- serving roughly similar goalsā€”in computing as in nounced to the world in Ihaka and Gentleman (1996) other ļ¬elds, the set of truly distinct concepts is not and evolved fairly soon to be developed and man- that large. But in the mix of different characteris- aged by the R-core group and other contributors. Its tics and in the details of how they work together lies history is harder to pin down, partly because the much of the ļ¬‚avor of a particular system. story is very much still happening and partly be- Understanding such characteristics, including a cause, as a collaborative open-source project, indi- bit of the historical background, can be helpful in vidual responsibility is sometimes hard to attribute; making better use of the software. It can also guide in addition, not all the important new ideas have thinking about directions for future improvements of been described separately by their authors. The the system itself. http://www.r-project.org site points to the on-line The R software and the S software that preceded manuals, as well as a list of over 70 related books and it reļ¬‚ect a rather large range of characteristics, result- other material. The manuals are invaluable as a tech- ing in software that might be termed rich or messy nical resource, but short on background information. according to oneā€™s taste. Since R has become a very Articles in R News, the predecessor of this journal, widely used environment for applications and re- give descriptions of some of the key contributions, search in data analysis, understanding something of such as namespaces (Tierney, 2003), and internation- these characteristics may be helpful to the commu- alization (Ripley, 2005). nity. This paper considers six characteristics, which we will call facets. They characterize R as: An interactive interface 1. an interface to computational procedures of The goals for the ļ¬rst version of S in 1976 already many kinds; combined two facets of computing usually kept The R Journal Vol. 1/1, May 2009 ISSN 2073-4859
  • 2. 6 I NVITED S ECTION : T HE F UTURE OF R apart, interactive computing and the development of Functional and object-oriented pro- procedural software for scientiļ¬c applications. gramming One goal was a better interface to new computa- tional procedures for scientiļ¬c data analysis, usually During the 1980s the topics of functional programming implemented then as Fortran subroutines. Bell Labs and object-oriented programming stimulated growing statistics research had collected and written a variety interest in the computer science literature. At the of these while at the same time numerical libraries same time, I was exploring possibilities for the next and published algorithm collections were establish- S, perhaps ā€œafter Sā€. These ideas were eventually ing many of the essential computational techniques. blended with other work to produce the ā€œnew Sā€, These procedures were the essence of the actual data later called Version 3 of S, or S3, (Becker et al., 1988). analysis; the initial goal was essentially a better in- As before, user expressions were still interpreted, terface to procedural computation. A graphic from but were now parsed into objects representing the the ļ¬rst plans for S (Chambers, 2008, Appendix, page language elements, mainly function calls. The func- 476) illustrated this. tions were also objects, the result of parsing expres- sions in the language that deļ¬ned the function. As But that interface was to be interactive, speciļ¬cally the motto expressed it, ā€œEverything is an objectā€. The via a language used by the data analyst communicat- procedural interface facet was not abandoned, how- ing in real time with the software. At this time, a few ever, but from the userā€™s perspective it was now de- systems had pointed the way for interactive scientiļ¬c ļ¬ned by functions that provided an interface to a C computing, but most of them were highly specialized or Fortran subroutine. with limited programming facilities; that is, a limited The new programming model was functional in ability for the user to express novel computations. two senses. It largely followed the functional pro- The most striking exception was APL, created by gramming paradigm, which deļ¬ned programming Kenneth Iverson (1962), an operator-based interac- as the evaluation of function calls, with the value tive language with heavy emphasis on general mul- uniquely determined by the objects supplied as ar- tiway arrays. APL introduced conventions that now guments and with no external side-effects. Not all seem obvious; for example, the result of evaluating functions followed this strictly, however. a userā€™s expression was either an assignment or a The new version was functional also in the sense printed display of the computed object (rather than that nearly everything that happened in evaluation expecting users to type a print command). But APL was a function call. Evaluation in R is even more at the time had no notion of an interface to proce- functional in that language components such as as- dures, which along with a limited, if exotic, syntax signment and loops are formally function calls inter- caused us to reject it, although S adopted a number nally, reļ¬‚ecting in part the inļ¬‚uence of Lisp on the of features from it, such as its approach to general initial implementation of R. multidimensional arrays. Subsequently, the programing model was ex- tended to include classes of objects and methods From its ļ¬rst implementation, the S software that specialized functions according to the classes combined these two facets: users carried out data of their arguments. That extension itself came in analysis interactively by writing expressions in the two stages. First, a thin layer of additional code S language, but much of the programming to extend implemented classes and methods by two simple the system was via new Fortran subroutines accom- mechanisms: objects could have an attribute deļ¬n- panied by interfaces to call them from S. The inter- ing their class as a character string or a vector of faces were written in a special language that was multiple strings; and an explicitly invoked method compiled into Fortran. dispatch function would match the class of the func- The result was already a mixed system that had tionā€™s ļ¬rst argument against methods, which were much of the tension between human and machine simply saved function objects with a class string ap- efļ¬ciency that still stimulates debate among R users pended to the objectā€™s name. In the second stage, the and programmers. At the same time, the ļ¬‚exibility version of the system known as S4 provided formal from that same mixture helped the software to grow class deļ¬nitions, stored as a special metadata object, and adapt with a growing user community. and similarly formal method deļ¬nitions associated with generic functions, also deļ¬ned via object classes Later versions of the system replaced the inter- (Chambers, 2008, Chapters 9 and 10, for the R ver- face language with individual functions providing sion). interfaces to C or Fortran, but the combination of The functional and object-oriented facets were an interactive language and an interface to com- combined in S and R, but they appear more fre- piled procedures remains a key feature of R. Increas- quently in separate, incompatible languages. Typi- ingly, interfaces to other languages and systems have cal object-oriented languages such as C++ or Java are been added as well, for example to database systems, not functional; instead, their programming model is spreadsheets and user interface software. of methods associated with a class rather than with a The R Journal Vol. 1/1, May 2009 ISSN 2073-4859
  • 3. I NVITED S ECTION : T HE F UTURE OF R 7 function and invoked on an object. Because the ob- essential tools provided in R to create, develop, test, ject is treated as a reference, the methods can usu- and distribute packages. Among many beneļ¬ts for ally modify the object, again quite a different model users, these tools help programmers create software from that leading to R. The functional use of meth- that can be installed and used on the three main op- ods is more complicated, but richer and indeed nec- erating systems for computing with dataā€”Windows, essary to support the essential role of functions. A Linux, and Mac OS X. few other languages do combine functional structure with methods, such as Dylan, (Shalit, 1996, chapters The sixth facet for our discussion is that R is a col- 5 and 6), and comparisons have proved useful for R, laborative enterprise, which a large group of people but came after the initial design was in place. share in and contribute to. Central to the collabo- rative facet is, of course, that R is a freely available, open-source system, both the core R software and Modular design and collaborative most of the packages built upon it. As most read- ers of this journal will be aware, the combination of support the collaborative efforts and the package mechanism have enormously increased the availability of tech- Our third pair of facets arrives with R. The paper by niques for data analysis, especially the results of new Ihaka and Gentleman (1996) introduced a statistical research in statistics and its applications. The pack- software system, characterized later as ā€œnot unlike age management tools, in fact, illustrate the essential Sā€, but distinguished by some ideas intended to im- role of collaboration. Another highly collaborative prove on the earlier system. A facet of R related to contribution has facilitated use of R internationally some of those ideas and to important later develop- (Ripley, 2005), both by the use of locales in compu- ments is its approach to modularity. In a number of tations and by the contribution of translations for R ļ¬elds, including architecture and computer science, documentation and messages. modules are units designed to be useful in them- selves and to ļ¬t together easily to create a desired While the collaborative facet of R is the least tech- result, such as a building or a software system. R has nical, it may eventually have the most profound im- two very important levels of modules: functions and plications. The growth of the collaborative commu- packages. nity involved is unprecedented. For example, a plot Functions are an obvious modular unit, espe- in Fox (2008) shows that the number of packages cially functions as objects. R extended the modu- in the CRAN repository exhibited literal exponen- larity of functions by providing them with an envi- tial growth from 2001 to 2008. The current size of ronment, usually the environment in which the func- this and other repositories implies at a conservative tion object was created. Objects assigned in this en- estimate that several hundreds of contributors have vironment can be accessed by name in a call to the mastered quite a bit of technical expertise and satis- function, and even modiļ¬ed, by stepping outside ļ¬ed some considerable formal requirements to offer the strict functional programming model. Functions their efforts freely to the worldwide scientiļ¬c com- sharing an environment can thus be used together munity. as a unit. The programming technique resulting is usually called programming with closures in R; for This facet does have some technical implications. a discussion and comparison with other techniques, One is that, being an open-source system, R is gener- see (Chambers, 2008, section 5.4). Because the eval- ally able to incorporate other open-source software uator searches for external objects in the functionā€™s and does indeed do so in many areas. In contrast, environment, dependencies on external objects can proprietary systems are more likely to build exten- be controlled through that environment. The names- sions internally, either to maintain competitive ad- pace mechanism (Tierney, 2003), an important con- vantage or to avoid the free distribution require- tribution to trustworthy programming with R, uses ments of some open-source licenses. Looking for this technique to help ensure consistent behavior of quality open-source software to extend the system functions in a package. should continue to be a favored strategy for R de- The larger module introduced by R is probably velopment. the most important for the systemā€™s growing use, the package. As it has evolved, and continues to evolve, On the downside, a large collaborative enterprise an R package is a module combining in most cases with a general practice of making collective decisions R functions, their documentation, and possibly data has a natural tendency towards conservatism. Rad- objects and/or code in other languages. ical changes do threaten to break currently working While S always had the idea of collections of soft- features. The future beneļ¬ts they might bring will of- ware for distribution, somewhat formalized in S4 ten not be sufļ¬ciently persuasive. The very success as chapters, the R package greatly extends and im- of R, combined with its collaborative facet, poses a proves the ability to share computing results as ef- challenge to cultivate the ā€œnext big stepā€ in software fective modules. The mechanism beneļ¬ts from some for data analysis. The R Journal Vol. 1/1, May 2009 ISSN 2073-4859
  • 4. 8 I NVITED S ECTION : T HE F UTURE OF R Implications ā€œworkaroundsā€ are a good thing or not. One may feel that the total effort devoted to multiple approx- A number of speciļ¬c implications of the facets of R, imate solutions would have been more productive if and of their interaction, have been noted in previous coordinated and applied to improving the core im- sections. Further implications are suggested in con- plementation. In practice, however, the nature of the sidering the current state of R and its future possibil- collaborative effort supporting R makes it much eas- ities. ier to obtain a modest effort from one or a few indi- Many reasons can be suggested for Rā€™s increasing viduals than to organize and execute a larger project. popularity. Certainly part of the picture is a produc- Perhaps the growth of R will encourage the com- tive interaction among the interface facet, the modu- munity to push for such projects, with a new view larity of packages, and the collaborative, open-source of funding and organization. Meanwhile, we can approach, all in an interactive mode. From the very hope that the growing communication facilities in beginning, S was not viewed as a set of procedures the community will assess and improve on the ex- (the model for SAS, for example) but as an interactive isting efforts. Particularly helpful facilities in this system to support the implementation of new pro- context include the CRAN task views (http://cran. cedures. The growth in packages noted previously r-project.org/web/views/), the mailing lists and reļ¬‚ects the same view. publications such as this journal. The emphasis on interfaces has never diminished and the range of possible procedures across the in- terface has expanded greatly. That Rā€™s growth has Bibliography been especially strong in academic and other re- search environments is at least partly due to the abil- R. A. Becker, J. M. Chambers, and A. R. Wilks. The ity to write interfaces to existing and new software New S Language. Chapman & Hall, London, 1988. of many forms. The evolution of the package as J. M. Chambers. Programming with Data. Springer, an effective module and of the repositories for con- New York, 1998. URL http://cm.bell-labs. tributed software from the community have resulted com/cm/ms/departments/sia/Sbook/. ISBN 0-387- in many new interfaces. 98503-4. The facets also have implications for the possible future of R. The hope is that R can continue to sup- J. M. Chambers. Software for Data Analysis: Program- port the implementation and communication of im- ming with R. Springer, New York, 2008. ISBN 978- portant techniques for data analysis, contributed and 0-387-75935-7. used by a growing community. At the same time, J. M. Chambers and T. J. Hastie. Statistical Mod- those interested in new directions for statistical com- els in S. Chapman & Hall, London, 1992. ISBN puting will hope that R or some future alternative 9780412830402. engages a variety of challenges for computing with data. The collaborative community seems the only J. Fox. Editorial. R News, 8(2):1ā€“2, October 2008. URL likely generator of the new features required but, as http://CRAN.R-project.org/doc/Rnews/. noted in the previous section, combining the new with maintenance of a popular current collaborative R. Ihaka and R. Gentleman. R: A language for data system will not be easy. analysis and graphics. Journal of Computational and The encouragement for new packages generally Graphical Statistics, 5:299ā€“314, 1996. and for interfaces to other software in particular are K. E. Iverson. A Programming Language. Wiley, 1962. relevant for this question also. Changes that might be difļ¬cult if applied internally to the core R im- B. D. Ripley. Internationalization features of R 2.1.0. plementation can often be supplied, or at least ap- R News, 5(1):2ā€“7, May 2005. URL http://CRAN. proximated, by packages. Two examples of enhance- R-project.org/doc/Rnews/. ments considered in discussions are an optional ob- A. Shalit. The Dylan Reference Manual. Addison- ject model using mutable references and computa- Wesley Developers Press, Reading, Mass., 1996. tions using multiple threads. In both cases, it can ISBN 0-201-44211-6. be argued that adding the facility to R could extend its applicability and performance. But a substantial L. Tierney. Name space management for R. R News, programming effort and knowledge of the current 3(1):2ā€“6, June 2003. URL http://CRAN.R-project. implementation would be required to modify the in- org/doc/Rnews/. ternal object model or to make the core code thread- safe. Issues of back compatibility are likely to arise John M. Chambers as well. Meanwhile, less ambitious approaches in Department of Statistics the form of packages that approximate the desired Stanford University behavior have been written. For the second exam- USA ple especially, these may interface to other software jmc@r-project.org designed to provide this facility. Opinions may reasonably differ on whether these The R Journal Vol. 1/1, May 2009 ISSN 2073-4859