1. 1
Contributing to SAS® By Writing Your Very Own Package
Hunter Glanz, Statistics Department, California Polytechnic State University, San Luis Obispo,
California
Emily Johnson, Statistics Department, California Polytechnic State University, San Luis Obispo,
California
ABSTRACT
One of the biggest reasons for the explosive growth of R statistical software in recent years is the massive collection
of user-developed packages. Each package consists of a number of functions centered around a particular theme or
task, not previously addressed (well) within the software. While SAS ® continues to advance on its own, SAS ® users
can now contribute packages to the broader SAS ® community. Creating and contributing a package is simple and
straightforward, empowering SAS ® users immensely to grow the software themselves. There is a lot of potential to
increase the general applicability of SAS ® to tasks beyond statistics and data management, and it's up to you!
INTRODUCTION
The attraction to the path of least resistance, when it comes to programming and working with data, remains
unavoidably strong. Practically speaking, this means that people often reach out to software or methods that they
have experience with or those that they know excel at the task at hand. Popular software packages like R, SAS,
Python, and Microsoft Excel each maintain their own strengths and weaknesses. People “grow up” learning how to
accomplish different things with different tools. This can happen because a tool is particularly good at something, or
because they have unfortunately inherited the preference of the teacher. To avoid professional suicide one should
definitely maintain knowledge and skills in a variety of programming languages and tools, but being forced into a
particular software tool because of its niche ability to handle a particular problem remains undesirable. Thankfully,
this problem is becoming extinct!
The R and Python languages both allow for the construction and contribution of additional, user-developed functions
and tools in the form of packages for R and libraries for Python. In fact, a huge reason for the explosive growth in the
use of R for statistics and data science is the respective growth in packages developed.
The arsenal of data management and analytic tools that SAS possesses continues to grow and stagger most users.
In fact, it is not uncommon for a given SAS user to be unaware of whole collections of tools that SAS has at its
disposal. Despite this, a small historical complaint about SAS was its dissimilarity from R and Python in its inability to
accommodate user-contributed content and functionality. The ability to create packages for SAS/IML has officially
dispelled this issue.
WORKING WITH SAS/IML PACKAGES
The acronym IML stands for “interactive matrix language” and describes the key components of the language: 1)
interactive and 2) the fundamental object is a data matrix. Those new to SAS/IML should read through the user’s
guide at support.sas.com for an overview of an extremely powerful dimension of SAS.
Those accustomed to working in R or Python will likely feel more at home using IML than base SAS. Especially nice
is the ability of IML to allow you to leverage base SAS functionality within it if you would like, instead of being more of
a stand-apart component of the SAS system. So, current SAS users should not feel alienated either!
In addition to numerous other things, IML allows for the creation of custom modules (think functions) which can be
stored and re-used at later times. A SAS/IML package consists of, most importantly, a collection of modules that
accomplish what base IML does not.
For the remainder of this article we will use a collection of modules for simulating Blackjack to demonstrate how to
work with a SAS/IML package. Suppose we are interested in simulating the game of Blackjack. Let us start by
creating the deck of cards and a module for shuffling that deck:
proc iml;
deck = 1:10;
deck = deck || {10 10 10};
deck = repeat(deck, 4);
create deck var {deck};
append;
2. Contributing to SAS® By Writing Your Very Own Package, continued
2
close deck;
quit;
/* creating a deck of cards */
proc iml;
start shuffle_deck(deck, n);
use deck;
read all into deck;
shuffle = sample(repeat(deck, n));
return(shuffle);
finish shuffle_deck;
store module = shuffle_deck;
shuffled_deck = shuffle_deck(deck, 2);
print shuffled_deck;
quit;
/* shuffles n number of decks */
In the first PROC IML the deck data set is built and stored. The second PROC IML creates and stores the module
shuffle_deck which randomizes the order of any number of deck data sets. The “shoe” contains the shuffled cards
to be dealt. Shoes typically contain more than 1 deck and the decks are reshuffled often, so our shoe module will
draw m cards from the shoe with replacement:
proc iml;
start shoe(m);
load module = shuffle_deck;
use deck;
read all into deck;
deck = shuffle_deck(deck, 2);
shoe = sample(deck, m, "Replace");
return shoe;
finish shoe;
store module = shoe;
/* TEST */
print(shoe(2));
quit;
/* Draws m number of cards from the deck with replacement. */
Now we need a module for dealing a new hand and placing a bet for that hand. We will call this module new_hand:
proc iml;
start new_hand(bet, m);
load module = shoe;
cards = shoe(m);
hand = cards || bet;
return hand;
finish new_hand;
store module = new_hand;
/* TEST */
print(new_hand(1,2));
quit;
/* m is the number of cards to be drawn from the shoe function. draw 2
cards if starting game. */
/* Bet amount is always the last number in the hand matrix. */
3. Contributing to SAS® By Writing Your Very Own Package, continued
3
Notice that the new_hand module makes use of the shoe module. Modules created within SAS/IML can reference
each other quite smoothly! Our blackjack package includes the above modules as well as modules to:
Compute the value of a hand
Compute the winnings of particular game
Printing a particular hand
Hit on a turn for a particular hand
Stand on a turn for a particular hand
Double down on a turn for a particular hand
Split on a turn for a particular hand
Compute the value of the dealer’s hand
Generate the dealer’s hand
Play a game with a dealer and one player
Our list of modules is not prohibitively long, but it definitely contains more than you would want to include at the top of
your SAS script every time you want to use any of them. Ideally you could run something like
proc iml;
package install “C:packagesblackjack.zip”;
quit;
proc iml;
package load blackjack;
to install and load the blackjack package (i.e. all of the modules described above). You can! Instead of running a
long list PROC IML steps to create and store modules for use in every SAS script, they can be packaged up into one
central location with documentation and sample programs. The package only needs to be installed once, but then
loaded each time you want to use any part of it.
A SAS/IML package includes more than just a collection of modules. Similar to R packages and Python libraries,
SAS/IML packages include custom modules (functions), sample data sets, sample programs, and
documentation/help on all of these things! The SAS/IML File Exchange hosts the current collection of SAS/IML
packages. These are user-contributed and not verified by anyone at SAS. The accuracy and quality of the packages
are the responsibility of the authors. So, you will need to contact them If you have questions or concerns.
LEVERAGING R IN SAS/IML
As all of the module and package writing takes place in SAS/IML, the ability of IML to work with the R programming
language cannot go unstated. Earlier we established that programmers likely reach for the tools they know best or
know work best. There is even less reason to leave SAS now that R code can be incorporated and executed directly
in SAS/IML. As a simple example, suppose I want to simulate the sampling distribution of the sample mean of a
random sample of size 10 from the standard uniform distribution. And let us just compute the overall mean of these
simulated sample means. One way to do this in R consists of
xbars <-matrix(colMeans(replicate(10000,{runif(10)})),ncol = 1)
allmean <-mean(xbars)
The variable allmean contains our mean of the sample means.
4. Contributing to SAS® By Writing Your Very Own Package, continued
4
This same R code could be run in PROC IML as easily as
proc iml;
submit / R;
xbars <-matrix(colMeans(replicate(10000,{runif(10)})), ncol = 1)
allmean <-mean(xbars)
endsubmit;
run ImportMatrixFromR(allmean, "allmean");
print(allmean);
quit;
If we wanted to create a new module to perform this task and store it, we could follow the same steps as we did with
the blackjack modules by merely wrapping the necessary code within a start and finish statement.
proc iml;
start rdatasim(n);
run ExportMatrixToR(n, "size");
submit / R;
xbars <-matrix(colMeans(replicate(10000,{runif(size)})),ncol = 1)
allmean <-mean(xbars)
endsubmit;
run ImportMatrixFromR(allmean, "allmean");
return(allmean);
finish;
store module = rdatasim;
quit;
/* Test R module */
proc iml;
load module = rdatasim;
print(rdatasim(10));
print(rdatasim(10));
quit;
In the above code the argument, n, to the rdatasim module dictates the size of the samples of standard uniform
variables. The seamless communication and passing of values between SAS and R makes this IML implementation
an incredibly powerful tool. In case you were thinking the use of R within SAS/IML was limited to base R, think again!
The SAS/IML procedure can run R code to the extent of running and leveraging R packages themselves. As another,
separate example consider the stringi package in R which includes a number of functions for handily working with
character data.
proc iml;
submit / R;
library(stringi)
x <- "This is a sentence"
y <- stri_split_boundaries(x, type = "word")
ans <- y[[1]][1]
endsubmit;
run ImportMatrixFromR(answer, "ans");
print(answer);
quit;
The above code takes the character value “This is a sentence”, breaks it into words using the stri_split_boundaries
function in the stringi package, and then prints the first word of the sentence, “This.” While this is a very simple
example, the possibilities here are endless. No longer does a user’s preference for R for certain tasks necessitate
tedious exporting of data or porting of code from SAS, or vice verca. The best parts of both SAS and R can be utilized
all from within SAS. In this way, we can actually write SAS/IML packages that contain modules full of R code if we
wanted.
Bridging the gap between SAS and R, two monumentally popular tools for statistical analyses, in this way helps
increase comfort and ability with their use, and portability of the work done in each.
5. Contributing to SAS® By Writing Your Very Own Package, continued
5
CONCLUSION
Rick Wicklin, at the SAS Institute Inc., wrote a wonderful paper for SAS Global Forum 2016 in which he detailed the
steps to creating, sharing, and using a SAS/IML package. The vastly increased ability of SAS users to share their
work via these packages opens the door to a whole new level of SAS use. If ever you cannot find functions, routines,
or procedures in SAS for doing something be sure to check out the collection of packages at the SAS/IML File
Exchange. And if you still don’t find a solution amidst the posted packages, consider creating and contributing a
SAS/IML package yourself!
Additionally, your tireless searches need not be constrained to SAS documentation and help any longer. If the
functionality exists for R then porting the code or packages is simple and straightforward using SAS/IML. The R code
can be run stand-alone or packaged up into a new SAS/IML package.
REFERENCES
Wicklin, Rick (2016), "Writing Packages: A new way to Distribute and Use SAS/IML
Programs", Proceedings of the SAS Global Forum 2016 Conference.
ACKNOWLEDGMENTS
The blackjack modules were adapted to SAS from the R functions created in Hadley Wickham’s Simulating
Blackjack case study in “Data Science in R: A Case Studies Approach to Computational Reasoning and Problem
Solving” by Deborah Nolan and Duncan Temple Lang.
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at:
Name: Hunter Glanz
Enterprise: Statistics Department, California Polytechnic State University
Address: 1 Grand Avenue
City, State ZIP: San Luis Obispo, CA 93407
Work Phone: 805-756-2792
E-mail: hglanz@calpoly.edu
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS
Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.