H T T P : / / W W W. A N A LY T I C S - M A G A Z I N E . O R G
JULY/AUGUST 2014DRIVING BETTER BUSINESS DECISIONS
BROUGHT TO YOU BY:
WHY ANALYTICS
PROJECTS
FAIL
ALSO INSIDE:
• Dark side of digital world
• Real-time text analytics
• Data scientists’ time to shine
• The future of forecasting
Key considerations
for deep analytics
on big data,
learning and
insights
Executive Edge
Hewlett-Packard
V. P. Rohit Tandon:
Six ways of
value creation via
E-commerce analytics
W W W. I N F O R M S . O R G2 | A N A LY T I C S - M AGA Z I N E . O R G
What I learned today
INSIDE STORY
One of the advantages of editing
Analytics (as well as OR/MS Today, the
membership magazine of INFORMS) is I
learn something new every day, thanks to
the wide array of contributed articles we
receive. For example, just in preparing
this issue, I learned:
• Nearly 20 years ago, Amazon found-
er Jeff Bezos said that Amazon intended
to sell books at or near cost as a way
of gathering data on affluent, educated
shoppers, as reported by George Packer
in The New Yorker. The implication: The
data, once analyzed, had more value
than the loss-leader books, which proved
absolutely correct when Amazon began
selling everything under the sun to well-
targeted consumers.
Drawing on Packer’s article, as well
as a couple of books (“Who Owns the
Future?” and “The Ethics of Big Data”),
Vijay Mehrotra explores the dark side
of technology, big data and analytics –
and the perceived and/or potential threat
it poses – in his Analyze This! column.
Don’t miss it.
• A Formula 1 pit crew, working in an
optimized, well-coordinated fashion, can
change a set of four tires in less than two
seconds. That means that unless you’re
Evelyn Wood, that crew can change
12 tires in the time it takes you to read
this sentence. For the story behind the
motorsports magic, check out Andy
Boyd’s Forum column. Seeing is be-
lieving, so don’t miss the amazing videos
referenced at the end of the article.
• We all know the digital/technical
world will come to a wordy end without
acronyms, but do you know what MOOC
stands for? I do (“massively open online
course”), thanks to an interview I did with
executive search honcho Linda Burtch
regarding the red-hot analytics job market.
• Finally, I also learned from Linda
that in today’s dynamic world, young
people should plan on three or four ca-
reers during their lifetime. “It’s not good
to specialize in one thing and try to stick
with one company or one industry or one
vertical application for your entire ca-
reer,” she says in the Q&A. “It’s incredibly
dangerous, and it likely won’t carry you
through a 35-year career. You need to be
continuously learning something new.”
I got that last part going for me,
every day.
– PETER HORNER, EDITOR
peter.horner@mail.informs.org
OPTIMIZEYOUR BUSINESS
WITH UNPRECEDENTED SPEED
info@aimms.com | +1 425 458 4024
To learn more about AIMMS Optimization Apps, visit aimms.com.
TO YOUR ENTERPRISE
OPTIMIZATION
APP STORE
PUBLISHED
INSTANTLY
IN A FEW
DAYS
PROOF OF
CONCEPT
IN A FEW
WEEKS
OPTIMIZATION APP
IN A FEW
MONTHS
MISSION CRITICAL
ENTERPRISE APP
IN A FEW
HOURS
IDEA
W W W. I N F O R M S . O R G4 | A N A LY T I C S - M AGA Z I N E . O R G
DRIVING BETTER BUSINESS DECISIONS
C O N T E N T S
FEATURES
REAL-TIME TEXT ANALYTICS
By Aveek Mukhopadhyay and Roger Barga
How a cloud-based analytical engine yields instant insight using
unstructured social media data.
WHY DO ANALYTICS PROJECTS FAIL?
By Haluk Demirkan and Bulent Dal
Not just another IT project: Key considerations for deep analytics
on big data, learning and insights.
‘IT’S THEIR TIME TO SHINE’
By Peter Horner
Job prospects for data scientists and elite analytics professionals
have never been better – and the future is even brighter.
ANALYTICS TRANSFORMS A ‘DINOSAUR’
By Brenda Dietrich, Emily Plachy and Maureen Norton
The story of how industry giant IBM not only survived but
thrived by realizing business value from big data.
THE FUTURE OF FORECASTING
By Jack Yurkiewicz
Making predictions from hard and fast data: Biennial survey
of popular software for analytics professionals.
34
44
54
62
70
54
62 70
34
JULY/AUGUST 2014
Brought to you by
Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com
AnAlytic Solver PlAtform
visualize, Analyze, Decide with Power Bi + Premium Solver
Before your company spends a year and a small fortune
on “advanced analytics”, shouldn’t you find out what
your people can do with the latest enhancements to
the tool they already know – Microsoft Excel – in
business intelligence and advanced analytics today?
Did you know that with Power Pivot in Excel 2013 and
2010, your Excel desktop can easily analyze 100 million
row datasets, with the power of Microsoft’s SQL Server
Analysis Services xVelocity engine inside Excel?
Did you know that with Power Query in Excel, you can
extract, transform and load (ETL) data from virtually any
enterprise or cloud database with point-and-click ease?
Did you know that with Analytic Solver Platform in
Excel, you can create powerful data mining, forecasting
and predictive analytics models, rivaling the best-known
statistical packages, again with point-and-click ease?
Did you know that with Analytic Solver Platform, you can
build sophisticated Monte Carlo simulation, risk analysis,
conventional and stochastic optimization models, using
the world’s best solvers, and modeling tools proven in
use by over 7,000 companies?
Did you know that with Power View and Frontline’s
XLMiner Data Visualization, you can visualize not only
your data, but the results of your analytic models?
Now you know that with Microsoft’s Power BI and
Frontline’s Premium Solver App, you can publish your
Excel workbook to Office 365 in the cloud, share your
visualizations, refresh from on-premise databases, and
re-optimize your model for new decisions immediately.
Find Out More, Download Your Free Trial Now
Visit www.solver.com/powerbi to learn more, register
and download a free trial – or email or call us today.
6 |
DRIVING BETTER BUSINESS DECISIONS
REGISTER FOR A FREE SUBSCRIPTION:
http://analytics.informs.org
INFORMS BOARD OF DIRECTORS
	 President	Stephen M. Robinson, University of
Wisconsin-Madison
	 President-Elect	L. Robin Keller,
University of
California, Irvine
	 Past President	 Anne G. Robinson, Verizon Wireless	
	 Secretary	 Brian Denton,
		 University of Michigan
	 Treasurer	 Nicholas G. Hall, Ohio State University
	 Vice President-Meetings	 William “Bill” Klimack, Chevron
	 Vice President-Publications 	 Eric Johnson, Dartmouth College
	 Vice President-
	 Sections and Societies 	 Paul Messinger, CAP, University ofAlberta
	 Vice President-
	 Information Technology	 Bjarni Kristjansson, Maximal Software
	 Vice President-Practice Activities 	 Jonathan Owen, CAP, General Motors
	Vice President-International Activities 	Grace Lin, Institute for Information Industry
	 Vice President-Membership
	 and Professional Recognition 	 Ozlem Ergun, Georgia Tech
	 Vice President-Education	 Joel Sokol, Georgia Tech
	 Vice President-Marketing,
	 Communications and Outreach 	 E. Andrew “Andy” Boyd,
		 University of Houston
	 Vice President-Chapters/Fora 	 David Hunt, Oliver Wyman
INFORMS OFFICES
www.informs.org • Tel: 1-800-4INFORMS
	
	 Executive Director 	 Melissa Moore
	 Meetings Director 	 Laura Payne
	 Marketing Director	 Gary Bennett
	 Communications Director	 Barry List
	
	 Headquarters 	 INFORMS (Maryland)
	 	 5521 Research Park Drive, Suite 200
		 Catonsville, MD 21228
		 Tel.: 443.757.3500
		 E-mail: informs@informs.org
ANALYTICS EDITORIAL AND ADVERTISING
	Lionheart Publishing Inc., 506 Roswell Street, Suite 220, Marietta, GA 30060 USA
Tel.: 770.431.0867 • Fax: 770.432.6969
	 President  Advertising Sales 	 John Llewellyn
		john.llewellyn@mail.informs.org
		 Tel.: 770.431.0867, ext. 209
	 Editor	 Peter R. Horner
		peter.horner@mail.informs.org
		 Tel.: 770.587.3172
	 Assistant Editor	 Donna Brooks
		donna.brooks@mail.informs.org
	 Art Director 	 Jim McDonald
		jim.mcdonald@mail.informs.org
		 Tel.: 770.431.0867, ext. 223
	 Advertising Sales 	 Sharon Baker
		sharon.baker@mail.informs.org
		 Tel.: 813.852.9942
Analytics (ISSN 1938-1697) is published six times a year by the
Institute for Operations Research and the Management Sciences
(INFORMS),thelargestmembershipsocietyintheworlddedicated
to the analytics profession. For a free subscription, register at
http://analytics.informs.org. Address other correspondence to
the editor, Peter Horner, peter.horner@mail.informs.org. The
opinions expressed in Analytics are those of the authors, and
do not necessarily reflect the opinions of INFORMS, its officers,
Lionheart Publishing Inc. or the editorial staff of Analytics.
Analytics copyright ©2014 by the Institute for Operations
Research and the Management Sciences. All rights reserved.
32
82
		 DEPARTMENTS
	 2	 Inside Story
	 8	 Executive Edge
	 14	 Analyze This!
	 24	 Healthcare Analytics
	 28	 INFORMS Initiatives
	 32	 Forum
	 82	 Conference Preview
	 84	 Five-Minute Analyst
	 90	 Thinking Analytically
Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com
AnAlytic Solver PlAtform
easy to Use, industrial Strength Predictive Analytics in excel
How can you get results quickly for business decisions,
without a huge budget for “enterprise analytics”
software, and months of learning time? Here’s how:
Analytic Solver Platform does it all in Microsoft Excel,
accessing data from PowerPivot and SQL databases.
Sophisticated Data Mining and Predictive Analytics
Go far beyond other statistics and forecasting add-ins
for Excel. Use classical multiple regression, exponential
smoothing, and ARIMA models, but go further with
regression trees, k-nearest neighbors, and neural
networks for prediction, discriminant analysis, logistic
regression, k-nearest neighbors, classification trees,
naïve Bayes and neural nets for classification, and
association rules for affinity (“market basket”) analysis.
Use principal components, k-means clustering, and
hierarchical clustering to simplify and cluster your data.
Simulation, Optimization and Prescriptive Analytics
Analytic Solver Platform also includes decision trees,
Monte Carlo simulation, and powerful conventional and
stochastic optimization for prescriptive analytics.
Help and Support to Get You Started
Analytic Solver Platform can help you learn while
getting results in business analytics, with its Guided
Mode and Constraint Wizard for optimization, and
Distribution Wizard for simulation. You’ll benefit from
User Guides, Help, 30 datasets, 90 sample models, and
new textbooks supporting Analytic Solver Platform.
Surprising Performance on Large Datasets
Excel’s ease of use won’t limit what you can do – Analytic
Solver Platform’s fast, accurate algorithms rival the
best-known statistical software packages.
Find Out More, Download Your Free Trial Now
Visit www.solver.com to learn more, register and
download a free trial – or email or call us today.
W W W. I N F O R M S . O R G8 | A N A LY T I C S - M AGA Z I N E . O R G
Increasing popularity and access to the Internet
has changed the way marketers are interacting with
customers. These customers are smart, well informed
and empowered, as Internet connectivity is available
to them at their fingertips and on the go. It has there-
fore become imperative for organizations to be on the
customers’ online radar with respect to new products or
services and to be able to influence their choices.
Not surprisingly, according to one study, 34 percent
of marketers are generating leads through Twitter. In-
dia’s online retail market grew at a staggering 88 per-
cent in 2013 to $16 billion and continues to grow. These
examples are a testimony to the growth of e-commerce.
The Internet deluge has opened an assortment of op-
portunities. Customers are able to buy high-end fashion
and designer shoes, book hotels, buy movie tickets and
you-name-it.
Therefore, an opportunity exists for business re-
search to capture, compile, churn and store colos-
sal bytes of information about customers, suppliers
and operations. This is what we call the age of “big
data.” We believe that this age is a natural progres-
sion in online business and is here to stay. We are al-
ready seeing a surge in adoption of digital channels
such as social media, e-mail marketing and display
ads in e-commerce. Imagine the amount of data this
It has become imperative
for organizations to be
on the customers’ online
radar with respect to
new products or services
and to be able to
influence their choices.
BY ROHIT TANDON
AND SHRUTI UPADHYAY
Six ways of value-creation
through analytics in
E-commerce
EXECUTIVE EDGE
Tel 775 831 0300 • Fax 775 831 0314 • info@solver.com
AnAlytic Solver PlAtform
from Solver to full-Power Business Analytics in excel
The Excel Solver’s Big Brother Has Everything You
Need for Predictive and Prescriptive Analytics
From the developers of the Excel Solver, Analytic Solver
Platform makes the world’s best optimization software
accessible in Excel. Solve your existing models faster,
scale up to large size, and solve new kinds of problems.
FromLinearProgrammingtoStochasticOptimization
Fast linear, quadratic and mixed-integer programming is
just the starting point in Analytic Solver Platform. Conic,
nonlinear, non-smooth and global optimization are just
the next step. Easily incorporate uncertainty and solve
with simulation optimization, stochastic programming,
and robust optimization – all at your fingertips.
Ultra-FastMonteCarloSimulationandDecisionTrees
Analytic Solver Platform is also a full-power tool for
Monte Carlo simulation and decision analysis, with a
Distribution Wizard, 50 distributions, 30 statistics and
risk measures, and a wide array of charts and graphs.
Comprehensive Forecasting and Data Mining
Analytic Solver Platform samples data from Excel,
PowerPivot, and SQL databases for forecasting and data
mining, from time series methods to classification and
regression trees, neural networks and association rules.
And you can use visual data exploration, cluster analysis
and mining on your Monte Carlo simulation results.
Find Out More, Download Your Free Trial Now
Analytic Solver Platform comes with Wizards, Help, User
Guides, 90 examples, and unique Active Support that
brings live assistance to you right inside Microsoft Excel.
Visit www.solver.com to learn more, register and
download a free trial – or email or call us today.
W W W. I N F O R M S . O R G10 | A N A LY T I C S - M AGA Z I N E . O R G
EXECUTIVE EDGE
has created for marketers to lay their hands on for
analysis. Despite that, in the race to utilize the on-
line space, marketers may be focusing more on ad-
vertising and less on analysis of the data that could
potentially increase sales.
In our opinion, understanding the customer
behavior becomes more complex in business-to-
consumer companies and more so in a 24/7 e-com-
merce business that sells technology products in an
increasingly commoditized industry. A strong analyt-
ics foundation may make e-commerce a thriving and
successful channel of sales. Businesses, therefore,
are increasingly creating customizable campaigns
for their installed base customers and improving
sales effectiveness through e-commerce.
For example, pricing and merchandising deci-
sions need to be taken in real time, and the need to
have real-time insights is ever-increasing. To make
these decisions faster and better, marketers would
need to quickly analyze their digital marketing strate-
gies by mining data exhaustively and cost effectively
through advanced analytics.
KEY DRIVERS OF INCREASED REVENUES
An organization’s ability to achieve its goal of
increased revenues and margins would depend
heavily on its ability to improve three key drivers: 1)
volume of customer traffic to the online store (num-
ber of visits); 2) customer conversion (percentage of
conversion); and 3) basket size (revenue per aver-
age order size). Analytics has a very important role
to play in this value chain. So while organizations
may have the best talent with an analytical mindset
and eagerness to apply it, we need to equip data
In the race to utilize
the online space,
marketers may be focusing
more on advertising
and less on analysis
of the data
that could potentially
increase sales.
A NA L Y T I C S J U LY / AU G U S T 2 014 | 11
scientists in organizations with the right
tools and insights.
Conversations with analytics profes-
sionals reiterate our belief in some of the
following must-haves that will elevate an
organization’s e-commerce agenda to the
next level:
1. Development of best-in-class
tools and techniques are a must to
build scalable solutions and tackle the
optimization of key drivers.
Over the years various products such
as SAS have provided excellent devel-
opment environments, but every data
scientist had to start from scratch and
depend on their “personal” techniques to
tackle new problems. However, in recent
years, data scientists and organizations
are now moving toward using templates
and building packaged models and solu-
tions to reuse and replicate technologies
with ease.
One of the first such pilot solutions with-
in HP was developed for HPDirect.com’s
demand generation function, where global
analytics developed V.1 of a series of de-
mand generation models. These models
also paved the way for the development of
www.leeds.colorado.edu/ms
303-492-8397
leedsms@colorado.edu
Stand Out.
Put yourself in a lucrative new career.
Apply now for a master’s degree in business
analytics or supply chain management.
• Intensive nine month programs
• World-renowned faculty
• Experiential projects with industry clients
• Personalized professional development
analytics_Layout 1 4/25/14 12:51 PM Page 1
W W W. I N F O R M S . O R G12 | A N A LY T I C S - M AGA Z I N E . O R G
customer targeting models. In most organizations,
such initiatives if implemented have the potential to
lay the foundation for similar opportunities with other
business functions such as planning, store opera-
tions and category management. When an organi-
zation reaches such a stage of maturity, that’s when
true “return on data” (ROD) is possible.
2. The three Ws …whom, what, when. Tradi-
tionally, marketers have used a uni-dimensional ap-
proach to target customers. However, results show
that these can be sub-optimal and might have an
adverse effect on customer loyalty and brand image.
Answering questions such as whom to target, what
to offer and when to offer bring a paradigm shift in
garnering customer interest and loyalty. These help
rank customers on their propensity to re-purchase,
and lead to preferential treatment of the right cus-
tomers with the right product portfolio or allow mar-
keters to understand when to offer discounts.
Effective tools and modeling will also note clues
on probability of customers picking one product over
another or repeat customer behaviors. This brings
us back to the importance of using effective, proven
analytics tools and techniques.
3. Automate and innovate. Creating and
applying big data algorithms will help organizations
in taking appropriate actions. Many of them are
programmed automatically, save time and allow
better decisions faster. Creating a robust tool-based
ecosystem that allows creation of funnels that track
visitors, bounce rates, conversations, etc., is vital to
a successful Web analytics initiative.
Answering questions
such as whom to target,
what to offer and
when to offer bring
a paradigm shift in
garnering customer
interest and loyalty.
EXECUTIVE EDGE
J U LY / AU G U S T 2 014 | 13A NA L Y T I C S
4. Site search analytics. Tracking
site search is a very useful resource that
allows you to know what your visitors are
looking for in your website. Is the search
engine directing the customer to your web-
site or redirecting them to the next best op-
tion in absence of the product? Keeping
tabs on this will help companies increase
customer loyalty and sales.
Another application of site search an-
alytics allows you to understand what is
being searched on your website. By under-
standing this, marketers can influence the
site layout and design so that visitors are
able to easily locate answers to common
queries or the most searched products.
5. Marketing spend optimization.
HP’s online store uses a mix of marketing
vehicles to reach different customer seg-
ments with different communication and
buying preferences. Optimizing spend on
various marketing vehicles is critical to
optimizing demand generation efforts as
well. However, determining which market-
ing mix is most beneficial to the business
is not an easy process, requiring not only
a scientific approach to analyzing spend
and revenue, but also a test-learn-opti-
mize culture. For example, ongoing anal-
ysis of the response to different types of
marketing vehicles helps in identifying the
best fit for a particular type of message.
Based on such analysis, one can decide
if a banner would work best vis-à-vis a
customized landing page, or would an
e-mail campaign be the best option.
6. Connect marketing with ware-
housing. In large supply chain environ-
ments, an accurate forecast of orders
that get shipped out of the warehouse on
a daily basis can be tracked using pre-
dictive analytics methodologies to en-
able accurate warehouse space/staffing
allocation in order to meet the aggressive
shipping timeline.
In conclusion, marketers can apply
data mining and advanced analytical skills
to derive key insights to better understand
drivers of Web traffic and reasonably ac-
curate traffic forecast for use in business
planning. We sense that if companies use
data accurately, they can easily exhibit
a three to five times growth of the online
business and will make analytics easily
replicable across different functions of the
organization.
Rohit Tandon is vice president of corporate strategy
and worldwide head of Global Analytics at Hewlett-
Packard. As part of HP’s corporate strategy team, he
helps drive the analytics ecosystem to support HP’s
vision and priorities through delivery of cutting-edge
analytical capabilities across sales, marketing, supply
chain, finance and HR domains. He was recently
named one of the top-10 most influential analytics
leaders in India for 2014 by Analytics India Magazine.
Shruti Upadhyay is a manager with HP Global
Analytics.
W W W. I N F O R M S . O R G14 | A N A LY T I C S - M AGA Z I N E . O R G
BY VIJAY MEHROTRA
ANALYZE THIS!
Given my love of books, it is perhaps not surpris-
ing that Amazon.com – where, thanks to the digital
technologies of today, a plethora of books can imme-
diately be found about nearly any idea that pops into
my head and be delivered (free with Amazon Prime
membership!) to my doorstep with remarkable speed
– is a website that I love deeply. Like many avid read-
ers, I purport to do my best to support my local inde-
pendent booksellers, but too often there is simply no
denying the powerful pull of the super convenient,
instantly gratifying, highly personalized Amazon.com
experience.
Thanks to my bi-monthly book club, I recently read
“Who Owns the Future?” by Jaron Lanier, a celebrat-
ed technologist and MacArthur “genius” award winner
best known for his contributions to the field of virtual
reality. Lanier is known as a big thinker, and in this
book – at once rambling, provocative and thoughtful
– he once again shows why.
“WOTF” begins with a bleak assessment of where
digital technology is leading us all. The main thrust of
Lanier’s argument is as follows:
• Technology makes it very easy to give away for
free a lot of things that people find valuable – just
Dark side of the
digital world
“In the book business
the prospect of a single
owner of both the means
of production and the
modes of distribution is
especially worrisome ...”
— George Packer
Big data, unintended consequences: What Amazon’s domination of the
book publishing industry could portend.
J U LY / AU G U S T 2 014 | 15A NA L Y T I C S
think about the search engine. Being
human, we are conditioned to love the
chance to get something for nothing,
and we have gratefully grabbed at it with
both hands.
• However, the value that technology
grants us is not actually free. In
exchange, we tacitly give up information
about ourselves, which is then stored
as data.
• Thanks largely to analytics
professionals, this data is then pooled
and analyzed to create a variety of
commercial opportunities that would not
otherwise exist.
• This commercial wealth confers
extraordinary power upon those who
own the technologies that capture and
analyze this data (Lanier calls them
“Siren Servers”).
• This power in turn enables the
owners of the Siren Servers to have a
huge impact on the society that we live
in, including employment, government,
culture and ideas.
• Taken to their logical conclusions,
Your one-stop shop to view top presentations from key INFORMS meetings
Your latest member benefit lets you learn from the best on your schedule.
http://livewebcast.net/INFORMS_Video_Learning_Center
video learning center
NOW ONLINE! 2014 Edelman Presentations
2013 Analytics Conference and Annual Meeting
2012 Analytics Conference and Annual Meeting
2011 Analytics Conference and Annual Meeting
2010 Practice Conference and Annual Meeting
2009 Annual Meeting
W W W. I N F O R M S . O R G16 | A N A LY T I C S - M AGA Z I N E . O R G
ANALYZE THIS!
all of this ultimately dooms the human
species to a very sad and cataclysmic
ending.
Along the way, Lanier also wanders
off into pleasantly intense digressions on
a broad variety of somewhat related top-
ics, including Aristotle, the tenure system,
biodiversity and the concept of local op-
tima. He too clearly loves to read.
IMPACT ON PUBLISHING
While still digesting this thought-
provoking book, I came across George
Packer’s recent article entitled “Is
Amazon good for books?” Taking a long
hard look at Amazon.com, the website
that perhaps most fully embodies Lanier’s
concept of a Siren Server, Packer finds
that many of Lanier’s more dire predic-
tions are already playing out there.
Packer’s particular focus is Amazon’s
impact on the publishing industry, and he
believes that the stakes here are incred-
ibly high: “In the book business the pros-
pect of a single owner of both the means
of production and the modes of distribu-
tion is especially worrisome; it would give
Amazon more control over the exchange
of ideas than any company in U.S. histo-
ry. Even in the iPhone age, books remain
central to American intellectual life, and
perhaps to democracy.”
I wholeheartedly agree.
Just as Lanier predicts, suppliers
and consumers alike had originally both
rushed to embrace Amazon, for like so
many technologies it seemed to magical-
ly (that is, without cost) provide all parties
with something for which they hungered.
As Packer writes, “When Amazon
emerged, publishers in New York sud-
denly had a new buyer that paid quickly,
sold their backlist as well as new titles,
and, unlike traditional bookstores, made
very few returns” – generating fresh rev-
enues for publishers with little incremen-
tal investment. Meanwhile, we readers
flocked to Amazon in droves for its con-
venience, its variety, and its low prices.
Amazon.com today accounts for
more than 40 percent of all printed books
purchased as well as 65 percent of all
eBooks, so it is probably fair to say that
book buyers by and large still love Ama-
zon. For us as readers, this is fortuitous,
since the number of independent book-
stores in business has declined by more
than 50 percent since Amazon’s found-
ing. However, as its share of overall book
sales has ballooned, Amazon has taken
advantage of its market power to aggres-
sively push the terms of its agreements
with book publishers dramatically in its
own favor, often through tactics reflect-
ing Amazon’s famously secretive and
opaque corporate culture. Meanwhile,
Packer reports, the many publishers large
and small whose businesses are now
© 2014 Fair Isaac Corporation. All rights reserved.
Now part of FICO®
Xpress
Optimization Suite.
Parallel
Simplex
S1
X1
X2
X3
S1
S2
P
S1
P
People have been attempting to add parallel processing to the simplex method for linear programming for well
over 30 years. FICO is proud to announce that we have solved this enormously difficult problem and can now offer
parallel simplex in our software, including FICO® Xpress Optimization Suite.
The addition of parallel processing to simplex algorithms speeds performance of FICO® Xpress Optimization Suite
by as much as a factor of 2.5.
Our method for the parallelization of classic simplex algorithms involves picking apart the
algorithmic components and rearranging them to make the algorithm open to parallelization.
Learn more about parallel simplex and FICO®Xpress Optimization Suite:
http://www.fico.com/xpress
W W W. I N F O R M S . O R G18 | A N A LY T I C S - M AGA Z I N E . O R G
ANALYZE THIS!
dependent on Amazon for much of their dis-
tribution and revenues are learning firsthand
that, as Lanier sharply points out, “information
supremacy for one company becomes, as a
matter of course, a form of behavior modifica-
tion for the rest of the world.”
Packer’s article also describes an Amazon
culture that places a very low value on human
beings that are involved with development, pro-
motion and distribution of books, placing its faith
in algorithms rather than editors and relying on
volunteer (that is, free) reviewers to take the
place of staff writers. All of this serves as a real
illustration of Lanier’s premise that as more and
more aspects of the enterprise are mediated by
software, those in the business of carefully cre-
ating content (rather than digitally distributing it)
will be increasingly de-valued and many forms
of employment that have long-term value to our
culture will subsequently perish.
ELIMINATING THE GATEKEEPERS
While Amazon’s efforts at actually serving
as a publisher have so far failed, it is clear
that we can expect them to continue to pur-
sue the holy grail of “eliminating the gate-
keepers” from the world of publishing by
producing its own original content. Indeed,
one comes away from Packer’s article with
the feeling that if Amazon’s founder and CEO
Jeff Bezos could eliminate the need for au-
thors and publishers by replacing them with
automated content-generating software, he
would not hesitate for an instant.
As more and more aspects
of the enterprise are
mediated by software, those
in the business of
carefully creating content
(rather than digitally
distributing it) will be
increasingly devalued.
J U LY / AU G U S T 2 014 | 19A NA L Y T I C S
In fact, book distribution has from the
outset been only a small part of Bezos’
vision. The real prize for Bezos has been
the access to reams of consumer data
and the ability to analyze this data for fun
and profit. According to Packer, as early
as 1995, Bezos had publicly stated that
“Amazon intended to sell books as a way
of gathering data on affluent, educated
shoppers.” Indeed, today the $5.25 billion
in book sales makes up only 7 percent
of Amazon’s total revenues. This too is
just as Lanier predicts in “WOTF,” which
may be why it was somehow not available
directly from Amazon.com when I looked
for it the other day (it has since been
restored somehow).
One book that I was able to find on
Amazon.com was “Ethics of Big Data,”
in which author Kord Davis asks a num-
ber of more fundamental questions
about data and its place in the business
world. As a longtime software/IT pro-
fessional with a deep grounding in phi-
losophy and the history of technology,
Davis is equally comfortable discussing
INFORMS is the foremost association of O.R. and analytics experts. Our
members literally wrote the book on how analytics and the principles of
operations research are used to improve organizational decision making.
To find an
expert to help
you, log onto
INFORMS
Find An
Analytics
Consultant
Database
informs.org/Find-Analytics-Consultant/Search
W W W. I N F O R M S . O R G2 0 | A N A LY T I C S - M AGA Z I N E . O R G
ANALYZE THIS!
topics as diverse as digital strategy, supply
chain optimization, application development
and values-based management. As such, he
has a unique perspective that motivates him
to take these important – and very thorny –
questions seriously. As he writes in the book’s
Preface, “nobody in history has ever had the
opportunity to innovate, or been faced with
the risks of unintended consequences, that
big data now provides.”
In particular, Davis identifies four
major aspects of any serious data ethics
discussion:
• 	Identity: In the digital world, who we
are is tacitly defined by the data we leave
behind and indeed our own sense of self
is often tightly intertwined with our online
activities. Davis points out that capturing
and analyzing our digital trail “provides
others the ability to quite easily summarize,
aggregate or correlate various aspects of
our identity – without our participation or
consent.”
• 	Privacy: Does your decision to
engage in a digital interaction confer
upon other entities the right to utilize data
captured in the course of that specific
interaction, and to link it to other sources
of data that may correspond to you? As
Davis asks, “Does privacy mean the same
thing in both online and offline worlds?…
should individuals have a legitimate ability
to control data about themselves, and to
what degree?”
“Nobody in history has
ever had the opportunity
to innovate, or been faced
with the risks of
unintended consequences,
that big data now
provides.”
— Kord Davis
SCHOLARSHIP FOR SERVICE PROGRAM
Undergraduate, graduate, and doctoral students pursuing degrees
in Science, Technology, Engineering,  Mathematics (STEM) fields
SMART Scholars receive:
+ Full tuition and educational fees
+ Generous cash stipend
+ Employment with Department of Defense facilities after graduation
+ Summer internships, health insurance,  book allowance
For more information and to apply, visit
For more information and to apply, visit HTTP://SMART.ASEE.ORG
In accordance with Federal statutes and regulations, no person on the grounds of race, color, age, sex, national origin or disability shall be excluded from participating in,
denied the benefits of, or be subject to discrimination under any program activity receiving financial assistance from the Department of Defense.
W W W. I N F O R M S . O R G22 | A N A LY T I C S - M AGA Z I N E . O R G
ANALYZE THIS!
• 	Ownership: Digital technology,
data and analytics have given some
companies the ability to turn individual
users’ data into saleable assets and
many others the capacity for improved
decision-making and increased
profitability. Intelligently utilizing
data is something that we typically
celebrate in our profession, but
Davis again challenges this view by
asking some very fundamental and
thought-provoking questions: “Does
our existence itself constitute a
creative act, over which we have
copyrights or other rights associated
with creation? If it does, then how
do those offline rights and privileges,
sanctified by everything from the
Constitution to local, state and federal
laws, apply to the online presence of
that same information?”
• 	Reputation: Davis hits the nail
on the head when he points out
that, thanks to the ability of data to
be combined and analyzed to drive
inferential and predictive judgments,
“the number of people who can form
an opinion about what kind of person
you are is exponentially larger and
farther removed…” And while these
online reputations are stubbornly
persistent, the accuracy of this
reputational assessment is too
often an afterthought.
CALL FOR ACTION
Unsatisfied with merely admiring the
problem, both Lanier and Davis also call
for action. Lanier proposes a technologi-
cal and marketplace solution to the oth-
erwise inevitable destiny that he believes
digital technology, user data, and busi-
ness analytics are rapidly leading us into,
problems that are so vividly illustrated
by the case of Amazon. He suggests an
elaborate (though high-level) framework
in which all personal data and creative
works are tagged so as to enable their
owner/creators to capture micropayments
whenever and however their data/works
are utilized. While his proposed remedy
is at this stage sketchy at best, from my
perspective he is to be commended for
engaging us all in a conversation about a
technology-enabled solution to a complex
set of problems that few others are even
willing to acknowledge.
Davis, like Lanier, is a technologist
rather than a Luddite (as he quite rightly
points out, “whereas big data is ethical-
ly neutral, the use of big data is not”). In
“Ethics of Big Data,” he strongly encour-
ages organizations that use data exten-
sively (as well as the policy-makers who
attempt to make judgments in support of
social good) to have meaningful discus-
sions about how and why we use data
and what the ethical implications are
J U LY / AU G U S T 2 014 | 23A NA L Y T I C S
of those actions. In his call for serious
ethical inquiry, Davis asserts that “Or-
ganizations realize that information has
value that can be extracted and turned
into new products…the ethical impact is
highly context-dependent. But to ignore
that there is an ethical impact is to court
an imbalance between the benefits of in-
novation and the detriment of risk.”
Especially, as Lanier would be quick
to add, “with technology itself enabling
the risk to be pushed off onto many, while
the benefits are captured by an ever
smaller few.”
As Packer reports, Amazon has giv-
en very little thought to the near-term
ethics or the long-term implications of
the way in which it has used its custom-
ers’ data to obtain its current level of
market power. But as Amazon’s current
battle [1] with publisher Hachette rages
on, with publishers, governments and
erstwhile business partners sure to fol-
low, it is clear that this particular story is
far from over.
As analytics professionals, neither is
ours. We have a significant stake in the
outcomes of these conversations about
ethics and the future. As such, we would
be wise to actively participate in those
conversations. At this particular moment,
we have considerable leverage to advo-
cate for a digital future that reflects our
own values.
The world of digital business – our
own personalized Siren Server – has
provided us with a massive, lucrative,
and free channel for our products and
services. Today’s digital enterprise de-
pends so much on our ever-expanding
ability to capture, transmit, store, inte-
grate and organize data, and our deep
capacity to use this data to summarize,
analyze, correlate, predict and optimize.
Through no fault of our own, we have
been bestowed with The Sexiest Job
of the 21st Century [2], and it is indeed
tempting to believe that we are an inte-
gral and indispensable part of the world
in which we live and work, and that we
always will be.
Turns out this is exactly what the pub-
lishers thought when Amazon first ap-
peared on the scene too. Beware: There
is no free lunch.
Vijay Mehrotra (vmehrotra@usfca.edu) is a
professor in the Department of Business Analytics
and Information Systems at the University of San
Francisco’s School of Management. He is also a
longtime member of INFORMS.
REFERENCES
1. For more on this, see http://www.nytimes.
com/2014/06/21/business/booksellers-score-
some-points-in-amazons-standoff-with-hachette.
html and http://www.latimes.com/books/
jacketcopy/la-et-jc-amazon-and-hachette-
explained-20140602-story.html#page=1.
2. http://hbr.org/2012/10/data-scientist-the-
sexiest-job-of-the-21st-century/ar/1
W W W. I N F O R M S . O R G24 | A N A LY T I C S - M AGA Z I N E . O R G
2014 is turning out to be an interesting year for
the healthcare industry. On the healthcare technology
front, this year has spurred 16 acquisitions since Jan.
1. State and federal government health insurance
exchanges finally started to operate at scale, offer-
ing affordable health insurance coverage to millions.
Twenty-six states and Washington, D.C., expanded
their Medicaid program as of May 2014, making a
large number of patients eligible for the safety net.
These are all good things that add to the success
of the Affordable Care Act (ACA), also known as
Obamacare.
At the same time we are just beginning to
see the impact of the new patient inflow on our
health system in the form of emergency room over-
crowding [1]. Opponents of the ACA argue that the
expansion of coverage without expanding the
primary care physician network across the nation
will lead to disaster. It remains to be seen which
way the pendulum will swing.
APPLE’S BIG SPLASH WITH HEALTHKIT
Meanwhile, Apple has released its HealthKit prod-
uct that connects multiple devices and apps. It has
shown promise to become the health data repository
BY RAJIB GHOSH
The two giants have
all the technology, talent
and financial firepower
needed to drive analytics
into the consumer health
space by enabling a
platform play for various
data generating devices
and apps.
HEALTHCARE ANALYTICS
Will Apple, Google
usher in new era in
healthcare analytics?
J U LY / AU G U S T 2 014 | 25A NA L Y T I C S
for consumers. In essence this was the promise
of the personal health record, or PHR, a promise
that rose to the peak of inflated expectation a few
years back and then fell to the trough of disillusion-
ment quite quickly [2]. But with Apple’s foray into
the space, this time it could be different.
The key promise, however, is the fusion of
data from multiple sources and use of analytics to
generate user-facing insights. The latter, howev-
er, is not there yet. In my last column I argued that
the true empowerment of the patient consumer
is waiting on the data fusion and analytics to
become mainstream. Consumers do not want
just a data repository like a PHR. They want
actionable information that PHR does not provide.
Apple’s announcement and subsequent ac-
tion may expedite the health data movement in
the right direction, but I am somewhat skeptical
regarding data liquidity in Apple’s “walled garden”
approach. Now that Apple has taken the lead
how far behind can Google be? Recently, Forbes
reported that Google is planning its own version
of a health platform. By the time this column goes
live we will know what Google is concealing up
its sleeves. These two giants have all the tech-
nology, talent and financial firepower needed to
drive analytics into the consumer health space
by enabling a platform play for various data
generating devices and apps.
Insights for the consumer, however, will come
at a price. As the insights with actionable consum-
er guidance increase, so too will the level of FDA
scrutiny, including requirement for mandatory FDA
approval. It is unclear how quickly Apple or Google
The key promise is
the fusion of data
from multiple sources
and use of analytics
to generate user-facing
insights. The latter,
however, is
not there yet.
W W W. I N F O R M S . O R G26 | A N A LY T I C S - M AGA Z I N E . O R G
HEALTHCARE ANALYTICS
will go for that since it is an unknown territory for
both companies. Having spent a decade in the
medical device industry I know first hand the pain
points of the manufacturers when their products
come under FDA’s purview.
APPLE-EPIC PARTNERSHIP
Apple is also partnering with Epic Systems,
the giant electronic medical record (EMR)
company that controls close to 20 percent of the
enterprise EMR market and covers 51 percent
of the patients in the United States. This is a
smart move by Apple. The ability to send user-
generated data to a healthcare professional’s
EMR system has always been a key requirement
for providers. This “end-to-end” data channel
establishes continuum of care, which acts as
the building block for analytics-driven population
health management (PHM) initiatives.
Since the introduction of the iPhone, Apple
products have enjoyed a widespread adoption
among healthcare professionals. A 2013 study by
the Black Book Rankings found that among physi-
cians who use medical apps on their smartphones,
68 percent used iPhones while 31 percent used
Android devices. Also, 59 percent of physicians ac-
cessed apps from their tablet, and most of those
users prefer iPad. Among U.S. consumers, Apple
has lost some ground recently to its key competitor,
Google Android, but still commands a large con-
sumer following.
When a system enjoys large market share
both among patients and providers and the sys-
tem connects with the largest EMR company in
When a system enjoys
large market share
both among patients and
providers and the system
connects with the largest
EMR company in
the country, we can expect
seamless bi-directional
data flow to reach
critical mass.
the country, we can expect seamless
bi-directional data flow to reach criti-
cal mass. This is a prerequisite to build
a cloud-based analytics solution that
can leverage data hubs at both ends of
the flow.
This is the reason why Apple’s Health-
Kit introduction is a key phenomenon,
albeit it does not do much in its early
incarnation. If Google wants to become
a serious player in the healthcare field
beyond fitness lovers, they have to think
in the same direction as well. Once that
happens imagine what sort of revolution
the rivalry of these technology compa-
nies can usher in!
The health data acquisition market is
still fragmented, and as a result EMR com-
panies have not shown much interest in
opening up their data repository to those
players. If Apple and Google can now turn
the table and make this a true platform
play using their controlling stakes in the
mobile device market, then it becomes
meaningful for the EMR companies to
forge powerful partnerships with one or
both of them. In turn that will create the
unification of episodic data and continu-
ous user-generated data – the Holy Grail!
Interoperability standards will be
firmed up and data security solutions will
emerge. Most importantly, patients and
providers will both benefit from the ana-
lytics solutions that will get a shot in the
arm from a data rich holistic picture of
the patient.
So far IBM is the lone warrior creat-
ing an ecosystem around its “Watson in
the cloud” analytics solution. It still lacks
the health data source. So what can
Apple, Google, IBM and Epic do together
to shake up healthcare? I’m getting goose
bumps just thinking about the possibilities.
Rajib Ghosh (rghosh@hotmail.com) is an
independent consultant and business advisor
with 20 years of technology experience in various
industry verticals where he had senior level
management roles in software engineering, program
management, product management and business
and strategy development. Ghosh spent a decade
in the U.S. healthcare industry as part of a global
ecosystem of medical device manufacturers, medical
software companies and telehealth and telemedicine
solution providers. He’s held senior positions at
Hill-Rom, Solta Medical and Bosch Healthcare. His
recent work interest includes public health and the
field of IT-enabled sustainable healthcare delivery
in the United States as well as emerging nations.
Follow Ghosh on twitter @ghosh_r.
REFERENCES
1. Laura Ungar, “More patients flocking to ERs
under Obamacare,” http://www.courier-journal.
com/story/news/2014/06/07/patients-flocking-
emergency-rooms-obamacare/10181349/
2. “Hype Cycle for Healthcare Provider
Applications, Analytics and Systems,” 2013,
Gartner http://www.healthcatalyst.com/health-
data-analytics-hype-cycle
J U LY / AU G U S T 2 014 | 27A NA L Y T I C S
Subscribe to Analytics
It’s fast, it’s easy and it’s FREE!
Just visit: http://analytics.informs.org/
W W W. I N F O R M S . O R G28 | A N A LY T I C S - M AGA Z I N E . O R G
The Institute for Operations Research and the
Management Sciences (INFORMS), the largest
professional society in the world for professionals
in the fields of analytics, operations research (O.R.)
and management science and the publishers of
Analytics magazine, announced that its Certified
Analytics Professional (CAP®
) exam will now be
given at hundreds of computer-based testing cen-
ters worldwide through an agreement with Kryterion,
the full-service provider of customizable assessment
and certification products and services.
Candidates for the CAP certification exam can
choose from Kryterion’s global network of online se-
cured testing locations to schedule their exam at a
convenient time and place. INFORMS’ online test-
ing center partner Kryterion, through strategic part-
nerships with colleges and universities, as well as
testing and training companies, provides over 700
testing locations in more than 100 countries. In the
United States alone, more than 400 testing centers
are available. CAP exams can now be scheduled al-
most any day of the week and at a time and location
that best suits the candidate.
Candidates for the CAP
certification exam can
choose from Kryterion’s
global network of online
secured testing locations
to schedule their exam at a
convenient time and place.
INFORMS INITIATIVES
CAP exam, continuing
education, analytics
conference cluster
J U LY / AU G U S T 2 014 | 29A NA L Y T I C S
Candidates can apply at www.in-
forms.org/applyforcertification. Upon ac-
ceptance into the program, candidates
receive an online voucher to present on
the Kryterion site.
Exam locations can be found at http://
www.kryteriononline.com/host_locations/.
Introduced in the spring of 2013, the
CAP program was created by subject
matter experts, many of whom are IN-
FORMS members. The CAP credential
is designed for general analytics pro-
fessionals in early- to mid-career and
is based on a rigorous job task analy-
sis and is vendor- and software-neutral.
Benefits of analytics certification include
gaining the ability to advance one’s ca-
reer by setting a professional with CAP
apart from the competition and obtain-
ing the structure to make continuing pro-
fessional development an integral part
of one’s job performance. The CAP pro-
gram assists hiring managers in finding
competent analytics talent and shows
that an organization hiring CAP profes-
sionals follows best analytics practice.
NEW INFORMS CONTINUING
EDUCATION COURSES
The INFORMS Continuing Education
program is offering two new courses this
fall: “Introduction to Monte Carlo and
Discrete-Event Simulation” and “Foun-
dations of Modern Predictive Analytics.”
The intensive, two-day, in-person
courses, like the program’s popular
current courses “Essential Practice
Skills for Analytics Professionals” and
“Data Exploration  Visualization,” pro-
vide real take-away value to implement
immediately at work. Once you leave
the classroom, you will be able to ap-
ply the real skills, tools and methods
of analytics. The courses will give par-
ticipants hands-on practice in handling
real data types, real business problems
and practical methods for delivering
business-useful results.
In the course “Introduction to
Monte Carlo and Discrete-Event
Simulation,” taught by Barry Lawson,
University of Richmond and Lawrence
Leemis, College of William and
Ma ry, participants will learn the
basics of Monte Carlo and discrete-
event simulation and how to identify
real-world problem types appropriate
for simulation. They’ll also develop
skills and intuition for applying
Monte Carlo and discrete-event
simulation techniques.
Topic areas covered include Monte
Carlo modeling, sensitivity analysis,
input modeling and output analysis.
The course will be held at the
INFORMS office, Catonsville (Baltimore
area), Md., Sept 12-13, and Chicago,
Oct. 16-17.
W W W. I N F O R M S . O R G3 0 | A N A LY T I C S - M AGA Z I N E . O R G
INFORMS INITIATIVES
The second new course, “Foundations
of Modern Predictive Analytics,” will
be taught by James Drew, Worcester
Polytechnic Institute, Verizon (ret.).
Modern predictive analytics, the
science of discovering and exploiting
complex data relationships, has rapidly
changed in recent years, especially in
today’s businesses. This course will
give participants hands-on practice in
handling real data types, real business
problems and practical methods for de-
livering business-useful results.
Some of the topic areas to be covered
in this course are: linear regression, re-
gression trees, logistic regression and
CART (classification and regression
trees).
The course will be held in Washington,
D.C., Sept. 15-16, and San Francisco,
Nov. 7-8.
Learn more about these courses
including course outlines, instructor
biographies, program objectives and
how to register at: www.informs.org/
continuinged.
ANALYTICS CLUSTER SET FOR
INFORMS ANNUAL MEETING IN S.F.
The Analytics Section of INFORMS
will present the analytics cluster of ses-
sions and presentations at the INFORMS
Annual Meeting in San Francisco
Nov. 9-12. The cluster encompasses
20 sessions featuring the renowned
analytics practitioners and leaders. Nine
additional sessions will be jointly orga-
nized in collaboration with the Health
Applications Society (HAS),CPMS
(the Practice Section of INFORMS)
and the Section on O.R. in Sports
(SpORts).
The sessions/presentations within
the cluster cover such topics as:
•	Successful application of analytics in
multiple industries such as healthcare,
transportation, defense and sports
•	Analytics focus areas such as big data,
spreadsheets and predictive analytics
•	Panel discussions on understand-
ing the connection between O.R. and
analytics, building analytics programs to
support organizations’ needs and busi-
ness analytics in healthcare industry
•	Winners of the Innovative Applications
in Analytics Award and the SAS Student
Paper Competition
•	Why’s, how’s and what’s of analytics
certification
More information about the confer-
ence can be found at http://meetings2.
informs.org/sanfrancisco2014/.
Help Promote Analytics Magazine
It’s fast and it’s easy! Visit:
http://analytics.informs.org/button.html
Solve key business problems utilizing big data. Earn an
AACSB-International accredited Master of Business
Administration with a specialization in Business Analytics
from the University of South Dakota.
Learn more: www.usd.edu/cde
The University of South Dakota’s
Beacom School of Business has been
continuously accredited by
AACSB-International since 1949.
Advance your career with an online Master of
Business Administration with a specialization
in Business Analytics.
DIVISION OF CONTINUING  DISTANCE EDUCATION
414 East Clark Street | Vermillion, SD 57069
605-677-6240 | 800-233-7937
www.usd.edu/cde | cde@usd.edu
C
M
Y
CM
MY
CY
CMY
K
USD_Online MBA BA Analytics Magazine Ad.pdf 1 6/9/14 9:15 AM
W W W. I N F O R M S . O R G32 | A N A LY T I C S - M AGA Z I N E . O R G
Magic shows are fun because we get to experi-
ence the impossible. Still, we know there’s trickery
afoot. But what about those times when the magic
isn’t magic? When we witness something that’s seem-
ingly impossible but proves all too real? Not only real,
but the result of optimization?
Such is the case in the Formula 1 race car pit. If
you follow F1 racing, it comes as no surprise that pit
stops have been reduced to two seconds. But if you
aren’t an F1 devotee, the idea of lifting a car, chang-
ing four tires and sending it on its way in a mere two
seconds stretches the imagination.
The role of the pit has changed dramatically over
the years. For much of racing history it was assumed
cars would only stop in the event of problems. Sched-
uled tire changes or fuel stops weren’t part of the
BY E. ANDREW BOYD
The idea of lifting a
car, changing four tires
and sending it on its way
in a mere two seconds
stretches the imagination.
FORUM
Pit stop analytics
Quick stop:
Optimized F1
pit teams can
change four
tires in two
seconds.
J U LY / AU G U S T 2 014 | 33A NA L Y T I C S
equation. This orthodoxy was challenged
in 1982 when an analytically minded race
team from the United Kingdom focused in
on two important facts. First, softer tires
stuck to the track better during turns than
their harder cousins, though they wore
out more quickly. Second, less gas in the
tank translated into a lighter, and there-
fore faster, car. Calculations showed
that time spent changing tires and re-
filling the tank was more than offset by
the improved performance of the car on
the track. It’s a calculation any analytics
practitioner would be proud of.
The idea quickly caught on, making
pit stops – and their efficient execution –
an integral part of racing. Refueling was
banned in 1984 out of safety concerns,
but reinstated in 1994. During that 10-year
period pit crews refined their tire chang-
ing skills to the point where the fastest pit
stops took a little over four seconds. When
refueling was again instituted, the impetus
for faster tire changes disappeared since
refuelingwasthebottleneck.Thatchanged
in 2010 when F1 racing again reverted to
a no refueling policy, setting the stage for
lightening fast tire changes.
Achieving a two-second tire change
required optimizing the entire process.
Engineers took a look at everything from
the design of the wheel nuts (one per
wheel on F1 cars) to the special, self-
positioning pneumatic guns that remove
and tighten each nut. They then turned
their attention to the pit crews.
Teams of three work on each wheel:
one to remove the old tire, one to position
the new tire and one to operate the gun.
Their moves aren’t left to chance, but are
choreographed down to the position of
their hands and feet from start to finish.
It’s not hard to imagine John and Lillian
Gilbreth – progenitors of industrial engi-
neering and pioneers of time and motion
studies – standing nearby, stopwatches
in hand. They’d certainly be smiling in ap-
proval. With two jack operators and scat-
tered observers, as many as 20 people
crowd around a car during a pit stop – for
two seconds of work.
Optimization brings to mind models
and mathematical programs. But some-
times optimization is smart without being
sophisticated. And in the F1 pit, it works
like magic.
Andrew Boyd, INFORMS Fellow and INFORMS
VP of Marketing, Communications and Outreach,
served as executive and chief scientist at an
analytics firm for many years. He can be reached
at e.a.boyd@earthlink.net.
NOTES  REFERENCES
1. Gray, W., “Tech Talk: Can F1 Pit Stops Get Even
Quicker?” Eurosport, April 9, 2013. See also: https://
uk.eurosport.yahoo.com/blogs/will-gray/gray-matter-
f1-stops-even-quicker-101951154.html. Accessed
May 24, 2014.
2. Examples of fast pit stops can be found at:
https://www.youtube.com/watch?v=aHSUp7msCIE
https://www.youtube.com/watch?v=Xvu0GlMa3xQ
W W W. I N F O R M S . O R G34 | A N A LY T I C S - M AGA Z I N E . O R G
CUSTOMER RELATIONSHIPS
Cloud-based analytical engine yields instant insight
using unstructured social media data.
nformation is generated in
today’s world more rapidly
than ever before, and it
will keep growing at an ex-
ponential rate. The rise of social media
combined with increased Internet pen-
etration has led to a significant increase
in user-generated content in the form
of product reviews and feedback, blogs,
independent news articles, Twitter
and Facebook updates. The crux of
leveraging such data lies in identifying
patterns from it and using the data to
generate actionable insights in real time.
This article proposes a cloud-based
analytical engine that analyzes com-
ments, reviews and opinions generated
by customers to understand the main
underlying themes and the general sen-
timent so that actionable insights can
be generated in real time. Algorithms
such as latent Dirichlet allocation for
topic modeling and the holistic lexicon-
based approach for sentiment mining
have been operationalized using a multi-
agent framework deployed in a cloud
Real-Time Text
Analytics
BY (l-r) AVEEK MUKHOPADHYAY
AND ROGER BARGA
I
J U LY / AU G U S T 2 014 | 35A NA L Y T I C S
depended on the time-intensive ETL pro-
cess (extract, transform, load). Depend-
ing upon the system and data complexity,
analytics could be delayed by hours, days
or even weeks while data management
put it all together.
In today’s business landscape, mini-
mizing the lag between acquiring data
and generating actionable insight has be-
come the key differentiator. Acting in real
time to respond to an event can result in
huge profits and improved customer rela-
tionships for a firm.
Real-time analytics can benefit in
multiple business scenarios, including:
•	 High-frequency trading (sophisticated
algorithms to rapidly trade securities)
•	 Real-time detection of fraudulent
transactions
•	 Real-time price adjustment based on
competitor information
•	 Real-time feedback from social
media for a product firm about its
new launch
•	 Real-time recommendations by retail
stores based on customer’s location
•	 Real-time traffic routing based on
information about vehicle frequency,
direction, etc.
Social media content comes from
users without any vested interest, thus
their opinions beget more trust. Orga-
nizations whose products and services
environment. This process meets com-
putational demands as it allows users
to run virtual machines within managed
data centers, freeing them from worry-
ing about acquisition of new hardware
and networks.
UNSTRUCTURED SOCIAL MEDIA DATA
According to a study by International
Data Corporation (IDC), mankind cre-
ated an estimated 150 exabytes (1 bil-
lion gigabytes) of data in 2005, a number
that jumped to 1,200 exabytes in 2010. A
more recent study by IDC and EMC put
the amount of data created in 2011 at 1.8
zettabytes (1 followed by 27 zeroes), a
number the study researchers expected
to double every two years.
Only 5 percent of this data is struc-
tured (comes in a standard format that
can be read by computers). The remain-
ing 95 percent is unstructured (photos,
phone calls and free-flow texts). A large
chunk of such unstructured data is in
text format. Posing challenges owing to
the sheer volume, depth and complex-
ity, such data, however, holds immense
potential for organizations. The key lies
in identifying patterns from the data and
gaining relevant insights.
REAL-TIME ANALYTICS
Not long ago, analyzing data and
generating business intelligence reports
W W W. I N F O R M S . O R G36 | A N A LY T I C S - M AGA Z I N E . O R G
REAL-TIME TEXT ANALYTICS
are mentioned in such media need to
remain current on relevant discussions
and be able to track the sentiment of ev-
ery employee, customer and investor. To
address this challenge, a cloud-based
real-time ecosystem was created for ana-
lyzing comments, reviews and opinions
mined from Twitter. In addition, tracking
trending themes in the customer space
and the evolution of these trends over
time was incorporated.
TEXT MINING ALGORITHMS
Topic modeling. Topic models are
statistical techniques that analyze words/
phrases in textual data to understand
the main themes running through them.
This model algorithm is based on LDA
(latent Dirichlet allocation) and uses the
observed words in tweets (extracted from
Twitter) to infer the hidden topic structure.
LDA is more easily understood by its
generative process. This generative pro-
cess defines a joint probability distribution
over the observed (the words) and hidden
(the topics) random variables. This joint
distribution is used to compute the condi-
tional distribution of the hidden variables
given the observed variables. This con-
ditional distribution is called the posterior
distribution.
A topic is assumed to be a collec-
tion of words with different probabilities
of occurrence. An individual tweet can
be assumed as generated from multiple
topics in different proportions. Now every
word generated in a tweet can be ran-
domly chosen in a two-step process:
• 	 First, a topic is randomly selected
from the distribution of topics.
• 	 Second, the chosen word is randomly
selected from the distribution of
words over that topic.
So, the joint probability distribution of word
W and topic T = Probability (W, T) =
Probability (T) * Probability (W | T).
Now when the individual probability of
occurrence of a word is known (because it
has already occurred in the tweet), the pos-
terior distribution is calculated as follows:
Probability (T | W) = Probability (W, T)
/ Probability (W)
Given the probabilities of observed
words, latent information like the vocabu-
lary distribution of a topic and the distri-
bution of topics over the tweet are thus
inferred.
Sentiment analysis. A holistic lexi-
con-based algorithm is used to analyze
individual feature-level sentiments as well
as cumulative sentiments over tweets.
Aggregating opinions for a feature:
The algorithm parses one tweet at a time
identifying the features present. A set of
opinion words for each feature is identi-
fied using a lexicon. An orientation score
Opportunity
at your fingertips.
Visual Analytics
The answers you need, the possibilities you seek—they’re
all in your data. SAS helps you quickly see through
the complexity and find hidden patterns, trends, key
relationships and potential outcomes. Then easily share
your insights in dynamic, interactive reports.
Try Visual Analytics and see for yourself
sas.com/VAdemo
Try Visual Analytics and see for yourself
sas.com/VAdemo
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2014 SAS Institute Inc.All rights reserved. S120597US.0214
W W W. I N F O R M S . O R G38 | A N A LY T I C S - M AGA Z I N E . O R G
REAL-TIME TEXT ANALYTICS
for each feature in the sentence is then
calculated by summing up the feature-
opinion scores for that sentence. (Each
feature-opinion score is obtained from
the sentiment polarity of the opinion
word and a multiplicative inverse of the
distance between the feature and opin-
ion word. Opinion words at a distance
from the feature are assumed to be less
associated to the feature compared to
the nearer words.)
For example, the phone is useful and
a great work of art.
Let the feature here be phone and
opinion words be “useful,” “great.”
Semantic orientation of useful = 1
Semantic orientation of great = 1
Distance between the words useful
and phone = 2
Distance between the words great
and phone = 5
score(f)=1/2+1/5= 0.7
Aggregating opinions for tweets: The
sentiment score for a tweet is the sum-
mation of the scores for all opinion words
present in the tweet.
For example, “The phone is useful
and a great work of art.”
The opinion words in the sentence are
“useful,” “great”
Semantic orientation of useful = 1
Semantic orientation of great = 1
score(t) = 1 +1= 2
Negation-rule: This identifies the ne-
gation word (which can be 1 or 2 places
before the opinion word) and reverses
the opinion expressed in a sentence.
For example, “The phone is not good.”
Here phone gets negative orientation.
Context-dependent rules: The features
for which we find no opinion words, context
dependent constructs are used to identify
the orientation score.
For example, “The phone is good but
battery-life is short.”
The only opinion word in the sentence
is “good” (“short” is a context-dependent
word).
Phone gets positive orientation be-
cause of “good.”
Battery-life gets negative orientation
because of the word “but” being present
between good and battery-life.
Topic Evolution. The next step to
topic modeling is to understand how top-
ics and trends develop, evolve and go viral
over time.
The algorithm maintains a fixed num-
ber of topic streams and their statistics.
Each tweet is processed as it comes in
and is assigned to the “closest” topic
stream (the topic stream most similar to
it). If no topic stream is close enough,
then a new stream is created and a stale
stream is killed to maintain a fixed number
INFORMSCONFERENCE
BIG
DATA
THE
BUSINESS
OF
THANKS
ToOur
Sponsors
Leadership Sponsors
Corporate  University Sponsors
W W W. I N F O R M S . O R G4 0 | A N A LY T I C S - M AGA Z I N E . O R G
REAL-TIME TEXT ANALYTICS
of topic streams. Streams are constantly
monitored for the rate of arrival of tweets.
Whenever there is a burst of tweets in a
particular topic stream, an alert for the
trending topic is generated.
THE REAL-TIME EDGE
A multi-agent distributed framework
enables the processing of real-time data
and facilitates decision-making by al-
lowing for easy deployment of analyti-
cal tasks in the form of process flows. In
this multi-agent paradigm, an agent is a
software program designed to carry out
one or more tasks and can communicate
with other agents in the system using
agent communication language. Thus, an
analytical task can be written as an agent,
and the analytical process flow can be es-
tablished by wiring together a set of com-
municating agents (an agency) that can
run in sequence or in parallel.
These agents were written using R to
offer the analyst the benefits of a powerful
and flexible statistical modeling language.
OPERATIONALIZATION IN THE
CLOUD
The entire real-time platform was then
deployed on a cloud ecosystem to allow
for the following processes:
Efficient resource management: The
cloud platform provides the necessary vir-
tual machine, network bandwidth and other
Figure 1: Real-time text mining agency.
J U LY / AU G U S T 2 014 | 41A NA L Y T I C S
infrastructure resources. Even when a
machine goes down because of an unex-
pected failure, a new virtual machine is al-
located for the application automatically.
Dynamic scaling and load balanc-
ing: The cloud solution allows scaling
out as well as scaling back an appli-
cation depending on resource require-
ments. Multiple services running in
tandem make the whole system com-
putationally resource intensive. As re-
source demands increase, new role
instances can be provisioned to handle
the load. When demand decreases,
these instances can be removed so that
payment for unnecessary computing
power is not required.
Availability  durability: The cloud
storage services replicate data on three
different servers, guaranteeing it can be
accessed at all times, even if a server
shuts down unexpectedly.
Better mobility: The application can
be accessed from any place, as long as
there is an Internet connection. There is
no tight coupling with any physical server
or machine.
RESULTS
Figure 2 shows a snapshot of the topic
treemap generated in one run of the topic
modeling algorithm (different topics are
represented by different colors, with the
areasrepresentingoccurrencefrequency).
Figure 2: Topic modeling treemap.
W W W. I N F O R M S . O R G42 | A N A LY T I C S - M AGA Z I N E . O R G
REAL-TIME TEXT ANALYTICS
Incoming tweets over a time period
were captured in a stream graph visual-
ization as shown in the Figure 3 screen-
shot. Each topic is represented by a
stream in the visualization and is charac-
terized by the top words in that topic. At
any point of time, the top words in each
topic are displayed in a topic treemap
below the stream graph. It is possible to
get the keyword “treemap” at any past
time in history.
Successive runs of the sentiment
analysis algorithm for batches of tweets
are represented by the visual in Figure 4.
Each bar captures the sentiment
for that feature in a particular batch
of tweets. The height of the bar rep-
resents the number of opinion words
for the feature in that batch. The col-
or of each bar represents the overall
sentiment level expressed in a batch of
data, ranging from extremely negative
(dark red) to extremely positive (dark
green). The change in color of the bars
across various batches can be used
to identify stimuli that are driving the
change.
Selection of a particular bar provides
a deeper analysis of that batch. The size
of a bubble indicates the number of ref-
erences of a particular opinion word, and
the color shows the overall sentiment
score for the particular opinion word.
Both the size and color are indicators of
which opinion words drive the sentiment
for a feature in a batch.
Figure 3: Trends stream graph.
CLOSING THOUGHTS
Trending topics represent the popular
“topics of conversation,” and when de-
tected in real time, these hot topics are
the social pulses that are usually ahead
of any standard news media. Data ana-
lyzed via managed data centers can pro-
vide key insights into the evolving nature
and patterns of social information and
opinion and the general sentiment pre-
vailing over such subjects.
Aveek Mukhopadhyay is an associate manager
at Mu Sigma where he works with the Innovation
 Development Team with a core focus on driving
the adoption of advanced analytical platforms
and techniques both internally and externally. He
has interests in the fields of text mining, machine
learning and analytics automation.
Roger Barga, Ph.D., is group program manager
for the CloudML team at Microsoft Corporation
where his team is building machine learning as
a service in the cloud. Barga is also a lecturer
in the Data Science program at the University
of Washington. He joined Microsoft in 1997 as a
researcher in the Database Group of Microsoft
Research (MSR), where he was involved in a
number of systems research projects and product
incubation efforts, before joining the Cloud and
Enterprise Division of Microsoft in 2011.
Figure 4: Sentiment analysis.
NOTES  REFERENCES
1. The Economist (Feb. 25, 2010), “The Data Deluge”
(http://www.economist.com/node/15579717).
2. David M. Blei, “Probabilistic Topic Models,”
Communications of the ACM, April 2012, Vol. 55, No.
4 (http://www.cs.princeton.edu/~blei/papers/Blei2012.
pdf).
3. Xiaowen Ding, Bing Liu and Philip S. Yu,
“A Holistic Lexicon-Based Approach to Opinion
Mining” (http://www.cs.uic.edu/~liub/FBS/opinion-
mining-final-WSDM.pdf).
Help Promote Analytics Magazine
It’s fast and it’s easy! Visit:
http://analytics.informs.org/button.html
J U LY / AU G U S T 2 014 | 43A NA L Y T I C S
W W W. I N F O R M S . O R G44 | A N A LY T I C S - M AGA Z I N E . O R G
Key considerations for deep analytics on big data,
learning and insights.
hat is big data? Big data,
which means many things
to many people, is not a
new technological fad. In
addition to providing innovative solu-
tions and operational insights to endur-
ing challenges and opportunities, big
data with deep analytics instigate new
ways to transform processes, organi-
zations, entire industries and even so-
ciety. Pushing the boundaries of deep
data analytics uncovers new insights
and opportunities, and “big” depends on
where you start and how you proceed.
Big data is not just “big.” The expo-
nentially growing volume of data is only
one of many characteristics that are of-
ten associated with big data, such as
variety, velocity, veracity and others (the
six Vs; see box).
According to Gartner Research,
the worldwide market for analytics
will remain the top focus for CIOs
through 2017 [1]. According to Gartner,
Why do so many
analytics projects fail?
BY (l-r) HALUK DEMIRKAN AND BULENT DAL
W
THE DATA ECONOMY
J U LY / AU G U S T 2 014 | 45A NA L Y T I C S
more than half of all analytics projects
fail because they aren’t completed
within budget or on schedule, or be-
cause they fail to deliver the features
and benefits that are optimistically
agreed on at their outset.
Today, an abundance of knowledge
and experience exists to have success-
ful data and analytics-enabled decision
support systems. So why do so many
of these projects fail, and why are so
many executives and users still so un-
happy? While there are many reasons
for the high failure rate, the biggest rea-
son is that companies still treat these
projects as just another IT project. Big
data analytics is neither a product nor a
computer system. Instead, it should be
considered a constantly evolving strat-
egy, vision and architecture that contin-
uously seeks to align an organization’s
operations and direction with its strate-
gic business goals and tactical and op-
erational decisions. Table 1 includes a
list of common mistakes that can doom
analytics projects.
n Volume (data at rest):
terabytes to exabytes, petabytes
to zettabytes of lots of data
n Velocity (data in motion):
streaming data, milliseconds to
seconds, how fast data is being
produced and how fast the data
must be processed to meet the
need or demand
n Variety (data in many forms):
structured, unstructured, text,
multimedia, video, audio, sensor data,
meter data, html, text, e-mails, etc.
n Veracity (data in doubt):
uncertainty due to data
inconsistency and incomplete-
ness, ambiguities, latency, de-
ception, model approximations,
accuracy, quality, truthfulness or
trustworthiness
n Variability (data in change):
the differing ways in which the data
may be interpreted; different ques-
tions require different interpretations
n Value (data for co-creation and
deep learning): The relative impor-
tance of different complex data from
distributed locations. Big data with
deep analytics means greater insight
and better decisions, something that
every organization needs.
The six Vs of big data
W W W. I N F O R M S . O R G46 | A N A LY T I C S - M AGA Z I N E . O R G
WHY PROJECTS FAIL
KEY CONSIDERATIONS FOR DEEP
ANALYTICS
We live in an era of big data. Whether
you work in financial services, consumer
goods, travel, transportation, health-
care, education, supply chain, logistics
or industrial products and professional
services, analytics are becoming a com-
petitive necessity for your organization.
But having big data – and even people
who can manipulate it successfully – is
not enough. Companies need managers
who can partner effectively with analysts
to ensure that their work yields better
strategic and tactical decisions.
Big data with deep analytics is a jour-
ney that helps organizations solve key
business issues and opportunities by
converting data into insights to influence
business actions and drive critical busi-
ness outcomes. As organizations try to
take advantage of the big data opportuni-
ty, they need not be overwhelmed by the
various challenges that might await them.
Managers will need to start their
journey by [2]:
Identifying clear business need and
value. Almost everything needs to be a
business rather than a technology solu-
tion. Before companies start collecting big
Going Deep  Wide on big
data with deep analytics for
deep learning
J U LY / AU G U S T 2 014 | 47A NA L Y T I C S
Table 1: Common mistakes for analytics projects.
Failing to build the need for big data within the organization
Islands of analytics with “Excel culture”
Data quality and reliability related issues
Not enough investigation on vendor products and rather than blindly taking the path of least
resistance
Departmental thinking rather than looking at the big picture
Considering this as a one-time implementation rather than a living eco-system
Developing silo dashboards to answer a few questions rather than strategic, tactical and opera-
tional dashboards
Not establishing company ontology and definitions for “single version of truth” culture
Lack of vision and not having a strategy; not having a clear organizational communications plan
Lack of upfront planning; overlooking the development of governance and program oversight
Failure to re-organize for big data
Not establishing a formal training program
Ignoring the need to sell success and market the big data program
Not having the adequate architecture for data integration
Forgetting rapidly increasing complexities with …volume, velocity, variety, veracity, and many more
W W W. I N F O R M S . O R G48 | A N A LY T I C S - M AGA Z I N E . O R G
WHY PROJECTS FAIL
data, they should have a clear idea of what
they want to do with it with from a business
sense. Here’s what you need to consider:
Turn over part or all of big data
solution delivery to business leaders.
Project management and ownership
from business (not IT) in big data solu-
tions is the key for success. In the mean-
time, make sure to have clear alignment
between business and IT.
Partner with business peers to
identify opportunities and solutions.
If we talk about big data, the impact of
these projects should also be “big.” Cre-
ate a cross-organization team and in-
volve all stakeholders early in the game.
Value co-creation of value with
customers. Overall business objective
should always be about customers. If
one of the initiatives is about big market-
ing outcome, than it should be about how
to set up customer-centric marketing,
how to provide targeted dynamic adver-
tisement, how to engage customers and
how to manage personalized shopping.
Start small – with an eye to scale
quickly. While big data solutions may
be quite advanced, everything else sur-
rounding it – best practices, methodolo-
gies, org structures, etc. – is nascent.
No one has all the answers, at least
not yet. Understand why traditional
business intelligence and data ware-
housing projects can’t solve a problem.
Small, simple and scalable. When
launching big data initiatives, avoid 1) get-
ting too complicated too fast, and 2) not
being prepared to scale once a solution
catches on. Big data solutions can quickly
grow out of control since discovering val-
ue from data prompts wanting more data.
Identify what part of the business
would benefit from quick wins. Look
for opportunities that will show quick
wins within no more than three months.
Success brings more people to the table.
This is not a one-time implementa-
tion. Understand that this is a living and
evolving organism that will grow expo-
nentially very fast. It is a culture change
in the company with the way that you
collect and use data, and the way you
make outcome-based decisions.
Develop a minimal set of big data
governance directives upfront. Big
data governance is a chicken-and-egg
problem – you can’t govern or secure
what you haven’t explored. However,
exploring vast data sets without gover-
nance and security introduces risk.
New processes to manage open
source risks. Most big data solutions
are being built on open source software,
but open source has both legal and skill
implications as firms are: 1) exposed to
risk due to intellectual property issues
and complex licensing agreements; 2)
concerned about liability if systems built
J U LY / AU G U S T 2 014 | 4 9A NA L Y T I C S
on open source fail; and 3) required to
use technology that is often early re-
lease and not enterprise-class.
New agile processes for solution
delivery. Successful firms will embrace
agile practices that allow end users of
big data solutions to provide highly in-
teractive inputs throughout the imple-
mentation process.
Integrate structured and unstruc-
tured data from multiple sources. Inte-
gration of data is one of the most important
and also complex processes to serve ef-
ficient and effective decision-making. In
terms of data, it includes machine data,
sensor data, videos, audio, documents,
enterprise content in call centers, e-mail
messages, wikis and, indeed, larger vol-
umes of transactional and application data.
Data sharing is key. In order for a
company to build a big data ecosystem
that drives business action, organiza-
tions have to share data.
Build a strong data infrastructure
to host and manage data. Make sure
to have secured and reliable in-house
and/or hosted data (e.g., cloud) and in-
formation management infrastructure.
USINESS ANALYTICS 
PERATIONS RESEARCH
INFORMS CONFERENCE ON
Save the Date!Catch the Analytics Wave in Huntington Beach, CA
APRIL 12-14, 2015
W W W. I N F O R M S . O R G5 0 | A N A LY T I C S - M AGA Z I N E . O R G
WHY PROJECTS FAIL
Think about what information do I
collect today … and what analytics should
I perform that can benefit me and others.
New security and compliance
procedures to protect extreme-scale
data. In order to succeed with big data,
new processes must be developed that
recognize and protect the special nature
of extreme-scale data that may be large-
ly unexplored.
Be ready to support rapid growth.
Big data solutions can grow fast and ex-
ponentially. They can start as a pilot with
a few terabytes of data, then becomes
a petabyte very quickly. Since the same
data can be used different ways and re-
analyzed for new insights easily, nothing
ever gets deleted.
Funding must move out of IT for
big data success. Funding for these
projects should come from outside of the
CIO organization and move to a market-
ing or sales organization, for instance,
so that the business has a vested stake
in the game.
Create a road map that gradually
builds the skills of your organization.
It’s important to create a road map that
allows you to gradually build the required
skills within your staff, minimize risk and
capitalize on previous successes to gain
more support. In the organization, there
will be new roles and responsibilities such
as the data scientist, who possesses a
blend of skills that includes statistics, ap-
plied mathematics and computer science.
This is different than any current
decision support solution. With big
data, organizations should look for new
capabilities, such as: using advanced
analytics to uncover patterns previously
hidden; visualization and exploration to
help the business find more complete
answers, with new types and greater
volumes of data to best represent the
data to the user and highlight important
patterns to the human eye; enable oper-
ational decision-making with on-demand
stream data by making floor employees
into analytic consumers; and turn insight
into action to drive a decision – either
with a manual step or an automated pro-
cess. And most important be ready for
rapidly increasing benefits and complex-
ities from the six Vs.
WHAT IS NEXT IN THE DATA
ECONOMY?
Organizations have access to a
wealth of information, but they can’t get
value out of it because it is sitting in its
most raw form or in a semi-structured
or unstructured format [3]. As a result,
they don’t even know whether it’s worth
keeping.
So where is deep analytics for
deep learning headed in the next few
years? The exciting news is that many
career analytics.
Enrollnow only
AAS
nation.
WakeTechnicalCommunity Collegeserved
68,919 studentsin2012-13andwas
rankedthesecondlargestcommunity
collegeinthecountryin2012by
CommunityCollegeWeek.
Afutureforwardcollege,itlaunchedtheAAS
inBusinessAnalytics,thefirstofitskind,in
2013.Theprogram providesstudentsthe
knowledgeandpracticalskillsnecessaryfor
employmentandgrowthinanalytics
professionsinaslittleastwosemesters.
Competitivetuition,open-doorenrollment,
flexibleschedulingoptions,accesstoindustry
recognizedtools,andavarietyofcredential
optionsmakeenrollmentintheprogram
bothaccessibleandaffordable.
Thisprogramisfundedinfullbya$2.9million
Dept.ofLaborTradeAdjustment
AssistanceCommunityCollegeCareer
Flexibility





Credential
Options


Executive
Accelerated
Program
Industry
Recognized
ToolsSkills
W W W. I N F O R M S . O R G52 | A N A LY T I C S - M AGA Z I N E . O R G
WHY PROJECTS FAIL
organizations are already realizing the
value of big data analytics today. Insight-
driven, information-centric initiatives will
be deployed where the ability to capital-
ize on the six Vs of information will cre-
ate new opportunities for organizations
to exploit. By combining and integrating
deep analytics, local rules, scoring, opti-
mization techniques and machine learn-
ing with cognitive science into business
processes and systems, decision man-
agement helps deliver decisions that are
consistently optimized and aligned with
the organization’s desired outcomes.
Social analytics will ensure busi-
nesses know how, when and where to
creatively engage with individual con-
sumers and social communities to fos-
ter trusted, one-to-one relationships and
better understand and manage the way
their companies are perceived. Integrat-
ing demographic and transactional data
with what can be learned about attitudes
and opinions allows organizations to
truly understand the motivations and in-
tents of its constituents to better serve
them at the right time and place.
Deep analytics will help organiza-
tions uncover previously hidden patterns,
identify classifications, associations and
segmentations, and make highly accu-
rate predictions from structured and un-
structured information. Organizations will
use real-time analysis of current activity
to anticipate what will happen and iden-
tify drivers of various business outcomes
so they can address the issues and chal-
lenges before they occur. Many decisions
will be done automatically by computers
that also have deep-learning capabilities.
When you are in a process of starting
a big data journey, consider this ques-
tion: What should our big data with deep
analytics roadmap look like to achieve
our objectives?
Haluk Demirkan (haluk@uw.edu) is a professor
of Service Innovation and Business Analytics, and
the founder and executive director of Center for
Information Based Management at the Milgard
School of Business, University of Washington-
Tacoma. He has a Ph.D. in information systems
and operations management from the University of
Florida. He is a longtime member of INFORMS.
Bulent Dal (bulent.dal@obase.com) is a co-founder
and general manager of Obase Analytical Solutions
(http://www.obase.com/index.php/en/obase),
Istanbul, Turkey. His expertise is in scientific retail
analytical solutions. He has a Ph.D. in computer
sciences engineering from Istanbul University.
Acknowledgement
Part of this article is excerpted with permission
of the publisher, HBR Turkey, from Demirkan,
H. and Dal, B., “Big Data, Big Opportunities, Big
Decisions,” Harvard Business Review Turkish
Edition (published in Turkish), March 2014.
REFERENCES
1. Gartner, Inc., 2013, “Gartner Predicts Business
Intelligence and Analytics Will Remain Top Focus
for CIOs Through 2017,” Dec. 16, 2013, http://www.
gartner.com/newsroom/id/2637615.
2. Demirkan, H. and Dal, B., “Big Data, Big
Opportunities, Big Decisions,” Harvard Business
Review Turkish Edition (published in Turkish),
March 2014, pp. 28-30.
3. Davenport, T., 2013, “Analytics, 3.0,” Harvard
Business Review, December.
The Institute of Business Analytics Symposium
is a two-day event where presenters from major
companies across the U.S. share their experiences
in business analytics. We will explore a diverse
landscape from statistics, data-mining, and
forecasting to predictive modeling and operations
research.
It’s also a great networking opportunity for
businesses, students and academia.
Keynote Speakers:
- Wayne Winston - Hear from this renowned analytics
expert. Major league sports teams and Fortune 500
companies have requested his business analytics
services.
- Paul Adams, VP of Ticket Sales is beginning his
26th season with the Atlanta Braves.
For a complete list of presenters and to register
visit http://mycba.ua.edu/basymposium. Early
registration is available at a discounted rate
through August 15. Businesses registering four
or more individuals can receive a reduced rate.
The INFORMS Certified Analytics Professional (CAP®)
exam will be administered on September 24 as a
pre-symposium event and requires separate payment.
“Obviously he (Wayne Winston)
helped start the basketball
analytics revolution with us,”
said Dallas Mavericks
owner Mark Cuban.
Wayne Winston
Paul Adams
7th ANNUALBUSINESS
ANALYTICSSYMPOSIUMHotel Capstone, The University of Alabama, Tuscaloosa, Alabama
September 25-26, 2014
W W W. I N F O R M S . O R G54 | A N A LY T I C S - M AGA Z I N E . O R G
DATA SCIENTISTS IN DEMAND
According to executive search firm head Linda Burtch,
the job prospects for data scientists and other elite
analytics professionals have never been better – and
the future is even brighter.
n April, the executive search
firm Burtch Works released
the results of its first-of-its-
kind salary and demograph-
ics survey of data scientists, a follow-up
survey of big data professionals con-
ducted a year earlier. Among other find-
ings, the 2014 survey quantified that data
scientists are well paid, relatively young,
overwhelmingly male and that almost half
(43 percent) are employed on the West
Coast.
Linda Burtch, managing partner of
Burtch Works, has been involved in the
recruitment and placement of high-end
analytics talent for 30 years. She start-
ed her career with Smith-Hanley before
founding her own company five years
ago. Analytics magazine editor Peter
Horner interviewed Burtch in April, not
long after the survey of data scientists
was released. Following are excerpts
from the interview.
What did you find that surprised
you the most from the salary and de-
mographics survey of data scientists?
First of all, I find it funny that every-
one is interested in salaries and what
data scientists and big data profession-
als make, but it’s such a taboo subject to
actually talk about. Not to me. I talk about
salaries all the time. That’s my business.
What surprised me? That’s an inter-
esting question. It actually turned out
the way I thought it would – a lot of the
‘It’s their time
to shine’
BY PETER HORNER
I
J U LY / AU G U S T 2 014 | 55A NA L Y T I C S
data scientists. Data
storage has become so
much cheaper, comput-
ing power has become
much faster, nanotech-
nology and sensors are
now becoming ubiqui-
tous. Self-driving cars,
traffic sensors, the en-
ergy grid. The list goes
on and on and on.
Right now the ob-
vious stuff is happen-
ing with understanding
digital streams of data
in applications related
to social media. That’s pretty straight-
forward stuff, but wait until it hits the
healthcare industry, for example. Self-
driving cars are going to be a huge,
huge deal. While a lot of it is being done
out in California now, over the next five
years we are going to see it scattered
all over the United States.
When it comes to recruiting can-
didates and job placement, who are
you talking to?
I recruit in analytics – people who
have master’s degrees in statistics, op-
erations research, econometrics, people
who are out there working in business
applications, solving problems related
to marketing spend or credit worthiness
candidates living out on
the West Coast and a
higher predominance
of Ph.D.s among data
scientists than the gen-
eral analytics population
or the big data profes-
sionals, as I call them.
It all pretty much made
sense to me. It was in-
teresting because it was
actually quantified.
Weren’t you a little
surprised by the extent
of the concentration of
data scientists – nearly 50 percent –
on the West Coast?
That’s for the moment, for now, but
watch and see what happens. Analyt-
ics has been around for a long time, yet
some people still ask me, “Are you sure
this isn’t a fad?” It’s not.
Analytics has become a hugely profit-
able specialty area within organizations
as they try to optimize their operations,
or target their marketing or look at re-
turn on investment issues, and that has
been around for years and years.
I would argue that those issues are
sort of the humdrum stuff of analytics.
Data-driven decision-making is really
going to explode, and that’s what we are
seeing with this whole area going toward
Linda Burtch, founder and managing
partner of Burtch Works.
W W W. I N F O R M S . O R G5 6 | A N A LY T I C S - M AGA Z I N E . O R G
QA WITH LINDA BURTCH
or target marketing. More recently I’ve
gotten into data science. That’s a huge
umbrella description.
You mentioned operations research,
the heart and soul of INFORMS.
It is. When I started out in recruit-
ing more than 30 years ago, I focused
on operations research candidates. It’s
grown pretty dramatically since then.
They have a very fond place in my heart
because that’s how I got started. It’s one
of those things that I’ve really been in-
volved with – the INFORMS group back
in New York when I was living there,
and I’m really excited now because the
INFORMS group in Chicago is getting
re-energized. It’s really exciting to watch.
When looking at the job market-
place, do you distinguish between,
say, a data scientist and other analyt-
ics professionals?
Let me back up a little bit. Last sum-
mer, when I was putting together the big
data salary study, I saw that data scien-
tists were a breed apart, and that they
had higher compensation levels. So I
made the decision to take them out of the
general big data study and hold them for
later because it’s such an emerging field
that’s so different. They are working with
what I would call unstructured data. You
could get into a lot more detail over how a
data scientist is different from a big data
professional, but the primary distinguish-
ing feature, in my opinion, is that data
scientists are working with data that’s un-
structured. It’s something that’s going to
grow as sensors become more and more
prevalent and data streams become con-
tinuous in so many applications areas.
How would you describe the current
job market for quants, for lack of a
better word?
It’s hot. A couple of months ago we did
a flash survey in which we simply asked
how often are you are contacted about a
new job opportunity through LinkedIn. We
had 400 responses; 89 percent of the re-
spondents said they were contacted at
least monthly, and 25 percent said that they
were contacted at least weekly. I’m working
with elite data scientists, and they’re telling
me that they get calls once or twice a day
from recruiters, so it’s just crazy.
Our candidates are seeing a 14 per-
cent increase in salary when they change
jobs, so there’s a lot of churn out there.
If they stay with their existing company,
they might see an annual increase of be-
tween 2 percent and 3 percent, so the
14 percent is a nice bounce if they de-
cide to make a change. One of my data
scientists in Boston said he received 30
calls in one week after he left a job and
went on the job hunt.
INFORMS Continuing
Education program offers
intensive, two-day in-person
courses providing analytics
professionals with key skills,
tools, and methods that can
be implemented immediately
in their work environment.
These courses will give
participants hands-on
practice in handling real
data types, real business
problems and practical
methods for delivering
business-useful results.
NEW!
INTRODUCTION TO MONTE CARLO
AND DISCRETE-EVENT SIMULATION
Topic areas:
» Monte Carlo Modeling
» Sensitivity Analysis
» Input Modeling
» Output Analysis
This course will be held
Catonsville, MD (INFORMS HQ)
Sep 12-13, 2014
Chicago, IL
Oct 16-17, 2014
Faculty:
Barry G. Lawson, University of Richmond
Lawrence M. Leemis,
The College of William  Mary
COURSES FOR
ANALYTICS
PROFESSIONALSducation
continuing Learn more about these
courses at:
informs.org/continuinged
NEW!
FOUNDATIONS OF MODERN
PREDICTIVE ANALYTICS
Topic areas:
» Linear Regression
» Regression Trees
» Classification Techniques
» Finding Patterns
This course will be held
Washington, DC – Sep 15-16, 2014
San Francisco, CA – Nov 7-8, 2014
Faculty:
James Drew, Worcester Polytechnic
Institute, Verizon (ret.)
It’s amazing. Competing offers is
another sign that the market is really
hot. Sign-on bonuses are another thing
that has become very commonplace in
the analytics job market. Another sign
that is important to note is the aca-
demic institutions have really stepped
up with many of them developing mas-
ter’s programs in analytics, predictive
analytics and the like, so that’s some-
thing that is very new in the last two or
three years.
In an interview with the New York
Times, you said in reference to MBAs,
and I quote, “In 15 years, if you don’t
have a solid quant background, you
might have a permanent pink slip.”
That’s a little rough, isn’t it?
I know, I’ve become the harbinger of
the permanent pink slip. Seriously, I have
seen many MBAs, your general MBA,
look around and say, whoa, this is a little
bit scary, because they are seeing this
trend toward analytical decision-making
J U LY / AU G U S T 2 014 | 57A NA L Y T I C S
W W W. I N F O R M S . O R G58 | A N A LY T I C S - M AGA Z I N E . O R G
QA WITH LINDA BURTCH
becoming so predominant in business.
Personally, I think within 10 or 15 years
if MBAs don’t have a quantitative foun-
dation, they will be prevented from pro-
motion. We’ll see. I always said back
when I was working with the operations
research people that my guys are so
smart, they are the ones who should be
running these companies. Now I’m see-
ing it come true.
In an episode of the TV show “Mad
Men,” the ad agency employees, cir-
ca late 1960s, were concerned that a
new computer the size of a confer-
ence room would make them expend-
able. Your quote reminded me of that.
Right. A lot of people ask me about
that. There is going to be a disruption.
There already has been. Just yesterday,
the Times had a visual display of analyt-
ics and quants and how it was disrupting
things and what jobs were going to be
eliminated, including truck drivers and
airplane pilots.
Self-driving cars, robots, analytics,
algorithms and all this stuff is here to
stay, and it’s only going to get bigger,
but it’s not going to replace the ability to
read, write and think critically. While I’m
a big proponent of analytics, communi-
cation will continue to be really impor-
tant; human-to-human contact can’t be
replaced, ever.
Just how important are commu-
nication skills to a data scientist?
INFORMS, for example, now routinely
holds “soft skills” workshops aimed
at helping analysts explain their work
to non-technical audiences in order
to garner corporate buy-in.
Yes. That’s absolutely critical. The
other piece that goes hand in hand with
that is having the ability to understand
the business at hand. Business acumen
is really important. You have to have
that gut check; does it make sense and
how can I best monetize the situation
to benefit a client or employer? It’s re-
ally important for people to understand
not only what’s interesting – what a lot
of quantitative people tend to gravitate
toward – but also what’s important.
If a company is just starting out on
the analytics journey and has no in-
house expertise in this area, how can
they judge a candidate’s technical
abilities?
That’s an interesting problem. When
I’m talking to a client, especially in this
data science area that is so new, they
will call me and sometime they will have
it down. They are talking the right lan-
guage, they are thinking about the right
things, they are asking the right ques-
tions. Other clients are floundering; they
are still exploring.
J U LY / AU G U S T 2 014 | 59A NA L Y T I C S
I think it’s very important that they
make sure they understand where their
needs are before they actually bring in
somebody because it’s not inexpensive
to apply analytics in an organization.
You really need to think very carefully
what the goals are, what the road map
is going to look like and so on. I can cer-
tainly help with that, and I can give the
names of consultants who can help a
company really understand what their
plan should be before they jump in and
make hires.
On the other side of that coin,
what’s the best advice you can give
an analytics candidate who is testing
the job market?
Another flash survey we did focused
on understanding what motivates peo-
ple to make a job change. The number
one motivation is money, but it’s quickly
followed by challenging work and the op-
portunity to grow within an organization.
Money is important to everyone,
but candidates shouldn’t make deci-
sions regarding changing jobs based on
Job Seeker Benefits
•	 Access	to	high	quality,	relevant	job	postings.	
No	more	wading	through	postings	that	aren’t	
applicable	to	your	expertise.
•	 Personalized	job	alerts	notify	you	of	relevant		
job	opportunities.
•	 Career	management	–	you	have	complete		
control	over	your	passive	or	active	job	search.	
Upload	multiple	resumes	and	cover	letters,		
add	notes	on	employers	and	communicate	
anonymously	with	employers.
•	 Anonymous	resume	bank	protects	your	confidential	
information.	Your	resume	will	be	displayed	for	
employers	to	view	EXCEPT	your	identity	and	
contact	information	which	will	remain	confidential	
until	you	are	ready	to	reveal	it.
•	 Value-added	benefits	of	career	coaching,	resume	
services,	education/training,	articles	and	advice,	
resume	critique,	resume	writing	and	career	
assessment	test	services.
POWERED BY
http://careercenter.informs.org
CAREER
CENTER
W W W. I N F O R M S . O R G6 0 | A N A LY T I C S - M AGA Z I N E . O R G
QA WITH LINDA BURTCH
salary alone because money isn’t going to be the
factor that’s going to change their life. Rather, it’s
the kind of work you will do and how engaged
you will be. It’s really important to understand
the challenge and the growth opportunity within
whatever it is you are looking to jump into.
The third thing I think is important to analyze
for any quantitative person when they’re talking
to a potential new employer is to understand if
analytics has a seat at the corporate table. You
have to make sure that there is buy-in within the
organization and the stakeholders are really ac-
tively involved and engaged in conversations
about how analytics can and should be used or
imbedded within any organization. That’s a huge
factor in understanding how happy you will be
in your job and how successful you can be as a
quantitative professional.
Getting back to the plight of the quant-
poor MBA, how can a candidate boost ana-
lytical skills mid-career? Many colleges and
universities are now offering analytics pro-
grams, often online, through their business
schools, and INFORMS, for example, holds
continuing education courses in the analyt-
ics area, as well as a certification program.
I get that question a lot: “I’m really interested
in beefing up my analytical skills so what should
I do?” As you noted, there are more opportunities
than ever to do that. In addition to the formal edu-
cation programs, there are plenty of good books
on the topic. I get the question all the time: What
books should I be looking at?
For any quantitative
person, when they’re
talking to a potential new
employer, it’s important to
understand if analytics
has a seat at the
corporate table.
J U LY / AU G U S T 2 014 | 61A NA L Y T I C S
Another way that you can jump into
this is through Kaggle competitions,
which I recommend to people if they are
interested in understanding data science
and who else is out there doing this kind
of work and what they are doing. There
are many tools out there. Certainly what
INFORMS is doing is terrific.
It’s important to keep your skills fresh
and make sure you continue to learn.
When it comes to giving general career
advice, especially to younger candidates,
my advice is this: prepare for three or four
careers during your lifetime. In today’s
world, it’s not good to specialize in one
thing and try to stick with one company
or one industry or one vertical applica-
tion for your entire career. It’s incredibly
dangerous, and it likely won’t carry you
through a 35-year career. You need to
be continuously learning something new.
People should keep that in mind.
INFORMS offers an analytics cer-
tification program (CAP). Is that a dif-
ferentiator in the job marketplace?
No two candidates are ever equal,
but it can certainly help once there are
enough employers out there who under-
stand what it means to be CAP certified.
I’m seeing people put various MOOCs
(massively open online course) on their
resumes now, along with Kaggle com-
petition results. I have a candidate who
actually got his job because of a Kaggle
competition. The first couple of times
he submitted his solution it was totally
rejected, but as he continued to study
the problem and resubmitted, he
climbed up the leaderboard. Then he
started getting calls and job opportuni-
ties because of his Kaggle rank.
From your perspective, what does
the future hold for data scientists and
other analytics professionals?
In my 30 years of experience, I have
never seen anything like this. The oppor-
tunities for elite analytics candidates have
never been better, and I think what we’re
seeing now is just the tip of the iceberg.
As I said earlier, I really think that my
quantitative candidates are going to be
running companies one day. Certainly
the CMO (chief marketing officer) is go-
ing to be coming up through the analyt-
ics ranks. Now there’s all this talk about
CAOs (chief analytics officer). I think the
candidates I’m working with have a very
strong chance – if they have leadership
ability and the ambition – to advance up
the ranks and continue to climb and run
organizations at some point. Their quan-
titative skills are going to be unique and
absolutely required to be a successful
businessperson. It’s their time to shine.
Peter Horner (peter.horner@mail.informs.org) is the
editor of Analytics and OR/MS Today magazines.
W W W. I N F O R M S . O R G62 | A N A LY T I C S - M AGA Z I N E . O R G
ANALYTICS ACROSS THE ENTERPRISE
The story of how IBM not only survived but thrived
by realizing business value from big data.
his is the story of how an
iconic company founded
more than a century ago,
and once deemed a “dino-
saur” that would not be able to survive
the 1990s, has learned lesson after les-
son about survival and transformation.
The use of analytics to bring more sci-
ence into the business decision process
is a key underpinning of this survival and
transformation. Now for the first time, the
inside story of how analytics is being used
across the IBM enterprise is being told.
According to Ginni Rometty, chairman,
president and chief executive officer, IBM
Corporation, “Analytics is forming the
silver thread through the future of every-
thing we do.”
What is analytics? In simple terms,
analytics is any mathematical or scientific
method that augments data with the intent
of providing new insight. With the nearly
1 trillion connected objects and devices
generating an estimated 2.5 billion giga-
bytes of new data each day, analytics can
help discover insights in the data. That in-
sight creates competitive advantage when
used to inform actions and decisions.
Analytics transforms
a ‘dinosaur’
BY (l-r) BRENDA DIETRICH,
EMILY PLACHY AND MAUREEN
NORTON
T
J U LY / AU G U S T 2 014 | 63A NA L Y T I C S
using data, but it involves
more than simple data
(or database) queries.
Analytics involves the use
of mathematical or scien-
tific methods to generate
insight from the data.
Analytics should be
thought of as a progres-
sion of capabilities, start-
ing with the well-known
methods of business in-
telligence, and extending
through more complex
methods involving sig-
nificant amounts of both
mathematical modeling
and computation.
Reporting is the most widely used
analytic capability. Reporting gathers
data from multiple sources, such as busi-
ness automation, and creates standard
summarizations of the data. Visualiza-
tions are created to bring the data to life
and make it easy to interpret.
As a generic example, consider store
sales data from a retail chain. The data
is generated through the point of sale
system by reading the product bar codes
at checkout. Daily reports might include
total store revenue for each store, rev-
enue by department for each region, and
national revenue for each stock-keeping
unit (SKU). Weekly reports might include
Data is becoming the
world’s new natural re-
source, and learning how
to use that resource is a
game changer. …
Analytics is not just a
technology; it is a way of
doing business. Through
the use of analytics, in-
sights from data can be
created to augment the
gut feelings and intuition
that many decisions are
based on today. Analytics
does not replace human
judgment or diminish the
creative, innovative spirit
but rather informs it with new insights to
be weighed in the decision process. …
Analytics for the sake of analytics will
not get you far. To drive the most value,
analytics should be applied to solving your
most important business challenges and
deployed widely. Analytics is a means, not
an end. It is a way of thinking that leads to
fact-based decision-making. …
BIG DATA AND ANALYTICS
DEMYSTIFIED
If analytics is any mathematical or sci-
entific method that augments data with
the intent of providing new insight, aren’t
all data queries analytics? No.Analytics is
often thought of as answering questions
This article is adapted from
the book, “Analytics Across the
Enterprise: How IBM Realizes
Business Value from Big Data
and Analytics.”
W W W. I N F O R M S . O R G64 | A N A LY T I C S - M AGA Z I N E . O R G
ANALYTICS ACROSS THE ENTERPRISE
the same metrics, as well as comparisons
to the previous week and comparisons
to the same week in the previous calen-
dar year. Many reporting systems also
allow for expanding the summarized data
into its component parts. This is particu-
larly useful in understanding changes in
the sums.
For example, a regional store man-
ager might want to examine the store-
level detail that resulted in an increase
in revenue from the home entertainment
department. She would be interested in
knowing whether sales increased at most
of the stores in the region, or whether the
increase in total sales resulted from a sig-
nificant sales jump in just a few stores.
She might also look at whether the in-
crease could be traced back to just a
few SKUs, such as an unusually popular
movie or video game. If a likely cause of
the sales increase can be identified, she
might alert the store managers to moni-
tor inventory of the popular products, re-
position the products within a store, or
even reallocate inventory of the products
across stores in her region. …
WHY ANALYTICS MATTER
Quite simply, analytics matters be-
cause it works. You can be overwhelmed
with data and the value of it may be unat-
tainable until you apply analytics to create
the insights. Human brains were not built
to process the amounts of data that are
today being generated through social me-
dia, sensors, and more. While gut instinct
is often the basis for decisions, analyti-
cally informed intuition is what wins going
forward.
Several studies have highlighted the
value of analytics. Companies that use
predictive analytics are outperforming
those that do not by a factor of five. In
a 2012 joint survey by the IBM Institute
of Business Value and the Said Busi-
ness School at the University of Oxford of
more than 1,000 professionals around the
world, 63 percent of respondents reported
that the use of information (including big
data and analytics) is creating a competi-
tive advantage for their organizations. IBM
depends on analytics to meet its business
objectives and provide shareholder value.
The bottom line is that analytics helps the
bottom line. Your competition will not be
waiting to take advantage of the new in-
sights from big data. Should you?
IBM has approached the use of ana-
lytics with a spirit of innovation and a be-
lief that analytics will illuminate insights
in data that can help improve outcomes.
The company hasn’t been afraid to make
mistakes or redesign programs that
haven’t worked as planned. Unlike tra-
ditional IT projects, most analytics proj-
ects are exploratory. For example, the
Development Expense Baseline Project
Master of Science in Analytics
Apply technical knowledge to diverse analytical problems in
this program for working adults.
Learn to draw insights from complex data using statistical
methods and modeling.
Develop advanced proficiency in applying sophisticated sta-
tistical, database development, and software skills to various
industries.
Apply by August 10.
Join us for an information session.
When Thursday, July 10, 6–7 pm, or Thursday, July 17, 6–7 pm
Where July 10: Downtown Chicago Gleacher Center
450 North Cityfront Plaza Drive
July 17: Online
More Info grahamschool.uchicago.edu/MAANMP
RSVP July 10: http://tinyurl.com/o4auzsw
July 17: http://tinyurl.com/nbs2495
BIG DATA.
BIG CAREER.
W W W. I N F O R M S . O R G6 6 | A N A LY T I C S - M AGA Z I N E . O R G
ANALYTICS ACROSS THE ENTERPRISE
explored innovative ways to determine
development expense at a detailed level,
thereby addressing a problem that many
thought was impossible to solve. IBM
analytic teams haven’t waited for perfect
data to get started; rather, they have re-
fined and improved their data along the
way. …
The key is to put a stake in the ground
with a commitment that analytics will be
woven into your strategy. That’s how IBM
does it. This approach is also effective
with big data. Rather than postpone the
leveraging of big data, you should em-
brace it, establish a link between your
business priorities and your information
agenda, and apply analytics to become a
smarter enterprise. …
PROVEN APPROACHES
Staying focused on solving business
problems was the pragmatic start, and
the other crucial element was having very
high-level executive support from the be-
ginning. From a governance perspective,
those are two key levers to drive value:
focus on actions and decisions that will
generate value and have high-level ex-
ecutive sponsorship.
The ideal team to do analytics is a
collaboration between an experienced
data scientist, a person steeped in the
area of the business where the challenge
needs to be solved, and an IT person
with expertise in the data in that particu-
lar area of the business.
A joint study by MIT Sloan and the
IBM Institute for Business Value devel-
oped several recommendations. The first
is that you start with your biggest and
highest-value business challenge. The
next recommendation is to ask a lot of
questions about that challenge in order to
understand what’s going on or what could
be going on. Then you go out and look for
what data you might have that’s relevant
to that challenge. Finally, you determine
which analytic technique can be used to
analyze the data and solve the problem.
Because most companies have con-
straints on the amount of money and
skills available for projects, estimating
the ROI can provide a better differentiator
for selecting the project with the highest
potential impact than relying on instincts.
Estimating an analytics project’s ROI in-
volves both capturing the project costs
and measuring the value. …
EMERGING THEMES
Relationships inferred from data
today may not be present in data col-
lected tomorrow. The relationships that
you infer from data about the past do
not necessarily hold in data that you col-
lect tomorrow. You cannot analyze data
once and then make decisions forever
based on old analysis. It’s important to
A NA L Y T I C S J U LY / AU G U S T 2 014 | 67
SAS and Hadoop take on
the Big Data challenge.
And win.
Analytics
Why collect massive amounts of Big Data if you can’t analyze
it all? Or if you have to wait days and weeks to get results?
Combining the analytical power of SAS with the crunching
capabilities of Hadoop takes you from data to decisions in a
single, interactive environment – for the fastest results at the
greatest value.
Read the TDWI report
sas.com/tdwi
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2014 SAS Institute Inc.All rights reserved. S120598US.0214
W W W. I N F O R M S . O R G68 | A N A LY T I C S - M AGA Z I N E . O R G
ANALYTICS ACROSS THE ENTERPRISE
continually analyze data to verify that pre-
viously detected relationships are still val-
id and to discover new ones. Fortunately,
major discontinuities with data do not hap-
pen very often, so change generally hap-
pens gradually. Social media sentiment,
however, has a much shorter half-life than
most data.
Using relationships derived from past
data has been repeatedly demonstrated
to work better than assuming that no re-
lationships exist. The relationships that
have been detected are likely correlation
rather than causality. However, these re-
lationships, if detected and acted upon
quickly, may provide at least a temporary
business advantage.
You don’t have to understand ana-
lytics technology to derive value from it.
For a long time, many business leaders
expressed the opinion that mathemat-
ics should be used by only those who
understood the details of the computa-
tions. However, in recent years this view
has been changing, and analytics is be-
ing treated like other technologies. You
must learn how to use it effectively, but
it is not necessary to understand the in-
ner workings in order to apply analytics
to business decisions. You have to apply
analytics methods in the context of the
problem that is being solved and make
the results accessible to the end user. But
just as the user of a car navigation system
does not need to understand the details
of the routing algorithm, the end user of
analytics does not have to understand the
details of the math.
Typically, making the results accessi-
ble to the end user involves wrapping the
math in the language and the process of
the end user. Also, the analytics can be
embedded deep inside things so that the
user does not see it, like in supply chain
operations. Analytics should be usable by
anyone, not just those with Ph.D.s in sta-
tistics or operations research. Some us-
ers will want to understand the algorithms
and inner workings of an analytics model
in order to trust the results prior to adop-
tion, but they are the exception.
Fast, cheap processors and cheap
storage make analysis on big data pos-
sible. Moore’s law has resulted in vast
increases in computing power and vast
decreases in the cost of storing and ac-
cessing data. With readily available and
inexpensive computing, we can do what-
if calculations often and test a number of
variables in big data for correlation.
Doing things fast is almost always
better than doing things perfectly.
Often inexact but fast approaches pro-
duce enormous gains because they re-
sult in better choices than humans would
have made without the use of analytics.
Over time, the approximate analytics
methods can be refined and improved to
J U LY / AU G U S T 2 014 | 69A NA L Y T I C S
achieve additional gains. However, for
many business processes, there is even-
tually a point of diminishing returns: The
calculations may become more detailed
and precise, but the end results are no
more accurate or valuable.
Using analytics leads to better
auditability and accountability. With
the use of analytics, the decision-making
process becomes more structured and
repeatable, and a decision becomes less
dependent on the individual making the
decision. When you change which peo-
ple are in various positions, things still
happen in the same way. You can often
go back and find out what analysis was
used and why a decision was made. …
Dr. Brenda L. Dietrich is an IBM Fellow and
vice president. She joined IBM in 1984, and
during her career she has worked with almost
every IBM business unit and applied analytics to
numerous IBM decision processes. She currently
leads the emerging technologies team in the IBM
Watson group. For more than a decade, she led
the Mathematical Sciences function in the IBM
Research division, where she was responsible for
both basic research on computational mathematics
and for the development of novel applications of
mathematics for both IBM and its clients.
In addition to her work within IBM, she has been
the president of INFORMS, the world’s largest
professional society for operations research and
management sciences. An INFORMS Fellow,
she has received multiple service awards from
INFORMS.
Dr. Emily C. Plachy is a distinguished engineer
in Business Analytics Transformation at IBM,
where she is responsible for leading an increased
use of analytics across IBM. Since joining IBM
in 1982, she has integrated data analysis into
her work and has held a number of technical
leadership roles including CTO, process,
methods, and tools in IBM Global Business
Services.
In 1992, Emily was elected to the IBM
Academy of Technology, a body of approximately
1,000 of IBM’s top technical leaders, and she
served as its president from 2009 to 2011. She is
a member of INFORMS.
Maureen Fitzgerald Norton, MBA, JD, is a
distinguished market intelligence professional
and executive program manager in Business
Analytics Transformation, responsible for driving
the widespread use of analytics across IBM. In
her previous role, she led project teams applying
analytics to IBM Smarter Planet initiatives in
public safety, global social services, commerce
and merchandising.
Norton became the first woman in IBM to
earn the designation of Distinguished Market
Intelligence Professional for developing
innovative approaches to solving business issues
and knowledge gaps through analysis.
Note: This article is adapted from the book,
“Analytics Across the Enterprise: How IBM
Realizes Business Value from Big Data and
Analytics,” authored by Brenda L. Dietrich, Emily
C. Plachy and Maureen F. Norton, published by
Pearson/IBM Press, May 2014, ISBN 978-0-
13-383303-4, ©2014 by International Business
Machines Corporation. For more information,
visit: ibmpressbooks.com.
Request a no-obligation INFORMS Member Benefits Packet
For more information, visit: http://www.informs.org/Membership
W W W. I N F O R M S . O R G70 | A N A LY T I C S - M AGA Z I N E . O R G
SOFTWARE SURVEY
Making predictions from hard and fast data.
ere is an easy forecast to
make: Forecasting will be
part of our information flow
for the foreseeable future.
Forecasting is also a key topic in my
“Decision Modeling for Management”
course. In preparing the midterm exam
for this past spring term, I wanted the stu-
dents to analyze the enrollment figures for
the Affordable Care Act and make some
forecasts. The media has been talking
about these enrollment figures since the
rollout, and politicians have been making
projections about them as well. In the
course we covered various forecasting
methodologies, including trend analysis.
Thus, my plan for a midterm problem was
to give the students the enrollment data
and have them make a forecast for the
May 1 enrollment deadline. Getting those
enrollment numbers became obstacle
number one.
Figures 1 and 2 show some typical
results of an Internet search. I found
graphs, some better, more worse (look
at the markers on the x-axis of the graph
The future of
forecasting
BY JACK YURKIEWICZ
H
J U LY / AU G U S T 2 014 | 71A NA L Y T I C S
Figure 1: http://www.cnn.com/interactive/2013/09/health/map-obamacare/.
Figure 2: http://www.whitehouse.gov/the-press-office/2014/04/17/fact-sheet-affordable-care-act-numbers.
in Figure 1), lots of opinion articles with
forecasts, but no data. I punted and de-
cided to present the class a similar but
far less-pressing problem. On March 31,
the day of the midterm exam, I asked
students to make forecasts for the
cumulative domestic box-office gross for
the recently released movie “Non-Stop.”
The action film starring Liam Neeson
had opened on Feb. 28, and I gave the
students the daily domestic box-office
gross values from opening day through
W W W. I N F O R M S . O R G72 | A N A LY T I C S - M AGA Z I N E . O R G
FORECASTING
March 16, or 17 days of data. The stu-
dents were asked to make a time plot
of these box-office figures (see Figure
3) and, after examining various trend
models, get a forecast for the cumu-
lative domestic box-office gross for a
target date, midterm day, March 31.
I knew that two days later (after I had
graded their exams and returned them),
Universal Studios would give the actual
cumulative domestic gross of the film
as of March 31. It was $85.39 million.
Of the various trend models we cov-
ered, the Weibull curve yielded the most
accurate forecast, $86.11 million; anoth-
er model was reasonably close, and the
others we discussed and they tried were
way off.
CATEGORIZING THE FORECAST
SOFTWARE
Commercial forecasting software
is available in two broad categories.
Using the nomenclature from previous
OR/MS Today forecasting surveys, the first
category is called dedicated software. A
dedicated product implies that the software
only has various forecasting capabilities,
such as Box-Jenkins, exponential smooth-
ing, trend analysis, regression and other
procedures. The second category is called
general statistical software. This implies the
product does have forecasting techniques
as a subset of the many statistical proce-
dures it can do. Thus, a product that can
do ANOVA, factor analysis, etc., as well
as Box-Jenkins techniques would fall into
Figure 3: Initial daily domestic box-office gross of the motion picture (“Non-Stop”).
MASTER OF SCIENCE IN ANALYTICS
•		15-month,	full-time,	on-campus	program	
•	Integrates	data	science,	information	technology	and	business	applications		
into	three	areas	of	data	analysis:	predictive	(forecasting),	descriptive	(business					
intelligence	and	data	mining)	and	prescriptive	(optimization	and	simulation)
•		Offered	by	the	McCormick	School	of	Engineering	and	Applied	Science
www.analytics.northwestern.edu
MASTER OF SCIENCE IN PREDICTIVE ANALYTICS
•		Online,	part-time	program
•		Builds	expertise	in	advanced	analytics,	data	mining,	database	management,				
	financial	analysis,	predictive	modeling,	quantitative	reasoning,	and	web	analytics,				
	as	well	as	advanced	communication	and	leadership
•		Offered	by	Northwestern	University	School	of	Continuing	Studies
877-664-3347 | www.predictive-analytics.northwestern.edu/info
NORTHWESTERN ANALYTICS
As	businesses	seek	to	maximize	the	value	of	vast	new	streams	of	available	data,	
Northwestern	University	offers	two	master’s	degree	programs	in	analytics	that	
prepare	students	to	meet	the	growing	demand	for	data-driven	leadership	and	
problem	solving.	Graduates	develop	a	robust	technical	foundation	to	guide		
data-driven	decision	making	and	innovation,	as	well	as	the	strategic,	
communication	and	management	skills	that	position	them	for	leadership	roles		
in	a	wide	range	of	industries	and	disciplines.
W W W. I N F O R M S . O R G74 | A N A LY T I C S - M AGA Z I N E . O R G
FORECASTING
this group. In recent years, the number of
products in the second category has been
growing, as statistical software firms have
been adding additional and more sophis-
ticated forecasting methodologies to their
lists of features and capabilities. However,
some dedicated software manufactur-
ers offer specific capabilities and features
(e.g., transfer function, econometric mod-
els, etc.) that general statistical programs
may not have.
In both software categories, forecast-
ing software varies when it comes to the
degree to which the software can find
the appropriate model and the optimal
parameters of that model. For example,
Winters’ method requires values for three
smoothing constants and Box-Jenkins
models have to be specified with vari-
ous parameters, such as ARIMA(1,0,1)
x(0,1,2). Forecasting software vary in
their degree to find these parameters.
For the purposes of this and previous
surveys, the ability of the software to find
the optimal model and parameters for the
data is characterized. Software is labeled
as automatic if it both recommends the
appropriate model to use on a particular
data set and finds the optimal parame-
ters for that model. Automatic software
typically asks the user to specify some
parameter to minimize (e.g., Akaike Infor-
mation Criterion (AIC), Schwarz Bayes-
ian Information Criterion (SBIC), RMSE,
etc.) and recommends a forecast model
for the data, gives the model’s optimal
parameters, calculates forecasts for a
user-specified number of future periods,
and gives various summary statistics and
graphs. The user can manually overrule
the recommended model and choose an-
other, and the software finds the optimal
parameters, forecasts, etc., for that one.
The second category is called semi-
automatic. Such software asks the user to
pick a forecasting model from a menu and
some statistic to minimize, and the pro-
gram then finds the optimal parameters
for that model, the forecasts, and various
graphs and statistics.
The third category is called manual
software. Here the user must specify both
the model that should be used and the
corresponding parameters. The software
then finds the forecasts, summary statis-
tics and charts. If you frequently need to
make forecasts of different types of time
series, using manual software could be a
tedious choice. Unfortunately, that broad
advice may not be apropos for some
software. Some products fall into two
categories. For example, if you choose
a Box-Jenkins model, the software may
find the optimal parameters for that mod-
el, but if you specify that Winters’ method
be used, the product may require that
you manually enter the three smoothing
constants.
J U LY / AU G U S T 2 014 | 75A NA L Y T I C S
When it comes to analyzing trends,
most the products I tried fall into the semi-
automatic group. That is, I need to choose a
trend curve, and the software finds the ap-
propriate parameters for that model, gives
forecasts, summary statistics and graphs.
WORKING WITH A SAMPLE OF
PRODUCTS
In my class, students use StatTools,
part of the Palisade Software Suite that
comes with their textbook. Its forecasting
capabilities are regression, exponential
smoothing (Brown, Holt and Winters’) and
moving averages. If data followed some
nonlinear function, the students could
make mathematical transformations to
make the data linear and then use ordi-
nary linear regression on it, and do the
inverse transformation to get the forecast.
They also have several Excel templates
I developed (Gompertz, Pearl-Reed,
Weibull, etc.) for the course. For this ar-
ticle, I tried a small sample of professional
A membership in INFORMS will help!
How will you stand out from the crowd?
• Certification for Analytics Professionals
• Online access to the latest in operations research and advanced analytics techniques
• Networking Opportunities available at INFORMS Meetings and Communities
• New Members receive one free Subdivison membership in 2014 visit http://join.informs.org
Join INFORMS for rest of 2014 for just $80.
Exclusive offer to Analytics subscribers. Promocode ANALYTICS-HALF.
W W W. I N F O R M S . O R G76 | A N A LY T I C S - M AGA Z I N E . O R G
FORECASTING
products from different categories, spe-
cifically Minitab, IBM SPSS and NCSS
on the “Non-Stop” movie data. IBM SPSS
falls into the automatic forecasting catego-
ry; Minitab and NCSS are semiautomatic
products. A caveat: This is not meant to be
a critical review of any product mentioned.
I let IBM SPSS first do the analysis of
the movie data via its automatic mode,
called “Expert Modeler” (i.e., choose the
model and its parameters and get the
forecasts). Figure 4 shows superimposed
screen shots of IBM SPSS’ worksheet,
showing the “Non-Stop” daily domestic
box-office gross and the menu system
to start the automatic forecasting proce-
dure. The program then gave its recom-
mended model, Brown’s method for data
with linear trend, which uses one smooth-
ing constant to estimate the intercept and
slope of the fitted line (as compared to
Holt’s method, which uses two inde-
pendent smoothing constants) [1]. IBM
SPSS’ accompanying statistics, forecast
plot and additional output are shown in
Figure 5.
Figure 4. IBM SPSS
input worksheet (show-
ing the “Non-Stop”
movie daily box-office
returns).
Figure 5: IBM SPSS’ results of “automatic”
forecasting of the “Non-Stop” data.
Published for business forecasters, planners, and managers by the International Institute
of Forecasters (IIF), Foresight: The International Journal of Applied Forecasting delivers
authoritative guidance on forecasting processes, practices, methods, and tools.
Each issue features a unique blend of insights from experienced practitioners and top
academics, distilled into concise and accessible articles, tutorials, and case studies. Our
mission is to help you improve the accuracy and efficiency of your forecasting and
operational planning.
Foresight’s topics include
• SOP process design and management
• Forecasting principles and methods
• Measuring and tracking forecast accuracy
• Regular columns on forecasting intelligence, prediction markets, financial forecasting
• Hot new research and its practical value
• Reviews of new and popular books, software, and other technologies
No matter what
kind of forecasting
you do, we invite you to
take Foresight for a “test drive.”
To take Foresight for a spin, download a recent issue here:
bit.ly/ForesightTestDrive
Foresight is a publication of the International Institute of Forecasters.
IIF Business Office: 53 Tesla Avenue, Medford, MA 02155, USA.Tel: 1-781-234-4077
To receive quarterly hard copy issues, unlimited access to our
library of back issues, and much more, subscribe to Foresight here:
forecasters.org/foresight/subscribe
W W W. I N F O R M S . O R G78 | A N A LY T I C S - M AGA Z I N E . O R G
FORECASTING
IBM SPSS does
have a curve fitting
feature, so I utilized
it and specified three
possible models to
be examined – the
linear, growth and
logistic curves. Fig-
ures 6 and 7 give
the resulting output
and plots for these
choices.
NCSS has, in
addition to the stan-
dard forecasting
procedures (Box-Jenkins and exponen-
tial smoothing models), an extensive list
of more than 20 nonlinear curve mod-
els under its menu label “Growth and
Other Models.” The user chooses a
model, and NCSS finds the appropriate
parameters for the particular data set.
I chose, for the “Non-Stop” data, the
“Logistic(4)” model [i.e., a logistic curve
with four parameters; there is a Logistic(3)
model available as well], and Figure 8
shows the NCSS’ output.
Minitab is a hybrid of a semi-auto-
matic and manual forecasting product. If
you specify that a Box-Jenkins model be
used, the software finds the appropriate
parameters for the model. However, if you
choose Winters’ method, Minitab requires
Figure 6: IBM SPSS’
fitted models for
three specified
growth curves.
Figure 7: IBM SPSS’ plot of the data and growth curves.
J U LY / AU G U S T 2 014 | 79A NA L Y T I C S
that you manually enter values
for the three smoothing con-
stants. Minitab also has, under
the Time Series choice on the
main menu, a Trend Analysis
option. Choosing that gives the
user four possible curves (lin-
ear, quadratic, exponential and
Pearl-Reed logistic). Figure 9
gives the results of my choice
for the “Non-Stop” data, the
Pearl-Reed curve (Minitab calls
it the S-Curve Trend Model). Figure 9: Minitab’s output for the Pearl-Reed logistic growth model for
the Non-Stop data.
Figure 8: NCSS’ output.
I chose the “Logistic(4)”
from NCSS’ list of “Growth
and Other Models.”
W W W. I N F O R M S . O R G80 | A N A LY T I C S - M AGA Z I N E . O R G
FORECASTING
Finally, Figure 10 shows the results of
one of my Excel templates that uses the
four-parameter Weibull trend curve and
uses Solver’s nonlinear programming
capability to find the optimal parameters
that minimizes the root mean square er-
ror for the entered data.
THE SURVEY
We e-mailed the vendors and asked
them to respond on our online ques-
tionnaire so readers could see the fea-
tures and capabilities of the software.
The purpose of the survey is to inform
the reader of a program’s forecasting
capabilities and features. We tried to
identify as many forecasting vendors
and products as possible and contacted
all the vendors that we identified and/
or responded to the last survey in 2012.
For those who did not respond, we tried
gentle reminders (several e-mails and
some phone calls). In addition to the
features and capability of the software,
we wanted to know what techniques or
enhancements have been added to the
software since our previous survey. The
information comes from the vendors,
and we made no attempt to verify what
they gave us.
Figure 10: The four-parameter Weibull curve fit for the Non-Stop data.
J U LY / AU G U S T 2 014 | 81A NA L Y T I C S
If you use data to make forecasts,
what should you look for in a vendor and
the product? First, find out the capabilities
of the software. Specifically, what fore-
casting methodologies can the product
do? Does it find the optimal parameters
of the procedure for your particular data
set or must you manually enter those val-
ues? How extensive, useful and clear is
the output?
Most, but not all, vendors allow you to
download a time-trial version of the soft-
ware that typically expires in anywhere
from a week to a month. Ideally, the trial
version should allow you to work with
your own data and not just “canned” data
that the vendor bundles with the trial soft-
ware. Verify if the trial version has size
limitations of the data, and if so, are they
overly restrictive.
Ask about technical support, updat-
ing to a newer version when it is released
and differences (if any) depending on the
operating system you are using. Contact
the vendor with your specific questions.
Users tell me, and I have independently
found, that most vendors have good and
helpful technical support before and after
you buy.
Jack Yurkiewicz (yurk@optonline.net) is a
professor of management science in the MBA
program at the Lubin School of Business, Pace
University, New York. He teaches data analysis,
management science and operations management.
His current interests include developing and
assessing the effectiveness of distance-learning
courses for these topics. He is a longtime member
of INFORMS.
SURVEY DATA  DIRECTORY
To view the survey results as well as a directory
of vendors who participated in the survey,
click here.
W W W. I N F O R M S . O R G82 | A N A LY T I C S - M AGA Z I N E . O R G
CONFERENCE PREVIEW
BY CANDACE
“CANDI” YANO
Tony Bennett sang that he “left his heart in San
Francisco” – and at the 2014 INFORMS Annual Meet-
ing in San Francisco, you will begin to understand
why as you take advantage of the opportunity to fill
both your heart and your mind. To fill your mind, you
can attend special presentations:
• Alvin Roth, professor of economics at Stanford
University and professor of economics and busi-
ness administration at Harvard University who
was awarded the 2012 Nobel Prize in Economics for
his work in the area of Game Theory, will talk about
his work.
• Richard Cottle, emeritus professor at Stanford
University, will offer a commemorative and historical
perspective on George Dantzig in honor of Dantzig’s
100th birthday.
• Jonathan Caulkins, professor at the Heinz School of
Public Policy at Carnegie Mellon University, will discuss
his work on health and drug-related policy issues.
S.F. conference set to
capture hearts  minds
The conference
will include more than
4,000 technical
presentations by experts
from industry, academia
and government,
from leading-edge
advancements in
operations research
methodologies and
analytics to applications
in healthcare, energy,
environmental management
and supply chain
management.
Some of San Francisco’s many
landmarks are mobile.
J U LY / AU G U S T 2 014 | 83A NA L Y T I C S
• Anthony Levandowski of Google will talk
about the Google Driverless Car project, of-
fering his perspective as both a developer
and a user of the technology.
• A panel of experts from within the
INFORMS community will discuss
their experience with, and offer ad-
vice on, massively open online courses
(MOOCs).
If this is not enough, there will be
more than 4,000 technical presentations
by experts from industry, academia and
government. Topics will be wide-rang-
ing, covering the full breadth of the field,
from leading-edge advancements in op-
erations research methodologies and
analytics, to applications in healthcare,
energy, critical infrastructure manage-
ment, environmental management and
supply chain management.
If you are not already overwhelmed
while filling your mind, you will have ample
opportunity to fill your heart – and stom-
ach. San Francisco is regarded as one of
the most beautiful cities in the world and
offers world-class cuisine from almost
every ethnic heritage. The meeting will
take place in two adjacent hotels, the Hil-
ton San Francisco Union Square and the
Parc 55 Wyndham. The location is in close
proximity to the city’s prime shopping dis-
trict and near the boarding point for cable
cars to Fisherman’s Wharf – famous for
fresh seafood and Pier 39 – where you
can see dozens of sea lions and walk to
ferries that offer everything from simple
rides across San Francisco Bay to amaz-
ingly scenic tours, as well as Ghirardelli
Square, known for Ghirardelli chocolate.
Venturing into other parts of San Fran-
cisco, you can visit world-class muse-
ums, including the Palace of the Legion of
Honor, DeYoung Museum, Asian Art Mu-
seum and California Academy of Sci-
ences. The performing arts, including the
symphony, ballet, opera, jazz, theater and
concerts, are all within easy reach. If you
prefer the outdoors, you can take a trip
to the former prison on Alcatraz (a limited
number of tickets will be available to con-
ferees for purchase), see the redwoods in
Muir Woods, hike in the Marin Headlands
with an unobstructed view of the Golden
Gate Bridge, sign up to play a round of golf
with other conferees at TPC Harding Golf
Course the day before the conference, or
simply wander through the haunts of the
hippies in Haight-Ashbury or the Beat po-
ets in North Beach. Just a bit further from
the city are the wine regions of Napa and
Sonoma, only an hour’s drive away.
Both the meeting and the venue will
have much to offer in many dimensions.
We look forward to seeing you there.
Candace “Candi” Yano is general chair of the
2014 INFORMS Annual Meeting in San Francisco.
She is a longtime member of INFORMS.
W W W. I N F O R M S . O R G84 | A N A LY T I C S - M AGA Z I N E . O R G
FIVE-MINUTE ANALYST
Few things make me more conflicted than parking
lots. On a personal level, I loathe the whole parking
activity. It brings out what I think is the worst behav-
iors of humankind: hoarding, brinksmanship, scarci-
ty mentality, irrational objective functions… and now
you see why as an O.R. professional I love parking
lots: because they are so interesting to study.
At the corner of Hades Street and Styx Ave. is
(at least to me) the world’s worst parking lot. Here’s
the set-up: There is an upper level with metered
parking. The meter has a two-hour limit at a rate
of $1.25/hour, but pressing a silver button on the
meter sets the time to 60 minutes if the meter is
currently less than 60 (see Figure 1). This makes
parking here free to most visitors. The lower level is
Probabilistic parking
problems
BY HARRISON
SCHRAMM, CAP
The whole parking
activity brings out the
worst behaviors of
humankind: hoarding,
brinksmanship, scarcity
mentality, irrational
objective functions…
and why as an O.R.
professional I love parking
lots: because they are so
interesting to study.
Figure 1: A “smart meter” in a parking lot. This meter has a button
next to the coin lot that may be pressed for a free hour of parking.
Coins may be added for additional time, up to two hours.
J U LY / AU G U S T 2 014 | 85A NA L Y T I C S
a standard parking garage, which has a
flat $2 per hour fee which can be vali-
dated by the two “anchor” stores, mak-
ing it essentially free for most patrons
as well. While this is light and explorato-
ry, there is serious work going on with
parking problems [1].
In the sterile world of figures and
mathematics, this sounds like a reason-
able way to run a parking lot, and pa-
trons who miss the upstairs free parking
will simply renege and take the lower
level free parking. In reality, people
“mob” the upstairs portion in search of
“free parking.” My assistant and I had
observed this behavior over a num-
ber of weeks, and we were interested
in learning about the time parked cars
spent in the lot, with an eye for simple
metrics such as expected wait time for a
parking spot or the expected number of
cars “trolling” for a slot. This interest be-
came action (the key for any analysis),
and we chose 6:30 p.m. on a Thursday
evening – a time that we knew the park-
ing lot would be full – to collect data
BENEFITS OF CERTIFICATION
• Advances your career potential by setting you apart from the competition
• Drives personal satisfaction of accomplishing a key career milestone
• Helps improve your overall job performance by stressing continuing
professional development
• Recognizes that you have invested in your analytics career by pursuing
this rigorous credential
• Boosts your salary potential by being viewed as experienced analytics professional
• Shows competence in the principles and practices of analytics
APPLICATIONS
• Prepare to apply by reviewing Candidate
Handbook  Study Guide Draft
• Arrange now to secure academic transcript
and confirmation of “soft skills” to send
to INFORMS
COMPUTER-BASED TESTING
It is now more convenient than ever to schedule
your CAP exam in more than 700 Kryterion test
centers in more than 100+ countries. To find the
location closest to you, check this site:
www.kryteriononline.com/host_locations/
QUESTIONS? certification@mail.informs.org
DOMAINS OF ANALYTICS PRACTICE
Domain Description Weight*
Business Problem (Question) Framing
Analytics Problem Framing
Data
Methodology (Approach) Selection
Model Building
Deployment
Life Cycle Management
*Percentage of questions in exam
I
II
III
IV
V
VI
VII
15%
17%
22%
15%
16%
9%
6%
100%
BECOME A CERTIFED ANALYTICS PROFESSIONAL
DON’T BE LEFT BEHIND.
www.informs.org/Build-Your-Career/Analytics-Certification
W W W. I N F O R M S . O R G86 | A N A LY T I C S - M AGA Z I N E . O R G
FIVE-MINUTE ANALYST
from the meters, which is displayed for
anyone who wishes to see.
What we found was surprising.
We expected to see uncorrelated
parking lot data. We did not expect to
find many over-time parking spots. I
hoped that the data would be exponen-
tial – which would lead to nice, clean
analysis. What we discovered was, well,
a mess.
Of the 100 parking spots surveyed,
25 percent were “flashing” or over-time
(violation). Of the parking spots that
were not over-time, six showed times
over one hour, implying that the persons
parked there had in fact put money in the
meter. We are completely discarding the
possibility that someone would park in a
spot that had been previously occupied
but was not vacated, i.e., showing up
with 30 minutes remaining on meter and
not pressing the button/inserting coins. I
had hoped that the sojourn times would
be exponentially distributed, but that is a
case that is pretty difficult to make with
this dataset (see Figure 2).
Now, we don’t actually know how
many patrons have paid, or how many
have simply run over. However, there
are 100 parking spots considered, and
of these, six currently have clocks over
one hour. We can (crudely) estimate [2]
the true number of paid parking spots by
realizing that we are observing the last
hour of what may be a two-hour pro-
cess. Therefore, we think approximately
Figure 2: Histogram of raw parking meter data. Note the tri-modal nature of the data. “Overtime,” i.e.,
flashing parking meters are represented by -1 in the red-shaded oval and constitute the large bar at the
origin of the graph. Known paid parking meters are at the right and have a blue oval.
J U LY / AU G U S T 2 014 | 87A NA L Y T I C S
12 parking spots have been paid for at
any given time.
YES, BUT WHAT DOES IT ALL MEAN?
So in one sense, the distributions of
the data are irrelevant; there are 100
parking spots on average, and the aver-
age time that a parking spot is occupied
is some time greater than 27 minutes. If
we make the (not bad!) assumption that
the parking spots that run over are oc-
cupied for 90 minutes, then the average
occupancy is 43 minutes. In a lot with
100 spots, this means that on average,
Figure 3: Histogram of parking time remaining, less than 60 minutes. Approximately six of these data
points are actually spill over from “paying” customers.
one spot comes open every 30 seconds.
This doesn’t sound so bad. If we treat
the system as a queue, and use the
(observed) steady state cars waiting of
three, we can place a rough lower es-
timate [3] that a new car arrives every
30 seconds looking for a parking spot,
and that they have between a 15 per-
cent and 25 percent chance of finding
an open spot. These crude estimates,
however, do not agree very well with
observation, because they neglect the
“blocking” effect of other cars waiting
for spots to open up. A better analysis of
W W W. I N F O R M S . O R G88 | A N A LY T I C S - M AGA Z I N E . O R G
FIVE-MINUTE ANALYST
this parking lot would involve simulation,
which would go beyond our intent.
THE WORLD’S WORST PARKING LOT?
Because of the behavior of the driv-
ers while trolling for a parking spot, it
might be considered the world’s worst
parking lot. Enforcement of the park-
ing policy might help because it would
decrease the sojourn times of the cars
parked in the lot, but there is no guar-
antee, and – more importantly – no di-
rect incentive for the parking lot owners
to do so. This is because the number of
“free” parking spots is fixed, and once
they are filled, they are filled, regard-
less of by whom. From the lot man-
ager’s point of view, it doesn’t matter
if they are “long” or “short” parkers.
In fact, the rate structure is such that
short parkers are slightly more lucrative
for the parking lot owner than parking
above ground.
In conclusion, it’s probably a bit of lit-
erary hyperbole to imply that this is the
world’s worst parking; I’m sure there are
others that are much worse. This is be-
cause I like to make short trips to this area
and visit the locations that don’t validate
parking, and I really don’t like the risky
behaviors aggressive parkers participate
in. On the upside, there’s time to write 12
articles in a single push of the button!
I’d be interested in hearing real
contenders for the “World’s Worst
Parking” lot.
Update: Between the original draft of
this article and its publication, the park-
ing lot in question began installing an
electronic system to help customers de-
termine how many spots were available
before entering the parking “queue.” It
has yet to be determined if it will change
the behaviors of the parking lot. Look
forward to an update in a future column!
Harrison Schramm (harrison.schramm@gmail.
com) is an operations research professional in the
Washington, D.C., area. He is a member of INFORMS
and a Certified Analytics Professional (CAP).
NOTES  REFERENCES
1. Fabusuyi, Hampshire, Hill and Sasauma, 2014,
“Decision Analytics for Parking Availability in
Downtown Pittsburgh,” Interfaces, INFORMS,
Hanover, Md.
2. This is just an estimate. More delicate techniques
may be applied.
3. Using the M/M/1 queuing model to find the “lower”
or optimistic estimates, and the M/G/1 queuing model
to find the upper estimate.
Join the Analytics Section of INFORMS
For more information, visit:
http://www.informs.org/Community/Analytics/Membership
meetings2.informs.org/sanfrancisco2014
Thanks to our Sponsors:
Join us in San Francisco
INFORMS returns to the City by the Bay for its 2014 Annual Meeting with a rich
and varied program, bridging data and decisions. Each year, the INFORMS
meeting brings together experts from academia, industry and government to
consider a broad range of ORMS and analytics research and applications. In
2014, we’ll offer that program excellence in one of America’s most exciting
cities. Join us for INFORMS 2014!
Registration Now Open!
November 9-12, 2014
Hilton San Francisco Union Square  Parc 55 Wyndham
San Francisco, California
The Premier Conference for OR/MS Professionals offers you:
 Networking – connect with colleagues, share knowledge and ideas
 Top industry and academic speakers
 Two great receptions, Sunday and Tuesday
 Tutorials, exhibits and software demonstrations
 Extensive tracks on “hot topics” – the best in ORMS
 Focus on Analytics and Practice – special tracks and sessions
 Vibrant Interactive/Poster Sessions
W W W. I N F O R M S . O R G9 0 | A N A LY T I C S - M AGA Z I N E . O R G
Frog and fly
BY JOHN TOCZEK
THINKING ANALYTICALLY
A frog is looking to catch his next meal
just as a fly wanders into his pond. The frog
jumps randomly from one lily pad to the next in
hopes of catching the fly. The fly is unaware of
the frog and is moving randomly from one red
flower to another.
The frog can only move on the lily pads
and the fly can only move on the flowers.
The interval at which both the frog and the
fly move to a new space is one second. They
never sit still and always move away from the
space they are currently on. Both the frog
and the fly have an equal chance of moving
to any nearby space including diagonals. For
example, if the frog were on space A1, he
would have a one-in-three chance each of moving
to A2, B2 and B1.
The frog will capture the fly when he lands on the
same space as the fly.
QUESTION: Which space is the frog most likely
to catch the fly?
Send your answer to puzzlor@gmail.com by
Aug. 15. The winner, chosen randomly from correct
answers, will receive a $25 Amazon Gift Card. Past
questions can be found at puzzlor.com.
Figure 1: Where will the frog dine on the fly?
John Toczek is the senior director
of Decision Support andAnalytics for
ARAMARK Corporation in the Global
Operational Excellence group. He
earned a bachelor of science degree
in chemical engineering at Drexel
University (1996) and a master’s
degree in operations research from
Virginia Commonwealth University
(2005). He is a member of INFORMS.
GENERAL ALGEBRAIC MODELING SYSTEM
sales@gams.com www.gams.com
Scheduled courses for 2014 include:
•	 Advanced	Techniques	in	General	Equilibrium	Modeling	with	GAMS
•	 Agro-Economic	Modeling	with	GAMS
•	 Applied	Equilibrium	Analysis	of	Energy	and	Climate	Policies
•	 Basic	and	Advanced	GAMS
•	 Development	Policy	Modeling
•	 Dynamic	Impacts	of	Macroeconomic	Policies	and	Shocks
•	 Environmental	Computable	General	Equilibrium	Modeling	with	GAMS
•	 Financial	General	Equilibrium	Modeling	with	GAMS
•	 Global	Computable	General	Equilibrium	Model	Training
•	 Microeconomic	Analysis	of	Welfare	and	Policy
•	 Modeling	and	Optimization	with	GAMS
•	 Practical	General	Equilibrium	Modeling	with	GAMS
•	 Simulation	Techniques	for	Applied	
Microeconomics
•	 Trade	and	Climate	Policy	Analysis	
with	GAMS	and	MPSGE
For more information please visit: http://www.gams.com/courses.htm
Whether	you	are	new	to	GAMS	or	already	an	experienced	user	looking	to	deepen	or	expand	your	
knowledge	in	a	certain	area	-	take	a	look	at	our	diverse	list	of	GAMS	related	courses.	From	basic	
introductions	to	equilibrium	or	agricultural	modeling	these	courses	meet	your	needs	in	your	area	
of	interest.	Courses	are	led	by	domain	experts	at	locations	worldwide.
GAMS-related Courses and Workshops
©pressmaster/©JonasGlaubitz
Fotolia.com

WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?

  • 1.
    H T TP : / / W W W. A N A LY T I C S - M A G A Z I N E . O R G JULY/AUGUST 2014DRIVING BETTER BUSINESS DECISIONS BROUGHT TO YOU BY: WHY ANALYTICS PROJECTS FAIL ALSO INSIDE: • Dark side of digital world • Real-time text analytics • Data scientists’ time to shine • The future of forecasting Key considerations for deep analytics on big data, learning and insights Executive Edge Hewlett-Packard V. P. Rohit Tandon: Six ways of value creation via E-commerce analytics
  • 2.
    W W W.I N F O R M S . O R G2 | A N A LY T I C S - M AGA Z I N E . O R G What I learned today INSIDE STORY One of the advantages of editing Analytics (as well as OR/MS Today, the membership magazine of INFORMS) is I learn something new every day, thanks to the wide array of contributed articles we receive. For example, just in preparing this issue, I learned: • Nearly 20 years ago, Amazon found- er Jeff Bezos said that Amazon intended to sell books at or near cost as a way of gathering data on affluent, educated shoppers, as reported by George Packer in The New Yorker. The implication: The data, once analyzed, had more value than the loss-leader books, which proved absolutely correct when Amazon began selling everything under the sun to well- targeted consumers. Drawing on Packer’s article, as well as a couple of books (“Who Owns the Future?” and “The Ethics of Big Data”), Vijay Mehrotra explores the dark side of technology, big data and analytics – and the perceived and/or potential threat it poses – in his Analyze This! column. Don’t miss it. • A Formula 1 pit crew, working in an optimized, well-coordinated fashion, can change a set of four tires in less than two seconds. That means that unless you’re Evelyn Wood, that crew can change 12 tires in the time it takes you to read this sentence. For the story behind the motorsports magic, check out Andy Boyd’s Forum column. Seeing is be- lieving, so don’t miss the amazing videos referenced at the end of the article. • We all know the digital/technical world will come to a wordy end without acronyms, but do you know what MOOC stands for? I do (“massively open online course”), thanks to an interview I did with executive search honcho Linda Burtch regarding the red-hot analytics job market. • Finally, I also learned from Linda that in today’s dynamic world, young people should plan on three or four ca- reers during their lifetime. “It’s not good to specialize in one thing and try to stick with one company or one industry or one vertical application for your entire ca- reer,” she says in the Q&A. “It’s incredibly dangerous, and it likely won’t carry you through a 35-year career. You need to be continuously learning something new.” I got that last part going for me, every day. – PETER HORNER, EDITOR peter.horner@mail.informs.org
  • 3.
    OPTIMIZEYOUR BUSINESS WITH UNPRECEDENTEDSPEED info@aimms.com | +1 425 458 4024 To learn more about AIMMS Optimization Apps, visit aimms.com. TO YOUR ENTERPRISE OPTIMIZATION APP STORE PUBLISHED INSTANTLY IN A FEW DAYS PROOF OF CONCEPT IN A FEW WEEKS OPTIMIZATION APP IN A FEW MONTHS MISSION CRITICAL ENTERPRISE APP IN A FEW HOURS IDEA
  • 4.
    W W W.I N F O R M S . O R G4 | A N A LY T I C S - M AGA Z I N E . O R G DRIVING BETTER BUSINESS DECISIONS C O N T E N T S FEATURES REAL-TIME TEXT ANALYTICS By Aveek Mukhopadhyay and Roger Barga How a cloud-based analytical engine yields instant insight using unstructured social media data. WHY DO ANALYTICS PROJECTS FAIL? By Haluk Demirkan and Bulent Dal Not just another IT project: Key considerations for deep analytics on big data, learning and insights. ‘IT’S THEIR TIME TO SHINE’ By Peter Horner Job prospects for data scientists and elite analytics professionals have never been better – and the future is even brighter. ANALYTICS TRANSFORMS A ‘DINOSAUR’ By Brenda Dietrich, Emily Plachy and Maureen Norton The story of how industry giant IBM not only survived but thrived by realizing business value from big data. THE FUTURE OF FORECASTING By Jack Yurkiewicz Making predictions from hard and fast data: Biennial survey of popular software for analytics professionals. 34 44 54 62 70 54 62 70 34 JULY/AUGUST 2014 Brought to you by
  • 5.
    Tel 775 8310300 • Fax 775 831 0314 • info@solver.com AnAlytic Solver PlAtform visualize, Analyze, Decide with Power Bi + Premium Solver Before your company spends a year and a small fortune on “advanced analytics”, shouldn’t you find out what your people can do with the latest enhancements to the tool they already know – Microsoft Excel – in business intelligence and advanced analytics today? Did you know that with Power Pivot in Excel 2013 and 2010, your Excel desktop can easily analyze 100 million row datasets, with the power of Microsoft’s SQL Server Analysis Services xVelocity engine inside Excel? Did you know that with Power Query in Excel, you can extract, transform and load (ETL) data from virtually any enterprise or cloud database with point-and-click ease? Did you know that with Analytic Solver Platform in Excel, you can create powerful data mining, forecasting and predictive analytics models, rivaling the best-known statistical packages, again with point-and-click ease? Did you know that with Analytic Solver Platform, you can build sophisticated Monte Carlo simulation, risk analysis, conventional and stochastic optimization models, using the world’s best solvers, and modeling tools proven in use by over 7,000 companies? Did you know that with Power View and Frontline’s XLMiner Data Visualization, you can visualize not only your data, but the results of your analytic models? Now you know that with Microsoft’s Power BI and Frontline’s Premium Solver App, you can publish your Excel workbook to Office 365 in the cloud, share your visualizations, refresh from on-premise databases, and re-optimize your model for new decisions immediately. Find Out More, Download Your Free Trial Now Visit www.solver.com/powerbi to learn more, register and download a free trial – or email or call us today.
  • 6.
    6 | DRIVING BETTERBUSINESS DECISIONS REGISTER FOR A FREE SUBSCRIPTION: http://analytics.informs.org INFORMS BOARD OF DIRECTORS President Stephen M. Robinson, University of Wisconsin-Madison President-Elect L. Robin Keller,
University of California, Irvine Past President Anne G. Robinson, Verizon Wireless Secretary Brian Denton, University of Michigan Treasurer Nicholas G. Hall, Ohio State University Vice President-Meetings William “Bill” Klimack, Chevron Vice President-Publications Eric Johnson, Dartmouth College Vice President- Sections and Societies Paul Messinger, CAP, University ofAlberta Vice President- Information Technology Bjarni Kristjansson, Maximal Software Vice President-Practice Activities Jonathan Owen, CAP, General Motors Vice President-International Activities Grace Lin, Institute for Information Industry Vice President-Membership and Professional Recognition Ozlem Ergun, Georgia Tech Vice President-Education Joel Sokol, Georgia Tech Vice President-Marketing, Communications and Outreach E. Andrew “Andy” Boyd, University of Houston Vice President-Chapters/Fora David Hunt, Oliver Wyman INFORMS OFFICES www.informs.org • Tel: 1-800-4INFORMS Executive Director Melissa Moore Meetings Director Laura Payne Marketing Director Gary Bennett Communications Director Barry List Headquarters INFORMS (Maryland) 5521 Research Park Drive, Suite 200 Catonsville, MD 21228 Tel.: 443.757.3500 E-mail: informs@informs.org ANALYTICS EDITORIAL AND ADVERTISING Lionheart Publishing Inc., 506 Roswell Street, Suite 220, Marietta, GA 30060 USA Tel.: 770.431.0867 • Fax: 770.432.6969 President Advertising Sales John Llewellyn john.llewellyn@mail.informs.org Tel.: 770.431.0867, ext. 209 Editor Peter R. Horner peter.horner@mail.informs.org Tel.: 770.587.3172 Assistant Editor Donna Brooks donna.brooks@mail.informs.org Art Director Jim McDonald jim.mcdonald@mail.informs.org Tel.: 770.431.0867, ext. 223 Advertising Sales Sharon Baker sharon.baker@mail.informs.org Tel.: 813.852.9942 Analytics (ISSN 1938-1697) is published six times a year by the Institute for Operations Research and the Management Sciences (INFORMS),thelargestmembershipsocietyintheworlddedicated to the analytics profession. For a free subscription, register at http://analytics.informs.org. Address other correspondence to the editor, Peter Horner, peter.horner@mail.informs.org. The opinions expressed in Analytics are those of the authors, and do not necessarily reflect the opinions of INFORMS, its officers, Lionheart Publishing Inc. or the editorial staff of Analytics. Analytics copyright ©2014 by the Institute for Operations Research and the Management Sciences. All rights reserved. 32 82 DEPARTMENTS 2 Inside Story 8 Executive Edge 14 Analyze This! 24 Healthcare Analytics 28 INFORMS Initiatives 32 Forum 82 Conference Preview 84 Five-Minute Analyst 90 Thinking Analytically
  • 7.
    Tel 775 8310300 • Fax 775 831 0314 • info@solver.com AnAlytic Solver PlAtform easy to Use, industrial Strength Predictive Analytics in excel How can you get results quickly for business decisions, without a huge budget for “enterprise analytics” software, and months of learning time? Here’s how: Analytic Solver Platform does it all in Microsoft Excel, accessing data from PowerPivot and SQL databases. Sophisticated Data Mining and Predictive Analytics Go far beyond other statistics and forecasting add-ins for Excel. Use classical multiple regression, exponential smoothing, and ARIMA models, but go further with regression trees, k-nearest neighbors, and neural networks for prediction, discriminant analysis, logistic regression, k-nearest neighbors, classification trees, naïve Bayes and neural nets for classification, and association rules for affinity (“market basket”) analysis. Use principal components, k-means clustering, and hierarchical clustering to simplify and cluster your data. Simulation, Optimization and Prescriptive Analytics Analytic Solver Platform also includes decision trees, Monte Carlo simulation, and powerful conventional and stochastic optimization for prescriptive analytics. Help and Support to Get You Started Analytic Solver Platform can help you learn while getting results in business analytics, with its Guided Mode and Constraint Wizard for optimization, and Distribution Wizard for simulation. You’ll benefit from User Guides, Help, 30 datasets, 90 sample models, and new textbooks supporting Analytic Solver Platform. Surprising Performance on Large Datasets Excel’s ease of use won’t limit what you can do – Analytic Solver Platform’s fast, accurate algorithms rival the best-known statistical software packages. Find Out More, Download Your Free Trial Now Visit www.solver.com to learn more, register and download a free trial – or email or call us today.
  • 8.
    W W W.I N F O R M S . O R G8 | A N A LY T I C S - M AGA Z I N E . O R G Increasing popularity and access to the Internet has changed the way marketers are interacting with customers. These customers are smart, well informed and empowered, as Internet connectivity is available to them at their fingertips and on the go. It has there- fore become imperative for organizations to be on the customers’ online radar with respect to new products or services and to be able to influence their choices. Not surprisingly, according to one study, 34 percent of marketers are generating leads through Twitter. In- dia’s online retail market grew at a staggering 88 per- cent in 2013 to $16 billion and continues to grow. These examples are a testimony to the growth of e-commerce. The Internet deluge has opened an assortment of op- portunities. Customers are able to buy high-end fashion and designer shoes, book hotels, buy movie tickets and you-name-it. Therefore, an opportunity exists for business re- search to capture, compile, churn and store colos- sal bytes of information about customers, suppliers and operations. This is what we call the age of “big data.” We believe that this age is a natural progres- sion in online business and is here to stay. We are al- ready seeing a surge in adoption of digital channels such as social media, e-mail marketing and display ads in e-commerce. Imagine the amount of data this It has become imperative for organizations to be on the customers’ online radar with respect to new products or services and to be able to influence their choices. BY ROHIT TANDON AND SHRUTI UPADHYAY Six ways of value-creation through analytics in E-commerce EXECUTIVE EDGE
  • 9.
    Tel 775 8310300 • Fax 775 831 0314 • info@solver.com AnAlytic Solver PlAtform from Solver to full-Power Business Analytics in excel The Excel Solver’s Big Brother Has Everything You Need for Predictive and Prescriptive Analytics From the developers of the Excel Solver, Analytic Solver Platform makes the world’s best optimization software accessible in Excel. Solve your existing models faster, scale up to large size, and solve new kinds of problems. FromLinearProgrammingtoStochasticOptimization Fast linear, quadratic and mixed-integer programming is just the starting point in Analytic Solver Platform. Conic, nonlinear, non-smooth and global optimization are just the next step. Easily incorporate uncertainty and solve with simulation optimization, stochastic programming, and robust optimization – all at your fingertips. Ultra-FastMonteCarloSimulationandDecisionTrees Analytic Solver Platform is also a full-power tool for Monte Carlo simulation and decision analysis, with a Distribution Wizard, 50 distributions, 30 statistics and risk measures, and a wide array of charts and graphs. Comprehensive Forecasting and Data Mining Analytic Solver Platform samples data from Excel, PowerPivot, and SQL databases for forecasting and data mining, from time series methods to classification and regression trees, neural networks and association rules. And you can use visual data exploration, cluster analysis and mining on your Monte Carlo simulation results. Find Out More, Download Your Free Trial Now Analytic Solver Platform comes with Wizards, Help, User Guides, 90 examples, and unique Active Support that brings live assistance to you right inside Microsoft Excel. Visit www.solver.com to learn more, register and download a free trial – or email or call us today.
  • 10.
    W W W.I N F O R M S . O R G10 | A N A LY T I C S - M AGA Z I N E . O R G EXECUTIVE EDGE has created for marketers to lay their hands on for analysis. Despite that, in the race to utilize the on- line space, marketers may be focusing more on ad- vertising and less on analysis of the data that could potentially increase sales. In our opinion, understanding the customer behavior becomes more complex in business-to- consumer companies and more so in a 24/7 e-com- merce business that sells technology products in an increasingly commoditized industry. A strong analyt- ics foundation may make e-commerce a thriving and successful channel of sales. Businesses, therefore, are increasingly creating customizable campaigns for their installed base customers and improving sales effectiveness through e-commerce. For example, pricing and merchandising deci- sions need to be taken in real time, and the need to have real-time insights is ever-increasing. To make these decisions faster and better, marketers would need to quickly analyze their digital marketing strate- gies by mining data exhaustively and cost effectively through advanced analytics. KEY DRIVERS OF INCREASED REVENUES An organization’s ability to achieve its goal of increased revenues and margins would depend heavily on its ability to improve three key drivers: 1) volume of customer traffic to the online store (num- ber of visits); 2) customer conversion (percentage of conversion); and 3) basket size (revenue per aver- age order size). Analytics has a very important role to play in this value chain. So while organizations may have the best talent with an analytical mindset and eagerness to apply it, we need to equip data In the race to utilize the online space, marketers may be focusing more on advertising and less on analysis of the data that could potentially increase sales.
  • 11.
    A NA LY T I C S J U LY / AU G U S T 2 014 | 11 scientists in organizations with the right tools and insights. Conversations with analytics profes- sionals reiterate our belief in some of the following must-haves that will elevate an organization’s e-commerce agenda to the next level: 1. Development of best-in-class tools and techniques are a must to build scalable solutions and tackle the optimization of key drivers. Over the years various products such as SAS have provided excellent devel- opment environments, but every data scientist had to start from scratch and depend on their “personal” techniques to tackle new problems. However, in recent years, data scientists and organizations are now moving toward using templates and building packaged models and solu- tions to reuse and replicate technologies with ease. One of the first such pilot solutions with- in HP was developed for HPDirect.com’s demand generation function, where global analytics developed V.1 of a series of de- mand generation models. These models also paved the way for the development of www.leeds.colorado.edu/ms 303-492-8397 leedsms@colorado.edu Stand Out. Put yourself in a lucrative new career. Apply now for a master’s degree in business analytics or supply chain management. • Intensive nine month programs • World-renowned faculty • Experiential projects with industry clients • Personalized professional development analytics_Layout 1 4/25/14 12:51 PM Page 1
  • 12.
    W W W.I N F O R M S . O R G12 | A N A LY T I C S - M AGA Z I N E . O R G customer targeting models. In most organizations, such initiatives if implemented have the potential to lay the foundation for similar opportunities with other business functions such as planning, store opera- tions and category management. When an organi- zation reaches such a stage of maturity, that’s when true “return on data” (ROD) is possible. 2. The three Ws …whom, what, when. Tradi- tionally, marketers have used a uni-dimensional ap- proach to target customers. However, results show that these can be sub-optimal and might have an adverse effect on customer loyalty and brand image. Answering questions such as whom to target, what to offer and when to offer bring a paradigm shift in garnering customer interest and loyalty. These help rank customers on their propensity to re-purchase, and lead to preferential treatment of the right cus- tomers with the right product portfolio or allow mar- keters to understand when to offer discounts. Effective tools and modeling will also note clues on probability of customers picking one product over another or repeat customer behaviors. This brings us back to the importance of using effective, proven analytics tools and techniques. 3. Automate and innovate. Creating and applying big data algorithms will help organizations in taking appropriate actions. Many of them are programmed automatically, save time and allow better decisions faster. Creating a robust tool-based ecosystem that allows creation of funnels that track visitors, bounce rates, conversations, etc., is vital to a successful Web analytics initiative. Answering questions such as whom to target, what to offer and when to offer bring a paradigm shift in garnering customer interest and loyalty. EXECUTIVE EDGE
  • 13.
    J U LY/ AU G U S T 2 014 | 13A NA L Y T I C S 4. Site search analytics. Tracking site search is a very useful resource that allows you to know what your visitors are looking for in your website. Is the search engine directing the customer to your web- site or redirecting them to the next best op- tion in absence of the product? Keeping tabs on this will help companies increase customer loyalty and sales. Another application of site search an- alytics allows you to understand what is being searched on your website. By under- standing this, marketers can influence the site layout and design so that visitors are able to easily locate answers to common queries or the most searched products. 5. Marketing spend optimization. HP’s online store uses a mix of marketing vehicles to reach different customer seg- ments with different communication and buying preferences. Optimizing spend on various marketing vehicles is critical to optimizing demand generation efforts as well. However, determining which market- ing mix is most beneficial to the business is not an easy process, requiring not only a scientific approach to analyzing spend and revenue, but also a test-learn-opti- mize culture. For example, ongoing anal- ysis of the response to different types of marketing vehicles helps in identifying the best fit for a particular type of message. Based on such analysis, one can decide if a banner would work best vis-à-vis a customized landing page, or would an e-mail campaign be the best option. 6. Connect marketing with ware- housing. In large supply chain environ- ments, an accurate forecast of orders that get shipped out of the warehouse on a daily basis can be tracked using pre- dictive analytics methodologies to en- able accurate warehouse space/staffing allocation in order to meet the aggressive shipping timeline. In conclusion, marketers can apply data mining and advanced analytical skills to derive key insights to better understand drivers of Web traffic and reasonably ac- curate traffic forecast for use in business planning. We sense that if companies use data accurately, they can easily exhibit a three to five times growth of the online business and will make analytics easily replicable across different functions of the organization. Rohit Tandon is vice president of corporate strategy and worldwide head of Global Analytics at Hewlett- Packard. As part of HP’s corporate strategy team, he helps drive the analytics ecosystem to support HP’s vision and priorities through delivery of cutting-edge analytical capabilities across sales, marketing, supply chain, finance and HR domains. He was recently named one of the top-10 most influential analytics leaders in India for 2014 by Analytics India Magazine. Shruti Upadhyay is a manager with HP Global Analytics.
  • 14.
    W W W.I N F O R M S . O R G14 | A N A LY T I C S - M AGA Z I N E . O R G BY VIJAY MEHROTRA ANALYZE THIS! Given my love of books, it is perhaps not surpris- ing that Amazon.com – where, thanks to the digital technologies of today, a plethora of books can imme- diately be found about nearly any idea that pops into my head and be delivered (free with Amazon Prime membership!) to my doorstep with remarkable speed – is a website that I love deeply. Like many avid read- ers, I purport to do my best to support my local inde- pendent booksellers, but too often there is simply no denying the powerful pull of the super convenient, instantly gratifying, highly personalized Amazon.com experience. Thanks to my bi-monthly book club, I recently read “Who Owns the Future?” by Jaron Lanier, a celebrat- ed technologist and MacArthur “genius” award winner best known for his contributions to the field of virtual reality. Lanier is known as a big thinker, and in this book – at once rambling, provocative and thoughtful – he once again shows why. “WOTF” begins with a bleak assessment of where digital technology is leading us all. The main thrust of Lanier’s argument is as follows: • Technology makes it very easy to give away for free a lot of things that people find valuable – just Dark side of the digital world “In the book business the prospect of a single owner of both the means of production and the modes of distribution is especially worrisome ...” — George Packer Big data, unintended consequences: What Amazon’s domination of the book publishing industry could portend.
  • 15.
    J U LY/ AU G U S T 2 014 | 15A NA L Y T I C S think about the search engine. Being human, we are conditioned to love the chance to get something for nothing, and we have gratefully grabbed at it with both hands. • However, the value that technology grants us is not actually free. In exchange, we tacitly give up information about ourselves, which is then stored as data. • Thanks largely to analytics professionals, this data is then pooled and analyzed to create a variety of commercial opportunities that would not otherwise exist. • This commercial wealth confers extraordinary power upon those who own the technologies that capture and analyze this data (Lanier calls them “Siren Servers”). • This power in turn enables the owners of the Siren Servers to have a huge impact on the society that we live in, including employment, government, culture and ideas. • Taken to their logical conclusions, Your one-stop shop to view top presentations from key INFORMS meetings Your latest member benefit lets you learn from the best on your schedule. http://livewebcast.net/INFORMS_Video_Learning_Center video learning center NOW ONLINE! 2014 Edelman Presentations 2013 Analytics Conference and Annual Meeting 2012 Analytics Conference and Annual Meeting 2011 Analytics Conference and Annual Meeting 2010 Practice Conference and Annual Meeting 2009 Annual Meeting
  • 16.
    W W W.I N F O R M S . O R G16 | A N A LY T I C S - M AGA Z I N E . O R G ANALYZE THIS! all of this ultimately dooms the human species to a very sad and cataclysmic ending. Along the way, Lanier also wanders off into pleasantly intense digressions on a broad variety of somewhat related top- ics, including Aristotle, the tenure system, biodiversity and the concept of local op- tima. He too clearly loves to read. IMPACT ON PUBLISHING While still digesting this thought- provoking book, I came across George Packer’s recent article entitled “Is Amazon good for books?” Taking a long hard look at Amazon.com, the website that perhaps most fully embodies Lanier’s concept of a Siren Server, Packer finds that many of Lanier’s more dire predic- tions are already playing out there. Packer’s particular focus is Amazon’s impact on the publishing industry, and he believes that the stakes here are incred- ibly high: “In the book business the pros- pect of a single owner of both the means of production and the modes of distribu- tion is especially worrisome; it would give Amazon more control over the exchange of ideas than any company in U.S. histo- ry. Even in the iPhone age, books remain central to American intellectual life, and perhaps to democracy.” I wholeheartedly agree. Just as Lanier predicts, suppliers and consumers alike had originally both rushed to embrace Amazon, for like so many technologies it seemed to magical- ly (that is, without cost) provide all parties with something for which they hungered. As Packer writes, “When Amazon emerged, publishers in New York sud- denly had a new buyer that paid quickly, sold their backlist as well as new titles, and, unlike traditional bookstores, made very few returns” – generating fresh rev- enues for publishers with little incremen- tal investment. Meanwhile, we readers flocked to Amazon in droves for its con- venience, its variety, and its low prices. Amazon.com today accounts for more than 40 percent of all printed books purchased as well as 65 percent of all eBooks, so it is probably fair to say that book buyers by and large still love Ama- zon. For us as readers, this is fortuitous, since the number of independent book- stores in business has declined by more than 50 percent since Amazon’s found- ing. However, as its share of overall book sales has ballooned, Amazon has taken advantage of its market power to aggres- sively push the terms of its agreements with book publishers dramatically in its own favor, often through tactics reflect- ing Amazon’s famously secretive and opaque corporate culture. Meanwhile, Packer reports, the many publishers large and small whose businesses are now
  • 17.
    © 2014 FairIsaac Corporation. All rights reserved. Now part of FICO® Xpress Optimization Suite. Parallel Simplex S1 X1 X2 X3 S1 S2 P S1 P People have been attempting to add parallel processing to the simplex method for linear programming for well over 30 years. FICO is proud to announce that we have solved this enormously difficult problem and can now offer parallel simplex in our software, including FICO® Xpress Optimization Suite. The addition of parallel processing to simplex algorithms speeds performance of FICO® Xpress Optimization Suite by as much as a factor of 2.5. Our method for the parallelization of classic simplex algorithms involves picking apart the algorithmic components and rearranging them to make the algorithm open to parallelization. Learn more about parallel simplex and FICO®Xpress Optimization Suite: http://www.fico.com/xpress
  • 18.
    W W W.I N F O R M S . O R G18 | A N A LY T I C S - M AGA Z I N E . O R G ANALYZE THIS! dependent on Amazon for much of their dis- tribution and revenues are learning firsthand that, as Lanier sharply points out, “information supremacy for one company becomes, as a matter of course, a form of behavior modifica- tion for the rest of the world.” Packer’s article also describes an Amazon culture that places a very low value on human beings that are involved with development, pro- motion and distribution of books, placing its faith in algorithms rather than editors and relying on volunteer (that is, free) reviewers to take the place of staff writers. All of this serves as a real illustration of Lanier’s premise that as more and more aspects of the enterprise are mediated by software, those in the business of carefully cre- ating content (rather than digitally distributing it) will be increasingly de-valued and many forms of employment that have long-term value to our culture will subsequently perish. ELIMINATING THE GATEKEEPERS While Amazon’s efforts at actually serving as a publisher have so far failed, it is clear that we can expect them to continue to pur- sue the holy grail of “eliminating the gate- keepers” from the world of publishing by producing its own original content. Indeed, one comes away from Packer’s article with the feeling that if Amazon’s founder and CEO Jeff Bezos could eliminate the need for au- thors and publishers by replacing them with automated content-generating software, he would not hesitate for an instant. As more and more aspects of the enterprise are mediated by software, those in the business of carefully creating content (rather than digitally distributing it) will be increasingly devalued.
  • 19.
    J U LY/ AU G U S T 2 014 | 19A NA L Y T I C S In fact, book distribution has from the outset been only a small part of Bezos’ vision. The real prize for Bezos has been the access to reams of consumer data and the ability to analyze this data for fun and profit. According to Packer, as early as 1995, Bezos had publicly stated that “Amazon intended to sell books as a way of gathering data on affluent, educated shoppers.” Indeed, today the $5.25 billion in book sales makes up only 7 percent of Amazon’s total revenues. This too is just as Lanier predicts in “WOTF,” which may be why it was somehow not available directly from Amazon.com when I looked for it the other day (it has since been restored somehow). One book that I was able to find on Amazon.com was “Ethics of Big Data,” in which author Kord Davis asks a num- ber of more fundamental questions about data and its place in the business world. As a longtime software/IT pro- fessional with a deep grounding in phi- losophy and the history of technology, Davis is equally comfortable discussing INFORMS is the foremost association of O.R. and analytics experts. Our members literally wrote the book on how analytics and the principles of operations research are used to improve organizational decision making. To find an expert to help you, log onto INFORMS Find An Analytics Consultant Database informs.org/Find-Analytics-Consultant/Search
  • 20.
    W W W.I N F O R M S . O R G2 0 | A N A LY T I C S - M AGA Z I N E . O R G ANALYZE THIS! topics as diverse as digital strategy, supply chain optimization, application development and values-based management. As such, he has a unique perspective that motivates him to take these important – and very thorny – questions seriously. As he writes in the book’s Preface, “nobody in history has ever had the opportunity to innovate, or been faced with the risks of unintended consequences, that big data now provides.” In particular, Davis identifies four major aspects of any serious data ethics discussion: • Identity: In the digital world, who we are is tacitly defined by the data we leave behind and indeed our own sense of self is often tightly intertwined with our online activities. Davis points out that capturing and analyzing our digital trail “provides others the ability to quite easily summarize, aggregate or correlate various aspects of our identity – without our participation or consent.” • Privacy: Does your decision to engage in a digital interaction confer upon other entities the right to utilize data captured in the course of that specific interaction, and to link it to other sources of data that may correspond to you? As Davis asks, “Does privacy mean the same thing in both online and offline worlds?… should individuals have a legitimate ability to control data about themselves, and to what degree?” “Nobody in history has ever had the opportunity to innovate, or been faced with the risks of unintended consequences, that big data now provides.” — Kord Davis
  • 21.
    SCHOLARSHIP FOR SERVICEPROGRAM Undergraduate, graduate, and doctoral students pursuing degrees in Science, Technology, Engineering, Mathematics (STEM) fields SMART Scholars receive: + Full tuition and educational fees + Generous cash stipend + Employment with Department of Defense facilities after graduation + Summer internships, health insurance, book allowance For more information and to apply, visit For more information and to apply, visit HTTP://SMART.ASEE.ORG In accordance with Federal statutes and regulations, no person on the grounds of race, color, age, sex, national origin or disability shall be excluded from participating in, denied the benefits of, or be subject to discrimination under any program activity receiving financial assistance from the Department of Defense.
  • 22.
    W W W.I N F O R M S . O R G22 | A N A LY T I C S - M AGA Z I N E . O R G ANALYZE THIS! • Ownership: Digital technology, data and analytics have given some companies the ability to turn individual users’ data into saleable assets and many others the capacity for improved decision-making and increased profitability. Intelligently utilizing data is something that we typically celebrate in our profession, but Davis again challenges this view by asking some very fundamental and thought-provoking questions: “Does our existence itself constitute a creative act, over which we have copyrights or other rights associated with creation? If it does, then how do those offline rights and privileges, sanctified by everything from the Constitution to local, state and federal laws, apply to the online presence of that same information?” • Reputation: Davis hits the nail on the head when he points out that, thanks to the ability of data to be combined and analyzed to drive inferential and predictive judgments, “the number of people who can form an opinion about what kind of person you are is exponentially larger and farther removed…” And while these online reputations are stubbornly persistent, the accuracy of this reputational assessment is too often an afterthought. CALL FOR ACTION Unsatisfied with merely admiring the problem, both Lanier and Davis also call for action. Lanier proposes a technologi- cal and marketplace solution to the oth- erwise inevitable destiny that he believes digital technology, user data, and busi- ness analytics are rapidly leading us into, problems that are so vividly illustrated by the case of Amazon. He suggests an elaborate (though high-level) framework in which all personal data and creative works are tagged so as to enable their owner/creators to capture micropayments whenever and however their data/works are utilized. While his proposed remedy is at this stage sketchy at best, from my perspective he is to be commended for engaging us all in a conversation about a technology-enabled solution to a complex set of problems that few others are even willing to acknowledge. Davis, like Lanier, is a technologist rather than a Luddite (as he quite rightly points out, “whereas big data is ethical- ly neutral, the use of big data is not”). In “Ethics of Big Data,” he strongly encour- ages organizations that use data exten- sively (as well as the policy-makers who attempt to make judgments in support of social good) to have meaningful discus- sions about how and why we use data and what the ethical implications are
  • 23.
    J U LY/ AU G U S T 2 014 | 23A NA L Y T I C S of those actions. In his call for serious ethical inquiry, Davis asserts that “Or- ganizations realize that information has value that can be extracted and turned into new products…the ethical impact is highly context-dependent. But to ignore that there is an ethical impact is to court an imbalance between the benefits of in- novation and the detriment of risk.” Especially, as Lanier would be quick to add, “with technology itself enabling the risk to be pushed off onto many, while the benefits are captured by an ever smaller few.” As Packer reports, Amazon has giv- en very little thought to the near-term ethics or the long-term implications of the way in which it has used its custom- ers’ data to obtain its current level of market power. But as Amazon’s current battle [1] with publisher Hachette rages on, with publishers, governments and erstwhile business partners sure to fol- low, it is clear that this particular story is far from over. As analytics professionals, neither is ours. We have a significant stake in the outcomes of these conversations about ethics and the future. As such, we would be wise to actively participate in those conversations. At this particular moment, we have considerable leverage to advo- cate for a digital future that reflects our own values. The world of digital business – our own personalized Siren Server – has provided us with a massive, lucrative, and free channel for our products and services. Today’s digital enterprise de- pends so much on our ever-expanding ability to capture, transmit, store, inte- grate and organize data, and our deep capacity to use this data to summarize, analyze, correlate, predict and optimize. Through no fault of our own, we have been bestowed with The Sexiest Job of the 21st Century [2], and it is indeed tempting to believe that we are an inte- gral and indispensable part of the world in which we live and work, and that we always will be. Turns out this is exactly what the pub- lishers thought when Amazon first ap- peared on the scene too. Beware: There is no free lunch. Vijay Mehrotra (vmehrotra@usfca.edu) is a professor in the Department of Business Analytics and Information Systems at the University of San Francisco’s School of Management. He is also a longtime member of INFORMS. REFERENCES 1. For more on this, see http://www.nytimes. com/2014/06/21/business/booksellers-score- some-points-in-amazons-standoff-with-hachette. html and http://www.latimes.com/books/ jacketcopy/la-et-jc-amazon-and-hachette- explained-20140602-story.html#page=1. 2. http://hbr.org/2012/10/data-scientist-the- sexiest-job-of-the-21st-century/ar/1
  • 24.
    W W W.I N F O R M S . O R G24 | A N A LY T I C S - M AGA Z I N E . O R G 2014 is turning out to be an interesting year for the healthcare industry. On the healthcare technology front, this year has spurred 16 acquisitions since Jan. 1. State and federal government health insurance exchanges finally started to operate at scale, offer- ing affordable health insurance coverage to millions. Twenty-six states and Washington, D.C., expanded their Medicaid program as of May 2014, making a large number of patients eligible for the safety net. These are all good things that add to the success of the Affordable Care Act (ACA), also known as Obamacare. At the same time we are just beginning to see the impact of the new patient inflow on our health system in the form of emergency room over- crowding [1]. Opponents of the ACA argue that the expansion of coverage without expanding the primary care physician network across the nation will lead to disaster. It remains to be seen which way the pendulum will swing. APPLE’S BIG SPLASH WITH HEALTHKIT Meanwhile, Apple has released its HealthKit prod- uct that connects multiple devices and apps. It has shown promise to become the health data repository BY RAJIB GHOSH The two giants have all the technology, talent and financial firepower needed to drive analytics into the consumer health space by enabling a platform play for various data generating devices and apps. HEALTHCARE ANALYTICS Will Apple, Google usher in new era in healthcare analytics?
  • 25.
    J U LY/ AU G U S T 2 014 | 25A NA L Y T I C S for consumers. In essence this was the promise of the personal health record, or PHR, a promise that rose to the peak of inflated expectation a few years back and then fell to the trough of disillusion- ment quite quickly [2]. But with Apple’s foray into the space, this time it could be different. The key promise, however, is the fusion of data from multiple sources and use of analytics to generate user-facing insights. The latter, howev- er, is not there yet. In my last column I argued that the true empowerment of the patient consumer is waiting on the data fusion and analytics to become mainstream. Consumers do not want just a data repository like a PHR. They want actionable information that PHR does not provide. Apple’s announcement and subsequent ac- tion may expedite the health data movement in the right direction, but I am somewhat skeptical regarding data liquidity in Apple’s “walled garden” approach. Now that Apple has taken the lead how far behind can Google be? Recently, Forbes reported that Google is planning its own version of a health platform. By the time this column goes live we will know what Google is concealing up its sleeves. These two giants have all the tech- nology, talent and financial firepower needed to drive analytics into the consumer health space by enabling a platform play for various data generating devices and apps. Insights for the consumer, however, will come at a price. As the insights with actionable consum- er guidance increase, so too will the level of FDA scrutiny, including requirement for mandatory FDA approval. It is unclear how quickly Apple or Google The key promise is the fusion of data from multiple sources and use of analytics to generate user-facing insights. The latter, however, is not there yet.
  • 26.
    W W W.I N F O R M S . O R G26 | A N A LY T I C S - M AGA Z I N E . O R G HEALTHCARE ANALYTICS will go for that since it is an unknown territory for both companies. Having spent a decade in the medical device industry I know first hand the pain points of the manufacturers when their products come under FDA’s purview. APPLE-EPIC PARTNERSHIP Apple is also partnering with Epic Systems, the giant electronic medical record (EMR) company that controls close to 20 percent of the enterprise EMR market and covers 51 percent of the patients in the United States. This is a smart move by Apple. The ability to send user- generated data to a healthcare professional’s EMR system has always been a key requirement for providers. This “end-to-end” data channel establishes continuum of care, which acts as the building block for analytics-driven population health management (PHM) initiatives. Since the introduction of the iPhone, Apple products have enjoyed a widespread adoption among healthcare professionals. A 2013 study by the Black Book Rankings found that among physi- cians who use medical apps on their smartphones, 68 percent used iPhones while 31 percent used Android devices. Also, 59 percent of physicians ac- cessed apps from their tablet, and most of those users prefer iPad. Among U.S. consumers, Apple has lost some ground recently to its key competitor, Google Android, but still commands a large con- sumer following. When a system enjoys large market share both among patients and providers and the sys- tem connects with the largest EMR company in When a system enjoys large market share both among patients and providers and the system connects with the largest EMR company in the country, we can expect seamless bi-directional data flow to reach critical mass.
  • 27.
    the country, wecan expect seamless bi-directional data flow to reach criti- cal mass. This is a prerequisite to build a cloud-based analytics solution that can leverage data hubs at both ends of the flow. This is the reason why Apple’s Health- Kit introduction is a key phenomenon, albeit it does not do much in its early incarnation. If Google wants to become a serious player in the healthcare field beyond fitness lovers, they have to think in the same direction as well. Once that happens imagine what sort of revolution the rivalry of these technology compa- nies can usher in! The health data acquisition market is still fragmented, and as a result EMR com- panies have not shown much interest in opening up their data repository to those players. If Apple and Google can now turn the table and make this a true platform play using their controlling stakes in the mobile device market, then it becomes meaningful for the EMR companies to forge powerful partnerships with one or both of them. In turn that will create the unification of episodic data and continu- ous user-generated data – the Holy Grail! Interoperability standards will be firmed up and data security solutions will emerge. Most importantly, patients and providers will both benefit from the ana- lytics solutions that will get a shot in the arm from a data rich holistic picture of the patient. So far IBM is the lone warrior creat- ing an ecosystem around its “Watson in the cloud” analytics solution. It still lacks the health data source. So what can Apple, Google, IBM and Epic do together to shake up healthcare? I’m getting goose bumps just thinking about the possibilities. Rajib Ghosh (rghosh@hotmail.com) is an independent consultant and business advisor with 20 years of technology experience in various industry verticals where he had senior level management roles in software engineering, program management, product management and business and strategy development. Ghosh spent a decade in the U.S. healthcare industry as part of a global ecosystem of medical device manufacturers, medical software companies and telehealth and telemedicine solution providers. He’s held senior positions at Hill-Rom, Solta Medical and Bosch Healthcare. His recent work interest includes public health and the field of IT-enabled sustainable healthcare delivery in the United States as well as emerging nations. Follow Ghosh on twitter @ghosh_r. REFERENCES 1. Laura Ungar, “More patients flocking to ERs under Obamacare,” http://www.courier-journal. com/story/news/2014/06/07/patients-flocking- emergency-rooms-obamacare/10181349/ 2. “Hype Cycle for Healthcare Provider Applications, Analytics and Systems,” 2013, Gartner http://www.healthcatalyst.com/health- data-analytics-hype-cycle J U LY / AU G U S T 2 014 | 27A NA L Y T I C S Subscribe to Analytics It’s fast, it’s easy and it’s FREE! Just visit: http://analytics.informs.org/
  • 28.
    W W W.I N F O R M S . O R G28 | A N A LY T I C S - M AGA Z I N E . O R G The Institute for Operations Research and the Management Sciences (INFORMS), the largest professional society in the world for professionals in the fields of analytics, operations research (O.R.) and management science and the publishers of Analytics magazine, announced that its Certified Analytics Professional (CAP® ) exam will now be given at hundreds of computer-based testing cen- ters worldwide through an agreement with Kryterion, the full-service provider of customizable assessment and certification products and services. Candidates for the CAP certification exam can choose from Kryterion’s global network of online se- cured testing locations to schedule their exam at a convenient time and place. INFORMS’ online test- ing center partner Kryterion, through strategic part- nerships with colleges and universities, as well as testing and training companies, provides over 700 testing locations in more than 100 countries. In the United States alone, more than 400 testing centers are available. CAP exams can now be scheduled al- most any day of the week and at a time and location that best suits the candidate. Candidates for the CAP certification exam can choose from Kryterion’s global network of online secured testing locations to schedule their exam at a convenient time and place. INFORMS INITIATIVES CAP exam, continuing education, analytics conference cluster
  • 29.
    J U LY/ AU G U S T 2 014 | 29A NA L Y T I C S Candidates can apply at www.in- forms.org/applyforcertification. Upon ac- ceptance into the program, candidates receive an online voucher to present on the Kryterion site. Exam locations can be found at http:// www.kryteriononline.com/host_locations/. Introduced in the spring of 2013, the CAP program was created by subject matter experts, many of whom are IN- FORMS members. The CAP credential is designed for general analytics pro- fessionals in early- to mid-career and is based on a rigorous job task analy- sis and is vendor- and software-neutral. Benefits of analytics certification include gaining the ability to advance one’s ca- reer by setting a professional with CAP apart from the competition and obtain- ing the structure to make continuing pro- fessional development an integral part of one’s job performance. The CAP pro- gram assists hiring managers in finding competent analytics talent and shows that an organization hiring CAP profes- sionals follows best analytics practice. NEW INFORMS CONTINUING EDUCATION COURSES The INFORMS Continuing Education program is offering two new courses this fall: “Introduction to Monte Carlo and Discrete-Event Simulation” and “Foun- dations of Modern Predictive Analytics.” The intensive, two-day, in-person courses, like the program’s popular current courses “Essential Practice Skills for Analytics Professionals” and “Data Exploration Visualization,” pro- vide real take-away value to implement immediately at work. Once you leave the classroom, you will be able to ap- ply the real skills, tools and methods of analytics. The courses will give par- ticipants hands-on practice in handling real data types, real business problems and practical methods for delivering business-useful results. In the course “Introduction to Monte Carlo and Discrete-Event Simulation,” taught by Barry Lawson, University of Richmond and Lawrence Leemis, College of William and Ma ry, participants will learn the basics of Monte Carlo and discrete- event simulation and how to identify real-world problem types appropriate for simulation. They’ll also develop skills and intuition for applying Monte Carlo and discrete-event simulation techniques. Topic areas covered include Monte Carlo modeling, sensitivity analysis, input modeling and output analysis. The course will be held at the INFORMS office, Catonsville (Baltimore area), Md., Sept 12-13, and Chicago, Oct. 16-17.
  • 30.
    W W W.I N F O R M S . O R G3 0 | A N A LY T I C S - M AGA Z I N E . O R G INFORMS INITIATIVES The second new course, “Foundations of Modern Predictive Analytics,” will be taught by James Drew, Worcester Polytechnic Institute, Verizon (ret.). Modern predictive analytics, the science of discovering and exploiting complex data relationships, has rapidly changed in recent years, especially in today’s businesses. This course will give participants hands-on practice in handling real data types, real business problems and practical methods for de- livering business-useful results. Some of the topic areas to be covered in this course are: linear regression, re- gression trees, logistic regression and CART (classification and regression trees). The course will be held in Washington, D.C., Sept. 15-16, and San Francisco, Nov. 7-8. Learn more about these courses including course outlines, instructor biographies, program objectives and how to register at: www.informs.org/ continuinged. ANALYTICS CLUSTER SET FOR INFORMS ANNUAL MEETING IN S.F. The Analytics Section of INFORMS will present the analytics cluster of ses- sions and presentations at the INFORMS Annual Meeting in San Francisco Nov. 9-12. The cluster encompasses 20 sessions featuring the renowned analytics practitioners and leaders. Nine additional sessions will be jointly orga- nized in collaboration with the Health Applications Society (HAS),CPMS (the Practice Section of INFORMS) and the Section on O.R. in Sports (SpORts). The sessions/presentations within the cluster cover such topics as: • Successful application of analytics in multiple industries such as healthcare, transportation, defense and sports • Analytics focus areas such as big data, spreadsheets and predictive analytics • Panel discussions on understand- ing the connection between O.R. and analytics, building analytics programs to support organizations’ needs and busi- ness analytics in healthcare industry • Winners of the Innovative Applications in Analytics Award and the SAS Student Paper Competition • Why’s, how’s and what’s of analytics certification More information about the confer- ence can be found at http://meetings2. informs.org/sanfrancisco2014/. Help Promote Analytics Magazine It’s fast and it’s easy! Visit: http://analytics.informs.org/button.html
  • 31.
    Solve key businessproblems utilizing big data. Earn an AACSB-International accredited Master of Business Administration with a specialization in Business Analytics from the University of South Dakota. Learn more: www.usd.edu/cde The University of South Dakota’s Beacom School of Business has been continuously accredited by AACSB-International since 1949. Advance your career with an online Master of Business Administration with a specialization in Business Analytics. DIVISION OF CONTINUING DISTANCE EDUCATION 414 East Clark Street | Vermillion, SD 57069 605-677-6240 | 800-233-7937 www.usd.edu/cde | cde@usd.edu C M Y CM MY CY CMY K USD_Online MBA BA Analytics Magazine Ad.pdf 1 6/9/14 9:15 AM
  • 32.
    W W W.I N F O R M S . O R G32 | A N A LY T I C S - M AGA Z I N E . O R G Magic shows are fun because we get to experi- ence the impossible. Still, we know there’s trickery afoot. But what about those times when the magic isn’t magic? When we witness something that’s seem- ingly impossible but proves all too real? Not only real, but the result of optimization? Such is the case in the Formula 1 race car pit. If you follow F1 racing, it comes as no surprise that pit stops have been reduced to two seconds. But if you aren’t an F1 devotee, the idea of lifting a car, chang- ing four tires and sending it on its way in a mere two seconds stretches the imagination. The role of the pit has changed dramatically over the years. For much of racing history it was assumed cars would only stop in the event of problems. Sched- uled tire changes or fuel stops weren’t part of the BY E. ANDREW BOYD The idea of lifting a car, changing four tires and sending it on its way in a mere two seconds stretches the imagination. FORUM Pit stop analytics Quick stop: Optimized F1 pit teams can change four tires in two seconds.
  • 33.
    J U LY/ AU G U S T 2 014 | 33A NA L Y T I C S equation. This orthodoxy was challenged in 1982 when an analytically minded race team from the United Kingdom focused in on two important facts. First, softer tires stuck to the track better during turns than their harder cousins, though they wore out more quickly. Second, less gas in the tank translated into a lighter, and there- fore faster, car. Calculations showed that time spent changing tires and re- filling the tank was more than offset by the improved performance of the car on the track. It’s a calculation any analytics practitioner would be proud of. The idea quickly caught on, making pit stops – and their efficient execution – an integral part of racing. Refueling was banned in 1984 out of safety concerns, but reinstated in 1994. During that 10-year period pit crews refined their tire chang- ing skills to the point where the fastest pit stops took a little over four seconds. When refueling was again instituted, the impetus for faster tire changes disappeared since refuelingwasthebottleneck.Thatchanged in 2010 when F1 racing again reverted to a no refueling policy, setting the stage for lightening fast tire changes. Achieving a two-second tire change required optimizing the entire process. Engineers took a look at everything from the design of the wheel nuts (one per wheel on F1 cars) to the special, self- positioning pneumatic guns that remove and tighten each nut. They then turned their attention to the pit crews. Teams of three work on each wheel: one to remove the old tire, one to position the new tire and one to operate the gun. Their moves aren’t left to chance, but are choreographed down to the position of their hands and feet from start to finish. It’s not hard to imagine John and Lillian Gilbreth – progenitors of industrial engi- neering and pioneers of time and motion studies – standing nearby, stopwatches in hand. They’d certainly be smiling in ap- proval. With two jack operators and scat- tered observers, as many as 20 people crowd around a car during a pit stop – for two seconds of work. Optimization brings to mind models and mathematical programs. But some- times optimization is smart without being sophisticated. And in the F1 pit, it works like magic. Andrew Boyd, INFORMS Fellow and INFORMS VP of Marketing, Communications and Outreach, served as executive and chief scientist at an analytics firm for many years. He can be reached at e.a.boyd@earthlink.net. NOTES REFERENCES 1. Gray, W., “Tech Talk: Can F1 Pit Stops Get Even Quicker?” Eurosport, April 9, 2013. See also: https:// uk.eurosport.yahoo.com/blogs/will-gray/gray-matter- f1-stops-even-quicker-101951154.html. Accessed May 24, 2014. 2. Examples of fast pit stops can be found at: https://www.youtube.com/watch?v=aHSUp7msCIE https://www.youtube.com/watch?v=Xvu0GlMa3xQ
  • 34.
    W W W.I N F O R M S . O R G34 | A N A LY T I C S - M AGA Z I N E . O R G CUSTOMER RELATIONSHIPS Cloud-based analytical engine yields instant insight using unstructured social media data. nformation is generated in today’s world more rapidly than ever before, and it will keep growing at an ex- ponential rate. The rise of social media combined with increased Internet pen- etration has led to a significant increase in user-generated content in the form of product reviews and feedback, blogs, independent news articles, Twitter and Facebook updates. The crux of leveraging such data lies in identifying patterns from it and using the data to generate actionable insights in real time. This article proposes a cloud-based analytical engine that analyzes com- ments, reviews and opinions generated by customers to understand the main underlying themes and the general sen- timent so that actionable insights can be generated in real time. Algorithms such as latent Dirichlet allocation for topic modeling and the holistic lexicon- based approach for sentiment mining have been operationalized using a multi- agent framework deployed in a cloud Real-Time Text Analytics BY (l-r) AVEEK MUKHOPADHYAY AND ROGER BARGA I
  • 35.
    J U LY/ AU G U S T 2 014 | 35A NA L Y T I C S depended on the time-intensive ETL pro- cess (extract, transform, load). Depend- ing upon the system and data complexity, analytics could be delayed by hours, days or even weeks while data management put it all together. In today’s business landscape, mini- mizing the lag between acquiring data and generating actionable insight has be- come the key differentiator. Acting in real time to respond to an event can result in huge profits and improved customer rela- tionships for a firm. Real-time analytics can benefit in multiple business scenarios, including: • High-frequency trading (sophisticated algorithms to rapidly trade securities) • Real-time detection of fraudulent transactions • Real-time price adjustment based on competitor information • Real-time feedback from social media for a product firm about its new launch • Real-time recommendations by retail stores based on customer’s location • Real-time traffic routing based on information about vehicle frequency, direction, etc. Social media content comes from users without any vested interest, thus their opinions beget more trust. Orga- nizations whose products and services environment. This process meets com- putational demands as it allows users to run virtual machines within managed data centers, freeing them from worry- ing about acquisition of new hardware and networks. UNSTRUCTURED SOCIAL MEDIA DATA According to a study by International Data Corporation (IDC), mankind cre- ated an estimated 150 exabytes (1 bil- lion gigabytes) of data in 2005, a number that jumped to 1,200 exabytes in 2010. A more recent study by IDC and EMC put the amount of data created in 2011 at 1.8 zettabytes (1 followed by 27 zeroes), a number the study researchers expected to double every two years. Only 5 percent of this data is struc- tured (comes in a standard format that can be read by computers). The remain- ing 95 percent is unstructured (photos, phone calls and free-flow texts). A large chunk of such unstructured data is in text format. Posing challenges owing to the sheer volume, depth and complex- ity, such data, however, holds immense potential for organizations. The key lies in identifying patterns from the data and gaining relevant insights. REAL-TIME ANALYTICS Not long ago, analyzing data and generating business intelligence reports
  • 36.
    W W W.I N F O R M S . O R G36 | A N A LY T I C S - M AGA Z I N E . O R G REAL-TIME TEXT ANALYTICS are mentioned in such media need to remain current on relevant discussions and be able to track the sentiment of ev- ery employee, customer and investor. To address this challenge, a cloud-based real-time ecosystem was created for ana- lyzing comments, reviews and opinions mined from Twitter. In addition, tracking trending themes in the customer space and the evolution of these trends over time was incorporated. TEXT MINING ALGORITHMS Topic modeling. Topic models are statistical techniques that analyze words/ phrases in textual data to understand the main themes running through them. This model algorithm is based on LDA (latent Dirichlet allocation) and uses the observed words in tweets (extracted from Twitter) to infer the hidden topic structure. LDA is more easily understood by its generative process. This generative pro- cess defines a joint probability distribution over the observed (the words) and hidden (the topics) random variables. This joint distribution is used to compute the condi- tional distribution of the hidden variables given the observed variables. This con- ditional distribution is called the posterior distribution. A topic is assumed to be a collec- tion of words with different probabilities of occurrence. An individual tweet can be assumed as generated from multiple topics in different proportions. Now every word generated in a tweet can be ran- domly chosen in a two-step process: • First, a topic is randomly selected from the distribution of topics. • Second, the chosen word is randomly selected from the distribution of words over that topic. So, the joint probability distribution of word W and topic T = Probability (W, T) = Probability (T) * Probability (W | T). Now when the individual probability of occurrence of a word is known (because it has already occurred in the tweet), the pos- terior distribution is calculated as follows: Probability (T | W) = Probability (W, T) / Probability (W) Given the probabilities of observed words, latent information like the vocabu- lary distribution of a topic and the distri- bution of topics over the tweet are thus inferred. Sentiment analysis. A holistic lexi- con-based algorithm is used to analyze individual feature-level sentiments as well as cumulative sentiments over tweets. Aggregating opinions for a feature: The algorithm parses one tweet at a time identifying the features present. A set of opinion words for each feature is identi- fied using a lexicon. An orientation score
  • 37.
    Opportunity at your fingertips. VisualAnalytics The answers you need, the possibilities you seek—they’re all in your data. SAS helps you quickly see through the complexity and find hidden patterns, trends, key relationships and potential outcomes. Then easily share your insights in dynamic, interactive reports. Try Visual Analytics and see for yourself sas.com/VAdemo Try Visual Analytics and see for yourself sas.com/VAdemo SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2014 SAS Institute Inc.All rights reserved. S120597US.0214
  • 38.
    W W W.I N F O R M S . O R G38 | A N A LY T I C S - M AGA Z I N E . O R G REAL-TIME TEXT ANALYTICS for each feature in the sentence is then calculated by summing up the feature- opinion scores for that sentence. (Each feature-opinion score is obtained from the sentiment polarity of the opinion word and a multiplicative inverse of the distance between the feature and opin- ion word. Opinion words at a distance from the feature are assumed to be less associated to the feature compared to the nearer words.) For example, the phone is useful and a great work of art. Let the feature here be phone and opinion words be “useful,” “great.” Semantic orientation of useful = 1 Semantic orientation of great = 1 Distance between the words useful and phone = 2 Distance between the words great and phone = 5 score(f)=1/2+1/5= 0.7 Aggregating opinions for tweets: The sentiment score for a tweet is the sum- mation of the scores for all opinion words present in the tweet. For example, “The phone is useful and a great work of art.” The opinion words in the sentence are “useful,” “great” Semantic orientation of useful = 1 Semantic orientation of great = 1 score(t) = 1 +1= 2 Negation-rule: This identifies the ne- gation word (which can be 1 or 2 places before the opinion word) and reverses the opinion expressed in a sentence. For example, “The phone is not good.” Here phone gets negative orientation. Context-dependent rules: The features for which we find no opinion words, context dependent constructs are used to identify the orientation score. For example, “The phone is good but battery-life is short.” The only opinion word in the sentence is “good” (“short” is a context-dependent word). Phone gets positive orientation be- cause of “good.” Battery-life gets negative orientation because of the word “but” being present between good and battery-life. Topic Evolution. The next step to topic modeling is to understand how top- ics and trends develop, evolve and go viral over time. The algorithm maintains a fixed num- ber of topic streams and their statistics. Each tweet is processed as it comes in and is assigned to the “closest” topic stream (the topic stream most similar to it). If no topic stream is close enough, then a new stream is created and a stale stream is killed to maintain a fixed number
  • 39.
  • 40.
    W W W.I N F O R M S . O R G4 0 | A N A LY T I C S - M AGA Z I N E . O R G REAL-TIME TEXT ANALYTICS of topic streams. Streams are constantly monitored for the rate of arrival of tweets. Whenever there is a burst of tweets in a particular topic stream, an alert for the trending topic is generated. THE REAL-TIME EDGE A multi-agent distributed framework enables the processing of real-time data and facilitates decision-making by al- lowing for easy deployment of analyti- cal tasks in the form of process flows. In this multi-agent paradigm, an agent is a software program designed to carry out one or more tasks and can communicate with other agents in the system using agent communication language. Thus, an analytical task can be written as an agent, and the analytical process flow can be es- tablished by wiring together a set of com- municating agents (an agency) that can run in sequence or in parallel. These agents were written using R to offer the analyst the benefits of a powerful and flexible statistical modeling language. OPERATIONALIZATION IN THE CLOUD The entire real-time platform was then deployed on a cloud ecosystem to allow for the following processes: Efficient resource management: The cloud platform provides the necessary vir- tual machine, network bandwidth and other Figure 1: Real-time text mining agency.
  • 41.
    J U LY/ AU G U S T 2 014 | 41A NA L Y T I C S infrastructure resources. Even when a machine goes down because of an unex- pected failure, a new virtual machine is al- located for the application automatically. Dynamic scaling and load balanc- ing: The cloud solution allows scaling out as well as scaling back an appli- cation depending on resource require- ments. Multiple services running in tandem make the whole system com- putationally resource intensive. As re- source demands increase, new role instances can be provisioned to handle the load. When demand decreases, these instances can be removed so that payment for unnecessary computing power is not required. Availability durability: The cloud storage services replicate data on three different servers, guaranteeing it can be accessed at all times, even if a server shuts down unexpectedly. Better mobility: The application can be accessed from any place, as long as there is an Internet connection. There is no tight coupling with any physical server or machine. RESULTS Figure 2 shows a snapshot of the topic treemap generated in one run of the topic modeling algorithm (different topics are represented by different colors, with the areasrepresentingoccurrencefrequency). Figure 2: Topic modeling treemap.
  • 42.
    W W W.I N F O R M S . O R G42 | A N A LY T I C S - M AGA Z I N E . O R G REAL-TIME TEXT ANALYTICS Incoming tweets over a time period were captured in a stream graph visual- ization as shown in the Figure 3 screen- shot. Each topic is represented by a stream in the visualization and is charac- terized by the top words in that topic. At any point of time, the top words in each topic are displayed in a topic treemap below the stream graph. It is possible to get the keyword “treemap” at any past time in history. Successive runs of the sentiment analysis algorithm for batches of tweets are represented by the visual in Figure 4. Each bar captures the sentiment for that feature in a particular batch of tweets. The height of the bar rep- resents the number of opinion words for the feature in that batch. The col- or of each bar represents the overall sentiment level expressed in a batch of data, ranging from extremely negative (dark red) to extremely positive (dark green). The change in color of the bars across various batches can be used to identify stimuli that are driving the change. Selection of a particular bar provides a deeper analysis of that batch. The size of a bubble indicates the number of ref- erences of a particular opinion word, and the color shows the overall sentiment score for the particular opinion word. Both the size and color are indicators of which opinion words drive the sentiment for a feature in a batch. Figure 3: Trends stream graph.
  • 43.
    CLOSING THOUGHTS Trending topicsrepresent the popular “topics of conversation,” and when de- tected in real time, these hot topics are the social pulses that are usually ahead of any standard news media. Data ana- lyzed via managed data centers can pro- vide key insights into the evolving nature and patterns of social information and opinion and the general sentiment pre- vailing over such subjects. Aveek Mukhopadhyay is an associate manager at Mu Sigma where he works with the Innovation Development Team with a core focus on driving the adoption of advanced analytical platforms and techniques both internally and externally. He has interests in the fields of text mining, machine learning and analytics automation. Roger Barga, Ph.D., is group program manager for the CloudML team at Microsoft Corporation where his team is building machine learning as a service in the cloud. Barga is also a lecturer in the Data Science program at the University of Washington. He joined Microsoft in 1997 as a researcher in the Database Group of Microsoft Research (MSR), where he was involved in a number of systems research projects and product incubation efforts, before joining the Cloud and Enterprise Division of Microsoft in 2011. Figure 4: Sentiment analysis. NOTES REFERENCES 1. The Economist (Feb. 25, 2010), “The Data Deluge” (http://www.economist.com/node/15579717). 2. David M. Blei, “Probabilistic Topic Models,” Communications of the ACM, April 2012, Vol. 55, No. 4 (http://www.cs.princeton.edu/~blei/papers/Blei2012. pdf). 3. Xiaowen Ding, Bing Liu and Philip S. Yu, “A Holistic Lexicon-Based Approach to Opinion Mining” (http://www.cs.uic.edu/~liub/FBS/opinion- mining-final-WSDM.pdf). Help Promote Analytics Magazine It’s fast and it’s easy! Visit: http://analytics.informs.org/button.html J U LY / AU G U S T 2 014 | 43A NA L Y T I C S
  • 44.
    W W W.I N F O R M S . O R G44 | A N A LY T I C S - M AGA Z I N E . O R G Key considerations for deep analytics on big data, learning and insights. hat is big data? Big data, which means many things to many people, is not a new technological fad. In addition to providing innovative solu- tions and operational insights to endur- ing challenges and opportunities, big data with deep analytics instigate new ways to transform processes, organi- zations, entire industries and even so- ciety. Pushing the boundaries of deep data analytics uncovers new insights and opportunities, and “big” depends on where you start and how you proceed. Big data is not just “big.” The expo- nentially growing volume of data is only one of many characteristics that are of- ten associated with big data, such as variety, velocity, veracity and others (the six Vs; see box). According to Gartner Research, the worldwide market for analytics will remain the top focus for CIOs through 2017 [1]. According to Gartner, Why do so many analytics projects fail? BY (l-r) HALUK DEMIRKAN AND BULENT DAL W THE DATA ECONOMY
  • 45.
    J U LY/ AU G U S T 2 014 | 45A NA L Y T I C S more than half of all analytics projects fail because they aren’t completed within budget or on schedule, or be- cause they fail to deliver the features and benefits that are optimistically agreed on at their outset. Today, an abundance of knowledge and experience exists to have success- ful data and analytics-enabled decision support systems. So why do so many of these projects fail, and why are so many executives and users still so un- happy? While there are many reasons for the high failure rate, the biggest rea- son is that companies still treat these projects as just another IT project. Big data analytics is neither a product nor a computer system. Instead, it should be considered a constantly evolving strat- egy, vision and architecture that contin- uously seeks to align an organization’s operations and direction with its strate- gic business goals and tactical and op- erational decisions. Table 1 includes a list of common mistakes that can doom analytics projects. n Volume (data at rest): terabytes to exabytes, petabytes to zettabytes of lots of data n Velocity (data in motion): streaming data, milliseconds to seconds, how fast data is being produced and how fast the data must be processed to meet the need or demand n Variety (data in many forms): structured, unstructured, text, multimedia, video, audio, sensor data, meter data, html, text, e-mails, etc. n Veracity (data in doubt): uncertainty due to data inconsistency and incomplete- ness, ambiguities, latency, de- ception, model approximations, accuracy, quality, truthfulness or trustworthiness n Variability (data in change): the differing ways in which the data may be interpreted; different ques- tions require different interpretations n Value (data for co-creation and deep learning): The relative impor- tance of different complex data from distributed locations. Big data with deep analytics means greater insight and better decisions, something that every organization needs. The six Vs of big data
  • 46.
    W W W.I N F O R M S . O R G46 | A N A LY T I C S - M AGA Z I N E . O R G WHY PROJECTS FAIL KEY CONSIDERATIONS FOR DEEP ANALYTICS We live in an era of big data. Whether you work in financial services, consumer goods, travel, transportation, health- care, education, supply chain, logistics or industrial products and professional services, analytics are becoming a com- petitive necessity for your organization. But having big data – and even people who can manipulate it successfully – is not enough. Companies need managers who can partner effectively with analysts to ensure that their work yields better strategic and tactical decisions. Big data with deep analytics is a jour- ney that helps organizations solve key business issues and opportunities by converting data into insights to influence business actions and drive critical busi- ness outcomes. As organizations try to take advantage of the big data opportuni- ty, they need not be overwhelmed by the various challenges that might await them. Managers will need to start their journey by [2]: Identifying clear business need and value. Almost everything needs to be a business rather than a technology solu- tion. Before companies start collecting big Going Deep Wide on big data with deep analytics for deep learning
  • 47.
    J U LY/ AU G U S T 2 014 | 47A NA L Y T I C S Table 1: Common mistakes for analytics projects. Failing to build the need for big data within the organization Islands of analytics with “Excel culture” Data quality and reliability related issues Not enough investigation on vendor products and rather than blindly taking the path of least resistance Departmental thinking rather than looking at the big picture Considering this as a one-time implementation rather than a living eco-system Developing silo dashboards to answer a few questions rather than strategic, tactical and opera- tional dashboards Not establishing company ontology and definitions for “single version of truth” culture Lack of vision and not having a strategy; not having a clear organizational communications plan Lack of upfront planning; overlooking the development of governance and program oversight Failure to re-organize for big data Not establishing a formal training program Ignoring the need to sell success and market the big data program Not having the adequate architecture for data integration Forgetting rapidly increasing complexities with …volume, velocity, variety, veracity, and many more
  • 48.
    W W W.I N F O R M S . O R G48 | A N A LY T I C S - M AGA Z I N E . O R G WHY PROJECTS FAIL data, they should have a clear idea of what they want to do with it with from a business sense. Here’s what you need to consider: Turn over part or all of big data solution delivery to business leaders. Project management and ownership from business (not IT) in big data solu- tions is the key for success. In the mean- time, make sure to have clear alignment between business and IT. Partner with business peers to identify opportunities and solutions. If we talk about big data, the impact of these projects should also be “big.” Cre- ate a cross-organization team and in- volve all stakeholders early in the game. Value co-creation of value with customers. Overall business objective should always be about customers. If one of the initiatives is about big market- ing outcome, than it should be about how to set up customer-centric marketing, how to provide targeted dynamic adver- tisement, how to engage customers and how to manage personalized shopping. Start small – with an eye to scale quickly. While big data solutions may be quite advanced, everything else sur- rounding it – best practices, methodolo- gies, org structures, etc. – is nascent. No one has all the answers, at least not yet. Understand why traditional business intelligence and data ware- housing projects can’t solve a problem. Small, simple and scalable. When launching big data initiatives, avoid 1) get- ting too complicated too fast, and 2) not being prepared to scale once a solution catches on. Big data solutions can quickly grow out of control since discovering val- ue from data prompts wanting more data. Identify what part of the business would benefit from quick wins. Look for opportunities that will show quick wins within no more than three months. Success brings more people to the table. This is not a one-time implementa- tion. Understand that this is a living and evolving organism that will grow expo- nentially very fast. It is a culture change in the company with the way that you collect and use data, and the way you make outcome-based decisions. Develop a minimal set of big data governance directives upfront. Big data governance is a chicken-and-egg problem – you can’t govern or secure what you haven’t explored. However, exploring vast data sets without gover- nance and security introduces risk. New processes to manage open source risks. Most big data solutions are being built on open source software, but open source has both legal and skill implications as firms are: 1) exposed to risk due to intellectual property issues and complex licensing agreements; 2) concerned about liability if systems built
  • 49.
    J U LY/ AU G U S T 2 014 | 4 9A NA L Y T I C S on open source fail; and 3) required to use technology that is often early re- lease and not enterprise-class. New agile processes for solution delivery. Successful firms will embrace agile practices that allow end users of big data solutions to provide highly in- teractive inputs throughout the imple- mentation process. Integrate structured and unstruc- tured data from multiple sources. Inte- gration of data is one of the most important and also complex processes to serve ef- ficient and effective decision-making. In terms of data, it includes machine data, sensor data, videos, audio, documents, enterprise content in call centers, e-mail messages, wikis and, indeed, larger vol- umes of transactional and application data. Data sharing is key. In order for a company to build a big data ecosystem that drives business action, organiza- tions have to share data. Build a strong data infrastructure to host and manage data. Make sure to have secured and reliable in-house and/or hosted data (e.g., cloud) and in- formation management infrastructure. USINESS ANALYTICS PERATIONS RESEARCH INFORMS CONFERENCE ON Save the Date!Catch the Analytics Wave in Huntington Beach, CA APRIL 12-14, 2015
  • 50.
    W W W.I N F O R M S . O R G5 0 | A N A LY T I C S - M AGA Z I N E . O R G WHY PROJECTS FAIL Think about what information do I collect today … and what analytics should I perform that can benefit me and others. New security and compliance procedures to protect extreme-scale data. In order to succeed with big data, new processes must be developed that recognize and protect the special nature of extreme-scale data that may be large- ly unexplored. Be ready to support rapid growth. Big data solutions can grow fast and ex- ponentially. They can start as a pilot with a few terabytes of data, then becomes a petabyte very quickly. Since the same data can be used different ways and re- analyzed for new insights easily, nothing ever gets deleted. Funding must move out of IT for big data success. Funding for these projects should come from outside of the CIO organization and move to a market- ing or sales organization, for instance, so that the business has a vested stake in the game. Create a road map that gradually builds the skills of your organization. It’s important to create a road map that allows you to gradually build the required skills within your staff, minimize risk and capitalize on previous successes to gain more support. In the organization, there will be new roles and responsibilities such as the data scientist, who possesses a blend of skills that includes statistics, ap- plied mathematics and computer science. This is different than any current decision support solution. With big data, organizations should look for new capabilities, such as: using advanced analytics to uncover patterns previously hidden; visualization and exploration to help the business find more complete answers, with new types and greater volumes of data to best represent the data to the user and highlight important patterns to the human eye; enable oper- ational decision-making with on-demand stream data by making floor employees into analytic consumers; and turn insight into action to drive a decision – either with a manual step or an automated pro- cess. And most important be ready for rapidly increasing benefits and complex- ities from the six Vs. WHAT IS NEXT IN THE DATA ECONOMY? Organizations have access to a wealth of information, but they can’t get value out of it because it is sitting in its most raw form or in a semi-structured or unstructured format [3]. As a result, they don’t even know whether it’s worth keeping. So where is deep analytics for deep learning headed in the next few years? The exciting news is that many
  • 51.
    career analytics. Enrollnow only AAS nation. WakeTechnicalCommunityCollegeserved 68,919 studentsin2012-13andwas rankedthesecondlargestcommunity collegeinthecountryin2012by CommunityCollegeWeek. Afutureforwardcollege,itlaunchedtheAAS inBusinessAnalytics,thefirstofitskind,in 2013.Theprogram providesstudentsthe knowledgeandpracticalskillsnecessaryfor employmentandgrowthinanalytics professionsinaslittleastwosemesters. Competitivetuition,open-doorenrollment, flexibleschedulingoptions,accesstoindustry recognizedtools,andavarietyofcredential optionsmakeenrollmentintheprogram bothaccessibleandaffordable. Thisprogramisfundedinfullbya$2.9million Dept.ofLaborTradeAdjustment AssistanceCommunityCollegeCareer Flexibility      Credential Options   Executive Accelerated Program Industry Recognized ToolsSkills
  • 52.
    W W W.I N F O R M S . O R G52 | A N A LY T I C S - M AGA Z I N E . O R G WHY PROJECTS FAIL organizations are already realizing the value of big data analytics today. Insight- driven, information-centric initiatives will be deployed where the ability to capital- ize on the six Vs of information will cre- ate new opportunities for organizations to exploit. By combining and integrating deep analytics, local rules, scoring, opti- mization techniques and machine learn- ing with cognitive science into business processes and systems, decision man- agement helps deliver decisions that are consistently optimized and aligned with the organization’s desired outcomes. Social analytics will ensure busi- nesses know how, when and where to creatively engage with individual con- sumers and social communities to fos- ter trusted, one-to-one relationships and better understand and manage the way their companies are perceived. Integrat- ing demographic and transactional data with what can be learned about attitudes and opinions allows organizations to truly understand the motivations and in- tents of its constituents to better serve them at the right time and place. Deep analytics will help organiza- tions uncover previously hidden patterns, identify classifications, associations and segmentations, and make highly accu- rate predictions from structured and un- structured information. Organizations will use real-time analysis of current activity to anticipate what will happen and iden- tify drivers of various business outcomes so they can address the issues and chal- lenges before they occur. Many decisions will be done automatically by computers that also have deep-learning capabilities. When you are in a process of starting a big data journey, consider this ques- tion: What should our big data with deep analytics roadmap look like to achieve our objectives? Haluk Demirkan (haluk@uw.edu) is a professor of Service Innovation and Business Analytics, and the founder and executive director of Center for Information Based Management at the Milgard School of Business, University of Washington- Tacoma. He has a Ph.D. in information systems and operations management from the University of Florida. He is a longtime member of INFORMS. Bulent Dal (bulent.dal@obase.com) is a co-founder and general manager of Obase Analytical Solutions (http://www.obase.com/index.php/en/obase), Istanbul, Turkey. His expertise is in scientific retail analytical solutions. He has a Ph.D. in computer sciences engineering from Istanbul University. Acknowledgement Part of this article is excerpted with permission of the publisher, HBR Turkey, from Demirkan, H. and Dal, B., “Big Data, Big Opportunities, Big Decisions,” Harvard Business Review Turkish Edition (published in Turkish), March 2014. REFERENCES 1. Gartner, Inc., 2013, “Gartner Predicts Business Intelligence and Analytics Will Remain Top Focus for CIOs Through 2017,” Dec. 16, 2013, http://www. gartner.com/newsroom/id/2637615. 2. Demirkan, H. and Dal, B., “Big Data, Big Opportunities, Big Decisions,” Harvard Business Review Turkish Edition (published in Turkish), March 2014, pp. 28-30. 3. Davenport, T., 2013, “Analytics, 3.0,” Harvard Business Review, December.
  • 53.
    The Institute ofBusiness Analytics Symposium is a two-day event where presenters from major companies across the U.S. share their experiences in business analytics. We will explore a diverse landscape from statistics, data-mining, and forecasting to predictive modeling and operations research. It’s also a great networking opportunity for businesses, students and academia. Keynote Speakers: - Wayne Winston - Hear from this renowned analytics expert. Major league sports teams and Fortune 500 companies have requested his business analytics services. - Paul Adams, VP of Ticket Sales is beginning his 26th season with the Atlanta Braves. For a complete list of presenters and to register visit http://mycba.ua.edu/basymposium. Early registration is available at a discounted rate through August 15. Businesses registering four or more individuals can receive a reduced rate. The INFORMS Certified Analytics Professional (CAP®) exam will be administered on September 24 as a pre-symposium event and requires separate payment. “Obviously he (Wayne Winston) helped start the basketball analytics revolution with us,” said Dallas Mavericks owner Mark Cuban. Wayne Winston Paul Adams 7th ANNUALBUSINESS ANALYTICSSYMPOSIUMHotel Capstone, The University of Alabama, Tuscaloosa, Alabama September 25-26, 2014
  • 54.
    W W W.I N F O R M S . O R G54 | A N A LY T I C S - M AGA Z I N E . O R G DATA SCIENTISTS IN DEMAND According to executive search firm head Linda Burtch, the job prospects for data scientists and other elite analytics professionals have never been better – and the future is even brighter. n April, the executive search firm Burtch Works released the results of its first-of-its- kind salary and demograph- ics survey of data scientists, a follow-up survey of big data professionals con- ducted a year earlier. Among other find- ings, the 2014 survey quantified that data scientists are well paid, relatively young, overwhelmingly male and that almost half (43 percent) are employed on the West Coast. Linda Burtch, managing partner of Burtch Works, has been involved in the recruitment and placement of high-end analytics talent for 30 years. She start- ed her career with Smith-Hanley before founding her own company five years ago. Analytics magazine editor Peter Horner interviewed Burtch in April, not long after the survey of data scientists was released. Following are excerpts from the interview. What did you find that surprised you the most from the salary and de- mographics survey of data scientists? First of all, I find it funny that every- one is interested in salaries and what data scientists and big data profession- als make, but it’s such a taboo subject to actually talk about. Not to me. I talk about salaries all the time. That’s my business. What surprised me? That’s an inter- esting question. It actually turned out the way I thought it would – a lot of the ‘It’s their time to shine’ BY PETER HORNER I
  • 55.
    J U LY/ AU G U S T 2 014 | 55A NA L Y T I C S data scientists. Data storage has become so much cheaper, comput- ing power has become much faster, nanotech- nology and sensors are now becoming ubiqui- tous. Self-driving cars, traffic sensors, the en- ergy grid. The list goes on and on and on. Right now the ob- vious stuff is happen- ing with understanding digital streams of data in applications related to social media. That’s pretty straight- forward stuff, but wait until it hits the healthcare industry, for example. Self- driving cars are going to be a huge, huge deal. While a lot of it is being done out in California now, over the next five years we are going to see it scattered all over the United States. When it comes to recruiting can- didates and job placement, who are you talking to? I recruit in analytics – people who have master’s degrees in statistics, op- erations research, econometrics, people who are out there working in business applications, solving problems related to marketing spend or credit worthiness candidates living out on the West Coast and a higher predominance of Ph.D.s among data scientists than the gen- eral analytics population or the big data profes- sionals, as I call them. It all pretty much made sense to me. It was in- teresting because it was actually quantified. Weren’t you a little surprised by the extent of the concentration of data scientists – nearly 50 percent – on the West Coast? That’s for the moment, for now, but watch and see what happens. Analyt- ics has been around for a long time, yet some people still ask me, “Are you sure this isn’t a fad?” It’s not. Analytics has become a hugely profit- able specialty area within organizations as they try to optimize their operations, or target their marketing or look at re- turn on investment issues, and that has been around for years and years. I would argue that those issues are sort of the humdrum stuff of analytics. Data-driven decision-making is really going to explode, and that’s what we are seeing with this whole area going toward Linda Burtch, founder and managing partner of Burtch Works.
  • 56.
    W W W.I N F O R M S . O R G5 6 | A N A LY T I C S - M AGA Z I N E . O R G QA WITH LINDA BURTCH or target marketing. More recently I’ve gotten into data science. That’s a huge umbrella description. You mentioned operations research, the heart and soul of INFORMS. It is. When I started out in recruit- ing more than 30 years ago, I focused on operations research candidates. It’s grown pretty dramatically since then. They have a very fond place in my heart because that’s how I got started. It’s one of those things that I’ve really been in- volved with – the INFORMS group back in New York when I was living there, and I’m really excited now because the INFORMS group in Chicago is getting re-energized. It’s really exciting to watch. When looking at the job market- place, do you distinguish between, say, a data scientist and other analyt- ics professionals? Let me back up a little bit. Last sum- mer, when I was putting together the big data salary study, I saw that data scien- tists were a breed apart, and that they had higher compensation levels. So I made the decision to take them out of the general big data study and hold them for later because it’s such an emerging field that’s so different. They are working with what I would call unstructured data. You could get into a lot more detail over how a data scientist is different from a big data professional, but the primary distinguish- ing feature, in my opinion, is that data scientists are working with data that’s un- structured. It’s something that’s going to grow as sensors become more and more prevalent and data streams become con- tinuous in so many applications areas. How would you describe the current job market for quants, for lack of a better word? It’s hot. A couple of months ago we did a flash survey in which we simply asked how often are you are contacted about a new job opportunity through LinkedIn. We had 400 responses; 89 percent of the re- spondents said they were contacted at least monthly, and 25 percent said that they were contacted at least weekly. I’m working with elite data scientists, and they’re telling me that they get calls once or twice a day from recruiters, so it’s just crazy. Our candidates are seeing a 14 per- cent increase in salary when they change jobs, so there’s a lot of churn out there. If they stay with their existing company, they might see an annual increase of be- tween 2 percent and 3 percent, so the 14 percent is a nice bounce if they de- cide to make a change. One of my data scientists in Boston said he received 30 calls in one week after he left a job and went on the job hunt.
  • 57.
    INFORMS Continuing Education programoffers intensive, two-day in-person courses providing analytics professionals with key skills, tools, and methods that can be implemented immediately in their work environment. These courses will give participants hands-on practice in handling real data types, real business problems and practical methods for delivering business-useful results. NEW! INTRODUCTION TO MONTE CARLO AND DISCRETE-EVENT SIMULATION Topic areas: » Monte Carlo Modeling » Sensitivity Analysis » Input Modeling » Output Analysis This course will be held Catonsville, MD (INFORMS HQ) Sep 12-13, 2014 Chicago, IL Oct 16-17, 2014 Faculty: Barry G. Lawson, University of Richmond Lawrence M. Leemis, The College of William Mary COURSES FOR ANALYTICS PROFESSIONALSducation continuing Learn more about these courses at: informs.org/continuinged NEW! FOUNDATIONS OF MODERN PREDICTIVE ANALYTICS Topic areas: » Linear Regression » Regression Trees » Classification Techniques » Finding Patterns This course will be held Washington, DC – Sep 15-16, 2014 San Francisco, CA – Nov 7-8, 2014 Faculty: James Drew, Worcester Polytechnic Institute, Verizon (ret.) It’s amazing. Competing offers is another sign that the market is really hot. Sign-on bonuses are another thing that has become very commonplace in the analytics job market. Another sign that is important to note is the aca- demic institutions have really stepped up with many of them developing mas- ter’s programs in analytics, predictive analytics and the like, so that’s some- thing that is very new in the last two or three years. In an interview with the New York Times, you said in reference to MBAs, and I quote, “In 15 years, if you don’t have a solid quant background, you might have a permanent pink slip.” That’s a little rough, isn’t it? I know, I’ve become the harbinger of the permanent pink slip. Seriously, I have seen many MBAs, your general MBA, look around and say, whoa, this is a little bit scary, because they are seeing this trend toward analytical decision-making J U LY / AU G U S T 2 014 | 57A NA L Y T I C S
  • 58.
    W W W.I N F O R M S . O R G58 | A N A LY T I C S - M AGA Z I N E . O R G QA WITH LINDA BURTCH becoming so predominant in business. Personally, I think within 10 or 15 years if MBAs don’t have a quantitative foun- dation, they will be prevented from pro- motion. We’ll see. I always said back when I was working with the operations research people that my guys are so smart, they are the ones who should be running these companies. Now I’m see- ing it come true. In an episode of the TV show “Mad Men,” the ad agency employees, cir- ca late 1960s, were concerned that a new computer the size of a confer- ence room would make them expend- able. Your quote reminded me of that. Right. A lot of people ask me about that. There is going to be a disruption. There already has been. Just yesterday, the Times had a visual display of analyt- ics and quants and how it was disrupting things and what jobs were going to be eliminated, including truck drivers and airplane pilots. Self-driving cars, robots, analytics, algorithms and all this stuff is here to stay, and it’s only going to get bigger, but it’s not going to replace the ability to read, write and think critically. While I’m a big proponent of analytics, communi- cation will continue to be really impor- tant; human-to-human contact can’t be replaced, ever. Just how important are commu- nication skills to a data scientist? INFORMS, for example, now routinely holds “soft skills” workshops aimed at helping analysts explain their work to non-technical audiences in order to garner corporate buy-in. Yes. That’s absolutely critical. The other piece that goes hand in hand with that is having the ability to understand the business at hand. Business acumen is really important. You have to have that gut check; does it make sense and how can I best monetize the situation to benefit a client or employer? It’s re- ally important for people to understand not only what’s interesting – what a lot of quantitative people tend to gravitate toward – but also what’s important. If a company is just starting out on the analytics journey and has no in- house expertise in this area, how can they judge a candidate’s technical abilities? That’s an interesting problem. When I’m talking to a client, especially in this data science area that is so new, they will call me and sometime they will have it down. They are talking the right lan- guage, they are thinking about the right things, they are asking the right ques- tions. Other clients are floundering; they are still exploring.
  • 59.
    J U LY/ AU G U S T 2 014 | 59A NA L Y T I C S I think it’s very important that they make sure they understand where their needs are before they actually bring in somebody because it’s not inexpensive to apply analytics in an organization. You really need to think very carefully what the goals are, what the road map is going to look like and so on. I can cer- tainly help with that, and I can give the names of consultants who can help a company really understand what their plan should be before they jump in and make hires. On the other side of that coin, what’s the best advice you can give an analytics candidate who is testing the job market? Another flash survey we did focused on understanding what motivates peo- ple to make a job change. The number one motivation is money, but it’s quickly followed by challenging work and the op- portunity to grow within an organization. Money is important to everyone, but candidates shouldn’t make deci- sions regarding changing jobs based on Job Seeker Benefits • Access to high quality, relevant job postings. No more wading through postings that aren’t applicable to your expertise. • Personalized job alerts notify you of relevant job opportunities. • Career management – you have complete control over your passive or active job search. Upload multiple resumes and cover letters, add notes on employers and communicate anonymously with employers. • Anonymous resume bank protects your confidential information. Your resume will be displayed for employers to view EXCEPT your identity and contact information which will remain confidential until you are ready to reveal it. • Value-added benefits of career coaching, resume services, education/training, articles and advice, resume critique, resume writing and career assessment test services. POWERED BY http://careercenter.informs.org CAREER CENTER
  • 60.
    W W W.I N F O R M S . O R G6 0 | A N A LY T I C S - M AGA Z I N E . O R G QA WITH LINDA BURTCH salary alone because money isn’t going to be the factor that’s going to change their life. Rather, it’s the kind of work you will do and how engaged you will be. It’s really important to understand the challenge and the growth opportunity within whatever it is you are looking to jump into. The third thing I think is important to analyze for any quantitative person when they’re talking to a potential new employer is to understand if analytics has a seat at the corporate table. You have to make sure that there is buy-in within the organization and the stakeholders are really ac- tively involved and engaged in conversations about how analytics can and should be used or imbedded within any organization. That’s a huge factor in understanding how happy you will be in your job and how successful you can be as a quantitative professional. Getting back to the plight of the quant- poor MBA, how can a candidate boost ana- lytical skills mid-career? Many colleges and universities are now offering analytics pro- grams, often online, through their business schools, and INFORMS, for example, holds continuing education courses in the analyt- ics area, as well as a certification program. I get that question a lot: “I’m really interested in beefing up my analytical skills so what should I do?” As you noted, there are more opportunities than ever to do that. In addition to the formal edu- cation programs, there are plenty of good books on the topic. I get the question all the time: What books should I be looking at? For any quantitative person, when they’re talking to a potential new employer, it’s important to understand if analytics has a seat at the corporate table.
  • 61.
    J U LY/ AU G U S T 2 014 | 61A NA L Y T I C S Another way that you can jump into this is through Kaggle competitions, which I recommend to people if they are interested in understanding data science and who else is out there doing this kind of work and what they are doing. There are many tools out there. Certainly what INFORMS is doing is terrific. It’s important to keep your skills fresh and make sure you continue to learn. When it comes to giving general career advice, especially to younger candidates, my advice is this: prepare for three or four careers during your lifetime. In today’s world, it’s not good to specialize in one thing and try to stick with one company or one industry or one vertical applica- tion for your entire career. It’s incredibly dangerous, and it likely won’t carry you through a 35-year career. You need to be continuously learning something new. People should keep that in mind. INFORMS offers an analytics cer- tification program (CAP). Is that a dif- ferentiator in the job marketplace? No two candidates are ever equal, but it can certainly help once there are enough employers out there who under- stand what it means to be CAP certified. I’m seeing people put various MOOCs (massively open online course) on their resumes now, along with Kaggle com- petition results. I have a candidate who actually got his job because of a Kaggle competition. The first couple of times he submitted his solution it was totally rejected, but as he continued to study the problem and resubmitted, he climbed up the leaderboard. Then he started getting calls and job opportuni- ties because of his Kaggle rank. From your perspective, what does the future hold for data scientists and other analytics professionals? In my 30 years of experience, I have never seen anything like this. The oppor- tunities for elite analytics candidates have never been better, and I think what we’re seeing now is just the tip of the iceberg. As I said earlier, I really think that my quantitative candidates are going to be running companies one day. Certainly the CMO (chief marketing officer) is go- ing to be coming up through the analyt- ics ranks. Now there’s all this talk about CAOs (chief analytics officer). I think the candidates I’m working with have a very strong chance – if they have leadership ability and the ambition – to advance up the ranks and continue to climb and run organizations at some point. Their quan- titative skills are going to be unique and absolutely required to be a successful businessperson. It’s their time to shine. Peter Horner (peter.horner@mail.informs.org) is the editor of Analytics and OR/MS Today magazines.
  • 62.
    W W W.I N F O R M S . O R G62 | A N A LY T I C S - M AGA Z I N E . O R G ANALYTICS ACROSS THE ENTERPRISE The story of how IBM not only survived but thrived by realizing business value from big data. his is the story of how an iconic company founded more than a century ago, and once deemed a “dino- saur” that would not be able to survive the 1990s, has learned lesson after les- son about survival and transformation. The use of analytics to bring more sci- ence into the business decision process is a key underpinning of this survival and transformation. Now for the first time, the inside story of how analytics is being used across the IBM enterprise is being told. According to Ginni Rometty, chairman, president and chief executive officer, IBM Corporation, “Analytics is forming the silver thread through the future of every- thing we do.” What is analytics? In simple terms, analytics is any mathematical or scientific method that augments data with the intent of providing new insight. With the nearly 1 trillion connected objects and devices generating an estimated 2.5 billion giga- bytes of new data each day, analytics can help discover insights in the data. That in- sight creates competitive advantage when used to inform actions and decisions. Analytics transforms a ‘dinosaur’ BY (l-r) BRENDA DIETRICH, EMILY PLACHY AND MAUREEN NORTON T
  • 63.
    J U LY/ AU G U S T 2 014 | 63A NA L Y T I C S using data, but it involves more than simple data (or database) queries. Analytics involves the use of mathematical or scien- tific methods to generate insight from the data. Analytics should be thought of as a progres- sion of capabilities, start- ing with the well-known methods of business in- telligence, and extending through more complex methods involving sig- nificant amounts of both mathematical modeling and computation. Reporting is the most widely used analytic capability. Reporting gathers data from multiple sources, such as busi- ness automation, and creates standard summarizations of the data. Visualiza- tions are created to bring the data to life and make it easy to interpret. As a generic example, consider store sales data from a retail chain. The data is generated through the point of sale system by reading the product bar codes at checkout. Daily reports might include total store revenue for each store, rev- enue by department for each region, and national revenue for each stock-keeping unit (SKU). Weekly reports might include Data is becoming the world’s new natural re- source, and learning how to use that resource is a game changer. … Analytics is not just a technology; it is a way of doing business. Through the use of analytics, in- sights from data can be created to augment the gut feelings and intuition that many decisions are based on today. Analytics does not replace human judgment or diminish the creative, innovative spirit but rather informs it with new insights to be weighed in the decision process. … Analytics for the sake of analytics will not get you far. To drive the most value, analytics should be applied to solving your most important business challenges and deployed widely. Analytics is a means, not an end. It is a way of thinking that leads to fact-based decision-making. … BIG DATA AND ANALYTICS DEMYSTIFIED If analytics is any mathematical or sci- entific method that augments data with the intent of providing new insight, aren’t all data queries analytics? No.Analytics is often thought of as answering questions This article is adapted from the book, “Analytics Across the Enterprise: How IBM Realizes Business Value from Big Data and Analytics.”
  • 64.
    W W W.I N F O R M S . O R G64 | A N A LY T I C S - M AGA Z I N E . O R G ANALYTICS ACROSS THE ENTERPRISE the same metrics, as well as comparisons to the previous week and comparisons to the same week in the previous calen- dar year. Many reporting systems also allow for expanding the summarized data into its component parts. This is particu- larly useful in understanding changes in the sums. For example, a regional store man- ager might want to examine the store- level detail that resulted in an increase in revenue from the home entertainment department. She would be interested in knowing whether sales increased at most of the stores in the region, or whether the increase in total sales resulted from a sig- nificant sales jump in just a few stores. She might also look at whether the in- crease could be traced back to just a few SKUs, such as an unusually popular movie or video game. If a likely cause of the sales increase can be identified, she might alert the store managers to moni- tor inventory of the popular products, re- position the products within a store, or even reallocate inventory of the products across stores in her region. … WHY ANALYTICS MATTER Quite simply, analytics matters be- cause it works. You can be overwhelmed with data and the value of it may be unat- tainable until you apply analytics to create the insights. Human brains were not built to process the amounts of data that are today being generated through social me- dia, sensors, and more. While gut instinct is often the basis for decisions, analyti- cally informed intuition is what wins going forward. Several studies have highlighted the value of analytics. Companies that use predictive analytics are outperforming those that do not by a factor of five. In a 2012 joint survey by the IBM Institute of Business Value and the Said Busi- ness School at the University of Oxford of more than 1,000 professionals around the world, 63 percent of respondents reported that the use of information (including big data and analytics) is creating a competi- tive advantage for their organizations. IBM depends on analytics to meet its business objectives and provide shareholder value. The bottom line is that analytics helps the bottom line. Your competition will not be waiting to take advantage of the new in- sights from big data. Should you? IBM has approached the use of ana- lytics with a spirit of innovation and a be- lief that analytics will illuminate insights in data that can help improve outcomes. The company hasn’t been afraid to make mistakes or redesign programs that haven’t worked as planned. Unlike tra- ditional IT projects, most analytics proj- ects are exploratory. For example, the Development Expense Baseline Project
  • 65.
    Master of Sciencein Analytics Apply technical knowledge to diverse analytical problems in this program for working adults. Learn to draw insights from complex data using statistical methods and modeling. Develop advanced proficiency in applying sophisticated sta- tistical, database development, and software skills to various industries. Apply by August 10. Join us for an information session. When Thursday, July 10, 6–7 pm, or Thursday, July 17, 6–7 pm Where July 10: Downtown Chicago Gleacher Center 450 North Cityfront Plaza Drive July 17: Online More Info grahamschool.uchicago.edu/MAANMP RSVP July 10: http://tinyurl.com/o4auzsw July 17: http://tinyurl.com/nbs2495 BIG DATA. BIG CAREER.
  • 66.
    W W W.I N F O R M S . O R G6 6 | A N A LY T I C S - M AGA Z I N E . O R G ANALYTICS ACROSS THE ENTERPRISE explored innovative ways to determine development expense at a detailed level, thereby addressing a problem that many thought was impossible to solve. IBM analytic teams haven’t waited for perfect data to get started; rather, they have re- fined and improved their data along the way. … The key is to put a stake in the ground with a commitment that analytics will be woven into your strategy. That’s how IBM does it. This approach is also effective with big data. Rather than postpone the leveraging of big data, you should em- brace it, establish a link between your business priorities and your information agenda, and apply analytics to become a smarter enterprise. … PROVEN APPROACHES Staying focused on solving business problems was the pragmatic start, and the other crucial element was having very high-level executive support from the be- ginning. From a governance perspective, those are two key levers to drive value: focus on actions and decisions that will generate value and have high-level ex- ecutive sponsorship. The ideal team to do analytics is a collaboration between an experienced data scientist, a person steeped in the area of the business where the challenge needs to be solved, and an IT person with expertise in the data in that particu- lar area of the business. A joint study by MIT Sloan and the IBM Institute for Business Value devel- oped several recommendations. The first is that you start with your biggest and highest-value business challenge. The next recommendation is to ask a lot of questions about that challenge in order to understand what’s going on or what could be going on. Then you go out and look for what data you might have that’s relevant to that challenge. Finally, you determine which analytic technique can be used to analyze the data and solve the problem. Because most companies have con- straints on the amount of money and skills available for projects, estimating the ROI can provide a better differentiator for selecting the project with the highest potential impact than relying on instincts. Estimating an analytics project’s ROI in- volves both capturing the project costs and measuring the value. … EMERGING THEMES Relationships inferred from data today may not be present in data col- lected tomorrow. The relationships that you infer from data about the past do not necessarily hold in data that you col- lect tomorrow. You cannot analyze data once and then make decisions forever based on old analysis. It’s important to
  • 67.
    A NA LY T I C S J U LY / AU G U S T 2 014 | 67 SAS and Hadoop take on the Big Data challenge. And win. Analytics Why collect massive amounts of Big Data if you can’t analyze it all? Or if you have to wait days and weeks to get results? Combining the analytical power of SAS with the crunching capabilities of Hadoop takes you from data to decisions in a single, interactive environment – for the fastest results at the greatest value. Read the TDWI report sas.com/tdwi SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2014 SAS Institute Inc.All rights reserved. S120598US.0214
  • 68.
    W W W.I N F O R M S . O R G68 | A N A LY T I C S - M AGA Z I N E . O R G ANALYTICS ACROSS THE ENTERPRISE continually analyze data to verify that pre- viously detected relationships are still val- id and to discover new ones. Fortunately, major discontinuities with data do not hap- pen very often, so change generally hap- pens gradually. Social media sentiment, however, has a much shorter half-life than most data. Using relationships derived from past data has been repeatedly demonstrated to work better than assuming that no re- lationships exist. The relationships that have been detected are likely correlation rather than causality. However, these re- lationships, if detected and acted upon quickly, may provide at least a temporary business advantage. You don’t have to understand ana- lytics technology to derive value from it. For a long time, many business leaders expressed the opinion that mathemat- ics should be used by only those who understood the details of the computa- tions. However, in recent years this view has been changing, and analytics is be- ing treated like other technologies. You must learn how to use it effectively, but it is not necessary to understand the in- ner workings in order to apply analytics to business decisions. You have to apply analytics methods in the context of the problem that is being solved and make the results accessible to the end user. But just as the user of a car navigation system does not need to understand the details of the routing algorithm, the end user of analytics does not have to understand the details of the math. Typically, making the results accessi- ble to the end user involves wrapping the math in the language and the process of the end user. Also, the analytics can be embedded deep inside things so that the user does not see it, like in supply chain operations. Analytics should be usable by anyone, not just those with Ph.D.s in sta- tistics or operations research. Some us- ers will want to understand the algorithms and inner workings of an analytics model in order to trust the results prior to adop- tion, but they are the exception. Fast, cheap processors and cheap storage make analysis on big data pos- sible. Moore’s law has resulted in vast increases in computing power and vast decreases in the cost of storing and ac- cessing data. With readily available and inexpensive computing, we can do what- if calculations often and test a number of variables in big data for correlation. Doing things fast is almost always better than doing things perfectly. Often inexact but fast approaches pro- duce enormous gains because they re- sult in better choices than humans would have made without the use of analytics. Over time, the approximate analytics methods can be refined and improved to
  • 69.
    J U LY/ AU G U S T 2 014 | 69A NA L Y T I C S achieve additional gains. However, for many business processes, there is even- tually a point of diminishing returns: The calculations may become more detailed and precise, but the end results are no more accurate or valuable. Using analytics leads to better auditability and accountability. With the use of analytics, the decision-making process becomes more structured and repeatable, and a decision becomes less dependent on the individual making the decision. When you change which peo- ple are in various positions, things still happen in the same way. You can often go back and find out what analysis was used and why a decision was made. … Dr. Brenda L. Dietrich is an IBM Fellow and vice president. She joined IBM in 1984, and during her career she has worked with almost every IBM business unit and applied analytics to numerous IBM decision processes. She currently leads the emerging technologies team in the IBM Watson group. For more than a decade, she led the Mathematical Sciences function in the IBM Research division, where she was responsible for both basic research on computational mathematics and for the development of novel applications of mathematics for both IBM and its clients. In addition to her work within IBM, she has been the president of INFORMS, the world’s largest professional society for operations research and management sciences. An INFORMS Fellow, she has received multiple service awards from INFORMS. Dr. Emily C. Plachy is a distinguished engineer in Business Analytics Transformation at IBM, where she is responsible for leading an increased use of analytics across IBM. Since joining IBM in 1982, she has integrated data analysis into her work and has held a number of technical leadership roles including CTO, process, methods, and tools in IBM Global Business Services. In 1992, Emily was elected to the IBM Academy of Technology, a body of approximately 1,000 of IBM’s top technical leaders, and she served as its president from 2009 to 2011. She is a member of INFORMS. Maureen Fitzgerald Norton, MBA, JD, is a distinguished market intelligence professional and executive program manager in Business Analytics Transformation, responsible for driving the widespread use of analytics across IBM. In her previous role, she led project teams applying analytics to IBM Smarter Planet initiatives in public safety, global social services, commerce and merchandising. Norton became the first woman in IBM to earn the designation of Distinguished Market Intelligence Professional for developing innovative approaches to solving business issues and knowledge gaps through analysis. Note: This article is adapted from the book, “Analytics Across the Enterprise: How IBM Realizes Business Value from Big Data and Analytics,” authored by Brenda L. Dietrich, Emily C. Plachy and Maureen F. Norton, published by Pearson/IBM Press, May 2014, ISBN 978-0- 13-383303-4, ©2014 by International Business Machines Corporation. For more information, visit: ibmpressbooks.com. Request a no-obligation INFORMS Member Benefits Packet For more information, visit: http://www.informs.org/Membership
  • 70.
    W W W.I N F O R M S . O R G70 | A N A LY T I C S - M AGA Z I N E . O R G SOFTWARE SURVEY Making predictions from hard and fast data. ere is an easy forecast to make: Forecasting will be part of our information flow for the foreseeable future. Forecasting is also a key topic in my “Decision Modeling for Management” course. In preparing the midterm exam for this past spring term, I wanted the stu- dents to analyze the enrollment figures for the Affordable Care Act and make some forecasts. The media has been talking about these enrollment figures since the rollout, and politicians have been making projections about them as well. In the course we covered various forecasting methodologies, including trend analysis. Thus, my plan for a midterm problem was to give the students the enrollment data and have them make a forecast for the May 1 enrollment deadline. Getting those enrollment numbers became obstacle number one. Figures 1 and 2 show some typical results of an Internet search. I found graphs, some better, more worse (look at the markers on the x-axis of the graph The future of forecasting BY JACK YURKIEWICZ H
  • 71.
    J U LY/ AU G U S T 2 014 | 71A NA L Y T I C S Figure 1: http://www.cnn.com/interactive/2013/09/health/map-obamacare/. Figure 2: http://www.whitehouse.gov/the-press-office/2014/04/17/fact-sheet-affordable-care-act-numbers. in Figure 1), lots of opinion articles with forecasts, but no data. I punted and de- cided to present the class a similar but far less-pressing problem. On March 31, the day of the midterm exam, I asked students to make forecasts for the cumulative domestic box-office gross for the recently released movie “Non-Stop.” The action film starring Liam Neeson had opened on Feb. 28, and I gave the students the daily domestic box-office gross values from opening day through
  • 72.
    W W W.I N F O R M S . O R G72 | A N A LY T I C S - M AGA Z I N E . O R G FORECASTING March 16, or 17 days of data. The stu- dents were asked to make a time plot of these box-office figures (see Figure 3) and, after examining various trend models, get a forecast for the cumu- lative domestic box-office gross for a target date, midterm day, March 31. I knew that two days later (after I had graded their exams and returned them), Universal Studios would give the actual cumulative domestic gross of the film as of March 31. It was $85.39 million. Of the various trend models we cov- ered, the Weibull curve yielded the most accurate forecast, $86.11 million; anoth- er model was reasonably close, and the others we discussed and they tried were way off. CATEGORIZING THE FORECAST SOFTWARE Commercial forecasting software is available in two broad categories. Using the nomenclature from previous OR/MS Today forecasting surveys, the first category is called dedicated software. A dedicated product implies that the software only has various forecasting capabilities, such as Box-Jenkins, exponential smooth- ing, trend analysis, regression and other procedures. The second category is called general statistical software. This implies the product does have forecasting techniques as a subset of the many statistical proce- dures it can do. Thus, a product that can do ANOVA, factor analysis, etc., as well as Box-Jenkins techniques would fall into Figure 3: Initial daily domestic box-office gross of the motion picture (“Non-Stop”).
  • 73.
    MASTER OF SCIENCEIN ANALYTICS • 15-month, full-time, on-campus program • Integrates data science, information technology and business applications into three areas of data analysis: predictive (forecasting), descriptive (business intelligence and data mining) and prescriptive (optimization and simulation) • Offered by the McCormick School of Engineering and Applied Science www.analytics.northwestern.edu MASTER OF SCIENCE IN PREDICTIVE ANALYTICS • Online, part-time program • Builds expertise in advanced analytics, data mining, database management, financial analysis, predictive modeling, quantitative reasoning, and web analytics, as well as advanced communication and leadership • Offered by Northwestern University School of Continuing Studies 877-664-3347 | www.predictive-analytics.northwestern.edu/info NORTHWESTERN ANALYTICS As businesses seek to maximize the value of vast new streams of available data, Northwestern University offers two master’s degree programs in analytics that prepare students to meet the growing demand for data-driven leadership and problem solving. Graduates develop a robust technical foundation to guide data-driven decision making and innovation, as well as the strategic, communication and management skills that position them for leadership roles in a wide range of industries and disciplines.
  • 74.
    W W W.I N F O R M S . O R G74 | A N A LY T I C S - M AGA Z I N E . O R G FORECASTING this group. In recent years, the number of products in the second category has been growing, as statistical software firms have been adding additional and more sophis- ticated forecasting methodologies to their lists of features and capabilities. However, some dedicated software manufactur- ers offer specific capabilities and features (e.g., transfer function, econometric mod- els, etc.) that general statistical programs may not have. In both software categories, forecast- ing software varies when it comes to the degree to which the software can find the appropriate model and the optimal parameters of that model. For example, Winters’ method requires values for three smoothing constants and Box-Jenkins models have to be specified with vari- ous parameters, such as ARIMA(1,0,1) x(0,1,2). Forecasting software vary in their degree to find these parameters. For the purposes of this and previous surveys, the ability of the software to find the optimal model and parameters for the data is characterized. Software is labeled as automatic if it both recommends the appropriate model to use on a particular data set and finds the optimal parame- ters for that model. Automatic software typically asks the user to specify some parameter to minimize (e.g., Akaike Infor- mation Criterion (AIC), Schwarz Bayes- ian Information Criterion (SBIC), RMSE, etc.) and recommends a forecast model for the data, gives the model’s optimal parameters, calculates forecasts for a user-specified number of future periods, and gives various summary statistics and graphs. The user can manually overrule the recommended model and choose an- other, and the software finds the optimal parameters, forecasts, etc., for that one. The second category is called semi- automatic. Such software asks the user to pick a forecasting model from a menu and some statistic to minimize, and the pro- gram then finds the optimal parameters for that model, the forecasts, and various graphs and statistics. The third category is called manual software. Here the user must specify both the model that should be used and the corresponding parameters. The software then finds the forecasts, summary statis- tics and charts. If you frequently need to make forecasts of different types of time series, using manual software could be a tedious choice. Unfortunately, that broad advice may not be apropos for some software. Some products fall into two categories. For example, if you choose a Box-Jenkins model, the software may find the optimal parameters for that mod- el, but if you specify that Winters’ method be used, the product may require that you manually enter the three smoothing constants.
  • 75.
    J U LY/ AU G U S T 2 014 | 75A NA L Y T I C S When it comes to analyzing trends, most the products I tried fall into the semi- automatic group. That is, I need to choose a trend curve, and the software finds the ap- propriate parameters for that model, gives forecasts, summary statistics and graphs. WORKING WITH A SAMPLE OF PRODUCTS In my class, students use StatTools, part of the Palisade Software Suite that comes with their textbook. Its forecasting capabilities are regression, exponential smoothing (Brown, Holt and Winters’) and moving averages. If data followed some nonlinear function, the students could make mathematical transformations to make the data linear and then use ordi- nary linear regression on it, and do the inverse transformation to get the forecast. They also have several Excel templates I developed (Gompertz, Pearl-Reed, Weibull, etc.) for the course. For this ar- ticle, I tried a small sample of professional A membership in INFORMS will help! How will you stand out from the crowd? • Certification for Analytics Professionals • Online access to the latest in operations research and advanced analytics techniques • Networking Opportunities available at INFORMS Meetings and Communities • New Members receive one free Subdivison membership in 2014 visit http://join.informs.org Join INFORMS for rest of 2014 for just $80. Exclusive offer to Analytics subscribers. Promocode ANALYTICS-HALF.
  • 76.
    W W W.I N F O R M S . O R G76 | A N A LY T I C S - M AGA Z I N E . O R G FORECASTING products from different categories, spe- cifically Minitab, IBM SPSS and NCSS on the “Non-Stop” movie data. IBM SPSS falls into the automatic forecasting catego- ry; Minitab and NCSS are semiautomatic products. A caveat: This is not meant to be a critical review of any product mentioned. I let IBM SPSS first do the analysis of the movie data via its automatic mode, called “Expert Modeler” (i.e., choose the model and its parameters and get the forecasts). Figure 4 shows superimposed screen shots of IBM SPSS’ worksheet, showing the “Non-Stop” daily domestic box-office gross and the menu system to start the automatic forecasting proce- dure. The program then gave its recom- mended model, Brown’s method for data with linear trend, which uses one smooth- ing constant to estimate the intercept and slope of the fitted line (as compared to Holt’s method, which uses two inde- pendent smoothing constants) [1]. IBM SPSS’ accompanying statistics, forecast plot and additional output are shown in Figure 5. Figure 4. IBM SPSS input worksheet (show- ing the “Non-Stop” movie daily box-office returns). Figure 5: IBM SPSS’ results of “automatic” forecasting of the “Non-Stop” data.
  • 77.
    Published for businessforecasters, planners, and managers by the International Institute of Forecasters (IIF), Foresight: The International Journal of Applied Forecasting delivers authoritative guidance on forecasting processes, practices, methods, and tools. Each issue features a unique blend of insights from experienced practitioners and top academics, distilled into concise and accessible articles, tutorials, and case studies. Our mission is to help you improve the accuracy and efficiency of your forecasting and operational planning. Foresight’s topics include • SOP process design and management • Forecasting principles and methods • Measuring and tracking forecast accuracy • Regular columns on forecasting intelligence, prediction markets, financial forecasting • Hot new research and its practical value • Reviews of new and popular books, software, and other technologies No matter what kind of forecasting you do, we invite you to take Foresight for a “test drive.” To take Foresight for a spin, download a recent issue here: bit.ly/ForesightTestDrive Foresight is a publication of the International Institute of Forecasters. IIF Business Office: 53 Tesla Avenue, Medford, MA 02155, USA.Tel: 1-781-234-4077 To receive quarterly hard copy issues, unlimited access to our library of back issues, and much more, subscribe to Foresight here: forecasters.org/foresight/subscribe
  • 78.
    W W W.I N F O R M S . O R G78 | A N A LY T I C S - M AGA Z I N E . O R G FORECASTING IBM SPSS does have a curve fitting feature, so I utilized it and specified three possible models to be examined – the linear, growth and logistic curves. Fig- ures 6 and 7 give the resulting output and plots for these choices. NCSS has, in addition to the stan- dard forecasting procedures (Box-Jenkins and exponen- tial smoothing models), an extensive list of more than 20 nonlinear curve mod- els under its menu label “Growth and Other Models.” The user chooses a model, and NCSS finds the appropriate parameters for the particular data set. I chose, for the “Non-Stop” data, the “Logistic(4)” model [i.e., a logistic curve with four parameters; there is a Logistic(3) model available as well], and Figure 8 shows the NCSS’ output. Minitab is a hybrid of a semi-auto- matic and manual forecasting product. If you specify that a Box-Jenkins model be used, the software finds the appropriate parameters for the model. However, if you choose Winters’ method, Minitab requires Figure 6: IBM SPSS’ fitted models for three specified growth curves. Figure 7: IBM SPSS’ plot of the data and growth curves.
  • 79.
    J U LY/ AU G U S T 2 014 | 79A NA L Y T I C S that you manually enter values for the three smoothing con- stants. Minitab also has, under the Time Series choice on the main menu, a Trend Analysis option. Choosing that gives the user four possible curves (lin- ear, quadratic, exponential and Pearl-Reed logistic). Figure 9 gives the results of my choice for the “Non-Stop” data, the Pearl-Reed curve (Minitab calls it the S-Curve Trend Model). Figure 9: Minitab’s output for the Pearl-Reed logistic growth model for the Non-Stop data. Figure 8: NCSS’ output. I chose the “Logistic(4)” from NCSS’ list of “Growth and Other Models.”
  • 80.
    W W W.I N F O R M S . O R G80 | A N A LY T I C S - M AGA Z I N E . O R G FORECASTING Finally, Figure 10 shows the results of one of my Excel templates that uses the four-parameter Weibull trend curve and uses Solver’s nonlinear programming capability to find the optimal parameters that minimizes the root mean square er- ror for the entered data. THE SURVEY We e-mailed the vendors and asked them to respond on our online ques- tionnaire so readers could see the fea- tures and capabilities of the software. The purpose of the survey is to inform the reader of a program’s forecasting capabilities and features. We tried to identify as many forecasting vendors and products as possible and contacted all the vendors that we identified and/ or responded to the last survey in 2012. For those who did not respond, we tried gentle reminders (several e-mails and some phone calls). In addition to the features and capability of the software, we wanted to know what techniques or enhancements have been added to the software since our previous survey. The information comes from the vendors, and we made no attempt to verify what they gave us. Figure 10: The four-parameter Weibull curve fit for the Non-Stop data.
  • 81.
    J U LY/ AU G U S T 2 014 | 81A NA L Y T I C S If you use data to make forecasts, what should you look for in a vendor and the product? First, find out the capabilities of the software. Specifically, what fore- casting methodologies can the product do? Does it find the optimal parameters of the procedure for your particular data set or must you manually enter those val- ues? How extensive, useful and clear is the output? Most, but not all, vendors allow you to download a time-trial version of the soft- ware that typically expires in anywhere from a week to a month. Ideally, the trial version should allow you to work with your own data and not just “canned” data that the vendor bundles with the trial soft- ware. Verify if the trial version has size limitations of the data, and if so, are they overly restrictive. Ask about technical support, updat- ing to a newer version when it is released and differences (if any) depending on the operating system you are using. Contact the vendor with your specific questions. Users tell me, and I have independently found, that most vendors have good and helpful technical support before and after you buy. Jack Yurkiewicz (yurk@optonline.net) is a professor of management science in the MBA program at the Lubin School of Business, Pace University, New York. He teaches data analysis, management science and operations management. His current interests include developing and assessing the effectiveness of distance-learning courses for these topics. He is a longtime member of INFORMS. SURVEY DATA DIRECTORY To view the survey results as well as a directory of vendors who participated in the survey, click here.
  • 82.
    W W W.I N F O R M S . O R G82 | A N A LY T I C S - M AGA Z I N E . O R G CONFERENCE PREVIEW BY CANDACE “CANDI” YANO Tony Bennett sang that he “left his heart in San Francisco” – and at the 2014 INFORMS Annual Meet- ing in San Francisco, you will begin to understand why as you take advantage of the opportunity to fill both your heart and your mind. To fill your mind, you can attend special presentations: • Alvin Roth, professor of economics at Stanford University and professor of economics and busi- ness administration at Harvard University who was awarded the 2012 Nobel Prize in Economics for his work in the area of Game Theory, will talk about his work. • Richard Cottle, emeritus professor at Stanford University, will offer a commemorative and historical perspective on George Dantzig in honor of Dantzig’s 100th birthday. • Jonathan Caulkins, professor at the Heinz School of Public Policy at Carnegie Mellon University, will discuss his work on health and drug-related policy issues. S.F. conference set to capture hearts minds The conference will include more than 4,000 technical presentations by experts from industry, academia and government, from leading-edge advancements in operations research methodologies and analytics to applications in healthcare, energy, environmental management and supply chain management. Some of San Francisco’s many landmarks are mobile.
  • 83.
    J U LY/ AU G U S T 2 014 | 83A NA L Y T I C S • Anthony Levandowski of Google will talk about the Google Driverless Car project, of- fering his perspective as both a developer and a user of the technology. • A panel of experts from within the INFORMS community will discuss their experience with, and offer ad- vice on, massively open online courses (MOOCs). If this is not enough, there will be more than 4,000 technical presentations by experts from industry, academia and government. Topics will be wide-rang- ing, covering the full breadth of the field, from leading-edge advancements in op- erations research methodologies and analytics, to applications in healthcare, energy, critical infrastructure manage- ment, environmental management and supply chain management. If you are not already overwhelmed while filling your mind, you will have ample opportunity to fill your heart – and stom- ach. San Francisco is regarded as one of the most beautiful cities in the world and offers world-class cuisine from almost every ethnic heritage. The meeting will take place in two adjacent hotels, the Hil- ton San Francisco Union Square and the Parc 55 Wyndham. The location is in close proximity to the city’s prime shopping dis- trict and near the boarding point for cable cars to Fisherman’s Wharf – famous for fresh seafood and Pier 39 – where you can see dozens of sea lions and walk to ferries that offer everything from simple rides across San Francisco Bay to amaz- ingly scenic tours, as well as Ghirardelli Square, known for Ghirardelli chocolate. Venturing into other parts of San Fran- cisco, you can visit world-class muse- ums, including the Palace of the Legion of Honor, DeYoung Museum, Asian Art Mu- seum and California Academy of Sci- ences. The performing arts, including the symphony, ballet, opera, jazz, theater and concerts, are all within easy reach. If you prefer the outdoors, you can take a trip to the former prison on Alcatraz (a limited number of tickets will be available to con- ferees for purchase), see the redwoods in Muir Woods, hike in the Marin Headlands with an unobstructed view of the Golden Gate Bridge, sign up to play a round of golf with other conferees at TPC Harding Golf Course the day before the conference, or simply wander through the haunts of the hippies in Haight-Ashbury or the Beat po- ets in North Beach. Just a bit further from the city are the wine regions of Napa and Sonoma, only an hour’s drive away. Both the meeting and the venue will have much to offer in many dimensions. We look forward to seeing you there. Candace “Candi” Yano is general chair of the 2014 INFORMS Annual Meeting in San Francisco. She is a longtime member of INFORMS.
  • 84.
    W W W.I N F O R M S . O R G84 | A N A LY T I C S - M AGA Z I N E . O R G FIVE-MINUTE ANALYST Few things make me more conflicted than parking lots. On a personal level, I loathe the whole parking activity. It brings out what I think is the worst behav- iors of humankind: hoarding, brinksmanship, scarci- ty mentality, irrational objective functions… and now you see why as an O.R. professional I love parking lots: because they are so interesting to study. At the corner of Hades Street and Styx Ave. is (at least to me) the world’s worst parking lot. Here’s the set-up: There is an upper level with metered parking. The meter has a two-hour limit at a rate of $1.25/hour, but pressing a silver button on the meter sets the time to 60 minutes if the meter is currently less than 60 (see Figure 1). This makes parking here free to most visitors. The lower level is Probabilistic parking problems BY HARRISON SCHRAMM, CAP The whole parking activity brings out the worst behaviors of humankind: hoarding, brinksmanship, scarcity mentality, irrational objective functions… and why as an O.R. professional I love parking lots: because they are so interesting to study. Figure 1: A “smart meter” in a parking lot. This meter has a button next to the coin lot that may be pressed for a free hour of parking. Coins may be added for additional time, up to two hours.
  • 85.
    J U LY/ AU G U S T 2 014 | 85A NA L Y T I C S a standard parking garage, which has a flat $2 per hour fee which can be vali- dated by the two “anchor” stores, mak- ing it essentially free for most patrons as well. While this is light and explorato- ry, there is serious work going on with parking problems [1]. In the sterile world of figures and mathematics, this sounds like a reason- able way to run a parking lot, and pa- trons who miss the upstairs free parking will simply renege and take the lower level free parking. In reality, people “mob” the upstairs portion in search of “free parking.” My assistant and I had observed this behavior over a num- ber of weeks, and we were interested in learning about the time parked cars spent in the lot, with an eye for simple metrics such as expected wait time for a parking spot or the expected number of cars “trolling” for a slot. This interest be- came action (the key for any analysis), and we chose 6:30 p.m. on a Thursday evening – a time that we knew the park- ing lot would be full – to collect data BENEFITS OF CERTIFICATION • Advances your career potential by setting you apart from the competition • Drives personal satisfaction of accomplishing a key career milestone • Helps improve your overall job performance by stressing continuing professional development • Recognizes that you have invested in your analytics career by pursuing this rigorous credential • Boosts your salary potential by being viewed as experienced analytics professional • Shows competence in the principles and practices of analytics APPLICATIONS • Prepare to apply by reviewing Candidate Handbook Study Guide Draft • Arrange now to secure academic transcript and confirmation of “soft skills” to send to INFORMS COMPUTER-BASED TESTING It is now more convenient than ever to schedule your CAP exam in more than 700 Kryterion test centers in more than 100+ countries. To find the location closest to you, check this site: www.kryteriononline.com/host_locations/ QUESTIONS? certification@mail.informs.org DOMAINS OF ANALYTICS PRACTICE Domain Description Weight* Business Problem (Question) Framing Analytics Problem Framing Data Methodology (Approach) Selection Model Building Deployment Life Cycle Management *Percentage of questions in exam I II III IV V VI VII 15% 17% 22% 15% 16% 9% 6% 100% BECOME A CERTIFED ANALYTICS PROFESSIONAL DON’T BE LEFT BEHIND. www.informs.org/Build-Your-Career/Analytics-Certification
  • 86.
    W W W.I N F O R M S . O R G86 | A N A LY T I C S - M AGA Z I N E . O R G FIVE-MINUTE ANALYST from the meters, which is displayed for anyone who wishes to see. What we found was surprising. We expected to see uncorrelated parking lot data. We did not expect to find many over-time parking spots. I hoped that the data would be exponen- tial – which would lead to nice, clean analysis. What we discovered was, well, a mess. Of the 100 parking spots surveyed, 25 percent were “flashing” or over-time (violation). Of the parking spots that were not over-time, six showed times over one hour, implying that the persons parked there had in fact put money in the meter. We are completely discarding the possibility that someone would park in a spot that had been previously occupied but was not vacated, i.e., showing up with 30 minutes remaining on meter and not pressing the button/inserting coins. I had hoped that the sojourn times would be exponentially distributed, but that is a case that is pretty difficult to make with this dataset (see Figure 2). Now, we don’t actually know how many patrons have paid, or how many have simply run over. However, there are 100 parking spots considered, and of these, six currently have clocks over one hour. We can (crudely) estimate [2] the true number of paid parking spots by realizing that we are observing the last hour of what may be a two-hour pro- cess. Therefore, we think approximately Figure 2: Histogram of raw parking meter data. Note the tri-modal nature of the data. “Overtime,” i.e., flashing parking meters are represented by -1 in the red-shaded oval and constitute the large bar at the origin of the graph. Known paid parking meters are at the right and have a blue oval.
  • 87.
    J U LY/ AU G U S T 2 014 | 87A NA L Y T I C S 12 parking spots have been paid for at any given time. YES, BUT WHAT DOES IT ALL MEAN? So in one sense, the distributions of the data are irrelevant; there are 100 parking spots on average, and the aver- age time that a parking spot is occupied is some time greater than 27 minutes. If we make the (not bad!) assumption that the parking spots that run over are oc- cupied for 90 minutes, then the average occupancy is 43 minutes. In a lot with 100 spots, this means that on average, Figure 3: Histogram of parking time remaining, less than 60 minutes. Approximately six of these data points are actually spill over from “paying” customers. one spot comes open every 30 seconds. This doesn’t sound so bad. If we treat the system as a queue, and use the (observed) steady state cars waiting of three, we can place a rough lower es- timate [3] that a new car arrives every 30 seconds looking for a parking spot, and that they have between a 15 per- cent and 25 percent chance of finding an open spot. These crude estimates, however, do not agree very well with observation, because they neglect the “blocking” effect of other cars waiting for spots to open up. A better analysis of
  • 88.
    W W W.I N F O R M S . O R G88 | A N A LY T I C S - M AGA Z I N E . O R G FIVE-MINUTE ANALYST this parking lot would involve simulation, which would go beyond our intent. THE WORLD’S WORST PARKING LOT? Because of the behavior of the driv- ers while trolling for a parking spot, it might be considered the world’s worst parking lot. Enforcement of the park- ing policy might help because it would decrease the sojourn times of the cars parked in the lot, but there is no guar- antee, and – more importantly – no di- rect incentive for the parking lot owners to do so. This is because the number of “free” parking spots is fixed, and once they are filled, they are filled, regard- less of by whom. From the lot man- ager’s point of view, it doesn’t matter if they are “long” or “short” parkers. In fact, the rate structure is such that short parkers are slightly more lucrative for the parking lot owner than parking above ground. In conclusion, it’s probably a bit of lit- erary hyperbole to imply that this is the world’s worst parking; I’m sure there are others that are much worse. This is be- cause I like to make short trips to this area and visit the locations that don’t validate parking, and I really don’t like the risky behaviors aggressive parkers participate in. On the upside, there’s time to write 12 articles in a single push of the button! I’d be interested in hearing real contenders for the “World’s Worst Parking” lot. Update: Between the original draft of this article and its publication, the park- ing lot in question began installing an electronic system to help customers de- termine how many spots were available before entering the parking “queue.” It has yet to be determined if it will change the behaviors of the parking lot. Look forward to an update in a future column! Harrison Schramm (harrison.schramm@gmail. com) is an operations research professional in the Washington, D.C., area. He is a member of INFORMS and a Certified Analytics Professional (CAP). NOTES REFERENCES 1. Fabusuyi, Hampshire, Hill and Sasauma, 2014, “Decision Analytics for Parking Availability in Downtown Pittsburgh,” Interfaces, INFORMS, Hanover, Md. 2. This is just an estimate. More delicate techniques may be applied. 3. Using the M/M/1 queuing model to find the “lower” or optimistic estimates, and the M/G/1 queuing model to find the upper estimate. Join the Analytics Section of INFORMS For more information, visit: http://www.informs.org/Community/Analytics/Membership
  • 89.
    meetings2.informs.org/sanfrancisco2014 Thanks to ourSponsors: Join us in San Francisco INFORMS returns to the City by the Bay for its 2014 Annual Meeting with a rich and varied program, bridging data and decisions. Each year, the INFORMS meeting brings together experts from academia, industry and government to consider a broad range of ORMS and analytics research and applications. In 2014, we’ll offer that program excellence in one of America’s most exciting cities. Join us for INFORMS 2014! Registration Now Open! November 9-12, 2014 Hilton San Francisco Union Square Parc 55 Wyndham San Francisco, California The Premier Conference for OR/MS Professionals offers you:  Networking – connect with colleagues, share knowledge and ideas  Top industry and academic speakers  Two great receptions, Sunday and Tuesday  Tutorials, exhibits and software demonstrations  Extensive tracks on “hot topics” – the best in ORMS  Focus on Analytics and Practice – special tracks and sessions  Vibrant Interactive/Poster Sessions
  • 90.
    W W W.I N F O R M S . O R G9 0 | A N A LY T I C S - M AGA Z I N E . O R G Frog and fly BY JOHN TOCZEK THINKING ANALYTICALLY A frog is looking to catch his next meal just as a fly wanders into his pond. The frog jumps randomly from one lily pad to the next in hopes of catching the fly. The fly is unaware of the frog and is moving randomly from one red flower to another. The frog can only move on the lily pads and the fly can only move on the flowers. The interval at which both the frog and the fly move to a new space is one second. They never sit still and always move away from the space they are currently on. Both the frog and the fly have an equal chance of moving to any nearby space including diagonals. For example, if the frog were on space A1, he would have a one-in-three chance each of moving to A2, B2 and B1. The frog will capture the fly when he lands on the same space as the fly. QUESTION: Which space is the frog most likely to catch the fly? Send your answer to puzzlor@gmail.com by Aug. 15. The winner, chosen randomly from correct answers, will receive a $25 Amazon Gift Card. Past questions can be found at puzzlor.com. Figure 1: Where will the frog dine on the fly? John Toczek is the senior director of Decision Support andAnalytics for ARAMARK Corporation in the Global Operational Excellence group. He earned a bachelor of science degree in chemical engineering at Drexel University (1996) and a master’s degree in operations research from Virginia Commonwealth University (2005). He is a member of INFORMS.
  • 91.
    GENERAL ALGEBRAIC MODELINGSYSTEM sales@gams.com www.gams.com Scheduled courses for 2014 include: • Advanced Techniques in General Equilibrium Modeling with GAMS • Agro-Economic Modeling with GAMS • Applied Equilibrium Analysis of Energy and Climate Policies • Basic and Advanced GAMS • Development Policy Modeling • Dynamic Impacts of Macroeconomic Policies and Shocks • Environmental Computable General Equilibrium Modeling with GAMS • Financial General Equilibrium Modeling with GAMS • Global Computable General Equilibrium Model Training • Microeconomic Analysis of Welfare and Policy • Modeling and Optimization with GAMS • Practical General Equilibrium Modeling with GAMS • Simulation Techniques for Applied Microeconomics • Trade and Climate Policy Analysis with GAMS and MPSGE For more information please visit: http://www.gams.com/courses.htm Whether you are new to GAMS or already an experienced user looking to deepen or expand your knowledge in a certain area - take a look at our diverse list of GAMS related courses. From basic introductions to equilibrium or agricultural modeling these courses meet your needs in your area of interest. Courses are led by domain experts at locations worldwide. GAMS-related Courses and Workshops ©pressmaster/©JonasGlaubitz Fotolia.com