SlideShare a Scribd company logo
1 of 34
Download to read offline
Translating data into action
Infosys Information Platform
Table of Contents
Abstract .........................................................................................................................................................4
Introduction....................................................................................................................................................4
The Infosys Information Platform (IIP) in Action ...........................................................................................5
Analytics-driven preventative maintenance and downtime reduction.......................................................5
Real-time operational visibility...................................................................................................................6
Augmented revenue and profitability ........................................................................................................6
Information Management Realities ...............................................................................................................8
There’s lots more data than ever ..............................................................................................................8
Data has tremendous diversity .................................................................................................................8
Data is generated by multiple external sources........................................................................................8
Data arrives very quickly...........................................................................................................................8
Cross-system data is generally uncorrelated............................................................................................8
Barriers to Success .......................................................................................................................................9
Installation and integration of a modern data platform..............................................................................9
Staffing shortfalls.......................................................................................................................................9
Administrative hassles ..............................................................................................................................9
What can change ........................................................................................................................................10
IIP in Brief....................................................................................................................................................11
IIP in the enterprise.................................................................................................................................11
IIP Customer Success Grid.........................................................................................................................12
Analytics-driven preventative maintenance & downtime reduction.........................................................13
One of world’s largest mining companies ...........................................................................................13
A major ATM manufacturer.................................................................................................................14
A multinational telecommunications enterprise...................................................................................15
Real-time business operational visibility .................................................................................................16
ATP Tour.............................................................................................................................................16
A global financial services institution ..................................................................................................17
A world leader in agribusiness ............................................................................................................18
A global electronics and imaging major ..............................................................................................19
Augmenting revenue and profitability......................................................................................................20
A global automation major ..................................................................................................................20
A global pharmaceutical supplier ........................................................................................................21
North American freight railroad network .............................................................................................22
The largest chocolate manufacturer in North America .......................................................................23
IIP Building Blocks ......................................................................................................................................24
Layer 1: Flexible data management........................................................................................................25
Layer 2: Insights development & analytics..............................................................................................26
Layer 3: Insights-as-a-service.................................................................................................................27
Table 1: IIP Layer 1 components ............................................................................................................28
Table 2: IIP Layer 2 components ............................................................................................................31
Next Steps...................................................................................................................................................34
Abstract
Infosys has strengthened its incomparable proficiency in helping the world’s most advanced enterprises
improve operations and run their businesses with a power packed technology driven platform known as
Infosys Information Platform (IIP)
IIP is an analytics platform that orchestrates open source software, value-added enhancements, and strong
professional service expertise. Along with its single-click installer, data ingestion framework and graphical
data-modeling tool, IIP supplies a comprehensive array of adapters for diverse data sources as well as an
easy way to create new connections when needed. Out-of-the-box integration with R studio simplifies
harnessing the power of clusters while modeling data. Significantly, all of IIP’s benefits can be realized
without requiring extensive coding.
All of these and other features help customers discover actionable insights and foresights by deriving
meaning from the abundant - and untapped - sources of information.
This paper will help you understand IIP, its design philosophy, technical architecture and success stories.
Introduction
So far, it’s been strenuous and time-consuming to obtain insights from the enormous amounts of raw data
from internal and external sources that flood enterprises every day. And often, even when information
analysis has been successful, tangible insights that convert to business results have frequently been
elusive.
Infosys Information Platform (IIP) brings in all the right ingredients such as technology, toolsets and
processes to obtain insights near real time from all kinds of data – historic or current, idle or ever- changing,
structured or unstructured.
Solutions developed using IIP deliver quick and meaningful business outcomes such as:
 Analytics-driven preventive maintenance and downtime reduction
 Real-time operational visibility
 Augmented revenue and profitability
Organizations that deploy IIP are able to realize these benefits while still protecting and rejuvenating
existing IT technology investments.
We begin this paper by describing a few common solutions and then we will illustrate IIP’s architecture and
distinct value proposition. We then depict how the unique challenges of today’s information landscape
served as the rationale behind Infosys’ development of AiKiDo and the IIP solution that it incorporates.
The intended audience for this paper includes line-of-business executives, IT leadership, and anyone else
interested in translating raw data into insights and guidance that the business can swiftly use to drive action.
The Infosys Information Platform (IIP) in Action
By running existing workloads more efficiently while unearthing new opportunities, IIP makes it possible to
leverage untapped data from numerous internal and external systems and sources to reveal insights and
suggest quick courses of action.
Market adoption of IIP has been impressive: within the first year of its existence, 200 customers have
completed evaluations, with dozens now onboard. To help customers get up and running quickly, Infosys
offers a preconfigured Data Analytics solution.
IIP with rich professional expertise in roles such as business analysts, technology architects, data scientists,
data engineers, and software developers, produces business solutions constructed on the platform. These
solutions result in the benefits to the bottom line in dozens of engagements across industries and
applications.
These achievements cover a broad range of applications, such as:
 Fraud analytics
 Predictive analytics for maintenance
 Digital shopper insights
 Customer churn analysis
 Risk exposure analytics
 Trade data analysis and regulatory reporting
 Real-time machine learning
 Working capital allocation optimization
Below are some of the examples of our solutions
Analytics-driven preventative maintenance and downtime reduction
1. One of the world’s largest mining operators has placed nearly 200 sensors on each of its autonomous,
unmanned vehicles. IIP’s real-time data analytics – ingesting and processing 27,000 messages per
second - predicts which of these vehicles is about to fail.
This guidance drives repairs before downtime can occur, and is a great illustration of a completely new
type of application made possible by IIP.
2. A major ATM manufacturer and service provider turned to IIP to develop a fresh, innovative solution
that analyzes four million records of alert and incident data from 8,500 machines in an effort to foresee
which ATMs would fail within one week.
The outcome included downtime reduction of 10%, a 14% increase in service call efficiency, and an
18% cost reduction thanks to more accurate, productive client visits.
3. A multinational telecommunications services company had recognized that network faults were the
biggest single cause of disruptions, but determining the time and location of impending failures was
nearly impossible. Complex analytics on millions of operational records conducted using IIP helped
unearth the fact that three percent of the company’s lines were at risk of having a fault sometime during
the next three weeks.
Armed with this knowledge, the organization immediately targeted the imperiled lines for repairs before
the anticipated outages could occur.
Real-time operational visibility
1. ATP Tour - the governing body of men’s professional tennis - wanted to add new color and depth to
fans’ understanding of the game in an open and cost-effective way.
Infosys loaded extensive historic data consisting of millions of data points from multiple systems into
IIP. The results - which were available in near real-time - provided a comprehensive collection of in-
depth player performance probability-led foresight to the media, game commentators, and the sport’s
worldwide fan base.
2. Spurred by regulatory trade requirements, a leading global financial institution processing six million
transactions per day employed near real-time analytics in IIP to slash report generation times from 10-
15 minutes to 35 seconds.
This is an example of better utilization of existing infrastructure brought about by IIP.
3. A world leader in agribusiness, was facing application performance challenges in their management
reporting solution that was staggered by large volumes of data. Infosys implemented a proof-of-concept
using the Infosys Information Platform (IIP) to improve the performance of the reporting solution. During
the exercise, IIP could inject 19 million records in just six minutes. This was a breakthrough compared
to the existing platform’s performance which took over an hour to inject half a million records.
IIP could conduct report / dashboard navigation in under five seconds while the existing platform took
over a minute to perform the same.
4. A global imaging and electronics manufacturer of printers, photocopiers and fax machines sought to
rejuvenate their Accounts Receivables reporting process.
Along with ongoing, daily production details, Infosys migrated 24 months of historical information from
the existing data warehouse into IIP. At the same time, the enterprise’s data models, views, and reports
were all streamlined and optimized.
Turnaround time for daily data integration was 37% faster, and report performance times were cut in
half.
Augmented revenue and profitability
1. To help identify existing customers with a propensity to buy specific products and services, a major
office automation vendor rolled out a new application that utilized IIP’s machine learning and in-memory
computing capabilities to analyze more than two million records of previous purchases.
The complete set of predictions was concluded in seven seconds.
2. A global pharmaceutical supplier was hampered by the amount of time it took to identify backorders.
Using IIP to consume SAP-generated order details, they created a new solution that analyzes the entire
data set and identifies - and helps correct - supply shortfalls within 10 seconds.
3. A major North American freight railroad network was eager to come up with new tactics to reduce the
quantity of unnecessary braking events generated by the Positive Train Control (PTC) system for its
locomotives. These unanticipated incidents diminished the organization’s ability to adhere to its
published operating schedules. Infosys used IIP and the R programming language to analyze an
expansive set of operational metrics and then develop a delay event prediction model.
The new approach helped the railroad adjust the PTC and significantly diminished the number of
unnecessary braking occurrences.
4. The largest chocolate manufacturer in North America lacked a timely, systematic methodology for
determining when its products were unavailable for purchase at retail locations.
Since many consumers make their buying decisions impulsively, these frustrating inventory shortages
resulted in lost revenue and diminished brand loyalty.
IIP served as the computing platform for a collection of statistical models that helped to classify out-of-
stock events and determine their root cause. Along with this analysis, the new solution also alerted the
appropriate users to help prevent these costly episodes.
To learn more about these IIP accomplishments, please see the IIP Customer Success Grid that’s
presented later, or visit http://www.infosys.com/information-platform/case-studies/Pages/index.aspx)
In the next section, we portray some of the modern information complexities that Infosys needed to
overcome when constructing the solutions that we just illustrated. These dynamics also helped influence
the design and development of the IIP platform itself.
Information Management Realities
Regardless of industry, every IT organization must confront an assortment of commonly discomforting
truths about how data is created and utilized today. Each of these factors were integral considerations when
Infosys developed IIP.
There’s lots more data than ever
According to a study 1published by EMC and IDC, from 2013 to 2020, the digital universe will grow by a
factor of 10 – from 4.4 trillion gigabytes to 44 trillion. It is more than doubling every two years.
Data has tremendous diversity
Previously, information was principally generated by enterprise applications using a standard relational
structure that was easily catalogued and employed.
Naturally, structured application information is still a big and meaningful portion of the overall IT portfolio,
but data variety now encompasses unstructured sources such as:
 Social Media Feeds
 Machine Logs
 Document scans
 OCR Data
Data is generated by multiple external sources
There was a time that IT leadership could simply focus attention on its own application and data collection.
That’s passé: IT must now have a plan to interact with, and react to, data created by innumerable outside
sources.
Data arrives very quickly
These new information categories are typified by the speed at which they’re generated and distributed. For
example, consider how rapidly a video clip, tweet or Facebook post can go viral.
Cross-system data is generally uncorrelated
When the bulk of the enterprise’s data was from well-defined enterprise applications, it was relatively
straightforward to understand and manage information interconnections. This is much more daunting today,
since properly linking raw - and often unstructured - data from diverse sources takes a lot of work.
1 The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things, EMC
Digital Universe with Research & Analysis by IDC
Barriers to Success
Smart enterprises appreciate the untapped value of the data that’s generated by their operations every day.
Predictably, many organizations are making significant investments in hardware, software, and personnel
in an effort to derive advantages from this idle information.
Sadly, these technology expenditures have often failed to deliver on their promise. At a recent Gartner Big
Data Industry Insights event2, analyst Lisa Kart stated, "73 percent of enterprises are either investing or
planning to invest in big data. However, of these, 65 percent are struggling with determining how to get
value from big data."
The gap between anticipation and results is not astonishing as endeavors of this kind that has the premise
of “making sense of big data” need a careful thought and actions towards it than simply selecting and
purchasing technology. Also, apart from the usual data volume, velocity and variety, there are many
technological, skills and processes’ challenges that diminish the returns from the IT expenditures. Let’s take
a look.
Installation and integration of a modern data platform
Despite the existence of IT infrastructure, budgets and competencies, enterprises today demand advanced
analytics capabilities which require varied components working seamlessly to derive insights. Modern
information management platforms need to be integrated with the existing components alongside
installation of big data components, which we all know, is a challenge.
Staffing shortfalls
There’s a dearth of talented individuals with expertise in data science and platforms like Hadoop. This
means that key IT staff have additional, often perplexing responsibilities such as managing data diversity,
creating an accurate, meaningful enterprise information model, designing and developing algorithms and
applications necessary to turn potential returns into reality.
Administrative hassles
Since each constituent in a modern information management platform has its own lifecycle, administering
these solutions is an ongoing, often highly convoluted process that entails:
 Keeping up with the rapid pace of open source technology advancements
 Controlling versions, for infrastructure as well as applications
 Managing across projects
 Configuring and managing enterprise-grade security
 Ensuring compliance with mandated internal standards as well as those from external regulatory
agencies
While IT struggles to surmount the gaps between the promise and reality of the enterprise’s data, the user
community is often disappointed at the pace - and eventual upshot - of these efforts.
2
Gartner Webinar, Big Data Industry Insights, Lisa Kart, January 27, 2015
First, there is a remarkable scarcity of easily deployed, user friendly, and industry-specific applications that
can make sense of - and offer meaningful recommendations about - the enterprise’s substantial information
assets.
In the absence of packaged solutions, staff members naturally turn to self-service options - including
analytics - as a mechanism for gaining the insights they crave. This often results in yet another instance
where the promises portrayed by IT and software vendors diverge from the actual end-user experience.
Finally, even in those rare cases where all of the prerequisites are in place for an exceptional user
experience, delays during the data ingestion process - for both structured and unstructured information -
introduce unendurable lags in turning suggestions into action.
What can change
Enterprises that seek profit from the modern data landscape in real-time should seek technologies that:
 are adept at handling both structured and unstructured information
 increase sophistication by moving from old-style, basic relations to more far-reaching correlations
 future proof investments by leveraging existing assets and letting the enterprise select best-of-
breed modules
 make it easy to profit from technology advancements without causing unnecessary expenditures
or downtime
 deliver rapid insights by moving away from the existing user request/IT response paradigm
towards ‘self-discovery’
With these fundamentals in place, users are empowered with self-service capabilities that let them
answer questions without needing to request assistance from the
IT organization.
All of the best practices that we’ve just described have served as foundational guidelines for the Infosys
Information Platform, which we describe next.
IIP in Brief
As a global leader in consulting, technology, outsourcing and next-generation services, Infosys enables
clients in more than 50 countries to stay a step ahead of emerging business trends and outperform the
competition.
An entrepreneurial adventure that began with seven engineers and $250, Infosys is now a publicly traded
company (NYSE: INFY) driven by more than 179,000 relentless innovators and annual revenues of more
than $8.7 billion.
By focusing on seven core mega-trends, Infosys supplies enterprises in every industry with strategic
insights and a framework to uncover opportunities for innovation-led growth.
Inspired by its unparalleled experience in supplying the precise combination of technology, people, and
know-how to the world’s largest and most sophisticated organizations, Infosys has launched the AiKiDo
initiative.
AiKiDo is an integrated, coordinated, and strategic solution amalgamated from three distinct families of
products and services:
 Ai: Platforms and Platforms as a Service
 Ki: Knowledge-based management and landscape evolution
 Do: Design thinking and design-led initiatives
AiKiDo’s overall mission is to help Infosys’ customers improve current operations - and thereby renew their
existing technology and business process landscape - while driving innovation by uncovering formerly
hidden opportunities to solve challenges.
IIP in the enterprise
As part of the Open Data aspect of the Ai platform set, Infosys has developed the Infosys Information
Platform (IIP). It’s a complete solution that blends open source software, value-added components, a robust
partner ecosystem, and deep professional service proficiency.
It is an analytics platform that enables you to quickly glean insights from all types of data sources and use
them for decision support across industries.
IIP offers all necessary open source components in one validated package that can be installed with a
single click. This saves a tremendous amount of time, and keeps staff focused on delivering value, rather
than downloading, installing, and maintaining individual software bundles.
All of IIP’s benefits can be attained without writing or manipulating any open source platform code, so
enterprises can emphasize core applications and business outcomes without having to hire expensive open
source experts.
IIP is tailored to avoid vendor lock-in, and works with any licensed software or open source Hadoop
distribution such as Cloudera or Hortonworks, or relevant components acquired directly from the related
open source Apache project.
An intuitive user interface abstracts away the nuances and complexities of the underlying platform’s open
source technologies, while also eliminating the need to code for many common tasks – including data
modeling. This greatly increases the productivity of scarce data engineers and data scientists. When coding
is necessary, everything that IIP generates is 100% open source for ease of maintenance.
Enterprises are free to make substitutions without incurring any downtime or outages, keeping all options
open when surveying the latest technology advancements. In fact, if they wish to they can replace IIP’s
open source components with already-installed technologies: it’s entirely up to the enterprises to select the
amount and type of open source software in their environment. Also, they can select any desired
deployment model for IIP, including on premise, public cloud, private cloud, and hybrid topographies.
Since it runs on commodity hardware and free from software license fees due to open source, IIP
significantly reduces hardware, software, professional service, and operational outlays and thus total cost
of ownership is significantly better over existing market offerings.
The most salient feature of IIP is that it is a secure platform while still taking full advantage of open source
technologies. Encryption, authentication, data lineage, and cluster monitoring tools provide far-reaching
security, management and compliance with data audit and governance mandates.
There is more to Infosys’ commitment to advance the state of the art of modern information platforms. It
extends far beyond simply providing a well-integrated solution to its customers. One such instance is -
Infosys is a Platinum Sponsor of the Open Data Platform (ODP) initiative: the open ecosystem of Big Data.
Infosys actively works with other industry leaders to promote and enhance Big Data technologies and open
source projects such as Apache Hadoop. These advances include Infosys contributions to performance
and security, which we will describe later in this paper.
Another example is - IIP has also been certified as a test bed by the Industrial Internet Consortium,
demonstrating its relevance in the rapidly evolving landscape of sensor data analytics and the Internet of
Things (IoT).
IIP Customer Success Grid
The following section summarizes few examples that have profited from IIP’s speed, scalability, and open
architecture. These capture business challenge, high-level solution summary, and benefits. They are
classified as
1. Analytics-driven preventative maintenance & downtime reduction
2. Real-time business operational visibility
3. Augmenting revenue and profitability
Analytics-driven preventative maintenance & downtime reduction
One of world’s largest mining companies
Business context Solution highlights Results
One of the world’s
largest mining
companies utilizes
a fleet of
autonomous,
unmanned trucks.
These vehicles
operate in
numerous
locations globally,
and a failure
negatively impacts
the entire supply
chain.
Each truck is equipped with nearly
200 sensors, which continually
broadcast 400 data points of
telemetry about the state of the
vehicle (e.g. temperature,
vibrations, and tire pressure) along
with details about the terrain in
which it’s currently operating.
Apache Kafka was configured to
stream 27,000 of these messages
per second into IIP, where a
mathematical model developed in
Apache Spark computed
maintenance requirements as well
as the likelihood of an upcoming
failure.
A native HTML5 application
presented a color-coded global
map indicating the state of all
vehicles, and permitting drill-down
on any individual truck.
Machine breakdowns and production
interruptions have been significantly
diminished, resulting in less
downtime and more savings.
Users can interact with much more
accurate indicators for a collection of
critical metrics such as:
 Production schedule
adjustments
 Spare part requirements
 Energy costs
 Optimal asset utilization
Thanks to the operational
efficiencies gained from the real-
time, elastic, and scalable IIP
solution, the enterprise is launching
an initiative to increase its fleet of
unmanned trucks by 300%.
A major ATM manufacturer
Business context Solution highlights Results
A major ATM
manufacturer and
service provider
sought techniques to
reduce maintenance
costs while offering
higher reliability and
improved customer
service to its clients.
An IIP solution - hosted on a 10-
node Amazon Web Services (AWS)
cluster - was developed to ingest
four million records of ticketing data
generated by 8,500 ATMs.
The entire data loading and
cleansing process took 27 seconds,
and the follow-on Apache Spark-
based logistic regression analysis
with reliability predictions concluded
in only 60 seconds.
The final results were then
transmitted to the customer’s
Oracle database, and presented
through the Tableau business
intelligence solution.
The IIP solution was able to
predict - with an 80% reliability
rate - the likelihood of an ATM
failing within one week.
Using the outage predictions
generated by the IIP solution,
each technician is now able to
conduct 4 service calls per day,
which is a significant increase
from the earlier average of 3.5
service calls per technician per
day.
Accurate failure predictions and
the resulting optimized service
calls have helped shave costs by
18%.
Meanwhile, chronic defects are
now corrected rapidly - in hours
rather than weeks.
A multinational telecommunications enterprise
Business context Solution highlights Results
A multinational
telecommunications
enterprise wanted to
identify - and then
correct - potential
network faults that
could result in costly
and inconvenient
service disruptions.
Experts from Infosys used IIP to
process and analyze more than 16
million records of ADSL connection
details such as attenuation/loss, code
violations, upload/download rates, and
re-initializations.
These computationally-intensive
examinations resulted in two distinct
“signatures”:
1. A profile produced by normal, non-
fault activities (“Control signature”)
2. A profile that indicated an incipient
fault (“Fault signature”)
Applying a statistical model to then
compare these two signatures served
as a reliable indicator of which lines
were candidates for a near-term
outage.
The IIP-based solution
identified three percent of the
firm’s lines as being at-risk of
a looming interruption at some
point in the subsequent three
weeks.
Using these insights as a
roadmap, the firm was able to
get a head start on repairing
the problematic lines before
trouble could develop.
This has resulted in reduced
downtime and more optimally
allocated maintenance
resources.
Real-time business operational visibility
ATP Tour
Business context Solution highlights Results
ATP Tour - the governing
body of men’s
professional tennis -
eagerly sought new,
innovative techniques to
help commentators and
fans get a better
understanding of the fast-
paced game.
Their mission was to go
far beyond traditional
statistics to uncover
previously hidden
insights.
Infosys loaded millions of
data points into IIP,
including umpire data for
12 months as well as five
years of data from the
computerized Hawk-Eye
ball tracking system used
in the Barclay’s ATP
World Tour Finals.
Requiring just two nodes
of eight core CPUs and
16GB of RAM for
hardware, IIP concluded
its analysis in near real-
time.
An enormous number and variety of
statistics - and their impact on the game
- are now available for fans. Just a few
examples of these metrics include:
 Shot speed
 Shot placement
 Point winning shots
 Fatigue indexes
 Serve analysis
ATP now offers this research to match
commentators along with publication on
ATPWorldTour.com for the benefit of
fans and journalists.
A global financial services institution
Business context Solution highlights Results
A global financial services
institution carries out
approximately six million trades
per day. Regulatory
requirements dictate that
certain trades must be reported
in a specific format within a 15-
minute window.
There were numerous
instances where the
organization failed to make
obligatory notifications within
the mandated timeframe.
These delays resulted in non-
compliance alerts and costly
financial penalties.
Apache Sqoop extracts trade details
from an Oracle database and loads
them into the Hadoop File System
(HDFS) residing on IIP. The data
extraction process includes data
cleansing, validation, and derivation
operations.
Nearly 600 million transactions were
loaded into a 100 AWS cluster at a
rate of 130,000 transactions per
second.
Upon completion of the relevant
computations, the regulatory results
are returned to Oracle, and various
analytic reports are available from
Tableau.
IIP completed the
entire processing and
reporting assignment
within 35 seconds.
The enterprise now
has an elastic and
scalable strategy that
eliminates violations
and penalties, and
can easily support
future growth.
A world leader in agribusiness
Business context Solution highlights Results
To offer timely
information to their
user community, a
large agribusiness
concern aimed to
speed up data loads
and report generation
by deploying an
inexpensive, cloud-
based business
intelligence data
warehouse
acceleration solution.
The entire information portfolio -
consisting of 19 million records of master
data, sales, costs, and inventory - was
loaded into an
on-premise two-node IIP cluster.
It took only six minutes to transfer the
complete data set, and a full suite of
reports were generated in less than 20
seconds.
These reports - which were presented in
Tableau - provided guidance on sales
performance, budget variances, and
geographic revenue summaries.
Month-end processing is
now completed in near real-
time, and the data load task
is 600 times faster than in
the previous solution.
Users are able to gain
access to the reports they
need 60 times more quickly
than before.
A global electronics and imaging major
Business context Solution highlights Results
An international
manufacturer of
imaging and
electronics
technology such as
printers, copiers,
and fax machines
desired a fresh
alternative to its
data warehouse
infrastructure.
Infosys created a collection of ingestion
models to load active and historical data
related to payments, credits, debits, and
adjustments from multiple source systems -
including a massive existing data lake - directly
into IIP.
The data model was optimized and
harmonized, with summary tables created in
Apache Hive.
Spark SQL and Tableau were assigned the
task of presenting information to users.
As part of this migration and streamlining
effort, Infosys was able to cut the number of
views in half, and offered extensive new
visualization and extraction options to users.
The essential job of
loading information from
source applications and
data warehouses was
improved by 37%.
Report generation times
were trimmed by 50%,
and the business
profited from far greater
accuracy and reduced
variances using the new
solution.
Augmenting revenue and profitability
A global automation major
Business context Solution highlights Results
In an effort to more
effectively allocate
marketing and sales
resources, a large
office automation
concern sought a
reliable method to
predict the likelihood
of existing customers
making subsequent
purchases.
More than two million records of current
customer details and monthly sales
transactions were retrieved from
production systems and loaded into
Hadoop.
Apache Spark was used to create near
real-time, in-memory machine learning
models to accurately identify which
customers within a given sales territory
were likely to make a purchase.
Results were available for user
consumption in Tableau within seven
seconds.
The ensuing reports were
accurate in identifying those
customers that were
genuine candidates for
repeat purchases.
The business used this
information to drive cross-
sell and upsell efforts.
A global pharmaceutical supplier
Business context Solution highlights Results
A global
pharmaceutical
manufacturer’s
revenue was
negatively impacted
by delays in
determining back
order specifics.
To scale these
obstacles, the
organization sought
to take advantage of
high performance
distributed
computing.
A comprehensive information portfolio
was loaded into IIP.
This data set consisted of fine-grained
details about customers, orders,
products, and manufacturing plant
availability.
Computations were performed in IIP, with
the resulting guidance delivered in 10
seconds on a single node server.
The results were then presented to users
via Tableau.
Management now has a
much more accurate picture
of potential backorder
issues, and can take
corrective action long
before problems impact
revenue.
North American freight railroad network
Business context Solution highlights Results
A major North
American freight
railroad network
searched for ways to
eliminate an ongoing
series of needless
braking events for its
locomotives.
Infosys loaded a diverse set of metrics
into IIP running on an AWS cloud. These
values - which created a data set of
hundreds of terabytes - included
locomotive brake data, engineer
characteristics, wayside data streams,
weather information, maintenance
details, and signal data from the Positive
Train Control (PTC) system.
Using the R programming language with
resulting visualizations presented in
Tableau, Infosys performed a series of
investigations such as Pareto analysis of
braking events, basic text mining of delay
comments, and locomotive delay
prediction.
These inquiries demonstrated that
erroneous signals, speed restrictions,
and switch alignments were the primary
culprits in the unwanted braking
occurrences.
Applying the
recommendations delivered
by the IIP-based solution
helped the railroad predict -
and then prevent - the
factors that were causing
the braking events.
A one-mile-per-hour
increase in train velocity can
yield $200 million of
incremental revenue, so
these adjustments had a
major impact on the
enterprise’s bottom line.
The largest chocolate manufacturer in North America
Business context Solution highlights Results
This organization
recognized its
inability to accurately
determine which
retailers were lacking
inventory was
resulting in lost
revenue and
unhappy customers.
350 million rows of sales and inventory
data were loaded into a five-node IIP
instance running on AWS.
Infosys developed a collection of
statistical models that classified - within
four minutes of processing - the out-of-
stock incidents and ascertained their root
causes.
The resulting Tableau dashboard - which
presented heat maps of details about
stores, days, times, and items - gave
users the necessary insights to avoid
product availability shortfalls.
According to industry trade
journal Retail Wire, out-of-
stock incidents such as
those experienced by this
organization account for
approximately 3.2% of lost
revenue.
By eliminating these events,
the enterprise stands to
gain more than $100
million of incremental
sales.
IIP Building Blocks
As illustrated in figure 1, IIP encompasses three fine-tuned yet well-integrated layers:
 Layer 1: Flexible data management
 Layer 2: Insights development & analytics
 Layer 3: Insights-as-a-service
Figure – 1
Layer 1: Flexible data management
IIP’s data management layer is a pre-configured, curated, and optimized collection of well-known, industrial
grade open source technologies. When architecting IIP, Infosys carefully surveyed the market to choose
each component.
This tactic supplies all of open source’s advantages – such as cost, performance, transparency, and vendor
flexibility - while minimizing the drawbacks such as research, technology acquisition, and maintenance that
are prevalent when deploying open source.
Customers are also free to substitute their own already-implemented infrastructure for any of the bundled
technologies provided by Infosys.
Table 1 enumerates the extensive list of open source software that constitutes IIP layer 1.
Infosys has been an active participant in advancing the open source projects that make up
layer 1. A few instances of these contributions include:
 Data level authorization on Spark views along with Hadoop File System (HDFS) tables accessible
via Spark
 Role-based access control on Spark Views and HDFS tables accessible via Spark
 Auto-registration of Spark views on Apache Thrift server restart
 Registration of multi-table joins as views through Spark beeline
 Multi-threading in Sqoop
 Callback capabilities in Sqoop created to report execution statistics
Infosys has also developed its own sentiment analytics engine that offers text analytics models and
algorithms for meaningful indicators such as:
 Buzz
 Sentiment
 Affinity
 Opinion
 Network
 Influencer
Layer 2: Insights development & analytics
The second layer of the IIP architecture builds on the robust foundation of open source and customer-
supplied information processing components that form IIP’s base layer.
Infosys has developed a collection of far-reaching, value-added, enterprise-grade capabilities that assist
customers with essential tasks like:
 Installation
 Administration
 Data loading and modeling
 Performance
 Scalability
 Publication framework
 Security
Table 2 describes each of the items found in IIP’s second layer.
We continue to make major investments in IIP.
Upcoming capabilities will include:
 Rules engine integration
 Elastic search integration
 High availability and disaster recovery
 Web aggregators
 Archiving and aging
Layer 3: Insights-as-a-service
IIP is a potent combination of open source and Infosys-supplied supplemental software. It provides
customers with the technical prerequisites to build applications that fully exploit today’s information
landscape. Infosys stands behind IIP with a large, highly-skilled specialists’ pool covering all aspects of
developing modern applications:
 Infrastructure management
 Functional expertise
 Technology acumen
 Business analysts
 Data scientists
Given our history of achievement, many clients also opt to take advantage of a group of related service
offerings such as:
 Integration and implementation customizations
 Custom data extractors and adaptors
 Client-specific data modeling and cleansing
 Client-specific data science and advanced analytics
 On-demand agile application development
Table 1: IIP Layer 1 components
Component Purpose
Apache Hadoop A popular framework and ecosystem that facilitates batch-
oriented distributed processing of massive amounts of data
Apache Hive Infrastructure erected on top of Hadoop and the Hadoop File
System (HDFS) to provide data warehouse capabilities such as
querying, analysis, and summarization
Apache Kafka A message broker that streamlines and speeds the important job
of ingesting real-time data feeds
Apache Open NLP A machine learning toolkit intended for processing natural
language text.
Apache Shiro Security framework for Java applications that offers
authentication, authorization, encryption, and session
management
Apache Spark A cluster computing and processing framework, designed for very
high throughput and performance, especially when incorporating
machine-learning algorithms
Apache Sqoop Technology developed to transfer data between relational
databases and Hadoop
Component Purpose
Apache Yarn A platform that manages computational assets that are
aggregated in clusters and schedule applications on those
resources
Apache Zeppelin Provides easily-created, interactive data analytics using popular
Big Data technology back-ends
Apache Zookeeper Provides a naming registry for large distributed systems, along
with keeping track of configuration and synchronizing information
across the computing cluster
Azkaban Technology developed by LinkedIn to permit scheduling of batch
Hadoop jobs
Hibernate A framework for mapping objects between Java and relational
databases
Hipi Hadoop Image Processing Interface: a library targeted at very fast
image processing using MapReduce computational patterns
Java Development Kit
(JDK)
Complete application development infrastructure for the Java
Programming language
Kerberos Software that implements a network authentication protocol that
makes it possible for nodes to securely communicate, regardless
of whether the underlying network is secure or not.
MySQL A widely adopted open source relational database, utilized as
internal storage by the IIP platform’s Quartz scheduler.
Quartz An open source, Java library that permits job scheduling and
workflow coordination directly from an application.
RStudio A specialized programming language (“R”) and supporting
development studio intended for developing data analysis and
statistical applications.
RStudio Server Enables a browser-based user interface to applications written in
the R programming language that are running on a remote server
Twitter 4J A software library that integrates Java applications with the
Twitter API
Table 2: IIP Layer 2 components
Component Purpose
Administration
workbench
Permits users to configure and manage workspaces and data sources.
Apache Ambari Software meant to make the job of administering and managing Hadoop
clusters less taxing
Cluster maintenance A single click installs all of the components in the IIP platform. Infosys
engineers provide robust maintenance and support for ongoing open source
upgrades.
Data explorer A graphical user interface-based information modeling and query tool for
designing joins, aggregation, and filtering. Offers drag-and-drop capabilities
to quickly correlate multiple disparate information sources and data types.
This sets the stage for uncovering insights while still insulating developers
from the specifics of the underlying technologies.
It renders its results via visualization tools such as Tableau and Qlik, as well
as native HTML5. The Data Explorer integrates with external data science
and analytics toolsets - such as the R programming language - via
commonly accepted standards and protocols.
Component Purpose
Data extractor A configurable, extensible, and fault-tolerant workbench that provides a
drag-and-drop user interface for ingesting data (initially and for subsequent
updates) with near real-time performance. It’s adept at loading data from
multiple data sources such as relational databases, data streams, message
queues, NoSQL databases, social media, and log files. It’s able to digest
CSV, XML, PDF, and JSON encoding formats.
Governance Delivers complete metadata and view management via data ingestion and
management workbenches.
In-memory analytics IIP supplies a high performance, comprehensive collection of libraries and
features to facilitate rapid data mining and modeling. They apply
mathematical and statistical algorithms to uncover patterns in raw data. This
supports fast data transformation to create joins and views for subsequent
consumption.
Resource manager All IIP-hosted applications can be launched, monitored, and administered
from a single integrated Web-based user interface.
Component Purpose
Security IIP was built to incorporate robust security capabilities. First, it provides three
levels of authentication including operating system, LDAP, and Kerberos. It
also offers highly granular cell-based authorization and role-based access
to information. Customers are free to specify fine-grained role-based access
control:
 For the platform
 For all tables
 For all views
 For all fields with in tables and views
Next Steps
Infosys offers a collection of helpful resources that provide more information about IIP:
1. To learn more about the platform, visit the IIP website.
2. Sign up for an IIP test drive.
3. Buy today on AWS Marketplace.
Beyond the test drive, Infosys provides the ability to completely host IIP using customer-supplied cloud
environments or on-premise hardware. This option consists of a fully configured, multi-node IIP solution
that’s designed to deliver real-time insights. For more information, write to us – askus@infosys.com

More Related Content

What's hot

Manage Crew Change Through Coupling of First Principles Simulation, Digital P...
Manage Crew Change Through Coupling of First Principles Simulation, Digital P...Manage Crew Change Through Coupling of First Principles Simulation, Digital P...
Manage Crew Change Through Coupling of First Principles Simulation, Digital P...
Yokogawa1
 

What's hot (20)

Industry 4.0 Security
Industry 4.0 SecurityIndustry 4.0 Security
Industry 4.0 Security
 
Inside 6 Dimensional Model for Industry 4.0 Smart Factory by Webonise
Inside 6 Dimensional Model for Industry 4.0 Smart Factory by WeboniseInside 6 Dimensional Model for Industry 4.0 Smart Factory by Webonise
Inside 6 Dimensional Model for Industry 4.0 Smart Factory by Webonise
 
The Cisco Connected Factory
The Cisco Connected FactoryThe Cisco Connected Factory
The Cisco Connected Factory
 
Is IIOT Right for You?
Is IIOT Right for You?Is IIOT Right for You?
Is IIOT Right for You?
 
MES, Operational Excellence, Data Analytics and Manufacturing Intelligence
MES, Operational Excellence, Data Analytics and Manufacturing IntelligenceMES, Operational Excellence, Data Analytics and Manufacturing Intelligence
MES, Operational Excellence, Data Analytics and Manufacturing Intelligence
 
Industry in transition fb note jan 2015
Industry in transition fb note jan 2015Industry in transition fb note jan 2015
Industry in transition fb note jan 2015
 
Manage Crew Change Through Coupling of First Principles Simulation, Digital P...
Manage Crew Change Through Coupling of First Principles Simulation, Digital P...Manage Crew Change Through Coupling of First Principles Simulation, Digital P...
Manage Crew Change Through Coupling of First Principles Simulation, Digital P...
 
Plotting the Path to Autonomous Operations
Plotting the Path to Autonomous OperationsPlotting the Path to Autonomous Operations
Plotting the Path to Autonomous Operations
 
Manufacturing Analytics at Scale
Manufacturing Analytics at ScaleManufacturing Analytics at Scale
Manufacturing Analytics at Scale
 
TCI 2016 Grace Systems
TCI 2016 Grace Systems TCI 2016 Grace Systems
TCI 2016 Grace Systems
 
Use of data in manufacturing
Use of data in manufacturingUse of data in manufacturing
Use of data in manufacturing
 
Are You Ready for Industry 4.0?
Are You Ready for Industry 4.0?Are You Ready for Industry 4.0?
Are You Ready for Industry 4.0?
 
Application decommissioning stop spending millions supporting legacy applicat...
Application decommissioning stop spending millions supporting legacy applicat...Application decommissioning stop spending millions supporting legacy applicat...
Application decommissioning stop spending millions supporting legacy applicat...
 
Accenture Applied Intelligence in Pharmacovigilance
Accenture Applied Intelligence in PharmacovigilanceAccenture Applied Intelligence in Pharmacovigilance
Accenture Applied Intelligence in Pharmacovigilance
 
Cisco Connected Factory - Security
Cisco Connected Factory - SecurityCisco Connected Factory - Security
Cisco Connected Factory - Security
 
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus WebinarReal-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
 
M1. io t
M1. io tM1. io t
M1. io t
 
M 4 iot..
M 4 iot..M 4 iot..
M 4 iot..
 
Quantifying the Value of Digital Transformation in Manufacturing
Quantifying the Value of Digital Transformation in ManufacturingQuantifying the Value of Digital Transformation in Manufacturing
Quantifying the Value of Digital Transformation in Manufacturing
 
Transform QA to Stay Ahead of Digital Disruption
Transform QA to Stay Ahead of Digital DisruptionTransform QA to Stay Ahead of Digital Disruption
Transform QA to Stay Ahead of Digital Disruption
 

Viewers also liked

Multiculturalidad en la escuelccca
Multiculturalidad en la escuelcccaMulticulturalidad en la escuelccca
Multiculturalidad en la escuelccca
delavibora
 
Gastronomia camino de la lengua castellana
Gastronomia camino de la lengua castellanaGastronomia camino de la lengua castellana
Gastronomia camino de la lengua castellana
Francisco Fuentes Moreno
 
pixton de la super zanahoria.
pixton de la super zanahoria.pixton de la super zanahoria.
pixton de la super zanahoria.
Victoria Martin
 
Pankaj Rajpal Profile & career Contour July 2016
Pankaj Rajpal Profile & career Contour July 2016Pankaj Rajpal Profile & career Contour July 2016
Pankaj Rajpal Profile & career Contour July 2016
Pankaj Rajpal
 

Viewers also liked (20)

Etre meilleur
Etre meilleurEtre meilleur
Etre meilleur
 
Karen Kavanagh Photography
Karen Kavanagh PhotographyKaren Kavanagh Photography
Karen Kavanagh Photography
 
Ab-Mailing - Manual de Usuario
Ab-Mailing - Manual de UsuarioAb-Mailing - Manual de Usuario
Ab-Mailing - Manual de Usuario
 
II Encuentro Emprendedores Eduardo Barreiros (Ourense) - Buscar Financiacion ...
II Encuentro Emprendedores Eduardo Barreiros (Ourense) - Buscar Financiacion ...II Encuentro Emprendedores Eduardo Barreiros (Ourense) - Buscar Financiacion ...
II Encuentro Emprendedores Eduardo Barreiros (Ourense) - Buscar Financiacion ...
 
Horse Sense Catalog
Horse Sense CatalogHorse Sense Catalog
Horse Sense Catalog
 
Multiculturalidad en la escuelccca
Multiculturalidad en la escuelcccaMulticulturalidad en la escuelccca
Multiculturalidad en la escuelccca
 
Cleaning Pakistan’s Air: Policy Options to Address the Cost of Outdoor Air Po...
Cleaning Pakistan’s Air: Policy Options to Address the Cost of Outdoor Air Po...Cleaning Pakistan’s Air: Policy Options to Address the Cost of Outdoor Air Po...
Cleaning Pakistan’s Air: Policy Options to Address the Cost of Outdoor Air Po...
 
SMi's E&P Information and Data Management
SMi's E&P Information and Data ManagementSMi's E&P Information and Data Management
SMi's E&P Information and Data Management
 
Gastronomia camino de la lengua castellana
Gastronomia camino de la lengua castellanaGastronomia camino de la lengua castellana
Gastronomia camino de la lengua castellana
 
Internet World Kongress München 2009 - Wie arbeitet eine SEO Agentur?
Internet World Kongress München 2009 - Wie arbeitet eine SEO Agentur?Internet World Kongress München 2009 - Wie arbeitet eine SEO Agentur?
Internet World Kongress München 2009 - Wie arbeitet eine SEO Agentur?
 
Presentación CTI-Jornada Codetrans
Presentación CTI-Jornada CodetransPresentación CTI-Jornada Codetrans
Presentación CTI-Jornada Codetrans
 
Stephen hawking
Stephen hawkingStephen hawking
Stephen hawking
 
Vernetzung der Akteure
Vernetzung der AkteureVernetzung der Akteure
Vernetzung der Akteure
 
9 Mehra and Jain JDT
9 Mehra and Jain JDT9 Mehra and Jain JDT
9 Mehra and Jain JDT
 
3.4 Effectively Collecting, Coordinating, and Using Youth Data
3.4 Effectively Collecting, Coordinating, and Using Youth Data3.4 Effectively Collecting, Coordinating, and Using Youth Data
3.4 Effectively Collecting, Coordinating, and Using Youth Data
 
pixton de la super zanahoria.
pixton de la super zanahoria.pixton de la super zanahoria.
pixton de la super zanahoria.
 
Maestras de la República
Maestras de la RepúblicaMaestras de la República
Maestras de la República
 
LVR Fleet - Logiciel Entretien Post'Accident
LVR Fleet - Logiciel Entretien Post'AccidentLVR Fleet - Logiciel Entretien Post'Accident
LVR Fleet - Logiciel Entretien Post'Accident
 
Pankaj Rajpal Profile & career Contour July 2016
Pankaj Rajpal Profile & career Contour July 2016Pankaj Rajpal Profile & career Contour July 2016
Pankaj Rajpal Profile & career Contour July 2016
 
Automatización de procesos industriales José María González
Automatización de procesos industriales   José María GonzálezAutomatización de procesos industriales   José María González
Automatización de procesos industriales José María González
 

Similar to Infosys Information Platform - Translating data into action

Business Intelligence ( Bi )
Business Intelligence ( Bi )Business Intelligence ( Bi )
Business Intelligence ( Bi )
Kim Moore
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
Gayatri Padhi
 
Avantium Tibco Study Case
Avantium Tibco Study CaseAvantium Tibco Study Case
Avantium Tibco Study Case
David Jim
 
Big Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter JönssonBig Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter Jönsson
IBM Danmark
 
intel-it-annual-performance-report-2014-15-paper-l
intel-it-annual-performance-report-2014-15-paper-lintel-it-annual-performance-report-2014-15-paper-l
intel-it-annual-performance-report-2014-15-paper-l
Mario A Villalta
 

Similar to Infosys Information Platform - Translating data into action (20)

Bitrock manufacturing
Bitrock manufacturing Bitrock manufacturing
Bitrock manufacturing
 
ARTICLE ON BI AND BO
ARTICLE ON BI AND BOARTICLE ON BI AND BO
ARTICLE ON BI AND BO
 
The Present - the History of Business Intelligence
The Present - the History of Business IntelligenceThe Present - the History of Business Intelligence
The Present - the History of Business Intelligence
 
Leonardo foundation
Leonardo foundationLeonardo foundation
Leonardo foundation
 
Parrot case
Parrot caseParrot case
Parrot case
 
Manufacturing erp and industry 4.0 pdf
Manufacturing erp and industry 4.0 pdfManufacturing erp and industry 4.0 pdf
Manufacturing erp and industry 4.0 pdf
 
Data Analytics - The Insight
Data Analytics - The InsightData Analytics - The Insight
Data Analytics - The Insight
 
The Complete Guide to Embedded Analytics
The Complete Guide to Embedded AnalyticsThe Complete Guide to Embedded Analytics
The Complete Guide to Embedded Analytics
 
9 Steps to Successful Information Lifecycle Management
9 Steps to Successful Information Lifecycle Management9 Steps to Successful Information Lifecycle Management
9 Steps to Successful Information Lifecycle Management
 
Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...
Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...
Profiting from the Digital Shift: Time Series Databases as Value Creation Eng...
 
Abhishek Rungta Workshop Digital Innovation - A Practical Guide For Businesses
Abhishek Rungta Workshop Digital Innovation - A Practical Guide For BusinessesAbhishek Rungta Workshop Digital Innovation - A Practical Guide For Businesses
Abhishek Rungta Workshop Digital Innovation - A Practical Guide For Businesses
 
Discover Rootstock ERP: Top Manufacturing Trends to Watch in 2018
Discover Rootstock ERP: Top Manufacturing Trends to Watch in 2018Discover Rootstock ERP: Top Manufacturing Trends to Watch in 2018
Discover Rootstock ERP: Top Manufacturing Trends to Watch in 2018
 
Business Intelligence ( Bi )
Business Intelligence ( Bi )Business Intelligence ( Bi )
Business Intelligence ( Bi )
 
On The Way To Smart Factory
On The Way To Smart FactoryOn The Way To Smart Factory
On The Way To Smart Factory
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Business Intelligence In Cyber Security | Cyberroot Risk Advisory
Business Intelligence In Cyber Security | Cyberroot Risk AdvisoryBusiness Intelligence In Cyber Security | Cyberroot Risk Advisory
Business Intelligence In Cyber Security | Cyberroot Risk Advisory
 
Strategy session 5 - unlocking the data dividend - andy steer
Strategy   session 5 - unlocking the data dividend - andy steerStrategy   session 5 - unlocking the data dividend - andy steer
Strategy session 5 - unlocking the data dividend - andy steer
 
Avantium Tibco Study Case
Avantium Tibco Study CaseAvantium Tibco Study Case
Avantium Tibco Study Case
 
Big Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter JönssonBig Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter Jönsson
 
intel-it-annual-performance-report-2014-15-paper-l
intel-it-annual-performance-report-2014-15-paper-lintel-it-annual-performance-report-2014-15-paper-l
intel-it-annual-performance-report-2014-15-paper-l
 

Infosys Information Platform - Translating data into action

  • 1. Translating data into action Infosys Information Platform
  • 2. Table of Contents Abstract .........................................................................................................................................................4 Introduction....................................................................................................................................................4 The Infosys Information Platform (IIP) in Action ...........................................................................................5 Analytics-driven preventative maintenance and downtime reduction.......................................................5 Real-time operational visibility...................................................................................................................6 Augmented revenue and profitability ........................................................................................................6 Information Management Realities ...............................................................................................................8 There’s lots more data than ever ..............................................................................................................8 Data has tremendous diversity .................................................................................................................8 Data is generated by multiple external sources........................................................................................8 Data arrives very quickly...........................................................................................................................8 Cross-system data is generally uncorrelated............................................................................................8 Barriers to Success .......................................................................................................................................9 Installation and integration of a modern data platform..............................................................................9 Staffing shortfalls.......................................................................................................................................9 Administrative hassles ..............................................................................................................................9 What can change ........................................................................................................................................10 IIP in Brief....................................................................................................................................................11 IIP in the enterprise.................................................................................................................................11 IIP Customer Success Grid.........................................................................................................................12 Analytics-driven preventative maintenance & downtime reduction.........................................................13 One of world’s largest mining companies ...........................................................................................13 A major ATM manufacturer.................................................................................................................14 A multinational telecommunications enterprise...................................................................................15 Real-time business operational visibility .................................................................................................16
  • 3. ATP Tour.............................................................................................................................................16 A global financial services institution ..................................................................................................17 A world leader in agribusiness ............................................................................................................18 A global electronics and imaging major ..............................................................................................19 Augmenting revenue and profitability......................................................................................................20 A global automation major ..................................................................................................................20 A global pharmaceutical supplier ........................................................................................................21 North American freight railroad network .............................................................................................22 The largest chocolate manufacturer in North America .......................................................................23 IIP Building Blocks ......................................................................................................................................24 Layer 1: Flexible data management........................................................................................................25 Layer 2: Insights development & analytics..............................................................................................26 Layer 3: Insights-as-a-service.................................................................................................................27 Table 1: IIP Layer 1 components ............................................................................................................28 Table 2: IIP Layer 2 components ............................................................................................................31 Next Steps...................................................................................................................................................34
  • 4. Abstract Infosys has strengthened its incomparable proficiency in helping the world’s most advanced enterprises improve operations and run their businesses with a power packed technology driven platform known as Infosys Information Platform (IIP) IIP is an analytics platform that orchestrates open source software, value-added enhancements, and strong professional service expertise. Along with its single-click installer, data ingestion framework and graphical data-modeling tool, IIP supplies a comprehensive array of adapters for diverse data sources as well as an easy way to create new connections when needed. Out-of-the-box integration with R studio simplifies harnessing the power of clusters while modeling data. Significantly, all of IIP’s benefits can be realized without requiring extensive coding. All of these and other features help customers discover actionable insights and foresights by deriving meaning from the abundant - and untapped - sources of information. This paper will help you understand IIP, its design philosophy, technical architecture and success stories. Introduction So far, it’s been strenuous and time-consuming to obtain insights from the enormous amounts of raw data from internal and external sources that flood enterprises every day. And often, even when information analysis has been successful, tangible insights that convert to business results have frequently been elusive. Infosys Information Platform (IIP) brings in all the right ingredients such as technology, toolsets and processes to obtain insights near real time from all kinds of data – historic or current, idle or ever- changing, structured or unstructured. Solutions developed using IIP deliver quick and meaningful business outcomes such as:  Analytics-driven preventive maintenance and downtime reduction  Real-time operational visibility  Augmented revenue and profitability Organizations that deploy IIP are able to realize these benefits while still protecting and rejuvenating existing IT technology investments. We begin this paper by describing a few common solutions and then we will illustrate IIP’s architecture and distinct value proposition. We then depict how the unique challenges of today’s information landscape served as the rationale behind Infosys’ development of AiKiDo and the IIP solution that it incorporates. The intended audience for this paper includes line-of-business executives, IT leadership, and anyone else interested in translating raw data into insights and guidance that the business can swiftly use to drive action.
  • 5. The Infosys Information Platform (IIP) in Action By running existing workloads more efficiently while unearthing new opportunities, IIP makes it possible to leverage untapped data from numerous internal and external systems and sources to reveal insights and suggest quick courses of action. Market adoption of IIP has been impressive: within the first year of its existence, 200 customers have completed evaluations, with dozens now onboard. To help customers get up and running quickly, Infosys offers a preconfigured Data Analytics solution. IIP with rich professional expertise in roles such as business analysts, technology architects, data scientists, data engineers, and software developers, produces business solutions constructed on the platform. These solutions result in the benefits to the bottom line in dozens of engagements across industries and applications. These achievements cover a broad range of applications, such as:  Fraud analytics  Predictive analytics for maintenance  Digital shopper insights  Customer churn analysis  Risk exposure analytics  Trade data analysis and regulatory reporting  Real-time machine learning  Working capital allocation optimization Below are some of the examples of our solutions Analytics-driven preventative maintenance and downtime reduction 1. One of the world’s largest mining operators has placed nearly 200 sensors on each of its autonomous, unmanned vehicles. IIP’s real-time data analytics – ingesting and processing 27,000 messages per second - predicts which of these vehicles is about to fail. This guidance drives repairs before downtime can occur, and is a great illustration of a completely new type of application made possible by IIP. 2. A major ATM manufacturer and service provider turned to IIP to develop a fresh, innovative solution that analyzes four million records of alert and incident data from 8,500 machines in an effort to foresee which ATMs would fail within one week. The outcome included downtime reduction of 10%, a 14% increase in service call efficiency, and an 18% cost reduction thanks to more accurate, productive client visits. 3. A multinational telecommunications services company had recognized that network faults were the biggest single cause of disruptions, but determining the time and location of impending failures was nearly impossible. Complex analytics on millions of operational records conducted using IIP helped unearth the fact that three percent of the company’s lines were at risk of having a fault sometime during the next three weeks. Armed with this knowledge, the organization immediately targeted the imperiled lines for repairs before the anticipated outages could occur.
  • 6. Real-time operational visibility 1. ATP Tour - the governing body of men’s professional tennis - wanted to add new color and depth to fans’ understanding of the game in an open and cost-effective way. Infosys loaded extensive historic data consisting of millions of data points from multiple systems into IIP. The results - which were available in near real-time - provided a comprehensive collection of in- depth player performance probability-led foresight to the media, game commentators, and the sport’s worldwide fan base. 2. Spurred by regulatory trade requirements, a leading global financial institution processing six million transactions per day employed near real-time analytics in IIP to slash report generation times from 10- 15 minutes to 35 seconds. This is an example of better utilization of existing infrastructure brought about by IIP. 3. A world leader in agribusiness, was facing application performance challenges in their management reporting solution that was staggered by large volumes of data. Infosys implemented a proof-of-concept using the Infosys Information Platform (IIP) to improve the performance of the reporting solution. During the exercise, IIP could inject 19 million records in just six minutes. This was a breakthrough compared to the existing platform’s performance which took over an hour to inject half a million records. IIP could conduct report / dashboard navigation in under five seconds while the existing platform took over a minute to perform the same. 4. A global imaging and electronics manufacturer of printers, photocopiers and fax machines sought to rejuvenate their Accounts Receivables reporting process. Along with ongoing, daily production details, Infosys migrated 24 months of historical information from the existing data warehouse into IIP. At the same time, the enterprise’s data models, views, and reports were all streamlined and optimized. Turnaround time for daily data integration was 37% faster, and report performance times were cut in half. Augmented revenue and profitability 1. To help identify existing customers with a propensity to buy specific products and services, a major office automation vendor rolled out a new application that utilized IIP’s machine learning and in-memory computing capabilities to analyze more than two million records of previous purchases. The complete set of predictions was concluded in seven seconds. 2. A global pharmaceutical supplier was hampered by the amount of time it took to identify backorders. Using IIP to consume SAP-generated order details, they created a new solution that analyzes the entire data set and identifies - and helps correct - supply shortfalls within 10 seconds. 3. A major North American freight railroad network was eager to come up with new tactics to reduce the quantity of unnecessary braking events generated by the Positive Train Control (PTC) system for its locomotives. These unanticipated incidents diminished the organization’s ability to adhere to its published operating schedules. Infosys used IIP and the R programming language to analyze an expansive set of operational metrics and then develop a delay event prediction model. The new approach helped the railroad adjust the PTC and significantly diminished the number of unnecessary braking occurrences.
  • 7. 4. The largest chocolate manufacturer in North America lacked a timely, systematic methodology for determining when its products were unavailable for purchase at retail locations. Since many consumers make their buying decisions impulsively, these frustrating inventory shortages resulted in lost revenue and diminished brand loyalty. IIP served as the computing platform for a collection of statistical models that helped to classify out-of- stock events and determine their root cause. Along with this analysis, the new solution also alerted the appropriate users to help prevent these costly episodes. To learn more about these IIP accomplishments, please see the IIP Customer Success Grid that’s presented later, or visit http://www.infosys.com/information-platform/case-studies/Pages/index.aspx) In the next section, we portray some of the modern information complexities that Infosys needed to overcome when constructing the solutions that we just illustrated. These dynamics also helped influence the design and development of the IIP platform itself.
  • 8. Information Management Realities Regardless of industry, every IT organization must confront an assortment of commonly discomforting truths about how data is created and utilized today. Each of these factors were integral considerations when Infosys developed IIP. There’s lots more data than ever According to a study 1published by EMC and IDC, from 2013 to 2020, the digital universe will grow by a factor of 10 – from 4.4 trillion gigabytes to 44 trillion. It is more than doubling every two years. Data has tremendous diversity Previously, information was principally generated by enterprise applications using a standard relational structure that was easily catalogued and employed. Naturally, structured application information is still a big and meaningful portion of the overall IT portfolio, but data variety now encompasses unstructured sources such as:  Social Media Feeds  Machine Logs  Document scans  OCR Data Data is generated by multiple external sources There was a time that IT leadership could simply focus attention on its own application and data collection. That’s passé: IT must now have a plan to interact with, and react to, data created by innumerable outside sources. Data arrives very quickly These new information categories are typified by the speed at which they’re generated and distributed. For example, consider how rapidly a video clip, tweet or Facebook post can go viral. Cross-system data is generally uncorrelated When the bulk of the enterprise’s data was from well-defined enterprise applications, it was relatively straightforward to understand and manage information interconnections. This is much more daunting today, since properly linking raw - and often unstructured - data from diverse sources takes a lot of work. 1 The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things, EMC Digital Universe with Research & Analysis by IDC
  • 9. Barriers to Success Smart enterprises appreciate the untapped value of the data that’s generated by their operations every day. Predictably, many organizations are making significant investments in hardware, software, and personnel in an effort to derive advantages from this idle information. Sadly, these technology expenditures have often failed to deliver on their promise. At a recent Gartner Big Data Industry Insights event2, analyst Lisa Kart stated, "73 percent of enterprises are either investing or planning to invest in big data. However, of these, 65 percent are struggling with determining how to get value from big data." The gap between anticipation and results is not astonishing as endeavors of this kind that has the premise of “making sense of big data” need a careful thought and actions towards it than simply selecting and purchasing technology. Also, apart from the usual data volume, velocity and variety, there are many technological, skills and processes’ challenges that diminish the returns from the IT expenditures. Let’s take a look. Installation and integration of a modern data platform Despite the existence of IT infrastructure, budgets and competencies, enterprises today demand advanced analytics capabilities which require varied components working seamlessly to derive insights. Modern information management platforms need to be integrated with the existing components alongside installation of big data components, which we all know, is a challenge. Staffing shortfalls There’s a dearth of talented individuals with expertise in data science and platforms like Hadoop. This means that key IT staff have additional, often perplexing responsibilities such as managing data diversity, creating an accurate, meaningful enterprise information model, designing and developing algorithms and applications necessary to turn potential returns into reality. Administrative hassles Since each constituent in a modern information management platform has its own lifecycle, administering these solutions is an ongoing, often highly convoluted process that entails:  Keeping up with the rapid pace of open source technology advancements  Controlling versions, for infrastructure as well as applications  Managing across projects  Configuring and managing enterprise-grade security  Ensuring compliance with mandated internal standards as well as those from external regulatory agencies While IT struggles to surmount the gaps between the promise and reality of the enterprise’s data, the user community is often disappointed at the pace - and eventual upshot - of these efforts. 2 Gartner Webinar, Big Data Industry Insights, Lisa Kart, January 27, 2015
  • 10. First, there is a remarkable scarcity of easily deployed, user friendly, and industry-specific applications that can make sense of - and offer meaningful recommendations about - the enterprise’s substantial information assets. In the absence of packaged solutions, staff members naturally turn to self-service options - including analytics - as a mechanism for gaining the insights they crave. This often results in yet another instance where the promises portrayed by IT and software vendors diverge from the actual end-user experience. Finally, even in those rare cases where all of the prerequisites are in place for an exceptional user experience, delays during the data ingestion process - for both structured and unstructured information - introduce unendurable lags in turning suggestions into action. What can change Enterprises that seek profit from the modern data landscape in real-time should seek technologies that:  are adept at handling both structured and unstructured information  increase sophistication by moving from old-style, basic relations to more far-reaching correlations  future proof investments by leveraging existing assets and letting the enterprise select best-of- breed modules  make it easy to profit from technology advancements without causing unnecessary expenditures or downtime  deliver rapid insights by moving away from the existing user request/IT response paradigm towards ‘self-discovery’ With these fundamentals in place, users are empowered with self-service capabilities that let them answer questions without needing to request assistance from the IT organization. All of the best practices that we’ve just described have served as foundational guidelines for the Infosys Information Platform, which we describe next.
  • 11. IIP in Brief As a global leader in consulting, technology, outsourcing and next-generation services, Infosys enables clients in more than 50 countries to stay a step ahead of emerging business trends and outperform the competition. An entrepreneurial adventure that began with seven engineers and $250, Infosys is now a publicly traded company (NYSE: INFY) driven by more than 179,000 relentless innovators and annual revenues of more than $8.7 billion. By focusing on seven core mega-trends, Infosys supplies enterprises in every industry with strategic insights and a framework to uncover opportunities for innovation-led growth. Inspired by its unparalleled experience in supplying the precise combination of technology, people, and know-how to the world’s largest and most sophisticated organizations, Infosys has launched the AiKiDo initiative. AiKiDo is an integrated, coordinated, and strategic solution amalgamated from three distinct families of products and services:  Ai: Platforms and Platforms as a Service  Ki: Knowledge-based management and landscape evolution  Do: Design thinking and design-led initiatives AiKiDo’s overall mission is to help Infosys’ customers improve current operations - and thereby renew their existing technology and business process landscape - while driving innovation by uncovering formerly hidden opportunities to solve challenges. IIP in the enterprise As part of the Open Data aspect of the Ai platform set, Infosys has developed the Infosys Information Platform (IIP). It’s a complete solution that blends open source software, value-added components, a robust partner ecosystem, and deep professional service proficiency. It is an analytics platform that enables you to quickly glean insights from all types of data sources and use them for decision support across industries. IIP offers all necessary open source components in one validated package that can be installed with a single click. This saves a tremendous amount of time, and keeps staff focused on delivering value, rather than downloading, installing, and maintaining individual software bundles. All of IIP’s benefits can be attained without writing or manipulating any open source platform code, so enterprises can emphasize core applications and business outcomes without having to hire expensive open source experts. IIP is tailored to avoid vendor lock-in, and works with any licensed software or open source Hadoop distribution such as Cloudera or Hortonworks, or relevant components acquired directly from the related open source Apache project.
  • 12. An intuitive user interface abstracts away the nuances and complexities of the underlying platform’s open source technologies, while also eliminating the need to code for many common tasks – including data modeling. This greatly increases the productivity of scarce data engineers and data scientists. When coding is necessary, everything that IIP generates is 100% open source for ease of maintenance. Enterprises are free to make substitutions without incurring any downtime or outages, keeping all options open when surveying the latest technology advancements. In fact, if they wish to they can replace IIP’s open source components with already-installed technologies: it’s entirely up to the enterprises to select the amount and type of open source software in their environment. Also, they can select any desired deployment model for IIP, including on premise, public cloud, private cloud, and hybrid topographies. Since it runs on commodity hardware and free from software license fees due to open source, IIP significantly reduces hardware, software, professional service, and operational outlays and thus total cost of ownership is significantly better over existing market offerings. The most salient feature of IIP is that it is a secure platform while still taking full advantage of open source technologies. Encryption, authentication, data lineage, and cluster monitoring tools provide far-reaching security, management and compliance with data audit and governance mandates. There is more to Infosys’ commitment to advance the state of the art of modern information platforms. It extends far beyond simply providing a well-integrated solution to its customers. One such instance is - Infosys is a Platinum Sponsor of the Open Data Platform (ODP) initiative: the open ecosystem of Big Data. Infosys actively works with other industry leaders to promote and enhance Big Data technologies and open source projects such as Apache Hadoop. These advances include Infosys contributions to performance and security, which we will describe later in this paper. Another example is - IIP has also been certified as a test bed by the Industrial Internet Consortium, demonstrating its relevance in the rapidly evolving landscape of sensor data analytics and the Internet of Things (IoT). IIP Customer Success Grid The following section summarizes few examples that have profited from IIP’s speed, scalability, and open architecture. These capture business challenge, high-level solution summary, and benefits. They are classified as 1. Analytics-driven preventative maintenance & downtime reduction 2. Real-time business operational visibility 3. Augmenting revenue and profitability
  • 13. Analytics-driven preventative maintenance & downtime reduction One of world’s largest mining companies Business context Solution highlights Results One of the world’s largest mining companies utilizes a fleet of autonomous, unmanned trucks. These vehicles operate in numerous locations globally, and a failure negatively impacts the entire supply chain. Each truck is equipped with nearly 200 sensors, which continually broadcast 400 data points of telemetry about the state of the vehicle (e.g. temperature, vibrations, and tire pressure) along with details about the terrain in which it’s currently operating. Apache Kafka was configured to stream 27,000 of these messages per second into IIP, where a mathematical model developed in Apache Spark computed maintenance requirements as well as the likelihood of an upcoming failure. A native HTML5 application presented a color-coded global map indicating the state of all vehicles, and permitting drill-down on any individual truck. Machine breakdowns and production interruptions have been significantly diminished, resulting in less downtime and more savings. Users can interact with much more accurate indicators for a collection of critical metrics such as:  Production schedule adjustments  Spare part requirements  Energy costs  Optimal asset utilization Thanks to the operational efficiencies gained from the real- time, elastic, and scalable IIP solution, the enterprise is launching an initiative to increase its fleet of unmanned trucks by 300%.
  • 14. A major ATM manufacturer Business context Solution highlights Results A major ATM manufacturer and service provider sought techniques to reduce maintenance costs while offering higher reliability and improved customer service to its clients. An IIP solution - hosted on a 10- node Amazon Web Services (AWS) cluster - was developed to ingest four million records of ticketing data generated by 8,500 ATMs. The entire data loading and cleansing process took 27 seconds, and the follow-on Apache Spark- based logistic regression analysis with reliability predictions concluded in only 60 seconds. The final results were then transmitted to the customer’s Oracle database, and presented through the Tableau business intelligence solution. The IIP solution was able to predict - with an 80% reliability rate - the likelihood of an ATM failing within one week. Using the outage predictions generated by the IIP solution, each technician is now able to conduct 4 service calls per day, which is a significant increase from the earlier average of 3.5 service calls per technician per day. Accurate failure predictions and the resulting optimized service calls have helped shave costs by 18%. Meanwhile, chronic defects are now corrected rapidly - in hours rather than weeks.
  • 15. A multinational telecommunications enterprise Business context Solution highlights Results A multinational telecommunications enterprise wanted to identify - and then correct - potential network faults that could result in costly and inconvenient service disruptions. Experts from Infosys used IIP to process and analyze more than 16 million records of ADSL connection details such as attenuation/loss, code violations, upload/download rates, and re-initializations. These computationally-intensive examinations resulted in two distinct “signatures”: 1. A profile produced by normal, non- fault activities (“Control signature”) 2. A profile that indicated an incipient fault (“Fault signature”) Applying a statistical model to then compare these two signatures served as a reliable indicator of which lines were candidates for a near-term outage. The IIP-based solution identified three percent of the firm’s lines as being at-risk of a looming interruption at some point in the subsequent three weeks. Using these insights as a roadmap, the firm was able to get a head start on repairing the problematic lines before trouble could develop. This has resulted in reduced downtime and more optimally allocated maintenance resources.
  • 16. Real-time business operational visibility ATP Tour Business context Solution highlights Results ATP Tour - the governing body of men’s professional tennis - eagerly sought new, innovative techniques to help commentators and fans get a better understanding of the fast- paced game. Their mission was to go far beyond traditional statistics to uncover previously hidden insights. Infosys loaded millions of data points into IIP, including umpire data for 12 months as well as five years of data from the computerized Hawk-Eye ball tracking system used in the Barclay’s ATP World Tour Finals. Requiring just two nodes of eight core CPUs and 16GB of RAM for hardware, IIP concluded its analysis in near real- time. An enormous number and variety of statistics - and their impact on the game - are now available for fans. Just a few examples of these metrics include:  Shot speed  Shot placement  Point winning shots  Fatigue indexes  Serve analysis ATP now offers this research to match commentators along with publication on ATPWorldTour.com for the benefit of fans and journalists.
  • 17. A global financial services institution Business context Solution highlights Results A global financial services institution carries out approximately six million trades per day. Regulatory requirements dictate that certain trades must be reported in a specific format within a 15- minute window. There were numerous instances where the organization failed to make obligatory notifications within the mandated timeframe. These delays resulted in non- compliance alerts and costly financial penalties. Apache Sqoop extracts trade details from an Oracle database and loads them into the Hadoop File System (HDFS) residing on IIP. The data extraction process includes data cleansing, validation, and derivation operations. Nearly 600 million transactions were loaded into a 100 AWS cluster at a rate of 130,000 transactions per second. Upon completion of the relevant computations, the regulatory results are returned to Oracle, and various analytic reports are available from Tableau. IIP completed the entire processing and reporting assignment within 35 seconds. The enterprise now has an elastic and scalable strategy that eliminates violations and penalties, and can easily support future growth.
  • 18. A world leader in agribusiness Business context Solution highlights Results To offer timely information to their user community, a large agribusiness concern aimed to speed up data loads and report generation by deploying an inexpensive, cloud- based business intelligence data warehouse acceleration solution. The entire information portfolio - consisting of 19 million records of master data, sales, costs, and inventory - was loaded into an on-premise two-node IIP cluster. It took only six minutes to transfer the complete data set, and a full suite of reports were generated in less than 20 seconds. These reports - which were presented in Tableau - provided guidance on sales performance, budget variances, and geographic revenue summaries. Month-end processing is now completed in near real- time, and the data load task is 600 times faster than in the previous solution. Users are able to gain access to the reports they need 60 times more quickly than before.
  • 19. A global electronics and imaging major Business context Solution highlights Results An international manufacturer of imaging and electronics technology such as printers, copiers, and fax machines desired a fresh alternative to its data warehouse infrastructure. Infosys created a collection of ingestion models to load active and historical data related to payments, credits, debits, and adjustments from multiple source systems - including a massive existing data lake - directly into IIP. The data model was optimized and harmonized, with summary tables created in Apache Hive. Spark SQL and Tableau were assigned the task of presenting information to users. As part of this migration and streamlining effort, Infosys was able to cut the number of views in half, and offered extensive new visualization and extraction options to users. The essential job of loading information from source applications and data warehouses was improved by 37%. Report generation times were trimmed by 50%, and the business profited from far greater accuracy and reduced variances using the new solution.
  • 20. Augmenting revenue and profitability A global automation major Business context Solution highlights Results In an effort to more effectively allocate marketing and sales resources, a large office automation concern sought a reliable method to predict the likelihood of existing customers making subsequent purchases. More than two million records of current customer details and monthly sales transactions were retrieved from production systems and loaded into Hadoop. Apache Spark was used to create near real-time, in-memory machine learning models to accurately identify which customers within a given sales territory were likely to make a purchase. Results were available for user consumption in Tableau within seven seconds. The ensuing reports were accurate in identifying those customers that were genuine candidates for repeat purchases. The business used this information to drive cross- sell and upsell efforts.
  • 21. A global pharmaceutical supplier Business context Solution highlights Results A global pharmaceutical manufacturer’s revenue was negatively impacted by delays in determining back order specifics. To scale these obstacles, the organization sought to take advantage of high performance distributed computing. A comprehensive information portfolio was loaded into IIP. This data set consisted of fine-grained details about customers, orders, products, and manufacturing plant availability. Computations were performed in IIP, with the resulting guidance delivered in 10 seconds on a single node server. The results were then presented to users via Tableau. Management now has a much more accurate picture of potential backorder issues, and can take corrective action long before problems impact revenue.
  • 22. North American freight railroad network Business context Solution highlights Results A major North American freight railroad network searched for ways to eliminate an ongoing series of needless braking events for its locomotives. Infosys loaded a diverse set of metrics into IIP running on an AWS cloud. These values - which created a data set of hundreds of terabytes - included locomotive brake data, engineer characteristics, wayside data streams, weather information, maintenance details, and signal data from the Positive Train Control (PTC) system. Using the R programming language with resulting visualizations presented in Tableau, Infosys performed a series of investigations such as Pareto analysis of braking events, basic text mining of delay comments, and locomotive delay prediction. These inquiries demonstrated that erroneous signals, speed restrictions, and switch alignments were the primary culprits in the unwanted braking occurrences. Applying the recommendations delivered by the IIP-based solution helped the railroad predict - and then prevent - the factors that were causing the braking events. A one-mile-per-hour increase in train velocity can yield $200 million of incremental revenue, so these adjustments had a major impact on the enterprise’s bottom line.
  • 23. The largest chocolate manufacturer in North America Business context Solution highlights Results This organization recognized its inability to accurately determine which retailers were lacking inventory was resulting in lost revenue and unhappy customers. 350 million rows of sales and inventory data were loaded into a five-node IIP instance running on AWS. Infosys developed a collection of statistical models that classified - within four minutes of processing - the out-of- stock incidents and ascertained their root causes. The resulting Tableau dashboard - which presented heat maps of details about stores, days, times, and items - gave users the necessary insights to avoid product availability shortfalls. According to industry trade journal Retail Wire, out-of- stock incidents such as those experienced by this organization account for approximately 3.2% of lost revenue. By eliminating these events, the enterprise stands to gain more than $100 million of incremental sales.
  • 24. IIP Building Blocks As illustrated in figure 1, IIP encompasses three fine-tuned yet well-integrated layers:  Layer 1: Flexible data management  Layer 2: Insights development & analytics  Layer 3: Insights-as-a-service Figure – 1
  • 25. Layer 1: Flexible data management IIP’s data management layer is a pre-configured, curated, and optimized collection of well-known, industrial grade open source technologies. When architecting IIP, Infosys carefully surveyed the market to choose each component. This tactic supplies all of open source’s advantages – such as cost, performance, transparency, and vendor flexibility - while minimizing the drawbacks such as research, technology acquisition, and maintenance that are prevalent when deploying open source. Customers are also free to substitute their own already-implemented infrastructure for any of the bundled technologies provided by Infosys. Table 1 enumerates the extensive list of open source software that constitutes IIP layer 1.
  • 26. Infosys has been an active participant in advancing the open source projects that make up layer 1. A few instances of these contributions include:  Data level authorization on Spark views along with Hadoop File System (HDFS) tables accessible via Spark  Role-based access control on Spark Views and HDFS tables accessible via Spark  Auto-registration of Spark views on Apache Thrift server restart  Registration of multi-table joins as views through Spark beeline  Multi-threading in Sqoop  Callback capabilities in Sqoop created to report execution statistics Infosys has also developed its own sentiment analytics engine that offers text analytics models and algorithms for meaningful indicators such as:  Buzz  Sentiment  Affinity  Opinion  Network  Influencer Layer 2: Insights development & analytics The second layer of the IIP architecture builds on the robust foundation of open source and customer- supplied information processing components that form IIP’s base layer. Infosys has developed a collection of far-reaching, value-added, enterprise-grade capabilities that assist customers with essential tasks like:  Installation  Administration  Data loading and modeling  Performance  Scalability  Publication framework  Security Table 2 describes each of the items found in IIP’s second layer. We continue to make major investments in IIP. Upcoming capabilities will include:  Rules engine integration  Elastic search integration  High availability and disaster recovery  Web aggregators  Archiving and aging
  • 27. Layer 3: Insights-as-a-service IIP is a potent combination of open source and Infosys-supplied supplemental software. It provides customers with the technical prerequisites to build applications that fully exploit today’s information landscape. Infosys stands behind IIP with a large, highly-skilled specialists’ pool covering all aspects of developing modern applications:  Infrastructure management  Functional expertise  Technology acumen  Business analysts  Data scientists Given our history of achievement, many clients also opt to take advantage of a group of related service offerings such as:  Integration and implementation customizations  Custom data extractors and adaptors  Client-specific data modeling and cleansing  Client-specific data science and advanced analytics  On-demand agile application development
  • 28. Table 1: IIP Layer 1 components Component Purpose Apache Hadoop A popular framework and ecosystem that facilitates batch- oriented distributed processing of massive amounts of data Apache Hive Infrastructure erected on top of Hadoop and the Hadoop File System (HDFS) to provide data warehouse capabilities such as querying, analysis, and summarization Apache Kafka A message broker that streamlines and speeds the important job of ingesting real-time data feeds Apache Open NLP A machine learning toolkit intended for processing natural language text. Apache Shiro Security framework for Java applications that offers authentication, authorization, encryption, and session management Apache Spark A cluster computing and processing framework, designed for very high throughput and performance, especially when incorporating machine-learning algorithms Apache Sqoop Technology developed to transfer data between relational databases and Hadoop
  • 29. Component Purpose Apache Yarn A platform that manages computational assets that are aggregated in clusters and schedule applications on those resources Apache Zeppelin Provides easily-created, interactive data analytics using popular Big Data technology back-ends Apache Zookeeper Provides a naming registry for large distributed systems, along with keeping track of configuration and synchronizing information across the computing cluster Azkaban Technology developed by LinkedIn to permit scheduling of batch Hadoop jobs Hibernate A framework for mapping objects between Java and relational databases Hipi Hadoop Image Processing Interface: a library targeted at very fast image processing using MapReduce computational patterns
  • 30. Java Development Kit (JDK) Complete application development infrastructure for the Java Programming language Kerberos Software that implements a network authentication protocol that makes it possible for nodes to securely communicate, regardless of whether the underlying network is secure or not. MySQL A widely adopted open source relational database, utilized as internal storage by the IIP platform’s Quartz scheduler. Quartz An open source, Java library that permits job scheduling and workflow coordination directly from an application. RStudio A specialized programming language (“R”) and supporting development studio intended for developing data analysis and statistical applications. RStudio Server Enables a browser-based user interface to applications written in the R programming language that are running on a remote server Twitter 4J A software library that integrates Java applications with the Twitter API
  • 31. Table 2: IIP Layer 2 components Component Purpose Administration workbench Permits users to configure and manage workspaces and data sources. Apache Ambari Software meant to make the job of administering and managing Hadoop clusters less taxing Cluster maintenance A single click installs all of the components in the IIP platform. Infosys engineers provide robust maintenance and support for ongoing open source upgrades. Data explorer A graphical user interface-based information modeling and query tool for designing joins, aggregation, and filtering. Offers drag-and-drop capabilities to quickly correlate multiple disparate information sources and data types. This sets the stage for uncovering insights while still insulating developers from the specifics of the underlying technologies. It renders its results via visualization tools such as Tableau and Qlik, as well as native HTML5. The Data Explorer integrates with external data science and analytics toolsets - such as the R programming language - via commonly accepted standards and protocols.
  • 32. Component Purpose Data extractor A configurable, extensible, and fault-tolerant workbench that provides a drag-and-drop user interface for ingesting data (initially and for subsequent updates) with near real-time performance. It’s adept at loading data from multiple data sources such as relational databases, data streams, message queues, NoSQL databases, social media, and log files. It’s able to digest CSV, XML, PDF, and JSON encoding formats. Governance Delivers complete metadata and view management via data ingestion and management workbenches. In-memory analytics IIP supplies a high performance, comprehensive collection of libraries and features to facilitate rapid data mining and modeling. They apply mathematical and statistical algorithms to uncover patterns in raw data. This supports fast data transformation to create joins and views for subsequent consumption. Resource manager All IIP-hosted applications can be launched, monitored, and administered from a single integrated Web-based user interface.
  • 33. Component Purpose Security IIP was built to incorporate robust security capabilities. First, it provides three levels of authentication including operating system, LDAP, and Kerberos. It also offers highly granular cell-based authorization and role-based access to information. Customers are free to specify fine-grained role-based access control:  For the platform  For all tables  For all views  For all fields with in tables and views
  • 34. Next Steps Infosys offers a collection of helpful resources that provide more information about IIP: 1. To learn more about the platform, visit the IIP website. 2. Sign up for an IIP test drive. 3. Buy today on AWS Marketplace. Beyond the test drive, Infosys provides the ability to completely host IIP using customer-supplied cloud environments or on-premise hardware. This option consists of a fully configured, multi-node IIP solution that’s designed to deliver real-time insights. For more information, write to us – askus@infosys.com