Big Transaction Data (BTD) provides a new approach to big data analytics by capturing entire business transactions and associated context such as users, locations, behaviors, and outcomes. This approach addresses the primary challenge with big data today which is understanding and preparing disparate data sources. With BTD, data is stored and linked in a single, normalized stream so business analysts can directly experiment with and gain insights from the data in hours rather than months. BTD can significantly reduce the time and resources spent understanding and preparing data for analytics.
2. www.optier.com
Agenda
What is Big Data?
What is Big Transaction Data (BTD)?
Why does BTD reinvent the economics and
approach of Data Analytics & Big Data?
5. www.optier.com
Real World Processes
Previous Customer
Orders Loyalty
Rebates and
Finance Offers Final Order
Value
Tax
Calculation
6. www.optier.com
Real World Data Stored in Silos
Old system with
Previous Transactions are recorded only by
Customer
no usefulOrders
records Loyalty
Customer loyalty number
This drives significant inefficiencies
Datacosts when analyzing data
and is scattered across
using traditional methods & also
20,000 machines
with Big Data
Only aggregate
Rebates and
Finance Per hour
data: Offers Terabytes
Final Order
totals Value
of data
Transactions only
Tax
indexed by non-
Calculation
synchronized clock time
7. www.optier.com
The Challenge
Which alternative
IT investment will
What technology have the largest
can be virtualized impact on
or put in the revenue?
cloud?
Which technology
can I decommission
What improvements in
with minimal impact
IT performance will have
on business?
the greatest impact on
reducing churn of high
value customers?
8. www.optier.com
Current Approach to Big Data
Big Data Process – “4 Steps”
Analytics
The primary issue with Big Data today is not the large volumes, it is the need to
clean, normalize, and infer data relationships through statistical modeling
• $$$$ – Involves multiple technologies, large number of IT
resources, and huge computing power/storage
• Complex – Requires manual data mapping, application
knowledge, and data relationships determined through
advanced statistical analysis
• Slow – Answering a business question can take months!
9. www.optier.com
Gartner Agrees
Majority of time and $ spent in IT trying to make sense of and prepare data
5-10%
10-15%
30-60%!!!
20-30%
20-30%
5-10%
Source: Gartner (March 2012) – Typical analytic process using CRISP-DM, the cross-industry
standard process for data mining methodology
10. www.optier.com
Overcoming this Challenge
Create a Business Transaction
Application and Business Outcomes
User and Session
Infrastructure Performance
Web
• Create transactions across tiers in the context of a business process, user,
location etc.
• Normalized, linked, real time data set.
Mobil
e • Addresses the primary challenge, and the largest effort by consumers of
big data today is in finding the relationships or links between various data
sets.
Device
12. www.optier.com
What is Big Transaction Data?
Big Transaction Data Process – “3 Steps”
Analytics
• User information and • Business analysts
behavior • Data visualization
experiment directly solution
• End to end transaction with the data
topology • Dashboards
• Data from other
• Entity relationships and sources such as
payload CRM and inventory,
• Session information can be easily added
• Performance of each
involved IT component
• Business outcome
It takes hours or days not weeks or months to get answers to your questions?
• Big Data – Big Data is the process of storing
large amounts of data, performing analytics on
those data, and providing real business benefit.
• Transaction – Refers to any business process
that is supported by applications or systems.
• Context – The “who,” “what,” “where,”
information associated with a transaction.
13. www.optier.com
BTD’s Affect on Gartner’s Report
• Almost eliminates the dependence on IT for analytics
• Can remove all steps associated with data understanding and preparation
• All time can be focused on understanding problems, modeling, and evaluating results
5-10%
10-15%
30-60%!!!
20-30%
20-30%
5-10%
14. www.optier.com
Elements of a Transaction
A business transaction is any interaction between an end-user and one or many
applications
BIG TRANSACTION DATA
The contextual relevance that are most valuable to 99% of business questions are rooted in the
following
USERS & BEHAVIOURS CUSTOM SERVICE BUSINESS
CUSTOMERS & ACTIONS BUSINESS PROVIDED OUTCOMES
DATA
- User - Transaction - Trade Value - System - Call center
- Location type - Search Response volume
- Device - User actions Criteria Time - IT costs
- Browser - Visits - Funds - Reliability - Revenue
- Connection - Result Transfer (errors) transactions
Speed Amount - IT Topology
Unique to 3rd Party Data
Application
15. www.optier.com
Example Use Case
A simple illustrative example deals with IT investment that will have the biggest
positive impact on revenue
USERS & BEHAVIOURS INTERNAL SERVICE EXTERNAL
CUSTOMERS & ACTIONS BUSINESS PROVIDED DATA
DATA
- User - Transaction - Shopping Cart - Front End - Call center
- Location type Value Response Time volume
- Device - User actions - Search - Back End - IT costs
- Browser - Visits Criteria Response Time - Revenue
- Connection - Result - Fund Transfer - Reliability transactions
Speed Amount - IT Topology - Marketing Info
General Question
Where should I invest IT budget to have the biggest impact on revenue?
E-Retailer Specific Example
What is the infrastructure whose performance impacts customer
purchase decisions the most?
16. www.optier.com
Example Use Case – Continued
The analysis clearly shows that improving performance of the product catalog will
have the biggest impact on revenue
Product Catalog DB – Strong Correlation Authentication Service - No Correlation
Doing this analysis with BTD is easy!
Step 1: Aggregate transaction stream into a clean historic data set.
(Each visit, performance during that visit, the sales outcome)
Step 2: Explore historic data set by aggregating the visits into a report summary.
(Tier performance by average revenue per visit)
17. www.optier.com
Conclusion
The primary issue with Big Data today is not the large volumes, it
is the need to clean, normalize, and infer data relationships
through statistical modeling.
Big Transaction Data will:
Save a single stream of Big Data, with context, generated
from transactions at the time of execution
Store this contextual data as a normalized, linked, real time
data set
Take analytics out of the hands of IT and put it in the hands
of the people who understand the business
Allow business questions to be answered in a matter of
hours vs. months with traditional methods
19. www.optier.com
Who we are…
Business Transactions Are Our DNA
Application and Business Outcomes
User and Session
Infrastructure Performance
• Create over 3 Billion Business Transactions a Month
• Current client base using includes:
World’s largest transaction processor company
50% of Fortune 50 Financial Institution
Industry’s #1 retail mortgage lender
World’s largest mutual and property casualty insurance company
Largest SAAS company
• Historically analyzed Business Transactions for APM
• Rebranding Offering to Include BTD
Editor's Notes
Today I will explain what is meant by the term Big Data. I will introduce you to the term Big Transaction Data (BTD) and discuss the economics and approaches to Big Data and Big Transaction Data.
As a catch-all term, "big data" can be pretty nebulous, in the same way that the term "cloud" covers diverse technologies. Input data to big data systems could be chatter from social networks, web server logs, traffic flow sensors, satellite imagery, broadcast audio streams, banking transactions, MP3s of rock music, the content of web pages, scans of government documents, GPS trails, telemetry from automobiles, financial market data, the list goes on. The value of big data to an organization falls into two categories: analytical use, and enabling new products. Big data analytics can reveal insights hidden previously by data too costly to process, such as peer influence among customers, revealed by analyzing shoppers' transactions, social and geographical data. Being able to process every item of data in reasonable time removes the troublesome need for sampling and promotes an investigative approach to data, in contrast to the somewhat static nature of running predetermined reports.The past decade's successful web startups are prime examples of big data used as an enabler of new products and services. For example, by combining a large number of signals from a user's actions and those of their friends, Facebook has been able to craft a highly personalized user experience and create a new kind of advertising business. It's no coincidence that the lion's share of ideas and tools underpinning big data have emerged from Google, Yahoo, Amazon and Facebook. Big Data promises to provide competitive advantage, better customer understanding and revenue growth.
Where does the data in Big Data come from in the real world? Well, as you can see in this real world architecture, it comes from many machines and systems.
In the real world, there are multiple processes initiated from these systems. Each of the processes, can initiate multiple downstream processes as well. There are synchronous, asynchronous, inbound and outbound calls made across various platforms including mainframe, web servers, application servers, databases, web services, 3rd parties, and the list goes on and on…These various systems and processes provide data that your business relies on. Is this a new or returning customer? If returning customer, what items have they purchased in the past? What rebates, if any, will be offered? Tax calculations and the value of the final order. As an example, I want to go buy a loaf of bread at my local grocery store. When I check out, I scan my customer loyalty card. Instantly, my previous orders are accessed and rebate offers are generated based on my past and current purchases. These rebates or coupons appear on the back of my receipt which is printed once my tax has been calculated and the final order is paid.
Even something as simple as buying a loaf a bread generates a lot of data and complexity on the back end systems. All of this generated data is stored in silo’s and organized inconsistently. There are old systems with no useful records. Some systems collect all data, some systems aggregate data. Some transactions are recorded by a customer number while other transactions are only indexed by non-synchronized clock time. Terabytes of data is scattered and created by thousands of machines and the data is silo’d. It is this which drives significant inefficiencies and costs when analyzing data using traditional methods & also with Big Data.
So when the analysts in your organization sit down to answer business questions with transactional data, where do they usually start? First they try to define the problem at hand. For example, they might be interested in determining: - What technology can be virtualized or put in the cloud? - What improvements in IT performance will have the greatest impact on reducing churn of high value customers? - Which alternative IT investment will have the largest impact on revenue? - Which technology can I decommission with minimal impact on business? The challenge in answering these questions is to make sense of fragmented data generated by distributed, complex applications with data stored in silo’d databases. Leaders end up making decisions from the snippets of data that are available, or traces left behind in operational data stores. This includes logs, services, tools and reports; each with different quality and format. Even determining if a question can be answered involves data warehouse teams and IT impact analysis.
Now that the business analysts have determined which questions they want answered, the next step is to determine what data is necessary to answer these questions. This data collection is the first in a four step process to the current approach to Big Data. Step 1: Data CollectionStep 2: Data Cleansing and NormalizingStep 3: Data TransformationStep 4: ReportingThis 4 step process requires a lot of time, money, resources and is very complex. Steps 2 & 3 are where most of the challenge lies. It usually takes months to answer the business questions that the analysts want answered.Just to re-iterate, the primary challenge with Big Data today is not the large volumes, it is the need to clean, normalize, and infer data relationships through statistical modeling.The current approach is kind of like this guy trying to run a marathon. It takes a lot of time, effort and it is quite exhausting.
And Gartner agrees with this…data preparation takes 30-60% of the time and money. The reason that data preparation consumes 30-60% of analytics processing is that data is saved without context - but context is paramount! Context, the association between data elements, brings the data to life. Think of data is the brick. Context is the mortar that ties it all together allowing for timely, accurate and substantive decisions based on facts. Context is defined as the relationship between different data elements that define the answers to what?, where?, when?, why? and who? Without context, the process of integrating data from multiple databases is incredibly complicated, and due to the process, often fleeting in terms of timely business relevance. To be more specific, companies often spend 2-3 months per application just to understand, clean, and format the data before sending it to their data warehouse or Big Data store. Gartner just released a new report in October on Big Data and the amount of money that will be spent by IT on Big Data initiatives. In 2012, the IT spend on Big Data will total $28 billion. By 2016, that number will almost double to $54 billion. From 2011 to 2016 a total of $232 billion will be spent on Big Data initiatives. This spend is a combination of upgrades, replacements and net new projects. Gartner also reports that almost 80% of this spend will be for IT services to support these initiatives.
How does one go about overcoming this challenge? First you need to have some form of tracking technology that is embedded across the enterprise to immediately connect separate events into true business transactions. It should be able to be configured to harvest context specific to your business, no matter where that data is processed. This analytic process would then ensure that these business transactions are connected to real world users and events. It should provide a result that is a consistent, accurate, and timely data model of your business as it is happening as well as the end-to-end detailed information associated with how customers (as well as partners, and employees) electronically interact with your company.
What if you could visualize a business transaction from the disparate events? This flow represents the provisioning process for a large Telco. As you can see, the diagram is still complex and has a lot of systems but this is an actual transaction flow that is now in context of the provisioning process (this is the “what”). This diagram shows the “where” of the context. Additional information associated with this specific transaction flow would be the “who” and “when”. Earlier I showed you an image of a real world architecture diagram, it was just that…an architecture diagram. It did not provide a transaction flow in context of a business service.
So, what is Big Transaction Data? Well, let’s put a few things in context.Big Data is the process of storing large amounts of data, performing analytics on those data, and providing real business benefit. It is NOT solely data generated by social media outletsA Transaction is any business process that is supported by applications or systems. Examples include “make payment,” or “checkout,” or “search”.Context is the “who,” “what,” “where,” information associated with a transaction. This is also referred to as meta-data.At the core, Big Transaction Data saves a single stream of Big Data, with context, generated from transactions at the time of execution. So how can Big Transaction Data can help address the current challenges with Big Data? Big Transaction Data reinvents the cost model by creating a normalized data set. It takes the existing 4 step process and turns it into a 3 step process. The Big Transaction Data model provides a holistic, timely, and trustworthy contextual data source that merges application silos across your business into a singular transaction data model that provides decision makers with the data they need to facilitate fact based alignment of IT investment to business results and prioritize budgetary investments appropriately.By providing answers within hours or days rather than weeks or months, Big Transaction Data can significantly reduce the time and money spent on data modeling, and increases the total value of a Big Data initiative.Remember our marathon runner, he goes from looking like he won’t even get across the finish line to winning the race.
How doesBTD affect Gartner’s Report? Big Transaction Data with it’s context allows you to almost eliminate the dependence on IT for analytics. They will no longer need to be involved in the Data Preparation, verification of data quality and modeling efforts. This will help reduce the effort conservatively by 45-85% and will allow you to have real time visibility into Big Data vs. the historic visibility that you get with the current 4 step model. With the BTD 3 step model, all of the time can be focused on understanding problems and evaluating results which you will receive in hours or days as opposed to months, previously. If you remember in an earlier slide, Gartner reported that almost 80% of the spend is in IT services. If BTD can cut that 80% in half, that amounts to substantial savings not to mention quicker answers to the questions.
So let’s take a look at the elements that makeup a business transaction. Another way to think of a transactionis any interaction between an end-user and one or many applications. I would suggest that the contextual relevance that is most valuable to 99% of business questions are rooted in the following areas:Users & CustomersBehaviors & ActionsCustom Business DataService ProvidedBusiness OutcomesI would also suggest to you that Big Transaction Data captures and stores, in context, the information described within the first four areas. The information in the Business Outcomes area can then augment the Big Transaction Data model to provide a complete picture of the business transaction.
Let’s take a look at an example use case for Big Transaction Data. Which IT investment will have the biggest positive impact on revenue?General question - Where should I invest IT budget to have the biggest impact on revenue?Specific question for an E-Retailer - What is the infrastructure whose performance impacts customer purchase decisions the most?Now that we have defined the question, what sources of data do we need?
You can see that as response time for the Product Catalog DB increases, average revenue per visit decreases. There is a strong inverse correlation there. However for the Authentication Service, there appears to be no correlation between Authentication Service response time and average revenue per visit. So, it is clear that investing in the Product Catalog DB infrastructure will have the biggest positive impact on revenue. If you think about it, I would tolerate a slow authentication page. If I keep performing searches or lookups on products and they are slow to respond, I would probably start looking at competitor’s sites for the products.So how did BTD help with the analysis? It aggregated the transaction stream into a clean historic data set with context (visits, performance, sales). It captured all the data in context needed to answer the question. A Business Intelligence tool can then explore the historic data set by aggregating the visits into a report summary (performance by average revenue per visit).
Inconclusion, the primary issue with Big Data today is not the large volumes, it is the need to clean, normalize, and infer data relationships through statistical modeling. How do you solve this problem? With Big Transaction Data! Big Transaction Data will save a single stream of Big Data, with context, generated from transactions at the time of execution. This provides you a normalized, linked, real time data set that allows you to take analytics out of the hands of IT and put it in the hands of people who understand the business. Now, the business questions will be answered in a matter of hours and not months.At the beginning I said that, “Big Data promises to provide competitive advantage, better customer understanding and revenue growth.”Big Transaction Data can make it faster, easier and cheaper to deliver on this promise.I want to thank you for your time today.