SlideShare a Scribd company logo
1 of 12
Download to read offline
Enterprise Applications in the Cloud:
A Roadmap to Workload Characterization and Prediction
Leonid Grinshpan, Oracle Corporation (www.oracle.com)

Subject
Enterprise application (EA) capacity planning methodology based on queuing models
provides reliable estimates of the cloud resources needed to satisfy dynamically
changing business workloads [Leonid Grinshpan. Solving Enterprise Applications
Performance Puzzles: Queuing Models to the Rescue, Willey-IEEE Press, 2012,
http://tinyurl.com/7hbalv5]. The biggest challenge in the methodology implementation is
a quantitative characterization of EA transactional workload that represents input data
for EA models. This article provides a road map to workload characterization and its
prediction by:
-

Identifying the constituents of EA transactional workload and specifying the
metrics to quantify it.
Reviewing the technologies generating raw transactional data.
Examining Big Data Analytics ability to extract workload characterization from
raw transactional data.
Assessing the methods that discover the workload variability patterns.

The cloud computing is becoming the most attractive technology for deployment of
enterprise applications (EA) because of its intrinsic ability of agile adaptation to the
variations of the business demands. Performance of the cloud-deployed EAs depends
on cloud ability to dynamically redistribute its resources allocated to EA in order to
synchronize them with the fluctuations of business workloads. Queuing model based
capacity planning helps to find a right balance between demand from the businesses
that are using EA, and supply of resources provided by cloud infrastructure.
Transactional workload – a characterization of a business demand for system services
EA business users generate transactional workload that requires allocation of system
resources to be processed. The transactions belong to two categories (Figure 1): the ones
that are initiated by people (user transactions) and the others that are triggered by all
sorts of equipment connected to EA over the networks (operational transactions).

Figure 1 Two sources of transactional workload: user requests and operational
equipment.
The intensity of the user transactions at any given time depends on a number of users
actively interacting with a system. It also is a derivative of a pace each user submits one
transaction after another. The interval between subsequent transactions from the same
user can be substantial as a user needs time to assess a reply from an application to a
previous request, and to prepare the next one.
Operational data is generated by devices connected to application over network (for
example, cash registers). Such devices trigger an operational transaction for each event
(for example, a sales operation in retail store). Operational transactions let EA keep
track of business activities across all company’s entities.
Transactional workload is characterized by the following parameters:
-

List of transactions (user generated and operational).
Per each transaction its intensity expressed in an average number of times a
transaction was initiated per one hour by single user or single device.
Per each transaction a number of users or devices requesting it.
When being processed by EA, a transaction is consuming system resources. To make
such an abstract object as transaction easier to comprehend, it can be visualized by a car
moving through a tanglewood of hardware servers and appliances, receiving services at
each component, and consuming its resources while being served. It is quite similar to
a car driving on a web of highways and receiving services from highway toll
attendants. A highway toll booth is metaphor for a hardware server; an expanse of a toll
plaza is a representation of a memory – a car has to occupy some space to be processed
by a toll attendant. At rush hours a car approaching a toll plaza might not find a space
to enter it and will wait – the same way a transaction waits for a memory when it is all
allocated to other transactions.

Transaction profile – a measure of single transaction demand for system resources
Each transaction initiated by user or device triggers a multitude of transaction
processing activities in different infrastructure components (servers, appliances, and
networks). Table 1 describes in orderly fashion all the components of classical three
tiered system involved in processing of a transaction that retrieves a sales report.
Table 1
Step

Description of activity

1

Transfer a transaction from a user to a load balancing
appliance
Implementation of a load balancing algorithm in order to
direct a transaction to a particular Web server
Transfer a transaction from load balancing appliance to a
Web server
Setting up connection and session between user and system,
directing transaction for further processing to Application
server
Transfer a transaction from Web server to Application
server
Analyzing transaction and determining what security and
metadata needed to retrieve a report
Transfer a transaction from Application server to SQL
server

2
3
4

5
6
7

Infrastructure
component
Network
Load balancing
appliance
Network
Web server

Network
Application
server
Network
8

Retrieving report security data and metadata

SQL server

9

Network

11
12
13
14

Transferring report security data and metadata to
Application server
Checking user credentials using report security data and
metadata, preparing a request to retrieve data from SQL
server needed for report generation
Transferring report to SQL server
Fetching data for report generation
Transferring data for report generation to Application server
Generating report

15
16
17

Transferring report to Web server
Rendering report
Transferring report to user

10

Application
server
Network
SQL server
Network
Application
server
Network
Web server
Network

As Table 1 indicates, the following infrastructure components are involved in
transaction processing: network, load balancing appliance, as well as Web, Application
and SQL servers. Each component allocates its resources for transaction processing for a
particular time interval. In general, each component has the following assets to be
allocated:
Active resources:
Passive resources:
-

CPU time (data processing)
I/O time (data transfer)
Network time (data transfer)
Software connections to the servers and services (for example, Web
server connections, database connections)
Software threads
Storage space
Memory space

Active resources implement transaction processing and data transfer. Passive resources
provide access to active resources. In order to be processed by any active resource, a
transaction has to request and get allocated the passive resources. If any of the assets
needed for transaction processing is not available because all supply is taken by other
transactions, than transaction will wait until an asset is released (indeed, wait time will
increase transaction response time).
A consumption of an active resource is measured in time interval it was serving a
transaction. A metric for a passive resource usage depends on passive resource type:
for software connections and threads it is a number of connections / threads; for
memory and storage it is a size of allocated memory.
We define a transaction profile as a set of numbers (vector) specifying quantity of each
resource consumed by transaction during its processing in hardware components.
Table 2 describes profile of transaction Report.
Table 2

0.08

2

0.15

Load
balancing
appliance

CPU (seconds)

100 KB

Network

Networking
hardware (seconds)

3

I/O system
(seconds)

2.5

Connections

1

CPU (seconds)

Threads

0.01

SQL server

Memory

CPU (seconds)

Profile of
“Report”
transaction

Connections

Active and
passive
resources

Application server

CPU (seconds)

Web
server

0.09

0.01

Transaction Report uses resources of Web, Application, and SQL servers as well as
network and load balancing appliance. It spends 2.5 seconds in Application server
CPU; before receiving CPU time it requires 3 threads and 100 KB memory to be
allocated. The transaction spends 0.08 seconds in SQL server CPU and 0.15 seconds in
I/O system only after if it acquired 2 connections to SQL server.

Transactional workload represents a flow of requests to be satisfied by EA.
Transaction profile is a quantification of a single transaction demand for
passive and active resources. Transactional workload and Transaction
profile quantify total demand for EA resources generated by business.
Constituents of transactional data
The menu of business transactions processed by EA is defined during EA development.
Each transaction is characterized by a function it is executing as well as by a set of
parameters:
Transaction A = {< function>; < parameter 1> ,..., <parameter N>}
Here is an example of a transaction implementing financial consolidation for a
particular geographical region, currency, and time period:
Transaction A =
= {financial consolidation; < geographical region>, <currency>, <time period>}
A transaction can be identified by its unique ID. Unique ID enables tracking of a
transaction path among system servers and measurement of active and passive
resources consumed by transaction during its processing in each server. Information on
each executed transaction has to be saved in log file; in general, this information
includes:
-

-

Unique ID
Name of a user or device that initiated transaction
Value of each parameter (for example, for Transaction A the parameters
are: North America, US dollars, year 2012)
Transaction start date and time
Transaction total execution time
For each server a transaction was processed in:
- Quantity of passive resource 1
- …………………………………..
- Quantity of passive resource M
- CPU processing time
- I/O data transfer time
Network data transfer time

Gathering transactional data
Collection of transactional information requires engagement of different technologies.
On application development stage an analysis of a business process identifies all
transactions and their parameters. The analysis is based on the interviews of the key
process participants. It produces the flow charts revealing the logistics of a business
process, and identifies the steps that can be automated by application and framed as
transactions [http://tinyurl.com/c8yn8fj].
Application instrumentation is the most potent approach capable to deliver
transactional data [http://tinyurl.com/7mchth]. A number of instrumentation
technologies are adopted by application developers. Among them:
-

Application Response Measurement – ARM [http://tinyurl.com/bmxlu83]
Apache Commons Monitoring [http://tinyurl.com/cm87akg]
Tracing and Instrumenting Applications written in Visual Basic and
Visual C# [http://tinyurl.com/dye2qep]
Java Management Extension – JMX [http://tinyurl.com/ccyqxtf]

The latest releases of Oracle EAs embrace Execution Context ID (ECID) technology.
ECID is a unique identifier that helps to correlate the events associated with the same
transaction across several infrastructure components. The ECID value for a particular
transaction is generated at the first layer of system (usually a Web server) and is passed
down to the subsequent layers. The ECID value is logged (and auditable) in each
software component involved in the transaction processing. ECID allows tracking the
end-to-end flow of a particular transaction across all EA components
(https://blogs.oracle.com/sduloutr/entry/using_execution_context_id_ecid). Here is an
example of a message with ECID as it appears in a log file of one of Oracle EA (ECID is
highlighted):
[2013-06-06T15:20:10.018-04:00] [FoundationServices0] [ERROR] [01301]
[oracle.bi.bifndnepm.workspace.security.RoleChecker] [tid: 609] [userId: <anonymous>] [ecid:
0000JwQvd83CKuG5Uz1Fic1HYwby0008Zk,0] [SRC_CLASS:
com.oracle.workspace.security.RoleChecker] [APP: WORKSPACE#11.1.2.0] [SRC_METHOD: ] Could not
resolve role: native://DN=cn=HAVA:0000011ed1cf2e7a-0000-4dc00a8f1415,ou=HAVA,ou=Roles,dc=css,dc=hyperion,dc=com?ROLE

Despite availability of a number of instrumentation methods, it is fair to say that EA
instrumentation implementation to the extent that makes EA efficiently manageable, is
a rarity. That brought to life a scope of the products that are filling the void and are
known as the business transactions management (BTM) systems. As Wikipedia.org
explains, BTM systems track each of the hops in the transaction path using a variety of
data collection methods, including OS-level sockets, network packet sniffing, log
parsing, agent-based middleware protocol sniffing, and others
[http://tinyurl.com/ck3gwby].
Instrumented EAs and BTM systems generate large volumes of raw transactional data
that have to be processed to extract workload and transaction profile information
representing input data for EA queuing models.
Big Data Analytics – from raw transactional data to workload characterization and
transaction profiles
Deployed in the cloud EAs are designed to process large number of business
transactions. For example, saleforce.com estimates that its cloud processes over 1 billion
complex transactions every single day (http://tinyurl.com/ntgw9tm). Such large data
volumes make Big Data Analytics platforms the attractive tools for extraction of
workload characterizations and transaction profiles from raw transactional data.
Transactional data is not that different from any other records the various Big Data
implementations successfully are working with. Transactional data mostly is saved as
the text, XML, or Excel files, as well as in SQL and OLAP databases. A challenge in
dealing with transactional data is that the pieces of information belonging to the same
transaction are scattered all over hosting platform and reside in various files on
different servers.
Open source Apache Hadoop Data processing framework provides necessary
functionality for real time characterization of workload and transaction profiles based
on raw transactional records (http://tinyurl.com/3cy6sd). To begin with, transactional
data from files and tables spread all around EA have to be loaded into Hadoop. Many
ETL tools are available to load transactional data into highly efficient Hadoop
Distributed File System (HDFS).
The files in HDFS are broken down by blocks; that allows speedy parallel processing of
the same file. A processing is based on MapReduce paradigm: a Map procedure filters
data based on unique transaction ID; a Reduce procedure counts a number of
transactions with particular unique IDs executed by particular user during one hour
time interval. The highly parallelizable nature of MarReduce allows usage of large
number of commodity servers generating workload characterization and transaction
profiles in real time.
Workload variability patterns
Generated by Big Data workload characterizations for different periods of time
represent a basis for discovering the workload patterns and for prediction of its
variability.
In order to allocate system resources sufficient to satisfy incoming transactions with
acceptable quality, it is necessary to predict fluctuations of hourly workload service
demand (WSD). We define hourly WSD as a vector (one-dimensional array):

{hourly workload service demand} =

∑ {hourly transaction service demand}

all transactions

where a vector { hourly transaction service demand} is calculated based on vector
{transaction profile} as shown below for the users (similar calculation takes place for
devices)
{transaction profile} * number of transactions per user per hour * number of users
It can be seen from the formula that {hourly workload service demand} is in direct
correlation with the numbers of users (as well as devices delivering operational data)
and intensity of their communication with the system. It also depends on the
parameters in {transaction profile}, particularly on such hardware specification as a speed
of CPU, I/O, network and size of available memory and storage.
EA workload variability patterns can be discovered using a number of approaches. A
publication http://tinyurl.com/lq5fnka “Workload Analysis and Demand Prediction of
Enterprise Data Center Applications” describes analysis of six months workload data
for 139 workloads generated by the users of customer relationship management
applications. The study concludes that EA workloads typically show a burthtiness as
well as a periodicity which can be a multiple of hours, days, and weeks. The study
presents the methods for deducing workload patterns and assessing their quality.
A paper http://tinyurl.com/lwazjyj “Dynamic Provisioning of Multi-tier Internet
Applications” presents a workload predictor that estimates the tail of the arrival rate
distribution (i.e., the peak demand) for the next T hours. If T = 1, the methodology
predicts the peak demand for the next one hour at the beginning of the hour. The
prediction is based on historical workload data collected for each hour of the day over
the past few days. A probability distribution is built for historical hourly workload data
and the peak workload for a particular hour is estimated as a high percentile of the
workload distribution for that hour.
A web post http://tinyurl.com/lrrm6dg specifies a few of patterns observed for different
EAs in the cloud. Two examples are:
-

-

The workloads that have a relatively short period of activity; they can be
generated, for example, by the users of enterprise planning applications
that are in use once per quarter.
The burthting workloads featuring high spikes in demand; they are
produced by the users of web retail applications during holiday seasons.

Workload characterization and prediction stack
Described technologies enable workload characterization and its prediction; they form a
stack presented on Fig. 2.

Figure 2 Workload characterization and prediction stack

Transactional workload constitutes input data for EA queuing model delivering the
estimates of EA performance for various infrastructure platforms. Before reassigning
EA resources based on modeling estimates, a cloud provider has to evaluate its impact
on overall cloud performance. That necessity originates from cloud’s nature as a multi-
tenant platform. In dedicated system there is a luxury to enact a change, then launch EA
and evaluate how a change affects EA performance; if change does not deliver needed
improvement, we can try a new change and repeat cycle again and again. But
reassigning resources for a cloud-based EA in order to improve its performance might
degrade performance of others EAs, as well as make cloud to run less profitably. A
provider has to use modeling to assess unexpected consequences of a change.
Because EA workload is changing over time, it is necessary periodically to repeat all the
steps in the stack presented on Figure 1 to ensure acceptable performance of EA as well
as the cloud as a multi-tenant environment.
Takeaway from the article
1. Transactional workload defines a business demand for EA services. It is
comprised of the requests coming from EA users as well as from operational
equipment.
2. Transactional workload is characterized by:
o List of transactions.
o Per each transaction its intensity expressed in an average number of times
a transaction was initiated per one hour by single user or single device.
o Per each transaction a number of users or devices requesting it.
3. Transaction profile is a measure of single transaction demand for system active
and passive resources (active resources - CPU, I/O, and network times; passive
resources - memory and storage space, number of Web server and database
connections, number of software threads).
4. Transactional workload and transaction profile quantify total demand from a
business for EA resources.
5. EA instrumentation generates raw transactional data. Big Data Analytics extract
transactional workload characterization and transaction profile information from
raw transactional data.
6. Workload variability patterns can be identified by analyzing workload
characterization for different periods of time; the patterns can be used for
predictive capacity planning.
About the author
During his last fifteen years as an Oracle consultant, the author was engaged hands-on
in performance tuning and sizing of enterprise applications for various corporations
(Dell, Citibank, Verizon, Clorox, Bank of America, AT&T, Best Buy, Aetna, Halliburton,
Pfizer, Astra Zeneca, Starbucks, Praxair, Baxter, American Express, etc.).

More Related Content

Similar to Enterprise applications in the cloud: a roadmap to workload characterization and prediction

USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONUSING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONijaia
 
Using Semi-supervised Classifier to Forecast Extreme CPU Utilization
Using Semi-supervised Classifier to Forecast Extreme CPU UtilizationUsing Semi-supervised Classifier to Forecast Extreme CPU Utilization
Using Semi-supervised Classifier to Forecast Extreme CPU Utilizationgerogepatton
 
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONUSING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONgerogepatton
 
How the detailed process of soa
How the detailed process of soaHow the detailed process of soa
How the detailed process of soaThony78
 
Dot Net performance monitoring
 Dot Net performance monitoring Dot Net performance monitoring
Dot Net performance monitoringKranthi Paidi
 
I p-o in different data processing systems
I p-o in different data processing systemsI p-o in different data processing systems
I p-o in different data processing systemsKinshook Chaturvedi
 
IRJET- Effective Technique for Optimizing Timestamp Ordering in Read-Write/Wr...
IRJET- Effective Technique for Optimizing Timestamp Ordering in Read-Write/Wr...IRJET- Effective Technique for Optimizing Timestamp Ordering in Read-Write/Wr...
IRJET- Effective Technique for Optimizing Timestamp Ordering in Read-Write/Wr...IRJET Journal
 
IRJET-Concurrency Control Model for Distributed Database
IRJET-Concurrency Control Model for Distributed DatabaseIRJET-Concurrency Control Model for Distributed Database
IRJET-Concurrency Control Model for Distributed DatabaseIRJET Journal
 
SA UNIT I STREAMING ANALYTICS.pdf
SA UNIT I STREAMING ANALYTICS.pdfSA UNIT I STREAMING ANALYTICS.pdf
SA UNIT I STREAMING ANALYTICS.pdfManjuAppukuttan2
 
E commerce technologies
E commerce technologiesE commerce technologies
E commerce technologiesAnne ndolo
 
New Relic_Heroku_Presentation_Dreamforce11
New Relic_Heroku_Presentation_Dreamforce11New Relic_Heroku_Presentation_Dreamforce11
New Relic_Heroku_Presentation_Dreamforce11New Relic
 
Lecture notes -001
Lecture notes -001Lecture notes -001
Lecture notes -001Eric Rotich
 
IRJET- Adopting Encryption for Intranet File Communication System
IRJET- Adopting Encryption for Intranet File Communication SystemIRJET- Adopting Encryption for Intranet File Communication System
IRJET- Adopting Encryption for Intranet File Communication SystemIRJET Journal
 
Application-Servers.pdf
Application-Servers.pdfApplication-Servers.pdf
Application-Servers.pdfSamir Paul
 
Performance testing basics
Performance testing basicsPerformance testing basics
Performance testing basicsCharu Anand
 
SDN Federation White Paper
SDN Federation White PaperSDN Federation White Paper
SDN Federation White PaperBrian Hedstrom
 

Similar to Enterprise applications in the cloud: a roadmap to workload characterization and prediction (20)

USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONUSING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
 
Using Semi-supervised Classifier to Forecast Extreme CPU Utilization
Using Semi-supervised Classifier to Forecast Extreme CPU UtilizationUsing Semi-supervised Classifier to Forecast Extreme CPU Utilization
Using Semi-supervised Classifier to Forecast Extreme CPU Utilization
 
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONUSING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
 
How the detailed process of soa
How the detailed process of soaHow the detailed process of soa
How the detailed process of soa
 
Dot Net performance monitoring
 Dot Net performance monitoring Dot Net performance monitoring
Dot Net performance monitoring
 
I p-o in different data processing systems
I p-o in different data processing systemsI p-o in different data processing systems
I p-o in different data processing systems
 
Copy of sec d (2)
Copy of sec d (2)Copy of sec d (2)
Copy of sec d (2)
 
Copy of sec d (2)
Copy of sec d (2)Copy of sec d (2)
Copy of sec d (2)
 
chapter 2.pdf
chapter 2.pdfchapter 2.pdf
chapter 2.pdf
 
IRJET- Effective Technique for Optimizing Timestamp Ordering in Read-Write/Wr...
IRJET- Effective Technique for Optimizing Timestamp Ordering in Read-Write/Wr...IRJET- Effective Technique for Optimizing Timestamp Ordering in Read-Write/Wr...
IRJET- Effective Technique for Optimizing Timestamp Ordering in Read-Write/Wr...
 
IRJET-Concurrency Control Model for Distributed Database
IRJET-Concurrency Control Model for Distributed DatabaseIRJET-Concurrency Control Model for Distributed Database
IRJET-Concurrency Control Model for Distributed Database
 
SA UNIT I STREAMING ANALYTICS.pdf
SA UNIT I STREAMING ANALYTICS.pdfSA UNIT I STREAMING ANALYTICS.pdf
SA UNIT I STREAMING ANALYTICS.pdf
 
E commerce technologies
E commerce technologiesE commerce technologies
E commerce technologies
 
Ch13
Ch13Ch13
Ch13
 
New Relic_Heroku_Presentation_Dreamforce11
New Relic_Heroku_Presentation_Dreamforce11New Relic_Heroku_Presentation_Dreamforce11
New Relic_Heroku_Presentation_Dreamforce11
 
Lecture notes -001
Lecture notes -001Lecture notes -001
Lecture notes -001
 
IRJET- Adopting Encryption for Intranet File Communication System
IRJET- Adopting Encryption for Intranet File Communication SystemIRJET- Adopting Encryption for Intranet File Communication System
IRJET- Adopting Encryption for Intranet File Communication System
 
Application-Servers.pdf
Application-Servers.pdfApplication-Servers.pdf
Application-Servers.pdf
 
Performance testing basics
Performance testing basicsPerformance testing basics
Performance testing basics
 
SDN Federation White Paper
SDN Federation White PaperSDN Federation White Paper
SDN Federation White Paper
 

More from Leonid Grinshpan, Ph.D.

Introduction to enterprise applications capacity planning
Introduction to enterprise applications capacity planning Introduction to enterprise applications capacity planning
Introduction to enterprise applications capacity planning Leonid Grinshpan, Ph.D.
 
Enterprise applications in the cloud: analysis of pay-per-use plans
Enterprise applications in the cloud:  analysis of pay-per-use plansEnterprise applications in the cloud:  analysis of pay-per-use plans
Enterprise applications in the cloud: analysis of pay-per-use plansLeonid Grinshpan, Ph.D.
 
Solving enterprise applications performance puzzles queuing models to the r...
Solving enterprise applications performance puzzles   queuing models to the r...Solving enterprise applications performance puzzles   queuing models to the r...
Solving enterprise applications performance puzzles queuing models to the r...Leonid Grinshpan, Ph.D.
 
Enterprise application in the cloud – virtualized deployment
Enterprise application in the cloud – virtualized deployment Enterprise application in the cloud – virtualized deployment
Enterprise application in the cloud – virtualized deployment Leonid Grinshpan, Ph.D.
 
Enterprise applications in the cloud: improving cloud efficiency by transacti...
Enterprise applications in the cloud: improving cloud efficiency by transacti...Enterprise applications in the cloud: improving cloud efficiency by transacti...
Enterprise applications in the cloud: improving cloud efficiency by transacti...Leonid Grinshpan, Ph.D.
 
Beyond IT optimization there is a (promised) land of application performance ...
Beyond IT optimization there is a (promised) land of application performance ...Beyond IT optimization there is a (promised) land of application performance ...
Beyond IT optimization there is a (promised) land of application performance ...Leonid Grinshpan, Ph.D.
 
Enterprise applications in the cloud: non-virtualized deployment
Enterprise applications in the cloud: non-virtualized deploymentEnterprise applications in the cloud: non-virtualized deployment
Enterprise applications in the cloud: non-virtualized deploymentLeonid Grinshpan, Ph.D.
 
Queuing model based load testing of large enterprise applications
Queuing model based load testing of large enterprise applicationsQueuing model based load testing of large enterprise applications
Queuing model based load testing of large enterprise applicationsLeonid Grinshpan, Ph.D.
 
Methodology of enterprise application capacity planning by real life examples
Methodology of enterprise application capacity planning by real life examplesMethodology of enterprise application capacity planning by real life examples
Methodology of enterprise application capacity planning by real life examplesLeonid Grinshpan, Ph.D.
 
Methodology Of Enterprise Applications Capacity Planning
Methodology Of Enterprise Applications Capacity PlanningMethodology Of Enterprise Applications Capacity Planning
Methodology Of Enterprise Applications Capacity PlanningLeonid Grinshpan, Ph.D.
 

More from Leonid Grinshpan, Ph.D. (11)

Introduction to enterprise applications capacity planning
Introduction to enterprise applications capacity planning Introduction to enterprise applications capacity planning
Introduction to enterprise applications capacity planning
 
Enterprise applications in the cloud: analysis of pay-per-use plans
Enterprise applications in the cloud:  analysis of pay-per-use plansEnterprise applications in the cloud:  analysis of pay-per-use plans
Enterprise applications in the cloud: analysis of pay-per-use plans
 
Solving enterprise applications performance puzzles queuing models to the r...
Solving enterprise applications performance puzzles   queuing models to the r...Solving enterprise applications performance puzzles   queuing models to the r...
Solving enterprise applications performance puzzles queuing models to the r...
 
Enterprise application in the cloud – virtualized deployment
Enterprise application in the cloud – virtualized deployment Enterprise application in the cloud – virtualized deployment
Enterprise application in the cloud – virtualized deployment
 
Enterprise applications in the cloud: improving cloud efficiency by transacti...
Enterprise applications in the cloud: improving cloud efficiency by transacti...Enterprise applications in the cloud: improving cloud efficiency by transacti...
Enterprise applications in the cloud: improving cloud efficiency by transacti...
 
Beyond IT optimization there is a (promised) land of application performance ...
Beyond IT optimization there is a (promised) land of application performance ...Beyond IT optimization there is a (promised) land of application performance ...
Beyond IT optimization there is a (promised) land of application performance ...
 
Enterprise applications in the cloud: non-virtualized deployment
Enterprise applications in the cloud: non-virtualized deploymentEnterprise applications in the cloud: non-virtualized deployment
Enterprise applications in the cloud: non-virtualized deployment
 
Queuing model based load testing of large enterprise applications
Queuing model based load testing of large enterprise applicationsQueuing model based load testing of large enterprise applications
Queuing model based load testing of large enterprise applications
 
Methodology of enterprise application capacity planning by real life examples
Methodology of enterprise application capacity planning by real life examplesMethodology of enterprise application capacity planning by real life examples
Methodology of enterprise application capacity planning by real life examples
 
Methodology of virtual machines sizing
Methodology of virtual machines sizingMethodology of virtual machines sizing
Methodology of virtual machines sizing
 
Methodology Of Enterprise Applications Capacity Planning
Methodology Of Enterprise Applications Capacity PlanningMethodology Of Enterprise Applications Capacity Planning
Methodology Of Enterprise Applications Capacity Planning
 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 

Enterprise applications in the cloud: a roadmap to workload characterization and prediction

  • 1. Enterprise Applications in the Cloud: A Roadmap to Workload Characterization and Prediction Leonid Grinshpan, Oracle Corporation (www.oracle.com) Subject Enterprise application (EA) capacity planning methodology based on queuing models provides reliable estimates of the cloud resources needed to satisfy dynamically changing business workloads [Leonid Grinshpan. Solving Enterprise Applications Performance Puzzles: Queuing Models to the Rescue, Willey-IEEE Press, 2012, http://tinyurl.com/7hbalv5]. The biggest challenge in the methodology implementation is a quantitative characterization of EA transactional workload that represents input data for EA models. This article provides a road map to workload characterization and its prediction by: - Identifying the constituents of EA transactional workload and specifying the metrics to quantify it. Reviewing the technologies generating raw transactional data. Examining Big Data Analytics ability to extract workload characterization from raw transactional data. Assessing the methods that discover the workload variability patterns. The cloud computing is becoming the most attractive technology for deployment of enterprise applications (EA) because of its intrinsic ability of agile adaptation to the variations of the business demands. Performance of the cloud-deployed EAs depends on cloud ability to dynamically redistribute its resources allocated to EA in order to synchronize them with the fluctuations of business workloads. Queuing model based capacity planning helps to find a right balance between demand from the businesses that are using EA, and supply of resources provided by cloud infrastructure.
  • 2. Transactional workload – a characterization of a business demand for system services EA business users generate transactional workload that requires allocation of system resources to be processed. The transactions belong to two categories (Figure 1): the ones that are initiated by people (user transactions) and the others that are triggered by all sorts of equipment connected to EA over the networks (operational transactions). Figure 1 Two sources of transactional workload: user requests and operational equipment. The intensity of the user transactions at any given time depends on a number of users actively interacting with a system. It also is a derivative of a pace each user submits one transaction after another. The interval between subsequent transactions from the same user can be substantial as a user needs time to assess a reply from an application to a previous request, and to prepare the next one. Operational data is generated by devices connected to application over network (for example, cash registers). Such devices trigger an operational transaction for each event (for example, a sales operation in retail store). Operational transactions let EA keep track of business activities across all company’s entities. Transactional workload is characterized by the following parameters: - List of transactions (user generated and operational). Per each transaction its intensity expressed in an average number of times a transaction was initiated per one hour by single user or single device. Per each transaction a number of users or devices requesting it.
  • 3. When being processed by EA, a transaction is consuming system resources. To make such an abstract object as transaction easier to comprehend, it can be visualized by a car moving through a tanglewood of hardware servers and appliances, receiving services at each component, and consuming its resources while being served. It is quite similar to a car driving on a web of highways and receiving services from highway toll attendants. A highway toll booth is metaphor for a hardware server; an expanse of a toll plaza is a representation of a memory – a car has to occupy some space to be processed by a toll attendant. At rush hours a car approaching a toll plaza might not find a space to enter it and will wait – the same way a transaction waits for a memory when it is all allocated to other transactions. Transaction profile – a measure of single transaction demand for system resources Each transaction initiated by user or device triggers a multitude of transaction processing activities in different infrastructure components (servers, appliances, and networks). Table 1 describes in orderly fashion all the components of classical three tiered system involved in processing of a transaction that retrieves a sales report. Table 1 Step Description of activity 1 Transfer a transaction from a user to a load balancing appliance Implementation of a load balancing algorithm in order to direct a transaction to a particular Web server Transfer a transaction from load balancing appliance to a Web server Setting up connection and session between user and system, directing transaction for further processing to Application server Transfer a transaction from Web server to Application server Analyzing transaction and determining what security and metadata needed to retrieve a report Transfer a transaction from Application server to SQL server 2 3 4 5 6 7 Infrastructure component Network Load balancing appliance Network Web server Network Application server Network
  • 4. 8 Retrieving report security data and metadata SQL server 9 Network 11 12 13 14 Transferring report security data and metadata to Application server Checking user credentials using report security data and metadata, preparing a request to retrieve data from SQL server needed for report generation Transferring report to SQL server Fetching data for report generation Transferring data for report generation to Application server Generating report 15 16 17 Transferring report to Web server Rendering report Transferring report to user 10 Application server Network SQL server Network Application server Network Web server Network As Table 1 indicates, the following infrastructure components are involved in transaction processing: network, load balancing appliance, as well as Web, Application and SQL servers. Each component allocates its resources for transaction processing for a particular time interval. In general, each component has the following assets to be allocated: Active resources: Passive resources: - CPU time (data processing) I/O time (data transfer) Network time (data transfer) Software connections to the servers and services (for example, Web server connections, database connections) Software threads Storage space Memory space Active resources implement transaction processing and data transfer. Passive resources provide access to active resources. In order to be processed by any active resource, a transaction has to request and get allocated the passive resources. If any of the assets needed for transaction processing is not available because all supply is taken by other transactions, than transaction will wait until an asset is released (indeed, wait time will increase transaction response time).
  • 5. A consumption of an active resource is measured in time interval it was serving a transaction. A metric for a passive resource usage depends on passive resource type: for software connections and threads it is a number of connections / threads; for memory and storage it is a size of allocated memory. We define a transaction profile as a set of numbers (vector) specifying quantity of each resource consumed by transaction during its processing in hardware components. Table 2 describes profile of transaction Report. Table 2 0.08 2 0.15 Load balancing appliance CPU (seconds) 100 KB Network Networking hardware (seconds) 3 I/O system (seconds) 2.5 Connections 1 CPU (seconds) Threads 0.01 SQL server Memory CPU (seconds) Profile of “Report” transaction Connections Active and passive resources Application server CPU (seconds) Web server 0.09 0.01 Transaction Report uses resources of Web, Application, and SQL servers as well as network and load balancing appliance. It spends 2.5 seconds in Application server CPU; before receiving CPU time it requires 3 threads and 100 KB memory to be allocated. The transaction spends 0.08 seconds in SQL server CPU and 0.15 seconds in I/O system only after if it acquired 2 connections to SQL server. Transactional workload represents a flow of requests to be satisfied by EA. Transaction profile is a quantification of a single transaction demand for passive and active resources. Transactional workload and Transaction profile quantify total demand for EA resources generated by business.
  • 6. Constituents of transactional data The menu of business transactions processed by EA is defined during EA development. Each transaction is characterized by a function it is executing as well as by a set of parameters: Transaction A = {< function>; < parameter 1> ,..., <parameter N>} Here is an example of a transaction implementing financial consolidation for a particular geographical region, currency, and time period: Transaction A = = {financial consolidation; < geographical region>, <currency>, <time period>} A transaction can be identified by its unique ID. Unique ID enables tracking of a transaction path among system servers and measurement of active and passive resources consumed by transaction during its processing in each server. Information on each executed transaction has to be saved in log file; in general, this information includes: - - Unique ID Name of a user or device that initiated transaction Value of each parameter (for example, for Transaction A the parameters are: North America, US dollars, year 2012) Transaction start date and time Transaction total execution time For each server a transaction was processed in: - Quantity of passive resource 1 - ………………………………….. - Quantity of passive resource M - CPU processing time - I/O data transfer time Network data transfer time Gathering transactional data Collection of transactional information requires engagement of different technologies. On application development stage an analysis of a business process identifies all
  • 7. transactions and their parameters. The analysis is based on the interviews of the key process participants. It produces the flow charts revealing the logistics of a business process, and identifies the steps that can be automated by application and framed as transactions [http://tinyurl.com/c8yn8fj]. Application instrumentation is the most potent approach capable to deliver transactional data [http://tinyurl.com/7mchth]. A number of instrumentation technologies are adopted by application developers. Among them: - Application Response Measurement – ARM [http://tinyurl.com/bmxlu83] Apache Commons Monitoring [http://tinyurl.com/cm87akg] Tracing and Instrumenting Applications written in Visual Basic and Visual C# [http://tinyurl.com/dye2qep] Java Management Extension – JMX [http://tinyurl.com/ccyqxtf] The latest releases of Oracle EAs embrace Execution Context ID (ECID) technology. ECID is a unique identifier that helps to correlate the events associated with the same transaction across several infrastructure components. The ECID value for a particular transaction is generated at the first layer of system (usually a Web server) and is passed down to the subsequent layers. The ECID value is logged (and auditable) in each software component involved in the transaction processing. ECID allows tracking the end-to-end flow of a particular transaction across all EA components (https://blogs.oracle.com/sduloutr/entry/using_execution_context_id_ecid). Here is an example of a message with ECID as it appears in a log file of one of Oracle EA (ECID is highlighted): [2013-06-06T15:20:10.018-04:00] [FoundationServices0] [ERROR] [01301] [oracle.bi.bifndnepm.workspace.security.RoleChecker] [tid: 609] [userId: <anonymous>] [ecid: 0000JwQvd83CKuG5Uz1Fic1HYwby0008Zk,0] [SRC_CLASS: com.oracle.workspace.security.RoleChecker] [APP: WORKSPACE#11.1.2.0] [SRC_METHOD: ] Could not resolve role: native://DN=cn=HAVA:0000011ed1cf2e7a-0000-4dc00a8f1415,ou=HAVA,ou=Roles,dc=css,dc=hyperion,dc=com?ROLE Despite availability of a number of instrumentation methods, it is fair to say that EA instrumentation implementation to the extent that makes EA efficiently manageable, is a rarity. That brought to life a scope of the products that are filling the void and are known as the business transactions management (BTM) systems. As Wikipedia.org explains, BTM systems track each of the hops in the transaction path using a variety of data collection methods, including OS-level sockets, network packet sniffing, log parsing, agent-based middleware protocol sniffing, and others [http://tinyurl.com/ck3gwby].
  • 8. Instrumented EAs and BTM systems generate large volumes of raw transactional data that have to be processed to extract workload and transaction profile information representing input data for EA queuing models. Big Data Analytics – from raw transactional data to workload characterization and transaction profiles Deployed in the cloud EAs are designed to process large number of business transactions. For example, saleforce.com estimates that its cloud processes over 1 billion complex transactions every single day (http://tinyurl.com/ntgw9tm). Such large data volumes make Big Data Analytics platforms the attractive tools for extraction of workload characterizations and transaction profiles from raw transactional data. Transactional data is not that different from any other records the various Big Data implementations successfully are working with. Transactional data mostly is saved as the text, XML, or Excel files, as well as in SQL and OLAP databases. A challenge in dealing with transactional data is that the pieces of information belonging to the same transaction are scattered all over hosting platform and reside in various files on different servers. Open source Apache Hadoop Data processing framework provides necessary functionality for real time characterization of workload and transaction profiles based on raw transactional records (http://tinyurl.com/3cy6sd). To begin with, transactional data from files and tables spread all around EA have to be loaded into Hadoop. Many ETL tools are available to load transactional data into highly efficient Hadoop Distributed File System (HDFS). The files in HDFS are broken down by blocks; that allows speedy parallel processing of the same file. A processing is based on MapReduce paradigm: a Map procedure filters data based on unique transaction ID; a Reduce procedure counts a number of transactions with particular unique IDs executed by particular user during one hour time interval. The highly parallelizable nature of MarReduce allows usage of large number of commodity servers generating workload characterization and transaction profiles in real time. Workload variability patterns Generated by Big Data workload characterizations for different periods of time represent a basis for discovering the workload patterns and for prediction of its variability.
  • 9. In order to allocate system resources sufficient to satisfy incoming transactions with acceptable quality, it is necessary to predict fluctuations of hourly workload service demand (WSD). We define hourly WSD as a vector (one-dimensional array): {hourly workload service demand} = ∑ {hourly transaction service demand} all transactions where a vector { hourly transaction service demand} is calculated based on vector {transaction profile} as shown below for the users (similar calculation takes place for devices) {transaction profile} * number of transactions per user per hour * number of users It can be seen from the formula that {hourly workload service demand} is in direct correlation with the numbers of users (as well as devices delivering operational data) and intensity of their communication with the system. It also depends on the parameters in {transaction profile}, particularly on such hardware specification as a speed of CPU, I/O, network and size of available memory and storage. EA workload variability patterns can be discovered using a number of approaches. A publication http://tinyurl.com/lq5fnka “Workload Analysis and Demand Prediction of Enterprise Data Center Applications” describes analysis of six months workload data for 139 workloads generated by the users of customer relationship management applications. The study concludes that EA workloads typically show a burthtiness as well as a periodicity which can be a multiple of hours, days, and weeks. The study presents the methods for deducing workload patterns and assessing their quality. A paper http://tinyurl.com/lwazjyj “Dynamic Provisioning of Multi-tier Internet Applications” presents a workload predictor that estimates the tail of the arrival rate distribution (i.e., the peak demand) for the next T hours. If T = 1, the methodology predicts the peak demand for the next one hour at the beginning of the hour. The prediction is based on historical workload data collected for each hour of the day over the past few days. A probability distribution is built for historical hourly workload data and the peak workload for a particular hour is estimated as a high percentile of the workload distribution for that hour. A web post http://tinyurl.com/lrrm6dg specifies a few of patterns observed for different EAs in the cloud. Two examples are:
  • 10. - - The workloads that have a relatively short period of activity; they can be generated, for example, by the users of enterprise planning applications that are in use once per quarter. The burthting workloads featuring high spikes in demand; they are produced by the users of web retail applications during holiday seasons. Workload characterization and prediction stack Described technologies enable workload characterization and its prediction; they form a stack presented on Fig. 2. Figure 2 Workload characterization and prediction stack Transactional workload constitutes input data for EA queuing model delivering the estimates of EA performance for various infrastructure platforms. Before reassigning EA resources based on modeling estimates, a cloud provider has to evaluate its impact on overall cloud performance. That necessity originates from cloud’s nature as a multi-
  • 11. tenant platform. In dedicated system there is a luxury to enact a change, then launch EA and evaluate how a change affects EA performance; if change does not deliver needed improvement, we can try a new change and repeat cycle again and again. But reassigning resources for a cloud-based EA in order to improve its performance might degrade performance of others EAs, as well as make cloud to run less profitably. A provider has to use modeling to assess unexpected consequences of a change. Because EA workload is changing over time, it is necessary periodically to repeat all the steps in the stack presented on Figure 1 to ensure acceptable performance of EA as well as the cloud as a multi-tenant environment. Takeaway from the article 1. Transactional workload defines a business demand for EA services. It is comprised of the requests coming from EA users as well as from operational equipment. 2. Transactional workload is characterized by: o List of transactions. o Per each transaction its intensity expressed in an average number of times a transaction was initiated per one hour by single user or single device. o Per each transaction a number of users or devices requesting it. 3. Transaction profile is a measure of single transaction demand for system active and passive resources (active resources - CPU, I/O, and network times; passive resources - memory and storage space, number of Web server and database connections, number of software threads). 4. Transactional workload and transaction profile quantify total demand from a business for EA resources. 5. EA instrumentation generates raw transactional data. Big Data Analytics extract transactional workload characterization and transaction profile information from raw transactional data. 6. Workload variability patterns can be identified by analyzing workload characterization for different periods of time; the patterns can be used for predictive capacity planning.
  • 12. About the author During his last fifteen years as an Oracle consultant, the author was engaged hands-on in performance tuning and sizing of enterprise applications for various corporations (Dell, Citibank, Verizon, Clorox, Bank of America, AT&T, Best Buy, Aetna, Halliburton, Pfizer, Astra Zeneca, Starbucks, Praxair, Baxter, American Express, etc.).