SlideShare a Scribd company logo
1 of 74
My results.
HOW TO CREATE
A BROCHURE
To print (and preserve) these brochure instructions, click Print
on the File menu. Press ENTER to print the brochure.
Using this template, you can create a professional brochure.
Here’s how:
Insert your words in place of these words, using or re-arranging
the preset paragraph styles.
Print pages 1 and 2 back-to-back onto sturdy, letter size paper.
Fold the paper like a letter to create a three-fold brochure
(positioning the panel with the large picture on the front).What
Else Should
I Know?
To change the style of any paragraph, select the text by
positioning your cursor anywhere in the paragraph. Then, select
a style from the Style list on the Formatting toolbar.
To change the picture, click it to select it. Click Picture on the
Insert menu, and then click FromFile. Select a new picture, and
then click Insert.
(
Company Name
Street
Address
Address 2
City,
ST ZIP
Code
Phone (
704
)
555-0125
Fax (
704
)
555-0145
Web site address
) (
Future
Solution
s Now
) (
Customized
T
urnkey Training Courseware
) (
Adventure
Works
Date of publication
)how to customize this brochure
You’ll probably want to customize all your templates when you
discover how editing and saving your templates makes creating
future documents easier. To customize this brochure template:
1. Insert your company information in place of the sample text.
Click Save As on the File menu. Click Document Template in
the Save as Type box (the file name extension should change
from .doc to .dot).
Next time you want to use it, click New on the File menu, and
then double-click your template.about the “picture” Fonts
The “picture” fonts in this brochure are Wingdings typeface
symbols. To insert a new symbol, select the symbol character
and click Symbol on the Insert menu. Select a new symbol from
the map, click Insert, and then click Close.
workING with breaks
Breaks in a Microsoft Word document appear as labeled dotted
lines on the screen. Using the Break command, you can insert
manual page breaks, column breaks, and section breaks.
To insert a break, click Break on the Insert menu. Select an
option. Click OK to accept your choice.WorkING with Spacing
To reduce the spacing between, for example, body text
paragraphs, click in this paragraph, and click Paragraph on the
Format menu. Reduce Spacing After to 6 points, and make
additional adjustments as needed.
To save your style changes (with the insertion point in the
changed paragraph), click the style in the Style list on the
Formatting toolbar. Press ENTER to save the changes and
update all similar styles.
To adjust character spacing, select the text to be modified and
click Font on the Format menu. Click CharacterSpacing and
then enter a new value.
Other Brochure Tips
To change a font size, click Font on the Format menu. Adjust
the size as needed, and then click OK or Cancel.
To change the shading of shaded paragraphs, click
BordersandShading on the Format menu. Select a new shade or
pattern, and then click OK. Experiment to achieve the best
shade for your printer.
To remove a character style, select the text and press
CTRL+SPACEBAR. You can also click Default Paragraph Font
on the Style list.Brochure Ideas
“Picture” fonts, like Wingdings, are gaining popularity.
Consider using other symbol fonts to create highly customized
icons.
Consider printing your brochure on colorful, preprinted
brochure paper—available from many paper suppliers.
(
Company Name
Street
Address
Address 2
City,
ST ZIP
Code
Phone (704) 555-0125
Fax (704) 555-0145
Web site address
) (
Company Name
Street
Address
Address 2
City,
ST ZIP
Code
Phone (555)555-0125
Fax (555)555-0145
Web site address
)
rpsgroup.com/energy
National Data Repositories (NDR)
define, develop,
deliver
Introduction
rpsgroup.com/energy
RPS Energy helps companies develop natural energy resources
across
the complete asset life cycle, combining our technical and
commercial
skills with an in-depth knowledge of environmental issues.
The expertise within RPS Energy is applied world-wide to a
broad
range of projects across a number of industry sectors. In each of
these
areas, we provide our clients with independent flexible support
to help
them achieve their technical and commercial goals.
RPS Energy has major regional offices across the UK,
Australia, USA
and Canada as well as local offices and agencies in many other
areas.
Oil and gas projects remain a central part of our work, but we
are also
world-leaders in advice to windfarm operators and are
increasingly
involved in other forms of renewable energy. Transferring skills
across
these sectors is a core capability for RPS Energy.
Our clients include governments, NOCs, IOCs, independents
and
financial institutions, as well as companies in the wider energy
industry
and other infrastructure and asset owners.
Increasingly we operate on projects where the issues
surrounding
the development energy resources and the preservation of the
environment converge. RPS Energy brings a unique combination
of
such skills to all our projects.
RPS Energy, through its acquisition of Paras Consulting, is one
of the world’s leading independent consulting companies in
the field of Information Systems and Processes. This
combination of vendor/service neutrality combined with E&P
technical
know-how, makes RPS Energy uniquely placed to offer services
to governments to implement an NDR.
The large system vendors are vital stakeholders in the NDR
domain and RPS Energy maintains excellent relationships with
these companies. We manage technology procurement projects
for both energy companies and governments, providing a
unique understanding of the capabilities of the technologies on
the market.
rpsgroup.com/energy
Typically, the role of any governing body
with the responsibility for a country’s
oil and gas industry, is as the domestic
petroleum resource owner, manager,
and regulator of the country’s E&P
industry. The role represents a number of
responsibilities including:
n Regulatory compliance
n Promotion of inward investment
n Cost savings
n Long term preservation of data for
scientific purposes
As investment within a country’s E&P
industry increases, so do the volumes and
complexity of data. An existing, efficient
data repository is an attractive proposition
to any exploration company looking for
new regions to explore.
The challenges facing government
departments include:
n Improved recognition of data as an
asset and as a key enabler for potential
new investment
n Adding value to current oil and gas
assets and realising new opportunities
in an environment with business,
technological and political fluctuations
n Managing increasing multi-sourced
volumes of a variety of data in a
multi-client environment
n Ready availability of high integrity
integrated datasets
n Continuous monitoring of data quality
and improved understanding of integrity
of datasets
n Improved interaction between the
Government and Industry
n Improved methods of addressing
security and entitlement issues
n Improved standards and procedures
to ensure preservation and increase in
value from the data including regulatory
compliance
n Establishing an improved strategy for
storage and archive of data
National Data Repositories
An NDR would provide long
term storage for future use,
ensuring that data retains
value into the future.
How can we help?
rpsgroup.com/energy
National Data Repositories Lifecycle
Define Develop Deliver Ongoing Audit
rpsgroup.com/energy
National Data Repositories Lifecycle
National Data Repositories Lifecycle
Define Develop Deliver Ongoing Audit
RPS Credentials
We were among the pioneers of the
concept of NDR’s through work with
energy companies and government in the
UK. We managed the design and set up
the UK’s Common Data Access (CDA)
initiative and have been involved since
inception in its continual evolution. Our
work as leaders in this field has taken us
to Norway, Colombia, Venezuela, Algeria,
Peru, Russia, and Romania. In parallel with
this NDR work, RPS Energy continues to
advise companies in data management
strategies. We are therefore able to
integrate a given country strategy into
a company’s internal data management
strategy. This has worked very effectively
in Norway.
Recent Project Examples
Specifically, RPS Energy is experienced in
working with a number of government
bodies around the world in developing
strategies for NDR implementation, and
has undertaken a number of projects in
this area. These include:
n Original design and implementation of
the UK Common Data Access (CDA)
systems. CDA provides data storage
and access systems to a consortium of
exploration and production companies
in the UK Continental shelf.
n Enhancement of the system to provide
UKDEAL – the web based information
system available on the internet
which shows data availability to
interested parties.
n Development of the commercial and
process models for the UK National
Hydrocarbon Data Archive (NHDA)
which is designed to provide long term
storage of data which is of no current
interest to the E&P industry.
n Development of the Banco de
Información Petroleó (BIP) for
Ecopetrol in Colombia. The system
was designed to hold all of Ecopetrol’s
information which it shared with the
operating companies in Colombia, plus
all the data from within Ecopetrol.
n Assisting Perupetro with the definition
and tendering for a commercial
solution for their petroleum data
system.
n Carrying out a feasibility study
on behalf of the West Australian
Government and industry for
development of a Petroleum Data
Centre supporting both government
and industry requirements. Further
work was carried out to update
the proposed roadmap as a result
of ongoing differences between
Government and Industry on the way
forward.
n Working with the Malay Thai
Joint Authority (MTJA) to help it
manage its interactions with the
Joint Development Area (JDA)
license holders more effectively. The
work includes streamlining decision-
making processes, developing data
analysis capabilities to allow MTJA to
better assess operational activities,
and investigating a shared system
for managing common data and
information. A common document
management system and a data analysis
technology have been implemented.
In addition to our work with national
authorities we have extensive experience
in developing systems and processes
for managing data and information in oil
companies. Our clients in this area include:
n BP
n BG
n OMV Group, including Petrom
n Shell
n Anadarko
n Hess
n UK DTI Oil and Gas division
(now part of DECC)
n Maersk Oil
n GDF Suez
rpsgroup.com/energy
Our Approach
RPS Energy have a structured approach
to undertaking projects of this nature
based on a “define, develop, deliver”
methodology.
Define
We are the E&P industry’s leading
independent data management
consultancy, and have the resource
capacity to support large and complex
programmes. RPS Energy has no
connections with data management
products or contracting services, and
therefore provide a service focused
entirely on adding value to the business.
We have an unrivalled reputation for
taking on complex problems and breaking
them down into manageable elements to
form an integrated programme, using a
wide range of specialist expertise in the
Technical Information Management arena.
n Understand the stakeholder drivers
(NOC/oil companies/ministry etc)
n Set up an effective governance
structure
n Understand data volumes, owners and
locations
n Build the business case and assist in
promoting the concept
n Outline statement of requirements
n Assist secure funding
Develop
The core capability of an NDR is
organizing and exploiting large volumes
of data covering a comprehensive
geographical area to serve a community
of users and prospective investors,
allowing improved promotion of acreage
and improved management of existing
acreage.
An NDR would therefore be of national
strategic importance for the prosperity of
the country’s petroleum industry.
An NDR would be able to store and
provide access through a single portal,
reducing the number of applications
required to access the information and
consolidating data sources into a single
framework. The data types included in
an NDR would include all exploration,
drilling and production activity – both
national and private enterprise. However,
there is scope to include other nationally
important data – for example planning
and environmental information.
Additionally the data stored can be
structured or unstructured data, and
include the majority of raw data formats.
n Write technology and service (either
in one or two tenders) RFP
n Manage procurement
n Define the entitlements system
n Assist in setting up the data release
policies
n Advise on optimal vendor/service
combination
n Assist in contract negotiations with
the chosen vendor
Deliver
Through the implementation of the plans
developed by RPS Energy, the challenges
facing government departments are met,
through delivery of the following:
n Implementation plan
n Map out processes
n Document policies and procedures
n Detailed data loading planning
n System roll out
n Testing
n Integration testing
n Training
n Manage Change
rpsgroup.com/energy
Ongoing Audit and support
RPS Energy has a continuing relationship
with many of the governments we have
worked with in the past, ensuring that the
systems put in place continue to function
fully. RPS Energy offers the following
services:
n Validating the security of systems
including the entitlements system
n Retendering of systems as and when
required
n Validating the processes used to
capture and manage data, improving
them where appropriate
In addition to the services outlined above,
RPS Energy is also involved in advising
governments on inward investment into
E&P. Data is a vital part of such a process
and is therefore integral to a country’s
strategy in attracting investment from the
oil industry.
National Data Repositories Lifecycle
Define Develop Deliver Ongoing Audit
RPS Energy is a part of RPS Group Plc, a consultancy
organisation employing
over 5000 professionals with a unique blend of skills and
experience. We
operate worldwide from regional offices in North America,
Europe, Australia
and S E Asia.
We have a reputation for successfully meeting the challenges
posed by large
complex projects and for providing reliable and practical advice
to clients in
all sectors of the economy. RPS Energy conducts business in an
open and fair
manner, contributing to society in a positive way.
rpsgroup.com/energy
23
45
6
A
ug
us
t
20
09
Pr
in
te
d
o
n
FS
C
c
er
tifi
ed
, 1
00
%
p
o
st
c
o
ns
um
er
r
ec
yc
le
d
pa
pe
r,
bl
ea
ch
ed
u
si
ng
a
n
el
em
en
ta
l c
hl
o
ri
ne
f
re
e
pr
o
ce
ss
.
UK | USA | Canada | Austr alia | Malaysia |
Singapore | Russia
For more information about our
Energy Services please contact:
[email protected]
Data Repository & Customer Portal
With the print data from your customer accounts backed up in
the FMAudit™ Central™
database repository at your location, you are the owner of a
powerful informational resource
that will help you provide better services for your customers,
increase operational
efficiencies and grow your business.
Fuel Your Growth Engine with:
Reports and/or Billing > Connecting Central™ to your
accounting system,
ERP or CRM system, you can generate reports and/or billing
invoices. With meter
synchronization with Onsite™, your accurate and timely
invoices sail through customer
approval with ease, resulting in quicker collection of your
money!
Online Customer Meter Validation > As a customer portal,
Central™ facilitates customer validation of meter readings
online – a convenience for you
and your customers and a huge source of savings for you!
(Note: You can also extract data
as .xls, csv, http and xml files.)
Contract Device Management > You get the data you need to
optimize contract efficiency for both you and your customers –
savings for your customer,
increased revenues for you.
Cross Account Analysis > You can compare account assets by
consumption,
geography, industry, account manager and other variables – spot
trends, manage sales and
data mine for golden opportunities to increase business.
Better Technical Services > Central’s Device Dashboard lets
you see an
easy-to-read virtual representation of critical service alerts –
technical support, toner low, etc.
– your customers get better service, you get better revenues.
How FMAudit™ Central™ Works
Residing at your location, Central™ is a backup database
repository for the print asset and
metering data gathered using any or all of FMAudit’s family of
data collection products:
Viewer USB™ (portable USB device; download collected data
for backup)
Onsite™ (resident on customer network; automatic,
synchronized feed into Central™)
WebAudit™ (internet based data collection; synchronized feed
into Central™)
When Used Viewer USB™
Central allows you to protect and save valuable meter readings
and audits in one central
repository that stays with you, regardless of where the USB key
goes.
When used with Onsite™
Central™ can synchronize and consolidate as many Onsite™
metering data feeds as you need
without any third party involvement. No ASP! The data is
secure and confidential.
Start revving your growth engine with print data that is
accurate, timely and all yours to explore
– to spur growth and to speed through revenues.
> DEALER PRINCIPALS
> SALES MANAGERS
> OPERATIONS
> SERVICE
> FINANCE
Call: 1-573-632-2461
E-mail: [email protected]
Website: www.FMAudit.com
Minimum Technical Requirements
Customer Workstation or Server
Windows 2000 or higher
Connected to target network
308 East High Street, Suite 109 Jefferson City, MO 65101
Q3 How Do Organizations Use Data Warehouses and Data
Marts to Acquire Data? 305
is s h o w n t o be - 6 0 0 . T h i s r e s u l t , an e r r o r , o c c u
r r e d because t h e o p e r a t i o n a l d a t a
showed t h a t 19,800 u n i t s were ordered and 19,800 u n i t s
were sold. However, the opera-
t i o n a l data also s h o w e d t h a t 600 u n i t s were d a m a g
e d . Clearly, s o m e t h i n g is w r o n g ,
somewhere. I t c o u l d be due t o a k e y i n g mistake by
someone o n the receiving dock, i t
c o u l d be t h a t t h e v e n d o r s u b s e q u e n t l y s h i p p
e d r e p l a c e m e n t i t e m s t h a t were n o t
charged a n d t h e r e f o r e d i d n o t appear i n t h e a c c o u
n t s payable database t h a t Lucas
q u e r i e d , or i t c o u l d be due t o some other reason.
Such a discrepancy is n o t u n u s u a l for B I analyses. W h e
n data are i n t e g r a t e d f r o m
several or m a n y d i f f e r e n t sources, t h e r e s u l t i n g c
o l l e c t i o n is f r e q u e n t l y i n c o n s i s t e n t .
The o n l y safeguard against inaccurate analyses f r o m such i
n c o n s i s t e n t data is for the
analysts and knowledge w o r k e r s t o k n o w t h a t such
inconsistencies are possible, to be
o n the l o o k o u t f o r t h e m , a n d t o a p p l y a c r i t i c a
l eye t o B I results.
A d d i s o n , Drew, a n d Lucas w o u l d use a process s i m i l
a r t o t h a t j u s t discussed t o f i n i s h
t h e i r analysis. They w o u l d l i k e l y a d d costs t o the
data they've already g a t h e r e d a n d
analyze i t so as t o p r o d u c e a n average cost per i t e m f o
r each v e n d o r a n d o t h e r s i m i l a r
results. The p a r t i c u l a r s are n o t i m p o r t a n t here; j u
s t realize they w o u l d c o n t i n u e i n a
s i m i l a r v e i n u n t i l t h e y were finished.
A t t h a t p o i n t , a c c o r d i n g t o the process s u m m a r y
i n Figure 9-3, t h e y w o u l d p u b l i s h
t h e i r results. Several possibilities exist:
P r i n t a n d d i s t r i b u t e t h e results v i a e m a i l or a c o
l l a b o r a t i o n t o o l .
Publish via a Web server or SharePoint.
• Publish o n a B I server.
A u t o m a t e the results via a Web service.
We w i l l discuss these alternatives i n m o r e d e t a i l i n Q7.
For now, j u s t realize t h a t
GearUp w o u l d choose a m o n g these alternatives a c c o r d
i n g to its needs. I f the business
intelligence is o n l y created to p r o v i d e guidance for
buyers, A d d i s o n a n d D r e w m i g h t
be c o n t e n t j u s t to p r i n t t h e i r results a n d e m a i l t
h e m to buyers or share t h e m u s i n g a
c o l l a b o r a t i o n t o o l . As an alternative, they c o u l d
also p r o d u c e t h e r e p o r t i n H T M L a n d
place i t o n a Web server. As an extension t o t h a t o p t i o n ,
t h e y c o u l d use SharePoint t o
p u b h s h t h e results. A l t h o u g h w e d i d n ' t discuss t h
e m i n C h a p t e r 2, SharePoint has
extensive features a n d f u n c t i o n s for B I r e p o r t i n g .
A d d i s o n a n d D r e w c o u l d integrate
t h e i r analyses w i t h these features a n d f u n c t i o n s so t
h a t users c o u l d go to a SharePoint
site f o r the latest data. F o u r t h , t h e y c o u l d p u b l i s h
via a B I server, w h i c h is a Web server
a p p l i c a t i o n t h a t is specialized for p u b h s h i n g B I
results. Finally, Lucas m i g h t assign a
p r o g r a m m e r i n his d e p a r t m e n t to create a Web
service t h a t w o u l d make i t possible f o r
o t h e r programs t o o b t a i n t h e B I results p r o g r a m m
a t i c a l l y . Most likely, f o r t h e i r s i t u a -
t i o n , t h e y w i l l p r i n t the results a n d e m a i l t h e m o
r share t h e m via a c o l l a b o r a t i o n t o o l .
W i t h this example i n m i n d , we w i l l n o w discuss each o
f the elements o f Figure 9-3
i n greater d e t a i l .
How Do Organizations Use Data
Warehouses and Data Marts to Acquire
Data?
A l t h o u g h i t is p o s s i b l e to create b a s i c r e p o r t s a
n d p e r f o r m s i m p l e analyses f r o m
o p e r a t i o n a l d a t a , t h i s course is n o t u s u a l l y r e c
o m m e n d e d . For reasons o f s e c u r i t y
a n d c o n t r o l , IS professionals do n o t w a n t business
analysts like A d d i s o n processing
o p e r a t i o n a l data. I f A d d i s o n makes a n error, t h a t
error c o u l d cause a serious d i s r u p -
t i o n i n Gearllp's o p e r a t i o n s . Also, o p e r a t i o n a l d
a t a is s t r u c t u r e d for fast a n d reliable
t r a n s a c t i o n p r o c e s s i n g . I t is s e l d o m s t r u c t
u r e d i n a w a y t h a t r e a d i l y s u p p o r t s B I
306 CHAPTER 9 Business Intelligence Systems
Components of a Data
Warehouse Production, Databases
' Other
Internal
Data
: External
Data J
Data }
Warehouse]
 Metadata •
Data
Warehouse
Database
Data
Extraction/
Cleaning/
Preparation
Programs
Data
Warehouse
DBMS
Data
Extraction/
Cleaning/
Preparation
Programs
F
Data
Warehouse
DBMS
Business
Intelligence
Tools
Business Intelligence
Users
analysis. Finally, B I analyses can r e q u i r e considerable
processing; p l a c i n g B I a p p l i c a -
t i o n s o n o p e r a t i o n a l servers can d r a m a t i c a l l y
reduce system p e r f o r m a n c e .
For these reasons, m o s t organizations extract o p e r a t i o n a
l data for B I processing.
For a s m a l l o r g a n i z a t i o n l i k e GearUp, t h e e x t r a
c t i o n m a y be as s i m p l e as an Access
database. Larger o r g a n i z a t i o n s , however, t y p i c a l l
y create and staff a g r o u p of people
w h o manage and r u n a data warehouse, w h i c h is a f a c i l i
t y for m a n a g i n g an organiza-
tion's B I data. The f u n c t i o n s of a data warehouse are t o :
O b t a i n data
Cleanse data
Organize a n d relate data
Catalog data
Figure 9-11 shows t h e c o m p o n e n t s o f a data warehouse.
Programs read p r o d u c -
t i o n a n d o t h e r d a t a a n d e x t r a c t , c l e a n , a n d p r
e p a r e t h a t d a t a f o r B I p r o c e s s i n g .
The p r e p a r e d data are stored i n a data warehouse database
u s i n g a data warehouse
D B M S , w h i c h c a n be d i f f e r e n t f r o m t h e
organization's o p e r a t i o n a l D B M S . For
example, an o r g a n i z a t i o n m i g h t use Oracle for its o p
e r a t i o n a l processing, b u t use SQL
Server f o r its d a t a w a r e h o u s e . O t h e r o r g a n i z a t
i o n s use SQL Server f o r o p e r a t i o n a l
processing, b u t use DBMSs f r o m statistical package v e n d
o r s such as SAS or SPSS i n
t h e data warehouse.
D a t a w a r e h o u s e s i n c l u d e d a t a t h a t are p u r c h
a s e d f r o m o u t s i d e s o u r c e s .
T h e p u r c h a s e o f d a t a a b o u t o t h e r c o m p a n i e s
is n o t u n u s u a l o r p a r t i c u l a r l y
c o n c e r n i n g f r o m a p r i v a c y s t a n d p o i n t . H o w
e v e r , s o m e c o m p a n i e s , Hke Fox Lake,
m i g h t c h o o s e t o b u y p e r s o n a l , c o n s u m e r d a t
a ( l i k e m a r i t a l s t a t u s ) f r o m d a t a
v e n d o r s l i k e A c x i o m C o r p o r a t i o n . F i g u r e 9-
12 l i s t s s o m e o f t h e c o n s u m e r d a t a
Examples of Consumer Data
for Sale
Name, address, phone
Age
Gender
Ethnicity
Religion
Income
Education
Voter registration
Home ownership
Vehicles
Magazine subscriptions
Hobbies
Catalog orders
Marital status, life stage
Height, weight, hair and
eye color
Spouse name, birth date
Children's names and
birth dates
Q3 How Do Organizations Use Data Warehouses and Data
Marts to Acquire Data?
t h a t can be r e a d i l y p u r c h a s e d . A n a m a z i n g ( a
n d f r o m a p r i v a c y s t a n d p o i n t , f r i g h t -
ening) a m o u n t o f data is available.
M e t a d a t a c o n c e r n i n g t h e d a t a — i t s source, its f
o r m a t , its a s s u m p t i o n s a n d
c o n s t r a i n t s , a n d o t h e r facts a b o u t t h e data—is k
e p t i n a d a t a w a r e h o u s e m e t a d a t a
database. The data warehouse DBMS extracts a n d provides
data to B I a p p l i c a t i o n s .
Most o p e r a t i o n a l a n d purchased data have p r o b l e m
s t h a t i n h i b i t t h e i r usefulness for
business intelligence. Figure 9-13 lists the m a j o r p r o b l e m
categories. First, a l t h o u g h
data t h a t are c r i t i c a l for successful o p e r a t i o n s m u
s t be c o m p l e t e a n d accurate, data
t h a t are o n l y m a r g i n a l l y necessary need n o t be. For e
x a m p l e , some systems gather
d e m o g r a p h i c data i n the o r d e r i n g process. But,
because such data are n o t needed t o
f i l l , ship, a n d b i l l orders, t h e i r q u a l i t y suffers.
P r o b l e m a t i c data are t e r m e d d i r t y data. Examples
are a value of B f o r c u s t o m e r
gender a n d of 213 for c u s t o m e r age. Other examples are a
value o f 999-999-9999 for a
U.S. p h o n e n u m b e r , a p a r t c o l o r o f g r e n , a n d an
e m a i l address o f [email protected]
W h o L A M . o r g . A l l o f these values can be p r o b l e m a
t i c for B I purposes.
Purchased d a t a o f t e n c o n t a i n m i s s i n g e l e m e n t s
. M o s t data v e n d o r s state t h e
percentage of m i s s i n g values for each a t t r i b u t e i n the
data they sell. A n o r g a n i z a t i o n
buys such data because for some uses some data are better t h a
n n o data at a l l . This is
especially t r u e for data items whose values are d i f f i c u l t t
o o b t a i n , such as N u m b e r of
A d u l t s i n H o u s e h o l d , H o u s e h o l d I n c o m e , D
w e l l i n g Type, a n d E d u c a t i o n o f P r i m a r y
I n c o m e Earner. However, care is r e q u i r e d here because
for some B I a p p l i c a t i o n s a few
m i s s i n g or erroneous data p o i n t s can seriously bias the
analysis.
I n c o n s i s t e n t data, the t h i r d p r o b l e m i n Figure 9-
13, is p a r t i c u l a r l y c o m m o n f o r
data t h a t have been gathered over t i m e . W h e n an area
code changes, for example, the
p h o n e n u m b e r for a given customer before the change w i
l l n o t m a t c h the customer's
n u m b e r after the change. L i k e w i s e , p a r t codes can
change, as c a n sales t e r r i t o r i e s .
Before such data can be used, t h e y m u s t be recoded for
consistency over t h e p e r i o d
o f the study.
Some data i n c o n s i s t e n c i e s o c c u r f r o m t h e n a t u
r e o f t h e business a c t i v i t y .
Consider a Web-based o r d e r - e n t r y system used by c u s t
o m e r s w o r l d w i d e . W h e n t h e
Web server records the t i m e o f order, w h i c h t i m e zone
does i t use? The server's system
clock t i m e is i r r e l e v a n t to an analysis o f c u s t o m e r
b e h a v i o r . C o o r d i n a t e d U n i v e r s a l
T i m e ( f o r m e r l y called G r e e n w i c h M e a n T i m e )
is also m e a n i n g l e s s . Somehow, Web
server t i m e m u s t be adjusted t o the t i m e zone of the
customer.
A n o t h e r p r o b l e m is n o n i n t e g r a t e d data. A p a r t
i c u l a r B I analysis m i g h t require data
f r o m an ERP s y s t e m , a n e - c o m m e r c e system, a n d
a social n e t w o r k i n g a p p U c a t i o n .
Analysts m a y w i s h t o i n t e g r a t e t h a t o r g a n i z a t i
o n a l data w i t h p u r c h a s e d c o n s u m e r
d a t a . Such a data c o l l e c t i o n w i l l l i k e l y have r e l
a t i o n s h i p s t h a t are n o t r e p r e s e n t e d
i n p r i m a r y k e y / f o r e i g n key r e l a t i o n s h i p s . I t
is the f u n c t i o n o f p e r s o n n e l i n t h e d a t a
warehouse to integrate such data, somehow.
Data can also have the w r o n g granularity, a t e r m t h a t
refers t o the level of d e t a i l
represented by the data. Granulaiit}' can be too fine or too
coarse. For the former, suppose
we w a n t t o analyze the p l a c e m e n t o f graphics a n d c o
n t r o l s o n an o r d e r - e n t r y Web
page. I t is possilale t o c a p t u r e the c u s t o m e r s ' c l i c
k i n g b e h a v i o r i n w h a t is t e r m e d
• Dirty data • Wrong granularity Possible Problems with
• Missing values -Too fine Source Data
• Inconsistent data - Not fine enough
• Data not integrated • Too much data
- Too many attributes
- Too many data points
308 CHAPTER 9 Business Intelligence Systems
clickstream data. Those data, however, i n c l u d e e v e r y t h i
n g the c u s t o m e r does at the
Web s i t e . I n t h e m i d d l e o f t h e o r d e r s t r e a m are
d a t a f o r clicks o n t h e n e w s , e m a i l ,
i n s t a n t chat, a n d a weather check. A l t h o u g h all of t h
a t data m a y be useful for a s t u d y
o f c o n s u m e r b r o w s i n g behavior, i t w i l l be o v e r w
h e l m i n g i f a l l we w a n t t o k n o w is h o w
c u s t o m e r s r e s p o n d to an ad l o c a t e d d i f f e r e n t
l y o n t h e screen. To p r o c e e d , t h e d a t a
analysts m u s t t h r o w away m i l l i o n s and m i l l i o n s o
f clicks.
D a t a can also be t o o coarse. For e x a m p l e , a f i l e o f r e
g i o n a l sales totals c a n n o t be
used t o i n v e s t i g a t e t h e sales i n a p a r t i c u l a r store
i n a r e g i o n , a n d t o t a l sales f o r a
store c a n n o t be u s e d t o d e t e r m i n e t h e sales o f p a r
t i c u l a r i t e m s w i t h i n a s t o r e .
I n s t e a d , we n e e d t o o b t a i n d a t a t h a t is f i n e e n
o u g h f o r t h e l o w e s t - l e v e l r e p o r t we
w a n t t o p r o d u c e .
I n general, i t is better t o have too fine a g r a n u l a r i t y t h
a n t o o coarse. I f the g r a n u -
l a r i t y is too f i n e , the data can be m a d e coarser by s u m
m i n g a n d c o m b i n i n g . O n l y ana-
lysts' l a b o r a n d c o m p u t e r p r o c e s s i n g are r e q u i
r e d . I f t h e g r a n u l a r i t y is t o o coarse,
however, there is n o w a y t o separate t h e data i n t o c o n s t
i t u e n t parts.
T h e f i n a l p r o b l e m l i s t e d i n Figure 9-13 is t o have t
o o m u c h d a t a . As s h o w n i n
t h e f i g u r e , we c a n have e i t h e r t o o m a n y a t t r i b u
t e s o r t o o m a n y data p o i n t s . T h i n k
b a c k to t h e d i s c u s s i o n o f tables i n C h a p t e r 5. We
c a n have t o o m a n y c o l u m n s or t o o
m a n y r o w s .
Consider t h e f i r s t p r o b l e m : t o o m a n y a t t r i b u t e
s . Suppose w e w a n t t o k n o w t h e
factors t h a t i n f l u e n c e h o w customers r e s p o n d to a p
r o m o t i o n . I f we c o m b i n e i n t e r n a l
c u s t o m e r d a t a w i t h p u r c h a s e d c u s t o m e r d a t
a , w e w i l l have m o r e t h a n a h u n d r e d
d i f f e r e n t a t t r i b u t e s t o c o n s i d e r . H o w do w e
select a m o n g t h e m ? Because o f a
p h e n o m e n o n c a l l e d t h e curse of dimensionality, t h e
m o r e a t t r i b u t e s t h e r e are,
t h e easier i t is t o b u i l d a m o d e l t h a t fits t h e s a m p l
e data b u t t h a t is w o r t h l e s s as a
p r e d i c t o r . There are o t h e r good reasons f o r r e d u c i
n g the n u m b e r o f a t t r i b u t e s , a n d
one o f t h e m a j o r a c t i v i t i e s i n d a t a m i n i n g c o n
c e r n s e f f i c i e n t a n d effective ways o f
selecting a t t r i b u t e s .
The s e c o n d w a y t o have t o o m u c h d a t a is to have t o
o m a n y d a t a p o i n t s — t o o
m a n y rows o f data. Suppose we w a n t to analyze c l i c k s t
r e a m data o n C N N . c o m . H o w
m a n y clicks does t h a t site receive p e r m o n t h ? M i l l i
o n s u p o n m i l l i o n s ! I n o r d e r to
m e a n i n g f u l l y analyze such d a t a w e n e e d to r e d u c
e t h e a m o u n t of d a t a . One g o o d
s o l u t i o n t o this p r o b l e m is statistical s a m p l i n g .
Organizations s h o u l d n o t be r e l u c t a n t
t o sample data i n such s i t u a t i o n s .
Data Wareiioyses Versos Data Marts
To u n d e r s t a n d t h e d i f f e r e n c e b e t w e e n data
warehouses a n d d a t a m a r t s , t h i n k o f a
data warehouse as a d i s t r i b u t o r i n a s u p p l y c h a i n .
The data warehouse takes data f r o m
t h e data m a n u f a c t u r e r s ( o p e r a t i o n a l systems a
n d p u r c h a s e d d a t a ) , cleans a n d
processes the data, a n d locates the data o n the shelves, so t o
speak, o f the data w a r e -
house. The people w h o w o r k w i t h a data warehouse are
experts at data m a n a g e m e n t ,
data c l e a n i n g , data t r a n s f o r m a t i o n , data
relationships a n d the l i k e . However, t h e y are
n o t usually experts i n a given business f u n c t i o n .
A data m a r t is a data c o l l e c t i o n , smaller t h a n the data
warehouse, t h a t addresses
t h e needs of a p a r t i c u l a r d e p a r t m e n t or f u n c t i o
n a l area o f t h e business. I f the d a t a
warehouse is the d i s t r i b u t o r i n a s u p p l y c h a i n , t h
e n a data m a r t is like a retail store i n
a s u p p l y c h a i n . Users i n the data m a r t o b t a i n data t
h a t p e r t a i n to a particular- business
f u n c t i o n f r o m t h e d a t a w a r e h o u s e . Such users d
o n o t have t h e d a t a m a n a g e m e n t
expertise t h a t data warehouse employees have, b u t t h e y are
knowledgeable analysts
for a g i v e n business f u n c t i o n .
Figure 9-14 i l l u s t r a t e s these r e l a t i o n s h i p s . The d
a t a w a r e h o u s e takes data f r o m
t h e d a t a p r o d u c e r s a n d d i s t r i b u t e s the d a t a t
o t h r e e d a t a m a r t s . One d a t a m a r t is
used t o analyze c l i c k s t r e a m data f o r t h e p u r p o s e o
f d e s i g n i n g Web pages. A second
analyzes store sales d a t a a n d d e t e r m i n e s w h i c h p r o
d u c t s t e n d t o be p u r c h a s e d
together. This i n f o r m a t i o n is used t o t r a i n salespeople
o n t h e best w a y to u p - s e l l t o
c u s t o m e r s . T h e t h i r d d a t a m a r t is u s e d t o
analyze c u s t o m e r o r d e r d a t a f o r t h e
Q4 How Do Organizations tJse Typical Reporting Applications?
Data
Warehouse
Metadata
2
S
3
T3
O
i
Web
Log
Data
BI tools
for Web clickstream
analysis
Web Sales Data Mart
Store Sales Data Mart
Store
Sales
Data
BI tools
for store
management
^ Inventory
History
Data
BI tools
for Inventory
management
Inventory Data Mart
Web page
design features
Data Mart Examples
Market-basket
analysis for sales
training
Inventory
layout
for optimal
item picking
p u r p o s e o f r e d u c i n g l a b o r f o r i t e m p i c k i n g f
r o m t h e w a r e h o u s e . A c o m p a n y l i k e
A m a z o n . c o m , f o r e x a m p l e , goes t o great l e n g t h
s t o o r g a n i z e its w a r e h o u s e s t o
reduce p i c k i n g expenses.
As y o u can i m a g i n e , i t is expensive to create, staff, a n d
operate data warehouses
a n d data m a r t s . O n l y large o r g a n i z a t i o n s w i t h
deep pockets can a f f o r d t o operate a
s y s t e m l i k e t h a t s h o w n i n F i g u r e 9 - 1 1 . Smaller
o r g a n i z a t i o n s l i k e GearUp o p e r a t e
subsets o f t h i s system, b u t t h e y m u s t find ways t o solve
the basic p r o b l e m s t h a t data
warehouses solve, even i f those ways are i n f o r m a l .
How Do Organizations Use Typical
Reporting Applications?
A reporting application is a B I a p p l i c a t i o n t h a t i n p u
t s data f r o m one or m o r e sources
a n d applies r e p o r t i n g o p e r a t i o n s t o t h a t data t o
p r o d u c e business intelligence. We w i l l
f i r s t s u m m a r i z e r e p o r t i n g o p e r a t i o n s a n d t
h e n i l l u s t r a t e t w o i m p o r t a n t r e p o r t i n g
a p p l i c a t i o n s : RFIVI analysis a n d OLAR
R e p o r t i n g a p p l i c a t i o n s p r o d u c e business
intelligence u s i n g five basic o p e r a t i o n s :
S o r t i n g
F i l t e r i n g
G r o u p i n g
Calculating
F o r m a t t i n g
N o n e o f these o p e r a t i o n s is p a r t i c u l a r l y s o p h
i s t i c a t e d ; they can all be a c c o m p l i s h e d
u s i n g SQL a n d basic H T M L or a s i m p l e r e p o r t w r i
t i n g t o o l .
A d d i s o n at GearUp used Access t o a p p l y a l l five o f
these o p e r a t i o n s i n t h e p r e p a -
r a t i o n o f t h e r e p o r t s discussed i n Q2. E x a m i n e , f
o r e x a m p l e . Figure 9-7 (page 301).
T h e r e s u l t s are sorted a n d grouped b y V e n d o r l D a n
d , w i t h i n a v e n d o r , sorted i n
decreasing o r d e r by value o f SalesShortage. The value o f
SalesShortage as w e l l as t h e
314 CHAPTER 9 Business Intelligence Systems
A Group Exercise
Do You Have a Club Card?
A d a t a aggregator is a company ttiat obtains data from
public and private sources and stores, combines, and pub-
lishes it i n sophisticated ways. When you use your grocery
store club card, tlie data from your grocery shopping trip are
sold to a data aggregator. Credit card data, credit data, public
tax records, insurance records, product warranty card data,
voter registration data, and hundreds of other types of data are
sold to aggregators.
Not all of the data are identified i n the same way (or, i n
terms of Chapter 5, not all of it has the same primary key). But,
using a combination of phone number, address, email
address, name, and other partially identifying data, such com-
panies can integrate that disparate data into an integrated,
coherent whole. They then query, report, and mine the inte-
grated data to f o r m detailed descriptions about companies,
communities, zip codes, households, and individuals.
As you w i l l learn in Chapter 12, laws l i m i t the types of
data
that federal and other governmental agencies can acquire and
store. There are also some legal safeguards on data maintained
by credit bureaus and medical facilities. However, no such laws
l i m i t data storage by most companies (nor are there laws that
prohibit governmental agencies from buying results from data
aggregators).
Acxiom Corporation, a data aggregator w i t h $1.2 billion i n
sales in 2009, has been described as the "biggest compairy you
never heard of." Visit www.acxiom.com and complete the
following tasks:
1. Na'igate the Acxiom Web site and make a list of 10 different
products that Acxiom provides.
2. Describe Acxiom's top customers.
3. Examine your answers to items 1 and 2 and describe, i n
general terms, the kinds of data that Acxiom must collect
to be able to provide these products to its customers.
4. In what ways might companies like Acxiom need to limit
their marketing so as to avoid a privacy outcry from the
public?
5. According to the Web site, what is Acxiom's privacy policy?
Are you reassured by its policy? Why or why not?
6. Should there be laws governing companies like Acxiom?
Why or why not?
7. Prepare a 3-minute presentation of your answers to items
3, 4, 5, and 6. Give your presentation to the rest of the
class.
Data mining and other
business intelligence systems
are useful, but they are not
without problems, as
discussed in the Guide on
pages 330-331.
How Do Organizations Use Typical
Data Mining Applications?
D a t a m i n i n g is the a p p l i c a t i o n of statistical t e c h n
i q u e s t o find patterns a n d r e l a t i o n -
ships a m o n g d a t a f o r c l a s s i f i c a t i o n a n d p r e d i
c t i o n . As s h o w n i n F i g u r e 9-19, d a t a
m i n i n g resulted f r o m a convergence o f disciplines. Data
m i n i n g techniques emerged
f r o m statistics a n d m a t h e m a t i c s and f r o m a r t i f i c
i a l intelligence a n d m a c h i n e - l e a r n i n g
fields i n c o m p u t e r science. As a result, d a t a m i n i n g t
e r m i n o l o g y is an o d d b l e n d o f
t e r m s f r o m these d i f f e r e n t d i s c i p l i n e s . S o m e
t i m e s p e o p l e use t h e t e r m knowledge
discovery in databases (KDD) as a s y n o n y m f o r data m i n
i n g .
Data m i n i n g techniques take advantage o f developments i n
data m a n a g e m e n t f o r
processing t h e e n o r m o u s databases t h a t have emerged i
n the last 10 years. O f course,
these data w o u l d n o t have been generated were i t n o t for
fast a n d cheap c o m p u t e r s ,
a n d w i t h o u t such c o m p u t e r s t h e n e w techniques w
o u l d be i m p o s s i b l e t o c o m p u t e .
Q5 How Do Organizations Use Typical Data Mining
Applications?
Statistics/
Mathematics
Artificial Intelligence
^ Machine Learning
Cheap Computer
Processing and
Storage
I Huge
Databases
Data
Management
Technology
Marketing, Finance,
and Other Business
Professionals
Most data m i n i n g techniques are sophisticated, a n d m a n y
are d i f f i c u l t t o use w e l l .
Such t e c h n i q u e s are v a l u a b l e t o o r g a n i z a t i o n
s , however, a n d some business profes-
sionals, especially those i n finance and m a r k e t i n g , have
become expert i n t h e i r use. I n
fact, today there are m a n y interesting and r e w a r d i n g
careers for business professionals
w h o are knowledgeable about data m i n i n g techniques.
Data m i n i n g t e c h n i q u e s f a l l i n t o t w o b r o a d
categories: u n s u p e r v i s e d a n d super-
vised. We e x p l a i n b o t h types i n the f o l l o w i n g
sections.
W i t h unsupervised data m i n i n g , analysts do n o t create a
m o d e l o r hypothesis before
r u n n i n g the analysis. I n s t e a d , they a p p l y a data m i n
i n g a p p l i c a t i o n to t h e data a n d
observe the results. W i t h this m e t h o d , analysts create
hypotheses after the analysis, i n
order t o e x p l a i n the patterns f o u n d .
One c o m m o n u n s u p e r v i s e d t e c h n i q u e is c l u s t
e r a n a l y s i s . W i t h i t , s t a t i s t i c a l
techniques i d e n t i f y groups of entities t h a t have s i m i l a
r characteristics. A c o m m o n use
f o r cluster analysis is t o f i n d g r o u p s o f s i m i l a r c u s
t o m e r s f r o m c u s t o m e r o r d e r a n d
d e m o g r a p h i c data.
For example, suppose a cluster analysis finds t w o very d i f f e
r e n t c u s t o m e r groups:
One g r o u p has an average age of 33, o w n s t h r e e A n d r
o i d p h o n e s , t w o iPads, has
a n expensive h o m e e n t e r t a i n m e n t s y s t e m , d r i v
e s a Lexus SUV, a n d t e n d s t o b u y
expensive children's p l a y e q u i p m e n t . The s e c o n d g r
o u p has an average age o f 64,
owns A r i z o n a v a c a t i o n p r o p e r t y , plays golf, a n d
buys expensive w i n e s . Suppose t h e
analysis also finds t h a t b o t h groups b u y designer
children's c l o t h i n g .
These findings are obtained solely by data analysis. There is no
p r i o r m o d e l about the
patterns a n d relationships that exist, ft is up to the analyst to f
o r m hypotheses, after the
fact, to explain w h y t w o such different groups are b o t h b u
y i n g designer children's clothes.
Supervised Data Mininci
W i t h supervised data m i n i n g , data m i n e r s develop a m
o d e l prior to the analysis a n d
a p p l y statistical techniques t o data t o estimate parameters o
f t h e m o d e l . For example,
suppose m a r k e t i n g experts i n a c o m m u n i c a t i o n s c
o m p a n y believe t h a t c e l l p h o n e
usage o n weekends is d e t e r m i n e d b y t h e age o f t h e c
u s t o m e r a n d t h e n u m b e r o f
m o n t h s t h e c u s t o m e r has h a d t h e cell p h o n e a c c
o u n t . A data m i n i n g analyst w o u l d
t h e n r u n an analysis t h a t estimates the i m p a c t o f c u s t
o m e r a n d account age.
One s u c h analysis, w h i c h measures t h e i m p a c t o f a set
o f variables o n a n o t h e r
variable, is called a regression analysis. A sample result for the
cell phone example is:
Many problems arise with
classification schemes,
especially those that classify
people. The Ethics Guide on
pages 318-319 examines
some of these problems.
CellphoneWeekendMinutes = 12 + (17.5 x CustomerAgeJ
+ (23.7 X N u m b e r M o n t h s OfAccount)
316 CHAPTER 9 Business intelligence Systems
U s i n g t h i s e q u a t i o n , analysts can p r e d i c t t h e n u
m b e r o f m i n u t e s o f w e e k e n d c e l l
p h o n e use by s u m m i n g 12, p l u s 17.5 t i m e s t h e
customer's age, p l u s 23.7 t i m e s t h e
n u m b e r o f m o n t h s o f the account.
As y o u w i l l learn i n your statistics classes, considerable
skill is required to interpret the
q u a l i t y o f such a m o d e l . The regression t o o l w i l l
create an e q u a t i o n , such as the one
shown. Whether that equation is a good predictor of future cell
phone usage depends o n
statistical factors, such as rvalues, confidence intervals, and
related statistical techniques.
Neural networks are a n o t h e r p o p u l a r supervised data m
i n i n g a p p l i c a t i o n used t o
p r e d i c t values a n d m a k e classifications such as "good p r
o s p e c t " or " p o o r p r o s p e c t "
c u s t o m e r s . The t e r m neural networks is d e c e i v i n g
because i t c o n n o t e s a b i o l o g i c a l
process s i m i l a r t o t h a t i n a n i m a l b r a i n s . I n fact,
a l t h o u g h the o r i g i n a l idea o f n e u r a l
nets m a y have c o m e f r o m t h e a n a t o m y a n d p h y s i
o l o g y o f n e u r o n s , a n e u r a l n e t w o r k is
n o t h i n g m o r e t h a n a c o m p l i c a t e d set of possibly
n o n l i n e a r equations. E x p l a i n i n g the
t e c h n i q u e s used f o r n e u r a l n e t w o r k s is b e y o n
d the scope o f t h i s t e x t . I f y o u w a n t t o
learn m o r e , search h t t p : / / k d n u g g e t s . c o m f o r t h
e t e r m neural network.
I n the next sections, we w i l l describe a n d illustrate t w o t y
p i c a l data m i n i n g t o o l s —
market-basket analysis and decision trees—and show
applications o f those techniques.
F r o m t h i s discussion, y o u can gain a sense o f the nature
of data m i n i n g . These examples
s h o u l d give y o u , a future manager, a sense of the
possibilities of data m i n i n g techniques.
You w i l l need a d d i t i o n a l c o u r s e w o r k i n
statistics, data m a n a g e m e n t , m a r k e t i n g , a n d
finance, however, before y o u w i l l be able to p e r f o r m
such analyses yourself.
Suppose y o u r u n a dive shop, a n d one day y o u realize t h a
t one o f y o u r salespeople is
m u c h b e t t e r at u p - s e l l i n g t o y o u r c u s t o m e r s .
A n y o f y o u r sales associates c a n f i l l a
customer's o r d e r , b u t t h i s o n e salesperson is e s p e c i a
l l y g o o d at s e l h n g c u s t o m e r s
i t e m s in addition to those for w h i c h they ask. One day, y o
u ask h i m h o w he does i t .
"It's s i m p l e , " he says. " I j u s t ask m y s e l f w h a t is the
next p r o d u c t t h e y w o u l d w a n t t o
b u y I f someone buys a dive c o m p u t e r , I don't t r y t o
sell her fins. I f she's b u y i n g a dive
c o m p u t e r , she's already a d i v e r a n d she already has
fins. B u t , these d i v e c o m p u t e r
displays are h a r d t o read. A better mask makes i t easier t o
read the display a n d get the
f u l l b e n e f i t f r o m the dive c o m p u t e r . "
A m a r k e t - b a s k e t a n a l y s i s is an u n s u p e r v i s e
d data m i n i n g t e c h n i q u e for deter-
m i n i n g sales p a t t e r n s . A m a r k e t - b a s k e t
analysis shows the p r o d u c t s t h a t c u s t o m e r s
t e n d to b u y t o g e t h e r . I n m a r k e t i n g t r a n s a c t i
o n s , t h e fact t h a t c u s t o m e r s w h o b u y
p r o d u c t X also b u y p r o d u c t Y creates a c r o s s - s e l
l i n g o p p o r t u n i t y ; t h a t is, " I f they're
b u y i n g X, s e U t h e m Y " or " I f they're b u y i n g Y, sell
t h e m X."
F i g u r e 9-20 shows h y p o t h e t i c a l sales d a t a f r o m
400 sales t r a n s a c t i o n s at a d i v e
s h o p . T h e f i r s t r o w o f n u m b e r s u n d e r e a c h c o
l u m n is t h e t o t a l n u m b e r o f t i m e s
a n i t e m was s o l d . For example, t h e 270 i n the first r o w
o f M a s k means t h a t 270 o f t h e
400 t r a n s a c t i o n s i n c l u d e d m a s k s . The 90 u n d e
r D i v e C o m p u t e r m e a n s t h a t 90 o f
t h e 400 transactions i n c l u d e d dive c o m p u t e r s .
We c a n use t h e n u m b e r s i n t h e first r o w t o e s t i m a
t e t h e p r o b a b i l i t y t h a t a
c u s t o m e r w i l l purchase a n i t e m . Because 270 of the
400 transactions were masks, we
can estimate the p r o b a b i h t y t h a t a c u s t o m e r w i l l
b u y a mask t o be 270/400, or .675.
I n m a r k e t - b a s k e t t e r m i n o l o g y , support is the p r
o b a b i h t y t h a t t w o i t e m s w i l l be
purchased together. To estimate t h a t p r o b a b i l i t y , w e
examine sales t r a n s a c t i o n s a n d
c o u n t the n u m b e r of times that t w o items occurred i n
the same transaction. For the data
i n Figure 9-20, fins and masks appeared together 250 t i m e s ,
a n d thus the s u p p o r t for fins
and a mask is 250/400, or .625. Similarly the s u p p o r t for
fins a n d weights is 20/400, or .05.
These data are i n t e r e s t i n g by themselves, b u t we can
refine the analysis b y t a k i n g
a n o t h e r step a n d c o n s i d e r i n g a d d i t i o n a l p r o
b a b i l i t i e s . For example, w h a t p r o p o r t i o n
o f t h e c u s t o m e r s w h o b o u g h t a m a s k also b o u g
h t fins? Masks w e r e p u r c h a s e d
270 t i m e s , a n d o f those i n d i v i d u a l s w h o b o u g h t
masks, 250 also b o u g h t fins. T h u s ,
given t h a t a customer b o u g h t a mask, we can estimate the
p r o b a b i l i t y t h a t he or she
w i l l b u y fins to be 250/270, or .926. I n market-basket t e r
m i n o l o g i ^ such a c o n d i t i o n a l
p r o b a b i l i t y estimate is called the confidence.
Q5 How Do Organizations Use Typical Data Mining
Applications? 317
Mask Tank W e i g h t s Dive C o m p u t e r
Mask 27G IG 25C 10 90
Tank i c 2CG 40 15C 30
?>.ns 25C 4C 2SC 20 20
W e i g h t s 10 13C 20 130 10
Dive C o r n p u t e r 90 30 2C 10 120
N u m Trans
Support
Mask 0,675 0.025 0.625 C.025 0.-225
Tank G.C25 0.5 G-1 C.325 0.075
G,625 G.l C.7 C.05 0.05
W e i g h t s C.C25 0.325 C.G5 C.325 C.G25
Dive C o m p u t e r G.22S C.C75 G,C5 0,025 0.3
C o n f i d e n c e
Mask 1 C.C5 0,892357143 0X76923077 0.75
Tank C.G37CB7C57 1 0.142357143 i 0,25
- i n s G,925925926 G,.2 1 0.-153846154 0,166666667
W e i g h t s C-Ci7GS70i7 0-65 C.C7i42S571 0.083533353
Dive C o m p u t e r C.iyis'iiiii C.15 0X71428571 C.C76923C77
1
Lift ( i m p r o v e m e r
Mask C.C74D74C74 1.322751323 D.11396C114 i . i i i i i i i i i
Tank G,.C74074C74 0.23571^286 G.5
=!n5 1-322751323 G. 285714286 C-21978022 C.23309523S
W e i g h t s C.ii396G114 2 C,2197S022 G.25641C256
Dive C o n p u i e r i . i i i n i i n 0.5 0.233095238 0.256410256
Market-Basket Analysis at a
Dive Shop
Reflect o n t h e m e a n i n g o f t h i s c o n f i d e n c e v a l u
e . T h e l i k e l i h o o d o f s o m e o n e
w a l k i n g i n t h e d o o r a n d b u y i n g fins is 250/400, or
.625. But the l i k e l i h o o d o f someone
b u y i n g fins, given t h a t he or she b o u g h t a mask, is
.926. Thus, i f someone buys a mask,
t h e l i k e l i h o o d t h a t he or she w i l l also b u y fins
increases substantially, f r o m .625 t o .926.
Thus, all sales p e r s o n n e l s h o u l d be t r a i n e d t o t r y
t o sell fins to anyone b u y i n g a mask.
N o w c o n s i d e r d i v e c o m p u t e r s a n d f i n s . O f t h
e 400 t r a n s a c t i o n s , fins w e r e s o l d
250 t i m e s , so t h e p r o b a b i l i t y t h a t someone walks i
n t o the store a n d buys fins is .625.
But o f the 90 purchases o f dive c o m p u t e r s , o n l y 20
appeared w i t h f i n s . So t h e l i k e l i -
h o o d o f s o m e o n e b u y i n g f i n s , g i v e n he o r she b
o u g h t a d i v e c o m p u t e r , is 20/90,
or .1566. T h u s , w h e n someone buys a dive c o m p u t e r ,
the l i k e l i h o o d t h a t she w i l l also
b u y fins falls f r o m .625 t o .1566.
The ratio o f confidence to the base p r o b a b i l i t y of b u y i
n g an i t e m is called l i f t . L i f t
shows h o w m u c h the base p r o b a b i l i t y increases or
decreases w h e n other p r o d u c t s are
purchased. The l i f t o f fins and a mask is the confidence o f
fins given a mask, d i v i d e d b y
the base p r o b a b i l i t y o f fins. I n Figure 9-20, the l i f t of
fins and a mask is .926/.625, or 1.32.
Thus, the l i k e l i h o o d t h a t people b u y fins w h e n they
b u y a mask increases by 32 percent.
Surprisingly, i t t u r n s o u t t h a t the l i f t of fins and a mask
is the same as the l i f t o f a mask
and fins. Both are 1.32.
We n e e d t o be c a r e f u l here, t h o u g h , because t h i s
analysis o n l y shows s h o p p i n g
c a i X s wXVv X"NO vlerrvs. Y J e carvrvot sa^; t t o m t d
a t a wViat tVie ViVeWViood vs tVvat
c u s t o m e r s , g i v e n t h a t t h e y b o u g h t a mask, w i l
l b u y b o t h weights a n d fins. To assess
t h a t p r o b a b i l i t y , w e n e e d t o analyze s h o p p i n g
carts w i t h three i t e m s . T h i s s t a t e m e n t
i l l u s t r a t e s , o n c e a g a i n , t h a t w e n e e d t o k n o
w w h a t p r o b l e m we're s o l v i n g b e f o r e
w e s t a r t t o b u i l d the i n f o r m a t i o n system to m i n e
t h e d a t a . The p r o b l e m d e f i n i t i o n
w i l l h e l p us decide i f we need t o analyze t h r e e - i t e m
, f o u r - i t e m , or some o t h e r sized
s h o p p i n g c a i t .
M a n y o r g a n i z a t i o n s are b e n e f i t i n g f r o m m a r
k e t - b a s k e t analysis today. You c a n
expect that this technique wil] become a standard CBM ana}ysis
d u r J J i ^ j o u r career.

More Related Content

Similar to My results.HOW TO CREATEA BROCHURETo print (and preserve) .docx

Mike Brough 2015
Mike Brough 2015Mike Brough 2015
Mike Brough 2015Mike Brough
 
Accelerating Innovation in Energy
Accelerating Innovation in EnergyAccelerating Innovation in Energy
Accelerating Innovation in Energyaccenture
 
Autodesk Sustainability: Progress Report FY2014
Autodesk Sustainability: Progress Report FY2014Autodesk Sustainability: Progress Report FY2014
Autodesk Sustainability: Progress Report FY2014Autodesk
 
Profile Jakob Bent Smed
Profile Jakob Bent SmedProfile Jakob Bent Smed
Profile Jakob Bent SmedJBS Consulting
 
Philip Hodgkinson - Resume April 2016
Philip Hodgkinson - Resume April 2016Philip Hodgkinson - Resume April 2016
Philip Hodgkinson - Resume April 2016Philip Hodgkinson
 
2009 01 12 Cv Jan Pedersen Eng
2009 01 12 Cv Jan Pedersen Eng2009 01 12 Cv Jan Pedersen Eng
2009 01 12 Cv Jan Pedersen EngJan Pedersen
 
HPE Accelerates its Sustainability Goals While Improving the Impact of IT On ...
HPE Accelerates its Sustainability Goals While Improving the Impact of IT On ...HPE Accelerates its Sustainability Goals While Improving the Impact of IT On ...
HPE Accelerates its Sustainability Goals While Improving the Impact of IT On ...Dana Gardner
 
Sustainability Business Proposal Template PowerPoint Presentation Slides
Sustainability Business Proposal Template PowerPoint Presentation SlidesSustainability Business Proposal Template PowerPoint Presentation Slides
Sustainability Business Proposal Template PowerPoint Presentation SlidesSlideTeam
 
I-Byte Energy july 2021
I-Byte Energy july 2021I-Byte Energy july 2021
I-Byte Energy july 2021EGBG Services
 
Sustainability Report 2013/14
Sustainability Report 2013/14Sustainability Report 2013/14
Sustainability Report 2013/14AT&S_IR
 
Progress Reporting 3 (Week 5)Greetings everyone,Thank you fo.docx
Progress Reporting 3 (Week 5)Greetings everyone,Thank you fo.docxProgress Reporting 3 (Week 5)Greetings everyone,Thank you fo.docx
Progress Reporting 3 (Week 5)Greetings everyone,Thank you fo.docxstilliegeorgiana
 
Bse consulting introduction sharepoint(office365)
Bse consulting introduction sharepoint(office365)Bse consulting introduction sharepoint(office365)
Bse consulting introduction sharepoint(office365)Steve Kim
 
Policies to Reduce Air Pollution Levels Project Proposal by Slidesgo.pptx
Policies to Reduce Air Pollution Levels Project Proposal by Slidesgo.pptxPolicies to Reduce Air Pollution Levels Project Proposal by Slidesgo.pptx
Policies to Reduce Air Pollution Levels Project Proposal by Slidesgo.pptxMario
 
7. ict strategy
7. ict strategy7. ict strategy
7. ict strategynealgs69
 
Mi0040 – technology management
Mi0040 – technology managementMi0040 – technology management
Mi0040 – technology managementsmumbahelp
 
Brittani Washington Resume - April 2016
Brittani Washington Resume - April 2016Brittani Washington Resume - April 2016
Brittani Washington Resume - April 2016Brittani Washington
 
Piti Prasertsuksean-CV 201501
Piti Prasertsuksean-CV 201501Piti Prasertsuksean-CV 201501
Piti Prasertsuksean-CV 201501petemail2k
 

Similar to My results.HOW TO CREATEA BROCHURETo print (and preserve) .docx (20)

Mike Brough 2015
Mike Brough 2015Mike Brough 2015
Mike Brough 2015
 
Accelerating Innovation in Energy
Accelerating Innovation in EnergyAccelerating Innovation in Energy
Accelerating Innovation in Energy
 
Drowning in Data
Drowning in DataDrowning in Data
Drowning in Data
 
Autodesk Sustainability: Progress Report FY2014
Autodesk Sustainability: Progress Report FY2014Autodesk Sustainability: Progress Report FY2014
Autodesk Sustainability: Progress Report FY2014
 
Profile Jakob Bent Smed
Profile Jakob Bent SmedProfile Jakob Bent Smed
Profile Jakob Bent Smed
 
Vinod_peddireddy
Vinod_peddireddyVinod_peddireddy
Vinod_peddireddy
 
Philip Hodgkinson - Resume April 2016
Philip Hodgkinson - Resume April 2016Philip Hodgkinson - Resume April 2016
Philip Hodgkinson - Resume April 2016
 
2009 01 12 Cv Jan Pedersen Eng
2009 01 12 Cv Jan Pedersen Eng2009 01 12 Cv Jan Pedersen Eng
2009 01 12 Cv Jan Pedersen Eng
 
HPE Accelerates its Sustainability Goals While Improving the Impact of IT On ...
HPE Accelerates its Sustainability Goals While Improving the Impact of IT On ...HPE Accelerates its Sustainability Goals While Improving the Impact of IT On ...
HPE Accelerates its Sustainability Goals While Improving the Impact of IT On ...
 
Sustainability Business Proposal Template PowerPoint Presentation Slides
Sustainability Business Proposal Template PowerPoint Presentation SlidesSustainability Business Proposal Template PowerPoint Presentation Slides
Sustainability Business Proposal Template PowerPoint Presentation Slides
 
T BYTES IOT & AR
T BYTES IOT & ART BYTES IOT & AR
T BYTES IOT & AR
 
I-Byte Energy july 2021
I-Byte Energy july 2021I-Byte Energy july 2021
I-Byte Energy july 2021
 
Sustainability Report 2013/14
Sustainability Report 2013/14Sustainability Report 2013/14
Sustainability Report 2013/14
 
Progress Reporting 3 (Week 5)Greetings everyone,Thank you fo.docx
Progress Reporting 3 (Week 5)Greetings everyone,Thank you fo.docxProgress Reporting 3 (Week 5)Greetings everyone,Thank you fo.docx
Progress Reporting 3 (Week 5)Greetings everyone,Thank you fo.docx
 
Bse consulting introduction sharepoint(office365)
Bse consulting introduction sharepoint(office365)Bse consulting introduction sharepoint(office365)
Bse consulting introduction sharepoint(office365)
 
Policies to Reduce Air Pollution Levels Project Proposal by Slidesgo.pptx
Policies to Reduce Air Pollution Levels Project Proposal by Slidesgo.pptxPolicies to Reduce Air Pollution Levels Project Proposal by Slidesgo.pptx
Policies to Reduce Air Pollution Levels Project Proposal by Slidesgo.pptx
 
7. ict strategy
7. ict strategy7. ict strategy
7. ict strategy
 
Mi0040 – technology management
Mi0040 – technology managementMi0040 – technology management
Mi0040 – technology management
 
Brittani Washington Resume - April 2016
Brittani Washington Resume - April 2016Brittani Washington Resume - April 2016
Brittani Washington Resume - April 2016
 
Piti Prasertsuksean-CV 201501
Piti Prasertsuksean-CV 201501Piti Prasertsuksean-CV 201501
Piti Prasertsuksean-CV 201501
 

More from rosemarybdodson23141

Young Adulthood begins with the individual being on the verge of att.docx
Young Adulthood begins with the individual being on the verge of att.docxYoung Adulthood begins with the individual being on the verge of att.docx
Young Adulthood begins with the individual being on the verge of att.docxrosemarybdodson23141
 
Your abilities in international management have been recognize.docx
Your abilities in international management have been recognize.docxYour abilities in international management have been recognize.docx
Your abilities in international management have been recognize.docxrosemarybdodson23141
 
your 14 years daughter accidently leaves her purse open in the fam.docx
your 14 years daughter accidently leaves her purse open in the fam.docxyour 14 years daughter accidently leaves her purse open in the fam.docx
your 14 years daughter accidently leaves her purse open in the fam.docxrosemarybdodson23141
 
Young people are ruining the English languageIn your reflectio.docx
Young people are ruining the English languageIn your reflectio.docxYoung people are ruining the English languageIn your reflectio.docx
Young people are ruining the English languageIn your reflectio.docxrosemarybdodson23141
 
Young man drops out of school in seventh grade and becomes his mothe.docx
Young man drops out of school in seventh grade and becomes his mothe.docxYoung man drops out of school in seventh grade and becomes his mothe.docx
Young man drops out of school in seventh grade and becomes his mothe.docxrosemarybdodson23141
 
Young and the RestlessWeek 11 Couples Therapy Movie Experience .docx
Young and the RestlessWeek 11 Couples Therapy Movie Experience .docxYoung and the RestlessWeek 11 Couples Therapy Movie Experience .docx
Young and the RestlessWeek 11 Couples Therapy Movie Experience .docxrosemarybdodson23141
 
You-Attitude A Linguistic PerspectiveLllita RodmanThe Uni.docx
You-Attitude A Linguistic PerspectiveLllita RodmanThe Uni.docxYou-Attitude A Linguistic PerspectiveLllita RodmanThe Uni.docx
You-Attitude A Linguistic PerspectiveLllita RodmanThe Uni.docxrosemarybdodson23141
 
You  may have seen how financial news outlets provide real-time .docx
You  may have seen how financial news outlets provide real-time .docxYou  may have seen how financial news outlets provide real-time .docx
You  may have seen how financial news outlets provide real-time .docxrosemarybdodson23141
 
You  are responsible for putting together the Harmony Day celebr.docx
You  are responsible for putting together the Harmony Day celebr.docxYou  are responsible for putting together the Harmony Day celebr.docx
You  are responsible for putting together the Harmony Day celebr.docxrosemarybdodson23141
 
You wrote this scenario from the perspective of Behaviorism learni.docx
You wrote this scenario from the perspective of Behaviorism learni.docxYou wrote this scenario from the perspective of Behaviorism learni.docx
You wrote this scenario from the perspective of Behaviorism learni.docxrosemarybdodson23141
 
You worked closely with your IT managers to develop a complementing .docx
You worked closely with your IT managers to develop a complementing .docxYou worked closely with your IT managers to develop a complementing .docx
You worked closely with your IT managers to develop a complementing .docxrosemarybdodson23141
 
You work in the office of a personal financial planner. He has asked.docx
You work in the office of a personal financial planner. He has asked.docxYou work in the office of a personal financial planner. He has asked.docx
You work in the office of a personal financial planner. He has asked.docxrosemarybdodson23141
 
You work in the IT department of a financial services company that s.docx
You work in the IT department of a financial services company that s.docxYou work in the IT department of a financial services company that s.docx
You work in the IT department of a financial services company that s.docxrosemarybdodson23141
 
You work for the Jaguars Bank as the Chief Information Officer.  It .docx
You work for the Jaguars Bank as the Chief Information Officer.  It .docxYou work for the Jaguars Bank as the Chief Information Officer.  It .docx
You work for the Jaguars Bank as the Chief Information Officer.  It .docxrosemarybdodson23141
 
You work for OneEarth, an environmental consulting company that .docx
You work for OneEarth, an environmental consulting company that .docxYou work for OneEarth, an environmental consulting company that .docx
You work for OneEarth, an environmental consulting company that .docxrosemarybdodson23141
 
You work for an international construction company that has been con.docx
You work for an international construction company that has been con.docxYou work for an international construction company that has been con.docx
You work for an international construction company that has been con.docxrosemarybdodson23141
 
You will write your Literature Review Section of your EBP Projec.docx
You will write your Literature Review Section of your EBP Projec.docxYou will write your Literature Review Section of your EBP Projec.docx
You will write your Literature Review Section of your EBP Projec.docxrosemarybdodson23141
 
You work for an airline, a small airline, so small you have only one.docx
You work for an airline, a small airline, so small you have only one.docxYou work for an airline, a small airline, so small you have only one.docx
You work for an airline, a small airline, so small you have only one.docxrosemarybdodson23141
 
You work for a small community hospital that has recently updated it.docx
You work for a small community hospital that has recently updated it.docxYou work for a small community hospital that has recently updated it.docx
You work for a small community hospital that has recently updated it.docxrosemarybdodson23141
 
You work for a regional forensic computer lab and have been tasked w.docx
You work for a regional forensic computer lab and have been tasked w.docxYou work for a regional forensic computer lab and have been tasked w.docx
You work for a regional forensic computer lab and have been tasked w.docxrosemarybdodson23141
 

More from rosemarybdodson23141 (20)

Young Adulthood begins with the individual being on the verge of att.docx
Young Adulthood begins with the individual being on the verge of att.docxYoung Adulthood begins with the individual being on the verge of att.docx
Young Adulthood begins with the individual being on the verge of att.docx
 
Your abilities in international management have been recognize.docx
Your abilities in international management have been recognize.docxYour abilities in international management have been recognize.docx
Your abilities in international management have been recognize.docx
 
your 14 years daughter accidently leaves her purse open in the fam.docx
your 14 years daughter accidently leaves her purse open in the fam.docxyour 14 years daughter accidently leaves her purse open in the fam.docx
your 14 years daughter accidently leaves her purse open in the fam.docx
 
Young people are ruining the English languageIn your reflectio.docx
Young people are ruining the English languageIn your reflectio.docxYoung people are ruining the English languageIn your reflectio.docx
Young people are ruining the English languageIn your reflectio.docx
 
Young man drops out of school in seventh grade and becomes his mothe.docx
Young man drops out of school in seventh grade and becomes his mothe.docxYoung man drops out of school in seventh grade and becomes his mothe.docx
Young man drops out of school in seventh grade and becomes his mothe.docx
 
Young and the RestlessWeek 11 Couples Therapy Movie Experience .docx
Young and the RestlessWeek 11 Couples Therapy Movie Experience .docxYoung and the RestlessWeek 11 Couples Therapy Movie Experience .docx
Young and the RestlessWeek 11 Couples Therapy Movie Experience .docx
 
You-Attitude A Linguistic PerspectiveLllita RodmanThe Uni.docx
You-Attitude A Linguistic PerspectiveLllita RodmanThe Uni.docxYou-Attitude A Linguistic PerspectiveLllita RodmanThe Uni.docx
You-Attitude A Linguistic PerspectiveLllita RodmanThe Uni.docx
 
You  may have seen how financial news outlets provide real-time .docx
You  may have seen how financial news outlets provide real-time .docxYou  may have seen how financial news outlets provide real-time .docx
You  may have seen how financial news outlets provide real-time .docx
 
You  are responsible for putting together the Harmony Day celebr.docx
You  are responsible for putting together the Harmony Day celebr.docxYou  are responsible for putting together the Harmony Day celebr.docx
You  are responsible for putting together the Harmony Day celebr.docx
 
You wrote this scenario from the perspective of Behaviorism learni.docx
You wrote this scenario from the perspective of Behaviorism learni.docxYou wrote this scenario from the perspective of Behaviorism learni.docx
You wrote this scenario from the perspective of Behaviorism learni.docx
 
You worked closely with your IT managers to develop a complementing .docx
You worked closely with your IT managers to develop a complementing .docxYou worked closely with your IT managers to develop a complementing .docx
You worked closely with your IT managers to develop a complementing .docx
 
You work in the office of a personal financial planner. He has asked.docx
You work in the office of a personal financial planner. He has asked.docxYou work in the office of a personal financial planner. He has asked.docx
You work in the office of a personal financial planner. He has asked.docx
 
You work in the IT department of a financial services company that s.docx
You work in the IT department of a financial services company that s.docxYou work in the IT department of a financial services company that s.docx
You work in the IT department of a financial services company that s.docx
 
You work for the Jaguars Bank as the Chief Information Officer.  It .docx
You work for the Jaguars Bank as the Chief Information Officer.  It .docxYou work for the Jaguars Bank as the Chief Information Officer.  It .docx
You work for the Jaguars Bank as the Chief Information Officer.  It .docx
 
You work for OneEarth, an environmental consulting company that .docx
You work for OneEarth, an environmental consulting company that .docxYou work for OneEarth, an environmental consulting company that .docx
You work for OneEarth, an environmental consulting company that .docx
 
You work for an international construction company that has been con.docx
You work for an international construction company that has been con.docxYou work for an international construction company that has been con.docx
You work for an international construction company that has been con.docx
 
You will write your Literature Review Section of your EBP Projec.docx
You will write your Literature Review Section of your EBP Projec.docxYou will write your Literature Review Section of your EBP Projec.docx
You will write your Literature Review Section of your EBP Projec.docx
 
You work for an airline, a small airline, so small you have only one.docx
You work for an airline, a small airline, so small you have only one.docxYou work for an airline, a small airline, so small you have only one.docx
You work for an airline, a small airline, so small you have only one.docx
 
You work for a small community hospital that has recently updated it.docx
You work for a small community hospital that has recently updated it.docxYou work for a small community hospital that has recently updated it.docx
You work for a small community hospital that has recently updated it.docx
 
You work for a regional forensic computer lab and have been tasked w.docx
You work for a regional forensic computer lab and have been tasked w.docxYou work for a regional forensic computer lab and have been tasked w.docx
You work for a regional forensic computer lab and have been tasked w.docx
 

Recently uploaded

Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 

Recently uploaded (20)

Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 

My results.HOW TO CREATEA BROCHURETo print (and preserve) .docx

  • 1. My results. HOW TO CREATE A BROCHURE To print (and preserve) these brochure instructions, click Print on the File menu. Press ENTER to print the brochure. Using this template, you can create a professional brochure. Here’s how: Insert your words in place of these words, using or re-arranging the preset paragraph styles. Print pages 1 and 2 back-to-back onto sturdy, letter size paper. Fold the paper like a letter to create a three-fold brochure (positioning the panel with the large picture on the front).What Else Should I Know? To change the style of any paragraph, select the text by positioning your cursor anywhere in the paragraph. Then, select a style from the Style list on the Formatting toolbar. To change the picture, click it to select it. Click Picture on the Insert menu, and then click FromFile. Select a new picture, and then click Insert. ( Company Name Street Address Address 2 City, ST ZIP Code Phone ( 704 )
  • 2. 555-0125 Fax ( 704 ) 555-0145 Web site address ) ( Future Solution s Now ) ( Customized T urnkey Training Courseware ) ( Adventure Works Date of publication )how to customize this brochure You’ll probably want to customize all your templates when you discover how editing and saving your templates makes creating
  • 3. future documents easier. To customize this brochure template: 1. Insert your company information in place of the sample text. Click Save As on the File menu. Click Document Template in the Save as Type box (the file name extension should change from .doc to .dot). Next time you want to use it, click New on the File menu, and then double-click your template.about the “picture” Fonts The “picture” fonts in this brochure are Wingdings typeface symbols. To insert a new symbol, select the symbol character and click Symbol on the Insert menu. Select a new symbol from the map, click Insert, and then click Close. workING with breaks Breaks in a Microsoft Word document appear as labeled dotted lines on the screen. Using the Break command, you can insert manual page breaks, column breaks, and section breaks. To insert a break, click Break on the Insert menu. Select an option. Click OK to accept your choice.WorkING with Spacing To reduce the spacing between, for example, body text paragraphs, click in this paragraph, and click Paragraph on the Format menu. Reduce Spacing After to 6 points, and make additional adjustments as needed. To save your style changes (with the insertion point in the changed paragraph), click the style in the Style list on the Formatting toolbar. Press ENTER to save the changes and update all similar styles.
  • 4. To adjust character spacing, select the text to be modified and click Font on the Format menu. Click CharacterSpacing and then enter a new value. Other Brochure Tips To change a font size, click Font on the Format menu. Adjust the size as needed, and then click OK or Cancel. To change the shading of shaded paragraphs, click BordersandShading on the Format menu. Select a new shade or pattern, and then click OK. Experiment to achieve the best shade for your printer. To remove a character style, select the text and press CTRL+SPACEBAR. You can also click Default Paragraph Font on the Style list.Brochure Ideas “Picture” fonts, like Wingdings, are gaining popularity. Consider using other symbol fonts to create highly customized icons. Consider printing your brochure on colorful, preprinted brochure paper—available from many paper suppliers. ( Company Name Street Address Address 2 City, ST ZIP
  • 5. Code Phone (704) 555-0125 Fax (704) 555-0145 Web site address ) ( Company Name Street Address Address 2 City, ST ZIP Code Phone (555)555-0125 Fax (555)555-0145 Web site address ) rpsgroup.com/energy National Data Repositories (NDR) define, develop, deliver
  • 6. Introduction rpsgroup.com/energy RPS Energy helps companies develop natural energy resources across the complete asset life cycle, combining our technical and commercial skills with an in-depth knowledge of environmental issues. The expertise within RPS Energy is applied world-wide to a broad range of projects across a number of industry sectors. In each of these areas, we provide our clients with independent flexible support to help them achieve their technical and commercial goals. RPS Energy has major regional offices across the UK, Australia, USA and Canada as well as local offices and agencies in many other areas.
  • 7. Oil and gas projects remain a central part of our work, but we are also world-leaders in advice to windfarm operators and are increasingly involved in other forms of renewable energy. Transferring skills across these sectors is a core capability for RPS Energy. Our clients include governments, NOCs, IOCs, independents and financial institutions, as well as companies in the wider energy industry and other infrastructure and asset owners. Increasingly we operate on projects where the issues surrounding the development energy resources and the preservation of the environment converge. RPS Energy brings a unique combination of such skills to all our projects. RPS Energy, through its acquisition of Paras Consulting, is one of the world’s leading independent consulting companies in the field of Information Systems and Processes. This
  • 8. combination of vendor/service neutrality combined with E&P technical know-how, makes RPS Energy uniquely placed to offer services to governments to implement an NDR. The large system vendors are vital stakeholders in the NDR domain and RPS Energy maintains excellent relationships with these companies. We manage technology procurement projects for both energy companies and governments, providing a unique understanding of the capabilities of the technologies on the market. rpsgroup.com/energy Typically, the role of any governing body with the responsibility for a country’s oil and gas industry, is as the domestic petroleum resource owner, manager, and regulator of the country’s E&P industry. The role represents a number of
  • 9. responsibilities including: n Regulatory compliance n Promotion of inward investment n Cost savings n Long term preservation of data for scientific purposes As investment within a country’s E&P industry increases, so do the volumes and complexity of data. An existing, efficient data repository is an attractive proposition to any exploration company looking for new regions to explore. The challenges facing government departments include: n Improved recognition of data as an asset and as a key enabler for potential
  • 10. new investment n Adding value to current oil and gas assets and realising new opportunities in an environment with business, technological and political fluctuations n Managing increasing multi-sourced volumes of a variety of data in a multi-client environment n Ready availability of high integrity integrated datasets n Continuous monitoring of data quality and improved understanding of integrity
  • 11. of datasets n Improved interaction between the Government and Industry n Improved methods of addressing security and entitlement issues n Improved standards and procedures to ensure preservation and increase in value from the data including regulatory compliance n Establishing an improved strategy for storage and archive of data National Data Repositories An NDR would provide long term storage for future use, ensuring that data retains value into the future.
  • 12. How can we help? rpsgroup.com/energy National Data Repositories Lifecycle Define Develop Deliver Ongoing Audit rpsgroup.com/energy National Data Repositories Lifecycle National Data Repositories Lifecycle Define Develop Deliver Ongoing Audit RPS Credentials We were among the pioneers of the concept of NDR’s through work with energy companies and government in the UK. We managed the design and set up the UK’s Common Data Access (CDA)
  • 13. initiative and have been involved since inception in its continual evolution. Our work as leaders in this field has taken us to Norway, Colombia, Venezuela, Algeria, Peru, Russia, and Romania. In parallel with this NDR work, RPS Energy continues to advise companies in data management strategies. We are therefore able to integrate a given country strategy into a company’s internal data management strategy. This has worked very effectively in Norway. Recent Project Examples Specifically, RPS Energy is experienced in working with a number of government bodies around the world in developing strategies for NDR implementation, and has undertaken a number of projects in this area. These include: n Original design and implementation of the UK Common Data Access (CDA) systems. CDA provides data storage
  • 14. and access systems to a consortium of exploration and production companies in the UK Continental shelf. n Enhancement of the system to provide UKDEAL – the web based information system available on the internet which shows data availability to interested parties. n Development of the commercial and process models for the UK National Hydrocarbon Data Archive (NHDA) which is designed to provide long term storage of data which is of no current interest to the E&P industry. n Development of the Banco de Información Petroleó (BIP) for Ecopetrol in Colombia. The system was designed to hold all of Ecopetrol’s information which it shared with the operating companies in Colombia, plus all the data from within Ecopetrol.
  • 15. n Assisting Perupetro with the definition and tendering for a commercial solution for their petroleum data system. n Carrying out a feasibility study on behalf of the West Australian Government and industry for development of a Petroleum Data Centre supporting both government and industry requirements. Further work was carried out to update the proposed roadmap as a result of ongoing differences between Government and Industry on the way forward. n Working with the Malay Thai Joint Authority (MTJA) to help it manage its interactions with the Joint Development Area (JDA) license holders more effectively. The work includes streamlining decision- making processes, developing data analysis capabilities to allow MTJA to
  • 16. better assess operational activities, and investigating a shared system for managing common data and information. A common document management system and a data analysis technology have been implemented. In addition to our work with national authorities we have extensive experience in developing systems and processes for managing data and information in oil companies. Our clients in this area include: n BP
  • 17. n BG n OMV Group, including Petrom n Shell n Anadarko n Hess n UK DTI Oil and Gas division (now part of DECC) n Maersk Oil n GDF Suez rpsgroup.com/energy Our Approach RPS Energy have a structured approach
  • 18. to undertaking projects of this nature based on a “define, develop, deliver” methodology. Define We are the E&P industry’s leading independent data management consultancy, and have the resource capacity to support large and complex programmes. RPS Energy has no connections with data management products or contracting services, and therefore provide a service focused entirely on adding value to the business. We have an unrivalled reputation for taking on complex problems and breaking them down into manageable elements to form an integrated programme, using a wide range of specialist expertise in the Technical Information Management arena. n Understand the stakeholder drivers (NOC/oil companies/ministry etc)
  • 19. n Set up an effective governance structure n Understand data volumes, owners and locations n Build the business case and assist in promoting the concept n Outline statement of requirements n Assist secure funding Develop The core capability of an NDR is organizing and exploiting large volumes of data covering a comprehensive geographical area to serve a community of users and prospective investors, allowing improved promotion of acreage
  • 20. and improved management of existing acreage. An NDR would therefore be of national strategic importance for the prosperity of the country’s petroleum industry. An NDR would be able to store and provide access through a single portal, reducing the number of applications required to access the information and consolidating data sources into a single framework. The data types included in an NDR would include all exploration, drilling and production activity – both
  • 21. national and private enterprise. However, there is scope to include other nationally important data – for example planning and environmental information. Additionally the data stored can be structured or unstructured data, and include the majority of raw data formats. n Write technology and service (either in one or two tenders) RFP n Manage procurement n Define the entitlements system n Assist in setting up the data release
  • 22. policies n Advise on optimal vendor/service combination n Assist in contract negotiations with the chosen vendor Deliver Through the implementation of the plans developed by RPS Energy, the challenges facing government departments are met, through delivery of the following: n Implementation plan n Map out processes n Document policies and procedures n Detailed data loading planning
  • 23. n System roll out n Testing n Integration testing n Training n Manage Change rpsgroup.com/energy Ongoing Audit and support RPS Energy has a continuing relationship with many of the governments we have worked with in the past, ensuring that the systems put in place continue to function fully. RPS Energy offers the following services: n Validating the security of systems including the entitlements system
  • 24. n Retendering of systems as and when required n Validating the processes used to capture and manage data, improving them where appropriate In addition to the services outlined above, RPS Energy is also involved in advising governments on inward investment into E&P. Data is a vital part of such a process and is therefore integral to a country’s strategy in attracting investment from the oil industry. National Data Repositories Lifecycle Define Develop Deliver Ongoing Audit RPS Energy is a part of RPS Group Plc, a consultancy organisation employing over 5000 professionals with a unique blend of skills and experience. We
  • 25. operate worldwide from regional offices in North America, Europe, Australia and S E Asia. We have a reputation for successfully meeting the challenges posed by large complex projects and for providing reliable and practical advice to clients in all sectors of the economy. RPS Energy conducts business in an open and fair manner, contributing to society in a positive way. rpsgroup.com/energy 23 45 6 A ug us
  • 29. o ri ne f re e pr o ce ss . UK | USA | Canada | Austr alia | Malaysia | Singapore | Russia For more information about our Energy Services please contact: [email protected]
  • 30. Data Repository & Customer Portal With the print data from your customer accounts backed up in the FMAudit™ Central™ database repository at your location, you are the owner of a powerful informational resource that will help you provide better services for your customers, increase operational efficiencies and grow your business. Fuel Your Growth Engine with: Reports and/or Billing > Connecting Central™ to your accounting system, ERP or CRM system, you can generate reports and/or billing invoices. With meter synchronization with Onsite™, your accurate and timely invoices sail through customer approval with ease, resulting in quicker collection of your
  • 31. money! Online Customer Meter Validation > As a customer portal, Central™ facilitates customer validation of meter readings online – a convenience for you and your customers and a huge source of savings for you! (Note: You can also extract data as .xls, csv, http and xml files.) Contract Device Management > You get the data you need to optimize contract efficiency for both you and your customers – savings for your customer, increased revenues for you. Cross Account Analysis > You can compare account assets by consumption, geography, industry, account manager and other variables – spot trends, manage sales and data mine for golden opportunities to increase business. Better Technical Services > Central’s Device Dashboard lets
  • 32. you see an easy-to-read virtual representation of critical service alerts – technical support, toner low, etc. – your customers get better service, you get better revenues. How FMAudit™ Central™ Works Residing at your location, Central™ is a backup database repository for the print asset and metering data gathered using any or all of FMAudit’s family of data collection products: Viewer USB™ (portable USB device; download collected data for backup) Onsite™ (resident on customer network; automatic, synchronized feed into Central™) WebAudit™ (internet based data collection; synchronized feed into Central™) When Used Viewer USB™ Central allows you to protect and save valuable meter readings and audits in one central
  • 33. repository that stays with you, regardless of where the USB key goes. When used with Onsite™ Central™ can synchronize and consolidate as many Onsite™ metering data feeds as you need without any third party involvement. No ASP! The data is secure and confidential. Start revving your growth engine with print data that is accurate, timely and all yours to explore – to spur growth and to speed through revenues. > DEALER PRINCIPALS > SALES MANAGERS > OPERATIONS > SERVICE > FINANCE
  • 34. Call: 1-573-632-2461 E-mail: [email protected] Website: www.FMAudit.com Minimum Technical Requirements Customer Workstation or Server Windows 2000 or higher Connected to target network 308 East High Street, Suite 109 Jefferson City, MO 65101 Q3 How Do Organizations Use Data Warehouses and Data Marts to Acquire Data? 305 is s h o w n t o be - 6 0 0 . T h i s r e s u l t , an e r r o r , o c c u r r e d because t h e o p e r a t i o n a l d a t a showed t h a t 19,800 u n i t s were ordered and 19,800 u n i t s were sold. However, the opera- t i o n a l data also s h o w e d t h a t 600 u n i t s were d a m a g e d . Clearly, s o m e t h i n g is w r o n g ,
  • 35. somewhere. I t c o u l d be due t o a k e y i n g mistake by someone o n the receiving dock, i t c o u l d be t h a t t h e v e n d o r s u b s e q u e n t l y s h i p p e d r e p l a c e m e n t i t e m s t h a t were n o t charged a n d t h e r e f o r e d i d n o t appear i n t h e a c c o u n t s payable database t h a t Lucas q u e r i e d , or i t c o u l d be due t o some other reason. Such a discrepancy is n o t u n u s u a l for B I analyses. W h e n data are i n t e g r a t e d f r o m several or m a n y d i f f e r e n t sources, t h e r e s u l t i n g c o l l e c t i o n is f r e q u e n t l y i n c o n s i s t e n t . The o n l y safeguard against inaccurate analyses f r o m such i n c o n s i s t e n t data is for the analysts and knowledge w o r k e r s t o k n o w t h a t such inconsistencies are possible, to be o n the l o o k o u t f o r t h e m , a n d t o a p p l y a c r i t i c a l eye t o B I results. A d d i s o n , Drew, a n d Lucas w o u l d use a process s i m i l a r t o t h a t j u s t discussed t o f i n i s h t h e i r analysis. They w o u l d l i k e l y a d d costs t o the data they've already g a t h e r e d a n d analyze i t so as t o p r o d u c e a n average cost per i t e m f o r each v e n d o r a n d o t h e r s i m i l a r
  • 36. results. The p a r t i c u l a r s are n o t i m p o r t a n t here; j u s t realize they w o u l d c o n t i n u e i n a s i m i l a r v e i n u n t i l t h e y were finished. A t t h a t p o i n t , a c c o r d i n g t o the process s u m m a r y i n Figure 9-3, t h e y w o u l d p u b l i s h t h e i r results. Several possibilities exist: P r i n t a n d d i s t r i b u t e t h e results v i a e m a i l or a c o l l a b o r a t i o n t o o l . Publish via a Web server or SharePoint. • Publish o n a B I server. A u t o m a t e the results via a Web service. We w i l l discuss these alternatives i n m o r e d e t a i l i n Q7. For now, j u s t realize t h a t GearUp w o u l d choose a m o n g these alternatives a c c o r d i n g to its needs. I f the business intelligence is o n l y created to p r o v i d e guidance for buyers, A d d i s o n a n d D r e w m i g h t be c o n t e n t j u s t to p r i n t t h e i r results a n d e m a i l t h e m to buyers or share t h e m u s i n g a c o l l a b o r a t i o n t o o l . As an alternative, they c o u l d also p r o d u c e t h e r e p o r t i n H T M L a n d
  • 37. place i t o n a Web server. As an extension t o t h a t o p t i o n , t h e y c o u l d use SharePoint t o p u b h s h t h e results. A l t h o u g h w e d i d n ' t discuss t h e m i n C h a p t e r 2, SharePoint has extensive features a n d f u n c t i o n s for B I r e p o r t i n g . A d d i s o n a n d D r e w c o u l d integrate t h e i r analyses w i t h these features a n d f u n c t i o n s so t h a t users c o u l d go to a SharePoint site f o r the latest data. F o u r t h , t h e y c o u l d p u b l i s h via a B I server, w h i c h is a Web server a p p l i c a t i o n t h a t is specialized for p u b h s h i n g B I results. Finally, Lucas m i g h t assign a p r o g r a m m e r i n his d e p a r t m e n t to create a Web service t h a t w o u l d make i t possible f o r o t h e r programs t o o b t a i n t h e B I results p r o g r a m m a t i c a l l y . Most likely, f o r t h e i r s i t u a - t i o n , t h e y w i l l p r i n t the results a n d e m a i l t h e m o r share t h e m via a c o l l a b o r a t i o n t o o l . W i t h this example i n m i n d , we w i l l n o w discuss each o f the elements o f Figure 9-3 i n greater d e t a i l . How Do Organizations Use Data Warehouses and Data Marts to Acquire
  • 38. Data? A l t h o u g h i t is p o s s i b l e to create b a s i c r e p o r t s a n d p e r f o r m s i m p l e analyses f r o m o p e r a t i o n a l d a t a , t h i s course is n o t u s u a l l y r e c o m m e n d e d . For reasons o f s e c u r i t y a n d c o n t r o l , IS professionals do n o t w a n t business analysts like A d d i s o n processing o p e r a t i o n a l data. I f A d d i s o n makes a n error, t h a t error c o u l d cause a serious d i s r u p - t i o n i n Gearllp's o p e r a t i o n s . Also, o p e r a t i o n a l d a t a is s t r u c t u r e d for fast a n d reliable t r a n s a c t i o n p r o c e s s i n g . I t is s e l d o m s t r u c t u r e d i n a w a y t h a t r e a d i l y s u p p o r t s B I 306 CHAPTER 9 Business Intelligence Systems Components of a Data Warehouse Production, Databases ' Other Internal Data
  • 39. : External Data J Data } Warehouse] Metadata • Data Warehouse Database Data Extraction/ Cleaning/ Preparation Programs Data Warehouse DBMS
  • 40. Data Extraction/ Cleaning/ Preparation Programs F Data Warehouse DBMS Business Intelligence Tools Business Intelligence Users analysis. Finally, B I analyses can r e q u i r e considerable processing; p l a c i n g B I a p p l i c a - t i o n s o n o p e r a t i o n a l servers can d r a m a t i c a l l y
  • 41. reduce system p e r f o r m a n c e . For these reasons, m o s t organizations extract o p e r a t i o n a l data for B I processing. For a s m a l l o r g a n i z a t i o n l i k e GearUp, t h e e x t r a c t i o n m a y be as s i m p l e as an Access database. Larger o r g a n i z a t i o n s , however, t y p i c a l l y create and staff a g r o u p of people w h o manage and r u n a data warehouse, w h i c h is a f a c i l i t y for m a n a g i n g an organiza- tion's B I data. The f u n c t i o n s of a data warehouse are t o : O b t a i n data Cleanse data Organize a n d relate data Catalog data Figure 9-11 shows t h e c o m p o n e n t s o f a data warehouse. Programs read p r o d u c - t i o n a n d o t h e r d a t a a n d e x t r a c t , c l e a n , a n d p r e p a r e t h a t d a t a f o r B I p r o c e s s i n g . The p r e p a r e d data are stored i n a data warehouse database u s i n g a data warehouse D B M S , w h i c h c a n be d i f f e r e n t f r o m t h e organization's o p e r a t i o n a l D B M S . For
  • 42. example, an o r g a n i z a t i o n m i g h t use Oracle for its o p e r a t i o n a l processing, b u t use SQL Server f o r its d a t a w a r e h o u s e . O t h e r o r g a n i z a t i o n s use SQL Server f o r o p e r a t i o n a l processing, b u t use DBMSs f r o m statistical package v e n d o r s such as SAS or SPSS i n t h e data warehouse. D a t a w a r e h o u s e s i n c l u d e d a t a t h a t are p u r c h a s e d f r o m o u t s i d e s o u r c e s . T h e p u r c h a s e o f d a t a a b o u t o t h e r c o m p a n i e s is n o t u n u s u a l o r p a r t i c u l a r l y c o n c e r n i n g f r o m a p r i v a c y s t a n d p o i n t . H o w e v e r , s o m e c o m p a n i e s , Hke Fox Lake, m i g h t c h o o s e t o b u y p e r s o n a l , c o n s u m e r d a t a ( l i k e m a r i t a l s t a t u s ) f r o m d a t a v e n d o r s l i k e A c x i o m C o r p o r a t i o n . F i g u r e 9- 12 l i s t s s o m e o f t h e c o n s u m e r d a t a Examples of Consumer Data for Sale Name, address, phone Age Gender
  • 43. Ethnicity Religion Income Education Voter registration Home ownership Vehicles Magazine subscriptions Hobbies Catalog orders Marital status, life stage Height, weight, hair and eye color Spouse name, birth date Children's names and birth dates Q3 How Do Organizations Use Data Warehouses and Data Marts to Acquire Data? t h a t can be r e a d i l y p u r c h a s e d . A n a m a z i n g ( a n d f r o m a p r i v a c y s t a n d p o i n t , f r i g h t -
  • 44. ening) a m o u n t o f data is available. M e t a d a t a c o n c e r n i n g t h e d a t a — i t s source, its f o r m a t , its a s s u m p t i o n s a n d c o n s t r a i n t s , a n d o t h e r facts a b o u t t h e data—is k e p t i n a d a t a w a r e h o u s e m e t a d a t a database. The data warehouse DBMS extracts a n d provides data to B I a p p l i c a t i o n s . Most o p e r a t i o n a l a n d purchased data have p r o b l e m s t h a t i n h i b i t t h e i r usefulness for business intelligence. Figure 9-13 lists the m a j o r p r o b l e m categories. First, a l t h o u g h data t h a t are c r i t i c a l for successful o p e r a t i o n s m u s t be c o m p l e t e a n d accurate, data t h a t are o n l y m a r g i n a l l y necessary need n o t be. For e x a m p l e , some systems gather d e m o g r a p h i c data i n the o r d e r i n g process. But, because such data are n o t needed t o f i l l , ship, a n d b i l l orders, t h e i r q u a l i t y suffers. P r o b l e m a t i c data are t e r m e d d i r t y data. Examples are a value of B f o r c u s t o m e r gender a n d of 213 for c u s t o m e r age. Other examples are a value o f 999-999-9999 for a
  • 45. U.S. p h o n e n u m b e r , a p a r t c o l o r o f g r e n , a n d an e m a i l address o f [email protected] W h o L A M . o r g . A l l o f these values can be p r o b l e m a t i c for B I purposes. Purchased d a t a o f t e n c o n t a i n m i s s i n g e l e m e n t s . M o s t data v e n d o r s state t h e percentage of m i s s i n g values for each a t t r i b u t e i n the data they sell. A n o r g a n i z a t i o n buys such data because for some uses some data are better t h a n n o data at a l l . This is especially t r u e for data items whose values are d i f f i c u l t t o o b t a i n , such as N u m b e r of A d u l t s i n H o u s e h o l d , H o u s e h o l d I n c o m e , D w e l l i n g Type, a n d E d u c a t i o n o f P r i m a r y I n c o m e Earner. However, care is r e q u i r e d here because for some B I a p p l i c a t i o n s a few m i s s i n g or erroneous data p o i n t s can seriously bias the analysis. I n c o n s i s t e n t data, the t h i r d p r o b l e m i n Figure 9- 13, is p a r t i c u l a r l y c o m m o n f o r data t h a t have been gathered over t i m e . W h e n an area code changes, for example, the p h o n e n u m b e r for a given customer before the change w i
  • 46. l l n o t m a t c h the customer's n u m b e r after the change. L i k e w i s e , p a r t codes can change, as c a n sales t e r r i t o r i e s . Before such data can be used, t h e y m u s t be recoded for consistency over t h e p e r i o d o f the study. Some data i n c o n s i s t e n c i e s o c c u r f r o m t h e n a t u r e o f t h e business a c t i v i t y . Consider a Web-based o r d e r - e n t r y system used by c u s t o m e r s w o r l d w i d e . W h e n t h e Web server records the t i m e o f order, w h i c h t i m e zone does i t use? The server's system clock t i m e is i r r e l e v a n t to an analysis o f c u s t o m e r b e h a v i o r . C o o r d i n a t e d U n i v e r s a l T i m e ( f o r m e r l y called G r e e n w i c h M e a n T i m e ) is also m e a n i n g l e s s . Somehow, Web server t i m e m u s t be adjusted t o the t i m e zone of the customer. A n o t h e r p r o b l e m is n o n i n t e g r a t e d data. A p a r t i c u l a r B I analysis m i g h t require data f r o m an ERP s y s t e m , a n e - c o m m e r c e system, a n d a social n e t w o r k i n g a p p U c a t i o n . Analysts m a y w i s h t o i n t e g r a t e t h a t o r g a n i z a t i
  • 47. o n a l data w i t h p u r c h a s e d c o n s u m e r d a t a . Such a data c o l l e c t i o n w i l l l i k e l y have r e l a t i o n s h i p s t h a t are n o t r e p r e s e n t e d i n p r i m a r y k e y / f o r e i g n key r e l a t i o n s h i p s . I t is the f u n c t i o n o f p e r s o n n e l i n t h e d a t a warehouse to integrate such data, somehow. Data can also have the w r o n g granularity, a t e r m t h a t refers t o the level of d e t a i l represented by the data. Granulaiit}' can be too fine or too coarse. For the former, suppose we w a n t t o analyze the p l a c e m e n t o f graphics a n d c o n t r o l s o n an o r d e r - e n t r y Web page. I t is possilale t o c a p t u r e the c u s t o m e r s ' c l i c k i n g b e h a v i o r i n w h a t is t e r m e d • Dirty data • Wrong granularity Possible Problems with • Missing values -Too fine Source Data • Inconsistent data - Not fine enough • Data not integrated • Too much data - Too many attributes - Too many data points
  • 48. 308 CHAPTER 9 Business Intelligence Systems clickstream data. Those data, however, i n c l u d e e v e r y t h i n g the c u s t o m e r does at the Web s i t e . I n t h e m i d d l e o f t h e o r d e r s t r e a m are d a t a f o r clicks o n t h e n e w s , e m a i l , i n s t a n t chat, a n d a weather check. A l t h o u g h all of t h a t data m a y be useful for a s t u d y o f c o n s u m e r b r o w s i n g behavior, i t w i l l be o v e r w h e l m i n g i f a l l we w a n t t o k n o w is h o w c u s t o m e r s r e s p o n d to an ad l o c a t e d d i f f e r e n t l y o n t h e screen. To p r o c e e d , t h e d a t a analysts m u s t t h r o w away m i l l i o n s and m i l l i o n s o f clicks. D a t a can also be t o o coarse. For e x a m p l e , a f i l e o f r e g i o n a l sales totals c a n n o t be used t o i n v e s t i g a t e t h e sales i n a p a r t i c u l a r store i n a r e g i o n , a n d t o t a l sales f o r a store c a n n o t be u s e d t o d e t e r m i n e t h e sales o f p a r t i c u l a r i t e m s w i t h i n a s t o r e . I n s t e a d , we n e e d t o o b t a i n d a t a t h a t is f i n e e n o u g h f o r t h e l o w e s t - l e v e l r e p o r t we w a n t t o p r o d u c e .
  • 49. I n general, i t is better t o have too fine a g r a n u l a r i t y t h a n t o o coarse. I f the g r a n u - l a r i t y is too f i n e , the data can be m a d e coarser by s u m m i n g a n d c o m b i n i n g . O n l y ana- lysts' l a b o r a n d c o m p u t e r p r o c e s s i n g are r e q u i r e d . I f t h e g r a n u l a r i t y is t o o coarse, however, there is n o w a y t o separate t h e data i n t o c o n s t i t u e n t parts. T h e f i n a l p r o b l e m l i s t e d i n Figure 9-13 is t o have t o o m u c h d a t a . As s h o w n i n t h e f i g u r e , we c a n have e i t h e r t o o m a n y a t t r i b u t e s o r t o o m a n y data p o i n t s . T h i n k b a c k to t h e d i s c u s s i o n o f tables i n C h a p t e r 5. We c a n have t o o m a n y c o l u m n s or t o o m a n y r o w s . Consider t h e f i r s t p r o b l e m : t o o m a n y a t t r i b u t e s . Suppose w e w a n t t o k n o w t h e factors t h a t i n f l u e n c e h o w customers r e s p o n d to a p r o m o t i o n . I f we c o m b i n e i n t e r n a l c u s t o m e r d a t a w i t h p u r c h a s e d c u s t o m e r d a t a , w e w i l l have m o r e t h a n a h u n d r e d d i f f e r e n t a t t r i b u t e s t o c o n s i d e r . H o w do w e
  • 50. select a m o n g t h e m ? Because o f a p h e n o m e n o n c a l l e d t h e curse of dimensionality, t h e m o r e a t t r i b u t e s t h e r e are, t h e easier i t is t o b u i l d a m o d e l t h a t fits t h e s a m p l e data b u t t h a t is w o r t h l e s s as a p r e d i c t o r . There are o t h e r good reasons f o r r e d u c i n g the n u m b e r o f a t t r i b u t e s , a n d one o f t h e m a j o r a c t i v i t i e s i n d a t a m i n i n g c o n c e r n s e f f i c i e n t a n d effective ways o f selecting a t t r i b u t e s . The s e c o n d w a y t o have t o o m u c h d a t a is to have t o o m a n y d a t a p o i n t s — t o o m a n y rows o f data. Suppose we w a n t to analyze c l i c k s t r e a m data o n C N N . c o m . H o w m a n y clicks does t h a t site receive p e r m o n t h ? M i l l i o n s u p o n m i l l i o n s ! I n o r d e r to m e a n i n g f u l l y analyze such d a t a w e n e e d to r e d u c e t h e a m o u n t of d a t a . One g o o d s o l u t i o n t o this p r o b l e m is statistical s a m p l i n g . Organizations s h o u l d n o t be r e l u c t a n t t o sample data i n such s i t u a t i o n s . Data Wareiioyses Versos Data Marts To u n d e r s t a n d t h e d i f f e r e n c e b e t w e e n data
  • 51. warehouses a n d d a t a m a r t s , t h i n k o f a data warehouse as a d i s t r i b u t o r i n a s u p p l y c h a i n . The data warehouse takes data f r o m t h e data m a n u f a c t u r e r s ( o p e r a t i o n a l systems a n d p u r c h a s e d d a t a ) , cleans a n d processes the data, a n d locates the data o n the shelves, so t o speak, o f the data w a r e - house. The people w h o w o r k w i t h a data warehouse are experts at data m a n a g e m e n t , data c l e a n i n g , data t r a n s f o r m a t i o n , data relationships a n d the l i k e . However, t h e y are n o t usually experts i n a given business f u n c t i o n . A data m a r t is a data c o l l e c t i o n , smaller t h a n the data warehouse, t h a t addresses t h e needs of a p a r t i c u l a r d e p a r t m e n t or f u n c t i o n a l area o f t h e business. I f the d a t a warehouse is the d i s t r i b u t o r i n a s u p p l y c h a i n , t h e n a data m a r t is like a retail store i n a s u p p l y c h a i n . Users i n the data m a r t o b t a i n data t h a t p e r t a i n to a particular- business f u n c t i o n f r o m t h e d a t a w a r e h o u s e . Such users d o n o t have t h e d a t a m a n a g e m e n t expertise t h a t data warehouse employees have, b u t t h e y are knowledgeable analysts
  • 52. for a g i v e n business f u n c t i o n . Figure 9-14 i l l u s t r a t e s these r e l a t i o n s h i p s . The d a t a w a r e h o u s e takes data f r o m t h e d a t a p r o d u c e r s a n d d i s t r i b u t e s the d a t a t o t h r e e d a t a m a r t s . One d a t a m a r t is used t o analyze c l i c k s t r e a m data f o r t h e p u r p o s e o f d e s i g n i n g Web pages. A second analyzes store sales d a t a a n d d e t e r m i n e s w h i c h p r o d u c t s t e n d t o be p u r c h a s e d together. This i n f o r m a t i o n is used t o t r a i n salespeople o n t h e best w a y to u p - s e l l t o c u s t o m e r s . T h e t h i r d d a t a m a r t is u s e d t o analyze c u s t o m e r o r d e r d a t a f o r t h e Q4 How Do Organizations tJse Typical Reporting Applications? Data Warehouse Metadata 2 S
  • 53. 3 T3 O i Web Log Data BI tools for Web clickstream analysis Web Sales Data Mart Store Sales Data Mart Store Sales Data BI tools
  • 54. for store management ^ Inventory History Data BI tools for Inventory management Inventory Data Mart Web page design features Data Mart Examples Market-basket analysis for sales training Inventory
  • 55. layout for optimal item picking p u r p o s e o f r e d u c i n g l a b o r f o r i t e m p i c k i n g f r o m t h e w a r e h o u s e . A c o m p a n y l i k e A m a z o n . c o m , f o r e x a m p l e , goes t o great l e n g t h s t o o r g a n i z e its w a r e h o u s e s t o reduce p i c k i n g expenses. As y o u can i m a g i n e , i t is expensive to create, staff, a n d operate data warehouses a n d data m a r t s . O n l y large o r g a n i z a t i o n s w i t h deep pockets can a f f o r d t o operate a s y s t e m l i k e t h a t s h o w n i n F i g u r e 9 - 1 1 . Smaller o r g a n i z a t i o n s l i k e GearUp o p e r a t e subsets o f t h i s system, b u t t h e y m u s t find ways t o solve the basic p r o b l e m s t h a t data warehouses solve, even i f those ways are i n f o r m a l . How Do Organizations Use Typical Reporting Applications? A reporting application is a B I a p p l i c a t i o n t h a t i n p u t s data f r o m one or m o r e sources a n d applies r e p o r t i n g o p e r a t i o n s t o t h a t data t o
  • 56. p r o d u c e business intelligence. We w i l l f i r s t s u m m a r i z e r e p o r t i n g o p e r a t i o n s a n d t h e n i l l u s t r a t e t w o i m p o r t a n t r e p o r t i n g a p p l i c a t i o n s : RFIVI analysis a n d OLAR R e p o r t i n g a p p l i c a t i o n s p r o d u c e business intelligence u s i n g five basic o p e r a t i o n s : S o r t i n g F i l t e r i n g G r o u p i n g Calculating F o r m a t t i n g N o n e o f these o p e r a t i o n s is p a r t i c u l a r l y s o p h i s t i c a t e d ; they can all be a c c o m p l i s h e d u s i n g SQL a n d basic H T M L or a s i m p l e r e p o r t w r i t i n g t o o l . A d d i s o n at GearUp used Access t o a p p l y a l l five o f these o p e r a t i o n s i n t h e p r e p a - r a t i o n o f t h e r e p o r t s discussed i n Q2. E x a m i n e , f o r e x a m p l e . Figure 9-7 (page 301). T h e r e s u l t s are sorted a n d grouped b y V e n d o r l D a n d , w i t h i n a v e n d o r , sorted i n
  • 57. decreasing o r d e r by value o f SalesShortage. The value o f SalesShortage as w e l l as t h e 314 CHAPTER 9 Business Intelligence Systems A Group Exercise Do You Have a Club Card? A d a t a aggregator is a company ttiat obtains data from public and private sources and stores, combines, and pub- lishes it i n sophisticated ways. When you use your grocery store club card, tlie data from your grocery shopping trip are sold to a data aggregator. Credit card data, credit data, public tax records, insurance records, product warranty card data, voter registration data, and hundreds of other types of data are sold to aggregators. Not all of the data are identified i n the same way (or, i n terms of Chapter 5, not all of it has the same primary key). But, using a combination of phone number, address, email address, name, and other partially identifying data, such com- panies can integrate that disparate data into an integrated,
  • 58. coherent whole. They then query, report, and mine the inte- grated data to f o r m detailed descriptions about companies, communities, zip codes, households, and individuals. As you w i l l learn in Chapter 12, laws l i m i t the types of data that federal and other governmental agencies can acquire and store. There are also some legal safeguards on data maintained by credit bureaus and medical facilities. However, no such laws l i m i t data storage by most companies (nor are there laws that prohibit governmental agencies from buying results from data aggregators). Acxiom Corporation, a data aggregator w i t h $1.2 billion i n sales in 2009, has been described as the "biggest compairy you never heard of." Visit www.acxiom.com and complete the following tasks: 1. Na'igate the Acxiom Web site and make a list of 10 different products that Acxiom provides. 2. Describe Acxiom's top customers. 3. Examine your answers to items 1 and 2 and describe, i n general terms, the kinds of data that Acxiom must collect
  • 59. to be able to provide these products to its customers. 4. In what ways might companies like Acxiom need to limit their marketing so as to avoid a privacy outcry from the public? 5. According to the Web site, what is Acxiom's privacy policy? Are you reassured by its policy? Why or why not? 6. Should there be laws governing companies like Acxiom? Why or why not? 7. Prepare a 3-minute presentation of your answers to items 3, 4, 5, and 6. Give your presentation to the rest of the class. Data mining and other business intelligence systems are useful, but they are not without problems, as discussed in the Guide on pages 330-331. How Do Organizations Use Typical Data Mining Applications?
  • 60. D a t a m i n i n g is the a p p l i c a t i o n of statistical t e c h n i q u e s t o find patterns a n d r e l a t i o n - ships a m o n g d a t a f o r c l a s s i f i c a t i o n a n d p r e d i c t i o n . As s h o w n i n F i g u r e 9-19, d a t a m i n i n g resulted f r o m a convergence o f disciplines. Data m i n i n g techniques emerged f r o m statistics a n d m a t h e m a t i c s and f r o m a r t i f i c i a l intelligence a n d m a c h i n e - l e a r n i n g fields i n c o m p u t e r science. As a result, d a t a m i n i n g t e r m i n o l o g y is an o d d b l e n d o f t e r m s f r o m these d i f f e r e n t d i s c i p l i n e s . S o m e t i m e s p e o p l e use t h e t e r m knowledge discovery in databases (KDD) as a s y n o n y m f o r data m i n i n g . Data m i n i n g techniques take advantage o f developments i n data m a n a g e m e n t f o r processing t h e e n o r m o u s databases t h a t have emerged i n the last 10 years. O f course, these data w o u l d n o t have been generated were i t n o t for fast a n d cheap c o m p u t e r s , a n d w i t h o u t such c o m p u t e r s t h e n e w techniques w o u l d be i m p o s s i b l e t o c o m p u t e .
  • 61. Q5 How Do Organizations Use Typical Data Mining Applications? Statistics/ Mathematics Artificial Intelligence ^ Machine Learning Cheap Computer Processing and Storage I Huge Databases Data Management Technology Marketing, Finance, and Other Business
  • 62. Professionals Most data m i n i n g techniques are sophisticated, a n d m a n y are d i f f i c u l t t o use w e l l . Such t e c h n i q u e s are v a l u a b l e t o o r g a n i z a t i o n s , however, a n d some business profes- sionals, especially those i n finance and m a r k e t i n g , have become expert i n t h e i r use. I n fact, today there are m a n y interesting and r e w a r d i n g careers for business professionals w h o are knowledgeable about data m i n i n g techniques. Data m i n i n g t e c h n i q u e s f a l l i n t o t w o b r o a d categories: u n s u p e r v i s e d a n d super- vised. We e x p l a i n b o t h types i n the f o l l o w i n g sections. W i t h unsupervised data m i n i n g , analysts do n o t create a m o d e l o r hypothesis before r u n n i n g the analysis. I n s t e a d , they a p p l y a data m i n i n g a p p l i c a t i o n to t h e data a n d observe the results. W i t h this m e t h o d , analysts create hypotheses after the analysis, i n order t o e x p l a i n the patterns f o u n d .
  • 63. One c o m m o n u n s u p e r v i s e d t e c h n i q u e is c l u s t e r a n a l y s i s . W i t h i t , s t a t i s t i c a l techniques i d e n t i f y groups of entities t h a t have s i m i l a r characteristics. A c o m m o n use f o r cluster analysis is t o f i n d g r o u p s o f s i m i l a r c u s t o m e r s f r o m c u s t o m e r o r d e r a n d d e m o g r a p h i c data. For example, suppose a cluster analysis finds t w o very d i f f e r e n t c u s t o m e r groups: One g r o u p has an average age of 33, o w n s t h r e e A n d r o i d p h o n e s , t w o iPads, has a n expensive h o m e e n t e r t a i n m e n t s y s t e m , d r i v e s a Lexus SUV, a n d t e n d s t o b u y expensive children's p l a y e q u i p m e n t . The s e c o n d g r o u p has an average age o f 64, owns A r i z o n a v a c a t i o n p r o p e r t y , plays golf, a n d buys expensive w i n e s . Suppose t h e analysis also finds t h a t b o t h groups b u y designer children's c l o t h i n g . These findings are obtained solely by data analysis. There is no p r i o r m o d e l about the patterns a n d relationships that exist, ft is up to the analyst to f o r m hypotheses, after the
  • 64. fact, to explain w h y t w o such different groups are b o t h b u y i n g designer children's clothes. Supervised Data Mininci W i t h supervised data m i n i n g , data m i n e r s develop a m o d e l prior to the analysis a n d a p p l y statistical techniques t o data t o estimate parameters o f t h e m o d e l . For example, suppose m a r k e t i n g experts i n a c o m m u n i c a t i o n s c o m p a n y believe t h a t c e l l p h o n e usage o n weekends is d e t e r m i n e d b y t h e age o f t h e c u s t o m e r a n d t h e n u m b e r o f m o n t h s t h e c u s t o m e r has h a d t h e cell p h o n e a c c o u n t . A data m i n i n g analyst w o u l d t h e n r u n an analysis t h a t estimates the i m p a c t o f c u s t o m e r a n d account age. One s u c h analysis, w h i c h measures t h e i m p a c t o f a set o f variables o n a n o t h e r variable, is called a regression analysis. A sample result for the cell phone example is: Many problems arise with classification schemes, especially those that classify
  • 65. people. The Ethics Guide on pages 318-319 examines some of these problems. CellphoneWeekendMinutes = 12 + (17.5 x CustomerAgeJ + (23.7 X N u m b e r M o n t h s OfAccount) 316 CHAPTER 9 Business intelligence Systems U s i n g t h i s e q u a t i o n , analysts can p r e d i c t t h e n u m b e r o f m i n u t e s o f w e e k e n d c e l l p h o n e use by s u m m i n g 12, p l u s 17.5 t i m e s t h e customer's age, p l u s 23.7 t i m e s t h e n u m b e r o f m o n t h s o f the account. As y o u w i l l learn i n your statistics classes, considerable skill is required to interpret the q u a l i t y o f such a m o d e l . The regression t o o l w i l l create an e q u a t i o n , such as the one shown. Whether that equation is a good predictor of future cell phone usage depends o n statistical factors, such as rvalues, confidence intervals, and related statistical techniques.
  • 66. Neural networks are a n o t h e r p o p u l a r supervised data m i n i n g a p p l i c a t i o n used t o p r e d i c t values a n d m a k e classifications such as "good p r o s p e c t " or " p o o r p r o s p e c t " c u s t o m e r s . The t e r m neural networks is d e c e i v i n g because i t c o n n o t e s a b i o l o g i c a l process s i m i l a r t o t h a t i n a n i m a l b r a i n s . I n fact, a l t h o u g h the o r i g i n a l idea o f n e u r a l nets m a y have c o m e f r o m t h e a n a t o m y a n d p h y s i o l o g y o f n e u r o n s , a n e u r a l n e t w o r k is n o t h i n g m o r e t h a n a c o m p l i c a t e d set of possibly n o n l i n e a r equations. E x p l a i n i n g the t e c h n i q u e s used f o r n e u r a l n e t w o r k s is b e y o n d the scope o f t h i s t e x t . I f y o u w a n t t o learn m o r e , search h t t p : / / k d n u g g e t s . c o m f o r t h e t e r m neural network. I n the next sections, we w i l l describe a n d illustrate t w o t y p i c a l data m i n i n g t o o l s — market-basket analysis and decision trees—and show applications o f those techniques. F r o m t h i s discussion, y o u can gain a sense o f the nature of data m i n i n g . These examples s h o u l d give y o u , a future manager, a sense of the
  • 67. possibilities of data m i n i n g techniques. You w i l l need a d d i t i o n a l c o u r s e w o r k i n statistics, data m a n a g e m e n t , m a r k e t i n g , a n d finance, however, before y o u w i l l be able to p e r f o r m such analyses yourself. Suppose y o u r u n a dive shop, a n d one day y o u realize t h a t one o f y o u r salespeople is m u c h b e t t e r at u p - s e l l i n g t o y o u r c u s t o m e r s . A n y o f y o u r sales associates c a n f i l l a customer's o r d e r , b u t t h i s o n e salesperson is e s p e c i a l l y g o o d at s e l h n g c u s t o m e r s i t e m s in addition to those for w h i c h they ask. One day, y o u ask h i m h o w he does i t . "It's s i m p l e , " he says. " I j u s t ask m y s e l f w h a t is the next p r o d u c t t h e y w o u l d w a n t t o b u y I f someone buys a dive c o m p u t e r , I don't t r y t o sell her fins. I f she's b u y i n g a dive c o m p u t e r , she's already a d i v e r a n d she already has fins. B u t , these d i v e c o m p u t e r displays are h a r d t o read. A better mask makes i t easier t o read the display a n d get the f u l l b e n e f i t f r o m the dive c o m p u t e r . "
  • 68. A m a r k e t - b a s k e t a n a l y s i s is an u n s u p e r v i s e d data m i n i n g t e c h n i q u e for deter- m i n i n g sales p a t t e r n s . A m a r k e t - b a s k e t analysis shows the p r o d u c t s t h a t c u s t o m e r s t e n d to b u y t o g e t h e r . I n m a r k e t i n g t r a n s a c t i o n s , t h e fact t h a t c u s t o m e r s w h o b u y p r o d u c t X also b u y p r o d u c t Y creates a c r o s s - s e l l i n g o p p o r t u n i t y ; t h a t is, " I f they're b u y i n g X, s e U t h e m Y " or " I f they're b u y i n g Y, sell t h e m X." F i g u r e 9-20 shows h y p o t h e t i c a l sales d a t a f r o m 400 sales t r a n s a c t i o n s at a d i v e s h o p . T h e f i r s t r o w o f n u m b e r s u n d e r e a c h c o l u m n is t h e t o t a l n u m b e r o f t i m e s a n i t e m was s o l d . For example, t h e 270 i n the first r o w o f M a s k means t h a t 270 o f t h e 400 t r a n s a c t i o n s i n c l u d e d m a s k s . The 90 u n d e r D i v e C o m p u t e r m e a n s t h a t 90 o f t h e 400 transactions i n c l u d e d dive c o m p u t e r s . We c a n use t h e n u m b e r s i n t h e first r o w t o e s t i m a t e t h e p r o b a b i l i t y t h a t a c u s t o m e r w i l l purchase a n i t e m . Because 270 of the 400 transactions were masks, we
  • 69. can estimate the p r o b a b i h t y t h a t a c u s t o m e r w i l l b u y a mask t o be 270/400, or .675. I n m a r k e t - b a s k e t t e r m i n o l o g y , support is the p r o b a b i h t y t h a t t w o i t e m s w i l l be purchased together. To estimate t h a t p r o b a b i l i t y , w e examine sales t r a n s a c t i o n s a n d c o u n t the n u m b e r of times that t w o items occurred i n the same transaction. For the data i n Figure 9-20, fins and masks appeared together 250 t i m e s , a n d thus the s u p p o r t for fins and a mask is 250/400, or .625. Similarly the s u p p o r t for fins a n d weights is 20/400, or .05. These data are i n t e r e s t i n g by themselves, b u t we can refine the analysis b y t a k i n g a n o t h e r step a n d c o n s i d e r i n g a d d i t i o n a l p r o b a b i l i t i e s . For example, w h a t p r o p o r t i o n o f t h e c u s t o m e r s w h o b o u g h t a m a s k also b o u g h t fins? Masks w e r e p u r c h a s e d 270 t i m e s , a n d o f those i n d i v i d u a l s w h o b o u g h t masks, 250 also b o u g h t fins. T h u s , given t h a t a customer b o u g h t a mask, we can estimate the p r o b a b i l i t y t h a t he or she w i l l b u y fins to be 250/270, or .926. I n market-basket t e r
  • 70. m i n o l o g i ^ such a c o n d i t i o n a l p r o b a b i l i t y estimate is called the confidence. Q5 How Do Organizations Use Typical Data Mining Applications? 317 Mask Tank W e i g h t s Dive C o m p u t e r Mask 27G IG 25C 10 90 Tank i c 2CG 40 15C 30 ?>.ns 25C 4C 2SC 20 20 W e i g h t s 10 13C 20 130 10 Dive C o r n p u t e r 90 30 2C 10 120 N u m Trans Support Mask 0,675 0.025 0.625 C.025 0.-225 Tank G.C25 0.5 G-1 C.325 0.075 G,625 G.l C.7 C.05 0.05 W e i g h t s C.C25 0.325 C.G5 C.325 C.G25 Dive C o m p u t e r G.22S C.C75 G,C5 0,025 0.3
  • 71. C o n f i d e n c e Mask 1 C.C5 0,892357143 0X76923077 0.75 Tank C.G37CB7C57 1 0.142357143 i 0,25 - i n s G,925925926 G,.2 1 0.-153846154 0,166666667 W e i g h t s C-Ci7GS70i7 0-65 C.C7i42S571 0.083533353 Dive C o m p u t e r C.iyis'iiiii C.15 0X71428571 C.C76923C77 1 Lift ( i m p r o v e m e r Mask C.C74D74C74 1.322751323 D.11396C114 i . i i i i i i i i i Tank G,.C74074C74 0.23571^286 G.5 =!n5 1-322751323 G. 285714286 C-21978022 C.23309523S W e i g h t s C.ii396G114 2 C,2197S022 G.25641C256 Dive C o n p u i e r i . i i i n i i n 0.5 0.233095238 0.256410256 Market-Basket Analysis at a Dive Shop Reflect o n t h e m e a n i n g o f t h i s c o n f i d e n c e v a l u e . T h e l i k e l i h o o d o f s o m e o n e w a l k i n g i n t h e d o o r a n d b u y i n g fins is 250/400, or .625. But the l i k e l i h o o d o f someone b u y i n g fins, given t h a t he or she b o u g h t a mask, is .926. Thus, i f someone buys a mask, t h e l i k e l i h o o d t h a t he or she w i l l also b u y fins
  • 72. increases substantially, f r o m .625 t o .926. Thus, all sales p e r s o n n e l s h o u l d be t r a i n e d t o t r y t o sell fins to anyone b u y i n g a mask. N o w c o n s i d e r d i v e c o m p u t e r s a n d f i n s . O f t h e 400 t r a n s a c t i o n s , fins w e r e s o l d 250 t i m e s , so t h e p r o b a b i l i t y t h a t someone walks i n t o the store a n d buys fins is .625. But o f the 90 purchases o f dive c o m p u t e r s , o n l y 20 appeared w i t h f i n s . So t h e l i k e l i - h o o d o f s o m e o n e b u y i n g f i n s , g i v e n he o r she b o u g h t a d i v e c o m p u t e r , is 20/90, or .1566. T h u s , w h e n someone buys a dive c o m p u t e r , the l i k e l i h o o d t h a t she w i l l also b u y fins falls f r o m .625 t o .1566. The ratio o f confidence to the base p r o b a b i l i t y of b u y i n g an i t e m is called l i f t . L i f t shows h o w m u c h the base p r o b a b i l i t y increases or decreases w h e n other p r o d u c t s are purchased. The l i f t o f fins and a mask is the confidence o f fins given a mask, d i v i d e d b y the base p r o b a b i l i t y o f fins. I n Figure 9-20, the l i f t of fins and a mask is .926/.625, or 1.32. Thus, the l i k e l i h o o d t h a t people b u y fins w h e n they
  • 73. b u y a mask increases by 32 percent. Surprisingly, i t t u r n s o u t t h a t the l i f t of fins and a mask is the same as the l i f t o f a mask and fins. Both are 1.32. We n e e d t o be c a r e f u l here, t h o u g h , because t h i s analysis o n l y shows s h o p p i n g c a i X s wXVv X"NO vlerrvs. Y J e carvrvot sa^; t t o m t d a t a wViat tVie ViVeWViood vs tVvat c u s t o m e r s , g i v e n t h a t t h e y b o u g h t a mask, w i l l b u y b o t h weights a n d fins. To assess t h a t p r o b a b i l i t y , w e n e e d t o analyze s h o p p i n g carts w i t h three i t e m s . T h i s s t a t e m e n t i l l u s t r a t e s , o n c e a g a i n , t h a t w e n e e d t o k n o w w h a t p r o b l e m we're s o l v i n g b e f o r e w e s t a r t t o b u i l d the i n f o r m a t i o n system to m i n e t h e d a t a . The p r o b l e m d e f i n i t i o n w i l l h e l p us decide i f we need t o analyze t h r e e - i t e m , f o u r - i t e m , or some o t h e r sized s h o p p i n g c a i t . M a n y o r g a n i z a t i o n s are b e n e f i t i n g f r o m m a r k e t - b a s k e t analysis today. You c a n expect that this technique wil] become a standard CBM ana}ysis
  • 74. d u r J J i ^ j o u r career.