SlideShare a Scribd company logo
1 of 18
ASSIGNMENTSSubject code: MB0036(4 credits)Set 1Marks 60SUBJECT NAME: BUSINESS
INTELLIGENCE & TOOLSNote: Each Question carries 10 marksQ1. Define the term business
intelligence tools? Briefly explain how the data from oneend gets transformed into information at the
other end?Ans:
Business intelligence tools. The various tools of this suite are:
•
Data Integration Tools:
These tools extract, transform and load the data from the source databases to the target
database. There are two categories; Data Integrator andRapid Marts. Data Integrator is an
ETL tool with a GUI. Rapid Marts is a packagedETL with pre -built data models for
reporting and query analysis that makes initial prototype development easy and fast for
ERP applications.The important components of Data Integrator include;
Graphicaldesigner:
This is a GUI used to build and test ETL jobs for data cleansing,validation and auditing.
Data integration server:
This integrates data from different source databases.
Metadata repository:
T h i s r e p o s i t o r y k e e p s s o u r c e a n d t a r g e t m e t a d a t a a n d t h e transformation rules.
Administrator:
This is a web-based tool that can be used to start, stop, schedule andmonitor ETL jobs.
•
BI Platform:
This platform provides a set of common services to deploy, use andmanage the tools and
applications. These services include providing the security, broadcasting, collaboration, metadata
and developer services.
•
Reporting Tools and Query & Analysis Tools
: These tools provide the facility for standard reports generation, ad hoc queries and data
analysis.
•
Performance Management Tools:
These tools help in managing the performance of a business by analyzing and tracking key
metrics and goals.
•
B u s i n e s s i n t e l l i g e n c e t o o l s a r e a t yp e o f a p p l i c a t i o n s o f t w a r e d e s i g n e d t o
h e l p i n making better business decisions. These tools aid in the analysis and
presentation of data in a more meaningful way and so play a key role in the strategic planning
process o f a n o r g a n i z a t i o n . T h e y i l l u s t r a t e b u s i n e s s i n t e l l i g e n c e i n t h e a r e a s
o f m a r k e t research and segmentation, customer profiling, customer support,
profitability, andinventory and distribution analysis to name a few.

•
Various types of BI systems viz. Decision Support Systems, Executive
InformationSystems (EIS), Multidimensi onal Analysis software or OLAP (On-Line
AnalyticalProcessing) tools, data mining tools are discussed further. Whatever is
the type, theBusiness Intelligence capabilities of the system is to let its users slice
and dice theinformation from their organization‟s numerous databases without having to wait
for their IT departments to develop complex queries and elicit answers.
•
Although it is possible to build BI systems without the benefit of a d a t a
w a r e h o u s e , m o s t o f t h e s ys t e m s a r e a n i n t e g r a l p a r t o f t h e user-facing end of
the data warehouse in practice. In fact, we can never think of building a data warehouse
without BI Systems. That isthe reason; sometimes, the words „data warehousing‟ and
„businessintelligence‟ are being used interchangeably.
•
Figure 1.1
depicts how the data from one end gets transformed to information at the other end for
business information.
•
Q2.

What do you mean by data ware house? What are the major concepts andterminology used in the study of
data warehouse?Ans:
In simple terms, a data warehouse is the repository of an organization‟s historical data( a l s o
termed as the corporate memory). For example, an organization
w o u l d g e t t h e information that is stored in its data warehouse to find out what day of the
week they sold themost number of gadgets in May 2002, or how employees were on
sick leave for a specificweek.A data warehouse is a database designed to support decision
making in an organization. Here,the data from various production databases are copied
to the data warehouse so that queriescan be forwarded without disturbing the stability or
performance of the production systems.S o t h e m a i n f a c t o r t h a t l e a d s t o t h e u s e o f a
d a t a w a r e h o u s e i s t h a t c o m p l e x q u e r i e s a n d analysis can be obtained over the
information without slowing down the operational systems.While operational systems are
optimized for simplicity and speed of modification (online transaction processing, or
OLTP), the data warehouse is optimized for reporting and analysis(online analytical processing,
or OLAP). (
The concepts of OLTP and OLAP are discussed inlater Units
).A p a r t f r o m t r a d i t i o n a l q u e r y a n d r e p o r t i n g , a d a t a w a r e h o u s e p r o v i d e s
t h e b a s e f o r t h e p o w e r f u l d a t a a n a l ys i s t e c h n i q u e s s u c h a s d a t a m i n i n g
a n d m u l t i d i m e n s i o n a l a n a l ys i s (discussed in detail in later Units). Making use of these
techniques will result in easier accessto the information you need for informed decision making.
haracteristics of a Data Warehouse
According to Bill Inmon, who is considered to be the Father of Data warehousing, the data ina
Data Warehouse consists of the following characteristics:
Subject oriented
The first feature of DW is its orientation toward the major subjects of the organization insteadof
applications. The subjects are categorized in such a way that the subject-wise collection
of information helps in decision-making. For example, the data in the data
warehouse of aninsurance company can be organized as customer ID, customer
name, premium, payment period, etc. rather auto insurance, life insurance, fire insurance, etc.
Integrated
The data contained within the boundaries of the warehouse are integrated. This means that
allinconsistencies regarding naming convention and value representations need to be removedin
a data warehouse. For example, one of the applications of an organization might
codegender as „m‟ and „f‟ and the other application might code the same
functionality as „0′ and„ 1 ′ . W h e n t h e d a t a i s m o v e d f r o m t h e o p e r a t i o n a l
e n v i r o n m e n t t o t h e d a t a w a r e h o u s e environment, this will result in conflict.
Time variant
The data stored in a data warehouse is not the current data. The data is a time series
data asthe data warehouse is a place where the data is accumulated periodically.
This is in contrastto the data in an operational system where the data in the
databases are accurate as of themoment of access.
Non-volatility of the data
The data in the data warehouse is non-volatile which means the data is stored in a read-
onlyf o r m a t a n d i t d o e s n o t c h a n g e o v e r a p e r i o d o f t i m e . T h i s i s t h e r e a s o n
t h e d a t a i n a d a t a warehouse forms as a single source for all decision system support
processing.Keeping the above characteristics in view, „
data warehouse
„can be defined as
a subject-oriented, integrated, non-volatile, time-variant collection of data designed
to support thedecision-making requirements of an organization
.
Q 3. What are the data modeling techniques used in data warehousing environment?Ans:
There are two data modeling techniques that are relevant in a data
warehousingenvironment. They are Entity Relationship modeling (ER
m o d e l i n g ) a n d d i m e n s i o n a l modeling.
•
ER modeling produces a data model of the specific area of interest, using two basicconcepts:
Entities and the Relationships
between them. A detailed ER model mayalso contain attributes, which can be properties of either
the entities or the
relationships. The ER model is an abstraction tool as it can be used to simplify,understand and
analyze the ambiguous data relationships in the real business world.
•
Dimensional modeling uses three basic concepts:
Facts, Dimensions and Measures.Dimensional
modeling is powerful in representing the requirements of the businessuser in the context of
database tables and also in the area of data warehousing.Both ER and dimensional modeling
can be used to create an abstract model of a specific subject. However, each of them has
its own limited set of modeling concepts and associatedn o t a t i o n c o n v e n t i o n s .
C o n s e q u e n t l y, t h e t e c h n i q u e s s e e m d i f f e r e n t , a n d t h e y a r e i n d e e d different
in terms of semantic representation. There is much debate as to which method
is better and the conditions under which a specific technique is to be selected. There can be
nodefinite answer, understanding of the circumstances and the business
requirements finallylead to selection of an appropriate technique.
Entity- Relationship (E-R) ModelingBasic Concepts
MB0036 – Business Intelligence & Tools
relationships. The ER model is an abstraction tool as it can be used to simplify,understand and
analyze the ambiguous data relationships in the real business world.
•
Dimensional modeling uses three basic concepts:
Facts, Dimensions and Measures.Dimensional
modeling is powerful in representing the requirements of the businessuser in the context of
database tables and also in the area of data warehousing.Both ER and dimensional modeling
can be used to create an abstract model of a specific subject. However, each of them has
its own limited set of modeling concepts and associatedn o t a t i o n c o n v e n t i o n s .
C o n s e q u e n t l y, t h e t e c h n i q u e s s e e m d i f f e r e n t , a n d t h e y a r e i n d e e d different
in terms of semantic representation. There is much debate as to which method
is better and the conditions under which a specific technique is to be selected. There can be
nodefinite answer, understanding of the cir cumstances and the business
requirements finallylead to selection of an appropriate technique.
Entity- Relationship (E-R) ModelingBasic Concepts
An ER model is represented by an ER diagram, which uses three basic graphic
symbols toconceptualize the data: entity, relationship, and attribute.
Entity
An entity is defined to be a person, place, thing, or event of interest to the business
or theorganization. It represents a class of objects, which are things in the real business world
thatcan be observed and classified by their properties and characteristics. In general, an entity
hasits own business definition and a clear boundary definition that is required to describe what
isincluded and what is not.In a practical modeling project, the team members share a definition
template for integrationand a consistent entity definition in the model. In case of a high-level
business modeling, anentity can be very generic, but it must be quite specific in the
detailed logical modeling.There are four entities; PRODUCT, PRODU CT MODEL,
PRODUCT COMPONENT, andCOMPONENT in the ER diagram (Refer Figure 4.1) and are
represented as rectangles.
Fig. 4.1: A Simple ER Model
5
An ER model is represented by an ER diagram, which uses three basic graphic
symbols toconceptualize the data: entity, relationship, and attribute.
Entity
An entity is defined to be a person, place, thing, or event of interest to the business
or theorganization. It represents a class of objects, which are things in the real business world
thatcan be observed and classified by their properties and characteristics. In general, an entity
hasits own business definition and a clear boundary definition that is required to describe what
isincluded and what is not.In a practical modeling project, the team members share a definition
template for integrationand a consistent entity definition in the model. In case of a high-level
business modeling, anentity can be very generic, but it must be quite specific in the
detailed logical modeling.There are four entities; PRODUCT, PR ODUCT MODEL,
PRODUCT COMPONENT, andCOMPONENT in the ER diagram (Refer Figure 4.1) and are
represented as rectangles.
Fig. 4.1: A Simple ER Model
he four diagonal lines on the corners of the PRODUCT COMPONENT entity represent thatthe
entity is „an associative entity‟ and the entity is to resolve the many-to-many
relationship between two entities. PRODUCT MODEL and COMPONENT are independent of
each other but have a business relationship between them. A PRODUCT MODEL
consists of manycomponents and a component is related to many product models. With this
business rule, youc a n n o t t e l l w h i c h c o m p o n e n t s m a k e u p a p r o d u c t m o d e l . T o
d o t h i s , yo u c a n d e f i n e a r e s o l v i n g e n t i t y . F o r e x a m p l e , t h e
P R O D U C T C O M P O N E N T e n t i t y c a n p r o v i d e t h e information about which
components are related to which product model.I n E R m o d e l i n g , n a m i n g t h e
entities is important for easy understanding and
c l e a r communication. It is expressed grammatically in the form of a noun rather
than a verb andt h e c r i t e r i a f o r s e l e c t i n g a n e n t i t y n a m e d e p e n d o n
h o w w e l l t h e n a m e r e p r e s e n t s t h e characteristics and scope of the entity.
Also, defining a unique identifier of an entity is the most critical task. These unique
identifiers are called candidate keys. Among them, you canselect the key that is most commonly
used to identify the entity, called „primary key‟.
Relationship
Relationships represent the structural interaction and association among the entities
in amodel and they are represented with lines drawn between the two specific entities.
Generally,a relationship is named grammatically by a verb (such as owns, belongs,
and has) and ther e l a t i o n s h i p b e t w e e n t h e e n t i t i e s c a n b e d e f i n e d i n t e r m s o f
t h e c a r d i n a l i t y. C a r d i n a l i t y represents the maximum number of instances of one entity
that are related to a single instancein another table and vice versa. Thus the possible cardinalities
include one-to-one (1:1), one-to-many (1:M), and many-to-many (M:M). In a detailed
normalized ER model, any M:Mrelationship is not shown because it is resolved to an
associative entity.
Attributes
Attributes describe the characteristics of properties of the entities. The ProductID,
Description, and Picture are attributes of the PRODUCT entity in Figure 4.1. T h e
name of an attribute has to be unique in an entity and should be
s e l f - explanatory to ensure clarity. For example, rather naming date1 and date2, youm a y u s e
t h e n a m e s ; o r d e r d a t e a n d d e l i v e r y d a t e . W h e n a n i n s t a n c e h a s n o value for
an attribute, the minimum cardinality of the attribute is zero, which means either
nullable or optional.In Figure 4.1, you can see the characters P, m, o, and F that stand
for primarykey, mandatory, optional, and foreign key. The Picture attribute of the
PRODUCTentity is optional, which means it is nullable. A foreign key of an entity is definedto
be the primary key of another entity. In figure 4.1, the Product ID attribute of t h e
PRODUCT MODEL entity is a foreign key as it is the primary key
o f t h e PRODUCT entity. These foreign keys are useful in determining the
relationshipssuch as the referential integrity between the entities.
Other ConceptsSupertype and Subtype
An entity can have subtypes and supertypes and the relationship between a supertype entityand
its subtype entity is an IS A relationship. An IS A relationship is used where one entity is

a generalization of several more specialized entities. The supertype and subtype relationshipis
represented by a triangle on the relationship. Figure 4.2 shows an example of supertype
ands u b t y p e e n t i t i e s w h e r e i n S A L E S O U T L E T i s t h e s u p e r t y p e o f
R E T A I L S T O R E a n d CORPORATE SALES OFFICE and RETAIL STORE,
CORPORATE SALES OFFICE aresubtypes of SALES OUTLET. Here, each subtype entity
inherits attributes from its supertypeentity.Also, each subtype entity can have its own
distinctive attributes. In the example providedabove, Region ID and Outlet ID are
inherited attributes and the sub entities have their own a t t r i b u t e s ( s u c h a s
number of cash registers and floor space of the RETAIL
S T O R E subentity). The practical benefit of supertyping and subtyping is that they make a data
modelmore directly expressive. Just by looking at the ER diagram, you can see that sales outlets
arecomposed of „retail stores‟ and „corporate sales offices‟.
Fig. 4.2: Supertype and Subtype
 Other important concepts in the area of ER modeling are „domain‟ and „normalization‟.
•
A domain consists of all the possible acceptable values and categories that areallowed for an
attribute. It is the set of all real possible occurrences. The format or data type, such as integer,
date, and character, provides a clear definition of domain.The practical benefit of domain is that
it is imperative for building the data dictionaryor repository, and for implementing the database
consequently.
•
 Normalization is a process of assigning the attributes to entities which in a wayreduces data
redundancy, avoids data anomalies, provides a solid architecture for updating data, and
reinforces the long-term integrity of the data model (the thirdnormal form is usually adequate).
Dimensional Modeling
Dimensional modeling is a relatively new concept compared to ER modeling. This method is
simpler,more expressive, and easier to understand. This technique is mainly aimed at
conceptualizing andvisualizing data models as a set of measures that are described by
common aspects of the business. Itis useful for summarizing and rearranging the data
and presenting views of the data to support data analysis. Also, the technique focuses on
numeric data, such as values, counts, and weights
Q 4. Discuss the categories in which data is divided before structuring it into data
warehouse?Ans:
The Data Warehouses can be divided into two types:
•
Enterprise Data Warehouse
•
Data Mart
Enterprise Data Warehouse
The Enterprise data warehouse consists of the data drawn from multiple operational systemso f
an organization. This data warehouse supports time -series and trend
a n a l ys i s a c r o s s different business areas of an organization and so can be used for strategic
decision-making.Also, this data warehouse is used to populate various data marts.
Data Mart
As data warehouses contain larger amounts of data, organizations often create „data
marts‟that are precise, specific to a department or product line. Thus data mart is a
physical andlogical subset of an Enterprise data warehouse and is also termed as a
department-specificdata warehouse. Generally, data marts are organized around a single
business process.There are two types of data marts; independent and dependant. The data is fed
directly fromthe legacy systems in case of an independent data mart and the data is fed from the
enterprisedata warehouse in case of a dependent data mart. In the long run, the
dependent data martsare much more stable architecturally than the independent data marts.
Advantages and Limitations of a DW System
Use of a data warehouse brings in the following advantages for an organization:
•
End-users can access a wide variety of data.
•
Management can obtain various kinds of trends and patterns of data.
•
A warehouse provides competitive advantage to the company by providing the data andtimely information.
•
A warehouse acts as a significant enabler of commercial business applications viz.,
Customer Relationship Management (CRM) applications.
However, following are the concerns that one has to keep in mind while
u s i n g a d a t a warehouse:
•
The scope of a Data warehousing project is to be managed carefully to attain the definedcontent and value.
•
The process of extracting, cleaning and loading the data and finally storing it into a
datawarehouse is a time-consuming process.
•
The problems of compatibility with the existing systems need to be resolved before building
adata warehouse.
•
Security of the data may become a serious issue, especially if the warehouse is webaccessible.
•
Building and maintenance of the data warehouse can be handled only through skilledresources and requires huge
investment.

MB0036 – Business Intelligence & Tools

Data Warehouse Concepts and Terminology
Various concepts and the key terms used in the study of data warehouse are provided below.
•
Dashboard:
This is a reporting tool that consolidates aggregates and arranges measurements,
metrics(measurements compared to a goal) on a single screen so that information can be
monitoredat a glance.
•
Data Management:
This is the process of controlling, protecting, and facilitating access to data in order
to provide the end users with timely access to the data they need.
•
Data Mining (or Data Surfing):
T h i s i s a t e c h n i q u e g e a r e d f o r t h e t yp i c a l u s e r w h o d o e s n o t k n o w e x a c t l y
w h a t h e i s searching for, but is looking for particular patterns or trends. Data
mining is the process of sifting through large amounts of data to produce data
content relationships. It can predictf u t u r e t r e n d s a n d b e h a v i o r s , a l l o w i n g
b u s i n e s s e s t o m a k e p r o a c t i v e , k n o w l e d g e - d r i v e n decisions. The most valuable
results from data mining include clustering, classifying, andestimating the things
that occur together. There are many kinds of t ools that play a role ind a t a m i n i n g
and they include neural networks, decision trees, visualization,
g e n e r a l algorithms, fuzzy logic, etc.
•
Data Modeling:
A method used to define and analyze data requirements needed to support the
businessfunctions of an organization.
•
Data Profiling:
Data Profiling is a critical step in data migration that automates the
identification of problematic data and metadata, and enables
o r g a n i z a t i o n s t o c o r r e c t i n c o n s i s t e n c i e s , redundancies and inaccuracies in their
databases.
•
Data Visualization:
Data visualization involves examining the data represented by dynamic images rather than pure
numbers. These are the techniques that turn the data into information by using the highcapacity
of the human brain to visually recognize patterns and trends
•
Decentralized Warehouse:
A remote data source that users can query/access via a central gateway that provides
alogical view of corporate data in terms that users can understand. The gateway
parses anddistributes queries in real time to remote data sources and returns result sets back to
users.
•
Drill-down:
This is the capacity to browse the information through a hierarchical structure as
shown below.
•
External Data Source:
This is the data that is not available in the OLTP system s, but is required to
enhance theinformation quality in the data warehouse. The examples of this data
include the data of thec o m p e t i t o r s , i n f o r m a t i o n o f t h e r e g u l a t o r y a n d
g o v e r n m e n t b o d i e s , r e s e a r c h d a t a o f t h e professional bodies and universities.
•
Metadata:
Metadata is data about data. The examples of metadata include data element descriptions,data
type descriptions, attribute descriptions, and process descriptions.
On-Line Analytical Processing (OLAP):
This is a category of software technology that en ables the users gain
i n s i g h t i n t o d a t a through fast, consistent, interactive access to a wide variety of possible
views of informationthat has been transformed from raw data to reflect the real dimensionality of
the organization.This is implemented in a multi-user client/server mode and offers consistently
rapid responset o q u e r i e s , r e g a r d l e s s o f d a t a b a s e s i z e a n d c o m p l e x i t y .
T h i s s o f t w a r e i s a l s o c a l l e d Multidimensional Analysis Software.
•
On-Line Transaction Processing (OLTP):
This is the way the data is processed by an end user/a computer system. Here, the
data isdetail oriented, highly repetitive with larger amounts of updates and changes. The major
task of these systems is to perform on-line transaction and query processing. These systems
cover m o s t o f t h e d a y - t o - d a y o p e r a t i o n s o f t h e o r g a n i z a t i o n , s u c h a s
p u r c h a s i n g , i n v e n t o r y, manufacturing, payroll, banking, accounting and registration

MB0036 – Business Intelligence & Tools
•
Operational Databases:
These are detail oriented databases defined to meet the needs of complex processes
of ano r g a n i z a t i o n . H e r e , t h e d a t a i s h i g h l y n o r m a l i z e d t o a v o i d d a t a
r e d u n d a n c y a n d d o u b l e - maintenance. A large number of transactions take place every
hour on these databases and area l w a ys “ u p t o d a t e ” a n d r e p r e s e n t a s n a p s h o t o f
t h e c u r r e n t s i t u a t i o n . C o n t r a s t t o t h e s e databases, there are Informational databases
that are stable over a period of time to representa situation at a specific point in time in the past.
Architecture of a Data Warehouse
The architecture describes the overall system of a Data Warehouse from various
perspectivess u c h a s d a t a , p r o c e s s , a n d i n f r a s t r u c t u r e t o s t u d y t h e i n t e r -
r e l a t i o n s h i p s a m o n g v a r i o u s components.
•
The data perspective includes the source and target data structures and so it aids the user
inunderstanding what data assets are available in a data warehouse and how they are related.
•
The process perspective is primarily concerned with communicating the process and flow of data
from the originating source system through the process of loading the data warehouseand
extracting data from the warehouse.
•
The infrastructure or technology perspective details the various hardware and softwareproducts
used to implement the distinct components of the overall system.
Depending upon the specifics of an organizational situation, the following types of
DataWarehouse architectures are provided below:
•
Basic architecture of a Data warehouse
•
Architecture of a Data warehouse with Staging area
•
Architecture of a Data warehouse with Staging area and Data marts
Fig 2.1 shows a simple architecture of a data warehouse wherein the end users
directlyaccess the data derived from several source systems through the data warehouse.

Q 5. Discuss the purpose of executive information system in an organization?
Ans:
An Executive Information System (EIS) is a set of management tools supporting theinformation and decision-
making needs of management by combining information availablewithin the organisation with external
information in an analytical framework.
 EIS are targeted at management needs to quickly assess the status of a business or section
of business. These packages are aimed firmly at the type of business user who needs instant
andup to date understanding of critical business information to aid decision making.The idea
behind an EIS is that information can be collated and displayed to the user withoutmanipulation
or further processing. The user can then quickly see the status of his chosendepartment or
function, enabling them to concentrate on decision making. Generally an EISis configured to
display data such as order backlogs, open sales, purchase order backlogs, shipments, receipts
and pending orders. This information can then be used to make executivedecisions at a strategic
level.The emphasis of the system as a whole is the easy to use interface and the integration with
avariety of data sources. It offers strong reporting and data mining capabilities which can provide
all the data the executive is likely to need. Traditionally the interface was menudriven with either
reports, or text presentation. Newer systems, and especially the newer Business Intelligence
systems,which are replacing EIS, have adashboard or scorecard type display.Before these
systems became available, decision makers had to rely on disparate spreadsheetsand reports
which slowed down the decision making process. Now massive amounts of relevant information
can be accessed in seconds. The two main aspects of an EIS system areintegration and
visualisation. The newest method of visualisation is theDashboard and Scorecard.
TheDashboardis one screen that presents key data and organisational informationon an almost
real time and integrated basis. The Scorecard is another one screen display withmeasurement
metrics which can give a percentile view of whatever criteria the executivechooses.Behind these
two front end screens can be an immensedata processing infrastructure, or acouple of
integrateddatabases, depending entirely on the organisation that is using thesystem. The
backbone of the system is traditional server hardware and a fast network. TheEIS software itself
is run from here and presented to the executive over this network. Thedatabases needs to be fully
integrated into the system and have real-time connections both inand out. This information then
needs to be collated, verified, processed and presented to theend user, so a real-time connection
into the EIS core is necessary.Executive Information Systems come in two distinct types: ones
that are data driven, andones that are model driven. Data driven systems interface with databases
and datawarehouses. They collate information from different sources and presents them to the
user inan integrated dashboard style screen. Model driven systems use forecasting, simulations
anddecision tree like processes to present the data.As with any emerging and progressive market,
service providers are continually improvingtheir products and offering new ways of doing
business. Modern EIS systems can also present industry trend information and competitor
behaviour trends if needed. They can filter and analyse data; create graphs, charts and scenario
generations; and offer many other optionsfor presenting data.
There are a number of ways to link decision making to organisational performance. From
adecision maker's perspective these tools provide an excellent way of viewing data.
Outcomesdisplayed include single metrics, trend analyses, demographics, market shares and a
myriadof other options. The simple interface makes it quick and easy to navigate and call
theinformation required.For a system that seems to offer business so much, it is used by
relatively few organisations.Current estimates indicate that as few as 10% of businesses use EIS
systems. One of thereasons for this is the complexity of the system and support infrastructure. It
is difficult tocreate such a system and populate it effectively. Combining all the necessary
systems anddata sources can be a daunting task, and seems to put many businesses off
implementing it.The system vendors have addressed this issue by offering turnkey solutions for
potentialclients. Companies like Actuate and Oracle are both offering complete out of the
boxExecutive Information Systems, and these aren't the only ones. Expense is also an issue.
Oncethe initial cost is calculated, there is the additional cost of support infrastructure,training,
andthe means of making the company data meaningful to the system.Does EIS warrant all of this
expense? Green King certainly thinks so. They installed aCognos system in 2003 and their first
few reports illustrated business opportunities in excessof ÂŁ250,000. The AA is also using a
Business Objects variant of an EIS system and theyexpect a return of 300% in three years.
(Guardian 31/7/03)An effective Executive Information System isn't something you can just set
up and leave it todo its work. Its success depends on the support and timely accurate data it gets
to be able to provide something meaningful. It can provide the information executives need to
makeeducated decisions quickly and effectively. An EIS can provide a competitive edge
to business strategy that can pay for itself in a very short space of time.
Q6. Discuss the challenges involved in data integration and coordination process?Ans:
In general, most of the data that the warehouse gets is the data extracted
f r o m a c o m b i n a t i o n o f l e g a c y m a i n f r a m e s ys t e m s , o l d m i n i c o m p u t e r
a p p l i c a t i o n s , a n d s o m e client/server systems. But these source systems do not
conform to the same set of businessrules. Thus they may often follow different naming
conventions and varied standards for datarepresentation. Thus the process of data integration and
consolidation plays a vital role. Here,the data integration includes combining of all
relevant operational data into coherent data s t r u c t u r e s s o a s t o m a k e t h e m r e a d y
for loading into data warehouse. It standardizes thenames and data
representations and resolves the discrepancies. So me of the
c h a l l e n g e s involved in the data integration and consolidation process are as follows.
Identification of an Entity
Suppose there are three legacy applications that are in use in your organization; one
is theorder entry system, second is customer service support system, and the third is the
marketingsystem. Each of these systems might have their own customer file to
support the system.Even most of the customers will be common to all these three
files, the same customer oneach of these files have a different unique identification number.
As you need to keep a single record for each customer in a data warehouse, you
need to getthe transactions of each customer from various source systems and then
match them up toload into the data warehouse. This is an entity identification
problem in which you do notknow which of the customer records relate to the same
customer. This problem is prevalentwhere multiple sources exist for the same entities and
the other entities that are prone to thistype of problem include vendors, suppliers, employees, and
various products manufactured by a company.In case of three customer files, you have to design
complex algorithms to match records froma l l t h e t h r e e f i l e s a n d g r o u p s o f m a t c h i n g
records. But this is a difficult exercise . If thematching criterion is too
t i g h t , t h e n s o m e r e c o r d s m i g h t e s c a p e t h e g r o u p s . S i m i l a r l y, a particular
group may include records of more than one customer if the matching
criteriond e s i g n e d i s t o o l o o s e . A l s o , y o u m i g h t h a v e t o i n v o l v e y o u r
u s e r s o r t h e r e s p e c t i v e stakeholders to understand the transaction accurately.
Some of the companies attempt this problem in two phases. In the first phase, the
entire records, irrespective whether they are duplicates or not, are assigned unique
identifiers and in the second phase, the duplicates arereconciled periodically ether through
automatic algorithms or manually.
Existence of Multiple Sources
Another major challenge in the area of data integration and consolidation results
from asingle data element having more than one source. For instance, cost values are calculated
andu p d a t e d a t s p e c i f i c i n t e r v a l s i n t h e s t a n d a r d c o s t i n g a p p l i c a t i o n .
S i m i l a r l y, yo u r o r d e r processing application also carries the unit costs for all products.
Thus there are two sourcesavailable to obtain the unit cost of a product and so there
could be a slight variation in their v a l u e s . W h i c h o f t h e s e s ys t e m s n e e d s t o b e
c o n s i d e r e d t o s t o r e t h e u n i t c o s t i n t h e d a t a warehouse becomes an important
question. One easy way of han dling this situation is to prioritize the two sources, or you
may select the source on the basis of the last update date.
Implementation of Transformation
The implementation of data transformation is a complex exercise. You
m a y h a v e t o g o beyond the manual methods, usual methods of writing conversion programs
while deployingthe operational systems. You need to consider several other factors to decide the
methods to be adopted. Suppose you are considering automating the data transformation
functions, youhave to identify, configure and install the tools, train the team on these
tools, and integratethem into the data warehouse environment. But a combination of both
methods proves to beeffective. The issues you may face in using manual methods and
transformation tools arediscussed below.
Manual Methods
These are the traditional methods that are in practice in the recent past. These
methods area d e q u a t e i n c a s e o f s m a l l e r d a t a w a r e h o u s e s . T h e s e m e t h o d s
i n c l u d e m a n u a l l y c o d e d programs and scripts that are mainly executed in the data
staging area. Since these methodsc a l l f o r e l a b o r a t e c o d i n g a n d t e s t i n g
a n d p r o g r a m m e r s a n d a n a l y s t s w h o p o s s e s t h e specialized knowledge in this
area only can produce the programs and scripts.Although the initial cost may be
reasonable, ongoing maintenance may escalate the costwhile implementing these
methods. Moreover these methods are always prone to errors . Another disadvantage
of these methods is about the creation of metadata. Even if the in -house programs
record the metadata initially, the metadata needs to be updated every time thechanges occur in
the transformation rules.

Transformation Tools
The difficulties involved in using the manual methods can be
e l i m i n a t e d u s i n g t h e sophisticated and comprehensive set of transformation
tools that are now available. Use of these automated tools certainly improves efficiency
and accuracy. If the inputs provided intothe tools are accurate, then the rest of the work
is performed efficiently by the tool. So youhave to carefully specify the required
parameters, the data definitions and the rules to the transformation tool.A l s o , t h e
t r a n s f o r m a t i o n t o o l s e n a b l e t h e r e c o r d i n g o f m e t a d a t a . W h e n yo u s p e c i f y
t h e transformation parameters and rules, these values are stored as metadata by the tool and
thismetadata becomes a part of the overall metadata component of the data
warehouse. Whenchanges occur to business rules or data definitions, you just have to enter the
changes into thetool and the metadata for the transformations get adjusted
automatically. But relying on thetransformation tools alone without using the manual
methods is also not practically possible.
Transformation for Dimension Attributes
 Now we consider the updating of the dimension tables. The dimension tables are more stablein
nature and so they are less volatile compared to the fact tables. The fact tables
changethrough an increase in the number of rows, but the dimension tables change
through thechanges to the attributes. For instance, we consider a product dimension
table. Every year,rows are added as new models become available. But what about the
attributes that are withint h e d i m e n s i o n t a b l e . Y o u m i g h t f a c e a s i t u a t i o n w h e r e
t h e r e i s a c h a n g e i n t h e p r o d u c t dimension table because a particular product was
moved into a different product category. Sothe corresponding values must be changed in
the product dimension table. Though most of the dimensions are generally constant over a
period of time, they may change slowly.


http://www.scribd.com/doc/44415081/MB0036-Business-Intelligence-amp-Tolls-Fall-10
SET 2



Q.1 Explain business development life cycle in detail? [10 Marks]

May 092012


Ans.

The Business development Lifecycle is a methodology adopted for planning, designing,
implementing and maintaining the BI system. Various steps involved in this approach are
depicted below.

Each of the phases in the above life cycle is described below.



Project Planning

Developing a project plan involves identification of all the tasks necessary to implement the BI
project. The Project Manager identifies the key team members, assigns the tasks, and develops
the effort estimates for their tasks. There is much interplay between this activity and the activity
of defining the Business Requirements and aligning the BI system/data warehouse system with
the business requirements is very crucial. Therefore you need to understand the business
requirements properly before proceeding further.



Project Management

This is the phase wherein the actual implementation of the project takes place. The first step here
is to define the business requirements and the implementation is carried out in three phases on
the basis of the requirements. The first phase (includes technical architecture design, selection
and installation of a product) deals with technology, the second phase (includes Dimensional
Modeling, Physical Design, ETL Design & Development) focuses on data and the last phase
(includes BI Application Specification, BI Application Development) deals with design and
development of analytical applications. The steps in these phases are discussed below.



1 Defining the Business Requirements

Business requirements are the bedrock of the BI system and so the Business Requirements
Definition acts as the foundation of the Lifecycle methodology. The business requirements
defined at this stage provide the necessary guidance to make the decisions. This process mainly
includes the following activities:

       Requirements planning
       Collecting the business requirements
       Post-collection documentation and follow-up



2 Technical Architecture Design

Creation of the Technical Architecture includes the following steps:

1. Establishing an Architecture task-force

2. Collecting Architecture-related requirements

3. Documenting the Architecture requirements

4. Developing a high-level Architectural model

5. Designing and specifying the subsystems

6. Determining Architecture implementation phases

7. Documenting the technical Architecture

8. Reviewing and finalizing the Architecture



3 Selection and Installation of a Product

The selection and the installation of a business intelligence product is carried out in the following
steps:

1. Understanding the corporate purchasing process

2. Developing a product evaluation matrix

3. Conducting market research

4. Shortlisting the options and performing detailed evaluations

5. Conducting a prototype (if necessary)
6. Selecting a product, installing on trial, and negotiating the value/price.



4 Dimensional Modeling

A dimensional model packages the data in a symmetric format whose design goals are obtaining
the user know-how, query performance, and resilience to change. In this step, a data-modeling
team is formed and design workshops are conducted to create the dimensional model. Once the
modeling team is confident of the model prepared, the model is demonstrated and validated with
a broader audience and then documented.

5 Physical Design

In this step, the dimensional model created in the previous step is translated into a physical
design. The physical model includes the details viz., physical database, data types, key
declarations, permissibility of nulls.

6 ETL Design & Development

ETL stands for Extraction, Transformation, and Loading. ETL tools are used to extract the data
from the operational data sources and to load the same into a data warehouse.



7 BI Application Specification

In this step, a set of analytical applications are identified for building a BI system based on the
business requirements definition, type of data being used, and the architecture of the warehouse
proposed.

8 BI Application Development

This is step wherein a specific application (tool) is selected from the identified applications for
actual implementation of the BI system.

9 Deployment

This is the step wherein the technology, data and analytical application tracks are converged. The
completion of this step can be assumed as the completion of actual building of the BI system.

10 Maintenance & Growth

During this step, the project team provides the user-support to the end-users of the system. Also,
the team involves in providing the technical support required for the system so as ensure the
continuous utilization of the system. This step may also include making some minor
enhancements to the BI system.

Revising the Project Planning

As the project makes progress, the project manager of the project has to revise the project plan to
accommodate the new business interests, concerns raised by the end-users.




http://www.scribd.com/doc/75437915/MI0036-SET-1-amp-SET-2

http://www.scribd.com/doc/75437878/MI0034-SET-1-amp-SET-2

http://www.scribd.com/santosh143hsv143

More Related Content

What's hot

Resource management techniques
Resource management techniquesResource management techniques
Resource management techniquesDr Geetha Mohan
 
Developing a Smart Campus
Developing a Smart CampusDeveloping a Smart Campus
Developing a Smart CampusJames Clay
 
Basics of Engineering Drawing and Graphics
Basics of Engineering Drawing and GraphicsBasics of Engineering Drawing and Graphics
Basics of Engineering Drawing and GraphicsJeyapoovan Thangasamy
 
Unit 1 intro. engineering graphics
Unit 1 intro. engineering graphics Unit 1 intro. engineering graphics
Unit 1 intro. engineering graphics ganesasmoorthy raju
 
Tableau online training
Tableau online trainingTableau online training
Tableau online trainingsuresh
 
ITVV(Industrial training report)
ITVV(Industrial training report)ITVV(Industrial training report)
ITVV(Industrial training report)HRJEETSINGH
 
Wireframe models
Wireframe modelsWireframe models
Wireframe modelsMohd Arif
 

What's hot (9)

Resource management techniques
Resource management techniquesResource management techniques
Resource management techniques
 
Excel 3: Data Analysis
Excel 3: Data Analysis Excel 3: Data Analysis
Excel 3: Data Analysis
 
Developing a Smart Campus
Developing a Smart CampusDeveloping a Smart Campus
Developing a Smart Campus
 
Basics of Engineering Drawing and Graphics
Basics of Engineering Drawing and GraphicsBasics of Engineering Drawing and Graphics
Basics of Engineering Drawing and Graphics
 
Unit 1 intro. engineering graphics
Unit 1 intro. engineering graphics Unit 1 intro. engineering graphics
Unit 1 intro. engineering graphics
 
CIM
CIMCIM
CIM
 
Tableau online training
Tableau online trainingTableau online training
Tableau online training
 
ITVV(Industrial training report)
ITVV(Industrial training report)ITVV(Industrial training report)
ITVV(Industrial training report)
 
Wireframe models
Wireframe modelsWireframe models
Wireframe models
 

Similar to Bi assignment

CHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal OlechowsCHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal OlechowsJinElias52
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl conceptsjeshocarme
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business IntelligenceSukirti Garg
 
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008Journal For Research
 
Data Quality in Data Warehouse and Business Intelligence Environments - Disc...
Data Quality in  Data Warehouse and Business Intelligence Environments - Disc...Data Quality in  Data Warehouse and Business Intelligence Environments - Disc...
Data Quality in Data Warehouse and Business Intelligence Environments - Disc...Alan D. Duncan
 
Business Intelligence Module 2
Business Intelligence Module 2Business Intelligence Module 2
Business Intelligence Module 2Home
 
DATA WAREHOUSE.pptx
DATA WAREHOUSE.pptxDATA WAREHOUSE.pptx
DATA WAREHOUSE.pptxSamirSharma46
 
The Data Warehouse Essays
The Data Warehouse EssaysThe Data Warehouse Essays
The Data Warehouse EssaysMelissa Moore
 
Information technology In Business
Information technology In BusinessInformation technology In Business
Information technology In BusinessShivaraj Bhardwaj
 
Introduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdfIntroduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdfAbdulrahimShaibuIssa
 
Top 60+ Data Warehouse Interview Questions and Answers.pdf
Top 60+ Data Warehouse Interview Questions and Answers.pdfTop 60+ Data Warehouse Interview Questions and Answers.pdf
Top 60+ Data Warehouse Interview Questions and Answers.pdfDatacademy.ai
 
Business Intelligence and Analytics .pptx
Business Intelligence and Analytics .pptxBusiness Intelligence and Analytics .pptx
Business Intelligence and Analytics .pptxRupaRani28
 
Role of business intelligence in knowledge management
Role of business intelligence in knowledge managementRole of business intelligence in knowledge management
Role of business intelligence in knowledge managementShakthi Fernando
 
Inventory System
Inventory System Inventory System
Inventory System Nasir152222
 

Similar to Bi assignment (20)

CHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal OlechowsCHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal Olechows
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
ETL QA
ETL QAETL QA
ETL QA
 
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
 
Data Quality in Data Warehouse and Business Intelligence Environments - Disc...
Data Quality in  Data Warehouse and Business Intelligence Environments - Disc...Data Quality in  Data Warehouse and Business Intelligence Environments - Disc...
Data Quality in Data Warehouse and Business Intelligence Environments - Disc...
 
Business Intelligence Module 2
Business Intelligence Module 2Business Intelligence Module 2
Business Intelligence Module 2
 
DATA WAREHOUSE.pptx
DATA WAREHOUSE.pptxDATA WAREHOUSE.pptx
DATA WAREHOUSE.pptx
 
The Data Warehouse Essays
The Data Warehouse EssaysThe Data Warehouse Essays
The Data Warehouse Essays
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Information technology In Business
Information technology In BusinessInformation technology In Business
Information technology In Business
 
Abstract
AbstractAbstract
Abstract
 
IT Ready - DW: 1st Day
IT Ready - DW: 1st Day IT Ready - DW: 1st Day
IT Ready - DW: 1st Day
 
Introduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdfIntroduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdf
 
Big data overview
Big data overviewBig data overview
Big data overview
 
Top 60+ Data Warehouse Interview Questions and Answers.pdf
Top 60+ Data Warehouse Interview Questions and Answers.pdfTop 60+ Data Warehouse Interview Questions and Answers.pdf
Top 60+ Data Warehouse Interview Questions and Answers.pdf
 
Business Intelligence and Analytics .pptx
Business Intelligence and Analytics .pptxBusiness Intelligence and Analytics .pptx
Business Intelligence and Analytics .pptx
 
Role of business intelligence in knowledge management
Role of business intelligence in knowledge managementRole of business intelligence in knowledge management
Role of business intelligence in knowledge management
 
Inventory System
Inventory System Inventory System
Inventory System
 
Oracle sql plsql & dw
Oracle sql plsql & dwOracle sql plsql & dw
Oracle sql plsql & dw
 

Recently uploaded

Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 

Recently uploaded (20)

Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 

Bi assignment

  • 1. ASSIGNMENTSSubject code: MB0036(4 credits)Set 1Marks 60SUBJECT NAME: BUSINESS INTELLIGENCE & TOOLSNote: Each Question carries 10 marksQ1. Define the term business intelligence tools? Briefly explain how the data from oneend gets transformed into information at the other end?Ans: Business intelligence tools. The various tools of this suite are: • Data Integration Tools: These tools extract, transform and load the data from the source databases to the target database. There are two categories; Data Integrator andRapid Marts. Data Integrator is an ETL tool with a GUI. Rapid Marts is a packagedETL with pre -built data models for reporting and query analysis that makes initial prototype development easy and fast for ERP applications.The important components of Data Integrator include; Graphicaldesigner: This is a GUI used to build and test ETL jobs for data cleansing,validation and auditing. Data integration server: This integrates data from different source databases. Metadata repository: T h i s r e p o s i t o r y k e e p s s o u r c e a n d t a r g e t m e t a d a t a a n d t h e transformation rules. Administrator: This is a web-based tool that can be used to start, stop, schedule andmonitor ETL jobs. • BI Platform: This platform provides a set of common services to deploy, use andmanage the tools and applications. These services include providing the security, broadcasting, collaboration, metadata and developer services. • Reporting Tools and Query & Analysis Tools : These tools provide the facility for standard reports generation, ad hoc queries and data analysis. • Performance Management Tools: These tools help in managing the performance of a business by analyzing and tracking key metrics and goals. • B u s i n e s s i n t e l l i g e n c e t o o l s a r e a t yp e o f a p p l i c a t i o n s o f t w a r e d e s i g n e d t o h e l p i n making better business decisions. These tools aid in the analysis and presentation of data in a more meaningful way and so play a key role in the strategic planning process o f a n o r g a n i z a t i o n . T h e y i l l u s t r a t e b u s i n e s s i n t e l l i g e n c e i n t h e a r e a s o f m a r k e t research and segmentation, customer profiling, customer support, profitability, andinventory and distribution analysis to name a few. • Various types of BI systems viz. Decision Support Systems, Executive InformationSystems (EIS), Multidimensi onal Analysis software or OLAP (On-Line AnalyticalProcessing) tools, data mining tools are discussed further. Whatever is the type, theBusiness Intelligence capabilities of the system is to let its users slice
  • 2. and dice theinformation from their organization‟s numerous databases without having to wait for their IT departments to develop complex queries and elicit answers. • Although it is possible to build BI systems without the benefit of a d a t a w a r e h o u s e , m o s t o f t h e s ys t e m s a r e a n i n t e g r a l p a r t o f t h e user-facing end of the data warehouse in practice. In fact, we can never think of building a data warehouse without BI Systems. That isthe reason; sometimes, the words „data warehousing‟ and „businessintelligence‟ are being used interchangeably. • Figure 1.1 depicts how the data from one end gets transformed to information at the other end for business information. • Q2. What do you mean by data ware house? What are the major concepts andterminology used in the study of data warehouse?Ans: In simple terms, a data warehouse is the repository of an organization‟s historical data( a l s o termed as the corporate memory). For example, an organization w o u l d g e t t h e information that is stored in its data warehouse to find out what day of the week they sold themost number of gadgets in May 2002, or how employees were on sick leave for a specificweek.A data warehouse is a database designed to support decision making in an organization. Here,the data from various production databases are copied to the data warehouse so that queriescan be forwarded without disturbing the stability or performance of the production systems.S o t h e m a i n f a c t o r t h a t l e a d s t o t h e u s e o f a d a t a w a r e h o u s e i s t h a t c o m p l e x q u e r i e s a n d analysis can be obtained over the information without slowing down the operational systems.While operational systems are optimized for simplicity and speed of modification (online transaction processing, or OLTP), the data warehouse is optimized for reporting and analysis(online analytical processing, or OLAP). ( The concepts of OLTP and OLAP are discussed inlater Units ).A p a r t f r o m t r a d i t i o n a l q u e r y a n d r e p o r t i n g , a d a t a w a r e h o u s e p r o v i d e s t h e b a s e f o r t h e p o w e r f u l d a t a a n a l ys i s t e c h n i q u e s s u c h a s d a t a m i n i n g a n d m u l t i d i m e n s i o n a l a n a l ys i s (discussed in detail in later Units). Making use of these techniques will result in easier accessto the information you need for informed decision making. haracteristics of a Data Warehouse According to Bill Inmon, who is considered to be the Father of Data warehousing, the data ina Data Warehouse consists of the following characteristics: Subject oriented The first feature of DW is its orientation toward the major subjects of the organization insteadof applications. The subjects are categorized in such a way that the subject-wise collection of information helps in decision-making. For example, the data in the data warehouse of aninsurance company can be organized as customer ID, customer name, premium, payment period, etc. rather auto insurance, life insurance, fire insurance, etc. Integrated
  • 3. The data contained within the boundaries of the warehouse are integrated. This means that allinconsistencies regarding naming convention and value representations need to be removedin a data warehouse. For example, one of the applications of an organization might codegender as „m‟ and „f‟ and the other application might code the same functionality as „0′ and„ 1 ′ . W h e n t h e d a t a i s m o v e d f r o m t h e o p e r a t i o n a l e n v i r o n m e n t t o t h e d a t a w a r e h o u s e environment, this will result in conflict. Time variant The data stored in a data warehouse is not the current data. The data is a time series data asthe data warehouse is a place where the data is accumulated periodically. This is in contrastto the data in an operational system where the data in the databases are accurate as of themoment of access. Non-volatility of the data The data in the data warehouse is non-volatile which means the data is stored in a read- onlyf o r m a t a n d i t d o e s n o t c h a n g e o v e r a p e r i o d o f t i m e . T h i s i s t h e r e a s o n t h e d a t a i n a d a t a warehouse forms as a single source for all decision system support processing.Keeping the above characteristics in view, „ data warehouse „can be defined as a subject-oriented, integrated, non-volatile, time-variant collection of data designed to support thedecision-making requirements of an organization . Q 3. What are the data modeling techniques used in data warehousing environment?Ans: There are two data modeling techniques that are relevant in a data warehousingenvironment. They are Entity Relationship modeling (ER m o d e l i n g ) a n d d i m e n s i o n a l modeling. • ER modeling produces a data model of the specific area of interest, using two basicconcepts: Entities and the Relationships between them. A detailed ER model mayalso contain attributes, which can be properties of either the entities or the relationships. The ER model is an abstraction tool as it can be used to simplify,understand and analyze the ambiguous data relationships in the real business world. • Dimensional modeling uses three basic concepts: Facts, Dimensions and Measures.Dimensional modeling is powerful in representing the requirements of the businessuser in the context of database tables and also in the area of data warehousing.Both ER and dimensional modeling can be used to create an abstract model of a specific subject. However, each of them has its own limited set of modeling concepts and associatedn o t a t i o n c o n v e n t i o n s . C o n s e q u e n t l y, t h e t e c h n i q u e s s e e m d i f f e r e n t , a n d t h e y a r e i n d e e d different in terms of semantic representation. There is much debate as to which method is better and the conditions under which a specific technique is to be selected. There can be nodefinite answer, understanding of the circumstances and the business requirements finallylead to selection of an appropriate technique. Entity- Relationship (E-R) ModelingBasic Concepts
  • 4. MB0036 – Business Intelligence & Tools relationships. The ER model is an abstraction tool as it can be used to simplify,understand and analyze the ambiguous data relationships in the real business world. • Dimensional modeling uses three basic concepts: Facts, Dimensions and Measures.Dimensional modeling is powerful in representing the requirements of the businessuser in the context of database tables and also in the area of data warehousing.Both ER and dimensional modeling can be used to create an abstract model of a specific subject. However, each of them has its own limited set of modeling concepts and associatedn o t a t i o n c o n v e n t i o n s . C o n s e q u e n t l y, t h e t e c h n i q u e s s e e m d i f f e r e n t , a n d t h e y a r e i n d e e d different in terms of semantic representation. There is much debate as to which method is better and the conditions under which a specific technique is to be selected. There can be nodefinite answer, understanding of the cir cumstances and the business requirements finallylead to selection of an appropriate technique. Entity- Relationship (E-R) ModelingBasic Concepts An ER model is represented by an ER diagram, which uses three basic graphic symbols toconceptualize the data: entity, relationship, and attribute. Entity An entity is defined to be a person, place, thing, or event of interest to the business or theorganization. It represents a class of objects, which are things in the real business world thatcan be observed and classified by their properties and characteristics. In general, an entity hasits own business definition and a clear boundary definition that is required to describe what isincluded and what is not.In a practical modeling project, the team members share a definition template for integrationand a consistent entity definition in the model. In case of a high-level business modeling, anentity can be very generic, but it must be quite specific in the detailed logical modeling.There are four entities; PRODUCT, PRODU CT MODEL, PRODUCT COMPONENT, andCOMPONENT in the ER diagram (Refer Figure 4.1) and are represented as rectangles. Fig. 4.1: A Simple ER Model 5
  • 5. An ER model is represented by an ER diagram, which uses three basic graphic symbols toconceptualize the data: entity, relationship, and attribute. Entity An entity is defined to be a person, place, thing, or event of interest to the business or theorganization. It represents a class of objects, which are things in the real business world thatcan be observed and classified by their properties and characteristics. In general, an entity hasits own business definition and a clear boundary definition that is required to describe what isincluded and what is not.In a practical modeling project, the team members share a definition template for integrationand a consistent entity definition in the model. In case of a high-level business modeling, anentity can be very generic, but it must be quite specific in the detailed logical modeling.There are four entities; PRODUCT, PR ODUCT MODEL, PRODUCT COMPONENT, andCOMPONENT in the ER diagram (Refer Figure 4.1) and are represented as rectangles. Fig. 4.1: A Simple ER Model he four diagonal lines on the corners of the PRODUCT COMPONENT entity represent thatthe entity is „an associative entity‟ and the entity is to resolve the many-to-many relationship between two entities. PRODUCT MODEL and COMPONENT are independent of each other but have a business relationship between them. A PRODUCT MODEL consists of manycomponents and a component is related to many product models. With this
  • 6. business rule, youc a n n o t t e l l w h i c h c o m p o n e n t s m a k e u p a p r o d u c t m o d e l . T o d o t h i s , yo u c a n d e f i n e a r e s o l v i n g e n t i t y . F o r e x a m p l e , t h e P R O D U C T C O M P O N E N T e n t i t y c a n p r o v i d e t h e information about which components are related to which product model.I n E R m o d e l i n g , n a m i n g t h e entities is important for easy understanding and c l e a r communication. It is expressed grammatically in the form of a noun rather than a verb andt h e c r i t e r i a f o r s e l e c t i n g a n e n t i t y n a m e d e p e n d o n h o w w e l l t h e n a m e r e p r e s e n t s t h e characteristics and scope of the entity. Also, defining a unique identifier of an entity is the most critical task. These unique identifiers are called candidate keys. Among them, you canselect the key that is most commonly used to identify the entity, called „primary key‟. Relationship Relationships represent the structural interaction and association among the entities in amodel and they are represented with lines drawn between the two specific entities. Generally,a relationship is named grammatically by a verb (such as owns, belongs, and has) and ther e l a t i o n s h i p b e t w e e n t h e e n t i t i e s c a n b e d e f i n e d i n t e r m s o f t h e c a r d i n a l i t y. C a r d i n a l i t y represents the maximum number of instances of one entity that are related to a single instancein another table and vice versa. Thus the possible cardinalities include one-to-one (1:1), one-to-many (1:M), and many-to-many (M:M). In a detailed normalized ER model, any M:Mrelationship is not shown because it is resolved to an associative entity. Attributes Attributes describe the characteristics of properties of the entities. The ProductID, Description, and Picture are attributes of the PRODUCT entity in Figure 4.1. T h e name of an attribute has to be unique in an entity and should be s e l f - explanatory to ensure clarity. For example, rather naming date1 and date2, youm a y u s e t h e n a m e s ; o r d e r d a t e a n d d e l i v e r y d a t e . W h e n a n i n s t a n c e h a s n o value for an attribute, the minimum cardinality of the attribute is zero, which means either nullable or optional.In Figure 4.1, you can see the characters P, m, o, and F that stand for primarykey, mandatory, optional, and foreign key. The Picture attribute of the PRODUCTentity is optional, which means it is nullable. A foreign key of an entity is definedto be the primary key of another entity. In figure 4.1, the Product ID attribute of t h e PRODUCT MODEL entity is a foreign key as it is the primary key o f t h e PRODUCT entity. These foreign keys are useful in determining the relationshipssuch as the referential integrity between the entities. Other ConceptsSupertype and Subtype An entity can have subtypes and supertypes and the relationship between a supertype entityand its subtype entity is an IS A relationship. An IS A relationship is used where one entity is a generalization of several more specialized entities. The supertype and subtype relationshipis represented by a triangle on the relationship. Figure 4.2 shows an example of supertype ands u b t y p e e n t i t i e s w h e r e i n S A L E S O U T L E T i s t h e s u p e r t y p e o f R E T A I L S T O R E a n d CORPORATE SALES OFFICE and RETAIL STORE, CORPORATE SALES OFFICE aresubtypes of SALES OUTLET. Here, each subtype entity inherits attributes from its supertypeentity.Also, each subtype entity can have its own distinctive attributes. In the example providedabove, Region ID and Outlet ID are
  • 7. inherited attributes and the sub entities have their own a t t r i b u t e s ( s u c h a s number of cash registers and floor space of the RETAIL S T O R E subentity). The practical benefit of supertyping and subtyping is that they make a data modelmore directly expressive. Just by looking at the ER diagram, you can see that sales outlets arecomposed of „retail stores‟ and „corporate sales offices‟. Fig. 4.2: Supertype and Subtype Other important concepts in the area of ER modeling are „domain‟ and „normalization‟. • A domain consists of all the possible acceptable values and categories that areallowed for an attribute. It is the set of all real possible occurrences. The format or data type, such as integer, date, and character, provides a clear definition of domain.The practical benefit of domain is that it is imperative for building the data dictionaryor repository, and for implementing the database consequently. • Normalization is a process of assigning the attributes to entities which in a wayreduces data redundancy, avoids data anomalies, provides a solid architecture for updating data, and reinforces the long-term integrity of the data model (the thirdnormal form is usually adequate). Dimensional Modeling Dimensional modeling is a relatively new concept compared to ER modeling. This method is simpler,more expressive, and easier to understand. This technique is mainly aimed at conceptualizing andvisualizing data models as a set of measures that are described by common aspects of the business. Itis useful for summarizing and rearranging the data and presenting views of the data to support data analysis. Also, the technique focuses on numeric data, such as values, counts, and weights Q 4. Discuss the categories in which data is divided before structuring it into data warehouse?Ans: The Data Warehouses can be divided into two types: • Enterprise Data Warehouse • Data Mart Enterprise Data Warehouse The Enterprise data warehouse consists of the data drawn from multiple operational systemso f an organization. This data warehouse supports time -series and trend a n a l ys i s a c r o s s different business areas of an organization and so can be used for strategic decision-making.Also, this data warehouse is used to populate various data marts. Data Mart As data warehouses contain larger amounts of data, organizations often create „data marts‟that are precise, specific to a department or product line. Thus data mart is a physical andlogical subset of an Enterprise data warehouse and is also termed as a department-specificdata warehouse. Generally, data marts are organized around a single business process.There are two types of data marts; independent and dependant. The data is fed directly fromthe legacy systems in case of an independent data mart and the data is fed from the enterprisedata warehouse in case of a dependent data mart. In the long run, the dependent data martsare much more stable architecturally than the independent data marts. Advantages and Limitations of a DW System
  • 8. Use of a data warehouse brings in the following advantages for an organization: • End-users can access a wide variety of data. • Management can obtain various kinds of trends and patterns of data. • A warehouse provides competitive advantage to the company by providing the data andtimely information. • A warehouse acts as a significant enabler of commercial business applications viz., Customer Relationship Management (CRM) applications. However, following are the concerns that one has to keep in mind while u s i n g a d a t a warehouse: • The scope of a Data warehousing project is to be managed carefully to attain the definedcontent and value. • The process of extracting, cleaning and loading the data and finally storing it into a datawarehouse is a time-consuming process. • The problems of compatibility with the existing systems need to be resolved before building adata warehouse. • Security of the data may become a serious issue, especially if the warehouse is webaccessible. • Building and maintenance of the data warehouse can be handled only through skilledresources and requires huge investment. MB0036 – Business Intelligence & Tools Data Warehouse Concepts and Terminology Various concepts and the key terms used in the study of data warehouse are provided below. • Dashboard: This is a reporting tool that consolidates aggregates and arranges measurements, metrics(measurements compared to a goal) on a single screen so that information can be monitoredat a glance. • Data Management: This is the process of controlling, protecting, and facilitating access to data in order to provide the end users with timely access to the data they need. • Data Mining (or Data Surfing): T h i s i s a t e c h n i q u e g e a r e d f o r t h e t yp i c a l u s e r w h o d o e s n o t k n o w e x a c t l y w h a t h e i s searching for, but is looking for particular patterns or trends. Data mining is the process of sifting through large amounts of data to produce data content relationships. It can predictf u t u r e t r e n d s a n d b e h a v i o r s , a l l o w i n g b u s i n e s s e s t o m a k e p r o a c t i v e , k n o w l e d g e - d r i v e n decisions. The most valuable
  • 9. results from data mining include clustering, classifying, andestimating the things that occur together. There are many kinds of t ools that play a role ind a t a m i n i n g and they include neural networks, decision trees, visualization, g e n e r a l algorithms, fuzzy logic, etc. • Data Modeling: A method used to define and analyze data requirements needed to support the businessfunctions of an organization. • Data Profiling: Data Profiling is a critical step in data migration that automates the identification of problematic data and metadata, and enables o r g a n i z a t i o n s t o c o r r e c t i n c o n s i s t e n c i e s , redundancies and inaccuracies in their databases. • Data Visualization: Data visualization involves examining the data represented by dynamic images rather than pure numbers. These are the techniques that turn the data into information by using the highcapacity of the human brain to visually recognize patterns and trends • Decentralized Warehouse: A remote data source that users can query/access via a central gateway that provides alogical view of corporate data in terms that users can understand. The gateway parses anddistributes queries in real time to remote data sources and returns result sets back to users. • Drill-down: This is the capacity to browse the information through a hierarchical structure as shown below. • External Data Source: This is the data that is not available in the OLTP system s, but is required to enhance theinformation quality in the data warehouse. The examples of this data include the data of thec o m p e t i t o r s , i n f o r m a t i o n o f t h e r e g u l a t o r y a n d g o v e r n m e n t b o d i e s , r e s e a r c h d a t a o f t h e professional bodies and universities. • Metadata: Metadata is data about data. The examples of metadata include data element descriptions,data type descriptions, attribute descriptions, and process descriptions. On-Line Analytical Processing (OLAP): This is a category of software technology that en ables the users gain i n s i g h t i n t o d a t a through fast, consistent, interactive access to a wide variety of possible views of informationthat has been transformed from raw data to reflect the real dimensionality of the organization.This is implemented in a multi-user client/server mode and offers consistently rapid responset o q u e r i e s , r e g a r d l e s s o f d a t a b a s e s i z e a n d c o m p l e x i t y . T h i s s o f t w a r e i s a l s o c a l l e d Multidimensional Analysis Software.
  • 10. • On-Line Transaction Processing (OLTP): This is the way the data is processed by an end user/a computer system. Here, the data isdetail oriented, highly repetitive with larger amounts of updates and changes. The major task of these systems is to perform on-line transaction and query processing. These systems cover m o s t o f t h e d a y - t o - d a y o p e r a t i o n s o f t h e o r g a n i z a t i o n , s u c h a s p u r c h a s i n g , i n v e n t o r y, manufacturing, payroll, banking, accounting and registration MB0036 – Business Intelligence & Tools • Operational Databases: These are detail oriented databases defined to meet the needs of complex processes of ano r g a n i z a t i o n . H e r e , t h e d a t a i s h i g h l y n o r m a l i z e d t o a v o i d d a t a r e d u n d a n c y a n d d o u b l e - maintenance. A large number of transactions take place every hour on these databases and area l w a ys “ u p t o d a t e ” a n d r e p r e s e n t a s n a p s h o t o f t h e c u r r e n t s i t u a t i o n . C o n t r a s t t o t h e s e databases, there are Informational databases that are stable over a period of time to representa situation at a specific point in time in the past. Architecture of a Data Warehouse The architecture describes the overall system of a Data Warehouse from various perspectivess u c h a s d a t a , p r o c e s s , a n d i n f r a s t r u c t u r e t o s t u d y t h e i n t e r - r e l a t i o n s h i p s a m o n g v a r i o u s components. • The data perspective includes the source and target data structures and so it aids the user inunderstanding what data assets are available in a data warehouse and how they are related. • The process perspective is primarily concerned with communicating the process and flow of data from the originating source system through the process of loading the data warehouseand extracting data from the warehouse. • The infrastructure or technology perspective details the various hardware and softwareproducts used to implement the distinct components of the overall system. Depending upon the specifics of an organizational situation, the following types of DataWarehouse architectures are provided below: • Basic architecture of a Data warehouse • Architecture of a Data warehouse with Staging area • Architecture of a Data warehouse with Staging area and Data marts Fig 2.1 shows a simple architecture of a data warehouse wherein the end users directlyaccess the data derived from several source systems through the data warehouse. Q 5. Discuss the purpose of executive information system in an organization? Ans:
  • 11. An Executive Information System (EIS) is a set of management tools supporting theinformation and decision- making needs of management by combining information availablewithin the organisation with external information in an analytical framework. EIS are targeted at management needs to quickly assess the status of a business or section of business. These packages are aimed firmly at the type of business user who needs instant andup to date understanding of critical business information to aid decision making.The idea behind an EIS is that information can be collated and displayed to the user withoutmanipulation or further processing. The user can then quickly see the status of his chosendepartment or function, enabling them to concentrate on decision making. Generally an EISis configured to display data such as order backlogs, open sales, purchase order backlogs, shipments, receipts and pending orders. This information can then be used to make executivedecisions at a strategic level.The emphasis of the system as a whole is the easy to use interface and the integration with avariety of data sources. It offers strong reporting and data mining capabilities which can provide all the data the executive is likely to need. Traditionally the interface was menudriven with either reports, or text presentation. Newer systems, and especially the newer Business Intelligence systems,which are replacing EIS, have adashboard or scorecard type display.Before these systems became available, decision makers had to rely on disparate spreadsheetsand reports which slowed down the decision making process. Now massive amounts of relevant information can be accessed in seconds. The two main aspects of an EIS system areintegration and visualisation. The newest method of visualisation is theDashboard and Scorecard. TheDashboardis one screen that presents key data and organisational informationon an almost real time and integrated basis. The Scorecard is another one screen display withmeasurement metrics which can give a percentile view of whatever criteria the executivechooses.Behind these two front end screens can be an immensedata processing infrastructure, or acouple of integrateddatabases, depending entirely on the organisation that is using thesystem. The backbone of the system is traditional server hardware and a fast network. TheEIS software itself is run from here and presented to the executive over this network. Thedatabases needs to be fully integrated into the system and have real-time connections both inand out. This information then needs to be collated, verified, processed and presented to theend user, so a real-time connection into the EIS core is necessary.Executive Information Systems come in two distinct types: ones that are data driven, andones that are model driven. Data driven systems interface with databases and datawarehouses. They collate information from different sources and presents them to the user inan integrated dashboard style screen. Model driven systems use forecasting, simulations anddecision tree like processes to present the data.As with any emerging and progressive market, service providers are continually improvingtheir products and offering new ways of doing business. Modern EIS systems can also present industry trend information and competitor behaviour trends if needed. They can filter and analyse data; create graphs, charts and scenario generations; and offer many other optionsfor presenting data. There are a number of ways to link decision making to organisational performance. From adecision maker's perspective these tools provide an excellent way of viewing data. Outcomesdisplayed include single metrics, trend analyses, demographics, market shares and a myriadof other options. The simple interface makes it quick and easy to navigate and call theinformation required.For a system that seems to offer business so much, it is used by relatively few organisations.Current estimates indicate that as few as 10% of businesses use EIS systems. One of thereasons for this is the complexity of the system and support infrastructure. It is difficult tocreate such a system and populate it effectively. Combining all the necessary
  • 12. systems anddata sources can be a daunting task, and seems to put many businesses off implementing it.The system vendors have addressed this issue by offering turnkey solutions for potentialclients. Companies like Actuate and Oracle are both offering complete out of the boxExecutive Information Systems, and these aren't the only ones. Expense is also an issue. Oncethe initial cost is calculated, there is the additional cost of support infrastructure,training, andthe means of making the company data meaningful to the system.Does EIS warrant all of this expense? Green King certainly thinks so. They installed aCognos system in 2003 and their first few reports illustrated business opportunities in excessof ÂŁ250,000. The AA is also using a Business Objects variant of an EIS system and theyexpect a return of 300% in three years. (Guardian 31/7/03)An effective Executive Information System isn't something you can just set up and leave it todo its work. Its success depends on the support and timely accurate data it gets to be able to provide something meaningful. It can provide the information executives need to makeeducated decisions quickly and effectively. An EIS can provide a competitive edge to business strategy that can pay for itself in a very short space of time. Q6. Discuss the challenges involved in data integration and coordination process?Ans: In general, most of the data that the warehouse gets is the data extracted f r o m a c o m b i n a t i o n o f l e g a c y m a i n f r a m e s ys t e m s , o l d m i n i c o m p u t e r a p p l i c a t i o n s , a n d s o m e client/server systems. But these source systems do not conform to the same set of businessrules. Thus they may often follow different naming conventions and varied standards for datarepresentation. Thus the process of data integration and consolidation plays a vital role. Here,the data integration includes combining of all relevant operational data into coherent data s t r u c t u r e s s o a s t o m a k e t h e m r e a d y for loading into data warehouse. It standardizes thenames and data representations and resolves the discrepancies. So me of the c h a l l e n g e s involved in the data integration and consolidation process are as follows. Identification of an Entity Suppose there are three legacy applications that are in use in your organization; one is theorder entry system, second is customer service support system, and the third is the marketingsystem. Each of these systems might have their own customer file to support the system.Even most of the customers will be common to all these three files, the same customer oneach of these files have a different unique identification number. As you need to keep a single record for each customer in a data warehouse, you need to getthe transactions of each customer from various source systems and then match them up toload into the data warehouse. This is an entity identification problem in which you do notknow which of the customer records relate to the same customer. This problem is prevalentwhere multiple sources exist for the same entities and the other entities that are prone to thistype of problem include vendors, suppliers, employees, and various products manufactured by a company.In case of three customer files, you have to design complex algorithms to match records froma l l t h e t h r e e f i l e s a n d g r o u p s o f m a t c h i n g records. But this is a difficult exercise . If thematching criterion is too t i g h t , t h e n s o m e r e c o r d s m i g h t e s c a p e t h e g r o u p s . S i m i l a r l y, a particular group may include records of more than one customer if the matching criteriond e s i g n e d i s t o o l o o s e . A l s o , y o u m i g h t h a v e t o i n v o l v e y o u r u s e r s o r t h e r e s p e c t i v e stakeholders to understand the transaction accurately. Some of the companies attempt this problem in two phases. In the first phase, the entire records, irrespective whether they are duplicates or not, are assigned unique
  • 13. identifiers and in the second phase, the duplicates arereconciled periodically ether through automatic algorithms or manually. Existence of Multiple Sources Another major challenge in the area of data integration and consolidation results from asingle data element having more than one source. For instance, cost values are calculated andu p d a t e d a t s p e c i f i c i n t e r v a l s i n t h e s t a n d a r d c o s t i n g a p p l i c a t i o n . S i m i l a r l y, yo u r o r d e r processing application also carries the unit costs for all products. Thus there are two sourcesavailable to obtain the unit cost of a product and so there could be a slight variation in their v a l u e s . W h i c h o f t h e s e s ys t e m s n e e d s t o b e c o n s i d e r e d t o s t o r e t h e u n i t c o s t i n t h e d a t a warehouse becomes an important question. One easy way of han dling this situation is to prioritize the two sources, or you may select the source on the basis of the last update date. Implementation of Transformation The implementation of data transformation is a complex exercise. You m a y h a v e t o g o beyond the manual methods, usual methods of writing conversion programs while deployingthe operational systems. You need to consider several other factors to decide the methods to be adopted. Suppose you are considering automating the data transformation functions, youhave to identify, configure and install the tools, train the team on these tools, and integratethem into the data warehouse environment. But a combination of both methods proves to beeffective. The issues you may face in using manual methods and transformation tools arediscussed below. Manual Methods These are the traditional methods that are in practice in the recent past. These methods area d e q u a t e i n c a s e o f s m a l l e r d a t a w a r e h o u s e s . T h e s e m e t h o d s i n c l u d e m a n u a l l y c o d e d programs and scripts that are mainly executed in the data staging area. Since these methodsc a l l f o r e l a b o r a t e c o d i n g a n d t e s t i n g a n d p r o g r a m m e r s a n d a n a l y s t s w h o p o s s e s t h e specialized knowledge in this area only can produce the programs and scripts.Although the initial cost may be reasonable, ongoing maintenance may escalate the costwhile implementing these methods. Moreover these methods are always prone to errors . Another disadvantage of these methods is about the creation of metadata. Even if the in -house programs record the metadata initially, the metadata needs to be updated every time thechanges occur in the transformation rules. Transformation Tools The difficulties involved in using the manual methods can be e l i m i n a t e d u s i n g t h e sophisticated and comprehensive set of transformation tools that are now available. Use of these automated tools certainly improves efficiency and accuracy. If the inputs provided intothe tools are accurate, then the rest of the work is performed efficiently by the tool. So youhave to carefully specify the required parameters, the data definitions and the rules to the transformation tool.A l s o , t h e t r a n s f o r m a t i o n t o o l s e n a b l e t h e r e c o r d i n g o f m e t a d a t a . W h e n yo u s p e c i f y t h e transformation parameters and rules, these values are stored as metadata by the tool and thismetadata becomes a part of the overall metadata component of the data warehouse. Whenchanges occur to business rules or data definitions, you just have to enter the
  • 14. changes into thetool and the metadata for the transformations get adjusted automatically. But relying on thetransformation tools alone without using the manual methods is also not practically possible. Transformation for Dimension Attributes Now we consider the updating of the dimension tables. The dimension tables are more stablein nature and so they are less volatile compared to the fact tables. The fact tables changethrough an increase in the number of rows, but the dimension tables change through thechanges to the attributes. For instance, we consider a product dimension table. Every year,rows are added as new models become available. But what about the attributes that are withint h e d i m e n s i o n t a b l e . Y o u m i g h t f a c e a s i t u a t i o n w h e r e t h e r e i s a c h a n g e i n t h e p r o d u c t dimension table because a particular product was moved into a different product category. Sothe corresponding values must be changed in the product dimension table. Though most of the dimensions are generally constant over a period of time, they may change slowly. http://www.scribd.com/doc/44415081/MB0036-Business-Intelligence-amp-Tolls-Fall-10
  • 15. SET 2 Q.1 Explain business development life cycle in detail? [10 Marks] May 092012 Ans. The Business development Lifecycle is a methodology adopted for planning, designing, implementing and maintaining the BI system. Various steps involved in this approach are depicted below. Each of the phases in the above life cycle is described below. Project Planning Developing a project plan involves identification of all the tasks necessary to implement the BI project. The Project Manager identifies the key team members, assigns the tasks, and develops the effort estimates for their tasks. There is much interplay between this activity and the activity of defining the Business Requirements and aligning the BI system/data warehouse system with the business requirements is very crucial. Therefore you need to understand the business requirements properly before proceeding further. Project Management This is the phase wherein the actual implementation of the project takes place. The first step here is to define the business requirements and the implementation is carried out in three phases on the basis of the requirements. The first phase (includes technical architecture design, selection and installation of a product) deals with technology, the second phase (includes Dimensional Modeling, Physical Design, ETL Design & Development) focuses on data and the last phase (includes BI Application Specification, BI Application Development) deals with design and development of analytical applications. The steps in these phases are discussed below. 1 Defining the Business Requirements Business requirements are the bedrock of the BI system and so the Business Requirements Definition acts as the foundation of the Lifecycle methodology. The business requirements
  • 16. defined at this stage provide the necessary guidance to make the decisions. This process mainly includes the following activities: Requirements planning Collecting the business requirements Post-collection documentation and follow-up 2 Technical Architecture Design Creation of the Technical Architecture includes the following steps: 1. Establishing an Architecture task-force 2. Collecting Architecture-related requirements 3. Documenting the Architecture requirements 4. Developing a high-level Architectural model 5. Designing and specifying the subsystems 6. Determining Architecture implementation phases 7. Documenting the technical Architecture 8. Reviewing and finalizing the Architecture 3 Selection and Installation of a Product The selection and the installation of a business intelligence product is carried out in the following steps: 1. Understanding the corporate purchasing process 2. Developing a product evaluation matrix 3. Conducting market research 4. Shortlisting the options and performing detailed evaluations 5. Conducting a prototype (if necessary)
  • 17. 6. Selecting a product, installing on trial, and negotiating the value/price. 4 Dimensional Modeling A dimensional model packages the data in a symmetric format whose design goals are obtaining the user know-how, query performance, and resilience to change. In this step, a data-modeling team is formed and design workshops are conducted to create the dimensional model. Once the modeling team is confident of the model prepared, the model is demonstrated and validated with a broader audience and then documented. 5 Physical Design In this step, the dimensional model created in the previous step is translated into a physical design. The physical model includes the details viz., physical database, data types, key declarations, permissibility of nulls. 6 ETL Design & Development ETL stands for Extraction, Transformation, and Loading. ETL tools are used to extract the data from the operational data sources and to load the same into a data warehouse. 7 BI Application Specification In this step, a set of analytical applications are identified for building a BI system based on the business requirements definition, type of data being used, and the architecture of the warehouse proposed. 8 BI Application Development This is step wherein a specific application (tool) is selected from the identified applications for actual implementation of the BI system. 9 Deployment This is the step wherein the technology, data and analytical application tracks are converged. The completion of this step can be assumed as the completion of actual building of the BI system. 10 Maintenance & Growth During this step, the project team provides the user-support to the end-users of the system. Also, the team involves in providing the technical support required for the system so as ensure the
  • 18. continuous utilization of the system. This step may also include making some minor enhancements to the BI system. Revising the Project Planning As the project makes progress, the project manager of the project has to revise the project plan to accommodate the new business interests, concerns raised by the end-users. http://www.scribd.com/doc/75437915/MI0036-SET-1-amp-SET-2 http://www.scribd.com/doc/75437878/MI0034-SET-1-amp-SET-2 http://www.scribd.com/santosh143hsv143