© Sisense Inc, 2015
MODELING DATA FOR BUSINESS INTELLIGENCE: KEY
CONCEPTS, TIPS & TRICKS
Presented by Sisense: Business
Intelligence Software for Complex Data
To enable data analysts to produce a new report, dashboard or just get a
new analytic question answered in real-time, or at least in-time.
The Goal of Data Modeling in BI:
© Sisense Inc, 2015
To Get There, Data Needs to Be:
ACCURATE
Records should be reliable
and reflect the reality of
the business
UP-TO-DATE
Data has to be complete
and pertain to the relevant
period
READY FOR ANALYSIS
Structured in a way that
lets you get answers to
new questions
© Sisense Inc, 2015
CEO: "We need to
increase our sales!"
MRKT MNGR: “What
other offerings can we
sell to customers?"
IT MNGR: While upgrading
platforms and implementing a
new CRM system, estimates that
the information will be available
in 20-30 days...
MRKT MNGR: A
month? Don't we
already have this
data in our system?
IT MNGR: Yes the data is
there but it DOESNT HAVE
THE RIGHT STRUCTURE to
answer those questions
MRKT MNGR: Keeps
thinking: If the data is there,
why is it so difficult to get
answers?
IT MNGR: Keeps
thinking: The
marketing manager
asks for weird things
with no time at all!
CEO: Just wants
to sell more
Most sold products?
Most successful product bundles?
Typical Business Challenge
DS DS DS
2. ETL (EXTRACT, TRANSFORM, LOAD):
Transform the data into a
workable format
3. Centralize the transformed
data to create a single source
of truth
5. Analyze: start asking
questions and visualizing
the answers
1. Locate and gather the relevant
data sources
4. Query/import: make the
data available and accessible
to analysts
Data Modelling Steps
DISPERSED
DATASETS
QUERY
LANGUAGE
STRUCTURE
SIZE
GROWTH
RATE
DETAIL
QUERY
LANGUAGE
© Sisense Inc, 2015
…and Challenges
MAP DATA
WHAT DATA DO I NEED?
© Sisense Inc, 2015
WHAT DATA DO I NEED? - MAP THE DATA
Facts Filter & OrderDimensions
Key business entities (subjects)
that we want to analyze
Performance measurements
A set of conditions and order
that specify the data subset
that we want to look at
8
DIMENSIONS
Dimensions Are Mostly Categorical –
Each Has A Discrete Set Of Values
• Place – UK/USA
• Person - Customer
• Object - Products
• Time and Date - Year
• Process- Packaging
• Hierarchy – Country> City>Zip
FACTS
A set of conditions that specify the data subset
AND order in which to see the aggregations
FILTER & ORDER
• Greater than
• Between
• When
• True/False
• Range
Facts are presented in aggregate format: Max, Sum,
Average, Variance, Median, Count, Year-to -Date
• Number of transactions
• Quantity
• Amount
• Cost
• Revenue
• Discount
• Profit
9
Correspondence Between Business Question And SQL Queries
Select <Dimensions>,
<Facts>
From <Tables>
Where <Conditions>
Group by <Dimensions>
Having <Conditions>
Order by <Order Specifications>
“What were the best-selling
products this year, per
country?
(show only products that
sold more than 20,000
units)”
Select Country, Product,
Sum (quantity)
From OrdersSales
Where Getyear( SaleDate ) = 2015
Group by Country, Product
Having Sum (quantity) > 20,000
Order by State, sum (quantity)
1 2 3
Business Question SQL Structure SQL Query
JOIN DATA
HOW DO I CONNECT DIFFERENT SOURCES?
© Sisense Inc, 2015
HOW DO I CONNECT DIFFERENT SOURCES? - JOINING DATA
Relationship Join Types Key
The way separate data sources
can reference each other
The total portion of data included when
connecting separate data sources
Field(s) used to connect
data sources
Data Relationships
Many-to-Many
SubjectStudent
How an instance of data from one source is related to data in another source
One-to-Many
SongArtist
One-to-One
WifeHusband
© Sisense Inc, 2015
Data Relationships
What portion of the connected data is required for analysis
Inner Join Left Join Right Join Full Join
Other Join Options
TABLE A: SALES
PRODUCT ID
EMPLOYEE ID
ORDER DATE
DELIVERY DATE
PRODUCT ID
CLIENT ID
AMOUNT
TABLE B: STOCK
PRODUCT ID
STOCK DATE
UNITS
COST
EMPLOYEE ID
Examples of Data Keys
© Sisense Inc, 2015
CLEAN DATA
HOW DO I WANT TO ANALYZE DATA?
© Sisense Inc, 2015
HOW DO I WANT TO ANALYZE DATA? – CLEAN DATA
Valid Accurate Complete & Consistent
Corrections related to missing,
incomplete, incorrect or inconsistent data
Data is precise and shows the right valuesData is correct and reasonable
Valid
Stable response
Example: Compare samples
Have a sufficient portion of data.
Example: Access comprehensive
portion of data
Measures what it is supposed to.
Example: Compare multiple
measurements
© Sisense Inc, 2015
Accurate
Data Capture
Example: Correct at source of entry
Data Decay + Movement
Example: Constant updates
© Sisense Inc, 2015
Complete and Consistent
Data correction
Example: Transform data
Data consistency
Example: Standardization
Data completeness
Example: Merge Data
© Sisense Inc, 2015
DATA MODELING IN SISENSE
© Sisense Inc, 2015
PREPARE FOR ANALYSISACCESS
Visual with
No Coding
Connect Directly to
Raw Data
Single Model - Many
Sources, Rows & Columns
Drag & Drop to Join Varied
Data Sources
Automatically Model
Based on Query
Complete Solution
ETL & Analysis
Change Incrementally
as Needed
ACCURATE + ON TIME
Ease of Modelling in Sisense
Synchronization
WANT TO LEARN MORE?
Visit sisense.com
To see real end-to-end business
analytics software in action
Image Credits
pakorn
Stuart Miles
winnond
adamr
sattva
markuso
Mister GC
John Kasawa
Images courtesy of
tungphoto
at FreeDigitalPhotos.net
© Sisense Inc, 2015

The Definitive Guide to Data Modeling for Business Intelligence

  • 1.
    © Sisense Inc,2015 MODELING DATA FOR BUSINESS INTELLIGENCE: KEY CONCEPTS, TIPS & TRICKS Presented by Sisense: Business Intelligence Software for Complex Data
  • 2.
    To enable dataanalysts to produce a new report, dashboard or just get a new analytic question answered in real-time, or at least in-time. The Goal of Data Modeling in BI: © Sisense Inc, 2015
  • 3.
    To Get There,Data Needs to Be: ACCURATE Records should be reliable and reflect the reality of the business UP-TO-DATE Data has to be complete and pertain to the relevant period READY FOR ANALYSIS Structured in a way that lets you get answers to new questions © Sisense Inc, 2015
  • 4.
    CEO: "We needto increase our sales!" MRKT MNGR: “What other offerings can we sell to customers?" IT MNGR: While upgrading platforms and implementing a new CRM system, estimates that the information will be available in 20-30 days... MRKT MNGR: A month? Don't we already have this data in our system? IT MNGR: Yes the data is there but it DOESNT HAVE THE RIGHT STRUCTURE to answer those questions MRKT MNGR: Keeps thinking: If the data is there, why is it so difficult to get answers? IT MNGR: Keeps thinking: The marketing manager asks for weird things with no time at all! CEO: Just wants to sell more Most sold products? Most successful product bundles? Typical Business Challenge
  • 5.
    DS DS DS 2.ETL (EXTRACT, TRANSFORM, LOAD): Transform the data into a workable format 3. Centralize the transformed data to create a single source of truth 5. Analyze: start asking questions and visualizing the answers 1. Locate and gather the relevant data sources 4. Query/import: make the data available and accessible to analysts Data Modelling Steps DISPERSED DATASETS QUERY LANGUAGE STRUCTURE SIZE GROWTH RATE DETAIL QUERY LANGUAGE © Sisense Inc, 2015 …and Challenges
  • 6.
    MAP DATA WHAT DATADO I NEED? © Sisense Inc, 2015
  • 7.
    WHAT DATA DOI NEED? - MAP THE DATA Facts Filter & OrderDimensions Key business entities (subjects) that we want to analyze Performance measurements A set of conditions and order that specify the data subset that we want to look at
  • 8.
    8 DIMENSIONS Dimensions Are MostlyCategorical – Each Has A Discrete Set Of Values • Place – UK/USA • Person - Customer • Object - Products • Time and Date - Year • Process- Packaging • Hierarchy – Country> City>Zip FACTS A set of conditions that specify the data subset AND order in which to see the aggregations FILTER & ORDER • Greater than • Between • When • True/False • Range Facts are presented in aggregate format: Max, Sum, Average, Variance, Median, Count, Year-to -Date • Number of transactions • Quantity • Amount • Cost • Revenue • Discount • Profit
  • 9.
    9 Correspondence Between BusinessQuestion And SQL Queries Select <Dimensions>, <Facts> From <Tables> Where <Conditions> Group by <Dimensions> Having <Conditions> Order by <Order Specifications> “What were the best-selling products this year, per country? (show only products that sold more than 20,000 units)” Select Country, Product, Sum (quantity) From OrdersSales Where Getyear( SaleDate ) = 2015 Group by Country, Product Having Sum (quantity) > 20,000 Order by State, sum (quantity) 1 2 3 Business Question SQL Structure SQL Query
  • 10.
    JOIN DATA HOW DOI CONNECT DIFFERENT SOURCES? © Sisense Inc, 2015
  • 11.
    HOW DO ICONNECT DIFFERENT SOURCES? - JOINING DATA Relationship Join Types Key The way separate data sources can reference each other The total portion of data included when connecting separate data sources Field(s) used to connect data sources
  • 12.
    Data Relationships Many-to-Many SubjectStudent How aninstance of data from one source is related to data in another source One-to-Many SongArtist One-to-One WifeHusband © Sisense Inc, 2015
  • 13.
    Data Relationships What portionof the connected data is required for analysis Inner Join Left Join Right Join Full Join Other Join Options
  • 14.
    TABLE A: SALES PRODUCTID EMPLOYEE ID ORDER DATE DELIVERY DATE PRODUCT ID CLIENT ID AMOUNT TABLE B: STOCK PRODUCT ID STOCK DATE UNITS COST EMPLOYEE ID Examples of Data Keys © Sisense Inc, 2015
  • 15.
    CLEAN DATA HOW DOI WANT TO ANALYZE DATA? © Sisense Inc, 2015
  • 16.
    HOW DO IWANT TO ANALYZE DATA? – CLEAN DATA Valid Accurate Complete & Consistent Corrections related to missing, incomplete, incorrect or inconsistent data Data is precise and shows the right valuesData is correct and reasonable
  • 17.
    Valid Stable response Example: Comparesamples Have a sufficient portion of data. Example: Access comprehensive portion of data Measures what it is supposed to. Example: Compare multiple measurements © Sisense Inc, 2015
  • 18.
    Accurate Data Capture Example: Correctat source of entry Data Decay + Movement Example: Constant updates © Sisense Inc, 2015
  • 19.
    Complete and Consistent Datacorrection Example: Transform data Data consistency Example: Standardization Data completeness Example: Merge Data © Sisense Inc, 2015
  • 20.
    DATA MODELING INSISENSE © Sisense Inc, 2015
  • 21.
    PREPARE FOR ANALYSISACCESS Visualwith No Coding Connect Directly to Raw Data Single Model - Many Sources, Rows & Columns Drag & Drop to Join Varied Data Sources Automatically Model Based on Query Complete Solution ETL & Analysis Change Incrementally as Needed ACCURATE + ON TIME Ease of Modelling in Sisense Synchronization
  • 22.
    WANT TO LEARNMORE? Visit sisense.com To see real end-to-end business analytics software in action
  • 23.
    Image Credits pakorn Stuart Miles winnond adamr sattva markuso MisterGC John Kasawa Images courtesy of tungphoto at FreeDigitalPhotos.net © Sisense Inc, 2015