SlideShare a Scribd company logo
1 of 32
Download to read offline
Big Data
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
What is Big Data ?
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
‘Data’-The New oil of Information Revolution
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
‘Data’-The New Information Revolution‘Data’-The New oil of Information Revolution
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
What makes it ‘Big’ Data ?
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Volume
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Velocity
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Variety
Hadoop
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Hadoop
HDFS
HDFS
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
HDFS
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
HDFS
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Map Reduce
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Map Reduce
Key =index.php
Value=1
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Processing Logs
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Hadoop Ecosystem
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Big Data Landscape
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
What’s in store for us?
• More jobs
• More opportunities
• More Money!
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Big Data Landscape
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Big Data Landscape
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Big Data Landscape
Sectors Using Big Data
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Enhancing the Multichannel Consumer experience:
• Use big data to integrate promotions and pricing for shoppers
seamlessly, whether those consumers are online, in-store, or perusing
a catalog.
• Integrate customer databases with information on households such
as income, housing values, and number of children and thus create
different versions of catalogs etc attuned to the behavior and
preferences of different groups of customers
Big Data Revenue
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Increased Efficiency
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Current Limitations for Big Data Analytics
• Meeting the need for speed
• Understanding the data
• Addressing data quality
• Displaying meaningful results
• Big data skills are in short supply.
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Problems & Treats – Big Data
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
• Privacy breaches and embarrassments
• Anonymization could become impossible
• Data masking could be defeated to reveal personal
information
• Unethical actions based on interpretations
• Big data analytics are not 100% accurate
• Discrimination
• Few (if any) legal protections exist for the involved
individuals
• Big data will probably exist forever
• Concerns for e-discovery
• Making patents and copyrights irrelevant
Case Studies – Recent Data Breaches
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
• Target breach, in which 40 million credit and debit accounts were
compromised over a three-week period - lost $148 million dollars.
• JP Morgan reporting that 76 million households and 8 million small business
were exposed in a data breach.
• Customer names, addresses, phone numbers and e-mail addresses were
taken
• Hackers also obtained internal data identifying customers by category,
such as whether they are clients of the private-bank, mortgage, auto or
credit-card divisions, said a person briefed on the matter.
• Third party – External Data - News: Banks turn to Facebook and Twitter to
keep track of education loan takers
Thinking Dimensionally
Sentiment_Analysis Table
Sentiment_ID ( e g-1,2,3,)
Sentiment_description
(eg-Wow, Awesome, Crap)
Customer_ID
Product_ID
Dim_Customer
Customer_ID
Customer_Name
Gender
Age
Dim_Product
Product_ID
Product_Name
Category
Product_Description
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Data-Big or small
Customer Name Location
Avadhoot Patil Dallas
Customer
name
Location
Ankur Kaushik Dallas
Customer
Name
Location
Avadhoot Patil Dallas
Ankur Kaushik Dalllas
Sort and Merge
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Conformed Dimensions
Online_Customer Table Store_Customer Table
Airport
Name
City Country
ABC Dallas USA
Airport_ID Airport
Name
City Country
1001 ABC Dallas USA
1002 XYZ Dallas USA
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Selecting Keys
• Anchor Dimensions with Durable Surrogate Keys
Natural Keys
durable surrogate keys.
slowly changing dimension
Datawarehouse System
Airport Data_source
 Dimensionalize data before applying governance
 Dimensionalize data as early as possible in the data pipeline
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Governance
Parse Match
Identify
Resolution
on Fly
• Privacy is the Most Important Governance Perspective
 For Most form of Analysis the personal details should be
masked
 Data aggregated enough not to allow identification of
individuals
 Data masked or encrypted on write or data should be
masked on read.
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Privacy
THANK YOU !
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

More Related Content

Viewers also liked

DWBI98 - Template Solutions for Data Warehouses and Data Marts - Presentation
DWBI98 - Template Solutions for Data Warehouses and Data Marts - PresentationDWBI98 - Template Solutions for Data Warehouses and Data Marts - Presentation
DWBI98 - Template Solutions for Data Warehouses and Data Marts - Presentation
David Walker
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
ashok kumar
 

Viewers also liked (14)

Bank of America
Bank of AmericaBank of America
Bank of America
 
Big datawarehouse
Big datawarehouseBig datawarehouse
Big datawarehouse
 
Data Staging Strategy
Data Staging StrategyData Staging Strategy
Data Staging Strategy
 
DWBI98 - Template Solutions for Data Warehouses and Data Marts - Presentation
DWBI98 - Template Solutions for Data Warehouses and Data Marts - PresentationDWBI98 - Template Solutions for Data Warehouses and Data Marts - Presentation
DWBI98 - Template Solutions for Data Warehouses and Data Marts - Presentation
 
Business Intelligence Overview
Business Intelligence OverviewBusiness Intelligence Overview
Business Intelligence Overview
 
Seminar datawarehouse @ Universitas Multimedia Nusantara
Seminar datawarehouse @ Universitas Multimedia NusantaraSeminar datawarehouse @ Universitas Multimedia Nusantara
Seminar datawarehouse @ Universitas Multimedia Nusantara
 
Real Time Data Processing using Spark Streaming | Data Day Texas 2015
Real Time Data Processing using Spark Streaming | Data Day Texas 2015Real Time Data Processing using Spark Streaming | Data Day Texas 2015
Real Time Data Processing using Spark Streaming | Data Day Texas 2015
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
Bank of America presentation
Bank of America presentationBank of America presentation
Bank of America presentation
 
Bank of America Case Study - Enterprise Architecture in Mobile Banking
Bank of America Case Study - Enterprise Architecture in Mobile BankingBank of America Case Study - Enterprise Architecture in Mobile Banking
Bank of America Case Study - Enterprise Architecture in Mobile Banking
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
 
Building A Bi Strategy
Building A Bi StrategyBuilding A Bi Strategy
Building A Bi Strategy
 
The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017
 

Similar to Business DataWarehouse_Big Data

Big Data – wie aus Daten strategische Resourcen und Ihr Wettbewerbsvorteil we...
Big Data – wie aus Daten strategische Resourcen und Ihr Wettbewerbsvorteil we...Big Data – wie aus Daten strategische Resourcen und Ihr Wettbewerbsvorteil we...
Big Data – wie aus Daten strategische Resourcen und Ihr Wettbewerbsvorteil we...
IBM Switzerland
 

Similar to Business DataWarehouse_Big Data (20)

B2B DATA: You Don't Have to Love it, But Don't Ignore it
B2B DATA: You Don't Have to Love it, But Don't Ignore itB2B DATA: You Don't Have to Love it, But Don't Ignore it
B2B DATA: You Don't Have to Love it, But Don't Ignore it
 
Module 2 - Improving current business with your own data - Online
Module 2 - Improving current business with your own data - Online Module 2 - Improving current business with your own data - Online
Module 2 - Improving current business with your own data - Online
 
Pres_Big Data for Finance_vsaini
Pres_Big Data for Finance_vsainiPres_Big Data for Finance_vsaini
Pres_Big Data for Finance_vsaini
 
Data-driven marketing - expert panel
Data-driven marketing - expert panelData-driven marketing - expert panel
Data-driven marketing - expert panel
 
Data-Analytics-Resource-updated for analysis
Data-Analytics-Resource-updated for analysisData-Analytics-Resource-updated for analysis
Data-Analytics-Resource-updated for analysis
 
The Role of Data in Customer Success
The Role of Data in Customer SuccessThe Role of Data in Customer Success
The Role of Data in Customer Success
 
Generating Big Value from Big Data
Generating Big Value from Big DataGenerating Big Value from Big Data
Generating Big Value from Big Data
 
Data set module 2
Data set   module 2Data set   module 2
Data set module 2
 
Future of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnFuture of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren Ravn
 
Applying Data Quality Best Practices at Big Data Scale
Applying Data Quality Best Practices at Big Data ScaleApplying Data Quality Best Practices at Big Data Scale
Applying Data Quality Best Practices at Big Data Scale
 
Data Done Right: Ensuring Information Integrity
Data Done Right: Ensuring Information IntegrityData Done Right: Ensuring Information Integrity
Data Done Right: Ensuring Information Integrity
 
Rplus Retail analytics solution
Rplus Retail analytics solutionRplus Retail analytics solution
Rplus Retail analytics solution
 
Impact of BIG Data on MDM
Impact of BIG Data on MDMImpact of BIG Data on MDM
Impact of BIG Data on MDM
 
Big Data World presentation - Sep. 2014
Big Data World presentation - Sep. 2014Big Data World presentation - Sep. 2014
Big Data World presentation - Sep. 2014
 
How to Monetize Your Data Assets and Gain a Competitive Advantage
How to Monetize Your Data Assets and Gain a Competitive AdvantageHow to Monetize Your Data Assets and Gain a Competitive Advantage
How to Monetize Your Data Assets and Gain a Competitive Advantage
 
Data set Improve your business with your own business data
Data set   Improve your business with your own business dataData set   Improve your business with your own business data
Data set Improve your business with your own business data
 
Big data
Big dataBig data
Big data
 
Big Data – wie aus Daten strategische Resourcen und Ihr Wettbewerbsvorteil we...
Big Data – wie aus Daten strategische Resourcen und Ihr Wettbewerbsvorteil we...Big Data – wie aus Daten strategische Resourcen und Ihr Wettbewerbsvorteil we...
Big Data – wie aus Daten strategische Resourcen und Ihr Wettbewerbsvorteil we...
 
ROUNDTABLE 2015: Agustin Meizoso
ROUNDTABLE 2015: Agustin MeizosoROUNDTABLE 2015: Agustin Meizoso
ROUNDTABLE 2015: Agustin Meizoso
 
Big data vs datawarehousing
Big data vs datawarehousingBig data vs datawarehousing
Big data vs datawarehousing
 

Business DataWarehouse_Big Data

  • 1. Big Data MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
  • 2. What is Big Data ? MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
  • 3. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 ‘Data’-The New oil of Information Revolution
  • 4. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 ‘Data’-The New Information Revolution‘Data’-The New oil of Information Revolution
  • 5. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 What makes it ‘Big’ Data ?
  • 6. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Volume
  • 7. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Velocity
  • 8. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Variety
  • 9. Hadoop MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Hadoop
  • 10. HDFS HDFS MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
  • 11. HDFS MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 HDFS
  • 12. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Map Reduce
  • 13. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Map Reduce
  • 14. Key =index.php Value=1 MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Processing Logs
  • 15. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Hadoop Ecosystem
  • 16. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Big Data Landscape
  • 17. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 What’s in store for us? • More jobs • More opportunities • More Money!
  • 18. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Big Data Landscape
  • 19. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Big Data Landscape
  • 20. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Big Data Landscape
  • 21. Sectors Using Big Data MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Enhancing the Multichannel Consumer experience: • Use big data to integrate promotions and pricing for shoppers seamlessly, whether those consumers are online, in-store, or perusing a catalog. • Integrate customer databases with information on households such as income, housing values, and number of children and thus create different versions of catalogs etc attuned to the behavior and preferences of different groups of customers
  • 22. Big Data Revenue MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
  • 23. Increased Efficiency MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
  • 24. Current Limitations for Big Data Analytics • Meeting the need for speed • Understanding the data • Addressing data quality • Displaying meaningful results • Big data skills are in short supply. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
  • 25. Problems & Treats – Big Data MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 • Privacy breaches and embarrassments • Anonymization could become impossible • Data masking could be defeated to reveal personal information • Unethical actions based on interpretations • Big data analytics are not 100% accurate • Discrimination • Few (if any) legal protections exist for the involved individuals • Big data will probably exist forever • Concerns for e-discovery • Making patents and copyrights irrelevant
  • 26. Case Studies – Recent Data Breaches MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 • Target breach, in which 40 million credit and debit accounts were compromised over a three-week period - lost $148 million dollars. • JP Morgan reporting that 76 million households and 8 million small business were exposed in a data breach. • Customer names, addresses, phone numbers and e-mail addresses were taken • Hackers also obtained internal data identifying customers by category, such as whether they are clients of the private-bank, mortgage, auto or credit-card divisions, said a person briefed on the matter. • Third party – External Data - News: Banks turn to Facebook and Twitter to keep track of education loan takers
  • 27. Thinking Dimensionally Sentiment_Analysis Table Sentiment_ID ( e g-1,2,3,) Sentiment_description (eg-Wow, Awesome, Crap) Customer_ID Product_ID Dim_Customer Customer_ID Customer_Name Gender Age Dim_Product Product_ID Product_Name Category Product_Description MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Data-Big or small
  • 28. Customer Name Location Avadhoot Patil Dallas Customer name Location Ankur Kaushik Dallas Customer Name Location Avadhoot Patil Dallas Ankur Kaushik Dalllas Sort and Merge MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Conformed Dimensions Online_Customer Table Store_Customer Table
  • 29. Airport Name City Country ABC Dallas USA Airport_ID Airport Name City Country 1001 ABC Dallas USA 1002 XYZ Dallas USA MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Selecting Keys • Anchor Dimensions with Durable Surrogate Keys Natural Keys durable surrogate keys. slowly changing dimension Datawarehouse System Airport Data_source
  • 30.  Dimensionalize data before applying governance  Dimensionalize data as early as possible in the data pipeline MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Governance Parse Match Identify Resolution on Fly
  • 31. • Privacy is the Most Important Governance Perspective  For Most form of Analysis the personal details should be masked  Data aggregated enough not to allow identification of individuals  Data masked or encrypted on write or data should be masked on read. MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6 Privacy
  • 32. THANK YOU ! MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6