Data Exploration & BI
Cristian Guajardo Garcia
CristiánGuajardo-García
www.5561.cl
MBA Politecnico di Milano
& IIM Lucknow (India) B-
Schools.
Has worked in Chile &
Italy with companies such
as Universidad Andrés
Bello, ProChile, IBM,
Luxottica.
Agenda
1. Why data matters | BI
2. Data gathering: Scraping with Import.io
3. Data cleaning: Spreadsheets
4. Data visualization: Tableau
5. Insights for Marketing
6. Business Case: Luxottica
7. Q&A
1. Why data matters | BI
2. Data gathering: Scraping with Import.io
3. Data cleaning: Spreadsheets
4. Data visualization: Tableau
5. Insights for Marketing
6. Business Case: Luxottica
7. Q&A
Why Data matters | BI
Today we have several sources generating tons of data per
second. Businesses need to anticipate the consumer in order
to remain competitive.
SQL = Sequence or structured
NOSQL = Unstructured data.
The goal of BI is to make the right information available to
the right people in the right time.
Big Data is nothing else but gathering tons of structured
and unstructured data, filtering it, cleaning it,
visualizing it and last but not least, getting insights from
it.
How and where do I start?
By Exploring the Data I don’t actually know where this will
take me. I just collect data from several sources and start
to explore it
The 2 most common scenarios to
start working with dat
By Answering a Question I know I have one question to ask.
This will lead my research and let me get rid of data that
is useless for this stage.
The Steps
1. Web Scraping (gather data)
2. Clean
3. Visualize and Analyze
1. Why data matters | BI
2. Data gathering: Scraping with Import.io
3. Data cleaning: Spreadsheets
4. Data visualization: Tableau
5. Insights for Marketing
6. Business Case: Luxottica
7. Q&A
Data Gathering: Scraping with
Import.io
What is Scraping
Scraping is a technique used to extract data from one place
to another one, which is usually, a table.
Tabula = Extracts data from PDF
OCR = Extracts data from images
Import.io = Extracts data from
the web
Extractor Crawlers Connectors
Scraping is a very basic -yet useful- artificial
intelligence technique.
What is it?
● Machine reading the web
● Real time crawling through API
● Map Data of website
● Point & Click UI
● Turn data to structured data
● Tailor made crawlers
● Cloud scaling
● Wide integration options
From a minimum input get a
maximized output.
How does it work
Answer Question: What is the average € of
a Nike sneaker on eBay Italy?
1. Open Import.io
2. Create a new Connector
3. Go to ebay.it
4. Click “I’m there” button
5. Click the red button which will
record our click trail (now Import.
io will start recording your clicks)
6. Click stop button
7. Now you tell import.io what matters
to you and what is it (image, text,
link etc).
Pieces of information
Now you have the data you needed
You will create a bot that
basically gets pieces of
information that will be stored
in a table.
Once you have trained the bot to
crawl the whole results, you can
clean columns that you might not
use.
Now is time to manipulate the
data and get info like average
price, most common products and
so on.
1. Why data matters | BI
2. Data gathering: Scraping with Import.io
3. Data cleaning: Spreadsheets
4. Data visualization: Tableau
5. Insights for Marketing
6. Business Case: Luxottica
7. Q&A
Data Cleaning: Spreadsheets
Store and clean
Once you have gathered the data, you might want to hide or
erase columns. Fill the n/a spaces or do some pivot table
maneuver. Whatever the case, Spreadsheet is a great way to
go.
Pivot Table: summarize big info
HLookup and Vlookup: target
specific info store in columns
and rows.
1. Why data matters | BI
2. Data gathering: Scraping with Import.io
3. Data cleaning: Spreadsheets
4. Data visualization: Tableau
5. Insights for Marketing
6. Business Case: Luxottica
7. Q&A
Data Visualization: Tableau
What is it for?
Tableau is the ultimate desktop and cloud solution for
visualizing data coming from several sources.
Remember: privacy is an illusion
It works perfectly merging info from several sources:
Survey data, Social media, SEM and Analytics visualized in
one dashboard.
Perfect for reporting and meeting the needs of several
clients.
Why is it useful?
1. Tailor made dashboards
2. Several layers (and sources) of
information
3. Set clear goals and KPI’s
4. Easy to export
5. Works for several industries
and roles
Visualize and analyze data
Example of
Tableau
1. Why data matters | BI
2. Data gathering: Scraping with Import.io
3. Data cleaning: Excel
4. Data visualization: Tableau
5. Insights for Marketing
6. Business Case: Luxottica
7. Q&A
How we can harness the power of the web
When we start working with data we stop “believing” and
start thinking. All the data available can help us to create
consumer profiles, specific interests, potential issues with
our product or even new ways to connect with them.
1. Forecast (where the puck is going)
2. The Rise of the Robots (automation)
3. Cross selling and tailor made dashboards per client
4. Insights like you’ve never seen before
Business Case
Applied to Marketing
The scenario
Untapped customer intelligence Luxottica
needed to analyze historical data pertaining
to more than 100 million customers to
increase marketing effectiveness.
© Copyright IBM Corporation 2013
The Impact
Centralized analytics The company deployed
advanced analytics technology from IBM to
create a 360-degree view of customers.
Actionable insights By identifying the
highest-value customers and creating
individualized marketing campaigns,
Luxottica anticipates a 10 percent boost in
marketing effectiveness.
1. Anticipates a 10 %
improvement in marketing
effectiveness
2. Identifies the highest-value
customers out of nearly 100
million
3. Targets individual
customers based on unique
preferences and histories
Recommendations
Adds on for Google Sheets
● Merge Sheets
● Data Everywhere
● Mapping Sheets
● Find Fuzzy Matches
● DukeDeploy
● BlockSpring
● Text Analysis
● Translate My Sheet
● AppSheet
● BigML
Q&A

Data Exploration & BI

  • 1.
    Data Exploration &BI Cristian Guajardo Garcia
  • 2.
    CristiánGuajardo-García www.5561.cl MBA Politecnico diMilano & IIM Lucknow (India) B- Schools. Has worked in Chile & Italy with companies such as Universidad Andrés Bello, ProChile, IBM, Luxottica.
  • 3.
    Agenda 1. Why datamatters | BI 2. Data gathering: Scraping with Import.io 3. Data cleaning: Spreadsheets 4. Data visualization: Tableau 5. Insights for Marketing 6. Business Case: Luxottica 7. Q&A
  • 4.
    1. Why datamatters | BI 2. Data gathering: Scraping with Import.io 3. Data cleaning: Spreadsheets 4. Data visualization: Tableau 5. Insights for Marketing 6. Business Case: Luxottica 7. Q&A
  • 5.
    Why Data matters| BI Today we have several sources generating tons of data per second. Businesses need to anticipate the consumer in order to remain competitive. SQL = Sequence or structured NOSQL = Unstructured data. The goal of BI is to make the right information available to the right people in the right time. Big Data is nothing else but gathering tons of structured and unstructured data, filtering it, cleaning it, visualizing it and last but not least, getting insights from it.
  • 7.
    How and wheredo I start? By Exploring the Data I don’t actually know where this will take me. I just collect data from several sources and start to explore it The 2 most common scenarios to start working with dat By Answering a Question I know I have one question to ask. This will lead my research and let me get rid of data that is useless for this stage.
  • 8.
    The Steps 1. WebScraping (gather data) 2. Clean 3. Visualize and Analyze
  • 9.
    1. Why datamatters | BI 2. Data gathering: Scraping with Import.io 3. Data cleaning: Spreadsheets 4. Data visualization: Tableau 5. Insights for Marketing 6. Business Case: Luxottica 7. Q&A
  • 10.
  • 11.
    What is Scraping Scrapingis a technique used to extract data from one place to another one, which is usually, a table. Tabula = Extracts data from PDF OCR = Extracts data from images Import.io = Extracts data from the web Extractor Crawlers Connectors Scraping is a very basic -yet useful- artificial intelligence technique.
  • 12.
    What is it? ●Machine reading the web ● Real time crawling through API ● Map Data of website ● Point & Click UI ● Turn data to structured data ● Tailor made crawlers ● Cloud scaling ● Wide integration options From a minimum input get a maximized output.
  • 13.
    How does itwork Answer Question: What is the average € of a Nike sneaker on eBay Italy? 1. Open Import.io 2. Create a new Connector 3. Go to ebay.it 4. Click “I’m there” button 5. Click the red button which will record our click trail (now Import. io will start recording your clicks) 6. Click stop button 7. Now you tell import.io what matters to you and what is it (image, text, link etc). Pieces of information
  • 14.
    Now you havethe data you needed You will create a bot that basically gets pieces of information that will be stored in a table. Once you have trained the bot to crawl the whole results, you can clean columns that you might not use. Now is time to manipulate the data and get info like average price, most common products and so on.
  • 15.
    1. Why datamatters | BI 2. Data gathering: Scraping with Import.io 3. Data cleaning: Spreadsheets 4. Data visualization: Tableau 5. Insights for Marketing 6. Business Case: Luxottica 7. Q&A
  • 16.
  • 17.
    Store and clean Onceyou have gathered the data, you might want to hide or erase columns. Fill the n/a spaces or do some pivot table maneuver. Whatever the case, Spreadsheet is a great way to go. Pivot Table: summarize big info HLookup and Vlookup: target specific info store in columns and rows.
  • 18.
    1. Why datamatters | BI 2. Data gathering: Scraping with Import.io 3. Data cleaning: Spreadsheets 4. Data visualization: Tableau 5. Insights for Marketing 6. Business Case: Luxottica 7. Q&A
  • 19.
  • 20.
    What is itfor? Tableau is the ultimate desktop and cloud solution for visualizing data coming from several sources. Remember: privacy is an illusion It works perfectly merging info from several sources: Survey data, Social media, SEM and Analytics visualized in one dashboard. Perfect for reporting and meeting the needs of several clients.
  • 21.
    Why is ituseful? 1. Tailor made dashboards 2. Several layers (and sources) of information 3. Set clear goals and KPI’s 4. Easy to export 5. Works for several industries and roles
  • 22.
    Visualize and analyzedata Example of Tableau
  • 23.
    1. Why datamatters | BI 2. Data gathering: Scraping with Import.io 3. Data cleaning: Excel 4. Data visualization: Tableau 5. Insights for Marketing 6. Business Case: Luxottica 7. Q&A
  • 24.
    How we canharness the power of the web When we start working with data we stop “believing” and start thinking. All the data available can help us to create consumer profiles, specific interests, potential issues with our product or even new ways to connect with them. 1. Forecast (where the puck is going) 2. The Rise of the Robots (automation) 3. Cross selling and tailor made dashboards per client 4. Insights like you’ve never seen before
  • 25.
  • 26.
    The scenario Untapped customerintelligence Luxottica needed to analyze historical data pertaining to more than 100 million customers to increase marketing effectiveness. © Copyright IBM Corporation 2013 The Impact Centralized analytics The company deployed advanced analytics technology from IBM to create a 360-degree view of customers. Actionable insights By identifying the highest-value customers and creating individualized marketing campaigns, Luxottica anticipates a 10 percent boost in marketing effectiveness. 1. Anticipates a 10 % improvement in marketing effectiveness 2. Identifies the highest-value customers out of nearly 100 million 3. Targets individual customers based on unique preferences and histories
  • 27.
    Recommendations Adds on forGoogle Sheets ● Merge Sheets ● Data Everywhere ● Mapping Sheets ● Find Fuzzy Matches ● DukeDeploy ● BlockSpring ● Text Analysis ● Translate My Sheet ● AppSheet ● BigML
  • 28.