SlideShare a Scribd company logo
Data Story-telling with AWS QuickSight
US E-COMMERCE DATASET
DevAx Online Workshop Oct 2021- AWS Vietnam
BY QUYEN DINH
Notebook and report are available at my github repository
GOAL
- Creating a dashboard
- Finding insights about future trends and
improvements for the companies and/or products
TRACK
- Provided dataset
- US E-Commerce dataset
CONTENTS
1. Exploratory Data Analysis
2. Dashboard Overview
3. US-Ecommerce - Insights
4. Conclusions
1. EXPLORATORY DATA ANALYSIS
- Provided file type: .csv
- Number of table: 1
- Number of columns: 16
OVERVIEW ABOUT US E-COMMERCE DATASET
- Number of rows: 65, 535
- Header in table: True
- Data need to be cleaned up: Yes
SOME IMPORTANT ASSUMPTION ABOUT THE DATA
Transaction_start: Total Transaction.
Transaction_result:
+ Label 1: Completed Transactions
+ Label 0: Uncompleted/Abandoned/Return Order
Since there is no communication to get more inputs from that E-commerce company, we need to
make some assumption about the data for the further analysis:
- Assumption 1: In this context, cost is considered as Revenue
“Amount US$”: Total cost for the order => Total revenue from the order
“IndividualPriceUS$”: Cost for each item => Revenue on each item
- Assumption 2: After checking the value of the 2 columns (“Transaction_start” and
“Transaction_result, we interpreted the Transaction as followed:
PROBLEM WITH THE DATA
DATASET IS NOT READY FOR VISUALIZATION ON AWS QUICKSIGHT
Problem 1. The format for Date in the dataset is DD/MM/YYYY, while
the default in QuickSight is MM/DD/YY => data is not valid => Should
change the QuickSight Date format to DD/MM/YYYY before loading
the data into Visualization.
Problem 2. Some missing values in the
Individual_Price_US$ column
Problem 3. Some data values are not in a
desirable format
For example, 2 unique values on
“Delivery_Type” is 'one-day deliver' and
'Normal Delivery' => Need to capitalize the
first letter for consistency.
EXPLORATORY DATA ANALYSIS WAS DONE USING
PYTHON - JUPYTER NOTEBOOK
REASON FOR PERFORMING EDA USING PYTHON/JUPYTER NOTEBOOK
- Account was not linked to AWS SageMaker
- More familiar with Python and Jupyter Notebook, didn’t have enough time to explore other options on AWS
TASKS DONE
1. Convert the “Date” column to the standard format so that it could be ready to use in AWSQuicksight
2. Remove 159 rows that have Individual_Price_US is #VALUE! (0.24% of the data)
3. Clean up some values to get a better format and consistency:
spectacles' => 'Spectacles', 'vessels' => 'Vessels', 'one-day deliver' => 'One-day deliver'
4. Add the column “Weekday”: Convert Date to weekday 0 to 6, where 0 is Sunday and 6 is Saturday.
5. Change the “Time” column to only contain the hourly order information (00 to 23) for tracking daily pattern.
Notebook is available at my github repository
Remove 159 rows that have Individual_Price_US is #VALUE! (0.24% of the data) didn’t affect the data distribution.
Brief overview of the data distribution on original and cleaned up data (Python Matplotlib library)
Original data Cleaned-up data
IMPORTANT NOTE:
2. DASHBOARD OVERVIEW
BUILDING DASHBOARD WITH AWS QUICKSIGHT
US-COMMERCE DASHBOARD SCREENSHOTS
BUILDING DASHBOARD WITH AWS QUICKSIGHT
US-COMMERCE DASHBOARD SCREENSHOTS (continued)
BUILDING DASHBOARD WITH AWS QUICKSIGHT
US-COMMERCE DASHBOARD SCREENSHOTS (continued)
3. US E-COMMERCE
SOME IMPORTANT INSIGHTS
PART 3 WOULD BE PRESENTED AS THE FOLLOWING FORMAT
1. Describe the data/graphs/charts and some insights.
2. Using green text box for discussing about future trends and improvements for the
companies and/or products
US-COMMERCE PERFORMANCE OVERVIEW
- The data ranges in 115 days (5 months):
Sep 13, 2012 to Jan 14, 2014.
- Total revenue from all the orders was
around $268.8M.
- Customers/Orders were from 3 cities Los
Angeles (California), New York City (New
York), and Seattle (Washington).
- The majority of revenue was from Seattle.
But there was no order activities in Seattle
since Dec 13, 2013.
- The revenue was significant decreased
overtime.
QUESTIONS/DISCUSSION NEEDED FOR PERFORMANCE IMPROVEMENT
1. What happened to the e-commerce activity in Seattle after Dec 13,
2013?
2. Why there was a great revenue decrease? Is it related to season?
(Normally Nov- Dec is the high season comparing to other months
of the year).
When looking at the Revenue by
Month, except Seattle, the
revenue/performance looks stable
in New York city and LA. The sharp
decrease in revenue could be
related to the problem in Seattle.
⇨ In such a short period of time (5 months), could not make any assumption on the revenue decrease. Need data on other
periods to investigate more.
PRODUCT INSIGHTS
CATEGORY
- 8 different product categories.
- The top categories that contributed to revenue
is Fashion, Clothing, Wearables, and Stationaries.
PRODUCT
- 12 specifics products.
- The top products were Fairness Cream (Fashion),
Shirt, Jeans (Clothing), Spectacles (Fashion), Shoes
(Wearables). Notes that in this product chart, the
color reflects its own category.
- There are the huge differences in “Individual Price of Product” with a lot of outliers
- The quantity in each product transaction is quite “stable” with not extreme outliers.
⇨ Need to perform Product Segmentation (normal, luxury, etc.) to see the performance of different product segments
⇨ Need to have further inputs from the company on how to do Product Segmentation
CUSTOMER SEGMENTATION
- Revenue came mostly from the “Member”
customer, followed by Guest order.
- In New York, the behavior is slightly different:
more orders came from guest account comparing
to other cities.
⇨ More programs to “take care” of the “Member” customers.
⇨ Program to convert other customer types to “Members”.
⇨ The approach for launching the loyalty program could be
different in New York.
ORDER ACTIVITY - ORDERING PATTERN
- The order activities was slightly
high on Sunday and Thursday, and
low on Friday and Saturday.
⇨ More promotion and
advertisement on Sunday and
Thursday
⇨ More promotion and advertisement on Sunday and
Thursday
- Surprisingly, there was no downtime in 24 hours of a day!!!
Everyone in the 3 big cities was just wide awake and did
shopping day and night!!
⇨ Need to investigate more on the Timestamp data to see if there is any
problem with data input/ETL process.
TOTAL TRANSACTION VERSUS COMPLETED TRANSACTION
- The percentage of completed transaction (conversion rate) is 86.74%, which is pretty good.
- When looking as the total transaction and completed transaction overtime, the conversion rate looks stable across 5 months.
⇨ Need to improve it? Then it needs more input from the company.
PLATFORMS USED BY CUSTOMERS
- 81% of the transaction was performed by Web.
- In New York and Los Angeles, the device usage tendency was equal in male and female. But it is not the same for
Seattle. More female used mobile for shopping.
⇨ Need to discuss suitable actions for
promotion/advertisements/ maintenance
on Web.
⇨ Improve experience for Mobile app.
⇨ The approach could be different in Seattle
since more female used mobile app
comparing to other cities.
CUSTOMER BEHAVIOR BY GENDER
- Majority of customer (>50%) was Male. Significantly, in New York, 70.13% of customer was Male.
⇨ Need to discuss suitable actions for promotion/advertisements/ maintenance target on Gender.
CATEGORY/PRODUCT TENDENCY BY GENDER
- Interestingly, Female and Male Customer shared the
same interest/need in buying Fairness Cream.
- Except that, all other products were purchased
mostly either by Male or Female => Really
gender-specific product!
⇨ Need to discuss suitable actions for
promotion/advertisements/ maintenance target
on Gender:
- Keep the product gender-specific like that, or
change?
- The way to expand the product variety: what
kinds of product to be expanded?
4. CONCLUSIONS
CONCLUSION ON US E-COMMERCE DATASET
- Data science project is a repetitive cycle, which needs lots of inputs/feedback from the company for
further actions (data analysis, finding more insights for improving and forecasting purposes).
⇨ Besides all the discussions/suggestions on other slides, E-commerce dataset needs more data and
input for further analysis.
CONCLUSION ON AWS VIETNAM - DEVAX ONLINE WORKSHOP
(Oct 2021)
OVERALL
- Great contents, organization and participation! Great job! Thank you!
MY EXPERIENCE WITH AWS QUICKSIGHT
Since I could only join the 3rd and 4th session of this workshop, I could give my opinion on AWS Quicksight only.
- QuickSight is a potential platform for data visualization.
- I realized that I need more customized graphs, don’t know whether QuickSight supports it, such as:
+ Combo charts: Stack more charts in the same figure without available fields.
+ Customize the length, width of the bars (on bar charts)
+ Adding a few label values (not all the values) on the existing charts.
+ Drill-down/Filtering features are not so user friendly (visualization result).
+ I still could not figure out how to modify/prepare, clean data on QuickSight. I still need to
prepare/modify the data using other tools.
- Impression on community support: comparing to Tableau and other BI tools, I have the feeling that the
available resources for QuickSight is not as much as other tools. When searching for some
troubleshootings, there were few answers available. Many online questions/issues were left
unanswered.
Thank you!

More Related Content

What's hot

All you wanted to know about analytics in e commerce- amazon, ebay, flipkart
All you wanted to know about analytics in e commerce- amazon, ebay, flipkartAll you wanted to know about analytics in e commerce- amazon, ebay, flipkart
All you wanted to know about analytics in e commerce- amazon, ebay, flipkart
Anju Gothwal
 
Retail Analytics Solution - CloudMoyo
Retail Analytics Solution - CloudMoyoRetail Analytics Solution - CloudMoyo
Retail Analytics Solution - CloudMoyo
CloudMoyo
 
How to avoid Cart Abandonment on an eCommerce store?
How to avoid Cart Abandonment on an eCommerce store?How to avoid Cart Abandonment on an eCommerce store?
How to avoid Cart Abandonment on an eCommerce store?
Knowband Store
 
Industry Focus: Departmental Stores Retail
Industry Focus: Departmental Stores RetailIndustry Focus: Departmental Stores Retail
Industry Focus: Departmental Stores Retail
Integrated Computer Systems, Inc.
 
Retail Management System
Retail Management SystemRetail Management System
Retail Management System
effiasoft
 
Big Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-CommerceBig Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-Commerce
Uyoyo Edosio
 

What's hot (6)

All you wanted to know about analytics in e commerce- amazon, ebay, flipkart
All you wanted to know about analytics in e commerce- amazon, ebay, flipkartAll you wanted to know about analytics in e commerce- amazon, ebay, flipkart
All you wanted to know about analytics in e commerce- amazon, ebay, flipkart
 
Retail Analytics Solution - CloudMoyo
Retail Analytics Solution - CloudMoyoRetail Analytics Solution - CloudMoyo
Retail Analytics Solution - CloudMoyo
 
How to avoid Cart Abandonment on an eCommerce store?
How to avoid Cart Abandonment on an eCommerce store?How to avoid Cart Abandonment on an eCommerce store?
How to avoid Cart Abandonment on an eCommerce store?
 
Industry Focus: Departmental Stores Retail
Industry Focus: Departmental Stores RetailIndustry Focus: Departmental Stores Retail
Industry Focus: Departmental Stores Retail
 
Retail Management System
Retail Management SystemRetail Management System
Retail Management System
 
Big Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-CommerceBig Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-Commerce
 

Similar to [Provided Data - US] ChiQuyen Dinh

CaseStudy.pptx
CaseStudy.pptxCaseStudy.pptx
CaseStudy.pptx
XimenaBustamante14
 
Building a financial data warehouse: A lesson in empathy
Building a financial data warehouse: A lesson in empathyBuilding a financial data warehouse: A lesson in empathy
Building a financial data warehouse: A lesson in empathy
Solmaz Shahalizadeh
 
1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx
1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx
1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx
jackiewalcutt
 
1.1DetailsCase Study Scenario - Global Trading PLCGlo.docx
1.1DetailsCase Study Scenario - Global Trading PLCGlo.docx1.1DetailsCase Study Scenario - Global Trading PLCGlo.docx
1.1DetailsCase Study Scenario - Global Trading PLCGlo.docx
elliotkimberlee
 
How and Why: Embedded Analytics Interfaces For Your SaaS Product
How and Why: Embedded Analytics Interfaces For Your SaaS ProductHow and Why: Embedded Analytics Interfaces For Your SaaS Product
How and Why: Embedded Analytics Interfaces For Your SaaS Product
Aggregage
 
Modern Product Data Workflows: How and Why: Embedded Analytics Interfaces For...
Modern Product Data Workflows: How and Why: Embedded Analytics Interfaces For...Modern Product Data Workflows: How and Why: Embedded Analytics Interfaces For...
Modern Product Data Workflows: How and Why: Embedded Analytics Interfaces For...
Hannah Flynn
 
Management information system
Management information systemManagement information system
Management information systemKush Sharma
 
Case Study Scenario - Global Trading PLCGlobal Trading PLC is.docx
Case Study Scenario - Global Trading PLCGlobal Trading PLC is.docxCase Study Scenario - Global Trading PLCGlobal Trading PLC is.docx
Case Study Scenario - Global Trading PLCGlobal Trading PLC is.docx
tidwellveronique
 
Data Provisioning & Optimization
Data Provisioning & OptimizationData Provisioning & Optimization
Data Provisioning & OptimizationAmbareesh Kulkarni
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
guest7c8e5f
 
Caching Business Logic in the Database
Caching Business Logic in the DatabaseCaching Business Logic in the Database
Caching Business Logic in the Database
Jonathan Levin
 
Content marketing analytics: what you should really be doing
Content marketing analytics: what you should really be doingContent marketing analytics: what you should really be doing
Content marketing analytics: what you should really be doing
Daniel Smulevich
 
Project report aditi paul1
Project report aditi paul1Project report aditi paul1
Project report aditi paul1
guest9529cb
 
Microsoft Office Performance Point
Microsoft Office Performance PointMicrosoft Office Performance Point
Microsoft Office Performance Point
Kashif Akram
 
Content Marketing Analytics - What you should really be doing... and probably...
Content Marketing Analytics - What you should really be doing... and probably...Content Marketing Analytics - What you should really be doing... and probably...
Content Marketing Analytics - What you should really be doing... and probably...
DigitalMarketingShow
 
Benefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topperBenefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topper
Being Topper
 
Sai Charan_Thotapalli_Internship Poster
Sai Charan_Thotapalli_Internship PosterSai Charan_Thotapalli_Internship Poster
Sai Charan_Thotapalli_Internship PosterSai Charan Thotapalli
 
Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15
AnwarrChaudary
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
NEWYORKSYS-IT SOLUTIONS
 
207828627 sap-bootcamp-quiz-sd
207828627 sap-bootcamp-quiz-sd207828627 sap-bootcamp-quiz-sd
207828627 sap-bootcamp-quiz-sd
homeworkping8
 

Similar to [Provided Data - US] ChiQuyen Dinh (20)

CaseStudy.pptx
CaseStudy.pptxCaseStudy.pptx
CaseStudy.pptx
 
Building a financial data warehouse: A lesson in empathy
Building a financial data warehouse: A lesson in empathyBuilding a financial data warehouse: A lesson in empathy
Building a financial data warehouse: A lesson in empathy
 
1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx
1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx
1.1 DetailsCase Study Scenario - Global Trading PLCGlobal Tra.docx
 
1.1DetailsCase Study Scenario - Global Trading PLCGlo.docx
1.1DetailsCase Study Scenario - Global Trading PLCGlo.docx1.1DetailsCase Study Scenario - Global Trading PLCGlo.docx
1.1DetailsCase Study Scenario - Global Trading PLCGlo.docx
 
How and Why: Embedded Analytics Interfaces For Your SaaS Product
How and Why: Embedded Analytics Interfaces For Your SaaS ProductHow and Why: Embedded Analytics Interfaces For Your SaaS Product
How and Why: Embedded Analytics Interfaces For Your SaaS Product
 
Modern Product Data Workflows: How and Why: Embedded Analytics Interfaces For...
Modern Product Data Workflows: How and Why: Embedded Analytics Interfaces For...Modern Product Data Workflows: How and Why: Embedded Analytics Interfaces For...
Modern Product Data Workflows: How and Why: Embedded Analytics Interfaces For...
 
Management information system
Management information systemManagement information system
Management information system
 
Case Study Scenario - Global Trading PLCGlobal Trading PLC is.docx
Case Study Scenario - Global Trading PLCGlobal Trading PLC is.docxCase Study Scenario - Global Trading PLCGlobal Trading PLC is.docx
Case Study Scenario - Global Trading PLCGlobal Trading PLC is.docx
 
Data Provisioning & Optimization
Data Provisioning & OptimizationData Provisioning & Optimization
Data Provisioning & Optimization
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
 
Caching Business Logic in the Database
Caching Business Logic in the DatabaseCaching Business Logic in the Database
Caching Business Logic in the Database
 
Content marketing analytics: what you should really be doing
Content marketing analytics: what you should really be doingContent marketing analytics: what you should really be doing
Content marketing analytics: what you should really be doing
 
Project report aditi paul1
Project report aditi paul1Project report aditi paul1
Project report aditi paul1
 
Microsoft Office Performance Point
Microsoft Office Performance PointMicrosoft Office Performance Point
Microsoft Office Performance Point
 
Content Marketing Analytics - What you should really be doing... and probably...
Content Marketing Analytics - What you should really be doing... and probably...Content Marketing Analytics - What you should really be doing... and probably...
Content Marketing Analytics - What you should really be doing... and probably...
 
Benefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topperBenefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topper
 
Sai Charan_Thotapalli_Internship Poster
Sai Charan_Thotapalli_Internship PosterSai Charan_Thotapalli_Internship Poster
Sai Charan_Thotapalli_Internship Poster
 
Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
 
207828627 sap-bootcamp-quiz-sd
207828627 sap-bootcamp-quiz-sd207828627 sap-bootcamp-quiz-sd
207828627 sap-bootcamp-quiz-sd
 

More from Lam Le

Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - Datalake
Lam Le
 
Module 3 - QuickSight Overview
Module 3 - QuickSight OverviewModule 3 - QuickSight Overview
Module 3 - QuickSight Overview
Lam Le
 
Module 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWSModule 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWS
Lam Le
 
[Custom Data] Ngo Duy Vu
[Custom Data] Ngo Duy Vu[Custom Data] Ngo Duy Vu
[Custom Data] Ngo Duy Vu
Lam Le
 
[Custom Data] Alice Nguyen
[Custom Data] Alice Nguyen[Custom Data] Alice Nguyen
[Custom Data] Alice Nguyen
Lam Le
 
[Provided Data - US] Thien Tran
[Provided Data - US] Thien Tran[Provided Data - US] Thien Tran
[Provided Data - US] Thien Tran
Lam Le
 
[Custom Data] Hy Dang
 [Custom Data] Hy Dang [Custom Data] Hy Dang
[Custom Data] Hy Dang
Lam Le
 
[Custom Data] Ha Hoang
[Custom Data] Ha Hoang[Custom Data] Ha Hoang
[Custom Data] Ha Hoang
Lam Le
 
[Provided Data - US] Tran Chau
[Provided Data - US] Tran Chau[Provided Data - US] Tran Chau
[Provided Data - US] Tran Chau
Lam Le
 
[Custom Data] Alice Nguyen
[Custom Data] Alice Nguyen[Custom Data] Alice Nguyen
[Custom Data] Alice Nguyen
Lam Le
 
[Provided Data - Brazil] Vuong.le
[Provided Data - Brazil] Vuong.le[Provided Data - Brazil] Vuong.le
[Provided Data - Brazil] Vuong.le
Lam Le
 
[Provided data - Brazil] Tran Manh Cuong
[Provided data - Brazil] Tran Manh Cuong[Provided data - Brazil] Tran Manh Cuong
[Provided data - Brazil] Tran Manh Cuong
Lam Le
 
[Custom data] Ngo Duy Vu
[Custom data] Ngo Duy Vu[Custom data] Ngo Duy Vu
[Custom data] Ngo Duy Vu
Lam Le
 

More from Lam Le (13)

Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - Datalake
 
Module 3 - QuickSight Overview
Module 3 - QuickSight OverviewModule 3 - QuickSight Overview
Module 3 - QuickSight Overview
 
Module 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWSModule 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWS
 
[Custom Data] Ngo Duy Vu
[Custom Data] Ngo Duy Vu[Custom Data] Ngo Duy Vu
[Custom Data] Ngo Duy Vu
 
[Custom Data] Alice Nguyen
[Custom Data] Alice Nguyen[Custom Data] Alice Nguyen
[Custom Data] Alice Nguyen
 
[Provided Data - US] Thien Tran
[Provided Data - US] Thien Tran[Provided Data - US] Thien Tran
[Provided Data - US] Thien Tran
 
[Custom Data] Hy Dang
 [Custom Data] Hy Dang [Custom Data] Hy Dang
[Custom Data] Hy Dang
 
[Custom Data] Ha Hoang
[Custom Data] Ha Hoang[Custom Data] Ha Hoang
[Custom Data] Ha Hoang
 
[Provided Data - US] Tran Chau
[Provided Data - US] Tran Chau[Provided Data - US] Tran Chau
[Provided Data - US] Tran Chau
 
[Custom Data] Alice Nguyen
[Custom Data] Alice Nguyen[Custom Data] Alice Nguyen
[Custom Data] Alice Nguyen
 
[Provided Data - Brazil] Vuong.le
[Provided Data - Brazil] Vuong.le[Provided Data - Brazil] Vuong.le
[Provided Data - Brazil] Vuong.le
 
[Provided data - Brazil] Tran Manh Cuong
[Provided data - Brazil] Tran Manh Cuong[Provided data - Brazil] Tran Manh Cuong
[Provided data - Brazil] Tran Manh Cuong
 
[Custom data] Ngo Duy Vu
[Custom data] Ngo Duy Vu[Custom data] Ngo Duy Vu
[Custom data] Ngo Duy Vu
 

Recently uploaded

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 

Recently uploaded (20)

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 

[Provided Data - US] ChiQuyen Dinh

  • 1. Data Story-telling with AWS QuickSight US E-COMMERCE DATASET DevAx Online Workshop Oct 2021- AWS Vietnam BY QUYEN DINH Notebook and report are available at my github repository
  • 2. GOAL - Creating a dashboard - Finding insights about future trends and improvements for the companies and/or products TRACK - Provided dataset - US E-Commerce dataset
  • 3. CONTENTS 1. Exploratory Data Analysis 2. Dashboard Overview 3. US-Ecommerce - Insights 4. Conclusions
  • 5. - Provided file type: .csv - Number of table: 1 - Number of columns: 16 OVERVIEW ABOUT US E-COMMERCE DATASET - Number of rows: 65, 535 - Header in table: True - Data need to be cleaned up: Yes
  • 6. SOME IMPORTANT ASSUMPTION ABOUT THE DATA Transaction_start: Total Transaction. Transaction_result: + Label 1: Completed Transactions + Label 0: Uncompleted/Abandoned/Return Order Since there is no communication to get more inputs from that E-commerce company, we need to make some assumption about the data for the further analysis: - Assumption 1: In this context, cost is considered as Revenue “Amount US$”: Total cost for the order => Total revenue from the order “IndividualPriceUS$”: Cost for each item => Revenue on each item - Assumption 2: After checking the value of the 2 columns (“Transaction_start” and “Transaction_result, we interpreted the Transaction as followed:
  • 7. PROBLEM WITH THE DATA DATASET IS NOT READY FOR VISUALIZATION ON AWS QUICKSIGHT Problem 1. The format for Date in the dataset is DD/MM/YYYY, while the default in QuickSight is MM/DD/YY => data is not valid => Should change the QuickSight Date format to DD/MM/YYYY before loading the data into Visualization. Problem 2. Some missing values in the Individual_Price_US$ column Problem 3. Some data values are not in a desirable format For example, 2 unique values on “Delivery_Type” is 'one-day deliver' and 'Normal Delivery' => Need to capitalize the first letter for consistency.
  • 8. EXPLORATORY DATA ANALYSIS WAS DONE USING PYTHON - JUPYTER NOTEBOOK REASON FOR PERFORMING EDA USING PYTHON/JUPYTER NOTEBOOK - Account was not linked to AWS SageMaker - More familiar with Python and Jupyter Notebook, didn’t have enough time to explore other options on AWS TASKS DONE 1. Convert the “Date” column to the standard format so that it could be ready to use in AWSQuicksight 2. Remove 159 rows that have Individual_Price_US is #VALUE! (0.24% of the data) 3. Clean up some values to get a better format and consistency: spectacles' => 'Spectacles', 'vessels' => 'Vessels', 'one-day deliver' => 'One-day deliver' 4. Add the column “Weekday”: Convert Date to weekday 0 to 6, where 0 is Sunday and 6 is Saturday. 5. Change the “Time” column to only contain the hourly order information (00 to 23) for tracking daily pattern. Notebook is available at my github repository
  • 9. Remove 159 rows that have Individual_Price_US is #VALUE! (0.24% of the data) didn’t affect the data distribution. Brief overview of the data distribution on original and cleaned up data (Python Matplotlib library) Original data Cleaned-up data IMPORTANT NOTE:
  • 11. BUILDING DASHBOARD WITH AWS QUICKSIGHT US-COMMERCE DASHBOARD SCREENSHOTS
  • 12. BUILDING DASHBOARD WITH AWS QUICKSIGHT US-COMMERCE DASHBOARD SCREENSHOTS (continued)
  • 13. BUILDING DASHBOARD WITH AWS QUICKSIGHT US-COMMERCE DASHBOARD SCREENSHOTS (continued)
  • 14. 3. US E-COMMERCE SOME IMPORTANT INSIGHTS
  • 15. PART 3 WOULD BE PRESENTED AS THE FOLLOWING FORMAT 1. Describe the data/graphs/charts and some insights. 2. Using green text box for discussing about future trends and improvements for the companies and/or products
  • 16. US-COMMERCE PERFORMANCE OVERVIEW - The data ranges in 115 days (5 months): Sep 13, 2012 to Jan 14, 2014. - Total revenue from all the orders was around $268.8M. - Customers/Orders were from 3 cities Los Angeles (California), New York City (New York), and Seattle (Washington). - The majority of revenue was from Seattle. But there was no order activities in Seattle since Dec 13, 2013. - The revenue was significant decreased overtime.
  • 17. QUESTIONS/DISCUSSION NEEDED FOR PERFORMANCE IMPROVEMENT 1. What happened to the e-commerce activity in Seattle after Dec 13, 2013? 2. Why there was a great revenue decrease? Is it related to season? (Normally Nov- Dec is the high season comparing to other months of the year). When looking at the Revenue by Month, except Seattle, the revenue/performance looks stable in New York city and LA. The sharp decrease in revenue could be related to the problem in Seattle. ⇨ In such a short period of time (5 months), could not make any assumption on the revenue decrease. Need data on other periods to investigate more.
  • 18. PRODUCT INSIGHTS CATEGORY - 8 different product categories. - The top categories that contributed to revenue is Fashion, Clothing, Wearables, and Stationaries. PRODUCT - 12 specifics products. - The top products were Fairness Cream (Fashion), Shirt, Jeans (Clothing), Spectacles (Fashion), Shoes (Wearables). Notes that in this product chart, the color reflects its own category.
  • 19. - There are the huge differences in “Individual Price of Product” with a lot of outliers - The quantity in each product transaction is quite “stable” with not extreme outliers. ⇨ Need to perform Product Segmentation (normal, luxury, etc.) to see the performance of different product segments ⇨ Need to have further inputs from the company on how to do Product Segmentation
  • 20. CUSTOMER SEGMENTATION - Revenue came mostly from the “Member” customer, followed by Guest order. - In New York, the behavior is slightly different: more orders came from guest account comparing to other cities. ⇨ More programs to “take care” of the “Member” customers. ⇨ Program to convert other customer types to “Members”. ⇨ The approach for launching the loyalty program could be different in New York.
  • 21. ORDER ACTIVITY - ORDERING PATTERN - The order activities was slightly high on Sunday and Thursday, and low on Friday and Saturday. ⇨ More promotion and advertisement on Sunday and Thursday ⇨ More promotion and advertisement on Sunday and Thursday - Surprisingly, there was no downtime in 24 hours of a day!!! Everyone in the 3 big cities was just wide awake and did shopping day and night!! ⇨ Need to investigate more on the Timestamp data to see if there is any problem with data input/ETL process.
  • 22. TOTAL TRANSACTION VERSUS COMPLETED TRANSACTION - The percentage of completed transaction (conversion rate) is 86.74%, which is pretty good. - When looking as the total transaction and completed transaction overtime, the conversion rate looks stable across 5 months. ⇨ Need to improve it? Then it needs more input from the company.
  • 23. PLATFORMS USED BY CUSTOMERS - 81% of the transaction was performed by Web. - In New York and Los Angeles, the device usage tendency was equal in male and female. But it is not the same for Seattle. More female used mobile for shopping. ⇨ Need to discuss suitable actions for promotion/advertisements/ maintenance on Web. ⇨ Improve experience for Mobile app. ⇨ The approach could be different in Seattle since more female used mobile app comparing to other cities.
  • 24. CUSTOMER BEHAVIOR BY GENDER - Majority of customer (>50%) was Male. Significantly, in New York, 70.13% of customer was Male. ⇨ Need to discuss suitable actions for promotion/advertisements/ maintenance target on Gender.
  • 25. CATEGORY/PRODUCT TENDENCY BY GENDER - Interestingly, Female and Male Customer shared the same interest/need in buying Fairness Cream. - Except that, all other products were purchased mostly either by Male or Female => Really gender-specific product! ⇨ Need to discuss suitable actions for promotion/advertisements/ maintenance target on Gender: - Keep the product gender-specific like that, or change? - The way to expand the product variety: what kinds of product to be expanded?
  • 27. CONCLUSION ON US E-COMMERCE DATASET - Data science project is a repetitive cycle, which needs lots of inputs/feedback from the company for further actions (data analysis, finding more insights for improving and forecasting purposes). ⇨ Besides all the discussions/suggestions on other slides, E-commerce dataset needs more data and input for further analysis.
  • 28. CONCLUSION ON AWS VIETNAM - DEVAX ONLINE WORKSHOP (Oct 2021) OVERALL - Great contents, organization and participation! Great job! Thank you!
  • 29. MY EXPERIENCE WITH AWS QUICKSIGHT Since I could only join the 3rd and 4th session of this workshop, I could give my opinion on AWS Quicksight only. - QuickSight is a potential platform for data visualization. - I realized that I need more customized graphs, don’t know whether QuickSight supports it, such as: + Combo charts: Stack more charts in the same figure without available fields. + Customize the length, width of the bars (on bar charts) + Adding a few label values (not all the values) on the existing charts. + Drill-down/Filtering features are not so user friendly (visualization result). + I still could not figure out how to modify/prepare, clean data on QuickSight. I still need to prepare/modify the data using other tools. - Impression on community support: comparing to Tableau and other BI tools, I have the feeling that the available resources for QuickSight is not as much as other tools. When searching for some troubleshootings, there were few answers available. Many online questions/issues were left unanswered.