202312 Exploration Data Analysis Visualization (English version)

•

0 likes•4 views

FEG

Data visualization

Data & Analytics

EDA Visualization
before building model
Orozco Hsu
2023-10-31
1

About me
• Education
• NCU (MIS)、NCCU (CS)
• Work Experience
• Telecom big data Innovation
• AI projects
• Retail marketing technology
• User Group
• TW Spark User Group
• TW Hadoop User Group
• Taiwan Data Engineer Association Director
• Research
• Big Data/ ML/ AIOT/ AI Columnist
2

Tutorial
Content
3
Iris dataset summary
EDA and visualization
Homework

Code
• Download materials:
• https://drive.google.com/drive/folders/1Kaneenrtd2P2IWbo-
PhMd3b6NvtT5FOc?usp=sharing
4

The most recommend dataset
• Where the independent variables are numerical and the dependent
variable is categorical
• The advantage of such a dataset also lies in its ease of clustering
• The preferable data type for the dependent variable is binary,
meaning it is either 'YES' or ‘NO
• When the number of independent variables exceeds two or more, the
accuracy will decrease
• The most commonly used algorithm is logistic regression
6

Homework
• Change dataset, the numeric target feature
• Please explain the data visualization
• housing.tab
14

Similar to 202312 Exploration Data Analysis Visualization (English version)

Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriarNilesh Shah

DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docxrandyburney60861

The Art of Intelligence – A Practical Introduction Machine Learning for Orac...Lucas Jellema

Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Geoffrey Fox

2019 DSA 105 Introduction to Data Science Week 3Ferdin Joe John Joseph PhD

WWV2015: Jibes Paul van der Hulst big datawebwinkelvakdag

Introduction to data mining and data warehousingEr. Nawaraj Bhandari

Taking Data Science to Enterprise levelChristos Charmatzis

TRM-capstone-story-.pptxCSA247PratikKesapure

how to build a Length of Stay model for a ProofOfConcept projectZenodia Charpy

Keynote: Graphs in Government_Lance Walter, CMONeo4j

Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015 Vladi Vexler

Lecture 3.31 3.32.pptxRATISHKUMAR32

DATA MINING TOOL- ORANGENeeraj Goswami

Analyst Keynote: Delivering Faster Insights with a Logical Data Fabric in a H...Denodo

Survey of Big Data Infrastructuresm.a.kirn

Lean Analytics: How to get more out of your data science teamDigital Transformation EXPO Event Series

Data-Ed Webinar: Data Modeling FundamentalsDATAVERSITY

From SQL to Python - A Beginner's Guide to Making the SwitchRachel Berryman

Intro to Data Science by DatalentTeam at Data Science Clinic#11Dr.Sotarat Thammaboosadee CIMP-Data Governance

Similar to 202312 Exploration Data Analysis Visualization (English version) (20)

Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriar

DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx

The Art of Intelligence – A Practical Introduction Machine Learning for Orac...

Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...

2019 DSA 105 Introduction to Data Science Week 3

WWV2015: Jibes Paul van der Hulst big data

Introduction to data mining and data warehousing

Taking Data Science to Enterprise level

TRM-capstone-story-.pptx

how to build a Length of Stay model for a ProofOfConcept project

Keynote: Graphs in Government_Lance Walter, CMO

Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015

Lecture 3.31 3.32.pptx

DATA MINING TOOL- ORANGE

Analyst Keynote: Delivering Faster Insights with a Logical Data Fabric in a H...

Survey of Big Data Infrastructures

Lean Analytics: How to get more out of your data science team

Data-Ed Webinar: Data Modeling Fundamentals

From SQL to Python - A Beginner's Guide to Making the Switch

Intro to Data Science by DatalentTeam at Data Science Clinic#11

Recently uploaded

原版1:1定制南十字星大学毕业证（SCU毕业证）#文凭成绩单#真实留信学历认证永久存档208367051

Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408

Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha

9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha

RadioAdProWritingCinderellabyButleri.pdfgstagge

1:1定制(UQ毕业证）昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk

Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Universitat Politècnica de Catalunya

INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman

办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La

Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort

ASML's Taxonomy Adventure by Daniel Cantervoginip

E-Commerce Order PredictionShraddha Kamble.pptxBoston Institute of Analytics

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss

DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett

办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La

NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics

RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993

Recently uploaded (20)

原版1:1定制南十字星大学毕业证（SCU毕业证）#文凭成绩单#真实留信学历认证永久存档

Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps

Call Girls In Dwarka 9654467111 Escorts Service

9654467111 Call Girls In Munirka Hotel And Home Service

RadioAdProWritingCinderellabyButleri.pdf

1:1定制(UQ毕业证）昆士兰大学毕业证成绩单修改留信学历认证原版一模一样

Dubai Call Girls Wifey O52&786472 Call Girls Dubai

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD

办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一

Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)

ASML's Taxonomy Adventure by Daniel Canter

E-Commerce Order PredictionShraddha Kamble.pptx

科罗拉多大学波尔得分校毕业证学位证成绩单-可办理

DBA Basics: Getting Started with Performance Tuning.pdf

办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一

NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx

RABBIT: A CLI tool for identifying bots based on their GitHub events.

202312 Exploration Data Analysis Visualization (English version)

1. EDA Visualization before building model Orozco Hsu 2023-10-31 1

2. About me • Education • NCU (MIS)、NCCU (CS) • Work Experience • Telecom big data Innovation • AI projects • Retail marketing technology • User Group • TW Spark User Group • TW Hadoop User Group • Taiwan Data Engineer Association Director • Research • Big Data/ ML/ AIOT/ AI Columnist 2

3. Tutorial Content 3 Iris dataset summary EDA and visualization Homework

4. Code • Download materials: • https://drive.google.com/drive/folders/1Kaneenrtd2P2IWbo- PhMd3b6NvtT5FOc?usp=sharing 4

5. Table • Load dataset • Iirs.tab 5

6. The most recommend dataset • Where the independent variables are numerical and the dependent variable is categorical • The advantage of such a dataset also lies in its ease of clustering • The preferable data type for the dependent variable is binary, meaning it is either 'YES' or ‘NO • When the number of independent variables exceeds two or more, the accuracy will decrease • The most commonly used algorithm is logistic regression 6

8. Rank 8

12. Box plot 12

13. Feature Statistics 13

14. Homework • Change dataset, the numeric target feature • Please explain the data visualization • housing.tab 14