SlideShare a Scribd company logo
‫منصات‬ ‫على‬ ‫االجتماعية‬ ‫اآلراء‬ ‫تحليل‬
‫االجتماعي‬ ‫التواصل‬:
‫اجتماعية‬ ‫كحالة‬ ‫للسيارة‬ ‫المرأة‬ ‫قيادة‬
‫الداود‬ ‫أسيل‬
‫االجتماعية‬ ‫والعلوم‬ ‫البيانات‬ ‫علم‬ ‫مجال‬ ‫في‬ ‫دكتوراه‬ ‫طالبة‬
ABD
About Me
• Fifth year PhD candidate in informatics at the University of
Illinois at Urbana-Champaign.
• Master of Science in Information Management, University of
Illinois at Urbana Champaign.
• Master of Engineering in Computer Science. Cornell University.
• Bachelors of Science, Information Technology. King Saud
University.
@aseel_addawood https://sites.google.com/view/aseeladdawood/
My Research Interest
Understanding the discussions of
controversial issues in social media
• Field of study?
• Have you done DS
before?
• Programming
experience, which
language?
1. Brief intro to data science
2. Skills needed to become a
data scientist
3. Environment Setup
4. 10 min break
5. Data science cycle:
a. Data collection
b. 10 min break
c. Data annotation
6. 30 min is for questions
Data Is The New Oil !
Data Is The New Oil !
Why? 3 reasons…
• The value of data does not come from
its volume, its from it’s connections and
insights you can generate from it.
• Data cannot be depleted, in fact the
amount of data seems to be exploding.
• Data is infinitely durable and usable.
https://cdn-images-1.medium.com/max/1200/1*KFHLIacf2U44bDcQGbMaBw.jpeg
What Is Data
http://effectualsystems.com/data-need-information/
http://effectualsystems.com/data-need-information/
Data Science
DIFFERENTTYPES OF DATA SOURCES
New type of data
Cyborg =
organic
+
biomechatronic body
parts
What CanWe DoWith Data?
• Recommender systems
• Image Recognition
• Digital Advertisements
• Speech Recognition
• Gaming
• Price Comparison Websites
• Airline Route Planning
• Fraud and Risk Detection
• Delivery logistics
• Etc…
Computer
science
Math and
statistics
Domain
knowledge
Data
science
Components of Data Science
How to become a data scientist ?
T-shaped Skill Set
https://www.slideshare.net/ryanorban/how-to-become-a-data-scientist
T-shaped Skill Set
https://www.slideshare.net/ryanorban/how-to-become-a-data-scientist
You do not need a PhD to do data science
The best way to learn data science is by doing
data science
Environment Setup
Python + Jupyter notebook
http://bit.ly/2Dj35WO
First…
1. Create a folder in your desktop.
2. Name it DSTutorial.
3. Download the code files and save it in the new
directory.
Second…
1. Open terminal or CMD.
2. Go to the folder you created. cd
/desktop/DSTutorial
3. Open the notebook. jupyter notebook
You should have your notebook open and READY!
break
Time: 10 min
Data Science
Workflow
Identify
Problem
Query
Data
Source
Store the
data
Data Collection Data Annotation
Identify
class
Feature
extraction
Preprocess
Missing
data
Data Cleaning
Descriptive
statistics
Data Exploration
Plotting
Word
analysis
Model
training
ML Classification Models
Classificatio
n models
Accuracy
assessment
Visualization
Result Communication
Application/product
Report finding
Annotate
Identify
Problem
Query
Data
Source
Store the
data
Data Collection Data Annotation
Identify
class
Feature
extraction
Preprocess
Missing
data
Data Cleaning
Descriptive
statistics
Data Exploration
Plotting
Word
analysis
Model
training
ML Classification Models
Classificatio
n models
Accuracy
assessment
Visualization
Result Communication
Application/product
Report finding
Annotate
Step 1: Identify the problem /
research questions
What are you interested in
understanding that can
with expanding the
knowledge.
What previous work
done that you can
Two ways:
1.Start with a question in
(‫)البطالة‬
2.Start with the data (‫)ساهر‬
To make this more realistic, lets take an example…
‫عليه‬ ‫وقع‬ ‫اجتماعي‬ ‫حدث‬
‫الجدل‬ ‫من‬ ‫الكثير‬
‫السعودية‬ ‫في‬ ‫للسيارة‬ ‫المرأة‬ ‫قيادة‬
‫واأليدي‬ ‫التاريخي‬ ‫الصراع‬ ‫المرأة‬ ‫قيادة‬ ‫قضية‬ ‫تبرز‬‫ولوجي‬
‫المحاف‬ ‫األصوات‬ ‫بين‬ ‫السعودية‬ ‫العربية‬ ‫المملكة‬ ‫في‬‫ظة‬
‫ليبرالية‬ ‫واألكثر‬
https://www.albayan.ae/five-senses/east-and-west/2018-05-29-1.3278235
Identify
Problem
Query
Data
Source
Store the
data
Data Collection Data Annotation
Identify
class
Feature
extraction
Preprocess
Missing
data
Data Cleaning
Descriptive
statistics
Data Exploration
Plotting
Word
analysis
Model
training
ML Classification Models
Classificatio
n models
Accuracy
assessment
Visualization
Result Communication
Application/product
Report finding
Annotate
Step 2: Data Collection
Build the query
For Twitter, you need to identify the
keywords and the time range.
The choice of keywords matters:
bootstrapping etc.
Source of Twitter data
collection
Paid firehose access: Crimson
hexagon
Free access: Twitter API
Storing the data
Excel files as csv
Json file
Databases, SQL
WhyTwitter?
Vast amount of data with
easy access.
Saudi Arabia is among the
countries with the highest
number ofTwitter users
among its online population.
Saudi Arabia is producing
40% of all tweets in the Arab
world.
1. Countries with most Twitter users 2018 | Statistic, Retrived from: https://www.statista.com/statistics/242606/number-of-active-twitter-users-in-selectedcountries.
2. Saudi Arabia: number of internet users 2022 | Statistic, Retrived from: https://www.statista.com/statistics/462959/internet-users-saudi-arabia
3. Salem, F., Mourtada, R.: Citizen engagement and public services in the Arab world: The potential of social media. the Governance and Innovation Program at the
Mohammed Bin Rashid School of Government, Dubai (2014).
How to use Twitter
API
BuildThe Query
"((‫سواقة‬"‫أ‬‫و‬"‫قيادة‬"‫أ‬‫و‬"‫قياده‬"‫أ‬‫و‬"‫سواقه‬)"‫و‬("‫مراءة‬"‫أ‬‫و‬"‫مراءه‬"‫أ‬‫و‬"‫المراءة‬"‫أ‬‫و‬"‫المراءه‬"‫أ‬‫و‬
"‫المرأة‬"‫أ‬‫و‬"‫المراه‬"‫أ‬‫و‬"‫النساء‬"‫أ‬‫و‬"‫حريم‬"‫أ‬‫و‬"‫حرمه‬)) "
‫أ‬‫و‬
"((‫سواقة‬"‫أ‬‫و‬"‫قيادة‬"‫أ‬‫و‬"‫قياده‬"‫أ‬‫و‬"‫سواقه‬)"‫و‬"(‫مراءة‬"‫أ‬‫و‬"‫مراءه‬"‫أ‬‫و‬"‫المراءة‬"‫أ‬‫و‬"‫المراءه‬"‫أ‬‫و‬"‫المرأة‬"
‫أ‬‫و‬"‫المراه‬"‫أ‬‫و‬"‫النساء‬"‫أ‬‫و‬"‫حريم‬"‫أ‬‫و‬"‫حرمه‬)"
‫و‬
(("‫الغاء‬"‫أ‬‫و‬"‫تقودي‬ ‫لن‬"‫أ‬‫و‬"‫رفض‬"‫أ‬‫و‬"‫ضد‬"‫أ‬‫و‬"‫مع‬)"‫أ‬‫و‬"(‫سيارة‬"‫أ‬‫و‬"‫سياره‬"‫أ‬‫و‬"‫رخصة‬"‫أ‬‫و‬
"‫رخص‬"‫أ‬‫و‬"‫مدرسة‬"‫أ‬‫و‬"‫مدارس‬"‫أ‬‫و‬"‫تعليم‬"‫أ‬‫و‬"‫مدرسه‬)))"
Time Frame
1st - 30th, September 2017;
The month during which the
announced the permission for women
Total number of tweets
collected
10,247 tweets
Lets open the excel sheet..
https://bit.ly/2EZBnQd
Identify
Problem
Query
Data
Source
Store the
data
Data Collection Data Annotation
Identify
class
Feature
extraction
Preprocess
Missing
data
Data Cleaning
Descriptive
statistics
Data Exploration
Plotting
Word
analysis
Model
training
ML Classification Models
Classificatio
n models
Accuracy
assessment
Visualization
Result Communication
Application/product
Report finding
Annotate
Step 3: Data Annotation
Identify classes (this
corresponds to your research
question)
Binary ( positive, negative | for,
| gender etc.)
Multi-class (types of evidence,
users etc.)
Annotate
Human (build the codebook, train
inter-annotator agreement - Cohen’s
etc.)
Automatic
Feature
extraction
Linguistics (LIWC, MPQA)
Syntactic (POS tags)
Twitter related (# followers,
#retweets)
Identify Classes
Label Instructions Example
Neutral
•‫القيادة‬ ‫موضوع‬ ‫عن‬ ‫اخبار‬
•‫القيادة‬ ‫موضوع‬ ‫عن‬ ‫اسئله‬
•‫للرأي‬ ‫واضح‬ ‫غير‬ ‫تعبير‬ ‫أي‬
•‫القيادة‬ ‫بموضوع‬ ‫استهزاء‬
•‫الخ‬ ‫الرخصة‬ ‫مثال‬ ‫القيادة‬ ‫غير‬ ‫آخر‬ ‫لشيء‬ ‫المعارضة‬ ‫كانت‬ ‫إذا‬
•‫بالقيادة‬ ‫متعلقة‬ ‫غير‬ ‫أخرى‬ ‫بمواضيع‬ ‫االنخراط‬
•‫التويته‬ ‫بنفس‬ ‫متعارضة‬ ‫آراء‬
‫في‬ ‫المرأة‬ ‫قيادة‬ ‫مظاهرة‬#‫عام‬ ‫الدولة‬ ‫ضد‬ ‫السعودية‬1990
Women2Drive# http://t.co/PyzAO0mUpV
For ‫واضح‬ ‫تعبير‬ ‫اي‬‫المرأة‬ ‫قيادة‬ ‫مع‬ ‫بأنه‬ ‫للرأي‬
@Qahtani098 ،‫اخواتي‬ ‫امنع‬ ‫ماراح‬ ‫القرار‬ ‫تفعيل‬ ‫تم‬ ‫اذا‬
‫غير‬ ‫سواق‬ ‫مع‬ ‫تركب‬ ‫يمنعها‬ ‫انه‬ ‫االحق‬ ،‫المرأة‬ ‫قيادة‬ ‫يمنع‬ ‫دليل‬
‫محرم‬
Against ‫واضح‬ ‫تعبير‬ ‫اي‬‫المرأة‬ ‫قيادة‬ ‫برفض‬
@Saamaa2 ‫الحقيقية‬ ‫المرأة‬ ‫مشاكل‬ ‫طرح‬ ‫انتظرنا‬ ‫وقت‬ ‫في‬
‫السيارة‬ ‫قيادة‬ ‫عن‬ ‫رفعوها‬ ‫الشورى‬ ‫نساء‬ ‫ان‬ ‫اسمع‬ ‫توصية‬!‫وسبحان‬
‫مع‬ ‫متزامنة‬ ‫صدف‬ ‫هللا‬#‫قيادة‬_26‫اكتوبر‬ 😏
Annotation Instruction
1. Open google sheet.
2. Read the tweet.
3. Based on the table, label the tweet as either for,
against or neutral.
4. Add your name to each tweet you label.
5. Add some notes if needed.
6. If you did not know how to label the tweet skip it to
the next tweet.
Each person should annotate 10 tweets
http://bit.ly/2U6xc9D
Lets Annotate..
Time: 10 min
Social Media Data Challenges
• Online users’ expressions are written informally, so may include sarcasm,
spelling mistakes, unconventional grammar, slang words and expressions.
• The differences in opinion between the annotators.
• You need someone from the same culture.
• It might not be representative of the whole population, but it qualifies as a
representative sample.
Time: 10 min
break
Upload the data file to your folder…
Questions

More Related Content

What's hot

Data analytics beyond data processing and how it affects Industry 4.0
Data analytics beyond data processing and how it affects Industry 4.0Data analytics beyond data processing and how it affects Industry 4.0
Data analytics beyond data processing and how it affects Industry 4.0
Mathieu d'Aquin
 
Nuts and bolts
Nuts and boltsNuts and bolts
Nuts and bolts
NBER
 
Getting Started with Unstructured Data
Getting Started with Unstructured DataGetting Started with Unstructured Data
Getting Started with Unstructured Data
Christine Connors
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Edureka!
 
Clare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science OnlineClare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science Online
sfdatascience
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data science
ShilpaKrishna6
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
Mathieu d'Aquin
 
Text Analytics Presentation
Text Analytics PresentationText Analytics Presentation
Text Analytics Presentation
Skylar Ritchie
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Darshan Ambhaikar
 
IRJET-Classifying Mined Online Discussion Data for Reflective Thinking based ...
IRJET-Classifying Mined Online Discussion Data for Reflective Thinking based ...IRJET-Classifying Mined Online Discussion Data for Reflective Thinking based ...
IRJET-Classifying Mined Online Discussion Data for Reflective Thinking based ...
IRJET Journal
 
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data ScienceIntroduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
Ferdin Joe John Joseph PhD
 
Question answering
Question answeringQuestion answering
Question answering
Nafiseh Navabpour
 
Stakeholder-centred Identification of Data Quality Issues: Knowledge that Can...
Stakeholder-centred Identification of Data Quality Issues: Knowledge that Can...Stakeholder-centred Identification of Data Quality Issues: Knowledge that Can...
Stakeholder-centred Identification of Data Quality Issues: Knowledge that Can...
Anastasija Nikiforova
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
Sampath Kumar
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
Brad Houston
 
Popular Text Analytics Algorithms
Popular Text Analytics AlgorithmsPopular Text Analytics Algorithms
Popular Text Analytics Algorithms
PromptCloud
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
butest
 
Data and Knowledge as Commodities
Data and Knowledge as CommoditiesData and Knowledge as Commodities
Data and Knowledge as Commodities
Mathieu d'Aquin
 
The Genopolis Microarray database
The Genopolis Microarray databaseThe Genopolis Microarray database
The Genopolis Microarray database
Novartis Institutes for BioMedical Research
 
Measuring Relevance in the Negative Space
Measuring Relevance in the Negative SpaceMeasuring Relevance in the Negative Space
Measuring Relevance in the Negative Space
Trey Grainger
 

What's hot (20)

Data analytics beyond data processing and how it affects Industry 4.0
Data analytics beyond data processing and how it affects Industry 4.0Data analytics beyond data processing and how it affects Industry 4.0
Data analytics beyond data processing and how it affects Industry 4.0
 
Nuts and bolts
Nuts and boltsNuts and bolts
Nuts and bolts
 
Getting Started with Unstructured Data
Getting Started with Unstructured DataGetting Started with Unstructured Data
Getting Started with Unstructured Data
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
 
Clare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science OnlineClare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science Online
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data science
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
 
Text Analytics Presentation
Text Analytics PresentationText Analytics Presentation
Text Analytics Presentation
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
IRJET-Classifying Mined Online Discussion Data for Reflective Thinking based ...
IRJET-Classifying Mined Online Discussion Data for Reflective Thinking based ...IRJET-Classifying Mined Online Discussion Data for Reflective Thinking based ...
IRJET-Classifying Mined Online Discussion Data for Reflective Thinking based ...
 
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data ScienceIntroduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
 
Question answering
Question answeringQuestion answering
Question answering
 
Stakeholder-centred Identification of Data Quality Issues: Knowledge that Can...
Stakeholder-centred Identification of Data Quality Issues: Knowledge that Can...Stakeholder-centred Identification of Data Quality Issues: Knowledge that Can...
Stakeholder-centred Identification of Data Quality Issues: Knowledge that Can...
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Popular Text Analytics Algorithms
Popular Text Analytics AlgorithmsPopular Text Analytics Algorithms
Popular Text Analytics Algorithms
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
Data and Knowledge as Commodities
Data and Knowledge as CommoditiesData and Knowledge as Commodities
Data and Knowledge as Commodities
 
The Genopolis Microarray database
The Genopolis Microarray databaseThe Genopolis Microarray database
The Genopolis Microarray database
 
Measuring Relevance in the Negative Space
Measuring Relevance in the Negative SpaceMeasuring Relevance in the Negative Space
Measuring Relevance in the Negative Space
 

Similar to Data Science Workshop - day 1

Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
bodaceacat
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
Sara-Jayne Terp
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sd
Thinkful
 
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17
Thinkful
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
TJ Stalcup
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sd
Thinkful
 
Tips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistTips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data Scientist
Lisa Cohen
 
Chapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data MiningChapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data Mining
Izwan Nizal Mohd Shaharanee
 
Data Science Demystified
Data Science DemystifiedData Science Demystified
Data Science Demystified
Emily Robinson
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
Thinkful
 
Data science presentation
Data science presentationData science presentation
Data science presentation
MSDEVMTL
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
Izwan Nizal Mohd Shaharanee
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
TJ Stalcup
 
Deck 92-146 (3)
Deck 92-146 (3)Deck 92-146 (3)
Deck 92-146 (3)
Thinkful
 
The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...
Juan Mateos-Garcia
 
Data Science Workflow
Data Science Workflow Data Science Workflow
Data Science Workflow
Aseel Addawood
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DC
TJ Stalcup
 
Claudia Gold: Learning Data Science Online
Claudia Gold: Learning Data Science OnlineClaudia Gold: Learning Data Science Online
Claudia Gold: Learning Data Science Online
sfdatascience
 
data science .1.pdf
data      science                 .1.pdfdata      science                 .1.pdf
data science .1.pdf
Mohamed Alashram
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)
Zenodia Charpy
 

Similar to Data Science Workshop - day 1 (20)

Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sd
 
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sd
 
Tips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistTips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data Scientist
 
Chapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data MiningChapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data Mining
 
Data Science Demystified
Data Science DemystifiedData Science Demystified
Data Science Demystified
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
 
Deck 92-146 (3)
Deck 92-146 (3)Deck 92-146 (3)
Deck 92-146 (3)
 
The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...The profile of the management (data) scientist: Potential scenarios and skill...
The profile of the management (data) scientist: Potential scenarios and skill...
 
Data Science Workflow
Data Science Workflow Data Science Workflow
Data Science Workflow
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DC
 
Claudia Gold: Learning Data Science Online
Claudia Gold: Learning Data Science OnlineClaudia Gold: Learning Data Science Online
Claudia Gold: Learning Data Science Online
 
data science .1.pdf
data      science                 .1.pdfdata      science                 .1.pdf
data science .1.pdf
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)
 

More from Aseel Addawood

Linguistic Cues to Deception: Identifying Political Trolls on Social Media
Linguistic Cues to Deception: Identifying Political Trolls on Social MediaLinguistic Cues to Deception: Identifying Political Trolls on Social Media
Linguistic Cues to Deception: Identifying Political Trolls on Social Media
Aseel Addawood
 
The Emergence Of Social Bots In Social Media- WiDS Talk
The Emergence Of Social Bots In Social Media- WiDS TalkThe Emergence Of Social Bots In Social Media- WiDS Talk
The Emergence Of Social Bots In Social Media- WiDS Talk
Aseel Addawood
 
معالجة اللغة الطبيعية
معالجة اللغة الطبيعيةمعالجة اللغة الطبيعية
معالجة اللغة الطبيعية
Aseel Addawood
 
Data discrimination and bias snap #6
Data discrimination and bias snap #6Data discrimination and bias snap #6
Data discrimination and bias snap #6
Aseel Addawood
 
Data Visualization snap#5
Data Visualization snap#5Data Visualization snap#5
Data Visualization snap#5
Aseel Addawood
 
Data storytelling_snap#4
Data storytelling_snap#4Data storytelling_snap#4
Data storytelling_snap#4
Aseel Addawood
 
Machine learning_snap#3
Machine learning_snap#3Machine learning_snap#3
Machine learning_snap#3
Aseel Addawood
 
Data science Snap#2
Data science Snap#2Data science Snap#2
Data science Snap#2
Aseel Addawood
 
Data science Snap1
Data science Snap1Data science Snap1
Data science Snap1
Aseel Addawood
 

More from Aseel Addawood (9)

Linguistic Cues to Deception: Identifying Political Trolls on Social Media
Linguistic Cues to Deception: Identifying Political Trolls on Social MediaLinguistic Cues to Deception: Identifying Political Trolls on Social Media
Linguistic Cues to Deception: Identifying Political Trolls on Social Media
 
The Emergence Of Social Bots In Social Media- WiDS Talk
The Emergence Of Social Bots In Social Media- WiDS TalkThe Emergence Of Social Bots In Social Media- WiDS Talk
The Emergence Of Social Bots In Social Media- WiDS Talk
 
معالجة اللغة الطبيعية
معالجة اللغة الطبيعيةمعالجة اللغة الطبيعية
معالجة اللغة الطبيعية
 
Data discrimination and bias snap #6
Data discrimination and bias snap #6Data discrimination and bias snap #6
Data discrimination and bias snap #6
 
Data Visualization snap#5
Data Visualization snap#5Data Visualization snap#5
Data Visualization snap#5
 
Data storytelling_snap#4
Data storytelling_snap#4Data storytelling_snap#4
Data storytelling_snap#4
 
Machine learning_snap#3
Machine learning_snap#3Machine learning_snap#3
Machine learning_snap#3
 
Data science Snap#2
Data science Snap#2Data science Snap#2
Data science Snap#2
 
Data science Snap1
Data science Snap1Data science Snap1
Data science Snap1
 

Recently uploaded

一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 

Recently uploaded (20)

一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 

Data Science Workshop - day 1

Editor's Notes

  1. مين منكم قد سمع هذي الجملة من قبل.. رفع ايادي المفروض كلكم لانها تكررت كثير :)
  2. لكن هذي الجمله غير صحيحة كليا واخذت اكبر من حجمها واتوقع هدفها كان تسويقي اكثر من انه وصفي لحقيقه ايش الدات
  3. قيمة الداتا ما تجي من ان كميتها كبيره واصلا ايش يعني بق داتا، مالها معنى كبير، هل 1 قيقا يعني كبيره، طيب 1.1 قيقا، الداتا قيمتها ما تجي من حجمها ولكن تجي من قدرتنا على استخراج الباترنز الموجوده فيها صح النفط كل ما كان عندك اكثر كل ما ربحت فلوس اكثر لكن الداتا تكبر قيمتها من ال فاليو اللي تقدر تطلعها منها ومن طريقة الربط اللي تقدر تسويها مع النسق العام اللي حولها مثلا عندك داتا اللي من تويتر كبيره وكثيره لكن وش فايدتها بدون ما نربطها بحل مشكله او فهم شي معين ، مو المهم تجمع داتا كثيره، الأهم تجمع الداتا الصح يعتبر النفط موردًا طبيعيًا يمكن استنزافه وصعوبة الحصول عليه واستنزافه بينما لا يمكن استنزاف البيانات ، وفي الواقع يبدو أن كمية البيانات تنفجر. النفط مورد محدود ، وليس قابلا لإعادة الاستخدام في حين أن البيانات دائمة للغاية ويمكن استخدامها.
  4. الداتا او البيانات هي زي قطع البازل لما تشوفها ما تفهمها وما تدري ايش ممكن يطلع لك لو رتبتها وفهمتها وهذي البيانات نقدر نرتبها بكذا طريقه وتتحول لنا لمعلومة مفهومة اقدر استفيد منها في حل او فهم المشكلة اللي تواجهني مثلا تخيلو البيانات اللي نقدر نحصلها من الدانوب خيالية حرفيا لكن هي مثل هالصوره كذا بلحالها محيوسه ومالها معنى وكميتها بدون تمحيص مالها سنع لو تجي شركة الدانوب هذي البروسس من نقل البيانات الي معلومات اقدر استفيد منها هي علم البيانات
  5. هذي البروسس من نقل البيانات الي معلومات اقدر استفيد منها هي علم البيانات
  6. قطع البازل هذي ايش ممكن تكون
  7. طيب الداتا ساينس مكون من ايش، ايش هي اهم البارتز اللي ممكن تحقق لي المعادل الصح للداتا ساينس لحل أي مشكلة
  8. 7:30
  9. 8
  10. Any data science project go through these steps in general : first you need to identify the problem
  11. Download the file from excel Change the file to csv Upload the file to your folder