1. University of Gujrat, Sialkot Sub Campus
Department of Computer Science
Title Data Science
Code CS-442
Credit Hours 3.0
Prerequisite Data Structure + Software Engineering
Instructor Dr. Javaid Anjum Sheikh + Sami Ur Rehman (TA)
Aims and Objectives
This course provides an introduction about data science, its application
and needs in today’s world. This course also introduces the students
about cutting-edge tools used by data scientists around the world. These
tools include Python, Numpy, Pandas, Jupyter Notebooks, Scikit Learn
and a bunch more. Students don’t learn just only different data analysis
tools but they better understand some aspects of the world. The field of
data science will help us in the following way:
- How using data science can improve the actions of emergency
responders
- Help us understand the impact of human activities on
environment
- How we can personalize services for our customers
Course Goals
By the end of the course you should be able to find useful datasets, for
research questions about the data, perform basic data analysis to help
answer your research questions, and present your findings
Text Books
1. Data Science from Scratch by Joel Grus
2. Python for Data Analysis
Reference Books 1. Internet
Assessment Criteria Sessional 25% Mid 25% Final 50%
2. Sixteen-week lecture plan
Week Lecture Topic
1 1,2
Welcome and overview of the course. Introduction to data science and value of
learning data science, Big Data, Modern data science skills, Why Python
2 3,4
The data science process, Basic steps in data science project, Data collection,
Data preparation, data cleaning, data visualization, data modeling, presenting
data science outcome
3 5,6 Python: basics, variables, data types, objects, loops, conditions
4 7,8 Python:functions, string functions, lists, tuples, dictionaries, sets
5 9,10
Jupyter: the most used tool in data science, key features, getting started,
documenting with markdown text
6 11,12 Numpy for data analysis, time and space efficient functions, ndarray basics
7 13,14
Pandas: offers critical data analysis functionality, benefits, pandas data structure,
pandas series, pandas data frame, descriptive statistics functions: describe(),
corr(), min(), max(), mode(), median(), data cleaning with pandas
8 15,16 Mid Term
9 17,18 Data visualization, plotting with pandas, bar chart, box plot, histogram
10 19,20 Data analysis, frequent data operations, merging data frames, frequent string
operations: split(), contains(), extract()
11 21,22
Data analysis in detail using Jupyter Notebooks, numpy, pandas etc, data
visualization
12 23,24 Matplotlib Library, case studies
13 25,26
Machine learning, applications, categories of machine learning, numeric and
categorical variables, decision tree, constructing a decision tree, sci-kit: a
powerful library for machine learning
14 27,28 Clustering, cluster analysis, K-means clustering, evaluation of cluster results,
regression analysis
15 29,30 Relational data model, Twitter – working with text
16 31,32 Evaluation of Projects & Presentations
Quizzes 5%ed
Assignments/Project 20%