SlideShare a Scribd company logo
1 of 34
Download to read offline
Data Science and Machine
learning
Course Overview
Dr. Pratishtha Verma
Assistant Professor, NIT Kurukshetra
1. The major goal of the course is to allow computers to learn (potentially complex) patterns from
data, and then make decisions based on these patterns.
2. To provide strong foundation for data science and application area related to it.
3. To provide the underlying core concepts and emerging technologies in data science.
4. A data scientist requires an integrated skill set spanning mathematics, probability and statistics,
optimization, and branches of computer science like databases, machine learning etc.
Course Learning Objectives:
Course Overview
Module List:
1. Introduction to Data Science: What is Data Science? Linear algebra for datascience:- algebraic and geometric view, Data
Representation & Statistical Inference:- Data objects and attribute types, Types of Data, descriptive statistics, notion of
probability, distributions, mean, variance, covariance, Understanding univariate and multivariate normal distributions.
2. Data Analysis: Probability and Random Variables, Correlation, Regression, Attribute Transformation, Sampling, Feature subset
selection, Similarity measures, High-dimensional Data: - Curse of Dimensionality, Dimensionality reduction: PCA, SVD, etc.
3. Data Visualization, Bayesian Learning& Evaluating Hypotheses: Basic principles, Scalar, Vector, & Tensor Visualization,
Multivariate Data Visualization, Text Data Visualization, Network Data Visualization, Visualization Techniques, Bayesian
Approach, Bayes’ Theorem, Evaluating Hypotheses- Z-test, T-test, Chi-square Test.
4. Machine Learning (Supervised & Unsupervised Learning): Basic concepts of Classification, k-Nearest Neighbor, Decision
Tree classification, Naïve Bayes’ Classifier, Linear Regression Models, Logistics Regression, Basic concepts of Clustering,
K-means, Hierarchical Clustering, DBSCAN.
What is Data Science?
Data science is a deep study of the massive amount of data, which involves extracting meaningful insights from raw,
structured, and unstructured data that is processed using the scientific method, different technologies, and algorithms.
It is a multidisciplinary field that uses tools and techniques to manipulate the data so that you can find something new and
meaningful.
Data science uses the most powerful hardware, programming systems, and most efficient algorithms to solve the data
related problems. It is the future of artificial intelligence.
In short, we can say that data science is all about:
Example:
Let suppose we want to travel from station A to station B by car. Now, we need to take some decisions such as which route
will be the best route to reach faster at the location, in which route there will be no traffic jam, and which will be
cost-effective. All these decision factors will act as input data, and we will get an appropriate answer from these decisions,
so this analysis of data is called the data analysis, which is a part of data science.
Linear algebra for data science:- algebraic and geometric view
Vector and their operation:
1) What are vectors?
2) Operations of vectors: Vector Addition, Scalar Multiplication in geometric
interpretation, Algebraic viewpoint and data science view point.
3) Length of a vector.
4) Dot product.
5) Zero vector, unit vector, orthogonal and orthonormal vector.
6) Projection: Scalar and Vector.
What are vectors?
Those quantities that has direction and magnitude are called vectors. Ex: Force and velocity.
Those qualities that has only magnitude are called scalars. Ex: mass, displacement.
Representation:
Vectors can be represented by arrow starting at reference point called origin.
Two vectors are equal if they have same direction and length.
A
B
C
D
Vectors from data science point of view:
A list of attributes of an object or describing attributes value of a specific instance.
Ex:
Vector operations (Geometrically) :
1. Vector addition (geometric view): addition of vector, head to tail: at the end of v place start of w.
Properties of vector addition: 1) Commutative
2) Associative
2. Scalar Multiplication (Geometric view): scalar multiplication scales the
length of the vector, but does not change its direction.
Properties of scalar multiplication: 1) Distributive over addition
Vector operation (Algebraically)
Lets define coordinate system to define vector operation algebraically:
1) Addition
2) Multiplication
-3
2
4
2
1
0
0
1
4i+2j
i
j
A
B
A+B=
4
2
+ -3
2
=
1
4
3A = 3
3
2
Data Science View
Length of a vector:
● Data Objects and Attribute Types
● Basic Statistical Descriptions of Data
● Measuring Data Similarity and Dissimilarity
Type of Data:
– Data sets differ in a number of ways.
– Type of data determines which techniques can be used to
analyze the data.
Quality of Data:
– Data is often far from perfect.
– Improving data quality improves the quality of the resulting
analysis.
Preprocessing Steps to Make Data More Suitable:
– Raw data must be processed in order to make it suitable for
analysis.
• Improve data quality,
• Modify data so that it better fits a specified data mining
technique.
Data-Related Issues
Analyzing Data in Terms of its Relationships:
– find relationships among data objects and then
perform remaining analysis using these
relationships rather than data objects themselves.
– There are many similarity or distance measures,
and the proper choice depends on the type of data
and application.
What is Data?
• Data sets are made up of data objects.
Example: A data object represents an entity - in a sales database, the objects may be customers, store items,
and sales; in a medical database, the objects may be patients; in a university database, the objects may be
students, professors, and courses.
– Also called sample, example, instance, data point, object, tuple.
• Data objects are described by attributes.
• An attribute is a property or characteristic of a data object.
– Examples: eye color of a person, temperature, etc.
– Attribute is also known as variable, field, characteristic, or feature
• A collection of attributes describe an object.
• Attribute values are numbers or symbols assigned to an attribute.
A Data Object
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview
Data Science and Machine Learning Course Overview

More Related Content

Similar to Data Science and Machine Learning Course Overview

Introduction of data science
Introduction of data scienceIntroduction of data science
Introduction of data scienceTanujaSomvanshi1
 
Exploratory Data Analysis.pptx for Data Analytics
Exploratory Data Analysis.pptx for Data AnalyticsExploratory Data Analysis.pptx for Data Analytics
Exploratory Data Analysis.pptx for Data Analyticsharshrnotaria
 
Pt2520 Unit 6 Data Mining Project
Pt2520 Unit 6 Data Mining ProjectPt2520 Unit 6 Data Mining Project
Pt2520 Unit 6 Data Mining ProjectJoyce Williams
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2Gokulks007
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...Editor IJCATR
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxTake1As
 
UNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningUNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningNandakumar P
 
PowerPoint Template
PowerPoint TemplatePowerPoint Template
PowerPoint Templatebutest
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxsumitkumar600840
 
Unveiling the Dynamics of Exploratory Data Analysis_ A Deep Dive into Data Sc...
Unveiling the Dynamics of Exploratory Data Analysis_ A Deep Dive into Data Sc...Unveiling the Dynamics of Exploratory Data Analysis_ A Deep Dive into Data Sc...
Unveiling the Dynamics of Exploratory Data Analysis_ A Deep Dive into Data Sc...Assignment Help
 
EE-232-LEC-01 Data_structures.pptx
EE-232-LEC-01 Data_structures.pptxEE-232-LEC-01 Data_structures.pptx
EE-232-LEC-01 Data_structures.pptxiamultapromax
 
Searching in metric spaces
Searching in metric spacesSearching in metric spaces
Searching in metric spacesunyil96
 
Data What Type Of Data Do You Have V2.1
Data   What Type Of Data Do You Have V2.1Data   What Type Of Data Do You Have V2.1
Data What Type Of Data Do You Have V2.1TimKasse
 
New Topic Effectivesequencesofenquiry
New Topic EffectivesequencesofenquiryNew Topic Effectivesequencesofenquiry
New Topic Effectivesequencesofenquiryguest2137aa
 
New Topic Effectivesequencesofenquiry
New Topic EffectivesequencesofenquiryNew Topic Effectivesequencesofenquiry
New Topic Effectivesequencesofenquiryguest9fa52
 

Similar to Data Science and Machine Learning Course Overview (20)

data mining
data miningdata mining
data mining
 
Introduction of data science
Introduction of data scienceIntroduction of data science
Introduction of data science
 
Exploratory Data Analysis.pptx for Data Analytics
Exploratory Data Analysis.pptx for Data AnalyticsExploratory Data Analysis.pptx for Data Analytics
Exploratory Data Analysis.pptx for Data Analytics
 
DataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdfDataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdf
 
Pt2520 Unit 6 Data Mining Project
Pt2520 Unit 6 Data Mining ProjectPt2520 Unit 6 Data Mining Project
Pt2520 Unit 6 Data Mining Project
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptx
 
Data mining
Data miningData mining
Data mining
 
Data mining BY Zubair Yaseen
Data mining BY Zubair YaseenData mining BY Zubair Yaseen
Data mining BY Zubair Yaseen
 
UNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningUNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data Mining
 
PowerPoint Template
PowerPoint TemplatePowerPoint Template
PowerPoint Template
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptx
 
Unveiling the Dynamics of Exploratory Data Analysis_ A Deep Dive into Data Sc...
Unveiling the Dynamics of Exploratory Data Analysis_ A Deep Dive into Data Sc...Unveiling the Dynamics of Exploratory Data Analysis_ A Deep Dive into Data Sc...
Unveiling the Dynamics of Exploratory Data Analysis_ A Deep Dive into Data Sc...
 
EE-232-LEC-01 Data_structures.pptx
EE-232-LEC-01 Data_structures.pptxEE-232-LEC-01 Data_structures.pptx
EE-232-LEC-01 Data_structures.pptx
 
Lesson 6 chapter 4
Lesson 6   chapter 4Lesson 6   chapter 4
Lesson 6 chapter 4
 
Searching in metric spaces
Searching in metric spacesSearching in metric spaces
Searching in metric spaces
 
Data What Type Of Data Do You Have V2.1
Data   What Type Of Data Do You Have V2.1Data   What Type Of Data Do You Have V2.1
Data What Type Of Data Do You Have V2.1
 
New Topic Effectivesequencesofenquiry
New Topic EffectivesequencesofenquiryNew Topic Effectivesequencesofenquiry
New Topic Effectivesequencesofenquiry
 
New Topic Effectivesequencesofenquiry
New Topic EffectivesequencesofenquiryNew Topic Effectivesequencesofenquiry
New Topic Effectivesequencesofenquiry
 

Recently uploaded

Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 

Recently uploaded (20)

Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 

Data Science and Machine Learning Course Overview

  • 1. Data Science and Machine learning Course Overview Dr. Pratishtha Verma Assistant Professor, NIT Kurukshetra
  • 2. 1. The major goal of the course is to allow computers to learn (potentially complex) patterns from data, and then make decisions based on these patterns. 2. To provide strong foundation for data science and application area related to it. 3. To provide the underlying core concepts and emerging technologies in data science. 4. A data scientist requires an integrated skill set spanning mathematics, probability and statistics, optimization, and branches of computer science like databases, machine learning etc. Course Learning Objectives:
  • 3. Course Overview Module List: 1. Introduction to Data Science: What is Data Science? Linear algebra for datascience:- algebraic and geometric view, Data Representation & Statistical Inference:- Data objects and attribute types, Types of Data, descriptive statistics, notion of probability, distributions, mean, variance, covariance, Understanding univariate and multivariate normal distributions. 2. Data Analysis: Probability and Random Variables, Correlation, Regression, Attribute Transformation, Sampling, Feature subset selection, Similarity measures, High-dimensional Data: - Curse of Dimensionality, Dimensionality reduction: PCA, SVD, etc. 3. Data Visualization, Bayesian Learning& Evaluating Hypotheses: Basic principles, Scalar, Vector, & Tensor Visualization, Multivariate Data Visualization, Text Data Visualization, Network Data Visualization, Visualization Techniques, Bayesian Approach, Bayes’ Theorem, Evaluating Hypotheses- Z-test, T-test, Chi-square Test. 4. Machine Learning (Supervised & Unsupervised Learning): Basic concepts of Classification, k-Nearest Neighbor, Decision Tree classification, Naïve Bayes’ Classifier, Linear Regression Models, Logistics Regression, Basic concepts of Clustering, K-means, Hierarchical Clustering, DBSCAN.
  • 4. What is Data Science? Data science is a deep study of the massive amount of data, which involves extracting meaningful insights from raw, structured, and unstructured data that is processed using the scientific method, different technologies, and algorithms. It is a multidisciplinary field that uses tools and techniques to manipulate the data so that you can find something new and meaningful. Data science uses the most powerful hardware, programming systems, and most efficient algorithms to solve the data related problems. It is the future of artificial intelligence. In short, we can say that data science is all about:
  • 5. Example: Let suppose we want to travel from station A to station B by car. Now, we need to take some decisions such as which route will be the best route to reach faster at the location, in which route there will be no traffic jam, and which will be cost-effective. All these decision factors will act as input data, and we will get an appropriate answer from these decisions, so this analysis of data is called the data analysis, which is a part of data science.
  • 6. Linear algebra for data science:- algebraic and geometric view Vector and their operation: 1) What are vectors? 2) Operations of vectors: Vector Addition, Scalar Multiplication in geometric interpretation, Algebraic viewpoint and data science view point. 3) Length of a vector. 4) Dot product. 5) Zero vector, unit vector, orthogonal and orthonormal vector. 6) Projection: Scalar and Vector.
  • 7. What are vectors? Those quantities that has direction and magnitude are called vectors. Ex: Force and velocity. Those qualities that has only magnitude are called scalars. Ex: mass, displacement. Representation: Vectors can be represented by arrow starting at reference point called origin. Two vectors are equal if they have same direction and length. A B C D
  • 8. Vectors from data science point of view: A list of attributes of an object or describing attributes value of a specific instance. Ex:
  • 9. Vector operations (Geometrically) : 1. Vector addition (geometric view): addition of vector, head to tail: at the end of v place start of w. Properties of vector addition: 1) Commutative 2) Associative 2. Scalar Multiplication (Geometric view): scalar multiplication scales the length of the vector, but does not change its direction. Properties of scalar multiplication: 1) Distributive over addition
  • 10. Vector operation (Algebraically) Lets define coordinate system to define vector operation algebraically: 1) Addition 2) Multiplication -3 2 4 2 1 0 0 1 4i+2j i j A B A+B= 4 2 + -3 2 = 1 4 3A = 3 3 2
  • 11. Data Science View Length of a vector:
  • 12. ● Data Objects and Attribute Types ● Basic Statistical Descriptions of Data ● Measuring Data Similarity and Dissimilarity
  • 13. Type of Data: – Data sets differ in a number of ways. – Type of data determines which techniques can be used to analyze the data. Quality of Data: – Data is often far from perfect. – Improving data quality improves the quality of the resulting analysis. Preprocessing Steps to Make Data More Suitable: – Raw data must be processed in order to make it suitable for analysis. • Improve data quality, • Modify data so that it better fits a specified data mining technique. Data-Related Issues Analyzing Data in Terms of its Relationships: – find relationships among data objects and then perform remaining analysis using these relationships rather than data objects themselves. – There are many similarity or distance measures, and the proper choice depends on the type of data and application.
  • 14. What is Data? • Data sets are made up of data objects. Example: A data object represents an entity - in a sales database, the objects may be customers, store items, and sales; in a medical database, the objects may be patients; in a university database, the objects may be students, professors, and courses. – Also called sample, example, instance, data point, object, tuple. • Data objects are described by attributes. • An attribute is a property or characteristic of a data object. – Examples: eye color of a person, temperature, etc. – Attribute is also known as variable, field, characteristic, or feature • A collection of attributes describe an object. • Attribute values are numbers or symbols assigned to an attribute.