SlideShare a Scribd company logo
1 of 17
Download to read offline
Proposal of Classes
Andres Mendez-Vazquez
Associate Research Professor
Cinvestav GDL
Datalab Adviser
Contents
1 Introduction 4
I Python Module 5
2 Basic Python 5
2.1 History of Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 The Basic Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 Python Basic Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.3 Control Flow Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.4 Defining Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.5 Functional Programming - Lambda Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Data Structures in Python 7
3.1 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4 Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.5 Hash Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.6 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.7 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.8 Divide and Conquer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
II Basic Mathematical Modules 9
4 Linear Algebra [1] 9
4.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.2 System of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.3 Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1
DataLab Classes
5 Probability and Statistical Theory [2] 9
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.2 Bayes Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.3 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6 Optimization [3] 10
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6.2 Basic Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
III Feature Engineering 11
IV Supervised Learning 12
7 Basic Supervised Learning 12
7.1 Introduction [4, 5, 6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
7.2 Linear Classifiers [4, 5, 6, 7, 8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
7.3 Probability Classifiers [4, 5, 6, 7, 8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
7.4 Advanced Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
7.5 Advanced Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
8 Intermediate Supervised Learning 13
8.1 Probability Models for Supervised Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
8.2 Important Issues [4, 5, 6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
8.3 Kernel Based Classifiers [9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
9 Advanced Supervised Learning 13
9.1 Graphical Model Based Classifiers [4, 6, 9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
9.2 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
9.3 Combining Classifiers [4, 10] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
V Unsupervised Learning 15
10 Basic Unsupervised Learning [11, 7] 15
11 Intermediate Unsupervised Learning [11] 15
12 Advanced Unsupervised Learning [11, 6] 15
VI Data Mining 16
13 Basic Data Mining 16
13.1 Mining the Web for Structured Data [12, 13, 14, 15] . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
13.2 Frequent Itemsets and Association rules [12, 15] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
A New Era 2
DataLab Classes
14 Advanced Data Mining 16
14.1 Near Neighbor Search in High Dimensional Data [14, 16] . . . . . . . . . . . . . . . . . . . . . . . . . 16
14.2 Structure of the Webgraph [13] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
References 17
A New Era 3
DataLab Classes
1 Introduction
Here, we propose the following schedule of classes.
Feature Engineering
Feature Engineering
Basic Mathematical Modules
Linear Algebra/Regression Basic Probability Basic Optimization Methods
Supervised Learning
Basic Supervised Learning
Intermediate Supervised Learning
Advanced Supervised Learning
Advanced Optimization Advanced Probability
Unsupervised Learning
Basic Unsupervised Learning
Intermediate Unsupervised Learning
Advanced Unsupervised Learning
Python
Data Structures
Basic Python
Data Mining
Advanced Data Mining
Basic Data Mining
A New Era 4
DataLab Classes
Part I
Python Module
2 Basic Python
2.1 History of Python
1. Python was conceived in the late 1980s.
2. By Guido Van Rossum
(a) Benevolent dictator for life!!!
3. Python reached version 1.0 in January 1994.
4. The major new features included in this release were the functional programming tools:
(a) lambda - the basic lambda calculus implementation.
(b) map - it applies a given function to each element of a list.
(c) filter - it filters out items based on a test function.
(d) reduce - it is a really useful function for performing some computation on a list and returning the result.
5. Standard Implementations
(a) CPython
(b) Jython (1997)
(c) IronPython (2006) - .NET
(d) PyPy (2007) written in RPython
6. Exotic Implementation
(a) Stackless Python (2000) does not depend on the C call stack.
2.2 The Basic Structures
2.2.1 Variables
There is a list of things that need to be covered:
1. Assigning Values to Variables.
2. Multiple Assignment
3. Standard Data Types
(a) Numbers
(b) String
(c) List
(d) Tuple
(e) Dictionary
A New Era 5
DataLab Classes
2.2.2 Python Basic Operators
Here, we need to review:
1. Arithmetic Operators
2. Comparison (Relational) Operators
3. Assignment Operators
4. Logical Operators
5. Bitwise Operators
6. Membership Operators
7. Identity Operators
2.2.3 Control Flow Structures
1. The If Statement
2. Loop Instructions
(a) For Instruction
(b) While Instruction
(c) Techniques
3. Else in loops
4. The range instruction
5. break and continue Statements
6. pass Statements
2.2.4 Defining Functions
Here, we classically define the syntax for functions using:
1. Fibonacci function
Then, we look at
1. Default Argument Values
2. Keyword Arguments
3. Arbitrary Argument Lists
In addition, we look at small expressions using lambda expressions.
A New Era 6
DataLab Classes
2.2.5 Functional Programming - Lambda Calculus
Review the difference between:
1. Procedural
2. Declarative
3. Object-Oriented
4. Functional
3 Data Structures in Python
3.1 Lists
Review the most basic functions for a list:
1. append(x)
2. extend(x)
3. remove(x)
4. pop()
5. index(x)
6. count(x)
7. sort()
8. reverse()
Then given all this operations it is possible to use a list as
1. Stacks
2. Queues
Here, you need to be careful because the emulation vs native implementation.
3.2 Tuples
Here, we will review the Tuple.
3.3 Sets
They are an unordered collection with no duplicate elements
3.4 Dictionaries
Another useful data type built into Python is the dictionary. It can be seen as an unordered set of key:value pairs.
Here, we will see how to substitute the case statement by a dictionary
A New Era 7
DataLab Classes
3.5 Hash Tables
Here, we will review some of the basics in
1. Hash Functions
2. Hash Tables
3.6 Trees
As you can imagine trees are an evolution from the chain lists. Thus, we will review how to :
1. Build a tree.
2. Traverse a tree
3. Expression trees
3.7 Graphs
Look at the basics on how building graphs by using lists and dictionaries. Then, look at
1. Breath First Search
2. Depth First Search
3.8 Divide and Conquer
Here, review the basics of divide and conquer using
1. Merge Sort
2. Quick Sort
A New Era 8
DataLab Classes
Part II
Basic Mathematical Modules
4 Linear Algebra [1]
4.1 Vector Spaces
1. Using Vector to Represent Samples
2. Basis of Vector Spaces
3. Sub-spaces
4. Linear independence
4.2 System of Linear Equations
1. Elementary row operations and elementary matrices
2. Example the inverse matrix
3. Squared Matrices and Inverse
4. Determinants
5. Derivatives of Matrices
4.3 Basis
1. Orthonormal Basis
2. Eigenvalues and Eigenvectors
4.4 Applications
1. Linear Regression
2. Graph Theory
3. Sparse Matrices
5 Probability and Statistical Theory [2]
5.1 Introduction
1. Probability Definition
2. Basic Set Operations
3. Counting
A New Era 9
DataLab Classes
5.2 Bayes Theorem
1. Conditional probability
2. Independence of events
5.3 Distributions
1. Random Variables
2. Distributions
(a) Continuous
(b) Discrete
(c) Examples in Machine Learning
3. Mean and variance
4. Joint distributions
5. Covariance and Correlation
5.4 Applications
1. Bayesian inference
2. Principal Component Analysis
3. Analysis of Samples
6 Optimization [3]
6.1 Introduction
1. Formulation
2. Example: Least Squared Error
3. Continuous versus Discrete Optimization
4. Constrained and Unconstrained Optimization
6.2 Basic Optimization
1. How to recognize a minimum?
2. Search using Derivatives
(a) Gradient Descent
3. Linear Search
A New Era 10
DataLab Classes
Part III
Feature Engineering
1. Introduction
2. Types of Data
(a) Categorical
(b) Numerical
3. Extracting meaning from the data
4. Encoding techniques of categorical variables:
(a) Feature hashing and bin-counting
5. Feature engineering for numeric data:
(a) Filtering, binning, scaling, log transforms, and power transforms
(b) Statistical analysis of the data
6. Natural text techniques:
(a) bag-of-words, n-grams.
7. The Cycle of feature engineering
(a) Forward, backward selection, etc
8. Non-Linear Featurization and Model Stacking
A New Era 11
DataLab Classes
Part IV
Supervised Learning
7 Basic Supervised Learning
7.1 Introduction [4, 5, 6]
1. What is Learning?
2. The Learning Process
7.2 Linear Classifiers [4, 5, 6, 7, 8]
1. What is a Classifier?
2. Introduction to Linear Classifiers
3. Mean-Square Error Linear Estimation
4. Regularization
7.3 Probability Classifiers [4, 5, 6, 7, 8]
1. Discriminant Functions
2. Naive Bayes
7.4 Advanced Optimization
1. Differentiable Convex Functions
2. Newton’s Method
3. Quasi-Newton’s Methods
4. Constrained Optimization
(a) Lagrange Multipliers
(b) The Karush-Kuhn-Tucher Condiditons
(c) Duality
7.5 Advanced Probability
1. Central limit theorem
2. Bayes Theorem
(a) Likelihood Distribution
A New Era 12
DataLab Classes
3. Bayesian inference
(a) Maximum A Posteriori
4. Generative Models
5. Point Estimation
6. Dirichlet Processes
8 Intermediate Supervised Learning
8.1 Probability Models for Supervised Classification
1. Logistic Regression
2. Expectation Maximization
3. Maximum A Posteriori in the LASSO
8.2 Important Issues [4, 5, 6]
1. Statistical Analysis of Classifiers
2. Bias-Varance Dilemma
3. The Confusion Matrix
4. K-Cross Validation
8.3 Kernel Based Classifiers [9]
1. Introduction
2. Support Vector Machines
9 Advanced Supervised Learning
9.1 Graphical Model Based Classifiers [4, 6, 9]
1. Decision Trees.
2. Neural Networks
• Perceptron
• Multilayer Perceptron
3. Deep Neural Networks
(a) CNN
A New Era 13
DataLab Classes
9.2 Data Preparation
1. Feature selection [4, 5, 6]
(a) Preprocessing
(b) Statistical Methods
(c) Class Separability
(d) Feature Subset Selection.
2. Feature Generation [4, 5, 6, 17]
(a) Introduction
(b) Fisher Method
(c) Principal Component Analysis
(d) Dimensionality Reduction
9.3 Combining Classifiers [4, 10]
1. Average Rules
2. Majority Voting Rule
3. A Bayesian viewpoint
4. Boosting
A New Era 14
DataLab Classes
Part V
Unsupervised Learning
10 Basic Unsupervised Learning [11, 7]
1. Introduction
2. Proximity Measures.
3. Admissible Cost Functions.
4. Basic Clustering Algorithms:
(a) K-Means
(b) K-Centers
(c) K-Medians
11 Intermediate Unsupervised Learning [11]
1. Hierarchical Clustering
(a) Graph Model For Hierarchical Clustering
2. Partitional Clustering
(a) Grouping Based in Disimilarities
(b) Grouping Based on Cluster Volume
3. Hierarchical Clustering vs. Partitional Clustering
12 Advanced Unsupervised Learning [11, 6]
1. Large Data Set Clustering:
(a) Density Based: DBSCAN,
(b) Grid Based: STING, CLIQUE
2. Strategies for Large Data Sets
(a) One-pass strategies
(b) Summarization strategies
(c) Sampling/batch strategies
(d) Approximation strategies
(e) Divide and conquer strategies
A New Era 15
DataLab Classes
Part VI
Data Mining
13 Basic Data Mining
13.1 Mining the Web for Structured Data [12, 13, 14, 15]
1. Introduction
2. Why do we want to deal with data from the web?
3. Extracting information from the web
13.2 Frequent Itemsets and Association rules [12, 15]
1. The Market Problem
2. The basic algorithm
3. Improvements
14 Advanced Data Mining
14.1 Near Neighbor Search in High Dimensional Data [14, 16]
1. Locality Sensitive Hashing
2. Near Neighbor Search
14.2 Structure of the Webgraph [13]
1. Page Rank Algorithm
2. Topic Specific Page Rank
3. Trust Rank
A New Era 16
DataLab Classes
References
[1] G. Strang, Introduction to Linear Algebra. Wellesley, MA: Wellesley-Cambridge Press, fourth ed., 2009.
[2] R. Ash, Basic Probability Theory. Dover Books on Mathematics Series, Dover Publications, Incorporated,
2012.
[3] J. Nocedal and S. J. Wright, Numerical Optimization, second edition. World Scientific, 2006.
[4] C. M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics). Secaucus, NJ,
USA: Springer-Verlag New York, Inc., 2006.
[5] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification (2Nd Edition). Wiley-Interscience, 2000.
[6] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Fourth Edition. Academic Press, 4th ed., 2008.
[7] S. Theodoridis, Machine Learning: A Bayesian and Optimization Perspective. Academic Press, 1st ed., 2015.
[8] G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning: With Applications
in R. Springer Publishing Company, Incorporated, 2014.
[9] S. Haykin, Neural Networks and Learning Machines. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 2008.
[10] R. Rojas, “Adaboost and the super bowl of classifiers a tutorial introduction to adaptive boosting,” 2009.
[11] S. Wierzchoń and M. Kłopotek, Modern Algorithms of Cluster Analysis, vol. 34. Springer, 2017.
[12] J. Han, Data Mining: Concepts and Techniques. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.,
2011.
[13] A. N. Langville and C. D. Meyer, Google’s PageRank and Beyond: The Science of Search Engine Rankings.
Princeton, NJ, USA: Princeton University Press, 2006.
[14] A. Rajaraman and J. D. Ullman, Mining of Massive Datasets. New York, NY, USA: Cambridge University
Press, 2011.
[15] I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, Second Edition
(Morgan Kaufmann Series in Data Management Systems). San Francisco, CA, USA: Morgan Kaufmann
Publishers Inc., 2005.
[16] H. Samet, Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in
Computer Graphics and Geometric Modeling). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.,
2005.
[17] R. E. Schapire and Y. Freund, Boosting: Foundations and algorithms. MIT press, 2012.
A New Era 17

More Related Content

What's hot (8)

Jdbc
JdbcJdbc
Jdbc
 
Perl 5 guide
Perl 5 guidePerl 5 guide
Perl 5 guide
 
Perl <b>5 Tutorial</b>, First Edition
Perl <b>5 Tutorial</b>, First EditionPerl <b>5 Tutorial</b>, First Edition
Perl <b>5 Tutorial</b>, First Edition
 
report
reportreport
report
 
R-intro
R-introR-intro
R-intro
 
R intro
R introR intro
R intro
 
R Intro
R IntroR Intro
R Intro
 
foobar
foobarfoobar
foobar
 

Similar to Python and Math Modules for Data Science Classes

Similar to Python and Math Modules for Data Science Classes (20)

Nguyễn Nho Vĩnh - Problem solvingwithalgorithmsanddatastructures
Nguyễn Nho Vĩnh - Problem solvingwithalgorithmsanddatastructuresNguyễn Nho Vĩnh - Problem solvingwithalgorithmsanddatastructures
Nguyễn Nho Vĩnh - Problem solvingwithalgorithmsanddatastructures
 
Arules_TM_Rpart_Markdown
Arules_TM_Rpart_MarkdownArules_TM_Rpart_Markdown
Arules_TM_Rpart_Markdown
 
Perltut
PerltutPerltut
Perltut
 
python learn basic tutorial learn easy..
python learn basic tutorial learn easy..python learn basic tutorial learn easy..
python learn basic tutorial learn easy..
 
C++ For Quantitative Finance
C++ For Quantitative FinanceC++ For Quantitative Finance
C++ For Quantitative Finance
 
Am06 complete 16-sep06
Am06 complete 16-sep06Am06 complete 16-sep06
Am06 complete 16-sep06
 
Practical Programming.pdf
Practical Programming.pdfPractical Programming.pdf
Practical Programming.pdf
 
Practical Programming.pdf
Practical Programming.pdfPractical Programming.pdf
Practical Programming.pdf
 
<img src="../i/r_14.png" />
<img src="../i/r_14.png" /><img src="../i/r_14.png" />
<img src="../i/r_14.png" />
 
perltut
perltutperltut
perltut
 
perltut
perltutperltut
perltut
 
Perl tut
Perl tutPerl tut
Perl tut
 
Sage tutorial
Sage tutorialSage tutorial
Sage tutorial
 
Dsa
DsaDsa
Dsa
 
numpy-ref-1.11.0.pdf
numpy-ref-1.11.0.pdfnumpy-ref-1.11.0.pdf
numpy-ref-1.11.0.pdf
 
Odoo new-api-guide-line
Odoo new-api-guide-lineOdoo new-api-guide-line
Odoo new-api-guide-line
 
MSc_Dissertation
MSc_DissertationMSc_Dissertation
MSc_Dissertation
 
Java
JavaJava
Java
 
Coding interview preparation
Coding interview preparationCoding interview preparation
Coding interview preparation
 
project Report on LAN Security Manager
project Report on LAN Security Managerproject Report on LAN Security Manager
project Report on LAN Security Manager
 

More from Andres Mendez-Vazquez

01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectorsAndres Mendez-Vazquez
 
01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issuesAndres Mendez-Vazquez
 
05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiationAndres Mendez-Vazquez
 
01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep LearningAndres Mendez-Vazquez
 
25 introduction reinforcement_learning
25 introduction reinforcement_learning25 introduction reinforcement_learning
25 introduction reinforcement_learningAndres Mendez-Vazquez
 
Neural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusNeural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusAndres Mendez-Vazquez
 
Introduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusIntroduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusAndres Mendez-Vazquez
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesIdeas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesAndres Mendez-Vazquez
 
20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variationsAndres Mendez-Vazquez
 

More from Andres Mendez-Vazquez (20)

2.03 bayesian estimation
2.03 bayesian estimation2.03 bayesian estimation
2.03 bayesian estimation
 
05 linear transformations
05 linear transformations05 linear transformations
05 linear transformations
 
01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors
 
01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues
 
01.02 linear equations
01.02 linear equations01.02 linear equations
01.02 linear equations
 
01.01 vector spaces
01.01 vector spaces01.01 vector spaces
01.01 vector spaces
 
06 recurrent neural_networks
06 recurrent neural_networks06 recurrent neural_networks
06 recurrent neural_networks
 
05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation
 
Zetta global
Zetta globalZetta global
Zetta global
 
01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning
 
25 introduction reinforcement_learning
25 introduction reinforcement_learning25 introduction reinforcement_learning
25 introduction reinforcement_learning
 
Neural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusNeural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning Syllabus
 
Introduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusIntroduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabus
 
Ideas 09 22_2018
Ideas 09 22_2018Ideas 09 22_2018
Ideas 09 22_2018
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesIdeas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data Sciences
 
Analysis of Algorithms Syllabus
Analysis of Algorithms  SyllabusAnalysis of Algorithms  Syllabus
Analysis of Algorithms Syllabus
 
20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations
 
18.1 combining models
18.1 combining models18.1 combining models
18.1 combining models
 
17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension
 
A basic introduction to learning
A basic introduction to learningA basic introduction to learning
A basic introduction to learning
 

Recently uploaded

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 

Python and Math Modules for Data Science Classes

  • 1. Proposal of Classes Andres Mendez-Vazquez Associate Research Professor Cinvestav GDL Datalab Adviser Contents 1 Introduction 4 I Python Module 5 2 Basic Python 5 2.1 History of Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 The Basic Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2.1 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2.2 Python Basic Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2.3 Control Flow Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2.4 Defining Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2.5 Functional Programming - Lambda Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3 Data Structures in Python 7 3.1 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.3 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.4 Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.5 Hash Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.6 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.7 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.8 Divide and Conquer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 II Basic Mathematical Modules 9 4 Linear Algebra [1] 9 4.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.2 System of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.3 Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1
  • 2. DataLab Classes 5 Probability and Statistical Theory [2] 9 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 5.2 Bayes Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 5.3 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 5.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 6 Optimization [3] 10 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 6.2 Basic Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 III Feature Engineering 11 IV Supervised Learning 12 7 Basic Supervised Learning 12 7.1 Introduction [4, 5, 6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 7.2 Linear Classifiers [4, 5, 6, 7, 8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 7.3 Probability Classifiers [4, 5, 6, 7, 8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 7.4 Advanced Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 7.5 Advanced Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 8 Intermediate Supervised Learning 13 8.1 Probability Models for Supervised Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 8.2 Important Issues [4, 5, 6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 8.3 Kernel Based Classifiers [9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 9 Advanced Supervised Learning 13 9.1 Graphical Model Based Classifiers [4, 6, 9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 9.2 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 9.3 Combining Classifiers [4, 10] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 V Unsupervised Learning 15 10 Basic Unsupervised Learning [11, 7] 15 11 Intermediate Unsupervised Learning [11] 15 12 Advanced Unsupervised Learning [11, 6] 15 VI Data Mining 16 13 Basic Data Mining 16 13.1 Mining the Web for Structured Data [12, 13, 14, 15] . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 13.2 Frequent Itemsets and Association rules [12, 15] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 A New Era 2
  • 3. DataLab Classes 14 Advanced Data Mining 16 14.1 Near Neighbor Search in High Dimensional Data [14, 16] . . . . . . . . . . . . . . . . . . . . . . . . . 16 14.2 Structure of the Webgraph [13] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 References 17 A New Era 3
  • 4. DataLab Classes 1 Introduction Here, we propose the following schedule of classes. Feature Engineering Feature Engineering Basic Mathematical Modules Linear Algebra/Regression Basic Probability Basic Optimization Methods Supervised Learning Basic Supervised Learning Intermediate Supervised Learning Advanced Supervised Learning Advanced Optimization Advanced Probability Unsupervised Learning Basic Unsupervised Learning Intermediate Unsupervised Learning Advanced Unsupervised Learning Python Data Structures Basic Python Data Mining Advanced Data Mining Basic Data Mining A New Era 4
  • 5. DataLab Classes Part I Python Module 2 Basic Python 2.1 History of Python 1. Python was conceived in the late 1980s. 2. By Guido Van Rossum (a) Benevolent dictator for life!!! 3. Python reached version 1.0 in January 1994. 4. The major new features included in this release were the functional programming tools: (a) lambda - the basic lambda calculus implementation. (b) map - it applies a given function to each element of a list. (c) filter - it filters out items based on a test function. (d) reduce - it is a really useful function for performing some computation on a list and returning the result. 5. Standard Implementations (a) CPython (b) Jython (1997) (c) IronPython (2006) - .NET (d) PyPy (2007) written in RPython 6. Exotic Implementation (a) Stackless Python (2000) does not depend on the C call stack. 2.2 The Basic Structures 2.2.1 Variables There is a list of things that need to be covered: 1. Assigning Values to Variables. 2. Multiple Assignment 3. Standard Data Types (a) Numbers (b) String (c) List (d) Tuple (e) Dictionary A New Era 5
  • 6. DataLab Classes 2.2.2 Python Basic Operators Here, we need to review: 1. Arithmetic Operators 2. Comparison (Relational) Operators 3. Assignment Operators 4. Logical Operators 5. Bitwise Operators 6. Membership Operators 7. Identity Operators 2.2.3 Control Flow Structures 1. The If Statement 2. Loop Instructions (a) For Instruction (b) While Instruction (c) Techniques 3. Else in loops 4. The range instruction 5. break and continue Statements 6. pass Statements 2.2.4 Defining Functions Here, we classically define the syntax for functions using: 1. Fibonacci function Then, we look at 1. Default Argument Values 2. Keyword Arguments 3. Arbitrary Argument Lists In addition, we look at small expressions using lambda expressions. A New Era 6
  • 7. DataLab Classes 2.2.5 Functional Programming - Lambda Calculus Review the difference between: 1. Procedural 2. Declarative 3. Object-Oriented 4. Functional 3 Data Structures in Python 3.1 Lists Review the most basic functions for a list: 1. append(x) 2. extend(x) 3. remove(x) 4. pop() 5. index(x) 6. count(x) 7. sort() 8. reverse() Then given all this operations it is possible to use a list as 1. Stacks 2. Queues Here, you need to be careful because the emulation vs native implementation. 3.2 Tuples Here, we will review the Tuple. 3.3 Sets They are an unordered collection with no duplicate elements 3.4 Dictionaries Another useful data type built into Python is the dictionary. It can be seen as an unordered set of key:value pairs. Here, we will see how to substitute the case statement by a dictionary A New Era 7
  • 8. DataLab Classes 3.5 Hash Tables Here, we will review some of the basics in 1. Hash Functions 2. Hash Tables 3.6 Trees As you can imagine trees are an evolution from the chain lists. Thus, we will review how to : 1. Build a tree. 2. Traverse a tree 3. Expression trees 3.7 Graphs Look at the basics on how building graphs by using lists and dictionaries. Then, look at 1. Breath First Search 2. Depth First Search 3.8 Divide and Conquer Here, review the basics of divide and conquer using 1. Merge Sort 2. Quick Sort A New Era 8
  • 9. DataLab Classes Part II Basic Mathematical Modules 4 Linear Algebra [1] 4.1 Vector Spaces 1. Using Vector to Represent Samples 2. Basis of Vector Spaces 3. Sub-spaces 4. Linear independence 4.2 System of Linear Equations 1. Elementary row operations and elementary matrices 2. Example the inverse matrix 3. Squared Matrices and Inverse 4. Determinants 5. Derivatives of Matrices 4.3 Basis 1. Orthonormal Basis 2. Eigenvalues and Eigenvectors 4.4 Applications 1. Linear Regression 2. Graph Theory 3. Sparse Matrices 5 Probability and Statistical Theory [2] 5.1 Introduction 1. Probability Definition 2. Basic Set Operations 3. Counting A New Era 9
  • 10. DataLab Classes 5.2 Bayes Theorem 1. Conditional probability 2. Independence of events 5.3 Distributions 1. Random Variables 2. Distributions (a) Continuous (b) Discrete (c) Examples in Machine Learning 3. Mean and variance 4. Joint distributions 5. Covariance and Correlation 5.4 Applications 1. Bayesian inference 2. Principal Component Analysis 3. Analysis of Samples 6 Optimization [3] 6.1 Introduction 1. Formulation 2. Example: Least Squared Error 3. Continuous versus Discrete Optimization 4. Constrained and Unconstrained Optimization 6.2 Basic Optimization 1. How to recognize a minimum? 2. Search using Derivatives (a) Gradient Descent 3. Linear Search A New Era 10
  • 11. DataLab Classes Part III Feature Engineering 1. Introduction 2. Types of Data (a) Categorical (b) Numerical 3. Extracting meaning from the data 4. Encoding techniques of categorical variables: (a) Feature hashing and bin-counting 5. Feature engineering for numeric data: (a) Filtering, binning, scaling, log transforms, and power transforms (b) Statistical analysis of the data 6. Natural text techniques: (a) bag-of-words, n-grams. 7. The Cycle of feature engineering (a) Forward, backward selection, etc 8. Non-Linear Featurization and Model Stacking A New Era 11
  • 12. DataLab Classes Part IV Supervised Learning 7 Basic Supervised Learning 7.1 Introduction [4, 5, 6] 1. What is Learning? 2. The Learning Process 7.2 Linear Classifiers [4, 5, 6, 7, 8] 1. What is a Classifier? 2. Introduction to Linear Classifiers 3. Mean-Square Error Linear Estimation 4. Regularization 7.3 Probability Classifiers [4, 5, 6, 7, 8] 1. Discriminant Functions 2. Naive Bayes 7.4 Advanced Optimization 1. Differentiable Convex Functions 2. Newton’s Method 3. Quasi-Newton’s Methods 4. Constrained Optimization (a) Lagrange Multipliers (b) The Karush-Kuhn-Tucher Condiditons (c) Duality 7.5 Advanced Probability 1. Central limit theorem 2. Bayes Theorem (a) Likelihood Distribution A New Era 12
  • 13. DataLab Classes 3. Bayesian inference (a) Maximum A Posteriori 4. Generative Models 5. Point Estimation 6. Dirichlet Processes 8 Intermediate Supervised Learning 8.1 Probability Models for Supervised Classification 1. Logistic Regression 2. Expectation Maximization 3. Maximum A Posteriori in the LASSO 8.2 Important Issues [4, 5, 6] 1. Statistical Analysis of Classifiers 2. Bias-Varance Dilemma 3. The Confusion Matrix 4. K-Cross Validation 8.3 Kernel Based Classifiers [9] 1. Introduction 2. Support Vector Machines 9 Advanced Supervised Learning 9.1 Graphical Model Based Classifiers [4, 6, 9] 1. Decision Trees. 2. Neural Networks • Perceptron • Multilayer Perceptron 3. Deep Neural Networks (a) CNN A New Era 13
  • 14. DataLab Classes 9.2 Data Preparation 1. Feature selection [4, 5, 6] (a) Preprocessing (b) Statistical Methods (c) Class Separability (d) Feature Subset Selection. 2. Feature Generation [4, 5, 6, 17] (a) Introduction (b) Fisher Method (c) Principal Component Analysis (d) Dimensionality Reduction 9.3 Combining Classifiers [4, 10] 1. Average Rules 2. Majority Voting Rule 3. A Bayesian viewpoint 4. Boosting A New Era 14
  • 15. DataLab Classes Part V Unsupervised Learning 10 Basic Unsupervised Learning [11, 7] 1. Introduction 2. Proximity Measures. 3. Admissible Cost Functions. 4. Basic Clustering Algorithms: (a) K-Means (b) K-Centers (c) K-Medians 11 Intermediate Unsupervised Learning [11] 1. Hierarchical Clustering (a) Graph Model For Hierarchical Clustering 2. Partitional Clustering (a) Grouping Based in Disimilarities (b) Grouping Based on Cluster Volume 3. Hierarchical Clustering vs. Partitional Clustering 12 Advanced Unsupervised Learning [11, 6] 1. Large Data Set Clustering: (a) Density Based: DBSCAN, (b) Grid Based: STING, CLIQUE 2. Strategies for Large Data Sets (a) One-pass strategies (b) Summarization strategies (c) Sampling/batch strategies (d) Approximation strategies (e) Divide and conquer strategies A New Era 15
  • 16. DataLab Classes Part VI Data Mining 13 Basic Data Mining 13.1 Mining the Web for Structured Data [12, 13, 14, 15] 1. Introduction 2. Why do we want to deal with data from the web? 3. Extracting information from the web 13.2 Frequent Itemsets and Association rules [12, 15] 1. The Market Problem 2. The basic algorithm 3. Improvements 14 Advanced Data Mining 14.1 Near Neighbor Search in High Dimensional Data [14, 16] 1. Locality Sensitive Hashing 2. Near Neighbor Search 14.2 Structure of the Webgraph [13] 1. Page Rank Algorithm 2. Topic Specific Page Rank 3. Trust Rank A New Era 16
  • 17. DataLab Classes References [1] G. Strang, Introduction to Linear Algebra. Wellesley, MA: Wellesley-Cambridge Press, fourth ed., 2009. [2] R. Ash, Basic Probability Theory. Dover Books on Mathematics Series, Dover Publications, Incorporated, 2012. [3] J. Nocedal and S. J. Wright, Numerical Optimization, second edition. World Scientific, 2006. [4] C. M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics). Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2006. [5] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification (2Nd Edition). Wiley-Interscience, 2000. [6] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Fourth Edition. Academic Press, 4th ed., 2008. [7] S. Theodoridis, Machine Learning: A Bayesian and Optimization Perspective. Academic Press, 1st ed., 2015. [8] G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning: With Applications in R. Springer Publishing Company, Incorporated, 2014. [9] S. Haykin, Neural Networks and Learning Machines. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 2008. [10] R. Rojas, “Adaboost and the super bowl of classifiers a tutorial introduction to adaptive boosting,” 2009. [11] S. Wierzchoń and M. Kłopotek, Modern Algorithms of Cluster Analysis, vol. 34. Springer, 2017. [12] J. Han, Data Mining: Concepts and Techniques. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2011. [13] A. N. Langville and C. D. Meyer, Google’s PageRank and Beyond: The Science of Search Engine Rankings. Princeton, NJ, USA: Princeton University Press, 2006. [14] A. Rajaraman and J. D. Ullman, Mining of Massive Datasets. New York, NY, USA: Cambridge University Press, 2011. [15] I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2005. [16] H. Samet, Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2005. [17] R. E. Schapire and Y. Freund, Boosting: Foundations and algorithms. MIT press, 2012. A New Era 17