2. 1. Introduction to Data Science
• What is Data Science? Why Data
Science?
• Need for Data Scientist in
Industries?
• Role of a Data Scientist in
Industries.
• How to become a Data
Scientist?
www.allyedu.in
3. 2. Business Statistics
o Univariate Analysis
• Measures of central tendencies (Mean, Median & Mode)
• Measures of dispersion (Range, Quartiles, Deciles,
Percentiles, Standard deviation, Variance, Mean/Median/Mode
Deviation)
• Measures of shape (Normal distribution, Central Limit Theorem,
Skewness (Left & Right), Kurtosis (Platy, Meso & Lepto))
• Tables (Counts, Frequency tables, Class intervals)
• Charts (Histograms, Polygonal Charts, Ogive Charts)
o Data Types
• Qualitative & Quantitative data
o Measurement & Scaling
• Nominal, Ordinal, Interval & Ratio
o Bi-variate Analysis
• Cross tabs, Correlations
o Hypothesis Testing
• Null, Alternate hypothesis, level of significance, Type-I Error, Type-II
Error)
• Parametric Tests (Z-tests, T-tests, Chi-sq tests, ANOVA (one-way,
two- way))
• Non-Parametric Tests (Wilcoxon Sign Rank test, Wilcoxon Sum Rank
test, Kruskal Wallis test, Friedmann Rank test)
www.allyedu.in
4. Probability
• Definitions (Probability, Events, Non-events, Mutually
exclusive,
Independent, Dependent, Exhaustive)
• Distributions (Gaussian, Binomial, Bernoulli)
• Variable Types (Continuous & Random)
o Multivariate Analysis
o Modeling Methodology
• Setting the working directory, Importing the data,
Splitting of the data, Data preprocessing (Missing value
treatment, Outlier treatment), Multicollinearity, variable
importance, Model development, Model validation using
various diagnostic techniques
o Supervised
• Regression (Simple Linear, Multiple Linear & Binary
Logistic), SVM, CART, Decision Tree, Random Forest, Naïve
Bayes, KNN, ANN
o Unsupervised
• Clustering, Apriori
o NLP
• Text Mining
o Forecasting
• Regular, Seasonal, Cyclic, Irregular trends, Moving
averages, Weighted moving averages, ACF, PACF, ARIMA
www.allyedu.in
5. 3. R
o Introduction
• D Introduction, History, Various versions,
Installation, Terminologies, Advantages &
Disadvantages
o Packages
• Definition, Objective, Installation, CRAN mirror
o Help
• Help & search functions
o Working Directory
• Objective, Setting the working directory, getting the
working directory, file.choose
o Importing & exporting
• CSV, Excel & Text files
o Data Types
• Introduction to various data types - Vectors, Lists, Matrices,
Arrays, Data
Frames & Factors & corresponding functions
o Data Creation
o Data Conversions
Converting from one data type to another data type
www.allyedu.in
6. o Understanding the data
• Type, structure, dimension, # of rows & columns, nature of the
data
o Slicing & extraction of data
• Subsetting the data by variables & records based on logic
o Date & Time functions
• Date & time formats, converting from one format to other
format
o Joins
• Left, Right, Inner & Outer
o Data Merging
• Merging the data by rows, columns
www.allyedu.in
7. o Functions
• Numeric, String, Arithmetic
o Family Functions
• Apply, Normal distribution
o Missing values
• Identification, position, # of missing values, removing the
missing values, removing the records with missing values
o Conditions
• Ifelse, For loop, While loop
o Plots
• Bar, Pie, Histograms, Line & additional functions
o Statistics
• Univariate, Bivariate & Multivariate (Supervised & Unsupervised)
analysis
www.allyedu.in
8. 4. Python
o Introduction
• Introduction, Advantages, Installation of Anaconda
o Packages
• Introduction to various packages like Pandas, Numpy,
Sklearn, Scipy, Matplotlib & Tensorflow
o Data Types
• Strings, Tuples, Dictionaries, Sets, Lists & Arrays
o Data Conversions
• Converting from one data type to another data type
o Understanding the data
• Type, structure, dimension, # of rows & columns, nature of the
data
www.allyedu.in
9. Slicing & extraction of data
• Subsetting the data by variables & records based on logic
o Working Directory
• Objective, Setting the working directory, getting the working
directory, file.choose
o Errors & Exceptions
• Differences between errors, exceptions, handling exceptions
o Missing values
• Identification, position, # of missing values, removing the missing
values, removing the records with missing values
o Conditions
• Ifelse, For loop, While loop
o Statistics
• Univariate, Bivariate
o Multivariate Analysis
• Regression (Simple Linear, Multiple Linear & Binary Logistic)
• Machine Learning (SVM, Decision Tree, Random Forest, KNN, ANN,
Boosting Techniques)
• Deep Learning (Image processing using CNN)
• NLP (Text Mining)www.allyedu.in
10. 5. Tableau
o Introduction
• Introduction to Visualization, Importance, Various tools,
Tableau
variants, Application of various charts based on data
o Basics
• Installation of trial version, Data importing, Live & extract
connections, Dimensions, Measures, Parameters, Filters
o Types of variables
• Character, Numeric, Geographical, Hierarchical, Calculated
variables
o Functions
• Arithmetic, Numeric, Character, Logical & case based
functions
o Panes & Legends
• Various panes in Tableau, Filters, Color legends, Filter legends
o Filters
• Context, Local, Global filters etc
o Charts
• Bar, Line, Pie, Area, Circle, Bubble, Bullet etc
o Advanced
• Data blending, data extracts, packaged versions, Dashboards
(Static
& Dynamic), Publishing the dashboards, Geographical maps (at
Zipcode level, County level, State level) & Map layers
www.allyedu.in
11. 6. Excel
o Introduction
• Importance, benefits of excel, Menu bar, Cell, Formula bar, # of
rows & columns in excel, Difference b/w Excel & CSV
o Basics
• Writing functions, Arithmetic, Random number creation, saving the
excel files, Sorting, Data filtration, Removal of duplicates,
Inserting rows & columns, Deleting rows & coulumns,
Toggling from left to right & top to bottom, Selecting rows &
columns
o Advanced
• Pivot tables, Conditional formatting, Countif, Countifs, Sumif,
Sumifs, Formattings for report, Vlookup, Hlookup, Generating
the report
o Statistics
• Univariate, Bivariate Analysis
o Multivariate Analysis
• Linear Regression
www.allyedu.in
12. 7. SQL
o Introduction
• Introduction, various SQL languages, databases, importance of
SQL
o Basics
• Various SQL functions like Select, Create, Insert into, from,
top 100
o Advanced
• Joins (Left, Right, Full Outer & Inner), Views (Importance,
Creation)
o Conditions
• Where, Having, Order by, Group by
o Operators
• And, or, logical
o Functions
• Arithmetic, Numeric, Character & Logical
www.allyedu.in
13. 8.Artificial Intelligence and Machine Learning with Python
Linear Models
Understand linear approximation and modelling of problems and develop linear models
Dimensionality Reduction
Use ideas from linear algebra to transform dimensions and warp space providing additional
flexibility and functionality to linear models.
SVM
Develop and implement kernel based methods to develop nonlinear models to solve few
complex tasks.
Nearest Neighbours, K-means, and Gaussian Mixture Models
Review pattern recognition ideas with distance and cluster based models to understand
similarity measures and grouping criteria.
Naive Bayes and Decision Trees
Dive into applications of bayes theorem and the use of decision criteria when learning
from data.
•
Search
Look at search from the perspective of graphs, trees and heuristic based optimizations.
•
Logic and Planning
Discover ways to encode logic and develop agents that plan actions in an environment.
•
Reinforcement Learning and Hidden Markov Models
Engineering agents that learn from a sequence of actions using rewards and penalties.
•
Q-Learning and Policy gradient
Operate in a stateful world over value and policy approximations tasks
www.allyedu.in
14. Data Preprocessing
Regression Techniques
Simple Linear Regression
Multiple Linear Regression
Polynomial Linear Regression
Support Vector Regression
Decision Tree Regression
Random Forest Regression
Evaluating Regression Model Performance
Classification Techniques
K-Nearest Neighbors (KNN)
Support Vector Machine (SVM)
Kernel SVM
Nave Bayes Classification
Decision Tree Classification
Random Forest Classification
Evaluating Classification Model Performance
Natural Language Processing (NLP)
Basic of NLP
Language preprocessing Techniques
Auto summarizing the given text document
Clustering Techniques
K-Means Clustering
K-mini Batch Clustering
Hierarchical Clustering
www.allyedu.in
15. Elbow Method
Curve Smoothening Techniques
Association Rule Learning
Reinforcement Learning
Basics of Numpy and panda
Deep Learning
Basics/what is Deep Learning
Artificial Neural Networks
Dimension Reduction Techniques
Principal Component Analysis (PCA)
Linear Discriminant Analysis (LDA)
Statistics Basics
Standard Deviation
Variance
Co-Variance
T-distribution
Pearson Correlation Coefficient (PCC)/ Correlation Coefficient
www.allyedu.in