Statistics in Data Science with Python

•Download as PPTX, PDF•

4 likes•407 views

Mahe Karim

Statistics in Data Science with Python | Mahe Karim

Education

Mahe Karim
Front End Developer
ID - 162-15-7770
Area of Interest:
 Full Stack Developer
 Data Analyst
 Animation
 Why Not Jump Into Passive Income ? ;)
Who I Am ?

Implement of our course
Step 1
Step 2
Step 3
•Statistics
•Data Science
•Python

Basic RoadTo Data Science
Statistics
Machine
Learning
Deep
Learning
Programming
Language
( Python / R )
Data Science

Smartest way to be a
Data Scientist / Analyst • Core Statistics
• Statistical Machine
Learning
• Probabilistic
Modeling
Step 1
Statistics
• Database
• Data Mining
• Data Design
Step 2
Computing
• Deep Learning
• NLP
• DataAnalysis
Step 3
ML

3 steps to learning the statistics and
probability required for data science:
• Descriptive statistics, distributions,
hypothesis testing, and regression.
Core Statistics
Concepts
• Conditional probability, priors,
posteriors, and maximum likelihood.
BayesianThinking
• Learn basic machine concepts and
how statistics fits in.
Intro to Statistical
Machine Learning

Most ImportantTopics In Statistics
• Part 1 - Simple Linear Regression
Part 2 - Multivariate Linear Regression
Part 3 - Logistic Regression
Part 4 - Multivariate Logistic Regression
Part 5 - Neural Networks
Part 6 - SupportVector Machines
Part 7 - K-Means Clustering & PCA
Part 8 - Anomaly Detection & Recommendation

import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
path = os.getcwd() + 'dataex1data1.txt'
data = pd.read_csv(path, header=None, names=['Population', 'Profit'])
data.head()

data.plot(kind='scatter', x='Population', y='Profit',
figsize=(12,8))

Implementing Simple Linear Regression
def computeCost(X, y, theta):
inner = np.power(((X * theta.T) - y), 2)
return np.sum(inner) / (2 * len(X)
# append a ones column to the front of the data set
data.insert(0, 'Ones', 1)
# set X (training data) and y (target variable)
cols = data.shape[1]
X = data.iloc[:,0:cols-1]
y = data.iloc[:,cols-1:cols]
# convert from data frames to numpy matrices
X = np.matrix(X.values)
y = np.matrix(y.values)
theta = np.matrix(np.array([0,0]))

x = np.linspace(data.Population.min(), data.Population.max(), 100)
f = g[0, 0] + (g[0, 1] * x)
fig, ax = plt.subplots(figsize=(12,8))
ax.plot(x, f, 'r', label='Prediction')
ax.scatter(data.Population, data.Profit, label='Traning Data')
ax.legend(loc=2)
ax.set_xlabel('Population')
ax.set_ylabel('Profit')
ax.set_title('Predicted Profit vs. Population Size')

Resources:
 https://elitedatascience.com/learn-statistics-for-data-science
 https://github.com/datasciencemasters/go
 An Introduction to Statistical Learning with Applications in R Gareth
James, DanielaWitten,Trevor Hastie and RobertTibshirani
 http://www.johnwittenauer.net/machine-learning-exercises-in-
python-part-1/
 Think Stats

What's hot

Jay Yagnik at AI Frontiers : A History Lesson on AIAI Frontiers

Matlab Nn IntroImthias Ahamed

Machine Learning Basics for Web Application DevelopersEtsuji Nakai

Linear regression on 1 terabytes of data? Some crazy observations and actionsHesen Peng

Data Structure Algorithmnibiganesh

1 seaborn introduction YuleiLi3

Essential NumPyzekeLabs Technologies

20181204i mlse discussionsHiroshi Maruyama

hashtim4911

Visual diagnostics for more effective machine learningBenjamin Bengfort

R-programming-training-in-mumbaiUnmesh Baile

論文紹介：Graph Pattern Entity Ranking Model for Knowledge Graph CompletionNaomi Shiraishi

Intermediate python ch1_slidesAtul Kumar

Dynamics in graph analysis (PyData Carolinas 2016)Benjamin Bengfort

Heap treeJananiJ19

gSpan algorithmSadik Mussah

MOA for the IoT at ACML 2016 Albert Bifet

Artificial intelligence and data stream miningAlbert Bifet

Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI AI Frontiers

What's hot (19)

Jay Yagnik at AI Frontiers : A History Lesson on AI

Matlab Nn Intro

Machine Learning Basics for Web Application Developers

Linear regression on 1 terabytes of data? Some crazy observations and actions

Data Structure Algorithm

1 seaborn introduction

Essential NumPy

20181204i mlse discussions

hash

Visual diagnostics for more effective machine learning

R-programming-training-in-mumbai

論文紹介：Graph Pattern Entity Ranking Model for Knowledge Graph Completion

Intermediate python ch1_slides

Dynamics in graph analysis (PyData Carolinas 2016)

Heap tree

gSpan algorithm

MOA for the IoT at ACML 2016

Artificial intelligence and data stream mining

Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI

Similar to Statistics in Data Science with Python

BDACA - Tutorial5Department of Communication Science, University of Amsterdam

Meetup Junio Data Analysis with python 2018DataLab Community

IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET Journal

PPT ON MACHINE LEARNING by Ragini RatreRaginiRatre

Lecture 1 Pandas Basics.pptx machine learningmy6305874

Machine Learning With RDavid Chiu

PPT on Data Science Using PythonNishantKumar1179

A Map of the PyData StackPeadar Coyle

DataCamp Cheat Sheets 4 Python Users (2020)EMRE AKCAOGLU

Introduction to Machine Learning with Python and scikit-learnMatt Hagy

Dimension reduction techniques[Feature Selection]AAKANKSHA JAIN

Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...Red Hat Developers

Lec 1 DsQundeel

Data Structuresheraz1

Lec 1 DsQundeel

Comparing EDA with classical and Bayesian analysis.pptxPremaGanesh1

AsghAbhaySingh467264

The ABC of Implementing Supervised Machine Learning with Python.pptxRuby Shrestha

PYTHON-Chapter 4-Plotting and Data Science PyLab - MAULIK BORSANIYAMaulik Borsaniya

2015 03-28-eb-finalChristopher Wilson

Similar to Statistics in Data Science with Python (20)

BDACA - Tutorial5

Meetup Junio Data Analysis with python 2018

IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...

PPT ON MACHINE LEARNING by Ragini Ratre

Lecture 1 Pandas Basics.pptx machine learning

Machine Learning With R

PPT on Data Science Using Python

A Map of the PyData Stack

DataCamp Cheat Sheets 4 Python Users (2020)

Introduction to Machine Learning with Python and scikit-learn

Dimension reduction techniques[Feature Selection]

Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...

Lec 1 Ds

Data Structure

Lec 1 Ds

Comparing EDA with classical and Bayesian analysis.pptx

Asgh

The ABC of Implementing Supervised Machine Learning with Python.pptx

PYTHON-Chapter 4-Plotting and Data Science PyLab - MAULIK BORSANIYA

2015 03-28-eb-final

Recently uploaded

Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a

Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita

Introduction to AI in Higher Education_draft.pptxpboyjonauth

URLs and Routing in the Odoo 17 Website AppCeline George

Crayon Activity Handout For the Crayon AUnboundStockton

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD

Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching

Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton

Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝9953056974 Low Rate Call Girls In Saket, Delhi NCR

Código Creativo y Arte de Software | Unidad 1Maestría en Comunicación Digital Interactiva - UNR

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George

Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019

Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732

Software Engineering Methodologies (overview)eniolaolutunde

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar

Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique

Recently uploaded (20)

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf

Class 11 Legal Studies Ch-1 Concept of State .pdf

Introduction to AI in Higher Education_draft.pptx

URLs and Routing in the Odoo 17 Website App

Crayon Activity Handout For the Crayon A

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...

Introduction to ArtificiaI Intelligence in Higher Education

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...

Science 7 - LAND and SEA BREEZE and its Characteristics

Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝

Código Creativo y Arte de Software | Unidad 1

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17

Sanyam Choudhary Chemistry practical.pdf

Separation of Lanthanides/ Lanthanides and Actinides

Software Engineering Methodologies (overview)

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx

Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx

Statistics in Data Science with Python

1. Statistics in Data Science with Python

2. Mahe Karim Front End Developer ID - 162-15-7770 Area of Interest:  Full Stack Developer  Data Analyst  Animation  Why Not Jump Into Passive Income ? ;) Who I Am ?

4. Implement of our course Step 1 Step 2 Step 3 •Statistics •Data Science •Python

5. Basic RoadTo Data Science Statistics Machine Learning Deep Learning Programming Language ( Python / R ) Data Science

6. Smartest way to be a Data Scientist / Analyst • Core Statistics • Statistical Machine Learning • Probabilistic Modeling Step 1 Statistics • Database • Data Mining • Data Design Step 2 Computing • Deep Learning • NLP • DataAnalysis Step 3 ML

7. 3 steps to learning the statistics and probability required for data science: • Descriptive statistics, distributions, hypothesis testing, and regression. Core Statistics Concepts • Conditional probability, priors, posteriors, and maximum likelihood. BayesianThinking • Learn basic machine concepts and how statistics fits in. Intro to Statistical Machine Learning

8. Verified course include STATISTICS

9. Most ImportantTopics In Statistics • Part 1 - Simple Linear Regression Part 2 - Multivariate Linear Regression Part 3 - Logistic Regression Part 4 - Multivariate Logistic Regression Part 5 - Neural Networks Part 6 - SupportVector Machines Part 7 - K-Means Clustering & PCA Part 8 - Anomaly Detection & Recommendation

10. import os import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inline path = os.getcwd() + 'dataex1data1.txt' data = pd.read_csv(path, header=None, names=['Population', 'Profit']) data.head()

11. Data Set:

12. data.plot(kind='scatter', x='Population', y='Profit', figsize=(12,8))

13. Implementing Simple Linear Regression def computeCost(X, y, theta): inner = np.power(((X * theta.T) - y), 2) return np.sum(inner) / (2 * len(X) # append a ones column to the front of the data set data.insert(0, 'Ones', 1) # set X (training data) and y (target variable) cols = data.shape[1] X = data.iloc[:,0:cols-1] y = data.iloc[:,cols-1:cols] # convert from data frames to numpy matrices X = np.matrix(X.values) y = np.matrix(y.values) theta = np.matrix(np.array([0,0]))

14. x = np.linspace(data.Population.min(), data.Population.max(), 100) f = g[0, 0] + (g[0, 1] * x) fig, ax = plt.subplots(figsize=(12,8)) ax.plot(x, f, 'r', label='Prediction') ax.scatter(data.Population, data.Profit, label='Traning Data') ax.legend(loc=2) ax.set_xlabel('Population') ax.set_ylabel('Profit') ax.set_title('Predicted Profit vs. Population Size')

15. Prediction ;) :D :p <3

16. Resources:  https://elitedatascience.com/learn-statistics-for-data-science  https://github.com/datasciencemasters/go  An Introduction to Statistical Learning with Applications in R Gareth James, DanielaWitten,Trevor Hastie and RobertTibshirani  http://www.johnwittenauer.net/machine-learning-exercises-in- python-part-1/  Think Stats

Statistics in Data Science with Python

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Statistics in Data Science with Python

Similar to Statistics in Data Science with Python (20)

Recently uploaded

Recently uploaded (20)

Statistics in Data Science with Python