Python for R Users

Ajay Ohri
Ajay OhriExperienced Data Scientist
Python for R Users
By
Chandan Routray
As a part of internship at
www.decisionstats.com
Basic Commands
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. i
Functions R Python
Downloading and installing a package install.packages('name') pip install name
Load a package library('name') import name as other_name
Checking working directory getwd() import os
os.getcwd()
Setting working directory setwd() os.chdir()
List files in a directory dir() os.listdir()
List all objects ls() globals()
Remove an object rm('name') del('object')
Data Frame Creation
R Python
(Using pandas package*)
Creating a data frame “df” of
dimension 6x4 (6 rows and 4
columns) containing random
numbers
A<­
matrix(runif(24,0,1),nrow=6,ncol=4)
df<­data.frame(A)
Here,
• runif function generates 24 random
numbers between 0 to 1
• matrix function creates a matrix from
those random numbers, nrow and ncol
sets the numbers of rows and columns
to the matrix
• data.frame converts the matrix to data
frame
import numpy as np
import pandas as pd
A=np.random.randn(6,4)
df=pd.DataFrame(A)
Here,
• np.random.randn generates a
matrix of 6 rows and 4 columns;
this function is a part of numpy**
library
• pd.DataFrame converts the matrix
in to a data frame
*To install Pandas library visit: http://pandas.pydata.org/; To import Pandas library type: import pandas as pd;
**To import Numpy library type: import numpy as np;
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 1
Data Frame Creation
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 2
Data Frame: Inspecting and Viewing Data
R Python
(Using pandas package*)
Getting the names of rows and
columns of data frame “df”
rownames(df)
returns the name of the rows
colnames(df)
returns the name of the columns
df.index
returns the name of the rows
df.columns
returns the name of the columns
Seeing the top and bottom “x”
rows of the data frame “df”
head(df,x)
returns top x rows of data frame
tail(df,x)
returns bottom x rows of data frame
df.head(x)
returns top x rows of data frame
df.tail(x)
returns bottom x rows of data frame
Getting dimension of data frame
“df”
dim(df)
returns in this format : rows, columns
df.shape
returns in this format : (rows,
columns)
Length of data frame “df” length(df)
returns no. of columns in data frames
len(df)
returns no. of columns in data frames
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 3
Data Frame: Inspecting and Viewing Data
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 4
Data Frame: Inspecting and Viewing Data
R Python
(Using pandas package*)
Getting quick summary(like
mean, std. deviation etc. ) of
data in the data frame “df”
summary(df)
returns mean, median , maximum,
minimum, first quarter and third quarter
df.describe()
returns count, mean, standard
deviation, maximum, minimum, 25%,
50% and 75%
Setting row names and columns
names of the data frame “df”
rownames(df)=c(“A”, ”B”, “C”, ”D”, 
“E”, ”F”)
set the row names to A, B, C, D and E
colnames=c(“P”, ”Q”, “R”, ”S”)
set the column names to P, Q, R and S
df.index=[“A”, ”B”, “C”, ”D”, 
“E”, ”F”]
set the row names to A, B, C, D and
E
df.columns=[“P”, ”Q”, “R”, ”S”]
set the column names to P, Q, R and
S
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 5
Data Frame: Inspecting and Viewing Data
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 6
Data Frame: Sorting Data
R Python
(Using pandas package*)
Sorting the data in the data
frame “df” by column name “P”
df[order(df$P),] df.sort(['P'])
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 7
Data Frame: Sorting Data
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 8
Data Frame: Data Selection
R Python
(Using pandas package*)
Slicing the rows of a data frame
from row no. “x” to row no.
“y”(including row x and y)
df[x:y,] df[x­1:y]
Python starts counting from 0
Slicing the columns name “x”,”Y”
etc. of a data frame “df”
myvars <­ c(“X”,”Y”)
newdata <­ df[myvars]
df.loc[:,[‘X’,’Y’]]
Selecting the the data from row
no. “x” to “y” and column no. “a”
to “b”
df[x:y,a:b] df.iloc[x­1:y,a­1,b]
Selecting the element at row no.
“x” and column no. “y”
df[x,y] df.iat[x­1,y­1]
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 9
Data Frame: Data Selection
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 10
Data Frame: Data Selection
R Python
(Using pandas package*)
Using a single column’s values
to select data, column name “A”
subset(df,A>0)
It will select the all the rows in which the
corresponding value in column A of that
row is greater than 0
df[df.A > 0]
It will do the same as the R function
PythonR
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 11
Mathematical Functions
Functions R Python
(import math and numpy library)
Sum sum(x) math.fsum(x)
Square Root sqrt(x) math.sqrt(x)
Standard Deviation sd(x) numpy.std(x)
Log log(x) math.log(x[,base])
Mean mean(x) numpy.mean(x)
Median median(x) numpy.median(x)
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 12
Mathematical Functions
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 13
Data Manipulation
Functions R Python
(import math and numpy library)
Convert character variable to numeric variable as.numeric(x) For a single value: int(x), long(x), float(x)
For list, vectors etc.: map(int,x), map(float,x)
Convert factor/numeric variable to character
variable
paste(x) For a single value: str(x)
For list, vectors etc.: map(str,x)
Check missing value in an object is.na(x) math.isnan(x)
Delete missing value from an object na.omit(list) cleanedList = [x for x in list if str(x) !
= 'nan']
Calculate the number of characters in character
value
nchar(x) len(x)
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 14
Date & Time Manipulation
Functions R
(import lubridate library)
Python
(import datetime library)
Getting time and date at an instant Sys.time() datetime.datetime.now()
Parsing date and time in format:
YYYY MM DD HH:MM:SS
d<­Sys.time()
d_format<­ymd_hms(d)
d=datetime.datetime.now()
format= “%Y %b %d  %H:%M:%S”
d_format=d.strftime(format)
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 15
Data Visualization
Functions R Python
(import matplotlib library**)
Scatter Plot variable1 vs variable2 plot(variable1,variable2) plt.scatter(variable1,variable2)
plt.show()
Boxplot for Var boxplot(Var) plt.boxplot(Var)
plt.show()
Histogram for Var hist(Var) plt.hist(Var)
plt.show()
Pie Chart for Var pie(Var) from pylab import *
pie(Var)
show()
** To import matplotlib library type: import matplotlib.pyplot as plt
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 16
Data Visualization: Scatter Plot
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 17
Data Visualization: Box Plot
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 18
Data Visualization: Histogram
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 19
Data Visualization: Line Plot
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 20
Data Visualization: Bubble
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 22
Data Visualization: Bar
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 21
Data Visualization: Pie Chart
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 23
Thank You
For feedback contact
DecisionStats.com
Coming up
● Data Mining in Python and R ( see draft slides
afterwards)
Machine Learning: SVM on Iris Dataset
*To know more about svm function in R visit: http://cran.r-project.org/web/packages/e1071/
** To install sklearn library visit : http://scikit-learn.org/, To know more about sklearn svm visit: http://scikit-
learn.org/stable/modules/generated/sklearn.svm.SVC.html
R(Using svm* function) Python(Using sklearn** library)
library(e1071)
data(iris)
trainset <­iris[1:149,]
testset <­iris[150,]
svm.model <­ svm(Species ~ ., data = 
trainset, cost = 100, gamma = 1, type= 'C­
classification')
svm.pred<­ predict(svm.model,testset[­5])
svm.pred
#Loading Library
from sklearn import svm
#Importing Dataset
from sklearn import datasets
#Calling SVM
clf = svm.SVC()
#Loading the package
iris = datasets.load_iris()
#Constructing training data
X, y = iris.data[:­1], iris.target[:­1]
#Fitting SVM
clf.fit(X, y)
#Testing the model on test data
print clf.predict(iris.data[­1])
Output: Virginica Output: 2, corresponds to Virginica
Linear Regression: Iris Dataset
*To know more about lm function in R visit: https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lm.html
** ** To know more about sklearn linear regression visit : http://scikit-
learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
R(Using lm* function) Python(Using sklearn** library)
data(iris)
total_size<­dim(iris)[1]
num_target<­c(rep(0,total_size))
for (i in 1:length(num_target)){
  if(iris$Species[i]=='setosa'){num_target[i]<­0}
  else if(iris$Species[i]=='versicolor')
{num_target[i]<­1}
  else{num_target[i]<­2}
}
iris$Species<­num_target
train_set <­iris[1:149,]
test_set <­iris[150,]
fit<­lm(Species ~ 0+Sepal.Length+ Sepal.Width+ 
Petal.Length+ Petal.Width , data=train_set)
coefficients(fit)
predict.lm(fit,test_set)
from sklearn import linear_model
from sklearn import datasets
iris = datasets.load_iris()
regr = linear_model.LinearRegression()
X, y = iris.data[:­1], iris.target[:­1]
regr.fit(X, y)
print(regr.coef_)
print regr.predict(iris.data[­1])
Output: 1.64 Output: 1.65
Random forest: Iris Dataset
*To know more about randomForest package in R visit: http://cran.r-project.org/web/packages/randomForest/
** To know more about sklearn random forest visit : http://scikit-
learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html
R(Using randomForest* package) Python(Using sklearn** library)
library(randomForest)
data(iris)
total_size<­dim(iris)[1]
num_target<­c(rep(0,total_size))
for (i in 1:length(num_target)){
  if(iris$Species[i]=='setosa'){num_target[i]<­0}
  else if(iris$Species[i]=='versicolor')
{num_target[i]<­1}
  else{num_target[i]<­2}}
iris$Species<­num_target
train_set <­iris[1:149,]
test_set <­iris[150,]
iris.rf <­ randomForest(Species ~ ., 
data=train_set,ntree=100,importance=TRUE,
                        proximity=TRUE)
print(iris.rf)
predict(iris.rf, test_set[­5], predict.all=TRUE)
from sklearn import ensemble
from sklearn import datasets
clf = 
ensemble.RandomForestClassifier(n_estimato
rs=100,max_depth=10)
iris = datasets.load_iris()
X, y = iris.data[:­1], iris.target[:­1]
clf.fit(X, y)
print clf.predict(iris.data[­1])
Output: 1.845 Output: 2
Decision Tree: Iris Dataset
*To know more about rpart package in R visit: http://cran.r-project.org/web/packages/rpart/
** To know more about sklearn desicion tree visit : http://scikit-
learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html
R(Using rpart* package) Python(Using sklearn** library)
library(rpart)
data(iris)
sub <­ c(1:149)
fit <­ rpart(Species ~ ., data = iris, 
subset = sub)
fit
predict(fit, iris[­sub,], type = "class")
from sklearn.datasets import load_iris
from sklearn.tree import 
DecisionTreeClassifier
clf = 
DecisionTreeClassifier(random_state=0)
iris = datasets.load_iris()
X, y = iris.data[:­1], iris.target[:­1]
clf.fit(X, y)
print clf.predict(iris.data[­1])
Output: Virginica Output: 2, corresponds to virginica
Gaussian Naive Bayes: Iris Dataset
*To know more about e1071 package in R visit: http://cran.r-project.org/web/packages/e1071/
** To know more about sklearn Naive Bayes visit : http://scikit-
learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html
R(Using e1071* package) Python(Using sklearn** library)
library(e1071)
data(iris)
trainset <­iris[1:149,]
testset <­iris[150,]
classifier<­naiveBayes(trainset[,1:4], 
trainset[,5]) 
predict(classifier, testset[,­5])
from sklearn.datasets import load_iris
from sklearn.naive_bayes import GaussianNB
clf = GaussianNB()
iris = datasets.load_iris()
X, y = iris.data[:­1], iris.target[:­1]
clf.fit(X, y)
print clf.predict(iris.data[­1])
Output: Virginica Output: 2, corresponds to virginica
K Nearest Neighbours: Iris Dataset
*To know more about kknn package in R visit:
** To know more about sklearn k nearest neighbours visit : http://scikit-
learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html
R(Using kknn* package) Python(Using sklearn** library)
library(kknn)
data(iris)
trainset <­iris[1:149,]
testset <­iris[150,]
iris.kknn <­ kknn(Species~., 
trainset,testset, distance = 1,            
    kernel = "triangular")
summary(iris.kknn)
fit <­ fitted(iris.kknn)
fit
from sklearn.datasets import load_iris
from sklearn.neighbors import 
KNeighborsClassifier
knn = KNeighborsClassifier()
iris = datasets.load_iris()
X, y = iris.data[:­1], iris.target[:­1]
knn.fit(X,y) 
print knn.predict(iris.data[­1])
Output: Virginica Output: 2, corresponds to virginica
Thank You
For feedback please let us know at
ohri2007@gmail.com
1 of 34

Recommended

Unit 1 - R Programming (Part 2).pptx by
Unit 1 - R Programming (Part 2).pptxUnit 1 - R Programming (Part 2).pptx
Unit 1 - R Programming (Part 2).pptxMalla Reddy University
447 views67 slides
Python For Data Science Cheat Sheet by
Python For Data Science Cheat SheetPython For Data Science Cheat Sheet
Python For Data Science Cheat SheetKarlijn Willems
5.2K views1 slide
Pandas Cheat Sheet by
Pandas Cheat SheetPandas Cheat Sheet
Pandas Cheat SheetACASH1011
13.4K views2 slides
Python Pandas by
Python PandasPython Pandas
Python PandasSunil OS
613.7K views119 slides
The matplotlib Library by
The matplotlib LibraryThe matplotlib Library
The matplotlib LibraryHaim Michael
298 views16 slides
Cheat Sheet for Machine Learning in Python: Scikit-learn by
Cheat Sheet for Machine Learning in Python: Scikit-learnCheat Sheet for Machine Learning in Python: Scikit-learn
Cheat Sheet for Machine Learning in Python: Scikit-learnKarlijn Willems
4.3K views1 slide

More Related Content

What's hot

Slope one recommender on hadoop by
Slope one recommender on hadoopSlope one recommender on hadoop
Slope one recommender on hadoopYONG ZHENG
29.5K views28 slides
Python Datatypes by SujithKumar by
Python Datatypes by SujithKumarPython Datatypes by SujithKumar
Python Datatypes by SujithKumarSujith Kumar
4.6K views33 slides
Pandas by
PandasPandas
PandasJyoti shukla
537 views15 slides
Reading Data into R by
Reading Data into RReading Data into R
Reading Data into RKazuki Yoshida
3.3K views50 slides
Python pandas Library by
Python pandas LibraryPython pandas Library
Python pandas LibraryMd. Sohag Miah
6.8K views36 slides
TeXの後継として、HTML5&CSS組版〜Vivliostyleプロジェクト by
TeXの後継として、HTML5&CSS組版〜VivliostyleプロジェクトTeXの後継として、HTML5&CSS組版〜Vivliostyleプロジェクト
TeXの後継として、HTML5&CSS組版〜VivliostyleプロジェクトShinyu Murakami
3.3K views18 slides

What's hot(20)

Slope one recommender on hadoop by YONG ZHENG
Slope one recommender on hadoopSlope one recommender on hadoop
Slope one recommender on hadoop
YONG ZHENG29.5K views
Python Datatypes by SujithKumar by Sujith Kumar
Python Datatypes by SujithKumarPython Datatypes by SujithKumar
Python Datatypes by SujithKumar
Sujith Kumar4.6K views
TeXの後継として、HTML5&CSS組版〜Vivliostyleプロジェクト by Shinyu Murakami
TeXの後継として、HTML5&CSS組版〜VivliostyleプロジェクトTeXの後継として、HTML5&CSS組版〜Vivliostyleプロジェクト
TeXの後継として、HTML5&CSS組版〜Vivliostyleプロジェクト
Shinyu Murakami3.3K views
pandas - Python Data Analysis by Andrew Henshaw
pandas - Python Data Analysispandas - Python Data Analysis
pandas - Python Data Analysis
Andrew Henshaw15.8K views
Data visualization in Python by Marc Garcia
Data visualization in PythonData visualization in Python
Data visualization in Python
Marc Garcia8.7K views
Introduction to Pandas and Time Series Analysis [PyCon DE] by Alexander Hendorf
Introduction to Pandas and Time Series Analysis [PyCon DE]Introduction to Pandas and Time Series Analysis [PyCon DE]
Introduction to Pandas and Time Series Analysis [PyCon DE]
Introduction to Python Pandas for Data Analytics by Phoenix
Introduction to Python Pandas for Data AnalyticsIntroduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data Analytics
Phoenix 1.7K views
Python Functions Tutorial | Working With Functions In Python | Python Trainin... by Edureka!
Python Functions Tutorial | Working With Functions In Python | Python Trainin...Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Edureka!1.3K views
Python 3 Programming Language by Tahani Al-Manie
Python 3 Programming LanguagePython 3 Programming Language
Python 3 Programming Language
Tahani Al-Manie11.5K views
Data Analysis with Python Pandas by Neeru Mittal
Data Analysis with Python PandasData Analysis with Python Pandas
Data Analysis with Python Pandas
Neeru Mittal696 views

Similar to Python for R Users

Python for R users by
Python for R usersPython for R users
Python for R usersSatyarth Praveen
891 views39 slides
R language introduction by
R language introductionR language introduction
R language introductionShashwat Shriparv
1.4K views27 slides
R Introduction by
R IntroductionR Introduction
R IntroductionSangeetha S
92 views44 slides
PPT on Data Science Using Python by
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using PythonNishantKumar1179
7.5K views42 slides
R for Pythonistas (PyData NYC 2017) by
R for Pythonistas (PyData NYC 2017)R for Pythonistas (PyData NYC 2017)
R for Pythonistas (PyData NYC 2017)Christopher Roach
776 views27 slides
R Language Introduction by
R Language IntroductionR Language Introduction
R Language IntroductionKhaled Al-Shamaa
8.6K views95 slides

Similar to Python for R Users(20)

R programming & Machine Learning by AmanBhalla14
R programming & Machine LearningR programming & Machine Learning
R programming & Machine Learning
AmanBhalla14676 views
PyTorch 튜토리얼 (Touch to PyTorch) by Hansol Kang
PyTorch 튜토리얼 (Touch to PyTorch)PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)
Hansol Kang1.3K views
Poetry with R -- Dissecting the code by Peter Solymos
Poetry with R -- Dissecting the codePoetry with R -- Dissecting the code
Poetry with R -- Dissecting the code
Peter Solymos1K views
PPT ON MACHINE LEARNING by Ragini Ratre by RaginiRatre
PPT ON MACHINE LEARNING by Ragini RatrePPT ON MACHINE LEARNING by Ragini Ratre
PPT ON MACHINE LEARNING by Ragini Ratre
RaginiRatre265 views
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem by Sages
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemWprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Sages1.2K views
Next Generation Programming in R by Florian Uhlitz
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in R
Florian Uhlitz1.7K views
Standardizing on a single N-dimensional array API for Python by Ralf Gommers
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
Ralf Gommers119 views

More from Ajay Ohri

Introduction to R ajay Ohri by
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay OhriAjay Ohri
1.2K views129 slides
Introduction to R by
Introduction to RIntroduction to R
Introduction to RAjay Ohri
9.2K views129 slides
Social Media and Fake News in the 2016 Election by
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionAjay Ohri
1.6K views26 slides
Pyspark by
PysparkPyspark
PysparkAjay Ohri
998 views9 slides
Download Python for R Users pdf for free by
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for freeAjay Ohri
1.9K views1 slide
Install spark on_windows10 by
Install spark on_windows10Install spark on_windows10
Install spark on_windows10Ajay Ohri
608 views9 slides

More from Ajay Ohri(20)

Introduction to R ajay Ohri by Ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay Ohri
Ajay Ohri1.2K views
Introduction to R by Ajay Ohri
Introduction to RIntroduction to R
Introduction to R
Ajay Ohri9.2K views
Social Media and Fake News in the 2016 Election by Ajay Ohri
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 Election
Ajay Ohri1.6K views
Download Python for R Users pdf for free by Ajay Ohri
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for free
Ajay Ohri1.9K views
Install spark on_windows10 by Ajay Ohri
Install spark on_windows10Install spark on_windows10
Install spark on_windows10
Ajay Ohri608 views
Ajay ohri Resume by Ajay Ohri
Ajay ohri ResumeAjay ohri Resume
Ajay ohri Resume
Ajay Ohri1.4K views
Statistics for data scientists by Ajay Ohri
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientists
Ajay Ohri14.2K views
National seminar on emergence of internet of things (io t) trends and challe... by Ajay Ohri
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...
Ajay Ohri4.1K views
Tools and techniques for data science by Ajay Ohri
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data science
Ajay Ohri10.8K views
How Big Data ,Cloud Computing ,Data Science can help business by Ajay Ohri
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help business
Ajay Ohri4.1K views
Training in Analytics and Data Science by Ajay Ohri
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
Ajay Ohri5.8K views
Tradecraft by Ajay Ohri
Tradecraft   Tradecraft
Tradecraft
Ajay Ohri989 views
Software Testing for Data Scientists by Ajay Ohri
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data Scientists
Ajay Ohri2.5K views
A Data Science Tutorial in Python by Ajay Ohri
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in Python
Ajay Ohri15.9K views
How does cryptography work? by Jeroen Ooms by Ajay Ohri
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen Ooms
Ajay Ohri976 views
Using R for Social Media and Sports Analytics by Ajay Ohri
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports Analytics
Ajay Ohri1.8K views
Kush stats alpha by Ajay Ohri
Kush stats alpha Kush stats alpha
Kush stats alpha
Ajay Ohri1.7K views
Analyze this by Ajay Ohri
Analyze thisAnalyze this
Analyze this
Ajay Ohri962 views

Recently uploaded

LIVE OAK MEMORIAL PARK.pptx by
LIVE OAK MEMORIAL PARK.pptxLIVE OAK MEMORIAL PARK.pptx
LIVE OAK MEMORIAL PARK.pptxms2332always
7 views6 slides
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf by
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf10urkyr34
7 views259 slides
PRIVACY AWRE PERSONAL DATA STORAGE by
PRIVACY AWRE PERSONAL DATA STORAGEPRIVACY AWRE PERSONAL DATA STORAGE
PRIVACY AWRE PERSONAL DATA STORAGEantony420421
7 views56 slides
[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks by
[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks
[DSC Europe 23] Aleksandar Tomcic - Adversarial AttacksDataScienceConferenc1
5 views20 slides
[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ... by
[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ...[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ...
[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ...DataScienceConferenc1
10 views18 slides
Lack of communication among family.pptx by
Lack of communication among family.pptxLack of communication among family.pptx
Lack of communication among family.pptxahmed164023
14 views10 slides

Recently uploaded(20)

LIVE OAK MEMORIAL PARK.pptx by ms2332always
LIVE OAK MEMORIAL PARK.pptxLIVE OAK MEMORIAL PARK.pptx
LIVE OAK MEMORIAL PARK.pptx
ms2332always7 views
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf by 10urkyr34
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf
10urkyr347 views
PRIVACY AWRE PERSONAL DATA STORAGE by antony420421
PRIVACY AWRE PERSONAL DATA STORAGEPRIVACY AWRE PERSONAL DATA STORAGE
PRIVACY AWRE PERSONAL DATA STORAGE
antony4204217 views
[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ... by DataScienceConferenc1
[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ...[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ...
[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ...
Lack of communication among family.pptx by ahmed164023
Lack of communication among family.pptxLack of communication among family.pptx
Lack of communication among family.pptx
ahmed16402314 views
[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx by DataScienceConferenc1
[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx
[DSC Europe 23] Ivan Dundovic - How To Treat Your Data As A Product.pptx
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ... by DataScienceConferenc1
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
[DSC Europe 23][AI:CSI] Aleksa Stojanovic - Applying AI for Threat Detection ...
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ... by DataScienceConferenc1
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...
[DSC Europe 23] Danijela Horak - The Innovator’s Dilemma: to Build or Not to ...
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx by DataScienceConferenc1
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo... by DataScienceConferenc1
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...
[DSC Europe 23][DigiHealth] Muthu Ramachandran AI and Blockchain Framework fo...
[DSC Europe 23] Rania Wazir - Opening up the box: the complexity of human int... by DataScienceConferenc1
[DSC Europe 23] Rania Wazir - Opening up the box: the complexity of human int...[DSC Europe 23] Rania Wazir - Opening up the box: the complexity of human int...
[DSC Europe 23] Rania Wazir - Opening up the box: the complexity of human int...
[DSC Europe 23] Ales Gros - Quantum and Today s security with Quantum.pdf by DataScienceConferenc1
[DSC Europe 23] Ales Gros - Quantum and Today s security with Quantum.pdf[DSC Europe 23] Ales Gros - Quantum and Today s security with Quantum.pdf
[DSC Europe 23] Ales Gros - Quantum and Today s security with Quantum.pdf
Product Research sample.pdf by AllenSingson
Product Research sample.pdfProduct Research sample.pdf
Product Research sample.pdf
AllenSingson33 views

Python for R Users

  • 1. Python for R Users By Chandan Routray As a part of internship at www.decisionstats.com
  • 2. Basic Commands Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. i Functions R Python Downloading and installing a package install.packages('name') pip install name Load a package library('name') import name as other_name Checking working directory getwd() import os os.getcwd() Setting working directory setwd() os.chdir() List files in a directory dir() os.listdir() List all objects ls() globals() Remove an object rm('name') del('object')
  • 3. Data Frame Creation R Python (Using pandas package*) Creating a data frame “df” of dimension 6x4 (6 rows and 4 columns) containing random numbers A<­ matrix(runif(24,0,1),nrow=6,ncol=4) df<­data.frame(A) Here, • runif function generates 24 random numbers between 0 to 1 • matrix function creates a matrix from those random numbers, nrow and ncol sets the numbers of rows and columns to the matrix • data.frame converts the matrix to data frame import numpy as np import pandas as pd A=np.random.randn(6,4) df=pd.DataFrame(A) Here, • np.random.randn generates a matrix of 6 rows and 4 columns; this function is a part of numpy** library • pd.DataFrame converts the matrix in to a data frame *To install Pandas library visit: http://pandas.pydata.org/; To import Pandas library type: import pandas as pd; **To import Numpy library type: import numpy as np; Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 1
  • 4. Data Frame Creation R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 2
  • 5. Data Frame: Inspecting and Viewing Data R Python (Using pandas package*) Getting the names of rows and columns of data frame “df” rownames(df) returns the name of the rows colnames(df) returns the name of the columns df.index returns the name of the rows df.columns returns the name of the columns Seeing the top and bottom “x” rows of the data frame “df” head(df,x) returns top x rows of data frame tail(df,x) returns bottom x rows of data frame df.head(x) returns top x rows of data frame df.tail(x) returns bottom x rows of data frame Getting dimension of data frame “df” dim(df) returns in this format : rows, columns df.shape returns in this format : (rows, columns) Length of data frame “df” length(df) returns no. of columns in data frames len(df) returns no. of columns in data frames Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 3
  • 6. Data Frame: Inspecting and Viewing Data R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 4
  • 7. Data Frame: Inspecting and Viewing Data R Python (Using pandas package*) Getting quick summary(like mean, std. deviation etc. ) of data in the data frame “df” summary(df) returns mean, median , maximum, minimum, first quarter and third quarter df.describe() returns count, mean, standard deviation, maximum, minimum, 25%, 50% and 75% Setting row names and columns names of the data frame “df” rownames(df)=c(“A”, ”B”, “C”, ”D”,  “E”, ”F”) set the row names to A, B, C, D and E colnames=c(“P”, ”Q”, “R”, ”S”) set the column names to P, Q, R and S df.index=[“A”, ”B”, “C”, ”D”,  “E”, ”F”] set the row names to A, B, C, D and E df.columns=[“P”, ”Q”, “R”, ”S”] set the column names to P, Q, R and S Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 5
  • 8. Data Frame: Inspecting and Viewing Data R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 6
  • 9. Data Frame: Sorting Data R Python (Using pandas package*) Sorting the data in the data frame “df” by column name “P” df[order(df$P),] df.sort(['P']) Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 7
  • 10. Data Frame: Sorting Data R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 8
  • 11. Data Frame: Data Selection R Python (Using pandas package*) Slicing the rows of a data frame from row no. “x” to row no. “y”(including row x and y) df[x:y,] df[x­1:y] Python starts counting from 0 Slicing the columns name “x”,”Y” etc. of a data frame “df” myvars <­ c(“X”,”Y”) newdata <­ df[myvars] df.loc[:,[‘X’,’Y’]] Selecting the the data from row no. “x” to “y” and column no. “a” to “b” df[x:y,a:b] df.iloc[x­1:y,a­1,b] Selecting the element at row no. “x” and column no. “y” df[x,y] df.iat[x­1,y­1] Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 9
  • 12. Data Frame: Data Selection R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 10
  • 13. Data Frame: Data Selection R Python (Using pandas package*) Using a single column’s values to select data, column name “A” subset(df,A>0) It will select the all the rows in which the corresponding value in column A of that row is greater than 0 df[df.A > 0] It will do the same as the R function PythonR Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 11
  • 14. Mathematical Functions Functions R Python (import math and numpy library) Sum sum(x) math.fsum(x) Square Root sqrt(x) math.sqrt(x) Standard Deviation sd(x) numpy.std(x) Log log(x) math.log(x[,base]) Mean mean(x) numpy.mean(x) Median median(x) numpy.median(x) Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 12
  • 15. Mathematical Functions R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 13
  • 16. Data Manipulation Functions R Python (import math and numpy library) Convert character variable to numeric variable as.numeric(x) For a single value: int(x), long(x), float(x) For list, vectors etc.: map(int,x), map(float,x) Convert factor/numeric variable to character variable paste(x) For a single value: str(x) For list, vectors etc.: map(str,x) Check missing value in an object is.na(x) math.isnan(x) Delete missing value from an object na.omit(list) cleanedList = [x for x in list if str(x) ! = 'nan'] Calculate the number of characters in character value nchar(x) len(x) Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 14
  • 17. Date & Time Manipulation Functions R (import lubridate library) Python (import datetime library) Getting time and date at an instant Sys.time() datetime.datetime.now() Parsing date and time in format: YYYY MM DD HH:MM:SS d<­Sys.time() d_format<­ymd_hms(d) d=datetime.datetime.now() format= “%Y %b %d  %H:%M:%S” d_format=d.strftime(format) Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 15
  • 18. Data Visualization Functions R Python (import matplotlib library**) Scatter Plot variable1 vs variable2 plot(variable1,variable2) plt.scatter(variable1,variable2) plt.show() Boxplot for Var boxplot(Var) plt.boxplot(Var) plt.show() Histogram for Var hist(Var) plt.hist(Var) plt.show() Pie Chart for Var pie(Var) from pylab import * pie(Var) show() ** To import matplotlib library type: import matplotlib.pyplot as plt Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 16
  • 19. Data Visualization: Scatter Plot R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 17
  • 20. Data Visualization: Box Plot R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 18
  • 21. Data Visualization: Histogram R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 19
  • 22. Data Visualization: Line Plot R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 20
  • 23. Data Visualization: Bubble R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 22
  • 24. Data Visualization: Bar R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 21
  • 25. Data Visualization: Pie Chart R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 23
  • 26. Thank You For feedback contact DecisionStats.com
  • 27. Coming up ● Data Mining in Python and R ( see draft slides afterwards)
  • 28. Machine Learning: SVM on Iris Dataset *To know more about svm function in R visit: http://cran.r-project.org/web/packages/e1071/ ** To install sklearn library visit : http://scikit-learn.org/, To know more about sklearn svm visit: http://scikit- learn.org/stable/modules/generated/sklearn.svm.SVC.html R(Using svm* function) Python(Using sklearn** library) library(e1071) data(iris) trainset <­iris[1:149,] testset <­iris[150,] svm.model <­ svm(Species ~ ., data =  trainset, cost = 100, gamma = 1, type= 'C­ classification') svm.pred<­ predict(svm.model,testset[­5]) svm.pred #Loading Library from sklearn import svm #Importing Dataset from sklearn import datasets #Calling SVM clf = svm.SVC() #Loading the package iris = datasets.load_iris() #Constructing training data X, y = iris.data[:­1], iris.target[:­1] #Fitting SVM clf.fit(X, y) #Testing the model on test data print clf.predict(iris.data[­1]) Output: Virginica Output: 2, corresponds to Virginica
  • 29. Linear Regression: Iris Dataset *To know more about lm function in R visit: https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lm.html ** ** To know more about sklearn linear regression visit : http://scikit- learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html R(Using lm* function) Python(Using sklearn** library) data(iris) total_size<­dim(iris)[1] num_target<­c(rep(0,total_size)) for (i in 1:length(num_target)){   if(iris$Species[i]=='setosa'){num_target[i]<­0}   else if(iris$Species[i]=='versicolor') {num_target[i]<­1}   else{num_target[i]<­2} } iris$Species<­num_target train_set <­iris[1:149,] test_set <­iris[150,] fit<­lm(Species ~ 0+Sepal.Length+ Sepal.Width+  Petal.Length+ Petal.Width , data=train_set) coefficients(fit) predict.lm(fit,test_set) from sklearn import linear_model from sklearn import datasets iris = datasets.load_iris() regr = linear_model.LinearRegression() X, y = iris.data[:­1], iris.target[:­1] regr.fit(X, y) print(regr.coef_) print regr.predict(iris.data[­1]) Output: 1.64 Output: 1.65
  • 30. Random forest: Iris Dataset *To know more about randomForest package in R visit: http://cran.r-project.org/web/packages/randomForest/ ** To know more about sklearn random forest visit : http://scikit- learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html R(Using randomForest* package) Python(Using sklearn** library) library(randomForest) data(iris) total_size<­dim(iris)[1] num_target<­c(rep(0,total_size)) for (i in 1:length(num_target)){   if(iris$Species[i]=='setosa'){num_target[i]<­0}   else if(iris$Species[i]=='versicolor') {num_target[i]<­1}   else{num_target[i]<­2}} iris$Species<­num_target train_set <­iris[1:149,] test_set <­iris[150,] iris.rf <­ randomForest(Species ~ .,  data=train_set,ntree=100,importance=TRUE,                         proximity=TRUE) print(iris.rf) predict(iris.rf, test_set[­5], predict.all=TRUE) from sklearn import ensemble from sklearn import datasets clf =  ensemble.RandomForestClassifier(n_estimato rs=100,max_depth=10) iris = datasets.load_iris() X, y = iris.data[:­1], iris.target[:­1] clf.fit(X, y) print clf.predict(iris.data[­1]) Output: 1.845 Output: 2
  • 31. Decision Tree: Iris Dataset *To know more about rpart package in R visit: http://cran.r-project.org/web/packages/rpart/ ** To know more about sklearn desicion tree visit : http://scikit- learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html R(Using rpart* package) Python(Using sklearn** library) library(rpart) data(iris) sub <­ c(1:149) fit <­ rpart(Species ~ ., data = iris,  subset = sub) fit predict(fit, iris[­sub,], type = "class") from sklearn.datasets import load_iris from sklearn.tree import  DecisionTreeClassifier clf =  DecisionTreeClassifier(random_state=0) iris = datasets.load_iris() X, y = iris.data[:­1], iris.target[:­1] clf.fit(X, y) print clf.predict(iris.data[­1]) Output: Virginica Output: 2, corresponds to virginica
  • 32. Gaussian Naive Bayes: Iris Dataset *To know more about e1071 package in R visit: http://cran.r-project.org/web/packages/e1071/ ** To know more about sklearn Naive Bayes visit : http://scikit- learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html R(Using e1071* package) Python(Using sklearn** library) library(e1071) data(iris) trainset <­iris[1:149,] testset <­iris[150,] classifier<­naiveBayes(trainset[,1:4],  trainset[,5])  predict(classifier, testset[,­5]) from sklearn.datasets import load_iris from sklearn.naive_bayes import GaussianNB clf = GaussianNB() iris = datasets.load_iris() X, y = iris.data[:­1], iris.target[:­1] clf.fit(X, y) print clf.predict(iris.data[­1]) Output: Virginica Output: 2, corresponds to virginica
  • 33. K Nearest Neighbours: Iris Dataset *To know more about kknn package in R visit: ** To know more about sklearn k nearest neighbours visit : http://scikit- learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html R(Using kknn* package) Python(Using sklearn** library) library(kknn) data(iris) trainset <­iris[1:149,] testset <­iris[150,] iris.kknn <­ kknn(Species~.,  trainset,testset, distance = 1,                 kernel = "triangular") summary(iris.kknn) fit <­ fitted(iris.kknn) fit from sklearn.datasets import load_iris from sklearn.neighbors import  KNeighborsClassifier knn = KNeighborsClassifier() iris = datasets.load_iris() X, y = iris.data[:­1], iris.target[:­1] knn.fit(X,y)  print knn.predict(iris.data[­1]) Output: Virginica Output: 2, corresponds to virginica
  • 34. Thank You For feedback please let us know at ohri2007@gmail.com