SlideShare a Scribd company logo
1 of 34
Download to read offline
Python for R Users
By
Chandan Routray
As a part of internship at
www.decisionstats.com
Basic Commands
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. i
Functions R Python
Downloading and installing a package install.packages('name') pip install name
Load a package library('name') import name as other_name
Checking working directory getwd() import os
os.getcwd()
Setting working directory setwd() os.chdir()
List files in a directory dir() os.listdir()
List all objects ls() globals()
Remove an object rm('name') del('object')
Data Frame Creation
R Python
(Using pandas package*)
Creating a data frame “df” of
dimension 6x4 (6 rows and 4
columns) containing random
numbers
A<­
matrix(runif(24,0,1),nrow=6,ncol=4)
df<­data.frame(A)
Here,
• runif function generates 24 random
numbers between 0 to 1
• matrix function creates a matrix from
those random numbers, nrow and ncol
sets the numbers of rows and columns
to the matrix
• data.frame converts the matrix to data
frame
import numpy as np
import pandas as pd
A=np.random.randn(6,4)
df=pd.DataFrame(A)
Here,
• np.random.randn generates a
matrix of 6 rows and 4 columns;
this function is a part of numpy**
library
• pd.DataFrame converts the matrix
in to a data frame
*To install Pandas library visit: http://pandas.pydata.org/; To import Pandas library type: import pandas as pd;
**To import Numpy library type: import numpy as np;
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 1
Data Frame Creation
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 2
Data Frame: Inspecting and Viewing Data
R Python
(Using pandas package*)
Getting the names of rows and
columns of data frame “df”
rownames(df)
returns the name of the rows
colnames(df)
returns the name of the columns
df.index
returns the name of the rows
df.columns
returns the name of the columns
Seeing the top and bottom “x”
rows of the data frame “df”
head(df,x)
returns top x rows of data frame
tail(df,x)
returns bottom x rows of data frame
df.head(x)
returns top x rows of data frame
df.tail(x)
returns bottom x rows of data frame
Getting dimension of data frame
“df”
dim(df)
returns in this format : rows, columns
df.shape
returns in this format : (rows,
columns)
Length of data frame “df” length(df)
returns no. of columns in data frames
len(df)
returns no. of columns in data frames
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 3
Data Frame: Inspecting and Viewing Data
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 4
Data Frame: Inspecting and Viewing Data
R Python
(Using pandas package*)
Getting quick summary(like
mean, std. deviation etc. ) of
data in the data frame “df”
summary(df)
returns mean, median , maximum,
minimum, first quarter and third quarter
df.describe()
returns count, mean, standard
deviation, maximum, minimum, 25%,
50% and 75%
Setting row names and columns
names of the data frame “df”
rownames(df)=c(“A”, ”B”, “C”, ”D”, 
“E”, ”F”)
set the row names to A, B, C, D and E
colnames=c(“P”, ”Q”, “R”, ”S”)
set the column names to P, Q, R and S
df.index=[“A”, ”B”, “C”, ”D”, 
“E”, ”F”]
set the row names to A, B, C, D and
E
df.columns=[“P”, ”Q”, “R”, ”S”]
set the column names to P, Q, R and
S
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 5
Data Frame: Inspecting and Viewing Data
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 6
Data Frame: Sorting Data
R Python
(Using pandas package*)
Sorting the data in the data
frame “df” by column name “P”
df[order(df$P),] df.sort(['P'])
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 7
Data Frame: Sorting Data
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 8
Data Frame: Data Selection
R Python
(Using pandas package*)
Slicing the rows of a data frame
from row no. “x” to row no.
“y”(including row x and y)
df[x:y,] df[x­1:y]
Python starts counting from 0
Slicing the columns name “x”,”Y”
etc. of a data frame “df”
myvars <­ c(“X”,”Y”)
newdata <­ df[myvars]
df.loc[:,[‘X’,’Y’]]
Selecting the the data from row
no. “x” to “y” and column no. “a”
to “b”
df[x:y,a:b] df.iloc[x­1:y,a­1,b]
Selecting the element at row no.
“x” and column no. “y”
df[x,y] df.iat[x­1,y­1]
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 9
Data Frame: Data Selection
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 10
Data Frame: Data Selection
R Python
(Using pandas package*)
Using a single column’s values
to select data, column name “A”
subset(df,A>0)
It will select the all the rows in which the
corresponding value in column A of that
row is greater than 0
df[df.A > 0]
It will do the same as the R function
PythonR
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 11
Mathematical Functions
Functions R Python
(import math and numpy library)
Sum sum(x) math.fsum(x)
Square Root sqrt(x) math.sqrt(x)
Standard Deviation sd(x) numpy.std(x)
Log log(x) math.log(x[,base])
Mean mean(x) numpy.mean(x)
Median median(x) numpy.median(x)
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 12
Mathematical Functions
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 13
Data Manipulation
Functions R Python
(import math and numpy library)
Convert character variable to numeric variable as.numeric(x) For a single value: int(x), long(x), float(x)
For list, vectors etc.: map(int,x), map(float,x)
Convert factor/numeric variable to character
variable
paste(x) For a single value: str(x)
For list, vectors etc.: map(str,x)
Check missing value in an object is.na(x) math.isnan(x)
Delete missing value from an object na.omit(list) cleanedList = [x for x in list if str(x) !
= 'nan']
Calculate the number of characters in character
value
nchar(x) len(x)
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 14
Date & Time Manipulation
Functions R
(import lubridate library)
Python
(import datetime library)
Getting time and date at an instant Sys.time() datetime.datetime.now()
Parsing date and time in format:
YYYY MM DD HH:MM:SS
d<­Sys.time()
d_format<­ymd_hms(d)
d=datetime.datetime.now()
format= “%Y %b %d  %H:%M:%S”
d_format=d.strftime(format)
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 15
Data Visualization
Functions R Python
(import matplotlib library**)
Scatter Plot variable1 vs variable2 plot(variable1,variable2) plt.scatter(variable1,variable2)
plt.show()
Boxplot for Var boxplot(Var) plt.boxplot(Var)
plt.show()
Histogram for Var hist(Var) plt.hist(Var)
plt.show()
Pie Chart for Var pie(Var) from pylab import *
pie(Var)
show()
** To import matplotlib library type: import matplotlib.pyplot as plt
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 16
Data Visualization: Scatter Plot
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 17
Data Visualization: Box Plot
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 18
Data Visualization: Histogram
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 19
Data Visualization: Line Plot
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 20
Data Visualization: Bubble
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 22
Data Visualization: Bar
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 21
Data Visualization: Pie Chart
R Python
Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 23
Thank You
For feedback contact
DecisionStats.com
Coming up
● Data Mining in Python and R ( see draft slides
afterwards)
Machine Learning: SVM on Iris Dataset
*To know more about svm function in R visit: http://cran.r-project.org/web/packages/e1071/
** To install sklearn library visit : http://scikit-learn.org/, To know more about sklearn svm visit: http://scikit-
learn.org/stable/modules/generated/sklearn.svm.SVC.html
R(Using svm* function) Python(Using sklearn** library)
library(e1071)
data(iris)
trainset <­iris[1:149,]
testset <­iris[150,]
svm.model <­ svm(Species ~ ., data = 
trainset, cost = 100, gamma = 1, type= 'C­
classification')
svm.pred<­ predict(svm.model,testset[­5])
svm.pred
#Loading Library
from sklearn import svm
#Importing Dataset
from sklearn import datasets
#Calling SVM
clf = svm.SVC()
#Loading the package
iris = datasets.load_iris()
#Constructing training data
X, y = iris.data[:­1], iris.target[:­1]
#Fitting SVM
clf.fit(X, y)
#Testing the model on test data
print clf.predict(iris.data[­1])
Output: Virginica Output: 2, corresponds to Virginica
Linear Regression: Iris Dataset
*To know more about lm function in R visit: https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lm.html
** ** To know more about sklearn linear regression visit : http://scikit-
learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
R(Using lm* function) Python(Using sklearn** library)
data(iris)
total_size<­dim(iris)[1]
num_target<­c(rep(0,total_size))
for (i in 1:length(num_target)){
  if(iris$Species[i]=='setosa'){num_target[i]<­0}
  else if(iris$Species[i]=='versicolor')
{num_target[i]<­1}
  else{num_target[i]<­2}
}
iris$Species<­num_target
train_set <­iris[1:149,]
test_set <­iris[150,]
fit<­lm(Species ~ 0+Sepal.Length+ Sepal.Width+ 
Petal.Length+ Petal.Width , data=train_set)
coefficients(fit)
predict.lm(fit,test_set)
from sklearn import linear_model
from sklearn import datasets
iris = datasets.load_iris()
regr = linear_model.LinearRegression()
X, y = iris.data[:­1], iris.target[:­1]
regr.fit(X, y)
print(regr.coef_)
print regr.predict(iris.data[­1])
Output: 1.64 Output: 1.65
Random forest: Iris Dataset
*To know more about randomForest package in R visit: http://cran.r-project.org/web/packages/randomForest/
** To know more about sklearn random forest visit : http://scikit-
learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html
R(Using randomForest* package) Python(Using sklearn** library)
library(randomForest)
data(iris)
total_size<­dim(iris)[1]
num_target<­c(rep(0,total_size))
for (i in 1:length(num_target)){
  if(iris$Species[i]=='setosa'){num_target[i]<­0}
  else if(iris$Species[i]=='versicolor')
{num_target[i]<­1}
  else{num_target[i]<­2}}
iris$Species<­num_target
train_set <­iris[1:149,]
test_set <­iris[150,]
iris.rf <­ randomForest(Species ~ ., 
data=train_set,ntree=100,importance=TRUE,
                        proximity=TRUE)
print(iris.rf)
predict(iris.rf, test_set[­5], predict.all=TRUE)
from sklearn import ensemble
from sklearn import datasets
clf = 
ensemble.RandomForestClassifier(n_estimato
rs=100,max_depth=10)
iris = datasets.load_iris()
X, y = iris.data[:­1], iris.target[:­1]
clf.fit(X, y)
print clf.predict(iris.data[­1])
Output: 1.845 Output: 2
Decision Tree: Iris Dataset
*To know more about rpart package in R visit: http://cran.r-project.org/web/packages/rpart/
** To know more about sklearn desicion tree visit : http://scikit-
learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html
R(Using rpart* package) Python(Using sklearn** library)
library(rpart)
data(iris)
sub <­ c(1:149)
fit <­ rpart(Species ~ ., data = iris, 
subset = sub)
fit
predict(fit, iris[­sub,], type = "class")
from sklearn.datasets import load_iris
from sklearn.tree import 
DecisionTreeClassifier
clf = 
DecisionTreeClassifier(random_state=0)
iris = datasets.load_iris()
X, y = iris.data[:­1], iris.target[:­1]
clf.fit(X, y)
print clf.predict(iris.data[­1])
Output: Virginica Output: 2, corresponds to virginica
Gaussian Naive Bayes: Iris Dataset
*To know more about e1071 package in R visit: http://cran.r-project.org/web/packages/e1071/
** To know more about sklearn Naive Bayes visit : http://scikit-
learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html
R(Using e1071* package) Python(Using sklearn** library)
library(e1071)
data(iris)
trainset <­iris[1:149,]
testset <­iris[150,]
classifier<­naiveBayes(trainset[,1:4], 
trainset[,5]) 
predict(classifier, testset[,­5])
from sklearn.datasets import load_iris
from sklearn.naive_bayes import GaussianNB
clf = GaussianNB()
iris = datasets.load_iris()
X, y = iris.data[:­1], iris.target[:­1]
clf.fit(X, y)
print clf.predict(iris.data[­1])
Output: Virginica Output: 2, corresponds to virginica
K Nearest Neighbours: Iris Dataset
*To know more about kknn package in R visit:
** To know more about sklearn k nearest neighbours visit : http://scikit-
learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html
R(Using kknn* package) Python(Using sklearn** library)
library(kknn)
data(iris)
trainset <­iris[1:149,]
testset <­iris[150,]
iris.kknn <­ kknn(Species~., 
trainset,testset, distance = 1,            
    kernel = "triangular")
summary(iris.kknn)
fit <­ fitted(iris.kknn)
fit
from sklearn.datasets import load_iris
from sklearn.neighbors import 
KNeighborsClassifier
knn = KNeighborsClassifier()
iris = datasets.load_iris()
X, y = iris.data[:­1], iris.target[:­1]
knn.fit(X,y) 
print knn.predict(iris.data[­1])
Output: Virginica Output: 2, corresponds to virginica
Thank You
For feedback please let us know at
ohri2007@gmail.com

More Related Content

What's hot

計量時系列分析の立場からビジネスの現場のデータを見てみよう - 30th Tokyo Webmining
計量時系列分析の立場からビジネスの現場のデータを見てみよう - 30th Tokyo Webmining計量時系列分析の立場からビジネスの現場のデータを見てみよう - 30th Tokyo Webmining
計量時系列分析の立場からビジネスの現場のデータを見てみよう - 30th Tokyo WebminingTakashi J OZAKI
 
機械学習による統計的実験計画(ベイズ最適化を中心に)
機械学習による統計的実験計画(ベイズ最適化を中心に)機械学習による統計的実験計画(ベイズ最適化を中心に)
機械学習による統計的実験計画(ベイズ最適化を中心に)Kota Matsui
 
Visual Studio CodeでRを使う
Visual Studio CodeでRを使うVisual Studio CodeでRを使う
Visual Studio CodeでRを使うAtsushi Hayakawa
 
トピックモデルを用いた 潜在ファッション嗜好の推定
トピックモデルを用いた 潜在ファッション嗜好の推定トピックモデルを用いた 潜在ファッション嗜好の推定
トピックモデルを用いた 潜在ファッション嗜好の推定Takashi Kaneda
 
ggplot2再入門(2015年バージョン)
ggplot2再入門(2015年バージョン)ggplot2再入門(2015年バージョン)
ggplot2再入門(2015年バージョン)yutannihilation
 
最適化計算の概要まとめ
最適化計算の概要まとめ最適化計算の概要まとめ
最適化計算の概要まとめYuichiro MInato
 
Devsumi 2018summer
Devsumi 2018summerDevsumi 2018summer
Devsumi 2018summerHarada Kei
 
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までーDeep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までーnlab_utokyo
 
臨床家が知っておくべき臨床疫学・統計
臨床家が知っておくべき臨床疫学・統計臨床家が知っておくべき臨床疫学・統計
臨床家が知っておくべき臨床疫学・統計Yasuaki Sagara
 
12. Diffusion Model の数学的基礎.pdf
12. Diffusion Model の数学的基礎.pdf12. Diffusion Model の数学的基礎.pdf
12. Diffusion Model の数学的基礎.pdf幸太朗 岩澤
 
StanとRで折れ線回帰──空間的視点取得課題の反応時間データを説明する階層ベイズモデルを例に──【※Docswellにも同じものを上げています】
StanとRで折れ線回帰──空間的視点取得課題の反応時間データを説明する階層ベイズモデルを例に──【※Docswellにも同じものを上げています】StanとRで折れ線回帰──空間的視点取得課題の反応時間データを説明する階層ベイズモデルを例に──【※Docswellにも同じものを上げています】
StanとRで折れ線回帰──空間的視点取得課題の反応時間データを説明する階層ベイズモデルを例に──【※Docswellにも同じものを上げています】Hiroyuki Muto
 
距離と分類の話
距離と分類の話距離と分類の話
距離と分類の話考司 小杉
 
機械学習を用いた予測モデル構築・評価
機械学習を用いた予測モデル構築・評価機械学習を用いた予測モデル構築・評価
機械学習を用いた予測モデル構築・評価Shintaro Fukushima
 
これからの Vision & Language ~ Acadexit した4つの理由
これからの Vision & Language ~ Acadexit した4つの理由これからの Vision & Language ~ Acadexit した4つの理由
これからの Vision & Language ~ Acadexit した4つの理由Yoshitaka Ushiku
 
fastTextの実装を見てみた
fastTextの実装を見てみたfastTextの実装を見てみた
fastTextの実装を見てみたYoshihiko Shiraki
 
『逆転オセロニア 』における、機械学習モデルを用いたデッキのアーキタイプ抽出とゲーム運用への活用
『逆転オセロニア 』における、機械学習モデルを用いたデッキのアーキタイプ抽出とゲーム運用への活用『逆転オセロニア 』における、機械学習モデルを用いたデッキのアーキタイプ抽出とゲーム運用への活用
『逆転オセロニア 』における、機械学習モデルを用いたデッキのアーキタイプ抽出とゲーム運用への活用RyoAdachi
 
{tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver)
{tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver){tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver)
{tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver)Takashi Kitano
 
統計学における相関分析と仮説検定の基本的な考え方とその実践
統計学における相関分析と仮説検定の基本的な考え方とその実践統計学における相関分析と仮説検定の基本的な考え方とその実践
統計学における相関分析と仮説検定の基本的な考え方とその実践id774
 
なぜ統計学がビジネスの 意思決定において大事なのか?
なぜ統計学がビジネスの 意思決定において大事なのか?なぜ統計学がビジネスの 意思決定において大事なのか?
なぜ統計学がビジネスの 意思決定において大事なのか?Takashi J OZAKI
 

What's hot (20)

計量時系列分析の立場からビジネスの現場のデータを見てみよう - 30th Tokyo Webmining
計量時系列分析の立場からビジネスの現場のデータを見てみよう - 30th Tokyo Webmining計量時系列分析の立場からビジネスの現場のデータを見てみよう - 30th Tokyo Webmining
計量時系列分析の立場からビジネスの現場のデータを見てみよう - 30th Tokyo Webmining
 
機械学習による統計的実験計画(ベイズ最適化を中心に)
機械学習による統計的実験計画(ベイズ最適化を中心に)機械学習による統計的実験計画(ベイズ最適化を中心に)
機械学習による統計的実験計画(ベイズ最適化を中心に)
 
Visual Studio CodeでRを使う
Visual Studio CodeでRを使うVisual Studio CodeでRを使う
Visual Studio CodeでRを使う
 
トピックモデルを用いた 潜在ファッション嗜好の推定
トピックモデルを用いた 潜在ファッション嗜好の推定トピックモデルを用いた 潜在ファッション嗜好の推定
トピックモデルを用いた 潜在ファッション嗜好の推定
 
R seminar on igraph
R seminar on igraphR seminar on igraph
R seminar on igraph
 
ggplot2再入門(2015年バージョン)
ggplot2再入門(2015年バージョン)ggplot2再入門(2015年バージョン)
ggplot2再入門(2015年バージョン)
 
最適化計算の概要まとめ
最適化計算の概要まとめ最適化計算の概要まとめ
最適化計算の概要まとめ
 
Devsumi 2018summer
Devsumi 2018summerDevsumi 2018summer
Devsumi 2018summer
 
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までーDeep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
 
臨床家が知っておくべき臨床疫学・統計
臨床家が知っておくべき臨床疫学・統計臨床家が知っておくべき臨床疫学・統計
臨床家が知っておくべき臨床疫学・統計
 
12. Diffusion Model の数学的基礎.pdf
12. Diffusion Model の数学的基礎.pdf12. Diffusion Model の数学的基礎.pdf
12. Diffusion Model の数学的基礎.pdf
 
StanとRで折れ線回帰──空間的視点取得課題の反応時間データを説明する階層ベイズモデルを例に──【※Docswellにも同じものを上げています】
StanとRで折れ線回帰──空間的視点取得課題の反応時間データを説明する階層ベイズモデルを例に──【※Docswellにも同じものを上げています】StanとRで折れ線回帰──空間的視点取得課題の反応時間データを説明する階層ベイズモデルを例に──【※Docswellにも同じものを上げています】
StanとRで折れ線回帰──空間的視点取得課題の反応時間データを説明する階層ベイズモデルを例に──【※Docswellにも同じものを上げています】
 
距離と分類の話
距離と分類の話距離と分類の話
距離と分類の話
 
機械学習を用いた予測モデル構築・評価
機械学習を用いた予測モデル構築・評価機械学習を用いた予測モデル構築・評価
機械学習を用いた予測モデル構築・評価
 
これからの Vision & Language ~ Acadexit した4つの理由
これからの Vision & Language ~ Acadexit した4つの理由これからの Vision & Language ~ Acadexit した4つの理由
これからの Vision & Language ~ Acadexit した4つの理由
 
fastTextの実装を見てみた
fastTextの実装を見てみたfastTextの実装を見てみた
fastTextの実装を見てみた
 
『逆転オセロニア 』における、機械学習モデルを用いたデッキのアーキタイプ抽出とゲーム運用への活用
『逆転オセロニア 』における、機械学習モデルを用いたデッキのアーキタイプ抽出とゲーム運用への活用『逆転オセロニア 』における、機械学習モデルを用いたデッキのアーキタイプ抽出とゲーム運用への活用
『逆転オセロニア 』における、機械学習モデルを用いたデッキのアーキタイプ抽出とゲーム運用への活用
 
{tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver)
{tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver){tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver)
{tidygraph}と{ggraph}による モダンなネットワーク分析(未公開ver)
 
統計学における相関分析と仮説検定の基本的な考え方とその実践
統計学における相関分析と仮説検定の基本的な考え方とその実践統計学における相関分析と仮説検定の基本的な考え方とその実践
統計学における相関分析と仮説検定の基本的な考え方とその実践
 
なぜ統計学がビジネスの 意思決定において大事なのか?
なぜ統計学がビジネスの 意思決定において大事なのか?なぜ統計学がビジネスの 意思決定において大事なのか?
なぜ統計学がビジネスの 意思決定において大事なのか?
 

Similar to Python for R Users

PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using PythonNishantKumar1179
 
R for Pythonistas (PyData NYC 2017)
R for Pythonistas (PyData NYC 2017)R for Pythonistas (PyData NYC 2017)
R for Pythonistas (PyData NYC 2017)Christopher Roach
 
pandas dataframe notes.pdf
pandas dataframe notes.pdfpandas dataframe notes.pdf
pandas dataframe notes.pdfAjeshSurejan2
 
Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...Edureka!
 
Python Pandas
Python PandasPython Pandas
Python PandasSunil OS
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11Henry Schreiner
 
Python-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptxPython-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptxParveenShaik21
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine LearningAmanBhalla14
 
PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)Hansol Kang
 
Poetry with R -- Dissecting the code
Poetry with R -- Dissecting the codePoetry with R -- Dissecting the code
Poetry with R -- Dissecting the codePeter Solymos
 
PPT ON MACHINE LEARNING by Ragini Ratre
PPT ON MACHINE LEARNING by Ragini RatrePPT ON MACHINE LEARNING by Ragini Ratre
PPT ON MACHINE LEARNING by Ragini RatreRaginiRatre
 
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemWprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemSages
 
python-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptxpython-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptxAkashgupta517936
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in RFlorian Uhlitz
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonRalf Gommers
 

Similar to Python for R Users (20)

Python for R users
Python for R usersPython for R users
Python for R users
 
R language introduction
R language introductionR language introduction
R language introduction
 
R Introduction
R IntroductionR Introduction
R Introduction
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using Python
 
R for Pythonistas (PyData NYC 2017)
R for Pythonistas (PyData NYC 2017)R for Pythonistas (PyData NYC 2017)
R for Pythonistas (PyData NYC 2017)
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
pandas dataframe notes.pdf
pandas dataframe notes.pdfpandas dataframe notes.pdf
pandas dataframe notes.pdf
 
Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...
 
Python Pandas
Python PandasPython Pandas
Python Pandas
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11
 
Python-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptxPython-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptx
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine Learning
 
PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)
 
Poetry with R -- Dissecting the code
Poetry with R -- Dissecting the codePoetry with R -- Dissecting the code
Poetry with R -- Dissecting the code
 
PPT ON MACHINE LEARNING by Ragini Ratre
PPT ON MACHINE LEARNING by Ragini RatrePPT ON MACHINE LEARNING by Ragini Ratre
PPT ON MACHINE LEARNING by Ragini Ratre
 
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemWprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
 
python-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptxpython-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptx
 
Lecture 9.pptx
Lecture 9.pptxLecture 9.pptx
Lecture 9.pptx
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in R
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
 

More from Ajay Ohri

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay OhriAjay Ohri
 
Introduction to R
Introduction to RIntroduction to R
Introduction to RAjay Ohri
 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionAjay Ohri
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for freeAjay Ohri
 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10Ajay Ohri
 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri ResumeAjay Ohri
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientistsAjay Ohri
 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...Ajay Ohri
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data scienceAjay Ohri
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessAjay Ohri
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data ScienceAjay Ohri
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data ScientistsAjay Ohri
 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in PythonAjay Ohri
 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen OomsAjay Ohri
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsAjay Ohri
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha Ajay Ohri
 
Analyze this
Analyze thisAnalyze this
Analyze thisAjay Ohri
 

More from Ajay Ohri (20)

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay Ohri
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 Election
 
Pyspark
PysparkPyspark
Pyspark
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for free
 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10
 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri Resume
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientists
 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data science
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help business
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
 
Tradecraft
Tradecraft   Tradecraft
Tradecraft
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data Scientists
 
Craps
CrapsCraps
Craps
 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in Python
 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen Ooms
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports Analytics
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha
 
Analyze this
Analyze thisAnalyze this
Analyze this
 

Recently uploaded

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 

Recently uploaded (20)

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 

Python for R Users

  • 1. Python for R Users By Chandan Routray As a part of internship at www.decisionstats.com
  • 2. Basic Commands Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. i Functions R Python Downloading and installing a package install.packages('name') pip install name Load a package library('name') import name as other_name Checking working directory getwd() import os os.getcwd() Setting working directory setwd() os.chdir() List files in a directory dir() os.listdir() List all objects ls() globals() Remove an object rm('name') del('object')
  • 3. Data Frame Creation R Python (Using pandas package*) Creating a data frame “df” of dimension 6x4 (6 rows and 4 columns) containing random numbers A<­ matrix(runif(24,0,1),nrow=6,ncol=4) df<­data.frame(A) Here, • runif function generates 24 random numbers between 0 to 1 • matrix function creates a matrix from those random numbers, nrow and ncol sets the numbers of rows and columns to the matrix • data.frame converts the matrix to data frame import numpy as np import pandas as pd A=np.random.randn(6,4) df=pd.DataFrame(A) Here, • np.random.randn generates a matrix of 6 rows and 4 columns; this function is a part of numpy** library • pd.DataFrame converts the matrix in to a data frame *To install Pandas library visit: http://pandas.pydata.org/; To import Pandas library type: import pandas as pd; **To import Numpy library type: import numpy as np; Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 1
  • 4. Data Frame Creation R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 2
  • 5. Data Frame: Inspecting and Viewing Data R Python (Using pandas package*) Getting the names of rows and columns of data frame “df” rownames(df) returns the name of the rows colnames(df) returns the name of the columns df.index returns the name of the rows df.columns returns the name of the columns Seeing the top and bottom “x” rows of the data frame “df” head(df,x) returns top x rows of data frame tail(df,x) returns bottom x rows of data frame df.head(x) returns top x rows of data frame df.tail(x) returns bottom x rows of data frame Getting dimension of data frame “df” dim(df) returns in this format : rows, columns df.shape returns in this format : (rows, columns) Length of data frame “df” length(df) returns no. of columns in data frames len(df) returns no. of columns in data frames Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 3
  • 6. Data Frame: Inspecting and Viewing Data R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 4
  • 7. Data Frame: Inspecting and Viewing Data R Python (Using pandas package*) Getting quick summary(like mean, std. deviation etc. ) of data in the data frame “df” summary(df) returns mean, median , maximum, minimum, first quarter and third quarter df.describe() returns count, mean, standard deviation, maximum, minimum, 25%, 50% and 75% Setting row names and columns names of the data frame “df” rownames(df)=c(“A”, ”B”, “C”, ”D”,  “E”, ”F”) set the row names to A, B, C, D and E colnames=c(“P”, ”Q”, “R”, ”S”) set the column names to P, Q, R and S df.index=[“A”, ”B”, “C”, ”D”,  “E”, ”F”] set the row names to A, B, C, D and E df.columns=[“P”, ”Q”, “R”, ”S”] set the column names to P, Q, R and S Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 5
  • 8. Data Frame: Inspecting and Viewing Data R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 6
  • 9. Data Frame: Sorting Data R Python (Using pandas package*) Sorting the data in the data frame “df” by column name “P” df[order(df$P),] df.sort(['P']) Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 7
  • 10. Data Frame: Sorting Data R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 8
  • 11. Data Frame: Data Selection R Python (Using pandas package*) Slicing the rows of a data frame from row no. “x” to row no. “y”(including row x and y) df[x:y,] df[x­1:y] Python starts counting from 0 Slicing the columns name “x”,”Y” etc. of a data frame “df” myvars <­ c(“X”,”Y”) newdata <­ df[myvars] df.loc[:,[‘X’,’Y’]] Selecting the the data from row no. “x” to “y” and column no. “a” to “b” df[x:y,a:b] df.iloc[x­1:y,a­1,b] Selecting the element at row no. “x” and column no. “y” df[x,y] df.iat[x­1,y­1] Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 9
  • 12. Data Frame: Data Selection R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 10
  • 13. Data Frame: Data Selection R Python (Using pandas package*) Using a single column’s values to select data, column name “A” subset(df,A>0) It will select the all the rows in which the corresponding value in column A of that row is greater than 0 df[df.A > 0] It will do the same as the R function PythonR Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 11
  • 14. Mathematical Functions Functions R Python (import math and numpy library) Sum sum(x) math.fsum(x) Square Root sqrt(x) math.sqrt(x) Standard Deviation sd(x) numpy.std(x) Log log(x) math.log(x[,base]) Mean mean(x) numpy.mean(x) Median median(x) numpy.median(x) Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 12
  • 15. Mathematical Functions R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 13
  • 16. Data Manipulation Functions R Python (import math and numpy library) Convert character variable to numeric variable as.numeric(x) For a single value: int(x), long(x), float(x) For list, vectors etc.: map(int,x), map(float,x) Convert factor/numeric variable to character variable paste(x) For a single value: str(x) For list, vectors etc.: map(str,x) Check missing value in an object is.na(x) math.isnan(x) Delete missing value from an object na.omit(list) cleanedList = [x for x in list if str(x) ! = 'nan'] Calculate the number of characters in character value nchar(x) len(x) Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 14
  • 17. Date & Time Manipulation Functions R (import lubridate library) Python (import datetime library) Getting time and date at an instant Sys.time() datetime.datetime.now() Parsing date and time in format: YYYY MM DD HH:MM:SS d<­Sys.time() d_format<­ymd_hms(d) d=datetime.datetime.now() format= “%Y %b %d  %H:%M:%S” d_format=d.strftime(format) Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 15
  • 18. Data Visualization Functions R Python (import matplotlib library**) Scatter Plot variable1 vs variable2 plot(variable1,variable2) plt.scatter(variable1,variable2) plt.show() Boxplot for Var boxplot(Var) plt.boxplot(Var) plt.show() Histogram for Var hist(Var) plt.hist(Var) plt.show() Pie Chart for Var pie(Var) from pylab import * pie(Var) show() ** To import matplotlib library type: import matplotlib.pyplot as plt Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 16
  • 19. Data Visualization: Scatter Plot R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 17
  • 20. Data Visualization: Box Plot R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 18
  • 21. Data Visualization: Histogram R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 19
  • 22. Data Visualization: Line Plot R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 20
  • 23. Data Visualization: Bubble R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 22
  • 24. Data Visualization: Bar R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 21
  • 25. Data Visualization: Pie Chart R Python Dec 2014 Copyrigt www.decisionstats.com Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 23
  • 26. Thank You For feedback contact DecisionStats.com
  • 27. Coming up ● Data Mining in Python and R ( see draft slides afterwards)
  • 28. Machine Learning: SVM on Iris Dataset *To know more about svm function in R visit: http://cran.r-project.org/web/packages/e1071/ ** To install sklearn library visit : http://scikit-learn.org/, To know more about sklearn svm visit: http://scikit- learn.org/stable/modules/generated/sklearn.svm.SVC.html R(Using svm* function) Python(Using sklearn** library) library(e1071) data(iris) trainset <­iris[1:149,] testset <­iris[150,] svm.model <­ svm(Species ~ ., data =  trainset, cost = 100, gamma = 1, type= 'C­ classification') svm.pred<­ predict(svm.model,testset[­5]) svm.pred #Loading Library from sklearn import svm #Importing Dataset from sklearn import datasets #Calling SVM clf = svm.SVC() #Loading the package iris = datasets.load_iris() #Constructing training data X, y = iris.data[:­1], iris.target[:­1] #Fitting SVM clf.fit(X, y) #Testing the model on test data print clf.predict(iris.data[­1]) Output: Virginica Output: 2, corresponds to Virginica
  • 29. Linear Regression: Iris Dataset *To know more about lm function in R visit: https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lm.html ** ** To know more about sklearn linear regression visit : http://scikit- learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html R(Using lm* function) Python(Using sklearn** library) data(iris) total_size<­dim(iris)[1] num_target<­c(rep(0,total_size)) for (i in 1:length(num_target)){   if(iris$Species[i]=='setosa'){num_target[i]<­0}   else if(iris$Species[i]=='versicolor') {num_target[i]<­1}   else{num_target[i]<­2} } iris$Species<­num_target train_set <­iris[1:149,] test_set <­iris[150,] fit<­lm(Species ~ 0+Sepal.Length+ Sepal.Width+  Petal.Length+ Petal.Width , data=train_set) coefficients(fit) predict.lm(fit,test_set) from sklearn import linear_model from sklearn import datasets iris = datasets.load_iris() regr = linear_model.LinearRegression() X, y = iris.data[:­1], iris.target[:­1] regr.fit(X, y) print(regr.coef_) print regr.predict(iris.data[­1]) Output: 1.64 Output: 1.65
  • 30. Random forest: Iris Dataset *To know more about randomForest package in R visit: http://cran.r-project.org/web/packages/randomForest/ ** To know more about sklearn random forest visit : http://scikit- learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html R(Using randomForest* package) Python(Using sklearn** library) library(randomForest) data(iris) total_size<­dim(iris)[1] num_target<­c(rep(0,total_size)) for (i in 1:length(num_target)){   if(iris$Species[i]=='setosa'){num_target[i]<­0}   else if(iris$Species[i]=='versicolor') {num_target[i]<­1}   else{num_target[i]<­2}} iris$Species<­num_target train_set <­iris[1:149,] test_set <­iris[150,] iris.rf <­ randomForest(Species ~ .,  data=train_set,ntree=100,importance=TRUE,                         proximity=TRUE) print(iris.rf) predict(iris.rf, test_set[­5], predict.all=TRUE) from sklearn import ensemble from sklearn import datasets clf =  ensemble.RandomForestClassifier(n_estimato rs=100,max_depth=10) iris = datasets.load_iris() X, y = iris.data[:­1], iris.target[:­1] clf.fit(X, y) print clf.predict(iris.data[­1]) Output: 1.845 Output: 2
  • 31. Decision Tree: Iris Dataset *To know more about rpart package in R visit: http://cran.r-project.org/web/packages/rpart/ ** To know more about sklearn desicion tree visit : http://scikit- learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html R(Using rpart* package) Python(Using sklearn** library) library(rpart) data(iris) sub <­ c(1:149) fit <­ rpart(Species ~ ., data = iris,  subset = sub) fit predict(fit, iris[­sub,], type = "class") from sklearn.datasets import load_iris from sklearn.tree import  DecisionTreeClassifier clf =  DecisionTreeClassifier(random_state=0) iris = datasets.load_iris() X, y = iris.data[:­1], iris.target[:­1] clf.fit(X, y) print clf.predict(iris.data[­1]) Output: Virginica Output: 2, corresponds to virginica
  • 32. Gaussian Naive Bayes: Iris Dataset *To know more about e1071 package in R visit: http://cran.r-project.org/web/packages/e1071/ ** To know more about sklearn Naive Bayes visit : http://scikit- learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html R(Using e1071* package) Python(Using sklearn** library) library(e1071) data(iris) trainset <­iris[1:149,] testset <­iris[150,] classifier<­naiveBayes(trainset[,1:4],  trainset[,5])  predict(classifier, testset[,­5]) from sklearn.datasets import load_iris from sklearn.naive_bayes import GaussianNB clf = GaussianNB() iris = datasets.load_iris() X, y = iris.data[:­1], iris.target[:­1] clf.fit(X, y) print clf.predict(iris.data[­1]) Output: Virginica Output: 2, corresponds to virginica
  • 33. K Nearest Neighbours: Iris Dataset *To know more about kknn package in R visit: ** To know more about sklearn k nearest neighbours visit : http://scikit- learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html R(Using kknn* package) Python(Using sklearn** library) library(kknn) data(iris) trainset <­iris[1:149,] testset <­iris[150,] iris.kknn <­ kknn(Species~.,  trainset,testset, distance = 1,                 kernel = "triangular") summary(iris.kknn) fit <­ fitted(iris.kknn) fit from sklearn.datasets import load_iris from sklearn.neighbors import  KNeighborsClassifier knn = KNeighborsClassifier() iris = datasets.load_iris() X, y = iris.data[:­1], iris.target[:­1] knn.fit(X,y)  print knn.predict(iris.data[­1]) Output: Virginica Output: 2, corresponds to virginica
  • 34. Thank You For feedback please let us know at ohri2007@gmail.com