SlideShare a Scribd company logo
1 of 21
Aggregate()
aggregate() function is used to get the summary
statistics of the data by group. The statistics
include mean, min, sum. max etc.
Syantax:
aggregate(dataframe$aggregate_column,
list(dataframe$group_column), FUN)
where
dataframe is the input dataframe.
aggregate_column is the column to be aggregated in the
dataframe.
group_column is the column to be grouped with FUN.
FUN represents sum/mean/min/ max
• # create a dataframe with 4 columns
• data = data.frame(subjects=c("java", "python", "java",
• "java", "php", "php"),
• id=c(1, 2, 3, 4, 5, 6),
• names=c("manoj", "sai", "mounika",
• "durga", "deepika", "roshan"),
• marks=c(89, 89, 76, 89, 90, 67))
•
• # display
• print(data)
• # aggregate sum of marks with subjects
• print(aggregate(data$marks, list(data$subjects), FUN=sum))
•
• # aggregate minimum of marks with subjects
• print(aggregate(data$marks, list(data$subjects), FUN=min))
• # aggregate maximum of marks with subjects
• print(aggregate(data$marks, list(data$subjects), FUN=max))
• create a dataframe with 4 columns
• data = data.frame(subjects=c("java", "python", "java",
• "java", "php", "php"),
• id=c(1, 2, 3, 4, 5, 6),
• names=c("manoj", "sai", "mounika",
• "durga", "deepika", "roshan"),
• marks=c(89, 89, 76, 89, 90, 67))
•
• # display
• print(data)
•
• # aggregate average of marks with subjects
• print(aggregate(data$marks, list(data$subjects),
FUN=mean))
apply(), lapply(), sapply(), and tapply() in R
• The apply() collection is a part of R essential package.
This family of functions helps us to apply a certain
function to a certain data frame, list, or vector and
return the result as a list or vector depending on the
function we use. There are these following four types
of function in apply() function family:
• apply() function
• The apply() function lets us apply a function to the
rows or columns of a matrix or data frame. This
function takes matrix or data frame as an argument
along with function and whether it has to be applied
by row or column and returns the result in the form of
a vector or array or list of values obtained.
• Syntax: apply( x, margin, function )
Parameters:
• x: determines the input array including matrix.
• margin: If the margin is 1 function is applied
across row, if the margin is 2 it is applied across
the column.
• function: determines the function that is to be
applied on input data.
• sample_matrix <- matrix(C<-(1:10),nrow=3, ncol=10)
•
• print( "sample matrix:")
• sample_matrix
•
• # Use apply() function across row to find sum
• print("sum across rows:")
• apply( sample_matrix, 1, sum)
•
• # use apply() function across column to find mean
• print("mean across columns:")
• apply( sample_matrix, 2, mean)
lapply() function
• The lapply() function helps us in applying
functions on list objects and returns a list object
of the same length. The lapply() function in the R
Language takes a list, vector, or data frame as
input and gives output in the form of a list object.
Since the lapply() function applies a certain
operation to all the elements of the list it doesn’t
need a MARGIN.
• Syntax: lapply( x, fun )
• Parameters:
• x: determines the input vector or an object.
• fun: determines the function that is to be applied
to input data.
• create sample data
• names <- c("priyank", "abhiraj","pawananjani",
• "sudhanshu","devraj")
• print( "original data:")
• names
•
• # apply lapply() function
• print("data after lapply():")
• lapply(names, toupper)
sapply() function
• The sapply() function helps us in applying
functions on a list, vector, or data frame and
returns an array or matrix object of the same
length.
• The sapply() function in the R Language takes a
list, vector, or data frame as input and gives
output in the form of an array or matrix object.
• Since the sapply() function applies a certain
operation to all the elements of the object it
doesn’t need a MARGIN.
• It is the same as lapply() with the only
difference being the type of return object.
• Syntax: sapply( x, fun )
• Parameters:
• x: determines the input vector or an object.
• fun: determines the function that is to be applied
to input data.
• # create sample data
• sample_data<- data.frame( x=c(1,2,3,4,5,6),
• y=c(3,2,4,2,34,5))
• print( "original data:")
• sample_data
•
• # apply sapply() function
• print("data after sapply():")
• sapply(sample_data, max)
tapply() function
• The tapply() helps us to compute statistical
measures (mean, median, min, max, etc..) or a self-
written function operation for each factor variable
in a vector.
• It helps us to create a subset of a vector and then
apply some functions to each of the subsets.
• Syntax: tapply( x, index, fun )
• Parameters:
• x: determines the input vector or an object.
• index: determines the factor vector that helps us
distinguish the data.
• fun: determines the function that is to be applied to
input data.
• # load library tidyverse
• library(tidyverse)
•
• # print head of diamonds dataset
• print(" Head of data:")
• head(diamonds)
•
• # apply tapply function to get average price by cut
• print("Average price for each cut of diamond:")
• tapply(diamonds$price, diamonds$cut, mean)
• Read Data:
• Input:
• Input is the first step in any processing, including analytical data
processing.
• Here the input is DATASET
• Read dataset is … read.table() or read.csv()
• Fruits<- read.csv(“Fruits.csv”)
• Fruit
• Describing Data structure
• The data set can be describes using different functions like
names(),str(),summary(),head() and tail()
• Str(Fruit)
• Head(Fruit,3)
• Tail(Fruit,3)
• Summary(Fruit)
Methods for Reading Data
• Read CSV
• One of the most widely data store is the .csv (comma-
separated values) file formats. R loads an array of
libraries during the start-up, including the utils package.
This package is convenient to open csv files combined
with the reading.csv() function. Here is the syntax for
read.csv
• read.csv(file, header = TRUE, sep = ",")
• Argument:
• file: PATH where the file is stored
• header: confirm if the file has a header or not, by default,
the header is set to TRUE
• sep: the symbol used to split the variable. By default, `,`.
Read Excel files
• Excel files are very popular among data analysts. Spreadsheets are
easy to work with and flexible. R is equipped with a library readxl to
import Excel spreadsheet.
• Use this code
• require(readxl)
• to check if readxl is installed in your machine. If you install r with r-
conda-essential, the library is already installed. You should see in
the command window:
• Output:
• Loading required package: readxl.If the package does not exit, you
can install it with the conda library or in the terminal, use conda
install -c mittner r-readxl.
• Use the following command to load the library to import excel files.
• library(readxl)
Import data from other Statistical software
• We will import different files format with the heaven
package. This package support SAS, STATA and SPSS
softwares. We can use the following function to open
different types of dataset, according to the extension of
the file:
• SAS: read_sas()
• STATA: read_dta() (or read_stata(), which are identical)
• SPSS: read_sav() or read_por(). We need to check the
extension
• Only one argument is required within these function. We
need to know the PATH where the file is stored. That’s it,
we are ready to open all the files from SAS, STATA and
SPSS. These three function accepts an URL as well.
• library(haven)
• Read STATA
• For STATA data files you can use read_dta().
We use exactly the same dataset but store in
.dta file.
• PATH_stata <- 'https://github.com/guru99-
edu/R-
Programming/blob/master/binary.dta?raw=tr
ue'
• df <- read_dta(PATH_stata) head(df)
• Read SPSS
• We use the read_sav()function to open a SPSS
file. The file extension “.sav”
• PATH_spss <- 'https://github.com/guru99-
edu/R-
Programming/blob/master/binary.sav?raw=tru
e' df <- read_sav(PATH_spss) head(df)
• Read sas
• sas7bdat can Import SAS Files
• The second package we are going to use is
the sas7bdat package. This package was written for the
sole purpose of reading SAS files in R.
• Can R open SAS files?
• As you already may have understood; yes, R can open SAS
files. Here’s 3 steps to open SAS files in R:
1) Install haven install.packages("haven")
2) Load the r-package haven: require(haven)
3) Open the SAS
file read_sas(PATH_TO_YOUR_SAS7BDAT_FILE)
Note, this assumes that R is already installed on your
computer and read the post to get more information on
how to read SAS files in R.
• How to install r-packages:
• Installing r-packages is quite easy. Below, we will
learn about two methods.
• Install r packages using
the install.packages() function:
Open up RGui (or RStudio) and type the following
in the console:
•
install.packages(c("haven", "sas7bdat"))
• Install using Conda:
Open the Anaconda Prompt and type conda
install -c conda-forge r-haven r-sas7bdat r-rio
How to Read a SAS (.sas7bdat) File in R
into a DataFrame
• In this section, we are going to learn how to import data
into R. First, we are going to import data in R using the
haven package. After this, we are going to use the
sas7bdat package to read a .sas7bdat file into R. Finally,
we are going to do the same using the rio package.
• Method 1: Load a SAS file in R using Haven
• # importing the SAS file: df <-
read_sas("airline.sas7bdat") head(df)

More Related Content

Similar to Aggregate.pptx

How to obtain and install R.ppt
How to obtain and install R.pptHow to obtain and install R.ppt
How to obtain and install R.pptrajalakshmi5921
 
Python-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptxPython-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptxParveenShaik21
 
R programming slides
R  programming slidesR  programming slides
R programming slidesPankaj Saini
 
Introduction to R _IMPORTANT FOR DATA ANALYTICS
Introduction to R _IMPORTANT FOR DATA ANALYTICSIntroduction to R _IMPORTANT FOR DATA ANALYTICS
Introduction to R _IMPORTANT FOR DATA ANALYTICSHaritikaChhatwal1
 
Advanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAdvanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAnshika865276
 
Unit 3_Numpy_Vsp.pptx
Unit 3_Numpy_Vsp.pptxUnit 3_Numpy_Vsp.pptx
Unit 3_Numpy_Vsp.pptxprakashvs7
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using PythonNishantKumar1179
 
Pandas Dataframe reading data Kirti final.pptx
Pandas Dataframe reading data  Kirti final.pptxPandas Dataframe reading data  Kirti final.pptx
Pandas Dataframe reading data Kirti final.pptxKirti Verma
 
R - Get Started I - Sanaitics
R - Get Started I - SanaiticsR - Get Started I - Sanaitics
R - Get Started I - SanaiticsVijith Nair
 
python-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptxpython-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptxAkashgupta517936
 

Similar to Aggregate.pptx (20)

Python basics
Python basicsPython basics
Python basics
 
Python basics
Python basicsPython basics
Python basics
 
Python basics
Python basicsPython basics
Python basics
 
Python basics
Python basicsPython basics
Python basics
 
Python basics
Python basicsPython basics
Python basics
 
Python basics
Python basicsPython basics
Python basics
 
How to obtain and install R.ppt
How to obtain and install R.pptHow to obtain and install R.ppt
How to obtain and install R.ppt
 
Lecture 9.pptx
Lecture 9.pptxLecture 9.pptx
Lecture 9.pptx
 
Python-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptxPython-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptx
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
Introduction to R _IMPORTANT FOR DATA ANALYTICS
Introduction to R _IMPORTANT FOR DATA ANALYTICSIntroduction to R _IMPORTANT FOR DATA ANALYTICS
Introduction to R _IMPORTANT FOR DATA ANALYTICS
 
Advanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAdvanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.ppt
 
Python for data analysis
Python for data analysisPython for data analysis
Python for data analysis
 
Decision Tree.pptx
Decision Tree.pptxDecision Tree.pptx
Decision Tree.pptx
 
Unit 3_Numpy_Vsp.pptx
Unit 3_Numpy_Vsp.pptxUnit 3_Numpy_Vsp.pptx
Unit 3_Numpy_Vsp.pptx
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using Python
 
Pandas Dataframe reading data Kirti final.pptx
Pandas Dataframe reading data  Kirti final.pptxPandas Dataframe reading data  Kirti final.pptx
Pandas Dataframe reading data Kirti final.pptx
 
R - Get Started I - Sanaitics
R - Get Started I - SanaiticsR - Get Started I - Sanaitics
R - Get Started I - Sanaitics
 
محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي   R program د.هديل القفيديمحاضرة برنامج التحليل الكمي   R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
 
python-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptxpython-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptx
 

More from Ramakrishna Reddy Bijjam

Arrays to arrays and pointers with arrays.pptx
Arrays to arrays and pointers with arrays.pptxArrays to arrays and pointers with arrays.pptx
Arrays to arrays and pointers with arrays.pptxRamakrishna Reddy Bijjam
 
Python With MongoDB in advanced Python.pptx
Python With MongoDB in advanced Python.pptxPython With MongoDB in advanced Python.pptx
Python With MongoDB in advanced Python.pptxRamakrishna Reddy Bijjam
 
Pointers and single &multi dimentionalarrays.pptx
Pointers and single &multi dimentionalarrays.pptxPointers and single &multi dimentionalarrays.pptx
Pointers and single &multi dimentionalarrays.pptxRamakrishna Reddy Bijjam
 
Certinity Factor and Dempster-shafer theory .pptx
Certinity Factor and Dempster-shafer theory .pptxCertinity Factor and Dempster-shafer theory .pptx
Certinity Factor and Dempster-shafer theory .pptxRamakrishna Reddy Bijjam
 
Auxiliary Memory in computer Architecture.pptx
Auxiliary Memory in computer Architecture.pptxAuxiliary Memory in computer Architecture.pptx
Auxiliary Memory in computer Architecture.pptxRamakrishna Reddy Bijjam
 

More from Ramakrishna Reddy Bijjam (20)

Arrays to arrays and pointers with arrays.pptx
Arrays to arrays and pointers with arrays.pptxArrays to arrays and pointers with arrays.pptx
Arrays to arrays and pointers with arrays.pptx
 
Auxiliary, Cache and Virtual memory.pptx
Auxiliary, Cache and Virtual memory.pptxAuxiliary, Cache and Virtual memory.pptx
Auxiliary, Cache and Virtual memory.pptx
 
Python With MongoDB in advanced Python.pptx
Python With MongoDB in advanced Python.pptxPython With MongoDB in advanced Python.pptx
Python With MongoDB in advanced Python.pptx
 
Pointers and single &multi dimentionalarrays.pptx
Pointers and single &multi dimentionalarrays.pptxPointers and single &multi dimentionalarrays.pptx
Pointers and single &multi dimentionalarrays.pptx
 
Certinity Factor and Dempster-shafer theory .pptx
Certinity Factor and Dempster-shafer theory .pptxCertinity Factor and Dempster-shafer theory .pptx
Certinity Factor and Dempster-shafer theory .pptx
 
Auxiliary Memory in computer Architecture.pptx
Auxiliary Memory in computer Architecture.pptxAuxiliary Memory in computer Architecture.pptx
Auxiliary Memory in computer Architecture.pptx
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
 
K Means Clustering in ML.pptx
K Means Clustering in ML.pptxK Means Clustering in ML.pptx
K Means Clustering in ML.pptx
 
Pandas.pptx
Pandas.pptxPandas.pptx
Pandas.pptx
 
Python With MongoDB.pptx
Python With MongoDB.pptxPython With MongoDB.pptx
Python With MongoDB.pptx
 
Python with MySql.pptx
Python with MySql.pptxPython with MySql.pptx
Python with MySql.pptx
 
PYTHON PROGRAMMING NOTES RKREDDY.pdf
PYTHON PROGRAMMING NOTES RKREDDY.pdfPYTHON PROGRAMMING NOTES RKREDDY.pdf
PYTHON PROGRAMMING NOTES RKREDDY.pdf
 
BInary file Operations.pptx
BInary file Operations.pptxBInary file Operations.pptx
BInary file Operations.pptx
 
Data Science in Python.pptx
Data Science in Python.pptxData Science in Python.pptx
Data Science in Python.pptx
 
CSV JSON and XML files in Python.pptx
CSV JSON and XML files in Python.pptxCSV JSON and XML files in Python.pptx
CSV JSON and XML files in Python.pptx
 
HTML files in python.pptx
HTML files in python.pptxHTML files in python.pptx
HTML files in python.pptx
 
Regular Expressions in Python.pptx
Regular Expressions in Python.pptxRegular Expressions in Python.pptx
Regular Expressions in Python.pptx
 
datareprersentation 1.pptx
datareprersentation 1.pptxdatareprersentation 1.pptx
datareprersentation 1.pptx
 
Apriori.pptx
Apriori.pptxApriori.pptx
Apriori.pptx
 
Eclat.pptx
Eclat.pptxEclat.pptx
Eclat.pptx
 

Recently uploaded

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 

Recently uploaded (20)

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 

Aggregate.pptx

  • 1. Aggregate() aggregate() function is used to get the summary statistics of the data by group. The statistics include mean, min, sum. max etc. Syantax: aggregate(dataframe$aggregate_column, list(dataframe$group_column), FUN) where dataframe is the input dataframe. aggregate_column is the column to be aggregated in the dataframe. group_column is the column to be grouped with FUN. FUN represents sum/mean/min/ max
  • 2. • # create a dataframe with 4 columns • data = data.frame(subjects=c("java", "python", "java", • "java", "php", "php"), • id=c(1, 2, 3, 4, 5, 6), • names=c("manoj", "sai", "mounika", • "durga", "deepika", "roshan"), • marks=c(89, 89, 76, 89, 90, 67)) • • # display • print(data) • # aggregate sum of marks with subjects • print(aggregate(data$marks, list(data$subjects), FUN=sum)) • • # aggregate minimum of marks with subjects • print(aggregate(data$marks, list(data$subjects), FUN=min)) • # aggregate maximum of marks with subjects • print(aggregate(data$marks, list(data$subjects), FUN=max))
  • 3. • create a dataframe with 4 columns • data = data.frame(subjects=c("java", "python", "java", • "java", "php", "php"), • id=c(1, 2, 3, 4, 5, 6), • names=c("manoj", "sai", "mounika", • "durga", "deepika", "roshan"), • marks=c(89, 89, 76, 89, 90, 67)) • • # display • print(data) • • # aggregate average of marks with subjects • print(aggregate(data$marks, list(data$subjects), FUN=mean))
  • 4. apply(), lapply(), sapply(), and tapply() in R • The apply() collection is a part of R essential package. This family of functions helps us to apply a certain function to a certain data frame, list, or vector and return the result as a list or vector depending on the function we use. There are these following four types of function in apply() function family: • apply() function • The apply() function lets us apply a function to the rows or columns of a matrix or data frame. This function takes matrix or data frame as an argument along with function and whether it has to be applied by row or column and returns the result in the form of a vector or array or list of values obtained.
  • 5. • Syntax: apply( x, margin, function ) Parameters: • x: determines the input array including matrix. • margin: If the margin is 1 function is applied across row, if the margin is 2 it is applied across the column. • function: determines the function that is to be applied on input data.
  • 6. • sample_matrix <- matrix(C<-(1:10),nrow=3, ncol=10) • • print( "sample matrix:") • sample_matrix • • # Use apply() function across row to find sum • print("sum across rows:") • apply( sample_matrix, 1, sum) • • # use apply() function across column to find mean • print("mean across columns:") • apply( sample_matrix, 2, mean)
  • 7. lapply() function • The lapply() function helps us in applying functions on list objects and returns a list object of the same length. The lapply() function in the R Language takes a list, vector, or data frame as input and gives output in the form of a list object. Since the lapply() function applies a certain operation to all the elements of the list it doesn’t need a MARGIN. • Syntax: lapply( x, fun ) • Parameters: • x: determines the input vector or an object. • fun: determines the function that is to be applied to input data.
  • 8. • create sample data • names <- c("priyank", "abhiraj","pawananjani", • "sudhanshu","devraj") • print( "original data:") • names • • # apply lapply() function • print("data after lapply():") • lapply(names, toupper)
  • 9. sapply() function • The sapply() function helps us in applying functions on a list, vector, or data frame and returns an array or matrix object of the same length. • The sapply() function in the R Language takes a list, vector, or data frame as input and gives output in the form of an array or matrix object. • Since the sapply() function applies a certain operation to all the elements of the object it doesn’t need a MARGIN. • It is the same as lapply() with the only difference being the type of return object.
  • 10. • Syntax: sapply( x, fun ) • Parameters: • x: determines the input vector or an object. • fun: determines the function that is to be applied to input data. • # create sample data • sample_data<- data.frame( x=c(1,2,3,4,5,6), • y=c(3,2,4,2,34,5)) • print( "original data:") • sample_data • • # apply sapply() function • print("data after sapply():") • sapply(sample_data, max)
  • 11. tapply() function • The tapply() helps us to compute statistical measures (mean, median, min, max, etc..) or a self- written function operation for each factor variable in a vector. • It helps us to create a subset of a vector and then apply some functions to each of the subsets. • Syntax: tapply( x, index, fun ) • Parameters: • x: determines the input vector or an object. • index: determines the factor vector that helps us distinguish the data. • fun: determines the function that is to be applied to input data.
  • 12. • # load library tidyverse • library(tidyverse) • • # print head of diamonds dataset • print(" Head of data:") • head(diamonds) • • # apply tapply function to get average price by cut • print("Average price for each cut of diamond:") • tapply(diamonds$price, diamonds$cut, mean)
  • 13. • Read Data: • Input: • Input is the first step in any processing, including analytical data processing. • Here the input is DATASET • Read dataset is … read.table() or read.csv() • Fruits<- read.csv(“Fruits.csv”) • Fruit • Describing Data structure • The data set can be describes using different functions like names(),str(),summary(),head() and tail() • Str(Fruit) • Head(Fruit,3) • Tail(Fruit,3) • Summary(Fruit)
  • 14. Methods for Reading Data • Read CSV • One of the most widely data store is the .csv (comma- separated values) file formats. R loads an array of libraries during the start-up, including the utils package. This package is convenient to open csv files combined with the reading.csv() function. Here is the syntax for read.csv • read.csv(file, header = TRUE, sep = ",") • Argument: • file: PATH where the file is stored • header: confirm if the file has a header or not, by default, the header is set to TRUE • sep: the symbol used to split the variable. By default, `,`.
  • 15. Read Excel files • Excel files are very popular among data analysts. Spreadsheets are easy to work with and flexible. R is equipped with a library readxl to import Excel spreadsheet. • Use this code • require(readxl) • to check if readxl is installed in your machine. If you install r with r- conda-essential, the library is already installed. You should see in the command window: • Output: • Loading required package: readxl.If the package does not exit, you can install it with the conda library or in the terminal, use conda install -c mittner r-readxl. • Use the following command to load the library to import excel files. • library(readxl)
  • 16. Import data from other Statistical software • We will import different files format with the heaven package. This package support SAS, STATA and SPSS softwares. We can use the following function to open different types of dataset, according to the extension of the file: • SAS: read_sas() • STATA: read_dta() (or read_stata(), which are identical) • SPSS: read_sav() or read_por(). We need to check the extension • Only one argument is required within these function. We need to know the PATH where the file is stored. That’s it, we are ready to open all the files from SAS, STATA and SPSS. These three function accepts an URL as well. • library(haven)
  • 17. • Read STATA • For STATA data files you can use read_dta(). We use exactly the same dataset but store in .dta file. • PATH_stata <- 'https://github.com/guru99- edu/R- Programming/blob/master/binary.dta?raw=tr ue' • df <- read_dta(PATH_stata) head(df)
  • 18. • Read SPSS • We use the read_sav()function to open a SPSS file. The file extension “.sav” • PATH_spss <- 'https://github.com/guru99- edu/R- Programming/blob/master/binary.sav?raw=tru e' df <- read_sav(PATH_spss) head(df)
  • 19. • Read sas • sas7bdat can Import SAS Files • The second package we are going to use is the sas7bdat package. This package was written for the sole purpose of reading SAS files in R. • Can R open SAS files? • As you already may have understood; yes, R can open SAS files. Here’s 3 steps to open SAS files in R: 1) Install haven install.packages("haven") 2) Load the r-package haven: require(haven) 3) Open the SAS file read_sas(PATH_TO_YOUR_SAS7BDAT_FILE) Note, this assumes that R is already installed on your computer and read the post to get more information on how to read SAS files in R.
  • 20. • How to install r-packages: • Installing r-packages is quite easy. Below, we will learn about two methods. • Install r packages using the install.packages() function: Open up RGui (or RStudio) and type the following in the console: • install.packages(c("haven", "sas7bdat")) • Install using Conda: Open the Anaconda Prompt and type conda install -c conda-forge r-haven r-sas7bdat r-rio
  • 21. How to Read a SAS (.sas7bdat) File in R into a DataFrame • In this section, we are going to learn how to import data into R. First, we are going to import data in R using the haven package. After this, we are going to use the sas7bdat package to read a .sas7bdat file into R. Finally, we are going to do the same using the rio package. • Method 1: Load a SAS file in R using Haven • # importing the SAS file: df <- read_sas("airline.sas7bdat") head(df)