It covers- Introduction to R language, Creating, Exploring data with Various Data Structures e.g. Vector, Array, Matrices, and Factors. Using Methods with examples.
Overview and about R, R Studio Installation, Fundamentals of R Programming: Data Structures and Data Types, Operators, Control Statements, Loop Statements, Functions,
Descriptive Analysis using R: Maximum, Minimum, Range, Mean, Median and Mode, Variance, Standard Deviation, Quantiles, IQR, Summary
Overview and about R, R Studio Installation, Fundamentals of R Programming: Data Structures and Data Types, Operators, Control Statements, Loop Statements, Functions,
Descriptive Analysis using R: Maximum, Minimum, Range, Mean, Median and Mode, Variance, Standard Deviation, Quantiles, IQR, Summary
In this tutorial, we learn to create variables in R. Followed by that, we explore the different data types including numeric, integer, character, logical and date/time.
This hands-on R course will guide users through a variety of programming functions in the open-source statistical software program, R. Topics covered include indexing, loops, conditional branching, S3 classes, and debugging. Full workshop materials available from http://projects.iq.harvard.edu/rtc/r-prog
Analysis of data in Python with SciPy and pandas, Ubuntu installation, PyCharm configuration, Series, DataFrame, big data, medical data, merging data, groupby, graphing data, iPython using Wakari.io, and analyzing stock prices of US automakers including Ford and Telsa. As presented at Penguicon 2016.
All data values in Python are encapsulated in relevant object classes. Everything in Python is an object and every object has an identity, a type, and a value. Like another object-oriented language such as Java or C++, there are several data types which are built into Python. Extension modules which are written in C, Java, or other languages can define additional types.
To determine a variable's type in Python you can use the type() function. The value of some objects can be changed. Objects whose value can be changed are called mutable and objects whose value is unchangeable (once they are created) are called immutable.
This presentation discusses the following topics:
Basic features of R
Exploring R GUI
Data Frames & Lists
Handling Data in R Workspace
Reading Data Sets & Exporting Data from R
Manipulating & Processing Data in R
Introduction to Data Science, Prerequisites (tidyverse), Import Data (readr), Data Tyding (tidyr),
pivot_longer(), pivot_wider(), separate(), unite(), Data Transformation (dplyr - Grammar of Manipulation): arrange(), filter(),
select(), mutate(), summarise()m
Data Visualization (ggplot - Grammar of Graphics): Column Chart, Stacked Column Graph, Bar Graph, Line Graph, Dual Axis Chart, Area Chart, Pie Chart, Heat Map, Scatter Chart, Bubble Chart
Abstract: This PDSG workshop introduces the basics of Python libraries used in machine learning. Libraries covered are Numpy, Pandas and MathlibPlot.
Level: Fundamental
Requirements: One should have some knowledge of programming and some statistics.
The goal of this workshop is to introduce fundamental capabilities of R as a tool for performing data analysis. Here, we learn about the most comprehensive statistical analysis language R, to get a basic idea how to analyze real-word data, extract patterns from data and find causality.
This presentation educates you about R - data types in detail with data type syntax, the data types are - Vectors, Lists, Matrices, Arrays, Factors, Data Frames.
For more topics stay tuned with Learnbay.
In this tutorial, we learn to create variables in R. Followed by that, we explore the different data types including numeric, integer, character, logical and date/time.
This hands-on R course will guide users through a variety of programming functions in the open-source statistical software program, R. Topics covered include indexing, loops, conditional branching, S3 classes, and debugging. Full workshop materials available from http://projects.iq.harvard.edu/rtc/r-prog
Analysis of data in Python with SciPy and pandas, Ubuntu installation, PyCharm configuration, Series, DataFrame, big data, medical data, merging data, groupby, graphing data, iPython using Wakari.io, and analyzing stock prices of US automakers including Ford and Telsa. As presented at Penguicon 2016.
All data values in Python are encapsulated in relevant object classes. Everything in Python is an object and every object has an identity, a type, and a value. Like another object-oriented language such as Java or C++, there are several data types which are built into Python. Extension modules which are written in C, Java, or other languages can define additional types.
To determine a variable's type in Python you can use the type() function. The value of some objects can be changed. Objects whose value can be changed are called mutable and objects whose value is unchangeable (once they are created) are called immutable.
This presentation discusses the following topics:
Basic features of R
Exploring R GUI
Data Frames & Lists
Handling Data in R Workspace
Reading Data Sets & Exporting Data from R
Manipulating & Processing Data in R
Introduction to Data Science, Prerequisites (tidyverse), Import Data (readr), Data Tyding (tidyr),
pivot_longer(), pivot_wider(), separate(), unite(), Data Transformation (dplyr - Grammar of Manipulation): arrange(), filter(),
select(), mutate(), summarise()m
Data Visualization (ggplot - Grammar of Graphics): Column Chart, Stacked Column Graph, Bar Graph, Line Graph, Dual Axis Chart, Area Chart, Pie Chart, Heat Map, Scatter Chart, Bubble Chart
Abstract: This PDSG workshop introduces the basics of Python libraries used in machine learning. Libraries covered are Numpy, Pandas and MathlibPlot.
Level: Fundamental
Requirements: One should have some knowledge of programming and some statistics.
The goal of this workshop is to introduce fundamental capabilities of R as a tool for performing data analysis. Here, we learn about the most comprehensive statistical analysis language R, to get a basic idea how to analyze real-word data, extract patterns from data and find causality.
This presentation educates you about R - data types in detail with data type syntax, the data types are - Vectors, Lists, Matrices, Arrays, Factors, Data Frames.
For more topics stay tuned with Learnbay.
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxcarliotwaycave
INFORMATIVE ESSAY
The purpose of the Informative Essay assignment is to choose a job or task that you know how to do and then write a minimum of 2 full pages, maximum of 3 full pages, Informative Essay teaching the reader how to do that job or task. You will follow the organization techniques explained in Unit 6.
Here are the details:
1. Read the Lecture Notes in Unit 6. You may also find the information in Chapter 10.5 in our text on Process Analysis helpful. The lecture notes will really be the most important to read in writing this assignment. However, here is a link to that chapter that you may look at in addition to the lecture notes:
https://open.lib.umn.edu/writingforsuccess/chapter/10-5-process-analysis/ (Links to an external site.)
2. Choose your topic, that is, the job or task you want to teach. As the notes explain, this should be a job or task that you already know how to do, and it should be something you can do well. At this point, think about your audience (reader). Will your reader need any knowledge or experience to do this job or task, or will you write these instructions for a general reader where no experience is required to perform the job?
3. Plan your outline to organize this essay. Unit 6 notes offer advice on this organization process. Be sure to include an introductory paragraph that has the four main points presented in the lecture notes.
4. Write the essay. It will need to be at least 2 FULL pages long, maximum of 3 full pages long. You will use the MLA formatting that you used in previous essays from Units 3, 4, and 5.
5. Be sure to include a title for your essay.
6. After writing the essay, be sure to take time to read it several times for revision and editing. It would be helpful to have at least one other person proofread it as well before submitting the assignment.
Quiz2
# comments start with #
# to quit q()
# two steps to install any library
#install.packages("rattle")
#library(rattle)
setwd("D:/AJITH/CUMBERLANDS/Ph.D/SEMESTER 3/Data Science & Big Data Analy (ITS-836-51)/RStudio/Week2")
getwd()
x <- 3 # x is a vector of length 1
print(x)
v1 <- c(2,4,6,8,10)
print(v1)
print(v1[3])
v <- c(1:10) #creates a vector of 10 elements numbered 1 through 10. More complicated data
print(v)
print(v[6])
# Import test data
test<-read.csv("CVEs.csv")
test1<-read.csv("CVEs.csv", sep=",")
test2<-read.table("CVEs.csv", sep=",")
write.csv(test2, file="out.csv")
# Write CSV in R
write.table(test1, file = "out1.csv",row.names=TRUE, na="",col.names=TRUE, sep=",")
head(test)
tail(test)
summary(test)
head <- head(test)
tail <- tail(test)
cor(test$X, test$index)
sd(test$index)
var(test$index)
plot(test$index)
hist(test$index)
str(test$index)
quit()
Quiz3
setwd("C:/Users/ialsmadi/Desktop/University_of_Cumberlands/Lectures/Week2/RScripts")
getwd()
# Import test data
data<-read.csv("yearly_sales.csv")
#A 5-number summary is a set of 5 descriptive statistics for summarizing a continuous univariate data set.
#It consists o ...
STAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdfSOUMIQUE AHAMED
STAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED, Division of Agronomy, Faculty of Agriculture - Wadura, Sher-e-Kashmir University of Agricultural Sciences and Technology of Kashmir.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
1. R Language (basics, vectors,
arrays, matrices, factors
By K K Singh
RGUKT Nuzvid
19-08-2017KK Singh, RGUKT Nuzvid
1
2. Introduction
R is a programming language and software environment for statistical analysis, graphics
representation and reporting. R was created by Ross Ihaka and Robert Gentleman at the
University of Auckland, New Zealand, and is currently developed by the R Development
Core Team.
R is freely available under the GNU General Public License, and pre-compiled binary
versions are provided for various operating systems like Linux, Windows and Mac.
This programming language was named R, based on the first letter of first name of the
two R authors (Robert Gentleman and Ross Ihaka), and partly a play on the name of the
Bell Labs Language S.
19-08-2017KK Singh, RGUKT Nuzvid
2
3. R environment setting
Windows Installation
You can download the Windows installer version of R from R-3.2.2 for
Windows (32/64 bit) and save it in a local directory.
As it is a Windows installer (.exe) with a name "R-version-win.exe". You can
just double click and run the installer accepting the default settings.
Linux Installation
R is available as a binary for Linux at the location R Binaries.
you may use yum command to install R as follows −
$ yum install R
then you can launch R prompt as follows −
$ R
Now you can use install command at R prompt to install the required package.
For example, the following command will install plotrix package.
> install.packages("plotrix" )
19-08-2017KK Singh, RGUKT Nuzvid
3
4. Basic Sentence
Start your R command prompt by typing the following command (in Linux) −
$ R
OR double click on installed .exe file ( in windows)
This will launch R interpreter and you will get a prompt >
where you can start typing your program as follows −
myString <- "Hello, World!"
> print ( myString)
[1] "Hello, World!"
R Script File
Usually, you will do your programming by writing your programs in script files
and execute those scripts with the help of R interpreter called Rscript. Ex:
# My first program in R Programming
myString <- "Hello, World!“
print ( myString)
Save it as test.R and execute it at R command prompt.
> source(“test.R”)
[1] "Hello, World!"
19-08-2017KK Singh, RGUKT Nuzvid
4
5. Some basic useful command
>help(word) # get the description of the word
>?word # get the description
>getwd() # get the working directory
>setwd(“C:/kk”) # set the working directory
>q() # quit
>source(“filename.R”) #execute Rscript
……………………………………………………………..
>sink(“filename.txt”) # direct output into filenale.txt
>source(“file.R”)
>sink() #exit from sink mode
19-08-2017KK Singh, RGUKT Nuzvid
5
6. R –Data Types
In contrast to other programming languages like C and java in R, the
variables are not declared as some data type.
The variables are assigned with R-Objects and the data type of the R-
object becomes the data type of the variable.
Vectors
Arrays
Matrices
Lists
Factors
Data Frames
19-08-2017KK Singh, RGUKT Nuzvid
6
7. Vector-Data Types
Data Type Example Verify
Logical TRUE, FALSE
v <- TRUE
print(class(v))
[1] "logical"
Numeric 12.3, 5, 999
v <- 23.5
print(class(v))
[1] "numeric"
Integer 2L, 34L, 0L
v <- 2L
print(class(v))
[1] "integer"
Complex 3 + 2i
v <- 2+5i
print(class(v))
[1] "complex"
Character
'a' , '"good",
"TRUE", '23.4'
v <- "TRUE"
print(class(v))
[1] "character" 19-08-2017KK Singh, RGUKT Nuzvid
7
8. Vector (Cont..)
To create vector with more than one element,
use c() function which combines elements into a vector.
# Create a vector.
apple <- c('red','green',"yellow")
print(apple) # Get the class of the vector.
print(class(apple))
………………………………………………………………………..
The non-character values are coerced to character type
if one of the elements is a character.
s <- c('apple','red',5,TRUE)
print(s)
it produces the following result −
[1] "apple" "red" "5" "TRUE"
19-08-2017KK Singh, RGUKT Nuzvid
8
9. Vector (Cont..)
Multiple Elements Vector
Using colon operator with numeric data
# Creating a sequence from 5 to 13.
v <- 5:13
print(v)
……………………………………………………………………………………………
v <- 6.6:12.6 # Creating a sequence from 6.6 to 12.6
print(v)
………………………………………………………………………………………………
# If the final element not belong to the sequence, it is discarded.
v <- 3.8:11.4
print(v)
………………………………………………………………………
Using sequence (Seq.) operator
# Create vector with elements from 5 to 9 incrementing by 0.4.
print(seq(5, 9, by = 0.4))
……………………………………………………………………….
# empty vector
>x<-numeric()
>x[3]<-5
>x
19-08-2017KK Singh, RGUKT Nuzvid
9
10. Accessing vector elements
Accessing Vector Elements
Elements of a Vector are accessed using indexing.
The [ ] brackets are used for indexing. Indexing starts
with position 1.
Giving a negative value in the index drops that element
from result.
TRUE, FALSE or 0 and 1 can also be used for indexing.
# Accessing vector elements using position.
t <- c("Sun","Mon","Tue","Wed","Thurs","Fri","Sat")
u <- t[c(2,3,6)]
print(u)
…………………………………………………………………………………………………….
v <- t[c(TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,FALSE)]
print(v)
……………………………………………………………………………………………………………
……
x <- t[c(-2,-5)] # print t excluding 2nd & 5th index value
print(x)
19-08-2017KK Singh, RGUKT Nuzvid
10
11. Vector Manipulation
Two vectors of same length can be added, subtracted, multiplied or divided
giving the result as a vector output.
# Create two vectors.
v1 <- c(3,8,4,5,0,11)
v2 <- c(4,11,0,8,1,2)
add.result <- v1+v2
print(add.result)
……………………………………………………………………………
sub.result <- v1-v2
print(sub.result)
…………………………………………………………………………………………………
multi.result <- v1*v2
print(multi.result)
………………………………………………………………………………………….
divi.result <- v1/v2
print(divi.result)
………………………………………………………………………………………………………….19-08-2017KK Singh, RGUKT Nuzvid
11
What is
output
12. Vector Element Sorting
Elements in a vector can be sorted using
the sort() function.
v <- c(3,8,4,5,0,11, -9, 304) # Sort the elements of the vector.
sort.result <- sort(v)
print(sort.result) # Sort the elements in the reverse order.
revsort.result <- sort(v, decreasing = TRUE)
print(revsort.result)
19-08-2017KK Singh, RGUKT Nuzvid
13
13. Arrays
Arrays are the R data objects which can store data in more than two
dimensions.
For example − If we create an array of dimension (2, 3, 4) then it
creates 4 rectangular matrices each with 2 rows and 3 columns.
Arrays can store only one data type.
An array is created using the array() function.
Example creates an array of two 3x3 matrices each with 3 rows and
3 columns.
vector1 <- c(5,9,3)
vector2 <- c(10,11,12,13,14,15) # Take these vectors as input to the array.
result <- array(c(vector1,vector2),dim = c(3,3,1))
print(result)
19-08-2017KK Singh, RGUKT Nuzvid
14
14. Naming Columns and Rows
We can give names to the rows, columns and matrices in the array by using
the dimnames parameter.
# Create two vectors of different lengths.
vector1 <- c(5,9,3)
vector2 <- c(10,11,12,13,14,15)
column.names <- c("COL1","COL2","COL3")
row.names <- c("ROW1","ROW2","ROW3")
matrix.names <- c("Matrix1","Matrix2") # Take these vectors as input to the array.
result <- array(c(vector1,vector2),dim = c(3,3,2),dimnames = list(row.names,column.names, m
print(result)
19-08-2017KK Singh, RGUKT Nuzvid
16
15. Accessing Array Elements# see previous example
print(result[3,,2]) # Print the third row of the second matrix of the array.
print(result[1,3,1]) # Print the element in the 1st row and 3rd column of the 1st matrix.
print(result[,,2]) # Print the 2nd Matrix.
It produces the following result −
COL1 COL2 COL3
3 12 15
[1] 13
COL1 COL2 COL3
ROW1 5 10 13
ROW2 9 11 14
ROW3 3 12 15
Q. Access 2nd & 4th column of the result.
19-08-2017KK Singh, RGUKT Nuzvid
18
16. Manipulating Array Elements
As array is made up matrices in multiple dimensions, the operations
on elements of array are carried out by accessing elements of the matrices.
# Create two vectors of different lengths.
vector1 <- c(5,9,3)
vector2 <- c(10,11,12,13,14,15)
array1 <- array(c(vector1,vector2),dim = c(3,3,2))
vector3 <- c(9,1,0)
vector4 <- c(6,0,11,3,14,1,2,6,9)
array2 <- array(c(vector1,vector2),dim = c(3,3,2))
matrix1 <- array1[,,2]
matrix2 <- array2[,,2]
result <- matrix1+matrix2
print(result)
19-08-2017KK Singh, RGUKT Nuzvid
19
17. R-Matrix
> Matrices are the R objects, which is two dimensional array.
> They contain elements of the same atomic types
> Matrix is created using the matrix() function.
Syntax
matrix(data, nrow, ncol, byrow, dimnames)
Example
M <- matrix(c(3:14), nrow = 4, byrow = TRUE)
print(M)
N <- matrix(c(3:14), nrow = 4, byrow = FALSE) # Elements are arranged
sequentially by column.
print(N) # Define the column and row names.
rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")
P <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames = list(rownames,
colnames))
print(P)
19-08-2017KK Singh, RGUKT Nuzvid
21
18. Accessing Elements of a Matrix
Elements of a matrix can be accessed by using the column and row index of the element.
rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")
P <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames = list(rownames, colnames)) # Create the matrix
print(P[1,3]) # Access the element at 3rd column and 1st row.
print(P[4,2]) # Access the element at 2nd column and 4th row.
print(P[2,]) # Access only the 2nd row.
print(P[,3]) # Access only the 3rd column.
It produces the following result −
[1] 5
[1] 13
col1 col2 col3
6 7 8
row1 row2 row3 row4
5 8 11 14
Q. A matrix has 100 columns, access all even index column.
19-08-2017KK Singh, RGUKT Nuzvid
23
19. 19-08-2017KK Singh, RGUKT Nuzvid
24
Factorsare the data objects which are used to categorize the data and store it as levels.
They can store both strings and integers.
They are useful in the columns which have a limited number of unique values.
Like "Male, "Female" and True, False etc. They are useful in data analysis for statistical modeling.
Factors are created using the factor () function by taking a vector as input.
Example
# Create a vector as input.
data <- c("East","West","East","North","North","East","West","West","West","East","North")
print(data)
print(is.factor(data)) # Apply the factor function.
factor_data <- as.factor(data)
print(factor_data)
print(is.factor(factor_data))
It produces the following result −
[1] "East" "West" "East" "North" "North" "East" "West" "West" "West" "East" "North"
[1] FALSE
[1] East West East North North East West West West East North Levels: East North West
[1] TRUE
20. 19-08-2017KK Singh, RGUKT Nuzvid
25
Factors in Data Frame
On creating any data frame with a column of text data,
R treats the text column as categorical data and creates factors on it.
# Create the vectors for data frame.
height <- c(132,151,162,139,166,147,122)
weight <- c(48,49,66,53,67,52,40)
gender <- c("male","male","female","female","male","female","male")
input_data <- data.frame(height,weight,gender)
print(input_data) # Test if the gender column is a factor.
print(as.factor(input_data$gender)) # Print the gender column so see the levels.
print(input_data$gender)