Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

R Programming: Introduction to Vectors

505 views

Published on

In this tutorial, we explore the most basic data structure in R, the vector. We cover everything from creating vectors to subsetting them in different ways.

Published in: Education
  • Be the first to like this

R Programming: Introduction to Vectors

  1. 1. www.r-squared.in/git-hub R2 Academy R Programming: Vectors
  2. 2. R2 AcademyCourse Material Slide 2 All the material related to this course are available on our website Scripts can be downloaded from GitHub Videos can be viewed on our Youtube Channel
  3. 3. R2 AcademyObjectives Slide 3 ➢ Understand the concept of vectors ➢ Learn to create vectors of different data types ➢ Coercion of different data types in vectors ➢ Perform simple operations on vectors ➢ Handling missing data in vectors ➢ Learn to index/subset vectors
  4. 4. R2 AcademyIntroduction Slide 4 Vector is the most basic data structure in R. It is a sequence of elements of the same data type. If the elements are of different type, they will be coerced to a common type that can accommodate all the elements. Vectors are generally created using the c() function, although depending on the data type of vector being created, other methods can be used. In this unit, we learn to create vectors of the following data types: ✓ Numeric ✓ Integer ✓ Character ✓ Logical
  5. 5. Length Vector Type 1 Integer 2 Numeric 3 Logical 4 Character R2 AcademyVectors Slide 5 1 3.5 8.2 TRUE FALSE TRUE ‘John’ ‘Jack’ ‘Jill’ ‘Jim’
  6. 6. R2 AcademyNumeric Vector Slide 6 We will create numeric vector using the c() function (concatenate function) but you can use any function that creates a sequence of numbers. After that, we will use is.vector to test if it is a vector and class to test the class/data type.
  7. 7. R2 AcademyNumeric Vector Slide 7 We have used three different functions to create a sequence of numbers. We leave it as an assignment to the student to understand the functions using help and documentation.
  8. 8. R2 AcademyInteger Vector Slide 8 Creating integer vectors is similar to numeric vectors except that we need to instruct R to treat them as integer and not as numeric type. We will use the same methods that we used for creating numeric vectors . To specify that the data type is integer, we suffix the number with L.
  9. 9. R2 AcademyInteger Vector Slide 9
  10. 10. R2 AcademyCharacter Vector Slide 10 A character vector may contain a single character, a word or a group of words. The elements must be enclosed in single or double quotes.
  11. 11. R2 AcademyLogical Vector Slide 11 A vector of logical values will contain either TRUE or FALSE.
  12. 12. R2 AcademyLogical Vector Slide 12 In fact, you can create an integer vector and coerce it to type logical and vice versa.
  13. 13. R2 AcademyNaming Vector Elements Slide 13 It is possible to name the different elements of a vector. We can use these names to access the elements after creating the vector. Let us look at an example:
  14. 14. R2 AcademyVector Coercion Slide 14 Vectors are homogeneous i.e. all the elements of the vector must be of the same data type. If we try to create a vector by combining different data types, the elements will be coerced to the most flexible type. The below tables shows the order in which coercion occurs. Character data type is the most flexible while logical data type is the least flexible. If you try to combine any other data type with character, all the elements will be coerced to character type. In the absence of character data, all elements will be coerced to numeric type. Finally, if the data does not include character or numeric types, all the elements will be coerced to integer type. In the next slide, we look at a few examples : Data Types Coerced Type Numeric, Integer, Character & Logical Character Numeric, Integer & Logical Numeric Integer & Logical Integer
  15. 15. R2 AcademyVector Coercion Slide 15
  16. 16. R2 AcademyVector Coercion Slide 16 To summarize, below is the order in which coercion takes place: Logical -> Integer -> Numeric -> Character
  17. 17. R2 AcademyVector Operations Slide 17 In this section, we look at simple operations that can be performed on vectors in R. Remember that the nature of the operations depends on the type of elements in the vector. Let us look at some examples:
  18. 18. R2 AcademyVector Operations Slide 18 In the previous examples, the length i.e number of elements in the vectors were same.What happens if the length of the vectors are unequal? In such cases, the shorter vector is recycled to match the length of the longer vector. A few examples will clear this concept.
  19. 19. R2 AcademyMissing Data Slide 19 Missing data is a reality. No matter how careful you are in collecting data for your analysis, chances are always high that you end up with some missing data. In R, missing values are represented by NA (not available). NA is neither a string nor numeric but it just indicates that the data is missing. In this section, we will focus on the following: ● Test for missing data ● Remove missing data ● Exclude missing data from analysis
  20. 20. R2 AcademyMissing Data Slide 20 We first create a vector with missing value. After that, we use is.na() to test whether the vector contains missing data. The is.na() function returns a vector of logical values equal to the vector being tested, TRUE if data is missing and FALSE otherwise. The complete.cases() function also works in the same way. Both return a vector of logical values indicating the presence/absence of missing data.
  21. 21. R2 AcademyOmit Missing Data Slide 21 The na.omit function returns the vector after removing missing data. We may not always want to remove missing data but in the presence of such data, all computations will return NA. Let us see how we can address this issue:
  22. 22. R2 AcademyExclude Missing Data Slide 22 We can use the na.rm argument within functions and set it to TRUE to ensure that missing data is excluded from computations.
  23. 23. R2 AcademyIndex/Subset Vectors Slide 23 One of the most important steps in data analysis is selecting a subset of data from a bigger dataset. Indexing helps in retrieving values individually or a set of values that meet a specific criteria. In this section, we look at various ways of indexing/subsetting a vector.
  24. 24. R2 AcademyIndex Operator Slide 24 [] is the index operator in R. We can use various expressions within [] to subset data. In R, index positions begin at 1 and not 0. To begin with, let us look at values in different index positions. The index operator is then used to access values at index positions 3 and 7. In the next example, we see what happens when we specify an out of range index.
  25. 25. R2 AcademyOut Of Range Index Slide 25 In the first case, we specified the index as 0 and in the second case we used the index 11, which is greater than the length of the vector. R returns an empty vector in the first case and NA in the second.
  26. 26. R2 AcademyNegative Index Slide 26 Using a negative index will delete the value present in the said index position. Unlike other languages, it will not index elements from the end of the vector counting backwards. Let us look at an example to understand how negative index works in R.
  27. 27. R2 AcademySubset Multiple Elements Slide 27 If we do not specify anything within [], all the elements in the vector will be returned. We can specify the index of elements using any expression that generates a sequence as seen in the second example. In the last case, we subset elements from position 5 to end using the length of the vector.
  28. 28. R2 AcademySubset Multiple Elements Slide 28 What if we want elements that are not in a sequence as we saw in the last example. In such cases, we will have to create a vector of index positions and use it to subset the elements of the original vector.
  29. 29. R2 AcademySubset Named Vectors Slide 29 Vectors can be subset using the name of the elements. When using name of elements for subsetting, ensure that the names are enclosed in single or double quotes, else R will return an error.
  30. 30. R2 AcademySubset Using Logical Values Slide 30 Logical values can be used to subset vectors. They are not very flexible but can be used in simple indexing. In all the above cases the logical values are recycled to match the length of the vector.
  31. 31. R2 AcademyIndex Using Logical Expressions Slide 31 Logical expressions can be used to extract elements that meet a specified criteria. This method is most flexible and useful as we can combine multiple conditions using relational and logical operators. Before we use logical expressions, let us spend some time understanding comparison and logical operators as we will be using them extensively in all the modules hereafter. R provides the following comparison operators Operator Name Example: x <- 5 Result > greater than x > 5 FALSE >= greater than or equal to x >= 5 TRUE < less than x < 5 TRUE <= less than or equal to x <= 5 TRUE == equal to x == 5 TRUE != not equal to x != 5 FALSE
  32. 32. R2 AcademyComparison Operators Slide 32 The output from a comparison operator is always a logical value i.e. TRUE or FALSE. To begin with, let us see how we can use comparison operators to subset elements in a vector.
  33. 33. R2 AcademyComparison Operators Slide 33
  34. 34. R2 AcademyLogical Operators Slide 34 R provides the following logical operators: Operator Name ! not | or & and
  35. 35. R2 AcademyTruth Table - ! (NOT) Slide 35 Truth table for the ! operator: x !x Example: x <- 5 Result True False x > 5 !(x > 5) TRUE FALSE False True x < 5 !(x < 5) FALSE TRUE
  36. 36. R2 AcademyTruth Table - & (AND) Slide 36 Truth table for the & operator: x y x & y Example: x <- 5 y <- 8 Result False False False (x > 5 & y > 8) FALSE False True False (x > 5 & y < 8) FALSE True False False (x < 5 & y > 8) FALSE True True True (x < 5 & y < 8) TRUE
  37. 37. R2 AcademyTruth Table - |(OR) Slide 37 Truth table for the | operator: x y x | y Example: x <- 5 y <- 8 Result False False False (x > 5 | y > 8) False False True True (x > 5 | y < 8) True True False True (x < 5 | y > 8) True True True True (x < 5 | y < 8) True
  38. 38. R2 AcademyLogical Operators Slide 38 Let us combine comparison and logical operators to create logical expressions and see how such expressions can be used to subset the elements of a vector.
  39. 39. R2 Academy Slide 39 ● Vectors are the most basic data structures in R. ● They are homogeneous i.e. they can hold single type of data. ● When multiple data types are combined, they will be coerced to the most flexible data type. ● The index of vectors begins at 1 and not zero. ● Out of range index returns NA. ● Negative index drops element at the specified index position. ● Logical expressions can be used to index/subset vectors. Summary
  40. 40. R2 AcademyNext Steps... Slide 40 In the next module: ✓ Create matrix ✓ Matrix Operations ✓ Combine matrices ✓ Index/Subset matrix ✓ Dissolve matrix
  41. 41. R2 Academy Slide 41 Visit Rsquared Academy for tutorials on: → R Programming → Business Analytics → Data Visualization → Web Applications → Package Development → Git & GitHub

×