0
Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

# Basic R

223

Published on

Published in: Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total Views
223
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
5
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Transcript

• 1. Prof. Dr. Roberto Dantas de Pinho, roberto.pinho@mct.gov.br 26/jul/2012 This presentation is based on courses by Dr. Paulo Justiniano Ribeiro Jr (UFPR) &amp; Dr. Cosme Marcelo Furtado Passos da Silva (FIOCRUZ) SEXECASCAV|CGIN 1
• 2. &#xF0A1; A First R Session &#xF0A1; Saving your work&#xF0A1; Objects &#xF0A1; Changing data&#xF0A1; Data input &#xF0A1; Sums e&#xF0A1; Now that we have aggregates data... &#xF0A1; Linear regression &#xF0A7; Some analyses&#xF0A1; Filter &amp; select And lots of other things along the way SEXECASCAV|CGIN 2
• 3. &#xF0D3;Install, configuration etc.&#xF0D3;R internals, structure etc.&#xF0D3;Handling large datasets&#xF0D3;Fancy plots beyond the basics SEXECASCAV|CGIN 3
• 4. &#xF0A1; You can use R to evaluate some simple expressions. Just type: 1 + 2 + 3 2 + 3 * 4 3/2 + 1 4 * 3**3&#xF0A1; R is an environment and a language SEXECASCAV|CGIN 4
• 5. &#xF0A1; The R environment allows for you to submit command and see results immediately.&#xF0A1; The R language is made by the set of rules and functions that may be run by the R environment.&#xF0A1; You may keep command sequences (scripts) for latter use. SEXECASCAV|CGIN 5
• 6. &#xF0A1; Several functions are available. A couple simple examples: &#xF0A7; sqrt(2) 2 &#xF0A7; abs(-10) &#xF02D; 10 &#xF0A7; sin(pi) sin(&#xF070; )&#xF0A1; pi is a constant in R, its value is already defined. SEXECASCAV|CGIN 6
• 7. &#xF0A1; Results, input data, tables etc. are all stored in R as Objects&#xF0A1; Objects have a name, content , type and are stored in memory. Ex. &#xF0A7; Creates object &#x201C;x&#x201D; with the number 10: x &lt;- 10 &#xF0A7; Show the content of x: x In R, abc is different of ABC SEXECASCAV|CGIN 7
• 8. &#xF0A1; Try: X &lt;- sqrt(2) &lt;- and = are equivalent. Y = sin(pi) Z = sqrt(X+Y)&#xF0A1; In the above examples, X, Y and Z store results from each operation.In R, There is always many ways ofdoing the same thing. We will try to focus on a single way of doing each task. SEXECASCAV|CGIN 8
• 9. &#xF0A1; What is the value of C at the end of the script? A = 1 B = 2 C = A + B A = 5 B = 5&#xF0A1; Why? SEXECASCAV|CGIN 9
• 10. SEXECASCAV|CGIN 10
• 11. &#xF0A1; Tool that makes it easier to use R&#xF0A1; Manages work windows&#xF0A1; Easier access to objects, scripts, history of commands and plots. SEXECASCAV|CGIN 11
• 12. Editing Scripts &amp;object view Console SEXECASCAV|CGIN 12
• 13. Object list&amp; historyHelp, plots,files &amp; packages SEXECASCAV|CGIN 13
• 14. &#xF0A1; Object that hold multiple values that store data of a single type&#xF0A1; Function c( ) (&#x201C;c&#x201D; from concatenate) groups values to build a vector: X = c(1,3,6)&#xF0A1; To access vector elements: X[1] X[3] SEXECASCAV|CGIN 14
• 15. &#xF0A1; Operations may be performed and functions applied over the whole vector. Ex. X = c(1,3,5) Y = c(10,20,30) X+Y [1] 11 23 35 sum(X) [1] 9&#xF0A1; How about X + 100 ? [1] 101 103 105 due to the Recycling law SEXECASCAV|CGIN 15
• 16. &#xF0A1; When the size of an object required by an operation is different from the actual size, available data is repeated as needed.&#xF0A1; As X has 3 elements, X+100 is the same as X + c(100,100,100) SEXECASCAV|CGIN 16
• 17. &gt; X = 1:10&gt; [1] 1 2 3 4 5 6 7 8 9 10&gt; X = seq(0,1,by=0.1)&gt; [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0&gt; rep(&#x201C;a&#x201D;,5)&gt; &#x201C;a&#x201D; &#x201C;a&#x201D; &#x201C;a&#x201D; &#x201C;a&#x201D; &#x201C;a&#x201D;&gt; names = c("fulano", "beltrano", "cicrano")&gt; names [1] "fulano" "beltrano" "cicrano"&gt; letras = letters[1:5]&gt; letras [1] "a" "b" "c" "d" "e"&gt; letras = LETTERS[1:5]&gt; letras [1] "A" "B" "C" "D" "E" SEXECASCAV|CGIN 17
• 18. &#xF0A1; numeric &#xF0A1; integer &#xF0A7; is.numeric( ) &#xF0A7; is.integer( ) &#xF0A7; as.numeric( ) &#xF0A7; as.integer( )&#xF0A1; character &#xF0A1; logical &#xF0A7; is.character( ) &#xF0A7; T == TRUE == 1 &#xF0A7; as.character( ) &#xF0A7; F == FALSE == 0 A == B means &#x201C;is A equal to B?&#x201D; SEXECASCAV|CGIN 18
• 19. &#xF0A1; A Vector arranged in rows &amp; columns m1 &lt;- matrix(1:12, ncol = 3) [,1] [,2] [,3] [1,] 1 5 9 [2,] 2 6 10 [3,] 3 7 11 [4,] 4 8 12 SEXECASCAV|CGIN 19
• 20. &#xF0D8; length(m1)&#xF0D8; [1] 12&#xF0D8; dim(m1)&#xF0D8; [1] 4 3&#xF0D8; nrow(m1)&#xF0D8; [1] 4&#xF0D8; ncol(m1)&#xF0D8; [1] 3 SEXECASCAV|CGIN 20
• 21. &#xF0D8; m1[1, 2]&#xF0D8; [1] 5&#xF0D8; m1[2, 2]&#xF0D8; [1] 6&#xF0D8; m1[ , 2]&#xF0D8; [1] 5 6 7 8&#xF0D8; m1[3, ] m1[1,2]= 99&#xF0D8; [1] 3 7 11 changes the value of the cell SEXECASCAV|CGIN 21
• 22. m1[1:2, 2:3] [,1] [,2][1,] 5 9[2,] 6 10 SEXECASCAV|CGIN 22
• 23. colnames(m1)NULLrownames(m1)NULLcolnames(m1) = c("C1","C2","C3")m1[,&#x201D;C1&#x201D;][1] 1 2 3 4 t(m1) transpose of m1 SEXECASCAV|CGIN 23
• 24. &#xF0A1; &#x201C;matrix&#x201D; with many dimensions. Ex. 3 dim.:ar1 &lt;- array(1:24, dim = c(3, 4, 2)), , 1 1&#xAA; matrix [,1] [,2] [,3] [,4][1,] 1 4 7 10[2,] 2 5 8 11[3,] 3 6 9 12 For a 3 dimention array, you migth visualize the 3rd, , 2 dimentions as a colections of matrices. [,1] [,2] [,3] [,4][1,] 13 16 19 22[2,] 14 17 20 23 2&#xAA; matrix[3,] 15 18 21 24 SEXECASCAV|CGIN 24
• 25. &#xF0A1; How to work with this kind of data?Ano C&#xF3;digo do &#xD3;rg&#xE3;o UF &#xD3;rg&#xE3;o C&#xF3;digo da UO unidade or&#xE7;ament&#xE1;ria fun&#xE7;&#xE3;o subfun&#xE7;&#xE3;o programa a&#xE7;&#xE3;o localizador descri&#xE7;&#xE3;o da a&#xE7;&#xE3;o valor P&amp;D valor ACTC Adm direta e MODERNIZA&#xC7;&#xC3;O DO SISTEMA DE2010 AC 1 indireta 1 Adm direta e indireta 19 121 2056 1548 PLANEJAMENTO E GEST&#xC3;O DA SDCT R\$ - R\$ 16.655,00 PROGRAMA DE COOPERA&#xC7;&#xC3;O T&#xC9;CNICA E Adm FINANCEIRA COM INSTIT. NAC. INTERN. direta e GOVERNAMENTAIS E N&#xC3;O2010 AC 1 indireta 1 Adm direta e indireta 19 121 2056 1549 GOVERNAMENTAIS R\$ - R\$ 715.000,00 Adm direta e MANUTEN&#xC7;&#xC3;O DO GABINETE DO SECRET&#xC1;2010 AC 1 indireta 1 Adm direta e indireta 19 122 2009 2224 RIO R\$ - R\$ 27.732,11 Adm direta e2010 AC 1 indireta 1 Adm direta e indireta 19 122 2009 2227 DEPARTAMENTO DE GEST&#xC3;O INTERNA R\$ - R\$ 2.266.169,90 SEXECASCAV|CGIN 25
• 26. colnames(d) [1] "letra" "num" "valor" &#xF0A1; Each column has its own data type d = data.frame(letters[1:4], 1:4, 10.5) letters.1.4. X1.4 X10.5 1 a 1 10.5 We will be using 2 b 2 10.5 data.frames most of 3 c 3 10.5 the time 4 d 4 10.5 &#xF0A1; We can change column names: colnames(d) = c("letra","num", "valor") colnames(d) [1] "letra" "num" "valor&#x201C; d\$valor # selects column &#x201C;valor&#x201D; from d SEXECASCAV|CGIN 26
• 27. &#xF0A1; list&#xF0A1; factor latter... 27 SEXECASCAV|CGIN
• 28. &#xF0A1; Several possible sources.&#xF0A1; We will see: &#xF0A7; Keyboard x = scan( ) &#xF0A7; Excel files &#xF0A7; CSV files &#xF0A7; SQL Databases SEXECASCAV|CGIN 28
• 29. require(XLConnect)wb &lt;- loadWorkbook(&#x201C;AC_PDACTCaula.xls&#x201D;)plan1 &lt;- readWorksheet(wb, sheet = 1)str(plan1)View(plan1) SEXECASCAV|CGIN 29
• 30. require(XLConnect)&#xF0A1; Loads package XLConnect&#xF0A1; Packages are sets of functions and data that add capabilities to R.&#xF0A1; If the package is not installed:setInternet2() #only on windowsinstall.packages("XLConnect", dep=T) SEXECASCAV|CGIN 30
• 31. &#xF0A1; Creates an object &#x201C;wb&#x201D; that points to the excel file:wb &lt;- loadWorkbook(&#x201C;AC_PDACTCaula.xls&#x201D;) SEXECASCAV|CGIN 31
• 32. &#xF0A1; Load the first sheet data into an object called &#x201C;plan1&#x201D;plan1 &lt;- readWorksheet(wb, sheet = 1) R functions identify parameters by Or by name, or order both SEXECASCAV|CGIN 32
• 33. &#xF0A1; Show the structure of the new object:str(plan1) str() works with any R Object. It is very useful.&#xF0A1; Show data on a window:View(plan1) In RStudio, you may click on na object from the objects list to the same effect SEXECASCAV|CGIN 33
• 34. args(readWorksheet) #shows available parametersfunction (object, #workbook &#x201C;wb&#x201D;sheet, #number or name of the sheetstartRow, #startCol, #endRow, #endCol, #header # T or F: use first line to name columns ) SEXECASCAV|CGIN 34
• 35. &#xF0A1; Comma-separated values&#xF0A1; Very popular format for data interchange&#xF0A1; ; Other separators are also popular: &lt;tab&gt; &lt;space&gt;&#xF0A1; Example:uf ano valido somaactc somapdAC 2009 1 34296430.67 3630841.04AC 2010 1 29397712.04 3579715.12AL 2009 1 12650160.51 8903714.41 SEXECASCAV|CGIN 35
• 36. &#xF0A1; Example:uf ano valido somaactc somapdAC 2009 1 34296430,67 3630841,04AC 2010 1 29397712,04 3579715,12AL 2009 1 12650160,51 8903714,41&#xF0A1; To read this file:d = read.csv(file="AgregaUF20110930_b.txt", header=T, # uses first line as column names sep="t", # separator is &lt;tab&gt; dec="," # decimals uses comma) SEXECASCAV|CGIN 36
• 37. &#xF0A1; str(d) #structure&#xF0A1; summary(d) #Statistical summary&#xF0A1; head(d) #first rows&#xF0A1; tail(d) #last rows&#xF0A1; plot(d) #standard plot SEXECASCAV|CGIN 37
• 38. require(RODBC)canal &lt;- odbcConnect(&#x201C;base_ODBC",case="tolower&#x201C;,uid=&#x201C;user&#x201D;,pwd=&#x201C;password&#x201D;)d &lt;- sqlQuery(canal,&#x201D;select * from table where year = 2010&#x201D;,as.is=T) SEXECASCAV|CGIN 38
• 39. &#xF0A1; How to get the sum of values from a data.frame column? sum(data.frame\$column) sum(d\$somapd) [1] NA SEXECASCAV|CGIN 39
• 40. &#xF0A1; NA Not Available &#xF0A7; Missing values.&#xF0A1; NaN Not a Number &#xF0A7; Value not able to be presented as a number.&#xF0A1; Inf &amp; -Inf &#xF0A7; plus and minus infinite Try: c(-1,0,1)/0 SEXECASCAV|CGIN 40
• 41. &#xF0A1; Sum: sum(d\$somapd, na.rm=T) [1] 4836882446&#xF0A1; Mean:mean(d\$somapd, na.rm=T)&#xF0A1; Median:median(d\$somapd, na.rm=T)&#xF0A1; Standard deviation:sd(d\$somapd, na.rm=T) SEXECASCAV|CGIN 41
• 42. &#xF0A1; For these examples: milsa = read.csv("milsaText.txt", sep="t", head=T, dec=".") SEXECASCAV|CGIN 42
• 43. &#xF0A1; Absolute frequenciestable(milsa\$civil)&#xF0A1; Relative frequenciestable(milsa\$civil) / length(milsa\$civil) orprop.table(milsa\$civil)&#xF0A1; Pie chartpie(table(milsa\$civil)) SEXECASCAV|CGIN 43
• 44. &#xF0A1; With attach(milsa)&#xF0A1; Absolute frequenciestable(civil)&#xF0A1; Relative frequenciestable(civil) / length(civil) orprop.table(civil)&#xF0A1; Pie Chart after: detach(milsa)pie(table(civil)) SEXECASCAV|CGIN 44
• 45. &#xF0A1; Bar plot:barplot(table(instrucao))&#xF0A1; remember: &#xF0A7; I may save any result as an object to use it later.instrucao.tb = table(instrucao)barplot(instrucao.tb)pie(instrucao.tb) SEXECASCAV|CGIN 45
• 46. &#xF0A1; Try:prop.table(filhos)&#xF0A1; Solution:prop.table(table(filhos))&#xF0A1; Other solution: &#xF0A7; Filter out elements with NA SEXECASCAV|CGIN 46
• 47. &#xF0A7; mean(filhos, na.rm=T) &#xF0A7; median(filhos, na.rm=T) &#xF0A7; range(filhos, na.rm=T) &#xF0A7; var(filhos, na.rm=T) #variance &#xF0A7; sd(filhos, na.rm=T) #standard deviation&#xF0A1; Quantiles: &#xF0A7; filhos.quartis = quantile(filhos, na.rm=T)&#xF0A1; interquartile range: &#xF0A7; filhos.quartis [4] -filhos.quartis [1] SEXECASCAV|CGIN 47
• 48. &#xF0A1; plot(milsa)&#xF0A1; plot(salario ~ ano)&#xF0A1; hist(salario)&#xF0A1; boxplot(salario)&#xF0A1; stem(salario) SEXECASCAV|CGIN 48
• 49. &#xF0A1; Selecting some rows&#xF0A1; milsaNovo = milsa[c(1,3,5,6) , ]&#xF0A1; Selecting some columns&#xF0A1; milsaNovo = milsa[ , c(1,3,5)]&#xF0A1; milsaNovo = milsa[ , c(&#x201C;funcionario&#x201D;, &#x201D;instrucao&#x201C;, &#x201C;salario&#x201D;)]&#xF0A1; Attention: &#xF0A7; New copy&#xF0A1; milsaNovo=milsa[c(1,3,5,6) ,] &#xF0A7; Replaces previous&#xF0A1; milsa=milsa[c(1,3,5,6) , ] SEXECASCAV|CGIN 49
• 50. &#xF0A1; Who earns above median&#xF0A1; acimamediana = milsa[ salario &gt; median(salario), ]&#xF0A1; Who is married and has higher education degree?&#xF0A1; casadoEsuperior = milsa[ civil==&#x201C;casado&#x201D; &amp; instrucao == &#x201C;Superior&#x201D;, ] AND: both must be true SEXECASCAV|CGIN 50
• 51. &#xF0A1; Who is married or has higher education degree?&#xF0A1; casadoOUsuperior = milsa[ civil==&#x201C;casado&#x201D; | instrucao == &#x201C;Superior&#x201D;, ] OR: at least one must be true SEXECASCAV|CGIN 51
• 52. NOT&#xF0A1; milsaLimpo=milsa[!is.na(salario), ]&#xF0A1; In English: &#xF0A7; New Table milsaLimpo &#xF0A7; equals = &#xF0A7; Old table milsa &#xF0A7; Select [ &#xF0A7; Rows where &#xF0A7; Salary is not NA ! is.na(salario) &#xF0A7; And all columns , ] SEXECASCAV|CGIN 52
• 53. &#xF0A1;How many are married?sum(civil==&#x201C;casado&#x201D;) &#xF0A7; ortable(civil)["casado"]&#xF0A1;How may are married and has higher ed. degree?sum(civil==&#x201C;casado&#x201D; &amp; instrucao == &#x201C;Superior&#x201D; ) &#xF0A7; ortable(civil,instrucao)["casado","S uperior"] SEXECASCAV|CGIN 53
• 54. &#xF0A1; milsaNovo is equal to milsa, without rows 1,2 &amp; 5 &amp; without columns 1 &amp; 8:milsaNovo =milsa[-c(1,2,5), -c(1,8)] SEXECASCAV|CGIN 54
• 55. Which rows where this is TRUE&#xF0A1; sup = which(instrucao=="Superior&#x201C;)&#xF0A1; [1] 19 24 31 33 34 36&#xF0A1; May use it again later: &#xF0A7; mean(milsa[sup,&#x201D;salario&#x201D;]) &#xF0A7; Mean salary for those with higher education advantage: it is not a copy!! SEXECASCAV|CGIN 55
• 56. &#xF0A1; A random sample of 10 rows from milsa: amostra = sample(x=nrow(milsa),size=10) [1] 12 29 1 3 17 14 26 33 20 31&#xF0A1; Mean salary for the sample: mean(milsa[amostra,&#x201D;salario&#x201D;]) SEXECASCAV|CGIN 56
• 57. &#xF0A1; By number of children: milsa[order(filhos),]&#xF0A1; Decreasing: milsa[order(filhos, decreasing=T),]&#xF0A1; By number of children and then age: milsa[order(filhos,ano),]&#xF0A1; 10 youngest: head(milsa[order(ano),], 10)&#xF0A1; 10 older: tail(milsa[order(ano),], 10) SEXECASCAV|CGIN 57
• 58. &#xF0A1; Removing an object &#xF0A7; rm(milsaNovo)&#xF0A1; Removing every object &#xF0A7; rm(list = ls()) ls() : list of current objects SEXECASCAV|CGIN 58
• 59. &#xF0A1; List objects are collections that may include different types of objects.lis = list(A=1:10, B=&#x201C;Text&#x201D;, C = matrix(1:9,ncol=3)&#xF0A1; They are often used as parameters to functions or as result sets from them.&#xF0A1; lis[1:2] &#xF0A7; A list with the two first objects from lis (A &amp; B)&#xF0A1; lis[[1]]: &#xF0A7; object stored at the first position of the list ( the content of A). The same as lis\$A SEXECASCAV|CGIN 59
• 60. &#xF0A1; Saving all objects: save.image(&#x201C;file.RData&#x201D;)&#xF0A1; Saving selected objects: save( x, y, file=&#x201C;file.RData&#x201D;)&#xF0A1; loading: load(&#x201C;file.RData&#x201C;) Several &#x201C;loads&#x201D;: objects with distinct names are kept in memory SEXECASCAV|CGIN 60
• 61. &#xF0A1; Saving a script &#x201C;.R&#x201D; that reproduces the desired output.&#xF0A1; Advantage: &#xF0A7; It may be used to document the work performed; &#xF0A7; It may be used again over updated data to update results.&#xF0A1; Hybrid model: &#xF0A7; Save intermediate results that take long time to process. Update them less often. SEXECASCAV|CGIN 61
• 62. &#xF0A1; Add a column to a data.frame: milsa\$idade = milsa\$ano + milsa\$mes/12 SEXECASCAV|CGIN 62
• 63. X Y6+3+5=14 SEXECASCAV|CGIN 63
• 64. X Y SEXECASCAV|CGIN 64
• 65. X Y SEXECASCAV|CGIN 65
• 66. X Y SEXECASCAV|CGIN 66
• 67. X Y SEXECASCAV|CGIN 67
• 68. &#xF0A1; Example: &amp; SEXECASCAV|CGIN 68
• 69. &#xF0A1; Only rows found in both data.frames:merge(x=milsa, y=tabInst,by.x="instrucao", by.y="desc&#x201C;, all=F)&#xF0A1;All rows from data.frame X:merge(x=milsa, y=tabInst,by.x="instrucao", by.y="desc", all.x=T) SEXECASCAV|CGIN 69
• 70. &#xF0A1;All rows from data.frame y:merge(x=milsa, y=tabInst,by.x="instrucao", by.y="desc", all.y=T)&#xF0A1;All rows from data.frames x &amp; y:merge(x=milsa, y=tabInst,by.x="instrucao", by.y="desc", all=T) SEXECASCAV|CGIN 70
• 71. &#xF0A1; From text to numericd.f\$novaColuna = as.numeric(d.f\$coluna)&#xF0A1; From numeric to text:d.f\$novaColuna=as.character(d.f\$coluna)&#xF0A1; From text or numeric to integer:d.f\$novaColuna = as.integer(d.f\$coluna) Integers save memory SEXECASCAV|CGIN 71
• 72. &#xF0A1; Representation for categorical data &#xF0A7; Nominal &#x25AA; &#x201C;married&#x201D;, &#x201C;single&#x201D; &#xF0A7; Ordinal Factors save memory &#x25AA; &#x201C;tall&#x201D;, &#x201C;short&#x201D;&#xF0A1; Assure proper treatment for these variables by many R functions SEXECASCAV|CGIN 72
• 73. Nominal:milsa\$fatorcivil=factor(milsa\$civil, ordered=F)\$fatorcivil : Factor w/ 2 levels "casado","solteiro": 2 1 1 2 2 1 2 2 1 2Ordinal:milsa\$fatormes = factor(milsa\$mes, ordered=T)\$fatormes : Ord.factor w/ 12 levels "0"&lt;"1"&lt;"2"&lt;"3"&lt;..: 4 11 6 11 8 1 1 5 11 7 ... It is possible to define a custom order: ?factor SEXECASCAV|CGIN 73
• 74. &#xF0A1; From factor to text:d.f\$novaColuna = as.character(d.f\$colunaFator)&#xF0A1; From factor to numeric:d.f\$novaColuna = as.numeric( as.character(d.f\$colunaFator)) The internal representation of a factor is different from its text description SEXECASCAV|CGIN 74
• 75. &#xF0A1; Using: m1 &lt;- matrix(1:12, ncol = 3)&#xF0A1; Sum of columns (a value for each column):colSums(m1)[1] 10 26 42 &#xF0A7; orapply(m1,2,sum)[1] 10 26 42 SEXECASCAV|CGIN 75
• 76. &#xF0A1; Sum of rows (one value for each row):rowSums(m1)[1] 15 18 21 24 &#xF0A7; orapply(m1,1,sum)[1] 15 18 21 24 May use any function, even your own. SEXECASCAV|CGIN 76
• 77. aggregate(salario ~ instrucao, data = milsa, mean) instrucao salario1 1oGrau 7.8366672 2oGrau 11.5283333 Superior 16.475000 SEXECASCAV|CGIN 77
• 78. aggregate( salario ~ instrucao + civil, data = milsa, mean) instrucao civil salario1 1oGrau casado 7.0440002 2oGrau casado 12.8250003 Superior casado 17.7833334 1oGrau solteiro 8.4028575 2oGrau solteiro 8.9350006 Superior solteiro 15.166667 SEXECASCAV|CGIN 78
• 79. model = lm( formula = salario ~ ano + instrucao, data = milsa)summary(model) Just one line!!! SEXECASCAV|CGIN 79
• 80. Prof. Dr. Roberto Dantas de Pinho, roberto.pinho@mct.gov.br This presentation is based on courses by Dr. Paulo Justiniano Ribeiro Jr (UFPR) &amp; Dr. Cosme Marcelo Furtado Passos da Silva (FIOCRUZ) SEXECASCAV|CGIN 80