2011-06-17 HiRoshima.R #1@Saturday, June 18, 2011                                1
Agenda          1. R             ―       ―          2. R          3. RSaturday, June 18, 2011                2
Agenda          1. R             ―       ―          2. R          3. RSaturday, June 18, 2011                3
Saturday, June 18, 2011   4
tSaturday, June 18, 2011       5
Saturday, June 18, 2011   6
•                 •                 •        A   B           •Saturday, June 18, 2011           7
:       “however”                              109    347        8    493                          [   ] However, ....    ...
> freq <- c(109,347,8)    > chisq.test(freq,correct=FALSE)            Chi-squared test for given probabilities        data...
Saturday, June 18, 2011   10
Agenda          1. R             ―       ―          2. R          3. RSaturday, June 18, 2011                11
Agenda          1. R             ―       ―          2. R          3. RSaturday, June 18, 2011                12
Saturday, June 18, 2011   13
1.                          2.                          3.                          4.                          5.        ...
1.              •                          • ns <- scan("ns_raw.txt", what="character")              •                    ...
2.               •          head(           ,       )               • tail(                ,       )                      ...
2.                    •grep (“              ”,          )                          •                                   > g...
2.            •              [       ]                  • > ns[100]                     • 100                  • > ns[c(98...
3.                    •                    •strsplit (           ,“             ”)                                  > strs...
3.     •                             > ns_list <- strsplit (ns, " ")                   •                     ns_list      ...
4.                          sort (       )                          > ns2 <- sort(unlist(ns_list))Saturday, June 18, 2011 ...
4.     unique (                  )     > ns3 <- unique (sort(unlist(ns_list)))     #                            (         ...
5.                 table (        )                 > ns4 <- table(unlist(strsplit (ns, " ")))                 #          ...
5.                 > ns5 <- length(unlist(strsplit (ns, " ")))                 #Saturday, June 18, 2011                   ...
5.     > ns6 <- length(unique(sort(unlist(strsplit (ns, " ")))))     #     #        > ns7 <- unique(sort(unlist (ns_list))...
6.             > write.table(ns4, file=“freq1.txt”)             > write.table(ns5, file=“freq2.txt”)             > write.tab...
Saturday, June 18, 2011   27
Agenda          1. R             ―       ―          2. R          3. RSaturday, June 18, 2011                28
Agenda          1. R             ―       ―          2. R          3. RSaturday, June 18, 2011                29
•                          •                          •                          •                              •   ... or...
RMeCabSaturday, June 18, 2011            31
RMeCab                 •                  •R           MeCab                          •        RSaturday, June 18, 2011   ...
• RMeCabText() :          • RMeCabFreq() :          • Ngram() : N-gram          • collocate() :Saturday, June 18, 2011    ...
Saturday, June 18, 2011   34
2,940    1,785   3,780Saturday, June 18, 2011                   35
Saturday, June 18, 2011   36
twitter: @sakaue                          e-mail: tsakaue<AT>hiroshima-u.ac.jpSaturday, June 18, 2011                     ...
Upcoming SlideShare
Loading in...5
×

HiRoshima.R #1 1-3 LT

1,181

Published on

Published in: Technology, News & Politics
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,181
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
16
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

HiRoshima.R #1 1-3 LT

  1. 1. 2011-06-17 HiRoshima.R #1@Saturday, June 18, 2011 1
  2. 2. Agenda 1. R ― ― 2. R 3. RSaturday, June 18, 2011 2
  3. 3. Agenda 1. R ― ― 2. R 3. RSaturday, June 18, 2011 3
  4. 4. Saturday, June 18, 2011 4
  5. 5. tSaturday, June 18, 2011 5
  6. 6. Saturday, June 18, 2011 6
  7. 7. • • • A B •Saturday, June 18, 2011 7
  8. 8. : “however” 109 347 8 493 [ ] However, .... [ ] ..., however, .... [ ] ..., however.Saturday, June 18, 2011 8
  9. 9. > freq <- c(109,347,8) > chisq.test(freq,correct=FALSE) Chi-squared test for given probabilities data: freq X-squared = 391.7371, df = 2, p-value < 2.2e-16 # 2 # http://homepage2.nifty.com/nandemoarchive/toukei_kiso/t_F_chi.htmSaturday, June 18, 2011 9
  10. 10. Saturday, June 18, 2011 10
  11. 11. Agenda 1. R ― ― 2. R 3. RSaturday, June 18, 2011 11
  12. 12. Agenda 1. R ― ― 2. R 3. RSaturday, June 18, 2011 12
  13. 13. Saturday, June 18, 2011 13
  14. 14. 1. 2. 3. 4. 5. 6.Saturday, June 18, 2011 14
  15. 15. 1. • • ns <- scan("ns_raw.txt", what="character") • • ns <- scan(choose.files(), what="char") • • getwd() !Saturday, June 18, 2011 15
  16. 16. 2. • head( , ) • tail( , ) • /Saturday, June 18, 2011 16
  17. 17. 2. •grep (“ ”, ) • > grep("school", ns) • ns > ns[grep("school", ns)]Saturday, June 18, 2011 17
  18. 18. 2. • [ ] • > ns[100] • 100 • > ns[c(98,99,100)] • 98, 99, 100 •cSaturday, June 18, 2011 18
  19. 19. 3. • •strsplit ( ,“ ”) > strsplit (ns, " ") •ns • • listSaturday, June 18, 2011 19
  20. 20. 3. • > ns_list <- strsplit (ns, " ") • ns_list > unlist (ns_list) • ns_list • unlist(strsplit(ns, " "))Saturday, June 18, 2011 20
  21. 21. 4. sort ( ) > ns2 <- sort(unlist(ns_list))Saturday, June 18, 2011 21
  22. 22. 4. unique ( ) > ns3 <- unique (sort(unlist(ns_list))) # ( ) # sort(unique(unlist(ns_list)))Saturday, June 18, 2011 22
  23. 23. 5. table ( ) > ns4 <- table(unlist(strsplit (ns, " "))) # table #Saturday, June 18, 2011 23
  24. 24. 5. > ns5 <- length(unlist(strsplit (ns, " "))) #Saturday, June 18, 2011 24
  25. 25. 5. > ns6 <- length(unique(sort(unlist(strsplit (ns, " "))))) # # > ns7 <- unique(sort(unlist (ns_list))) > length(ns7)Saturday, June 18, 2011 25
  26. 26. 6. > write.table(ns4, file=“freq1.txt”) > write.table(ns5, file=“freq2.txt”) > write.table(ns6, file=“freq3.txt”) # getwd() # ExcelSaturday, June 18, 2011 26
  27. 27. Saturday, June 18, 2011 27
  28. 28. Agenda 1. R ― ― 2. R 3. RSaturday, June 18, 2011 28
  29. 29. Agenda 1. R ― ― 2. R 3. RSaturday, June 18, 2011 29
  30. 30. • • • • • ... orzSaturday, June 18, 2011 30
  31. 31. RMeCabSaturday, June 18, 2011 31
  32. 32. RMeCab • •R MeCab • RSaturday, June 18, 2011 32
  33. 33. • RMeCabText() : • RMeCabFreq() : • Ngram() : N-gram • collocate() :Saturday, June 18, 2011 33
  34. 34. Saturday, June 18, 2011 34
  35. 35. 2,940 1,785 3,780Saturday, June 18, 2011 35
  36. 36. Saturday, June 18, 2011 36
  37. 37. twitter: @sakaue e-mail: tsakaue<AT>hiroshima-u.ac.jpSaturday, June 18, 2011 37
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×