Los Angeles R users group - Dec 14 2010 - Part 1


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Los Angeles R users group - Dec 14 2010 - Part 1

  1. 1. A SQL primer for R userswith examples from PokemonNeal FultzUCLA StatisticsGoal of talk Make SQL look easy And present R equivalents Not another customer dbParadigms R: fundamental unit is the vector RDBMS: fundamental unit is the tablePokemon Best selling video game of the 90s sold in multiple versions (and major fad) Turn based JRPG Featuring hundreds(!) of characters to collect Gotta catch em all!
  2. 2. Pokemon (2)from http://www.giantbomb.com/pokemon-yellow-special-pikachu-edition/61-18673/Pokemon (3)
  3. 3. from Pokemon for dummiesPokemon (4)
  4. 4. from http://guides.ign.com/guides/818481/page_2.htmlData Model for Pokemon Pokemon ID Number Name Type(s) Version Type Table Attack Type Defense Type Multiplier
  5. 5. In R its natural to represent this as a matrix. In SQL, its natural to pivot it to tuples.More concretely id Name Type 1 Type 2 In Red In Blue001 Bulbusaur Plant Poison T T002 Ivysaur Plant Poison T T003 Venusaur Plant Poison T T004 Charmander Fire T T005 Charmelion Fire T T006 Charzard Fire Flying T TWhats in Red only?select id, namefrom pokemonwhere red and not blue;Whats in Red only? (2)23;"Ekans"24;"Arbok"43;"Oddish"44;"Gloom"45;"Vileplume"56;"Mankey"57;"Primeape"58;"Growlithe"59;"Arcanine"
  6. 6. 123;"Scyther"125;"Electabuzz"Whats in Red only? (R)pokemon[red & ! blue];Consider Psyduckselect * from pokemon where name like Psyduck;image from http://strategywiki.org/wiki/Pok%C3%A9mon_Gold_and_Silver/Ilex_Foresthttp://strategywiki.org/wiki/Pok%C3%A9mon_Gold_and_Silver/Ilex_ForestConsider Psyduck (2)54;"Psyduck";"Water";"";t;tConsider Psyduck (R)pokemon[grep(Psyduck, names)];
  7. 7. What types are least common?Select type1, Count(type1) as cfrom pokemongroup by type1order by c;What types are least common? (2)"Ice";2"Ghost";3"Dragon";3...What types are least common? (R)sort(table(type1));Second Typesselect type1, type2, count(type2) as cfrom pokemonwhere type2 is not nullgroup by type1, type2 order by type2Second Types (2)"Water";"Fighting";1"Normal";"Flying";8"Fire";"Flying";1
  8. 8. "Water";"Flying";1"Rock";"Flying";1Second Types (R)table(type1, type2, exclude=type2==NULL);Vs Gyarados?Select attackType, multiplierfrom pokemon, pokemonTypewhere name like Gyaradosand defendType in (type1, type2)Vs Gyarados (2)"Fighting";0.5"Ground";0"Rock";2"Bug";0.5"Fire";0.5"Water";0.5"Grass";0.5"Grass";2"Electric";2"Electric";2"Ice";2"Ice";0.5Vs Gyarados (T)i <- grep("Gyarados", names);
  9. 9. multipliers <- types[, c(type1[i], type2[i])];multipliers[which(multipliers != 1)];Vs Gyarados ContSelect attackType,round(exp(sum(ln(multiplier+.00000000000001))),3)from pokemon, pokemonTypewhere name like Gyaradosand defendType in (type1, type2) group by AttackTypeVs Gyarados Cont (2)"Ground";0.000"Bug";0.500"Grass";1.000"Water";0.500"Ice";1.000"Rock";2.000"Fighting";0.500"Fire";0.500"Electric";4.00Vs Gyarados Cont (R)i <- grep("Gyarados", names);multipliers <- types[, c(type1[i], type2[i])];apply(multipliers,2,prod);Vs Gyarados FinalSelect o.name,
  10. 10. round(exp(sum(ln(multiplier+.00000000000001))),3) as mfrom pokemon p, pokemonType t, pokemon owhere p.name like Gyaradosand defendType in (p.type1, p.type2)and attackType in (o.type1, o.type2)group by o.nameorder by m desc;Vs Gyarados Final (2)"Raichu";4.000"Electabuzz";4.000"Jolteon";4.000"Electrode";4.000"Zapados";4.000"Magneton";4.000"Pikachu";4.000"Magnemite";4.000"Voltorb";4.000"Aerodactyl";2.000"Bellsprout";1.000"Bulbasaur";1.000...Vs Gyarados Final (R)i <- grep("Gyarados", names);multipliers <- types[, c(type1[i], type2[i])];totals <- apply(multipliers,2,prod);cbind(names, type1[totals] * type2[totals]);Conclusions
  11. 11. See the pattern? SQL: SELECT (cols) FROM (tables) WHERE (row condition) R: Subsetting (Logical, index, multiple index) grep() table() apply() merge() See also: sqldf libraryQuestions/CommentsResources PostgreSQL An open source RDBMS W3schools SQL tutorial Wikipedia comparison page Bulbapedia Everything about pokemon Pokemon for Dummies Log Parser A Win util for running SQL directly against files