Los Angeles R users group - Dec 14 2010 - Part 1
Upcoming SlideShare
Loading in...5

Los Angeles R users group - Dec 14 2010 - Part 1






Total Views
Views on SlideShare
Embed Views



4 Embeds 666

http://www.r-bloggers.com 661
http://r-bloggers.com 3
http://feeds.feedburner.com 1
http://translate.googleusercontent.com 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Los Angeles R users group - Dec 14 2010 - Part 1 Los Angeles R users group - Dec 14 2010 - Part 1 Presentation Transcript

  • A SQL primer for R userswith examples from PokemonNeal FultzUCLA StatisticsGoal of talk Make SQL look easy And present R equivalents Not another customer dbParadigms R: fundamental unit is the vector RDBMS: fundamental unit is the tablePokemon Best selling video game of the 90s sold in multiple versions (and major fad) Turn based JRPG Featuring hundreds(!) of characters to collect Gotta catch em all!
  • Pokemon (2)from http://www.giantbomb.com/pokemon-yellow-special-pikachu-edition/61-18673/Pokemon (3)
  • from Pokemon for dummiesPokemon (4)
  • from http://guides.ign.com/guides/818481/page_2.htmlData Model for Pokemon Pokemon ID Number Name Type(s) Version Type Table Attack Type Defense Type Multiplier
  • In R its natural to represent this as a matrix. In SQL, its natural to pivot it to tuples.More concretely id Name Type 1 Type 2 In Red In Blue001 Bulbusaur Plant Poison T T002 Ivysaur Plant Poison T T003 Venusaur Plant Poison T T004 Charmander Fire T T005 Charmelion Fire T T006 Charzard Fire Flying T TWhats in Red only?select id, namefrom pokemonwhere red and not blue;Whats in Red only? (2)23;"Ekans"24;"Arbok"43;"Oddish"44;"Gloom"45;"Vileplume"56;"Mankey"57;"Primeape"58;"Growlithe"59;"Arcanine"
  • 123;"Scyther"125;"Electabuzz"Whats in Red only? (R)pokemon[red & ! blue];Consider Psyduckselect * from pokemon where name like Psyduck;image from http://strategywiki.org/wiki/Pok%C3%A9mon_Gold_and_Silver/Ilex_Foresthttp://strategywiki.org/wiki/Pok%C3%A9mon_Gold_and_Silver/Ilex_ForestConsider Psyduck (2)54;"Psyduck";"Water";"";t;tConsider Psyduck (R)pokemon[grep(Psyduck, names)];
  • What types are least common?Select type1, Count(type1) as cfrom pokemongroup by type1order by c;What types are least common? (2)"Ice";2"Ghost";3"Dragon";3...What types are least common? (R)sort(table(type1));Second Typesselect type1, type2, count(type2) as cfrom pokemonwhere type2 is not nullgroup by type1, type2 order by type2Second Types (2)"Water";"Fighting";1"Normal";"Flying";8"Fire";"Flying";1
  • "Water";"Flying";1"Rock";"Flying";1Second Types (R)table(type1, type2, exclude=type2==NULL);Vs Gyarados?Select attackType, multiplierfrom pokemon, pokemonTypewhere name like Gyaradosand defendType in (type1, type2)Vs Gyarados (2)"Fighting";0.5"Ground";0"Rock";2"Bug";0.5"Fire";0.5"Water";0.5"Grass";0.5"Grass";2"Electric";2"Electric";2"Ice";2"Ice";0.5Vs Gyarados (T)i <- grep("Gyarados", names);
  • multipliers <- types[, c(type1[i], type2[i])];multipliers[which(multipliers != 1)];Vs Gyarados ContSelect attackType,round(exp(sum(ln(multiplier+.00000000000001))),3)from pokemon, pokemonTypewhere name like Gyaradosand defendType in (type1, type2) group by AttackTypeVs Gyarados Cont (2)"Ground";0.000"Bug";0.500"Grass";1.000"Water";0.500"Ice";1.000"Rock";2.000"Fighting";0.500"Fire";0.500"Electric";4.00Vs Gyarados Cont (R)i <- grep("Gyarados", names);multipliers <- types[, c(type1[i], type2[i])];apply(multipliers,2,prod);Vs Gyarados FinalSelect o.name,
  • round(exp(sum(ln(multiplier+.00000000000001))),3) as mfrom pokemon p, pokemonType t, pokemon owhere p.name like Gyaradosand defendType in (p.type1, p.type2)and attackType in (o.type1, o.type2)group by o.nameorder by m desc;Vs Gyarados Final (2)"Raichu";4.000"Electabuzz";4.000"Jolteon";4.000"Electrode";4.000"Zapados";4.000"Magneton";4.000"Pikachu";4.000"Magnemite";4.000"Voltorb";4.000"Aerodactyl";2.000"Bellsprout";1.000"Bulbasaur";1.000...Vs Gyarados Final (R)i <- grep("Gyarados", names);multipliers <- types[, c(type1[i], type2[i])];totals <- apply(multipliers,2,prod);cbind(names, type1[totals] * type2[totals]);Conclusions
  • See the pattern? SQL: SELECT (cols) FROM (tables) WHERE (row condition) R: Subsetting (Logical, index, multiple index) grep() table() apply() merge() See also: sqldf libraryQuestions/CommentsResources PostgreSQL An open source RDBMS W3schools SQL tutorial Wikipedia comparison page Bulbapedia Everything about pokemon Pokemon for Dummies Log Parser A Win util for running SQL directly against files