2013 10-16-sbc3610-research methcomm

2,246 views

Published on

Research methods & comms
Some info about careers, about writing, introduction to R practicals (regular expressions, functions, loops), experimental design.
@ Queen Mary U London.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,246
On SlideShare
0
From Embeds
0
Number of Embeds
900
Actions
Shares
0
Downloads
41
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

2013 10-16-sbc3610-research methcomm

  1. 1. Research Methods & Comms. y.wurm@qmul.ac.uk
  2. 2. © Alex Wild & others
  3. 3. Atta leaf-cutter ants © National Geographic
  4. 4. Atta leaf-cutter ants © National Geographic
  5. 5. Atta leaf-cutter ants © National Geographic
  6. 6. Oecophylla Weaver ants © ameisenforum.de
  7. 7. © ameisenforum.de Fourmis tisserandes
  8. 8. © ameisenforum.de Oecophylla Weaver ants
  9. 9. © wynnie@flickr © forestryimages.org
  10. 10. Tofilski et al 2008 Forelius pusillus
  11. 11. Forelius pusillus hides the nest entrance at night Tofilski et al 2008
  12. 12. Forelius pusillus hides the nest entrance at night Tofilski et al 2008
  13. 13. Forelius pusillus hides the nest entrance at night Tofilski et al 2008
  14. 14. Forelius pusillus hides the nest entrance at night Tofilski et al 2008
  15. 15. Forelius pusillus hides the nest entrance at night Avant Workers staying outside die « preventive self-sacrifice » Tofilski et al 2008
  16. 16. Dorylus driver ants: ants with no home © BBC
  17. 17. Animal biomass (Brazilian rainforest) Mammals Birds Reptiles Other insects Amphibians from Fittkau & Klinge 1973 ! Earthworms ! ! Spiders Soil fauna excluding earthworms, ants & termites Ants & termites
  18. 18. Big data is invading biology
  19. 19. This changes 454 everything. Illumina Solid... Any lab can sequence anything!
  20. 20. Big data is invading biology • Genomics • Biodiversity assessments • Stool microbiome sequencing • Personalized medicine • Cancer genomics • Sensor networks - e.g tracking microclimates • Aerial surveys (Drones) - e.g. crop productivity; rainforest cover • Camera traps
  21. 21. Learning to deal with big data takes time • New Master’s Programs @ QM: • Bioinformatics (for biologists) • Ecological & Evolutionary Genomics (or Biodiversity Informatics) • Our 6 hours of practicals.
  22. 22. Practicals • Aim: get relevant data handling skills • Doing things by hand: slow, error-prone, often impossible. • Automate! • Basic programming • in R • no stats!
  23. 23. Practicals: format • Groups - ok? • 3h practical this week • data accessing/subsetting • search/replace • regular expressions • 3h in two weeks • 2h practical • functions • loops • 1h exam (last hour of practical)
  24. 24. http://tryr.codeschool.com
  25. 25. Regular expressions: Text search on steroids. Regular expression Finds David David Dav(e|id) David, Dave Dav(e|id|ide|o) David, Dave, Davide, Davo At{1,2}enborough Attenborough, Atenborough Atte[nm]borough Attenborough, Attemborough At{1,2}[ei][nm]bo{0,1}ro(ugh){0,1} Atimbro, attenbrough, etc. Easy counting, replacing all with “Sir David Attenborough”
  26. 26. Regular expressions Synonymous with d [0-9] [A-z] [A-z], ie [A-Za-z] s whitespace . any single character .+ one to many of anything b* between 0 and infinity letter ‘b’ [^abc] any character other than a, b or c. ( ( [:punct:] any of these: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ ] ^ _ ` { | • Google “Regular expression cheat sheet” • ?regexp
  27. 27. Functions • R has many. e.g.: plot(), t.test() • Making your own: tree_age_estimate <- function(diameter, species) { [...do the magic... # maybe something like: growth.rate <- growth.rates[ species ] age.estimate <- diameter / growth.rate ...] ! return(age.estimate) } > tree_age_estimate(25, “White Oak”) + 66 > tree_age_estimate(60, “Carya ovata”) + 190
  28. 28. “for” Loop > possible_colours <- c('blue', 'cyan', 'sky-blue', 'navy blue', 'steel blue', 'royal blue', 'slate blue', 'light blue', 'dark blue', 'prussian blue', 'indigo', 'baby blue', 'electric blue') ! > possible_colours [1] "blue" "cyan" "sky-blue" "navy blue" [5] "steel blue" "royal blue" "slate blue" "light blue" [9] "dark blue" "prussian blue" "indigo" "baby blue" [13] "electric blue" ! > for (colour in possible_colours) { + print(paste("The sky is oh so, so", colour)) + } ! [1] "The sky is so, oh so blue" [1] "The sky is so, oh so cyan" [1] "The sky is so, oh so sky-blue" [1] "The sky is so, oh so navy blue" [1] "The sky is so, oh so steel blue" [1] "The sky is so, oh so royal blue" [1] "The sky is so, oh so slate blue" [1] "The sky is so, oh so light blue" [1] "The sky is so, oh so dark blue" [1] "The sky is so, oh so prussian blue" [1] "The sky is so, oh so indigo" [1] "The sky is so, oh so baby blue" [1] "The sky is so, oh so electric blue"
  29. 29. xkcd
  30. 30. 1.23 (?)
  31. 31. More career stuff • Internships? •What does PhD mean? • Basic CV rules
  32. 32. Writing • an essay • a cover letter • a reference letter • (a “new scientist article”) • a dissertation • (an abstract)
  33. 33. QMUL marking scheme Marking Criteria and Mark Scheme for Essay-style Questions Levels 5 - 6 Level 6 All Levels (Desirable in other years) Evidence of Comprehension Breadth and Depth of Knowledge Irrelevant Material and Errors Synthesis & Balance Originality & Innovation A+ Outstanding. Deep insight Outstanding. As much as could be expected Absent or minimal Evidence of critical analysis Original ideas and insight Clear understanding. Shrewd and appropriate Extensive. Almost as much as could be expected Minimal or absent Astute selection and juxtaposition Some evidence of creative A- Tending to description rather than interpretation Extensive Minimal Appropriate selection and combination Some A-- Sufficient to marshal a well-organised, direct response Most key points but not extensive Perhaps some minor errors and tangential material Inappropriate balance, partial synthesis Limited Sufficient to marshal an organised, direct response Not all key points but comprehensive and accurate Some minor errors and tangential material Inappropriate balance, partial synthesis Limited Not a direct response but sufficient for a logical presentation. Several omissions but some key points Some errors, tangential material Minimal Minimal D,E Poor comprehension, muddled organisation Major omissions. No key points. A few basic facts Major factual errors. Frequently irrelevant None None F+ Almost none One or two very minor points correct Extensively irrelevant or wrong None None None One or two very minor points just about correct Extensively irrelevant or wrong None None F- None No evidence of being better if longer Almost all irrelevant or wrong None None Nothing written Nothing written Nothing written None None Notes: x In order to qualify for an "A-grade" the work must meet most of the indicated criteria. x Grade to % conversion: A+ = 100; A = 92; A- = 83; A-- = 74; B+ = 68; B = 65; B- = 63; C+ = 58; C = 55; C- = 53; D+ = 49; D = 48; D- = 47; E+ = 44; E = 43; E- = 42; F++ = 39; F+ = 37; F = 27; F- = 17; 0 = 0
  34. 34. Important for all: Structure Clear overall structure? Separate intro starts from general points. announces the structure (paragraphs or major sections). One paragraph per idea/point. Clear structure within each paragraph. If includes a list: “three lines of evidence suggest that X. First, ...., Second, ... Finally....” Clarity of each sentence. No unnecessary words! Try to make smooth transitions
  35. 35. More writing tips. • No ping-ponging! • No unnecessary ideas. • Eliminate unnecessary words. • “We have performed X” -- “We did X” • shorter is better • Put MS Word in “strict grammar” • Eliminate jargon. • write for the “general smart scientists” with little domain specific knowledge.
  36. 36. Writing • an essay • a cover letter • a reference letter • (a “new scientist article”) • a dissertation • (an abstract)
  37. 37. Rest of our time together Experimental design (Reproducible research)
  38. 38. Why consider experimental design? • If you’re performing experiments • Cost • Time • for experiment • for analysis • Ethics • If you’re deciding to fund? to buy? to approve? to compete? • are the results real? • can you trust the data?
  39. 39. Main potential problems • Pseudoreplication • Confounding factors • Insufficient data/power • Inappropriate statistics Inaccurate Wrong Misleading

×