Essential UNIX skills for biologists

508 views

Published on

Learning UNIX for biomedical researchers

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
508
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • cd C:Documents and SettingsypouliotMy DocumentsResearchResourcesReferences
  • Essential UNIX skills for biologists

    1. 1. Essential UNIX Skills for Biologists Yannick Pouliot, PhD Bioresearch Informationist Lane Medical Library & Knowledge Management Center 1/14/2009 Lane Medical Library & Knowledge Management Center http://lane.stanford.edu
    2. 2. The Bioresearch Informationist: At Your Service   Yannick Pouliot, PhD, Lane Medical Library & Knowledge Management Center Bioresearch Informationist ≈ computational biologist in residence    Lane Library service Closely coordinated with CMGM Role: Support laboratory researchers regarding biocomputational resources and their use  …especially postdocs Contact: lanebioresearch@stanford.edu Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 2
    3. 3. Goals  Deliver basic understanding of core UNIX commands  Tips on running UNIX on Mac and Windows … and on a procedural note, we’ll be using anonymous polling to determine whether you’re happy with the material and speed of delivery … Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 3
    4. 4. But First: LaneConnex -- Your Key to Finding Resources Quickly Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 4
    5. 5. So, Why UNIX? UNIX is good for:  performing complex operations with very few key strokes operating on large number of objects for e.g., 1. 2.    UNIX is fast…    searching file contents very specifically renaming files moving/copying files Fast running and fast to invoke LINUX (≈ UNIX) is free and runs on everything Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 5
    6. 6. UNIX Trip-Ups  UNIX is capitalization-sensitive   ls ≠ Ls What you type is what you get   no mistyping! mind those commands  e.g., rm –fr = delete everything in current directory and subdirectories! → DON’T Lane Medical Library & Knowledge Management Center http://lane.stanford.edu DO THIS AT HOME! 6
    7. 7. So How Does One Access UNIX? Mac: UNIX underlies Mac’s graphical interface    Applications → Utilities → Terminal Windows: Must install code (more later) Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 7
    8. 8. Exploring UNIX Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 8
    9. 9. Key UNIX Concepts       UNIX is command-line based (no cute icons). There are flavors of UNIX  “Mac” UNIX ≈ Linux ≈ UNIX “Shell” = command line interface  different shells exist, all with identical basic functionality Anything you can imagine, UNIX can do  … but you may have to think about it… In UNIX, anything can be done in at least three different ways… UNIX has:  commands (built-in) → most of today’s workshop  utilities   ≈ “super-commands”, e.g., grep, for parsing text not built-in but usually there Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 9
    10. 10. Concept: Redirection ***  Redirection operator    “>” or “<“ : add to file (overwrite) “>>” or “<<“: add to file (don’t overwrite) Applies to both input and output     file.txt > prog.exe prog.exe > file.txt File.txt > prog.exe > file1.txt prog.exe >> file.txt Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 10
    11. 11. Concept: Metacharacters ***   “*”= 0 or more characters of any kind ‘.’ or ‘?’ = exactly one character of any kind   Exact character depends on the tool… Metacharacters can be used with nearly any other command, e.g.,      ls file?.txt ls file*.txt ls *.* more *.txt grep *omics *.txt NB: There are lots of other kinds of metacharacters… Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 11
    12. 12. Concept: Stringing Commands Together Using Pipes  “I” = pipe, e.g.:  ls -1 | more Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 12
    13. 13. Polling Time: How’s the speed? 1: Too fast 2. Too slow 3. More or less OK 4. I feel nauseous Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 13
    14. 14. Overview of Selected UNIX Commands Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 14
    15. 15. ls [options] [names] Lists contents of directories, including directories themselves      **** Basically, lists files… When names are provides, lists files contained in a directory name or that match a file name. names can include filename metacharacters. The options display information in different formats. The most useful options include -F, -R, -l, and -s. Examples 1. list all details of all files in current directory ls –l 2. list just the filenames ls -1 3. create a file that contains a list of the filenames ls -1 > mylist.txt 4. List files of type with word “example” followed by single character, e.g., example1.txt, etc ls -1 example?.txt Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 15
    16. 16. cat/more/head/tail → commands to look at content of files      cat: returns everything more: same but one page at a time **** head: returns top x lines tail: returns bottom x lines all can operate on multiple files Examples 1. show contents of all txt files cat *.txt 2. show first 100 lines of file head +100 file.txt 3. show first 1000 lines of file and paginate: head +1000 file.txt | more Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 16
    17. 17. grep: Searching File Contents Using “Regular Expressions” **** grep [options] pattern [files]  Very powerful: Searches file contents for presence of a string    grep protein *.pdf about a million options… Also searches using regular expressions  Definition: a mathematical expression that expresses the characteristics of one or more strings, e.g.:  te?xt  *omics Examples 1. Find all text files whose contents contain words ending in “omics” (“genomics”, “proteomics”, “transcriptomics”): grep *omics *.txt Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 17
    18. 18. Polling Time: How’s the speed? 1: Too fast 2. Too slow 3. More or less OK 4. Need coffee Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 18
    19. 19. uniq options filename1 **    Very handy for listing unique (or duplicate) lines in a file Has options to…  ignore first or last n fields delimited by tabs or spaces  compare only the first n characters Operates ONLY on sorted files Examples 1. List unique lines using unsorted file sort test1.txt | uniq 2. Count number of unique instances using sorted file uniq –c test2.txt Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 19
    20. 20. find [pathnames] [conditions] ***     Very powerful: can specify anything, including exclusions and negations Descends the directory tree beginning at each pathname and locates files that meet the specified conditions. The default pathname is the current directory. Most useful conditions are -name and -type (for general use) Can search very large numbers of file names, if slowly… Examples 1. List all files named chapter1 in the /work directory: find /work -name chapter1 -print 2. Look for filenames in current directory that don't begin with a capital letter find . ! -name '[A-Z]+' -print Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 20
    21. 21. UNIX on Windows  Easy: UnxUtls     = UNIX “light” Excellent for most tasks Not a complete emulation of UNIX Download here; make sure to follow installation instructions   Hard: Cygwin    More later… difficult to make it behave perfectly can run in parallel with Windows Easier: create a dual boot   Provides ability to boot either Windows or Linux Requires reboot to go switch… Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 21
    22. 22. Resources • UNIX commands: http://en.wikibooks.org/wiki/Guide_to_Unix/Comm  Another list of UNIX utilities: http:// en.wikipedia.org/wiki/List_of_Unix_utilities Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 22
    23. 23. Everything You Need to Know About UNIX in Short Form: eBooks from Lane • The ultimate quick reference for LINUX • More than you typically need, but you can zoom into what you need Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 23
    24. 24. UnxUtils Installation: The MiniMe of UNIX   Download Installation instructions → Let’s do it together if you have a PC and want it Lane Medical Library & Knowledge Management Center http://lane.stanford.edu 24
    25. 25. Lane Medical Library & Knowledge Management Center http://lane.stanford.edu

    ×