2. Andrea TelatinBecoming a Bioinformatician
We started with…
1. Most bioinformatics file formats are text files!
2. There are quite a few robust programs
3. Our goal is often to create a pipeline
4. For most pipelines we need some glue
5. Andrea TelatinBecoming a Bioinformatician
Playing with the BASH
• Example 1:
• Download 3 to 5 PNG images from the web
• Install the program “ImageMagik” using the
repository (apt-get…)
BASH COMMANDS
6. Andrea TelatinBecoming a Bioinformatician
Playing with the BASH
• Move to the download images directory
• Type “convert -resize 50% image1.png small1.png”
• How to automate the process to create a smaller
version of all the images?
BASH COMMANDS
7. Andrea TelatinBecoming a Bioinformatician
Playing with the BASH
• Remember of the “man command”
• Never forget about google
BASH COMMANDS
8. Andrea TelatinBecoming a Bioinformatician
Playing with the BASH
• Now you should create a directory for today’s tasks
• Then download into it (using wget):
• http://www.telatin.com/reads.tar.gz
• http://www.telatin.com/amplicon.tar.gz
• The human chromosome 2 (hg19)
BASH COMMANDS
9. Andrea TelatinBecoming a Bioinformatician
Reads alignments
• Extract the .tar.gz archives using tar. Check via
google how to do this (tar is a strange program)
• Now we have to install bwa to align reads. We can
use the repository again.
BASH COMMANDS
BIO TOOLS
10. Andrea TelatinBecoming a Bioinformatician
Reads alignments
• Create an index:
bwa index genome.fa
• Align reads:
bwa mem genome.fa reads.fastq > output.sam
BASH COMMANDS
BIO TOOLS
11. Andrea TelatinBecoming a Bioinformatician
SAMtools
• Download them via the repository
• SAM to BAM pipeline:
• samtools view -bS file.sam > file.bam
• samtools sort file.bam sorted_file
• samtools index sorted_file.bam
BASH COMMANDS
BIO TOOLS
12. Andrea TelatinBecoming a Bioinformatician
IGV
• DON’T Download it via the repository. Download it
from the internet!
• Unzip it into a directory (eg: IGV in your home)
• Launch it with the terminal: “sh igv.sh”
BASH COMMANDS
BIO TOOLS
13. Andrea TelatinBecoming a Bioinformatician
IGV
• Load as genome the human chromosome 2
• Load as tracks both the BED and the BAM files
BASH COMMANDS
BIO TOOLS
18. Imitare i programmi della shell è un buon modo per farne!
di validi.!
I programmi della shell:
• Hanno una guida (documentazione)
• Hanno dei comportamenti standardizzati (si imparano
in fretta una volta imparati questi standard)
• Sono robusti (controllano l’input, danno errori che ci
aiutano a lanciarli correttamente)
19. Programmare significa saper scomporre il !
nostro obiettivo in passaggi !
che un computer possa effettuare.