SlideShare a Scribd company logo
1 of 15
FBW
3-11-2015
Wim Van Criekinge
Bioinformatics.be
GitHub: Hosted GIT
• Largest open source git hosting site
• Public and private options
• User-centric rather than project-centric
• http://github.ugent.be (use your Ugent
login and password)
– Accept invitation from Bioinformatics-I-
2015
URI:
– https://github.ugent.be/Bioinformatics-I-
2015/Python.git
Control Structures
if condition:
statements
[elif condition:
statements] ...
else:
statements
while condition:
statements
for var in sequence:
statements
break
continue
Lists
• Flexible arrays, not Lisp-like linked
lists
• a = [99, "bottles of beer", ["on", "the",
"wall"]]
• Same operators as for strings
• a+b, a*3, a[0], a[-1], a[1:], len(a)
• Item and slice assignment
• a[0] = 98
• a[1:2] = ["bottles", "of", "beer"]
-> [98, "bottles", "of", "beer", ["on", "the", "wall"]]
• del a[-1] # -> [98, "bottles", "of", "beer"]
Dictionaries
• Hash tables, "associative arrays"
• d = {"duck": "eend", "water": "water"}
• Lookup:
• d["duck"] -> "eend"
• d["back"] # raises KeyError exception
• Delete, insert, overwrite:
• del d["water"] # {"duck": "eend", "back": "rug"}
• d["back"] = "rug" # {"duck": "eend", "back":
"rug"}
• d["duck"] = "duik" # {"duck": "duik", "back":
"rug"}
Regex.py
text = 'abbaaabbbbaaaaa'
pattern = 'ab'
for match in re.finditer(pattern, text):
s = match.start()
e = match.end()
print ('Found "%s" at %d:%d' % (text[s:e], s, e))
m = re.search("^([A-Z]) ",line)
if m:
from_letter = m.groups()[0]
Install Biopython
pip is the preferred installer program.
Starting with Python 3.4, it is included
by default with the Python binary
installers.
pip3.5 install Biopython
#pip3.5 install yahoo_finance
from yahoo_finance import Share
yahoo = Share('AAPL')
print (yahoo.get_open())
BioPython
• Make a histogram of the MW (in kDa) of all proteins in
Swiss-Prot
• Find the most basic and most acidic protein in Swiss-Prot?
• Biological relevance of the results ?
From AAIndex
H ZIMJ680104
D Isoelectric point (Zimmerman et al., 1968)
R LIT:2004109b PMID:5700434
A Zimmerman, J.M., Eliezer, N. and Simha, R.
T The characterization of amino acid sequences in proteins by
statistical
methods
J J. Theor. Biol. 21, 170-201 (1968)
C KLEP840101 0.941 FAUJ880111 0.813 FINA910103 0.805
I A/L R/K N/M D/F C/P Q/S E/T G/W H/Y I/V
6.00 10.76 5.41 2.77 5.05 5.65 3.22 5.97 7.59 6.02
5.98 9.74 5.74 5.48 6.30 5.68 5.66 5.89 5.66 5.96
Biopython AAindex ? Dictionary
… file parser
from Bio import SeqIO
c=0
handle = open(r'/Users/wvcrieki/Downloads/uniprot_sprot.dat')
for seq_rec in SeqIO.parse(handle, "swiss"):
print (seq_rec.id)
print (repr(seq_rec.seq))
print (len(seq_rec))
c+=1
if c>5:
break
Parsing sequences from the net
Parsing GenBank records from the net
Parsing SwissProt sequence from the net
Handles are not always from files
>>>from Bio import Entrez
>>>from Bio import SeqIO
>>>handle = Entrez.efetch(db="nucleotide",rettype="fasta",id="6273291")
>>>seq_record = SeqIO.read(handle,”fasta”)
>>>handle.close()
>>>seq_record.description
>>>from Bio import ExPASy
>>>from Bio import SeqIO
>>>handle = ExPASy.get_sprot_raw("6273291")
>>>seq_record = SeqIO.read(handle,”swiss”)
>>>handle.close()
>>>print seq_record.id
>>>print seq_record.name
>>>prin seq_record.description
Extra Questions
• How many records have a sequence of length 260?
• What are the first 20 residues of 143X_MAIZE?
• What is the identifier for the record with the shortest
sequence? Is there more than one record with that length?
• What is the identifier for the record with the longest
sequence? Is there more than one record with that length?
• How many contain the subsequence "ARRA"?
• How many contain the substring "KCIP-1" in the description?

More Related Content

Viewers also liked

Viewers also liked (9)

Plan estratégido del estado plurinacional
Plan estratégido del estado plurinacionalPlan estratégido del estado plurinacional
Plan estratégido del estado plurinacional
 
Python basics
Python basicsPython basics
Python basics
 
2. funciones de los auxiliares de educacion(1)
2.  funciones de los auxiliares de educacion(1)2.  funciones de los auxiliares de educacion(1)
2. funciones de los auxiliares de educacion(1)
 
INECUACIONES CUADRÁTICAS
INECUACIONES CUADRÁTICASINECUACIONES CUADRÁTICAS
INECUACIONES CUADRÁTICAS
 
Scalable ABM for SMB
Scalable ABM for SMBScalable ABM for SMB
Scalable ABM for SMB
 
Permen LHK no.70 2016 ttg baku mutu emisi usaha dan atau kegiatan pengolahan ...
Permen LHK no.70 2016 ttg baku mutu emisi usaha dan atau kegiatan pengolahan ...Permen LHK no.70 2016 ttg baku mutu emisi usaha dan atau kegiatan pengolahan ...
Permen LHK no.70 2016 ttg baku mutu emisi usaha dan atau kegiatan pengolahan ...
 
ISO 9001改版的五十道問題 part2
ISO 9001改版的五十道問題 part2ISO 9001改版的五十道問題 part2
ISO 9001改版的五十道問題 part2
 
Tha price of a great pearl.pt.3.newer.html.doc
Tha price of a great pearl.pt.3.newer.html.docTha price of a great pearl.pt.3.newer.html.doc
Tha price of a great pearl.pt.3.newer.html.doc
 
Introduction to LDAP and Directory Services
Introduction to LDAP and Directory ServicesIntroduction to LDAP and Directory Services
Introduction to LDAP and Directory Services
 

Similar to 2015 bioinformatics bio_python_partii

Antao Biopython Bosc2008
Antao Biopython Bosc2008Antao Biopython Bosc2008
Antao Biopython Bosc2008
bosc_2008
 
Cock Biopython Bosc2009
Cock Biopython Bosc2009Cock Biopython Bosc2009
Cock Biopython Bosc2009
bosc
 
Generated by gitinspector 0.3.2
Generated by gitinspector 0.3.2Generated by gitinspector 0.3.2
Generated by gitinspector 0.3.2
iHlony
 

Similar to 2015 bioinformatics bio_python_partii (20)

2015 bioinformatics bio_python
2015 bioinformatics bio_python2015 bioinformatics bio_python
2015 bioinformatics bio_python
 
P6 2018 biopython2b
P6 2018 biopython2bP6 2018 biopython2b
P6 2018 biopython2b
 
P6 2017 biopython2
P6 2017 biopython2P6 2017 biopython2
P6 2017 biopython2
 
BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopython
 
Speedup Your Java Apps with Hardware Counters
Speedup Your Java Apps with Hardware CountersSpeedup Your Java Apps with Hardware Counters
Speedup Your Java Apps with Hardware Counters
 
Introduction to Git
Introduction to GitIntroduction to Git
Introduction to Git
 
Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects
Assessing the Use of Eclipse MDE Technologies in Open-Source Software ProjectsAssessing the Use of Eclipse MDE Technologies in Open-Source Software Projects
Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects
 
Antao Biopython Bosc2008
Antao Biopython Bosc2008Antao Biopython Bosc2008
Antao Biopython Bosc2008
 
Cock Biopython Bosc2009
Cock Biopython Bosc2009Cock Biopython Bosc2009
Cock Biopython Bosc2009
 
Biopython
BiopythonBiopython
Biopython
 
P7 2018 biopython3
P7 2018 biopython3P7 2018 biopython3
P7 2018 biopython3
 
IoT Aquarium 2
IoT Aquarium 2IoT Aquarium 2
IoT Aquarium 2
 
2015 bioinformatics bio_python_part4
2015 bioinformatics bio_python_part42015 bioinformatics bio_python_part4
2015 bioinformatics bio_python_part4
 
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
Australian Bioinformatics Conference (ABiC) 2014 Talk - Doing bioinformatics ...
 
GraphPad Prism: Curve fitting
GraphPad Prism: Curve fittingGraphPad Prism: Curve fitting
GraphPad Prism: Curve fitting
 
Generated by gitinspector 0.3.2
Generated by gitinspector 0.3.2Generated by gitinspector 0.3.2
Generated by gitinspector 0.3.2
 
iMicrobe_ASLO_2015
iMicrobe_ASLO_2015iMicrobe_ASLO_2015
iMicrobe_ASLO_2015
 
Processing malaria HTS results using KNIME: a tutorial
Processing malaria HTS results using KNIME: a tutorialProcessing malaria HTS results using KNIME: a tutorial
Processing malaria HTS results using KNIME: a tutorial
 
Comprehensive Container Based Service Monitoring with Kubernetes and Istio
Comprehensive Container Based Service Monitoring with Kubernetes and IstioComprehensive Container Based Service Monitoring with Kubernetes and Istio
Comprehensive Container Based Service Monitoring with Kubernetes and Istio
 
Analysis, synthesis, and design of chemical processes.pdf
Analysis, synthesis, and design of chemical processes.pdfAnalysis, synthesis, and design of chemical processes.pdf
Analysis, synthesis, and design of chemical processes.pdf
 

More from Prof. Wim Van Criekinge

More from Prof. Wim Van Criekinge (20)

2020 02 11_biological_databases_part1
2020 02 11_biological_databases_part12020 02 11_biological_databases_part1
2020 02 11_biological_databases_part1
 
2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_upload2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_upload
 
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload
 
2019 03 05_biological_databases_part3_v_upload
2019 03 05_biological_databases_part3_v_upload2019 03 05_biological_databases_part3_v_upload
2019 03 05_biological_databases_part3_v_upload
 
2019 02 21_biological_databases_part2_v_upload
2019 02 21_biological_databases_part2_v_upload2019 02 21_biological_databases_part2_v_upload
2019 02 21_biological_databases_part2_v_upload
 
2019 02 12_biological_databases_part1_v_upload
2019 02 12_biological_databases_part1_v_upload2019 02 12_biological_databases_part1_v_upload
2019 02 12_biological_databases_part1_v_upload
 
P4 2018 io_functions
P4 2018 io_functionsP4 2018 io_functions
P4 2018 io_functions
 
P3 2018 python_regexes
P3 2018 python_regexesP3 2018 python_regexes
P3 2018 python_regexes
 
T1 2018 bioinformatics
T1 2018 bioinformaticsT1 2018 bioinformatics
T1 2018 bioinformatics
 
P1 2018 python
P1 2018 pythonP1 2018 python
P1 2018 python
 
Bio ontologies and semantic technologies[2]
Bio ontologies and semantic technologies[2]Bio ontologies and semantic technologies[2]
Bio ontologies and semantic technologies[2]
 
2018 05 08_biological_databases_no_sql
2018 05 08_biological_databases_no_sql2018 05 08_biological_databases_no_sql
2018 05 08_biological_databases_no_sql
 
2018 03 27_biological_databases_part4_v_upload
2018 03 27_biological_databases_part4_v_upload2018 03 27_biological_databases_part4_v_upload
2018 03 27_biological_databases_part4_v_upload
 
2018 03 20_biological_databases_part3
2018 03 20_biological_databases_part32018 03 20_biological_databases_part3
2018 03 20_biological_databases_part3
 
2018 02 20_biological_databases_part2_v_upload
2018 02 20_biological_databases_part2_v_upload2018 02 20_biological_databases_part2_v_upload
2018 02 20_biological_databases_part2_v_upload
 
2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload
 
P7 2017 biopython3
P7 2017 biopython3P7 2017 biopython3
P7 2017 biopython3
 
Van criekinge 2017_11_13_rodebiotech
Van criekinge 2017_11_13_rodebiotechVan criekinge 2017_11_13_rodebiotech
Van criekinge 2017_11_13_rodebiotech
 
P4 2017 io
P4 2017 ioP4 2017 io
P4 2017 io
 
T5 2017 database_searching_v_upload
T5 2017 database_searching_v_uploadT5 2017 database_searching_v_upload
T5 2017 database_searching_v_upload
 

Recently uploaded

會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
中 央社
 
SURVEY I created for uni project research
SURVEY I created for uni project researchSURVEY I created for uni project research
SURVEY I created for uni project research
CaitlinCummins3
 
Financial Accounting IFRS, 3rd Edition-dikompresi.pdf
Financial Accounting IFRS, 3rd Edition-dikompresi.pdfFinancial Accounting IFRS, 3rd Edition-dikompresi.pdf
Financial Accounting IFRS, 3rd Edition-dikompresi.pdf
MinawBelay
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

philosophy and it's principles based on the life
philosophy and it's principles based on the lifephilosophy and it's principles based on the life
philosophy and it's principles based on the life
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
 
REPRODUCTIVE TOXICITY STUDIE OF MALE AND FEMALEpptx
REPRODUCTIVE TOXICITY  STUDIE OF MALE AND FEMALEpptxREPRODUCTIVE TOXICITY  STUDIE OF MALE AND FEMALEpptx
REPRODUCTIVE TOXICITY STUDIE OF MALE AND FEMALEpptx
 
....................Muslim-Law notes.pdf
....................Muslim-Law notes.pdf....................Muslim-Law notes.pdf
....................Muslim-Law notes.pdf
 
PSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxPSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptx
 
SURVEY I created for uni project research
SURVEY I created for uni project researchSURVEY I created for uni project research
SURVEY I created for uni project research
 
Championnat de France de Tennis de table/
Championnat de France de Tennis de table/Championnat de France de Tennis de table/
Championnat de France de Tennis de table/
 
Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17
 
Software testing for project report .pdf
Software testing for project report .pdfSoftware testing for project report .pdf
Software testing for project report .pdf
 
Financial Accounting IFRS, 3rd Edition-dikompresi.pdf
Financial Accounting IFRS, 3rd Edition-dikompresi.pdfFinancial Accounting IFRS, 3rd Edition-dikompresi.pdf
Financial Accounting IFRS, 3rd Edition-dikompresi.pdf
 
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
Operations Management - Book1.p  - Dr. Abdulfatah A. SalemOperations Management - Book1.p  - Dr. Abdulfatah A. Salem
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
 
Dementia (Alzheimer & vasular dementia).
Dementia (Alzheimer & vasular dementia).Dementia (Alzheimer & vasular dementia).
Dementia (Alzheimer & vasular dementia).
 
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
 
Navigating the Misinformation Minefield: The Role of Higher Education in the ...
Navigating the Misinformation Minefield: The Role of Higher Education in the ...Navigating the Misinformation Minefield: The Role of Higher Education in the ...
Navigating the Misinformation Minefield: The Role of Higher Education in the ...
 
An overview of the various scriptures in Hinduism
An overview of the various scriptures in HinduismAn overview of the various scriptures in Hinduism
An overview of the various scriptures in Hinduism
 
An Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptxAn Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptx
 
An Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge AppAn Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge App
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024
 
Removal Strategy _ FEFO _ Working with Perishable Products in Odoo 17
Removal Strategy _ FEFO _ Working with Perishable Products in Odoo 17Removal Strategy _ FEFO _ Working with Perishable Products in Odoo 17
Removal Strategy _ FEFO _ Working with Perishable Products in Odoo 17
 

2015 bioinformatics bio_python_partii

  • 1.
  • 4.
  • 5. GitHub: Hosted GIT • Largest open source git hosting site • Public and private options • User-centric rather than project-centric • http://github.ugent.be (use your Ugent login and password) – Accept invitation from Bioinformatics-I- 2015 URI: – https://github.ugent.be/Bioinformatics-I- 2015/Python.git
  • 6. Control Structures if condition: statements [elif condition: statements] ... else: statements while condition: statements for var in sequence: statements break continue
  • 7. Lists • Flexible arrays, not Lisp-like linked lists • a = [99, "bottles of beer", ["on", "the", "wall"]] • Same operators as for strings • a+b, a*3, a[0], a[-1], a[1:], len(a) • Item and slice assignment • a[0] = 98 • a[1:2] = ["bottles", "of", "beer"] -> [98, "bottles", "of", "beer", ["on", "the", "wall"]] • del a[-1] # -> [98, "bottles", "of", "beer"]
  • 8. Dictionaries • Hash tables, "associative arrays" • d = {"duck": "eend", "water": "water"} • Lookup: • d["duck"] -> "eend" • d["back"] # raises KeyError exception • Delete, insert, overwrite: • del d["water"] # {"duck": "eend", "back": "rug"} • d["back"] = "rug" # {"duck": "eend", "back": "rug"} • d["duck"] = "duik" # {"duck": "duik", "back": "rug"}
  • 9. Regex.py text = 'abbaaabbbbaaaaa' pattern = 'ab' for match in re.finditer(pattern, text): s = match.start() e = match.end() print ('Found "%s" at %d:%d' % (text[s:e], s, e)) m = re.search("^([A-Z]) ",line) if m: from_letter = m.groups()[0]
  • 10. Install Biopython pip is the preferred installer program. Starting with Python 3.4, it is included by default with the Python binary installers. pip3.5 install Biopython #pip3.5 install yahoo_finance from yahoo_finance import Share yahoo = Share('AAPL') print (yahoo.get_open())
  • 11.
  • 12. BioPython • Make a histogram of the MW (in kDa) of all proteins in Swiss-Prot • Find the most basic and most acidic protein in Swiss-Prot? • Biological relevance of the results ? From AAIndex H ZIMJ680104 D Isoelectric point (Zimmerman et al., 1968) R LIT:2004109b PMID:5700434 A Zimmerman, J.M., Eliezer, N. and Simha, R. T The characterization of amino acid sequences in proteins by statistical methods J J. Theor. Biol. 21, 170-201 (1968) C KLEP840101 0.941 FAUJ880111 0.813 FINA910103 0.805 I A/L R/K N/M D/F C/P Q/S E/T G/W H/Y I/V 6.00 10.76 5.41 2.77 5.05 5.65 3.22 5.97 7.59 6.02 5.98 9.74 5.74 5.48 6.30 5.68 5.66 5.89 5.66 5.96
  • 13. Biopython AAindex ? Dictionary … file parser from Bio import SeqIO c=0 handle = open(r'/Users/wvcrieki/Downloads/uniprot_sprot.dat') for seq_rec in SeqIO.parse(handle, "swiss"): print (seq_rec.id) print (repr(seq_rec.seq)) print (len(seq_rec)) c+=1 if c>5: break
  • 14. Parsing sequences from the net Parsing GenBank records from the net Parsing SwissProt sequence from the net Handles are not always from files >>>from Bio import Entrez >>>from Bio import SeqIO >>>handle = Entrez.efetch(db="nucleotide",rettype="fasta",id="6273291") >>>seq_record = SeqIO.read(handle,”fasta”) >>>handle.close() >>>seq_record.description >>>from Bio import ExPASy >>>from Bio import SeqIO >>>handle = ExPASy.get_sprot_raw("6273291") >>>seq_record = SeqIO.read(handle,”swiss”) >>>handle.close() >>>print seq_record.id >>>print seq_record.name >>>prin seq_record.description
  • 15. Extra Questions • How many records have a sequence of length 260? • What are the first 20 residues of 143X_MAIZE? • What is the identifier for the record with the shortest sequence? Is there more than one record with that length? • What is the identifier for the record with the longest sequence? Is there more than one record with that length? • How many contain the subsequence "ARRA"? • How many contain the substring "KCIP-1" in the description?