Pro-Gyan
Build and share protein classifiers from fasta
files
What is Pro-Gyan
o It builds binary classifier directly from protein sequences
Calculates ~5000 different properties from...
How to use Pro-Gyan
• Download “Pro_Gyan_1.0.zip” from
(https://code.google.com/p/pro-gyan/downloads/list)
• Extract all t...
How to build a protein classifier
1. To build a classifier we need two set of
proteins (like mitochondrial and non-
mitoch...
How to build a protein classifier
How to build a protein classifier
•Give a name to your classier
•Add description about the classifier and data set.
•Label...
How to build a protein classifier
Now Pro-Gyan is ready to build a new classifer; press “Self Learn”;
it will take some ti...
Different performance metrices
Confusion matrix
Accuracy, Sensitivity, Specificity
Mathews correlation coefficient.
Receiver operating characteristic
Selected ranked features & statistics
Evaluation independent test data
Data not used in training is important to evaluate
the chance over of over fitting.
Performance on independent test set
Export the classifier
The classifier could be export/save in “Pro-Gyan
classifier“ (pgc) format and upload in a web-server...
Classify novel proteins
Classify novel proteins
• Import the required
Pro-Gyan built
classifier
Training information of the classifier
Classify novel proteins
• Copy-paste in the text
area or upload (“Fasta
File” button) multiple
protein sequence in fasta
f...
Prediction result
• The result is displayed in
tabular format which
could be copy paste to
any text or spreadsheet.
Upcoming SlideShare
Loading in …5
×

Pro gyan complete

272 views

Published on

http://code.google.com/p/pro-gyan/

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
272
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Pro gyan complete

  1. 1. Pro-Gyan Build and share protein classifiers from fasta files
  2. 2. What is Pro-Gyan o It builds binary classifier directly from protein sequences Calculates ~5000 different properties from proteins seuence  Selects a “maximal relevant and minimal redundant feature subset” and ranked them applying Information theory. Top ranked features are selected to build the final SVM classifier by 5 fold cross validation.
  3. 3. How to use Pro-Gyan • Download “Pro_Gyan_1.0.zip” from (https://code.google.com/p/pro-gyan/downloads/list) • Extract all the files. • Double click Pro-Gyan.jar which will open the main window of “Pro-Gyan”. • Let us build a classifier.
  4. 4. How to build a protein classifier 1. To build a classifier we need two set of proteins (like mitochondrial and non- mitochondrial) in fasta format. 2. Now press “Create Classifier”
  5. 5. How to build a protein classifier
  6. 6. How to build a protein classifier •Give a name to your classier •Add description about the classifier and data set. •Labeled the positive and negative input data appropriately •Browse the fasta files and press “Save” button
  7. 7. How to build a protein classifier Now Pro-Gyan is ready to build a new classifer; press “Self Learn”; it will take some time depending on the data size
  8. 8. Different performance metrices Confusion matrix Accuracy, Sensitivity, Specificity Mathews correlation coefficient.
  9. 9. Receiver operating characteristic
  10. 10. Selected ranked features & statistics
  11. 11. Evaluation independent test data Data not used in training is important to evaluate the chance over of over fitting.
  12. 12. Performance on independent test set
  13. 13. Export the classifier The classifier could be export/save in “Pro-Gyan classifier“ (pgc) format and upload in a web-server, e-mailed, etc. The name and description of the classifier could be updated at the time of export.
  14. 14. Classify novel proteins
  15. 15. Classify novel proteins • Import the required Pro-Gyan built classifier
  16. 16. Training information of the classifier
  17. 17. Classify novel proteins • Copy-paste in the text area or upload (“Fasta File” button) multiple protein sequence in fasta format and “Classify” them.
  18. 18. Prediction result • The result is displayed in tabular format which could be copy paste to any text or spreadsheet.

×