PROTOFILWW
A computational platform for the analysis of the relationships between
microorganisms and environmental paramet...
System requirements
• Insertion and retrieval of data has to be done quickly and easily
• Should be possible to export the...
Overview of the workflow of field and lab work
PROTOFILWWPROTOFILWW
1.635 lines x 137 columns
ProtoFilWW system major components
1. Content Management component: supports the
researchers managing and analyzing the da...
High-level integration perspective of ProtoFilWW
Drupal core
PLUGINS
Import data
Reports Access control
Other
services...
...
Contend Management component
• Open source Content Management System (CMS) and
Framework (CMF)
• Highly modular and with h...
WWTP Sample
1. Filamentous bacteria
2. Protozoa
3. Metazoa
4. Physical-chemical
5. Sample characterization
User roles
use case visitors collaborators WWTP researchers administrators
Find studies and results x x x x
Contact resear...
Dynamic reporting and charting
Reports creation Reports display
Geolocation of the WWTPs
Address geocoding Map display
Text Mining
component
Listing the species
mentioned in a
document
Major Text Mining technologies used
• Lucene is a high-performance text search engine
library.
• Solr is a standalone ente...
Text Mining process in ProtoFilWW
Solr/Lucene
LINNAEUS
Solr UIMA
PMC Open Access SubsetPMC Open Access Subset Solr XML doc...
Solr LINNAEUS Annotator
UIMA Component Descriptor
Editor plugin UIMA type system for LINNAEUS
LINNAEUS UIMA wrapper running on CVD
Drupal Views Solr Backend
Major contributions
1. The Web-based computational system
www.protofilww.org
2. The Drupal module Views Solr Backend
3. Th...
Em que trabalho após o mestrado…
Preventive Medicine
 Alert the user to the risk of Type 2 Diabetes.
 How?
1. We know the user has a gene mutation associ...
Bioinformatics presentation to students University of Minho
Bioinformatics presentation to students University of Minho
Bioinformatics presentation to students University of Minho
Bioinformatics presentation to students University of Minho
Bioinformatics presentation to students University of Minho
Bioinformatics presentation to students University of Minho
Bioinformatics presentation to students University of Minho
Bioinformatics presentation to students University of Minho
Bioinformatics presentation to students University of Minho
Bioinformatics presentation to students University of Minho
Upcoming SlideShare
Loading in...5
×

Bioinformatics presentation to students University of Minho

557
-1

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
557
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Bioinformatics presentation to students University of Minho

  1. 1. PROTOFILWW A computational platform for the analysis of the relationships between microorganisms and environmental parameters in activated sludge plants José Fernandes Bioinformatics Master Thesis Prof. Anália Lourenço Prof. Ana Nicolau
  2. 2. System requirements • Insertion and retrieval of data has to be done quickly and easily • Should be possible to export the data so it can be analyzed with other informatics systems • Should support statistical assessments • Have user-friendly visualization capabilities • Controlled access to data, based on user roles, accounting for data privacy issues • Easy dissemination of related studies and results • Always online (web-based) • Help finding additional information about the microorganisms present in the biological samples
  3. 3. Overview of the workflow of field and lab work PROTOFILWWPROTOFILWW
  4. 4. 1.635 lines x 137 columns
  5. 5. ProtoFilWW system major components 1. Content Management component: supports the researchers managing and analyzing the data obtained from the WWTP’s samples 2. Text Mining component: finding additional information about the microorganisms present in the biological samples
  6. 6. High-level integration perspective of ProtoFilWW Drupal core PLUGINS Import data Reports Access control Other services... PROTOFILWW SQL XLS, TXT, CSV Export dataXLS, TXT, CSV Solr/LuceneViews Solr Backend Views XML Relational Database UIMA
  7. 7. Contend Management component • Open source Content Management System (CMS) and Framework (CMF) • Highly modular and with high extensibility • Built in the PHP scripting language
  8. 8. WWTP Sample 1. Filamentous bacteria 2. Protozoa 3. Metazoa 4. Physical-chemical 5. Sample characterization
  9. 9. User roles use case visitors collaborators WWTP researchers administrators Find studies and results x x x x Contact researchers x x x Analysis of available data x Data insertion x x Creation of reports x Export data x Managing users x Backup data x Text Mining x x x x
  10. 10. Dynamic reporting and charting Reports creation Reports display
  11. 11. Geolocation of the WWTPs Address geocoding Map display
  12. 12. Text Mining component Listing the species mentioned in a document
  13. 13. Major Text Mining technologies used • Lucene is a high-performance text search engine library. • Solr is a standalone enterprise search server with a REST-like API • UIMA is a powerful infrastructure for the storage, transport, and retrieval of document and annotation knowledge accumulated in NLP pipeline systems • LINNAEUS is a popular organism name identification system for biomedical literature that is capable of normalizing to unambiguous NCBI taxonomy identifiers
  14. 14. Text Mining process in ProtoFilWW Solr/Lucene LINNAEUS Solr UIMA PMC Open Access SubsetPMC Open Access Subset Solr XML documentsSolr XML documents XPath convertion
  15. 15. Solr LINNAEUS Annotator UIMA Component Descriptor Editor plugin UIMA type system for LINNAEUS
  16. 16. LINNAEUS UIMA wrapper running on CVD
  17. 17. Drupal Views Solr Backend
  18. 18. Major contributions 1. The Web-based computational system www.protofilww.org 2. The Drupal module Views Solr Backend 3. The Solr UIMA plug-in for LINNAEUS Annotator
  19. 19. Em que trabalho após o mestrado…
  20. 20. Preventive Medicine  Alert the user to the risk of Type 2 Diabetes.  How? 1. We know the user has a gene mutation associated with Type 2 Diabetes, because he gave us is genome! 2. We know what he has eaten, because he told us! 3. We know what exercise he’s been doing, because he told us! 4. Genehome connects the dots!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×