SlideShare a Scribd company logo
DATABASE TECHNOLOGIES
IN BIOINFORMATICS
GLEB SKLYR
THE PROBLEM
• Bioinformatics research produces highly irregular and
unstructured data
• Example: gene EGFR
THE PROBLEM
• New emerging technologies allow data to be generated quicker, cheaper, and in
larger quantities
• Example:
Gebelhoff, Robert. "Sequencing the genome creates so much data we don’t know what to do with it." The Washington Post. WP Company, 07 July 2015.
Web. 01 May 2017.
THE PROBLEM
• Bioinformatics data is generated globally and is stored and
processed in multiple site around the world. Each research
center and university have their own data storage solutions and
many different centralized repositories exist
• Examples:
THE PROBLEM
• Additionally, data analysis algorithms are complex
• Examples:
- Global alignment used by BLAST O(NM)
- Multiple Sequence Alignment O(2 𝑁 𝐿 𝑁)
…Most algorithms use heuristic approaches
MOTIVATION
• Understand the “secret of life”. How biology works
• Replicate biological processes
• Cure disease
• Much more
MOTIVATION
• Every paper repeats the 3 points: data is unstructured,
scattered, and growing fast (“data tsunami”)
• This field has a lot of problems that individual companies do
not have and make it unique
• What solutions exist? What solutions are proposed?
• As a database administrator/designer how can you alleviate the
hard work that goes into bioinformatics?
EXISTING WORK – XML IN RDBMS
EXISTING WORK –
ORACLE RDBMS
• Offer XML data type
• Have data mining libraries
• Continuously working to
adapt to standards in
industry
• ACID – Atomicity,
Consistency, Isolation,
Durability
PROBLEM
• Relational databases are constrained by schema and
relationships – all columns are same in a table, foreign key
constraints
• Performance is degraded with increasing schema complexity,
data volume and data distribution
SOLUTION – NOSQL SYSTEMS
• Are not restricted by schema or relationships
• Designed with performance in mind
• Designed with data distribution in mind
• Highly scalable
SOLUTIONS – MONGODB
UNSTRUCTURED DATA
SOLUTIONS – CASSANDRA
FOR COMPUTATIONALLY INTENSIVE DATA
CASE STUDY - BIGNASIM
CONCLUSION
• NoSQL technologies are the future of bioinformatics
• In a field of unstructured, distributed, and rapidly growing
data, it is important to be able to pick the right system for your
application
BIBLIOGRAPHY
• Blackwell, Bruce, and Siva Ravada. "Oracle's technology for bioinformatics and future directions." ACM Digital
Library. Australian Computer Society, Inc., n.d. Web. 03 May 2017.
• Alger, Abdullah. "Redis and MongoDB in the biomedical domain." Compose Articles. Compose Articles, 03 Feb.
2017. Web. 03 May 2017.
• Aniceto, Rodrigo, Rene Xavier, Maristela Holanda, Maria Emilia Walter, and Sergio Lifschitz. "Genomic data
persistency on a NoSQL database system." 2014 IEEE International Conference on Bioinformatics and Biomedicine
(BIBM) (2014): n. pag. Web.
• Gebelhoff, Robert. "Sequencing the genome creates so much data we don’t know what to do with it." The
Washington Post. WP Company, 07 July 2015. Web. 01 May 2017.
• Guimaraes, Valeria, Fernanda Hondo, Rodrigo Almeida, Harley Vera, Maristela Holanda, Aleteia Araujo, Maria
Emilia Walter, and Sergio Lifschitz. "A study of genomic data provenance in NoSQL document-oriented database
systems." 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2015): n. pag. Web.
• Hospital, Adam, Pau Andrio, Cesare Cugnasco, Laia Codo, Yolanda Becerra, Pablo D. Dans, Federica Battistini,
Jordi Torres, Ramón Goñi, Modesto Orozco, and Josep Ll. Gelpí. "BIGNASim: a NoSQL database structure and
analysis portal for nucleic acids simulation data." Nucleic Acids Research 44.D1 (2015): n. pag. Web.
• Lima, Iasmini, Matheus Oliveira, Diego Kieckbusch, Maristela Holanda, Maria Emilia M. T. Walter, Aleteia Araujo,
Marcio Victorino, Waldeyr M. C. Silva, and Sergio Lifschitz. "An evaluation of data replication for bioinformatics
workflows on NoSQL systems." 2016 IEEE International Conference on Bioinformatics and Biomedicine
(BIBM) (2016): n. pag. Web.
• Stromback, Lena, and Juliana Freire. "XML Management for Bioinformatics Applications." Computing in Science &
Engineering 13.5 (2011): 12-23. Web.
QUESTIO
NS

More Related Content

What's hot

Bioinformatics Final Presentation
Bioinformatics Final PresentationBioinformatics Final Presentation
Bioinformatics Final Presentation
Shruthi Choudary
 

What's hot (20)

B.sc biochem i bobi u-1 introduction to bioinformatics
B.sc biochem i bobi u-1 introduction to bioinformaticsB.sc biochem i bobi u-1 introduction to bioinformatics
B.sc biochem i bobi u-1 introduction to bioinformatics
 
Bioinformatics principles and applications
Bioinformatics principles and applicationsBioinformatics principles and applications
Bioinformatics principles and applications
 
BIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesBIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And Challenges
 
Bioinformatics Final Presentation
Bioinformatics Final PresentationBioinformatics Final Presentation
Bioinformatics Final Presentation
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Genome Database Systems
Genome Database Systems Genome Database Systems
Genome Database Systems
 
Basics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsBasics of Data Analysis in Bioinformatics
Basics of Data Analysis in Bioinformatics
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Application of bioinformatics
Application of bioinformaticsApplication of bioinformatics
Application of bioinformatics
 
Biological Database
Biological DatabaseBiological Database
Biological Database
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future Perspectives
 
Bioinformatics Software
Bioinformatics SoftwareBioinformatics Software
Bioinformatics Software
 
Bioinformatics - Discovering the Bio Logic Of Nature
Bioinformatics - Discovering the Bio Logic Of NatureBioinformatics - Discovering the Bio Logic Of Nature
Bioinformatics - Discovering the Bio Logic Of Nature
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Intro bioinformatics
Intro bioinformaticsIntro bioinformatics
Intro bioinformatics
 

Similar to Database technologies in bioinformatics

Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Amit Sheth
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
Dr. Haxel Consult
 
Ontologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowOntologies: What Librarians Need to Know
Ontologies: What Librarians Need to Know
Barry Smith
 

Similar to Database technologies in bioinformatics (20)

Big data, bioscience and the cloud biocatalyst june 2015 sullivan
Big data, bioscience and the cloud   biocatalyst june 2015 sullivanBig data, bioscience and the cloud   biocatalyst june 2015 sullivan
Big data, bioscience and the cloud biocatalyst june 2015 sullivan
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017
 
Big Data
Big Data Big Data
Big Data
 
The pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleThe pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an example
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
 
Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forum
 
Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...
 
2015 04-18-wilson cg
2015 04-18-wilson cg2015 04-18-wilson cg
2015 04-18-wilson cg
 
Standards in health informatics - problem, clinical models and terminology
Standards in health informatics - problem, clinical models and terminologyStandards in health informatics - problem, clinical models and terminology
Standards in health informatics - problem, clinical models and terminology
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
 
Accomplishments And Challenges In Bioinformatics
Accomplishments And Challenges In BioinformaticsAccomplishments And Challenges In Bioinformatics
Accomplishments And Challenges In Bioinformatics
 
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystem
 
Summaries on the fly: Query-based Extraction of Structured Knowledge from Web...
Summaries on the fly: Query-based Extraction of Structured Knowledge from Web...Summaries on the fly: Query-based Extraction of Structured Knowledge from Web...
Summaries on the fly: Query-based Extraction of Structured Knowledge from Web...
 
Neuroscience as networked science
Neuroscience as networked scienceNeuroscience as networked science
Neuroscience as networked science
 
Databases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems ImmunologyDatabases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems Immunology
 
Ontologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowOntologies: What Librarians Need to Know
Ontologies: What Librarians Need to Know
 

Recently uploaded

platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
Jocelyn Atis
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
muralinath2
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Sérgio Sacani
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 

Recently uploaded (20)

BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
THYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursingTHYROID-PARATHYROID medical surgical nursing
THYROID-PARATHYROID medical surgical nursing
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
FAIRSpectra - Towards a common data file format for SIMS images
FAIRSpectra - Towards a common data file format for SIMS imagesFAIRSpectra - Towards a common data file format for SIMS images
FAIRSpectra - Towards a common data file format for SIMS images
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 

Database technologies in bioinformatics

  • 2. THE PROBLEM • Bioinformatics research produces highly irregular and unstructured data • Example: gene EGFR
  • 3. THE PROBLEM • New emerging technologies allow data to be generated quicker, cheaper, and in larger quantities • Example: Gebelhoff, Robert. "Sequencing the genome creates so much data we don’t know what to do with it." The Washington Post. WP Company, 07 July 2015. Web. 01 May 2017.
  • 4. THE PROBLEM • Bioinformatics data is generated globally and is stored and processed in multiple site around the world. Each research center and university have their own data storage solutions and many different centralized repositories exist • Examples:
  • 5. THE PROBLEM • Additionally, data analysis algorithms are complex • Examples: - Global alignment used by BLAST O(NM) - Multiple Sequence Alignment O(2 𝑁 𝐿 𝑁) …Most algorithms use heuristic approaches
  • 6. MOTIVATION • Understand the “secret of life”. How biology works • Replicate biological processes • Cure disease • Much more
  • 7. MOTIVATION • Every paper repeats the 3 points: data is unstructured, scattered, and growing fast (“data tsunami”) • This field has a lot of problems that individual companies do not have and make it unique • What solutions exist? What solutions are proposed? • As a database administrator/designer how can you alleviate the hard work that goes into bioinformatics?
  • 8. EXISTING WORK – XML IN RDBMS
  • 9. EXISTING WORK – ORACLE RDBMS • Offer XML data type • Have data mining libraries • Continuously working to adapt to standards in industry • ACID – Atomicity, Consistency, Isolation, Durability
  • 10. PROBLEM • Relational databases are constrained by schema and relationships – all columns are same in a table, foreign key constraints • Performance is degraded with increasing schema complexity, data volume and data distribution
  • 11. SOLUTION – NOSQL SYSTEMS • Are not restricted by schema or relationships • Designed with performance in mind • Designed with data distribution in mind • Highly scalable
  • 13. SOLUTIONS – CASSANDRA FOR COMPUTATIONALLY INTENSIVE DATA
  • 14. CASE STUDY - BIGNASIM
  • 15. CONCLUSION • NoSQL technologies are the future of bioinformatics • In a field of unstructured, distributed, and rapidly growing data, it is important to be able to pick the right system for your application
  • 16. BIBLIOGRAPHY • Blackwell, Bruce, and Siva Ravada. "Oracle's technology for bioinformatics and future directions." ACM Digital Library. Australian Computer Society, Inc., n.d. Web. 03 May 2017. • Alger, Abdullah. "Redis and MongoDB in the biomedical domain." Compose Articles. Compose Articles, 03 Feb. 2017. Web. 03 May 2017. • Aniceto, Rodrigo, Rene Xavier, Maristela Holanda, Maria Emilia Walter, and Sergio Lifschitz. "Genomic data persistency on a NoSQL database system." 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2014): n. pag. Web. • Gebelhoff, Robert. "Sequencing the genome creates so much data we don’t know what to do with it." The Washington Post. WP Company, 07 July 2015. Web. 01 May 2017. • Guimaraes, Valeria, Fernanda Hondo, Rodrigo Almeida, Harley Vera, Maristela Holanda, Aleteia Araujo, Maria Emilia Walter, and Sergio Lifschitz. "A study of genomic data provenance in NoSQL document-oriented database systems." 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2015): n. pag. Web. • Hospital, Adam, Pau Andrio, Cesare Cugnasco, Laia Codo, Yolanda Becerra, Pablo D. Dans, Federica Battistini, Jordi Torres, Ramón Goñi, Modesto Orozco, and Josep Ll. Gelpí. "BIGNASim: a NoSQL database structure and analysis portal for nucleic acids simulation data." Nucleic Acids Research 44.D1 (2015): n. pag. Web. • Lima, Iasmini, Matheus Oliveira, Diego Kieckbusch, Maristela Holanda, Maria Emilia M. T. Walter, Aleteia Araujo, Marcio Victorino, Waldeyr M. C. Silva, and Sergio Lifschitz. "An evaluation of data replication for bioinformatics workflows on NoSQL systems." 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2016): n. pag. Web. • Stromback, Lena, and Juliana Freire. "XML Management for Bioinformatics Applications." Computing in Science & Engineering 13.5 (2011): 12-23. Web.