The document discusses the ISA software suite, which was created to address the problems of fragmented data formats and inconsistent terminology that make it difficult for biologists to submit and share experimental data. The key components of the ISA suite are ISA-Tab, a general-purpose yet flexible format for experimental metadata, and the ISAtools, which provide software to help biologists get data into the ISA-Tab format, share it, and convert or analyze it. The goal is to make it easier for researchers to report and reuse experimental data.
A Study of Semantic Proximity between Archetype Terms based on SNOMED CT Rela...Jose Iglesias
The aim of OpenEHR archetypes is sharing clinical data in a unambiguous and
accurate way. Standard terminologies, such as SNOMED CT, provide an appro-
priate method of expressing unambiguous and interoperable clinical data terms.
However, nowadays bindings to terminologies are infrequent in the archetypes,
probably because manual mapping requires a lot of human resources.
The work has analyzed clinical archetypes and has detected a high degree of
semantic proximity between their terms, using the SNOMED CT relationships
as a reference. Moreover, taking advantage of this, an automated method to
map archetype terms to SNOMED CT concepts has been proposed. The method
exploits the SNOMED CT relationships to limit the searches to relevant portions
of the terminology. This contribution clearly improves mapping results.
This research shows that it is possible to automatically map archetype terms
to a standard terminology with a high precision and recall, with the help of
appropriate contextual and semantic information of both models.
A Study of Semantic Proximity between Archetype Terms based on SNOMED CT Rela...Jose Iglesias
The aim of OpenEHR archetypes is sharing clinical data in a unambiguous and
accurate way. Standard terminologies, such as SNOMED CT, provide an appro-
priate method of expressing unambiguous and interoperable clinical data terms.
However, nowadays bindings to terminologies are infrequent in the archetypes,
probably because manual mapping requires a lot of human resources.
The work has analyzed clinical archetypes and has detected a high degree of
semantic proximity between their terms, using the SNOMED CT relationships
as a reference. Moreover, taking advantage of this, an automated method to
map archetype terms to SNOMED CT concepts has been proposed. The method
exploits the SNOMED CT relationships to limit the searches to relevant portions
of the terminology. This contribution clearly improves mapping results.
This research shows that it is possible to automatically map archetype terms
to a standard terminology with a high precision and recall, with the help of
appropriate contextual and semantic information of both models.
Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...GigaScience, BGI Hong Kong
Eamonn Maguire's talk on "The Open Source ISA Metadata Tracking Framework: From Data Curation and Management at the Source, to the Linked Data Universe" at ISCB-Asia, December 17th 2012
Project number: 224348
Project acronym: AEGIS
Project title: Open Accessibility Everywhere: Groundwork, Infrastructure, Standards
Starting date: 1 September 2008
Duration: 48 Months
AEGIS is an Integrated Project (IP) within the ICT programme of FP7
Ontology Evaluation Methods and Metrics - This is work I did while I was at The MITRE Corporation. I came up with a framework to support ontology evaluation for reuse that could also be used for ontology construction. I was the sole author of the approach, which was intended to begin a research program and a community of practice around it. It's been on hold and would like that to change. I'm now at the Tetherless World Constellation at Rensselaer Polytechnic Institute, if interested contact me there.
CDAO presentation.
The idea of the comparative analysis ontoloty has been presented worldwide, including: NESCent (USA), IGBMC (France), UFRJ (Brazil). Providing a semantic framework for evolutionary analysis in a high-throughtput way after the next and third generation sequencing is the way to approach evolutionary-based studies into genome-wide analysis. The darwinian core of reasoning also allows CDAO to be used with other entities.
Keynote presentation from Plant and Pathogen Bioinformatics workshop at EMBL-EBI, 8-11 July 2014
Slides and teaching material are available at https://github.com/widdowquinn/Teaching-EMBL-Plant-Path-Genomics
Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...GigaScience, BGI Hong Kong
Eamonn Maguire's talk on "The Open Source ISA Metadata Tracking Framework: From Data Curation and Management at the Source, to the Linked Data Universe" at ISCB-Asia, December 17th 2012
Project number: 224348
Project acronym: AEGIS
Project title: Open Accessibility Everywhere: Groundwork, Infrastructure, Standards
Starting date: 1 September 2008
Duration: 48 Months
AEGIS is an Integrated Project (IP) within the ICT programme of FP7
Ontology Evaluation Methods and Metrics - This is work I did while I was at The MITRE Corporation. I came up with a framework to support ontology evaluation for reuse that could also be used for ontology construction. I was the sole author of the approach, which was intended to begin a research program and a community of practice around it. It's been on hold and would like that to change. I'm now at the Tetherless World Constellation at Rensselaer Polytechnic Institute, if interested contact me there.
CDAO presentation.
The idea of the comparative analysis ontoloty has been presented worldwide, including: NESCent (USA), IGBMC (France), UFRJ (Brazil). Providing a semantic framework for evolutionary analysis in a high-throughtput way after the next and third generation sequencing is the way to approach evolutionary-based studies into genome-wide analysis. The darwinian core of reasoning also allows CDAO to be used with other entities.
Keynote presentation from Plant and Pathogen Bioinformatics workshop at EMBL-EBI, 8-11 July 2014
Slides and teaching material are available at https://github.com/widdowquinn/Teaching-EMBL-Plant-Path-Genomics
Visual Compression of Workflow Visualizations with Automated Detection of Mac...Eamonn Maguire
VIS 2013 Presentation
Paper is available here: http://www.oerc.ox.ac.uk/personal-pages/emaguire/AutoMacron.pdf
Code is available here: http://github.com/isa-tools/automacron
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Generating a custom Ruby SDK for your web service or Rails API using Smithy
ISA tools presentation
1. The ISA software suite
Eamonn Maguire
Lead Software Engineer
eamonn.maguire@oerc.ox.ac.uk
Novartis, 21st October 2011
Tuesday, 8 November 2011
2. Who am I?
it’s rhetorical...
Irish
Formal background is Computer Science (Bachelors) and Bioinformatics (Masters)
Lead software engineer on the ISA project
DPhil Student at Oxford in Visualization in the Dept. of Computer Science
Have my own graphic design company (Antarctic Design)
Part of a small but productive and vibrant team at Oxford
headed by Susanna-Assunta Sansone.
Our work includes the ISA tools/infrastructure, MIBBI &
BioSharing.
Novartis, 21st October 2011
Tuesday, 8 November 2011
3. What is ISA all about?
We want to enable better reporting of
experiments...
We want to make to easier for
submitters...
We want to provide tooling which
biologists will want to use...
Novartis, 21st October 2011
Tuesday, 8 November 2011
4. What’s the problem?
Could be beans. Could be peas. Could be soup.
Analogy time.
Each can is an experiment.
Tin can analogy borrowed from We have no labels, so no indication about what is in the can.
Norman Morrison & converted
from ontologies to metadata
transfer standards.
In biology, things aren’t quite as bad as this, we have some labels, but they aren’t all in the same
language. What do I mean by this? Well...
1. there is fragmentation: the formats used to describe experiments are different, e.g. MAGE-
Tab, PRIDE-ML, SRA-XML, but in essence they capture much of the same information; and
2. the terminologies used to describe experiments is different, even though many concepts
are shared such as sample description. Field names as well as values...
Novartis, 21st October 2011
Tuesday, 8 November 2011
5. What’s the problem?
Could be beans. Could be peas. Could be soup.
- a different representation...non latin language
Analogy time.
Each can is an experiment.
Tin can analogy borrowed from We have no labels, so no indication about what is in the can.
Norman Morrison & converted
from ontologies to metadata
transfer standards.
In biology, things aren’t quite as bad as this, we have some labels, but they aren’t all in the same
language. What do I mean by this? Well...
1. there is fragmentation: the formats used to describe experiments are different, e.g. MAGE-
Tab, PRIDE-ML, SRA-XML, but in essence they capture much of the same information; and
2. the terminologies used to describe experiments is different, even though many concepts
are shared such as sample description. Field names as well as values...
Novartis, 21st October 2011
Tuesday, 8 November 2011
6. What’s the problem?
Could be beans. Could be peas. Could be soup.
- a different representation...non latin language
Might be petit pois - a different terminology
Analogy time.
Each can is an experiment.
Tin can analogy borrowed from We have no labels, so no indication about what is in the can.
Norman Morrison & converted
from ontologies to metadata
transfer standards.
In biology, things aren’t quite as bad as this, we have some labels, but they aren’t all in the same
language. What do I mean by this? Well...
1. there is fragmentation: the formats used to describe experiments are different, e.g. MAGE-
Tab, PRIDE-ML, SRA-XML, but in essence they capture much of the same information; and
2. the terminologies used to describe experiments is different, even though many concepts
are shared such as sample description. Field names as well as values...
Novartis, 21st October 2011
Tuesday, 8 November 2011
7. What’s the problem?
Can you imagine having to translate everything you write into a different language in
order to submit your data?
Novartis, 21st October 2011
Tuesday, 8 November 2011
8. What’s the problem?
Can you imagine having to translate everything you write into a different language in
order to submit your data?
译 语 编 吗 转换
译 错
Novartis, 21st October 2011
Tuesday, 8 November 2011
9. What’s the problem?
Can you imagine having to translate everything you write into a different language in
order to submit your data?
译 语 编 吗 转换
译 错
An féidir leat a shamhlú go bhfuil gach rud a scríobh tú a aistriú isteach i
dteanga eile d'fhonn a chur isteach do chuid sonraí? Fiú uirlisí chomhshó,
cosúil le google translate a fháilsé mícheart.
Novartis, 21st October 2011
Tuesday, 8 November 2011
10. Take home point...
Repositories are making it difficult for biologists to submit data, and for others to use it.
Particularly for those performing multi-omic experiments where to submit say proteomic and
transcriptomic data, one must provide the same general data in two very different formats...why?
Well people like to have their own formats...plus, ad hoc is easier in general
Our solution is one general purpose, flexible format, herein referred to as ISA-Tab.
A domain agnostic format to capture experimental metadata in omic experiments
(transcriptomic, genomic, proteomic, metabolomic) as well as traditional experiments such as
clinical chemistry and histology.
...it works on lots (I won’t dare say all) types of data...nutrigenomics, toxicogenomics, public
health... etc.
Novartis, 21st October 2011
Tuesday, 8 November 2011
11. Tell me more...
investigation investigation
high level concept to link
related studies
study
the central unit, containing
information on the subject
under study, its characteristics
and any treatments applied.
a study has associated assays
assay
test performed either on
material taken from the sub-
ject or on the whole initial
subject, which produce quali-
tative or quantitative meas-
urements (data)
assay(s) assay(s)
pointers to data file
Biologists like tab.
names/location
They don’t like XML.
Through basic inference...
external files in ISA-Tab is good :)
native or other for-
mats
data data
Novartis, 21st October 2011
Tuesday, 8 November 2011
12. But we don’t want to do this...
http://xkcd.com/927/
Novartis, 21st October 2011
Tuesday, 8 November 2011
13. A format on it’s own isn’t very much though...
Too true...the secret to adoption is to provide the tooling to enable biologists to get data
into the format, share it, convert and analyse it!
The ISAtools provide this tool support.
Novartis, 21st October 2011
Tuesday, 8 November 2011
14. The ISA tools
Developed on top of the ISA-Tab format...modular, configurable, open source, Java based*
converter
isacreator
&
others being developed by the ISA community...
PERL Parser for ISA by Bob MacCallum and Python Parser for ISA by Brad Chapman
*apart from the R, PERL and Python
packages of course...
Novartis, 21st October 2011
Tuesday, 8 November 2011
15. The ISA tools... modular
Convert to ISA Convert from ISA
converter converter
Convert to MAGE-TAB,
Convert from MAGE-Tab PRIDE-ML, SRA-XML for
to ISATab. More formats submission to international
coming soon... public repositories
Configure Create Validate Load Browse
isacreator Users browse investigations,
Check adherance to Curator stores metadata
Curator creates template Experimentalist uses editor to in database using BII data query and view
template experimental metadata, and
report investigation. management tool
access associated data files
Analyze
Perform analysis of data in
context with the metadata
Requires Configuration XML using the Galaxy or R analysis
engines.
Novartis, 21st October 2011
Tuesday, 8 November 2011
16. The ISA tools... configurable
Are you just using buzz words? Well we like buzz words as much as everyone else, but no.
We need to be configurable to support evolving checklists and requirements. Just check out
mibbi.org, lots of checklists! 32 in fact at the last count.
MIBBI is trying to harmonise these checklists to reduce redundancy and make them
interoperable.
Novartis, 21st October 2011
Tuesday, 8 November 2011
17. Checklists...what are they?
When we report things, there are some things which are really important.
In a school report, we have the child’s name, their class, teacher, subjects taken and so on.
Well, in a biological experiment, the very same principles apply. We need information about the
sample (species, strain, age) and information about the protocols applied during the experiment
and subsequent parameters.
We have 32 checklists at present because there are differences in what is deemed important
depending on the experiment being performed.
Good reporting means that statistics can be applied better, experiments can be reproduced more
easily, and data mashups can occur in the future.
Experiments are expensive, we should make sure that their full value is realised.
Novartis, 21st October 2011
Tuesday, 8 November 2011
18. On this point...
Helping to demystify the
unwieldy world of
standards...
Find out what standards are out
there...MI Checklists, ontologies
and formats plus what domains
they are suited to...
Find out about data sharing
policies from NIH for example.
Novartis, 21st October 2011
Tuesday, 8 November 2011
19. Configurable...back to that
We need to support lots of different checklists,
and it should be easy for people to change their
requirements should they need to....
So, our infrastructure is built upon XML files.
These are created by the ISAConfigurator.
A configuration XML file describes the fields (or
checklist) required to describe a particular
experiment!
Novartis, 21st October 2011
Tuesday, 8 November 2011
20. ISAconfigurator
Configuration XML
The brick maker...a kiln The bricks...
Novartis, 21st October 2011
Tuesday, 8 November 2011
24. The configuration xml...
This is an example of a field definition created by the
configurator. In this instance we are describing a label
field, in particular, one used to describe the label used in
a microarray experiment.
We have defined it to come from an ontology, and we
recommend the ChEBI ontology. It is also required.
Novartis, 21st October 2011
Tuesday, 8 November 2011
25. The configuration xml...
Aside from strong ontology support, the configuration xml also allows for specification of regular
expressions which field contents should match, to specify if a field is an integer, double, list value,
boolean, string or a field which should accept a file location...
The configuration xml is an important part of the infrastructure and is utilised in various
components in differing capacities.
isacreator
Used in content validation but it’s main Used in content validation. The validation
purpose here is to build the user component is also called in the ISAconverter
interface...more on this later. and BII data manager before conversion and
loading respectively
Novartis, 21st October 2011
Tuesday, 8 November 2011
26. isacreator
Create & Edit ISA-Tab
Novartis, 21st October 2011
Tuesday, 8 November 2011
27. The ISAcreator... file chooser
publication searcher visualization
ontology search
QR code generator
isacreator
Developed to be a user friendly
way to enter standards-compliant automated ontology tagging
metadata: it has lots of features... spreadsheet-like interface tagterms visualise suggest clear all help
powered by ncbo annotator
But these are just some of
them...we also have a data entry
wizard and an import utility...
Novartis, 21st October 2011
Tuesday, 8 November 2011
28. Use of the configuration xml
Configuration xml schema (XSD) is consumed by an XML beans goal in maven and Java stubs are
created which are then used to load the XML files into memory
XML definition(s) Import into Java Object Model Construct spreadsheet model. Columns, Assign cell editors. Ontology terms are
using classes created by XML rows, etc. given the ontology selection tool as a cell
beans editor, file fields are given a file chooser
etc.
<xml>
<field>sample</field>
<field>protocol ref</field> Java Object
<field>extract name</field> TableReferenceObject
<field>label</field>
...
</xml>
The configuration is also used to define the form view using a similar mechanism....
Novartis, 21st October 2011
Tuesday, 8 November 2011
29. Sounds good...what does it look like?...
Novartis, 21st October 2011
Tuesday, 8 November 2011
30. Sounds good...what does it look like?...
Novartis, 21st October 2011
Tuesday, 8 November 2011
31. Sounds good...what does it look like?...
Novartis, 21st October 2011
Tuesday, 8 November 2011
32. Sounds good...what does it look like?...
Novartis, 21st October 2011
Tuesday, 8 November 2011
33. Sounds good...what does it look like?...
Novartis, 21st October 2011
Tuesday, 8 November 2011
34. Sounds good...what does it look like?...
Novartis, 21st October 2011
Tuesday, 8 November 2011
35. Sounds good...what does it look like?...
Novartis, 21st October 2011
Tuesday, 8 November 2011
36. Sounds good...what does it look like?...
Novartis, 21st October 2011
Tuesday, 8 November 2011
37. Sounds good...what does it look like?...
Novartis, 21st October 2011
Tuesday, 8 November 2011
38. Sounds good...what does it look like?...
Novartis, 21st October 2011
Tuesday, 8 November 2011
39. Sounds good...what does it look like?...
Novartis, 21st October 2011
Tuesday, 8 November 2011
40. Sounds good...what does it look like?...
Novartis, 21st October 2011
Tuesday, 8 November 2011
41. Ontologies
We use the NCBO Bioportal and the EBI’s OLS to do searching and browsing on ontologies.
Ontology field restriction Ontology browsing & searching
Ontology tagging
Ontology Resource Manager
The resource manager provides seamless searching of ontology resources, regardless of their origins, their underlying
data schema or the mechanism (REST, SOAP or local file store) through which they are accessed.
NCBO Ontology Plugin
BioPortal Lookup
Search, Hierarchy and Annotator services Service (OLS)
ISAcreator manages ontology metadata such as version information as well as individual term accessions, source, uri and so forth.
Ontology search code is usable outside of ISAcreator. In fact, the ISAconfigurator imports ISAcreator as a maven dependency and
reuses it’s components to do ontology restriction...plugins can also make use of our ontology search and browse functionalities
Novartis, 21st October 2011
Tuesday, 8 November 2011
42. Ontologies...some more technical details
How do we browse so quickly without downloading and reasoning over ontologies?
(disclaimer: speed also depends on if you access OLS/BioPortal from Europe/America)
Ontologies are all accessed by web services...this part is clear.
But browsing over ontologies, especially those coming from 2 separate resources, in different parts
of the world with very different implementations isn’t easy.
ontology loaded root expanded node a expanded
root root, level 0 root, level 0
level(root) + 1 branch a branch a
level(a) +1 branch b
level(b) +1
To make the browsing experience not so slow and painful, we preload parts of the ontology tree
in advance of them being requested by the user.
Novartis, 21st October 2011
Tuesday, 8 November 2011
43. Plugins
In ISAcreator, we use the Apache Felix implementation of the OSGi framework...it’s really good.
Plugins can be developed for 3 different purposes:
Search (adds extra search Custom cell editors Extra general functionality
space for ontology tool) (for spreadsheet) (which appears in a plugin
menu)
Novartis, 21st October 2011
Tuesday, 8 November 2011
44. Plugins...example Novartis Metastore Search
Search function on the Novartis
Metastore... integrates search
results on the metastore in the
Ontology search tool.
So, with the Novartis plugin in
your Plugin directory, you’ll be
able to search the Novartis
metastore directly within
ISAcreator, and it will handle all
the tasks involved with recording
term source, etc.
Novartis, 21st October 2011
Tuesday, 8 November 2011
45. Make sure the ISA-Tab is correct
Novartis, 21st October 2011
Tuesday, 8 November 2011
46. Checks:
the structure of the ISA-Tab to ensure it’s well formed;
the contents to ensure that it matches what is defined in the configuration xml
Then:
maps the tab structure into an graph-based object model
H1 H. Sapiens 35 Years H1.sample1 Labeling H1.sample1.labeled h1-s1.cel
H1 H. Sapiens 35 Years H1.sample2 h1-s2.cel
H2 H. Sapiens 33 Years H2.sample1 Labeling H2.sample1.labeled h2-s1.cel
H1.sample1 Labeling H1.sample1.labeled h1-s1.cel
H1
H. Sapiens H1.sample2 h1-s2.cel
35 Years
H2 H2.sample1 Labeling H2.sample1.labeled h2-s1.cel
H. Sapiens
33 Years
Actions such as conversion to other formats and persisting to the DB are performed on this object
model (called the BIIObjectStore).
Novartis, 21st October 2011
Tuesday, 8 November 2011
48. or...
validate from the command line...
or...
within ISAcreator directly...
Novartis, 21st October 2011
Tuesday, 8 November 2011
49. Convert to or from differing formats
Novartis, 21st October 2011
Tuesday, 8 November 2011
50. The converters
Fully Endorsed by ArrayExpress, PRIDE and the European Nucleotide Archive (ENA)...
Converts MAGE-Tab to ISA-Tab.
This is still in beta, however we are getting close to a fully working version. We’ve successfully
creating validated ISA-Tab for ~90% of the 21k experiments in ArrayExpress
Available as a web service, web interface and source is available for running conversions locally
http://isatab.sourceforge.net/magetoisa/
Novartis, 21st October 2011
Tuesday, 8 November 2011
52. or...
convert from the command line...
or...
within ISAcreator directly...
Novartis, 21st October 2011
Tuesday, 8 November 2011
53. Automagically filters
out the formats you
can’t export to...e.g., if I
have no sequencing
experiments, I won’t
need to export in SRA
Novartis, 21st October 2011
Tuesday, 8 November 2011
54. Get ISA-Tab into a database
Share it (or don’t) with the world
Novartis, 21st October 2011
Tuesday, 8 November 2011
55. GUI & command line interface to get ISA-Tab into an instance of the BII (BioInvestigation Index)
Calls the validator first, then persists the BIIObjectStore object to the database via Hibernate
Novartis, 21st October 2011
Tuesday, 8 November 2011
56. Lots of admin
functionalities available
from the GUI, these are
also available using the
command line or API
Disclaimer
Over X11, using such an
interface is slow...I’d suggest
making use of the API or
command line tools available...
Novartis, 21st October 2011
Tuesday, 8 November 2011
57. Database
Novartis, 21st October 2011
Tuesday, 8 November 2011
58. Database
The BioInvestigation Index term is an overloaded one.
It refers to the database & the web application
The database itself is quite complicated to describe in detail in a single presentation, but the key take
home message is that it is graph based...remember this?
H1.sample1 Labeling H1.sample1.labeled h1-s1.cel
H1
H. Sapiens H1.sample2 h1-s2.cel
35 Years
H2 H2.sample1 Labeling H2.sample1.labeled h2-s1.cel
H. Sapiens
33 Years
In the BII, we have Materials, Processes, Cross References and Annotations.
This makes things pretty generic...and the BII model is even more generic that ISA-Tab
Novartis, 21st October 2011
Tuesday, 8 November 2011
59. Database
One more word about the database, (and a few sentences)
then I’ll show the web application.
Scalable.
As far as we know... :)
ArrayExpress v2 makes use of all of the BII object model.
They just add a table for bio entities (or genes) and that’s it!
AE have >21,000 experiments and >500,000 hybridizations loaded
into it’s database.
Novartis, 21st October 2011
Tuesday, 8 November 2011
60. Web Application
Novartis, 21st October 2011
Tuesday, 8 November 2011
61. Web application
Novartis, 21st October 2011
Tuesday, 8 November 2011
62. Web application
Novartis, 21st October 2011
Tuesday, 8 November 2011
63. Web application
Novartis, 21st October 2011
Tuesday, 8 November 2011
64. Web application
Novartis, 21st October 2011
Tuesday, 8 November 2011
65. Web application
Novartis, 21st October 2011
Tuesday, 8 November 2011
66. Web application
We created the web application as a light weight solution enabling users to share their data.
(But it’s a J2EE solution so I think we’ve got an oxymoron on our hands)
But even though it’s enterprise level, it is at least light on maintenance. You’ll not have to do much with BII once it is
running. The EBI version, running across 2 servers (one as backup) has been live for 6 months so far without one
restart...and I only restarted to deploy a new instance.
Novartis, 21st October 2011
Tuesday, 8 November 2011
67. Web application
We use JBoss Seam, mainly because we don’t have to worry about HTTP sessions, scope,
etc. It manages everything for us which is useful...this is particularly important in highly
accessed systems and releases time to be spent working on more interesting things...
But it’s also a really good “integration framework”, pulling in JSP, JSF, EJB, JPA, Hibernate, etc.
Novartis, 21st October 2011
Tuesday, 8 November 2011
68. Web application
We use HQL instead of platform specific SQL. So the database can be
Oracle, MySQL, PostGreSQL...a database independent application
We can deal directly with objects, directly from the database queries
We construct the schema using POJO’s, some XML
Novartis, 21st October 2011
Tuesday, 8 November 2011
69. Web application
Lucene creates a document-based index of the database contents
We use annotations to specify which fields should be indexed
This index can be accessed and queried very quickly,
so we use this to build the user interface
Novartis, 21st October 2011
Tuesday, 8 November 2011
70. Being deployed on Cloud-enabled instance of the BioLinux VM
Will make it easier to create deployments of the BII database and
web application...
Novartis, 21st October 2011
Tuesday, 8 November 2011
71. Last but not least...
Analysis
Novartis, 21st October 2011
Tuesday, 8 November 2011
72. Package to read ISA-Tab into R, especially BioConductor to run analysis
scripts on your data...
It can automatically call microarray, mass spec and flow cytometry
analysis packages on appropriate datasets...
We still need to upload this to BioConductor...created by Audrey Kauffman
There is also a script to create Galaxy libraries from ISA-Tab
Brad Chapman is working on this at HSPH
Novartis, 21st October 2011
Tuesday, 8 November 2011
73. Who’s using ISA?
Fortunately, lots of people are now taking ISA on board... people are realising that MAGE-TAB,
SOFT, PRIDE-ML and SRA-XML are an overhead which can be avoided, especially in multi-omic
experiments.
The National Center for
Toxicological Research (NCTR)
& others...see the case study section on the ISA tools web site
Novartis, 21st October 2011
Tuesday, 8 November 2011
74. Who’s using ISA?
Case study: Metabolomics repository - Metabolights
Built on top of the ISA infrastructure with a custom front-end web interface...
converter
isacreator
Data entry tooling - ISAcreator, ISAvalidator and ISAconverter
Data management tools - BII data manager, BII database
Also developing their own plugins for ISAcreator (of type: custom cell editor)
to help users in reporting metabolite assignments.
Novartis, 21st October 2011
Tuesday, 8 November 2011
75. Who’s using ISA?
Case study: Metabolomics repository - Metabolights
Novartis, 21st October 2011
Tuesday, 8 November 2011
76. Who’s using ISA?
Case study: SCDE
Curated stem cell informatics resource linked with the Galaxy analysis engine
converter
isacreator
Built on top of the ISA infrastructure in its entirety
Contributing automatic deployment scripts for the BII
(linked with the cloud BioLinux initiative)
Created the Python Parser for ISA-Tab
Novartis, 21st October 2011
Tuesday, 8 November 2011
77. Who’s using ISA?
Case study: SCDE
Novartis, 21st October 2011
Tuesday, 8 November 2011
78. Who’s using ISA? Biggest public study of its kind
Case study: GeneData - InnoMed Only available in ISA-Tab
720 animals
16 compounds
3 doses
~20,000 assays
Novartis, 21st October 2011
Tuesday, 8 November 2011
79. Who’s using ISA? Biggest public study of its kind
Case study: GeneData - InnoMed Only available in ISA-Tab
protein expression profiling
by mass spectrometry
transcription profiling
by dna microarray
720 animals
metabolite profiling
16 compounds by mass spectrometry
3 doses metabolite profiling
by nmr spectroscopy
~20,000 assays
histology
clinical chemistry
hematology
Novartis, 21st October 2011
Tuesday, 8 November 2011
80. Who’s using ISA?
Case study: GeneData - InnoMed
Novartis, 21st October 2011
Tuesday, 8 November 2011
81. Our next steps...as a community
Visualization Further adoption Analysis
low dose
aspirin
liver kidney blood serum blood plasma
x5 x5 x5 x5
SAMP SAMP SAMP SAMP
EX EX EX EX
kidney blood serum
LABEL LABEL LABEL
HYB HYB HYB
x5 x5
SAMP SAMP
SCAN SCAN SCAN SCAN
EX
TRANS TRANS TRANS TRANS
LABEL
HYB
SCAN SCAN
liver kidney blood serum blood plasma
TRANS TRANS
x5 x5 x5 x5
SAMP SAMP SAMP SAMP
well described process missing protocols and no
from sample to data file. information about what
was being measured.
EX EX
Making visual comparisons is straightfor-
ward using this approach. The longest path
is constructed based on all other known
LABEL LABEL datasets in the pool of workflows being
compared.
HYB HYB HYB
SCAN SCAN SCAN SCAN
TRANS TRANS TRANS TRANS
Novartis, 21st October 2011
Tuesday, 8 November 2011
82. We can’t do everything by ourselves...
ISA team Funders
Susanna-Assunta Sansone
Philippe Rocca-Serra
Eamonn Maguire
Contributors
Collaborators at
Marco Brandizi
Natalija Sklyar
Brad Chapman
Bob MacCallum
Kenneth Haug
Pablo Conesa The National Center for
Toxicological Research (NCTR)
Audrey Kauffman
Novartis, 21st October 2011
Tuesday, 8 November 2011
83. ISA software suite: supporting standards-compliant
experimental annotation and enabling curation at the
community level
Philippe Rocca-Serra; Marco Brandizi; Eamonn Maguire; Nataliya Sklyar; Chris Taylor; Kimberly Begley; Dawn
Field; Stephen Harris; Winston Hide; Oliver Hofmann; Steffen Neumann; Peter Sterk; Weida Tong; Susanna-
Assunta Sansone
Bioinformatics 2010 26: 2354-2356
Novartis, 21st October 2011
Tuesday, 8 November 2011
84. Thanks for listening...
Questions??
You can email us...
isatools@googlegroups.com
View our website
http://www.isa-tools.org
View our Git repo & contribute
http://github.com/ISA-tools
View our blog
http://isatools.wordpress.com
Follow us on Twitter
@antarcticdesign
Novartis, 21st October 2011
Tuesday, 8 November 2011