Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Overview of the features and architecture of Glowing Bear and tranSMART

34 views

Published on

Glowing Bear is a cohort selection user interface for the TranSMART clinical data warehouse. In recent years, features for several use cases have been added: time series data, standard ontologies, family relations, sample-level lab data. Meanwhile, the structure of the platform has been transformed to be more modular and maintainable. We give an overview of the added features and the changes to the data model and architecture.
Presented at the i2b2 tranSMART Foundation 2019 Fall European Symposium on 9 October 2019 in Tübingen, Germany. https://transmartfoundation.org/tubingen-2019/

Published in: Software
  • Be the first to comment

  • Be the first to like this

Overview of the features and architecture of Glowing Bear and tranSMART

  1. 1. Overview of the features and architecture of Glowing Bear and TranSMART Gijs Kant The Hyve Utrecht, The Netherlands i2b2 tranSMART Foundation 2019 Fall European Symposium Tübingen, Germany, 9 October 2019 1 / 30
  2. 2. TranSMART 17.2 / Glowing Bear i2b2, TranSMART: open source platforms for managing and sharing healthcare and research data Combine health research data from heterogeneous data sources into a single repository Search and explore available data, define cohorts Targeted at researchers and data managers, moving towards clinicians 2 / 30
  3. 3. Example: Netherlands twin register Survey data and metadata from SPSS files Pedigree relations data, Boomsma et al. (2008) Metadata database 3 / 30
  4. 4. Example: Princess Máxima Center Data from tab separated and comma separated files, code books, NGS sequencing data Central Subject Registry data model 4 / 30
  5. 5. Example: Data Integration for Future Medicine Data from various sources stored in a data lake Variant data in a dedicated variant store Transformed into HL7 FHIR resources 5 / 30
  6. 6. Cohort selection 6 / 30
  7. 7. Ontology tree 7 / 30
  8. 8. Cross table 8 / 30
  9. 9. Data export 9 / 30
  10. 10. Variations Pedigree relations data Clinical visits Trial visits Specific variable metadata Specific export formats Data update notifications Sample level data Connect to an external variant store 10 / 30
  11. 11. Support for pedigree relations 11 / 30
  12. 12. Sample data and update notifications 12 / 30
  13. 13. Observations data model Observation value start_date end_date Patient Concept code path Encounter start_date end_date status location type Trial visit label relative_time Study Observation metadata modifier value Encounter mapping Patient mapping 13 / 30
  14. 14. Ontology data model Tree node name type Study Concept code path parent 14 / 30
  15. 15. Pedigree data model Patient Relation biological Relation type label biological symmetrical 15 / 30
  16. 16. Data model and database schema Observation-centered star-schema data model Based on the i2b2 Data Repository (CRC) Cell database schema Flexible data model to store observations data Flexible ontology support Trade-off performance and validation versus genericity 16 / 30
  17. 17. Solution ingredients Clinical data storage Clinical data querying application Saving cohorts based on queries Data export Data visualisations Data loading 17 / 30
  18. 18. Component overview LDAP / AD identities, authentication PostgreSQL db Redis TranSMART API server Clinical data REST API transmart-copy data loader TSV files Python API client PostgreSQL db Fractalis Visualisations REST API Transmart Packer Exports REST API Glowing bear backend User queries, notifications REST API Keycloak single sign-on, permissions OIDC API Glowing bear user interface Redis Java transmart-lib client 18 / 30
  19. 19. Glowing Bear user interface Angular with Typescript Reusable and extendable components, dependency injection Configurable High test coverage 19 / 30
  20. 20. Architectural principles Separate components for distinct functionality Clear, typed and well-documented interfaces Recognise commonalities/patterns: develop reusable, customisable libraries/tools/frameworks for those Complexity of integration versus testability, maintainability of components Automated tests of the system behaviour Automated deployment 20 / 30
  21. 21. Component overview LDAP / AD identities, authentication PostgreSQL db Redis TranSMART API server Clinical data REST API transmart-copy data loader TSV files Python API client PostgreSQL db Fractalis Visualisations REST API Transmart Packer Exports REST API Glowing bear backend User queries, notifications REST API Keycloak single sign-on, permissions OIDC API Glowing bear user interface Redis Java transmart-lib client 21 / 30
  22. 22. LDAP / AD identities, authentication PostgreSQL db Redis TranSMART variant store connector REST API transmart-copy data loader Python API client PostgreSQL db Fractalis Visualisations REST API Transmart Packer Exports REST API Glowing bear backend User queries, notifications REST API Keycloak single sign-on, permissions OIDC API Glowing bear user interface Redis Java transmart-lib client MariaDB onco-store-loader variant data loader VCF files TSV files Oncostore Variant data REST API TranSMART API server Clinical data REST API 22 / 30
  23. 23. Query interface API documentation as OpenAPI/Swagger Data serialised as multi-dimensional data structure (hypercube), because tabular format is not sufficient for, e.g., timeseries data Hypercube serialised as JSON or Protobuf Query observations based on values, concepts, studies, patients, custom dimensions, custom biomarker constraints Combine with and, or, not, temporal operators, union, intersection select all male patients with asthma select patients with a heart attack after receiving medication X select patients diagnosed for osteoarthritis, but not diabetes 23 / 30
  24. 24. OpenAPI documentation https://transmart.thehyve.net 24 / 30
  25. 25. OpenAPI documentation 25 / 30
  26. 26. Hypercube Patients SUBJ1 male SUBJ2 female SUBJ3 female Concepts Age Tissue type Samples S1 S2 S3- Dimensions Patient id sex Concept code Sample id Start time inline Cells 0, 0, 0 12-03-1995 21 1, 0, 0 02-04-2001 78 2, 0, 0 24-11-2003 45 0, 1, 1 14-03-1995 kidney 0, 1, 2 15-03-1995 heart 1, 1, 3 12-04-2001 heart ... 26 / 30
  27. 27. Results: application improvements Better developer experience, easier customisation Separate user interface from storage and querying backends Decomposition of the system with clear, well documented interfaces Database schema managed by Liquibase Unit tests, user interface tests Cleanup of legacy code Containerised deployment (Docker) Authentication and user management with Keycloak Performance improvements (e.g., bit sets for patient sets) 27 / 30
  28. 28. Results: reusable libraries Data loading libraries Optimised database insert tool transmart-copy Interactive data loading toolkit tmtk Python transmart-loader package Automated pipelines in Python, using luigi task manager and well tested transformation code Client libraries Python transmart API client transmart-lib REST API library for Java 28 / 30
  29. 29. Challenges Scalability/performance Data curation, mapping and transformation tools Validation, richer data model Visualisations of the data, interacting with the data 29 / 30
  30. 30. TranSMART 17.2 / Glowing Bear https://glowingbear.app 30 / 30

×