Clinical data eav

381 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Clinical data eav

  1. 1. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions CHCDB & CHCDBWEB Clinical Annotation Database and web interface Thomas Burguiere INSERM Unit´e 674 May 5th, 2011 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 1 / 35
  2. 2. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 2 / 35
  3. 3. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions • 1500 liver tumor samples • Malignant (HCC) and benign (HCA) tumors • Normal Tissue Existing Data • Clinical Annotations of malignant tumors (4D) • Excel files which contains : • Clinical Annotations of malignant & benign tumors • Other annotations (mutations, clinical studies, etc.) • Tissue extractions listings (concentrations / quantities) Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 4 / 35
  4. 4. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Existing Data • Clinical Annotations of malignant tumors (4D) • Excel files which contains : • Clinical Annotations of malignant & benign tumors • Other annotations (mutations, clinical studies, etc.) • Tissue extractions listings (concentrations / quantities) Problems • Clinical Annotations of malignant tumours can only be accessed on single machine • Redundant data among di↵erent files • Duplicated files on di↵erent machines ,! Discrepancies between di↵erent files ,! Cross-checking data between the di↵erent data source is cumbersome Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 5 / 35
  5. 5. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 6 / 35
  6. 6. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Relational Database : Software • Relational Database Management System : software which contains and organizes data (OracleTM , MySQL, DB2TM , SQL ServerTM , etc.) • Client Server architecture : • Server software, which manages data, installed on a single machine • Client software, which queries the server, installed on any machine used to consult the database Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 7 / 35
  7. 7. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Client Server architecture Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 8 / 35
  8. 8. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Relational Database : Data • The data is stored in a set of tables • One can define a set of constraints regarding the data contained in the tables • The tables can be associated to one another by logical links : integrity constraints Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 9 / 35
  9. 9. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Relational Database : Example !"##$%&' !$()*!+,% -%. /0% -1%21)#"# 34526%3)(2# 789 !"!#$%& "!' ( )# &*+, - ./. !"!)01& "!! ( 2- !"!)#.& "!! 3 $. Classical table (e.g. Excel sheet) Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 10 / 35
  10. 10. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Relational Database : Example 1 Breaking down data 2 Typing constraints 3 Unicity constraints 4 Integrity constraints !"##$%&' !$()*!+,% -%. /0% !"!#$%& "!' ( )# !"!)01& "!! ( 2- !"!)#.& "!! 3 $. !"##$%&' -1%21)#"# 34526%3)(2# 789 !"!#$%& &*+, - ./. !"!%-)& ('=E, ) ./. &4567- &45670 8'*!"'* 9:& 9:&;<<=,': (=<'&8'*!"'* Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 11 / 35
  11. 11. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Benefits Relational Database : Benefits • Data centralisation on the server side • Constraints allow, in some instances, to avoid data inconsistencies ,! Consistent data • E cient : tables containing millions of rows can be easily manipulated • Querying a correctly structured database allows one to cross-check data very rapidly* Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 12 / 35
  12. 12. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Interface A database requires graphical interface • Data manipulation in a database is done exclusively with queries, written in SQL (Structured Query Language) • !"##$%&' !$()*!+,% -%. /0% -1%21)#"# 34526%3)(2# 789 !"!#$%& "!' ( )# &*+, - ./. !"!)01& "!! ( 2- !"!)#.& "!! 3 $. • SELECT t a b l e 1 . t i s s u e I D , t a b l e 1 . TumorType , t a b l e 1 . Sex , t a b l e 1 . Age , t a b l e 2 . S t e a t o s i s , t a b l e 2 . nb adenomas , t a b l e 2 .CRP FROM t a b l e 1 INNER JOIN t a b l e 2 ON ( t a b l e 1 . t i s s u e I D = t a b l e 2 . t i s s u e I D ) WHERE t a b l e 1 . TissueID = ’CHC358T ’ ; • Powerful language, albeit counterintuitive ,! A graphical interface must be associated to the database Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 13 / 35
  13. 13. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Interface Graphical interface : principle Mecanism 1 The interface receives instruction from the user, and transform them into SQL queries sent to the server 2 The server receives the SQL queries, and sends back results 3 The interface receives the results from the server, and displays the results to the user Interface types • 2 types of interface : desktop program or web interface • In our case, we decided to develop a web interface Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 14 / 35
  14. 14. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Interface Web interface • Software installed on a single machine : the web server • Accessing the interface only requires a web browser ,! Avoids installation and maintenance issues on the client machines ,! Avoids OS compatibility issues (Mac, Windows, Linux, etc. . .) Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 15 / 35
  15. 15. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Interface Web client / server architecture Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 16 / 35
  16. 16. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 17 / 35
  17. 17. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Specificities Specific features of clinical annotation data Specificities • New variables are frequently added • Data regarding the same variable can be input di↵erently, depending of sample provenance and type (malignant or benign tumor) Consequences in a database • Frequent addition of new columns or sub-tables • Tables contain a lot of columns, with sparsely filled rows ,! Constant maintenance of the database Clinical annotation data must be stored in a specific database structure : the E.A.V. structure Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 18 / 35
  18. 18. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Principle • E.A.V. = Entity Attribute Value[?] • An E.A.V. is a subset of tables in a relational database, with a specific organization • This data organization is particularly suitable of clinical annotation data • In the E.A.V., all clinical annotation data is stored in one 3-columns table : • Entity : contains the identifier of the entity for which an annotation is stored (In our case, an entity is a tissue) • Attribute : contains the identifier (e.g. the name) of the annotation variable • Value : contains the value of the annotation, for a given entity Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 19 / 35
  19. 19. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Example !"##$%&' !$()*!+,% -%. /0% !"!#$%& "!' ( )# !"!)*+& "!! ( ,- !"!)#.& "!! / $. !"##$%&' -1%21)#"# 34526%3)(2# 789 !"!#$%& &012 - .3. Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 20 / 35
  20. 20. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Example !"##$%&' !$()*!+,% -%. /0% !"!#$%& "!' ( )# !"!)*+& "!! ( ,- !"!)#.& "!! / $. !"##$%&' -1%21)#"# 34526%3)(2# 789 !"!#$%& &012 - .3. 256789 256789 ':;<=>79 ':;<=>79 ?@AB>;9 ?@AB>;9 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 20 / 35
  21. 21. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Example !"##$%&' ?2*"24<%&' ?2<$% !"!#$%& CBD ( !"!#$%& &>EF;&GHB "!' !"!#$%& 'IB )# !"!#$%& C7B@7F9<9 &012 !"!#$%& 5=J@KB5FE@9 - !"!#$%& !0L .3. !"!)*+& CBD ( !"!)*+& 'IB ,- !"!)*+& &>EF;&GHB "!! !"!)#.& CBD / !"!)#.& 'IB $. !"!)#.& &>EF;&GHB "!! 256789 ':;<=>79 ?@AB>;9 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 20 / 35
  22. 22. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Pros & Cons Pros • New annotation = New line in the table • No structural modifications • No sparsely filled tables Cons • A lot of lines in the table • Very complex queries • Columns are no longer typed !"##$%&' ?2*"24<%&' ?2<$% !"!#$%& CBD ( !"!#$%& &>EF;&GHB "!' !"!#$%& 'IB )# !"!#$%& C7B@7F9<9 &012 !"!#$%& 5=J@KB5FE@9 - !"!#$%& !0L .3. !"!#$%& M;@A!F57;@NBH6F5 &012 !"!)*+& CBD ( !"!)*+& 'IB ,- !"!)*+& &>EF;&GHB "!! !"!)#.& CBD / !"!)#.& 'IB $. !"!)#.& &>EF;&GHB "!! !"!)#.& 2KEF59F5 444 ?'0!"'0 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 21 / 35
  23. 23. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Metadata • Loosing information regarding variable data types is problematic ,! This data is stored in an ancillary table, the metadata table !"##$%&' ?2*"24<%&' ?2<$% !"!#$%& CBD ( !"!#$%& &>EF;&GHB "!' !"!#$%& 'IB )# !"!#$%& C7B@7F9<9 &012 !"!#$%& 5=J@KB5FE@9 - !"!#$%& !0L .3. !"!#$%& M;@A!F57;@NBH6F5 &012 !"!)*+& CBD ( !"!)*+& 'IB ,- !"!)*+& &>EF;&GHB "!! !"!)#.& CBD / !"!)#.& 'IB $. !"!)#.& &>EF;&GHB "!! !"!)#.& 2KEF59F5 444 ?2*"24<%&' '212!+,% CBD ?'0!"'0 &>EF;&GHB ?'0!"'0 'IB 4O& C7B@7F9<9 PMMQ2'O 5=J@KB5FE@9 4O& !0L (QM'& M;@A!F57;@NBH6F5 PMMQ2'O 2KEF59F5 ?'0!"'0 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 22 / 35
  24. 24. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 23 / 35
  25. 25. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions CHCDB Software • R.D.B.M.S. : MySQL • Open source & free • Most widely used open-source R.D.B.M.S. ,! Actively maintained ,! Lots of maintenance and development tools • The machine hosting the R.D.B.M.S. has yet to be bought Data • CHCDB’s tables fall into one of three categories • Tissue listings • Clinical annotation data, in the E.A.V. structure • Extraction data Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 24 / 35
  26. 26. Database structure
  27. 27. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 26 / 35
  28. 28. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions A web interface Peculiarities • Installed on the server hosting the R.D.B.M.S. • Can be reached from any machine on the CEPH network Features • Consultation and modificationof clinical annotations for a given tissue • Listing of tissues and their annotations • Listing of tissue extractions • Management (add/modify/delete) of annotation variables • Batch import of annotations • Batch import of extraction data Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 27 / 35
  29. 29. Consultation & modification of the annotations of a given tissue
  30. 30. Consultation & modification of the annotations of a given tissue
  31. 31. Listing of tissues and annotations
  32. 32. Listing of tissue extractions
  33. 33. Annotation variables management
  34. 34. Annotation variables management
  35. 35. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 33 / 35
  36. 36. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Missing features in CHCDBWEB • The tissue management interface is not yet complete • The batch import interface for annotations and extraction is missing CHCDB • Defining a starting set of variables • Importing existing data into CHCDB Material • Acquiring a configuring the machine which will host the database and the web server CHCDB and CHCDBWEB should enter production phase in June 2011. Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 34 / 35

×