Clinical data eav
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Clinical data eav

on

  • 228 views

 

Statistics

Views

Total Views
228
Views on SlideShare
228
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Clinical data eav Presentation Transcript

  • 1. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions CHCDB & CHCDBWEB Clinical Annotation Database and web interface Thomas Burguiere INSERM Unit´e 674 May 5th, 2011 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 1 / 35
  • 2. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 2 / 35
  • 3. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions • 1500 liver tumor samples • Malignant (HCC) and benign (HCA) tumors • Normal Tissue Existing Data • Clinical Annotations of malignant tumors (4D) • Excel files which contains : • Clinical Annotations of malignant & benign tumors • Other annotations (mutations, clinical studies, etc.) • Tissue extractions listings (concentrations / quantities) Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 4 / 35
  • 4. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Existing Data • Clinical Annotations of malignant tumors (4D) • Excel files which contains : • Clinical Annotations of malignant & benign tumors • Other annotations (mutations, clinical studies, etc.) • Tissue extractions listings (concentrations / quantities) Problems • Clinical Annotations of malignant tumours can only be accessed on single machine • Redundant data among di↵erent files • Duplicated files on di↵erent machines ,! Discrepancies between di↵erent files ,! Cross-checking data between the di↵erent data source is cumbersome Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 5 / 35
  • 5. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 6 / 35
  • 6. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Relational Database : Software • Relational Database Management System : software which contains and organizes data (OracleTM , MySQL, DB2TM , SQL ServerTM , etc.) • Client Server architecture : • Server software, which manages data, installed on a single machine • Client software, which queries the server, installed on any machine used to consult the database Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 7 / 35
  • 7. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Client Server architecture Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 8 / 35
  • 8. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Relational Database : Data • The data is stored in a set of tables • One can define a set of constraints regarding the data contained in the tables • The tables can be associated to one another by logical links : integrity constraints Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 9 / 35
  • 9. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Relational Database : Example !"##$%&' !$()*!+,% -%. /0% -1%21)#"# 34526%3)(2# 789 !"!#$%& "!' ( )# &*+, - ./. !"!)01& "!! ( 2- !"!)#.& "!! 3 $. Classical table (e.g. Excel sheet) Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 10 / 35
  • 10. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Relational Database : Example 1 Breaking down data 2 Typing constraints 3 Unicity constraints 4 Integrity constraints !"##$%&' !$()*!+,% -%. /0% !"!#$%& "!' ( )# !"!)01& "!! ( 2- !"!)#.& "!! 3 $. !"##$%&' -1%21)#"# 34526%3)(2# 789 !"!#$%& &*+, - ./. !"!%-)& ('=E, ) ./. &4567- &45670 8'*!"'* 9:& 9:&;<<=,': (=<'&8'*!"'* Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 11 / 35
  • 11. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Benefits Relational Database : Benefits • Data centralisation on the server side • Constraints allow, in some instances, to avoid data inconsistencies ,! Consistent data • E cient : tables containing millions of rows can be easily manipulated • Querying a correctly structured database allows one to cross-check data very rapidly* Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 12 / 35
  • 12. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Interface A database requires graphical interface • Data manipulation in a database is done exclusively with queries, written in SQL (Structured Query Language) • !"##$%&' !$()*!+,% -%. /0% -1%21)#"# 34526%3)(2# 789 !"!#$%& "!' ( )# &*+, - ./. !"!)01& "!! ( 2- !"!)#.& "!! 3 $. • SELECT t a b l e 1 . t i s s u e I D , t a b l e 1 . TumorType , t a b l e 1 . Sex , t a b l e 1 . Age , t a b l e 2 . S t e a t o s i s , t a b l e 2 . nb adenomas , t a b l e 2 .CRP FROM t a b l e 1 INNER JOIN t a b l e 2 ON ( t a b l e 1 . t i s s u e I D = t a b l e 2 . t i s s u e I D ) WHERE t a b l e 1 . TissueID = ’CHC358T ’ ; • Powerful language, albeit counterintuitive ,! A graphical interface must be associated to the database Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 13 / 35
  • 13. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Interface Graphical interface : principle Mecanism 1 The interface receives instruction from the user, and transform them into SQL queries sent to the server 2 The server receives the SQL queries, and sends back results 3 The interface receives the results from the server, and displays the results to the user Interface types • 2 types of interface : desktop program or web interface • In our case, we decided to develop a web interface Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 14 / 35
  • 14. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Interface Web interface • Software installed on a single machine : the web server • Accessing the interface only requires a web browser ,! Avoids installation and maintenance issues on the client machines ,! Avoids OS compatibility issues (Mac, Windows, Linux, etc. . .) Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 15 / 35
  • 15. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Interface Web client / server architecture Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 16 / 35
  • 16. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 17 / 35
  • 17. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Specificities Specific features of clinical annotation data Specificities • New variables are frequently added • Data regarding the same variable can be input di↵erently, depending of sample provenance and type (malignant or benign tumor) Consequences in a database • Frequent addition of new columns or sub-tables • Tables contain a lot of columns, with sparsely filled rows ,! Constant maintenance of the database Clinical annotation data must be stored in a specific database structure : the E.A.V. structure Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 18 / 35
  • 18. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Principle • E.A.V. = Entity Attribute Value[?] • An E.A.V. is a subset of tables in a relational database, with a specific organization • This data organization is particularly suitable of clinical annotation data • In the E.A.V., all clinical annotation data is stored in one 3-columns table : • Entity : contains the identifier of the entity for which an annotation is stored (In our case, an entity is a tissue) • Attribute : contains the identifier (e.g. the name) of the annotation variable • Value : contains the value of the annotation, for a given entity Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 19 / 35
  • 19. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Example !"##$%&' !$()*!+,% -%. /0% !"!#$%& "!' ( )# !"!)*+& "!! ( ,- !"!)#.& "!! / $. !"##$%&' -1%21)#"# 34526%3)(2# 789 !"!#$%& &012 - .3. Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 20 / 35
  • 20. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Example !"##$%&' !$()*!+,% -%. /0% !"!#$%& "!' ( )# !"!)*+& "!! ( ,- !"!)#.& "!! / $. !"##$%&' -1%21)#"# 34526%3)(2# 789 !"!#$%& &012 - .3. 256789 256789 ':;<=>79 ':;<=>79 ?@AB>;9 ?@AB>;9 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 20 / 35
  • 21. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Example !"##$%&' ?2*"24<%&' ?2<$% !"!#$%& CBD ( !"!#$%& &>EF;&GHB "!' !"!#$%& 'IB )# !"!#$%& C7B@7F9<9 &012 !"!#$%& 5=J@KB5FE@9 - !"!#$%& !0L .3. !"!)*+& CBD ( !"!)*+& 'IB ,- !"!)*+& &>EF;&GHB "!! !"!)#.& CBD / !"!)#.& 'IB $. !"!)#.& &>EF;&GHB "!! 256789 ':;<=>79 ?@AB>;9 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 20 / 35
  • 22. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Pros & Cons Pros • New annotation = New line in the table • No structural modifications • No sparsely filled tables Cons • A lot of lines in the table • Very complex queries • Columns are no longer typed !"##$%&' ?2*"24<%&' ?2<$% !"!#$%& CBD ( !"!#$%& &>EF;&GHB "!' !"!#$%& 'IB )# !"!#$%& C7B@7F9<9 &012 !"!#$%& 5=J@KB5FE@9 - !"!#$%& !0L .3. !"!#$%& M;@A!F57;@NBH6F5 &012 !"!)*+& CBD ( !"!)*+& 'IB ,- !"!)*+& &>EF;&GHB "!! !"!)#.& CBD / !"!)#.& 'IB $. !"!)#.& &>EF;&GHB "!! !"!)#.& 2KEF59F5 444 ?'0!"'0 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 21 / 35
  • 23. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Metadata • Loosing information regarding variable data types is problematic ,! This data is stored in an ancillary table, the metadata table !"##$%&' ?2*"24<%&' ?2<$% !"!#$%& CBD ( !"!#$%& &>EF;&GHB "!' !"!#$%& 'IB )# !"!#$%& C7B@7F9<9 &012 !"!#$%& 5=J@KB5FE@9 - !"!#$%& !0L .3. !"!#$%& M;@A!F57;@NBH6F5 &012 !"!)*+& CBD ( !"!)*+& 'IB ,- !"!)*+& &>EF;&GHB "!! !"!)#.& CBD / !"!)#.& 'IB $. !"!)#.& &>EF;&GHB "!! !"!)#.& 2KEF59F5 444 ?2*"24<%&' '212!+,% CBD ?'0!"'0 &>EF;&GHB ?'0!"'0 'IB 4O& C7B@7F9<9 PMMQ2'O 5=J@KB5FE@9 4O& !0L (QM'& M;@A!F57;@NBH6F5 PMMQ2'O 2KEF59F5 ?'0!"'0 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 22 / 35
  • 24. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 23 / 35
  • 25. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions CHCDB Software • R.D.B.M.S. : MySQL • Open source & free • Most widely used open-source R.D.B.M.S. ,! Actively maintained ,! Lots of maintenance and development tools • The machine hosting the R.D.B.M.S. has yet to be bought Data • CHCDB’s tables fall into one of three categories • Tissue listings • Clinical annotation data, in the E.A.V. structure • Extraction data Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 24 / 35
  • 26. Database structure
  • 27. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 26 / 35
  • 28. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions A web interface Peculiarities • Installed on the server hosting the R.D.B.M.S. • Can be reached from any machine on the CEPH network Features • Consultation and modificationof clinical annotations for a given tissue • Listing of tissues and their annotations • Listing of tissue extractions • Management (add/modify/delete) of annotation variables • Batch import of annotations • Batch import of extraction data Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 27 / 35
  • 29. Consultation & modification of the annotations of a given tissue
  • 30. Consultation & modification of the annotations of a given tissue
  • 31. Listing of tissues and annotations
  • 32. Listing of tissue extractions
  • 33. Annotation variables management
  • 34. Annotation variables management
  • 35. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 33 / 35
  • 36. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Missing features in CHCDBWEB • The tissue management interface is not yet complete • The batch import interface for annotations and extraction is missing CHCDB • Defining a starting set of variables • Importing existing data into CHCDB Material • Acquiring a configuring the machine which will host the database and the web server CHCDB and CHCDBWEB should enter production phase in June 2011. Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 34 / 35