Database for cancer immunology


Published on

Published in: Health & Medicine
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Database for cancer immunology

  1. 1. Bernhard Mlecnik Database for cancer immunology Master Thesis 1 Institute for Biomedical Engineering, Graz University of Technology, Graz, Austria 2 ´ ´ Institut National de la Sante et de la Recherche Medical ´ ´ Unite 255, Centre de Recherches Biomedicales des Cordeliers, Paris, France Supervisors: Dipl.-Ing. Robert Molidor1 , Dr. Jerome Galon2 , Ao.Univ.-Prof. ´ ˆ Dipl.-Ing. Dr.techn. Zlatko Trajanoski1 Evaluator: Ao.Univ.-Prof. Dipl.-Ing. Dr.techn. Zlatko Trajanoski1 Head of Institute: Univ.-Prof. Dipl.-Ing. Dr.techn. Gert Pfurtscheller1 Graz, February 2003
  2. 2. For my parents ¨ Fur meine Eltern
  3. 3. Abstract Abstract Cancer is a worldwide public health problem. Each year, 6 million people die from cancer and 8,1 million new cases are diagnosed. In twenty years from now, the cancer burden will exceed 50% due to the ageing of the population and their increasing exposure to risk factors. It is proven that the immune system plays a major role in recognising and destroying tumour cells, and it is possible that it may induce immunological responses, which may have therapeutical benefits against certain tumours. The broad, long-term objective of the functional genomic studies in this thesis is to identify molecular signatures and pathways in T-cells surrounding cancer. The specific aim of this thesis was to develop a tumoral microenvironment (TME) database for storing and maintaining all the data which are arising from different immunological experiments. The data were obtained from cancer patients as well as from healthy donors. The used software tech- nology was based on the newest Java-Client-Server technologies and applied Java Database Connectivity (JDBC), Java Server Pages (JSP) and Enterprise Java Beans (EJB). The collected FACS (fluorescence ac- tivated cell sorter) data was clustered using hierarchical clustering algorithm. The results demonstrated that immunophenotypic and functional data can be used to group patients and controls into distinct groups. In future work, immunophenotypic and functional data will be integrated with microarray data in order to explore new relations between expression patterns and cell surface markers. Keywords: Cancer, Tumoral Microenvironment, T-Cells, Databases, Bioinformatics i
  4. 4. Abstract Kurzfassung Krebs hat sich l¨ ngst zu einem weltweiten Gesundheitsproblem entwickelt. J¨ hrlich sterben 6 Mil- a a lionen Menschen an den Folgen einer Krebserkrankung und 8,1 Millionen neue F¨ lle werden diagnos- a tiziert. In den kommenden zwanzig Jahren soll die Krebsrate um 50% steigen. Es ist bewiesen, dass das Immunsystem eine wichtige Rolle im Erkennen und Zerst¨ ren von Krebszellen einnimmt, wobei es o immunologische Reaktionen hervorrufen k¨ nnte, die therapeutisch gegen gewisse Krebsarten einsetzbar o w¨ ren. a Das Ziel langfristiger funktioneller genomischer Studien in dieser Diplomarbeit soll neue moleku- lare Signaturen in T-Zellen aufdecken, die sich in unmittelbarer Umgebung eines Tumors befinden. Das Ziel dieser Arbeit war es eine Datenbank zu entwickeln, die phenotypische wie funktionelle immunolo- gische Daten speichern und verwalten soll, die w¨ hrend verschiedner Experimente aufkamen, bzw. noch a aufkommen werden. Die Softwaretechnologie zur Realisierung dieser Diplomarbeit basiert auf der neuesten Java-Client- Server Technologie, unter Verwendung von Java Server Pages (JSP), Java Database Connectivity (JDBC) und Enterprise Java Bean (EJB). Die gespeicherten FACS (fluorescence activated cell sorter) Daten wur- den vereint und mit hierarchischen Cluster-Algorithmen geclustert. Es konnte gezeigt werden, dass immunophenotypische und funktionelle Daten von Patienten und Kontrollpersonen verwendet werden k¨ nnen, um sie in verschiedene Gruppen zu unterteilen. o In Zukunft sollen auch Microarray-Experimente mit den immunologischen Daten zusammengef¨ hrt u werden, um neue Zusammenh¨ nge zwischen intrazellul¨ ren Expressionsmustern und Oberfl¨ chenmark- a a a ern zu erforschen. ¨ Schlusselw¨ rter: Krebs, Tumoral Micro Environment T-Zellen, Datenbanken, Bioinformatik o ii
  5. 5. Contents Glossary viii 1 Introduction 1 1.1 Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Tumoral microenvironment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Immunity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3.1 Innate and adaptive immunity . . . . . . . . . . . . . . . . . . . . . . 4 1.3.2 B cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.3 T cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.4 Cluster Designation (CD) . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.5 Cytokines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Tumour immunology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4.1 Immune surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4.2 Tumour antigens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4.3 Immunotherapy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 Objectives 10 2.1 Database development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Application server deployment . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Web application development . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3 Methods 12 iii
  6. 6. CONTENTS CONTENTS 3.1 Fluorescent-activated cell sorter (FACS) . . . . . . . . . . . . . . . . . . . . . 12 3.1.1 FACS analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1.2 Sample treatments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.3 Phenotype analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.4 Proliferation analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1.5 Cytokine secretion analysis . . . . . . . . . . . . . . . . . . . . . . . . 18 3.2 Database development (Enterprise Information System (EIS)-Tier) . . . . . . . 19 3.2.1 Relational databases . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Normalisation . . . . . . . . . . . . . . . . . . . . . . . . . 20 Integrity rules . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2.2 Structured Query Language (SQL) . . . . . . . . . . . . . . . . . . . . 21 Data Definition Language (DDL) . . . . . . . . . . . . . . . 21 Data Manipulation Language (DML) . . . . . . . . . . . . . 22 3.2.3 Java Database Connectivity (JDBC) . . . . . . . . . . . . . . . . . . . 22 Two-tier and Three-tier Models . . . . . . . . . . . . . . . . 23 3.3 Application server deployment (Middle-Tier) . . . . . . . . . . . . . . . . . . 24 3.3.1 Enterprise Java Beans 2 (EJB2) . . . . . . . . . . . . . . . . . . . . . 24 3.3.2 The Java 2 Enterprise Edition (J2EE) server . . . . . . . . . . . . . . . 26 3.3.3 Java Cryptography Extension (JCE) . . . . . . . . . . . . . . . . . . . 27 3.3.4 Extensible Markup Language (XML) . . . . . . . . . . . . . . . . . . 27 3.3.5 JDOM (Java Document Object Model) . . . . . . . . . . . . . . . . . 28 3.4 Web application development (WEB-Tier) . . . . . . . . . . . . . . . . . . . . 28 3.4.1 Java Server Page (JSP) . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.4.2 JSP Tag libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.3 The Jakarta Struts Framework . . . . . . . . . . . . . . . . . . . . . . 31 3.4.4 Struts Application Flow . . . . . . . . . . . . . . . . . . . . . . . . . 31 4 Results 34 4.1 The Database Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 iv
  7. 7. CONTENTS CONTENTS 4.1.1 Patient and Experiment Table . . . . . . . . . . . . . . . . . . . . . . 35 4.1.2 User Management Tables . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.1.3 Experiment Related Tables . . . . . . . . . . . . . . . . . . . . . . . . 35 4.1.4 Application server connection . . . . . . . . . . . . . . . . . . . . . . 38 4.2 The Web Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.2.1 The TME.db Web Page . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.2.2 User Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.2.3 The Patient Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Cancer Information . . . . . . . . . . . . . . . . . . . . . . 42 Biological Markers . . . . . . . . . . . . . . . . . . . . . . 43 Treatments . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 FACS Phenotype Assays . . . . . . . . . . . . . . . . . . . 44 FACS Proliferation Assays . . . . . . . . . . . . . . . . . . 45 FACS Cytokine Secretion Assays . . . . . . . . . . . . . . . 49 4.2.4 Basic Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2.5 Custom Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Building A Custom Query . . . . . . . . . . . . . . . . . . . 52 Clustering the FACS data . . . . . . . . . . . . . . . . . . . 55 5 Discussion 59 Bibliography 62 A Database Model 65 B Cluster Designation List 67 v
  8. 8. List of Figures 1.1 Tumour micro ecology [14] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 The principal components of the immune system [27] . . . . . . . . . . . . . . 3 1.3 T cell encounters an APC [27] . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.1 Left: FACS with two dye channels FL1-H and FL3-H. Right: Scatter plot with histogram [27] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2 Phenotype scatter plots. Day 0, cells without CD25+ markers. Right: Day 5 after stimulation, cells with an increased amount of CD25+ markers . . . . . . 16 3.3 Proliferation scatter plots. Left: Day 0, cells stained with CFSE. Right: Day 5 after stimulation with Il-10 and IL-15. . . . . . . . . . . . . . . . . . . . . . . 17 3.4 Cytokine secreting cell [20] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.5 Multi Tier Model [11] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.6 EJB Model [1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.7 JSP MVC model [23] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.8 Process flow for displaying a JSP page within a Struts Project [5] . . . . . . . . 32 4.1 TME Web Page Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.2 Add User Panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.3 Patient Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.4 Phenotype Assay Overview with scatter plot. . . . . . . . . . . . . . . . . . . 44 4.5 Proliferation Assay Overview with: FSC-SSC-plot, scatter plot and histogram plot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 vi
  9. 9. LIST OF FIGURES LIST OF FIGURES 4.6 Basic Query Page with certain selected Gates and Antigens . . . . . . . . . . . 49 4.7 Custom List Section with selected experiment tab . . . . . . . . . . . . . . . . 50 4.8 Custom Query Section with selected cancer tab . . . . . . . . . . . . . . . . . 51 4.9 Custom Savings Section with saved query . . . . . . . . . . . . . . . . . . . . 52 4.10 Custom Query Listings page with specified selections . . . . . . . . . . . . . . 53 4.11 Patient FACS experiment list . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.12 Custom Query pages with specified selections . . . . . . . . . . . . . . . . . . 54 4.13 File download . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.14 Hierarchical Cluster result of the FACS data . . . . . . . . . . . . . . . . . . . 55 4.15 3D plot of the clusters within the 3 most influential Principle Components . . . 57 vii
  10. 10. Glossary APCs Antigen presenting cells, have the ability to present antigen particles bound to specific receptors on their surface. CDxx Cluster of Designation, terms lymphoid surface antigens. CFSE Carboxy-fluorescein diacetate, succinimidyl ester, used for intracellular staining of cells. CSFs Colony-stimulating factors, have influence in controlling and directing the division and differentiation of bone-marrow stem cells. DDL Data Definition Language, is a sub section of SQL allowing the creation and deletion of tables in the database. DML Data Manipulation Language, is a sub section of SQL include the syntax for complex queries as well as for updates, insertions and deletions of data records. EJB Enterprise Java Bean, a business data object specification developed in java technology. ELISA special kit for detecting particular cytokines secreted by APCs or other cytokine se- creting cells. viii
  11. 11. Glossary FACS Fluorescent-activated cell sorter, designed for an automatic separation and analysis of fluorescently stained cells. Ficoll is used to separate immune cells from the blood. FK Foreign Key, is a PK of a foreign data table. HCL Hierarchical Clustering, cluster algorithm for creating a relational data tree (dendro- gram). IFNs Interferons, are mainly involved into the cell’s prevention of certain viral infections. IgG Immune globulin G, soluble antibody secreted by B cells. ILs Interleukins, cytokines mainly produced by T cells. JCE Java Cryptography Extension, is a non-commercial cryptography extension for Java con- taining a package which provides implementations for encryption. JDBC Java Database Connectivity, Java software tool for accessing databases. JSPs Java Server Pages, have been developed to provide an easy and intuitive way in creating dynamical generated HTML pages. MFI Mean fluorescent intensity, is measured within FACS experiments on stained antigens bound to cell surfaces. MHC Major Histocompatibility Complex, antigen presenting receptor. NK Natural Killer cells, are a group of lymphocytes which have intrinsic ability to recognise and destroy some virally infected cells. ix
  12. 12. Glossary PCA Principle Component Analysis, determines basis functions of a similarity matirx. PHA Phytohemagglutin, mitogen for activating T cell receptors. PK Primary Key, defines the uniqueness of each data record. SQL Structured Query Language, non procedural language for accessing relational databases. TCR T cell receptor, connects with the antigen presenting MHC receptor. TGFs Transforming growth factors, have a partial effect in mediating inflammation reactions. TME Tumoral microenvironment, comprises all the biomolecules and cells which surround a tumour. TNFs Tumour necrosis factors, have a partial effect in cytotoxic reactions against tumour cells. x
  13. 13. Chapter 1 Introduction Cancer is a public health problem worldwide. Each year, 6 million people die from cancer and 8.1 million new cases are diagnosed. The growth rate of cancer is now 2.1% per year, a rate that exceeds the growth rate of the world’s population at 1.7 % per year. The leading causes of worldwide cancer deaths are lung cancer, which accounts for over 900,000 deaths, gastric cancer (600,000 deaths) and colorectal cancer (400,000 deaths) [3]. The occurring cases of different types of cancer differ between developed and developing countries, whereby more than 55% of the detected deaths occur in developing countries. The most common cancers in developed countries are lung, stomach, breast, and colorectal cancer, whereas in developing countries lung, stomach, breast, cervical, and oesophageal cancer accounts for the main part of the occurring cases. The average 5-year survival is as low as 8 % in Europe and 14% in the United States [3]. 1.1 Cancer The main indicator of cancer is the uncontrolled growth and dispersion of cells as a result of abnormal changes to the genetic material contained in those cells. A single cell or group of cells can undergo genetic events such as mutations, influenced by inherited or environmental factors as well as a result of certain levels of hormones or growth factors, which may change the cells’ behaviours. These events, which may take years to arise, cause the cells to proceed down the 1
  14. 14. Introduction 1.2 Tumoral microenvironment pathway to the development of cancer [2]. If cells divide abnormal in an early stage of development, they may evolve into a cell pop- ulation that could be immortalized and which may lose the control mechanisms of normal cell division, activity and interactions with neighbouring cells. Such immortalized cell populations may evolve into malignant tumour cell populations, whose behaviour may violate the tissue environment. Once certain cell populations became malignant they may form solid tumours which invade and destroy sane tissues as well as they may metastasize (spread) all over the body by releasing tumour cells into the blood and lymph system, where they may continue to grow and develop by forming new cancers [2]. 1.2 Tumoral microenvironment The tumoral microenvironment comprises all the biomolecules and cells which surround the tumour and have a major influence to its development and behaviour. During the whole tumour aetiology, progression and final metastases the tumour microenvironment of the local host tissue may represent an active participant in tumour-host interaction. Throughout all the cancer stages the tumour-host interaction is accompanied by Figure 1.1: Tumour micro ecology enzyme and cytokine exchange of cancer and stromal [14] cells that change and modify the extra cellular matrix as well as survival and proliferation [14]. 1.3 Immunity All living species are threatened by other organisms constantly. This is the reason why many species have tried to develop protections and defensive mechanisms against foreign aggressors and intruders. Vertebrates have developed an own and very complex defensive mechanism against intruding microorganisms, which is called the immune system. The meaning of the word 2
  15. 15. Introduction 1.3 Immunity immunity derives from the Latin word immunis (unhurt, protected) and describes the protection and immunity against particular infectious agents. During the encounter with foreign micro organisms the immune system runs through a learn process whereby the recognition of the infectious agents is a crucial step in immune defence. The most important task of the immune system is to distinguish between own and foreign bio molecules to make sure that only foreign intruders are attacked and destroyed [15]. The immune cells (leukocytes) are distinguished into three major subgroups (Fig.1.2): • Lymphocytes: These kind of immune cells (B-, T-Cells) induces adaptive immune re- sponses (adaptive immunity) and create specific memory cells to prevent further encoun- ters with pathogens. • Phagocytes: The main function of these cells (mononuclear phagocytes, neutrophils, eosinophils) is the unspecific destruction (innate immunity) of foreign pathogens by en- gulfing or releasing lytic granules to dissolve them. • Auxiliary Cells: These cells (basophils, mast cells) are mainly involved in inducing in- flammatory reactions which support and accelerate curing processes. Figure 1.2: The principal components of the immune system [27] 3
  16. 16. Introduction 1.3 Immunity 1.3.1 Innate and adaptive immunity Any immune response firstly recognises the pathogen or foreign material and eliminates it af- terwards. There are different immune responses which fall into two categories: innate (or non- adaptive) immune responses and adaptive immune responses. The main difference between these two is that an adaptive immune response is highly specific against a particular pathogen. The innate immune response does not alter during repeated encounters with infectious agents, whereby the adaptive immune response improves with each successive encounter with the same pathogen: in effect the adaptive immune response ’remembers’ the infectious agent and may prevent it from causing a disease later [27]. The main tasks of the innate immune system are non-specific recognition and digestion of foreign intruders and therefore it is also called the first line of defence against infection. The major participants in this kind of immunity are the phagocytic cells (Fig.1.2), the monocytes as well as the macrophages [27]. The strength of the adaptive immunity is the specific recognition of particular pathogens in the host’s tissues and body fluids. Lymphocytes, which are distinguished into two major sub groups, the B and the T lymphocytes (Fig.1.2), support the cells’ acting within the adaptive immunity [27]. 1.3.2 B cells Every B cell bears a unique genetic information in its genome to encode its own very specific surface marker which may only bind to one particular antigen. Once a B cell encounters a specific antigen, fitting to its receptors, internal pathways are activated and the cell starts to proliferate and differentiates itself into a plasma cell. Differentiated plasma cells raise a large amount of soluble antibodies which are secreted afterwards. These antibodies are large glyco- proteins situated in the blood which bind to the same type of antigen which initially encounters the B cell’s receptors. So the antigen which has evoked an immunological response is covered with antibodies all over, which may bind to the surface of a phagocyte that may engulf the antigen and destroy it later [27]. 4
  17. 17. Introduction 1.3 Immunity 1.3.3 T cells Both B and T cells have the same precursors, the bone-marrow stem cells that are situated in the cavities of the large bones. It is crucial that all specific immune cells are tested against auto- immunity to prevent the body’s own proteins from being attacked. Mature lymphocytes which show auto-immunity are detected and destroyed before they can enter the lymphatic system. T cells migrate to the thymus where they mature and on passing the auto-immunity check (nega- tive selection) they are released to the lymph system. Otherwise the cells are destroyed. T cells are characteristic for detecting and bind- ing unexceptionally to antigen presenting cells (APCs) which have the ability to present antigen particles bound to specific receptors on their surface. Because of their far reaching genetic invariance these kinds of antigen Figure 1.3: T cell encounters an APC representing receptors are termed the major histocom- [27] patibity complex (MHC). The T cells in turn possess the T cell receptor (TCR) which may connect with the antigen presenting MHC receptor (Fig.1.3). The T cells are distinguished into two categories, the cytotoxic T cells (TC ), which kill the APCs in case of an encounter, and the T helper cells (TH ) which initiate a secretion of soluble proteins called cytokines to induce several different cell events. The cytotoxic cells use in addition to the TCR the CD8 (Cluster Designation) receptor to detect the MHC of type I, whereas the T helper cells use the CD4 re- ceptor to detect the MHC of type II. These complex recognition qualities are important security mechanisms to make sure that there may not be any confusion in these complicated interactions of cells [27]. 1.3.4 Cluster Designation (CD) Researchers in many scattered laboratories identified many of new lymphoid surface antigens and termed them with self defined names. It became apparent that there were a vast amount 5
  18. 18. Introduction 1.3 Immunity of various called antigens which seemed to be identically. Due to these confusions the cluster designation (CD) system has been developed over the last few years. Now new investigated antigens at first have to be termed ’CDw’ whereby ’w’ indicates the not yet being confirmed label and within some years the label is changed to a true CD designation confirmed by an international committee [6] [21]. A list of CD labels used in this thesis is given in appendix B. 1.3.5 Cytokines All cells participating within an immune response, communicate among themselves by secret- ing soluble molecules called cytokines. These molecules pertain to a large group of different proteins which fall into several categories described below [27]. Interleukins (ILs) are mainly produced by T cells (IL − 1 to IL − 22), but there are also many other kinds of cells which have the ability to secrete interleukins like some phagocytes or tissue cells. They induce manifold cell activities, but their major function appears to be the direction of other cells’ division and differentiation [27]. Interferons (IFNs) are mainly involved into the cell’s prevention of certain viral infections whereby IFNα and/or IFNβ are produced and released during a cell’s infection. Certain acti- vated T cells may also release another type of interferon, the INFγ [27]. Chemokines direct the cells’ movement around the body which goes from the blood stream into the tissues and further on to appropriate locations within each tissue. They induce cells to cross tissue boundaries but they also may have some certain functions in activating cells [27] (Chemokines may be responsible for spreading and metastasis of cancer cells). Colony-stimulating factors (CSFs) may have influence in controlling and directing the di- vision and differentiation of bone-marrow stem cells. Blood leukocytes and their production mostly depend on the CSFs’ balance too [27]. 6
  19. 19. Introduction 1.4 Tumour immunology Other cytokines like the tumour necrosis factors (TNFs) and transforming growth factors (TGFs) have a partial effect in mediating inflammation and cytotoxic reactions [27]. All these protein molecules are secreted by the white blood cells and act like hormones by having a major influence in cells’ interactions and behaviour. Although a lot of cytokines are already known there remains a vast of unidentified functions they may induce. 1.4 Tumour immunology It is well known that tumours may induce immunological responses. In the early 19th century Paul Ehrlich was one of the first who suggested that tumours might be regarded as similar to grafted tissue which may be rejected by the host, if causing an immunological response. A first approach revealed a regression of grafted tumour tissue in the host, but it came in discredit as it was apparent that the regression was caused by the foreignness of the MHC receptors (every individuum bears its own kind of MHC receptors). Hence it was established that tumours might be rejected in presence of antigenic disparity between tumour and host. Later, when inbred rodents became available, studies on animals bearing a tumour expressing identical MHC anti- gens were accomplished and the term immune surveillance became more and more important [27]. 1.4.1 Immune surveillance At first it was suggested that the immune system surveys and recognises abnormal cells in a very early stage in order to destroy them. There were proposals that the immune system plays a major part in delaying growth and causing regression of already established tumours. Some evidences which arose are outlined below: • Spontaneous regression of tumours occurred. 7
  20. 20. Introduction 1.4 Tumour immunology • Postmortem data suggested that there may be more tumours than became clinically ap- parent. • Tumours arose frequently in immunosuppressed or immunodeficient individuals. • Many tumours contained lymphoid infiltrates which may have been a favourable sign. Despite these impressive evidences there was no proved correlation between immunosup- pression and an increased tumour incidence. It rather seemed that the immune surveillance acted against viruses which also might have caused tumoral spreading [27]. 1.4.2 Tumour antigens Abundant evidences have shown that almost all tumours indicate genetic alterations which lead to the expression of mutated and sometimes overexpressed molecules. Tumour antigens were first demonstrated by transplantation tests. When a tumour was grafted to an animal previously immunized with inactivated cells of the same tumour, resistance to the graft was seen. There are two known groups of tumour antigens: • Tumour-specific antigens caused by genetic mutations in tumour genes. • Shared tumour antigens found on several tumours induced by viruses. It is important to explore a variety of new means which take the fact into account that the immune system possesses the ability of recognising specific antigens on cells surfaces. This specificity is of greatest importance in applying immunotherapies with T cells [27]. 1.4.3 Immunotherapy There is a long history behind immunotherapy but only during the last recent years it gained more and more importance and is now established as a new reliable form of therapy for some kinds of tumours. There are several forms of interventions like active or passive, specific or non-specific treatments, but all of these remain currently in an experimental stage. Some of the latest established therapies are listed below: 8
  21. 21. Introduction 1.4 Tumour immunology • Immunization with tumour antigens • Immunotherapy with cytokine can cause tumour regression • Immunization against oncogenic viruses • Therapy with lymphokine-activated killer cells • Immunotherapy with T cells • Therapy with antibodies As already mentioned all these therapies may only act against some particular forms of can- cer and are therefore very restricted in their appliance. But nonetheless there are a lot of studies engaged with the development of new means pertaining to this kind of therapeutic treatments [27]. 9
  22. 22. Chapter 2 Objectives The aim of this thesis is to develop a database for different immunological data and make it accessible via a web interface. The database should be maintainable by an administrator or different users with particular access admissions. A basic demand was an encryption of data pertaining to patient and user specific information. Different upload routines and input masks, which should be accessible via a web browser, have to be written in order to fill the database with the required data. Several query algorithms have to be established with which the requested information should be aligned in an appropriate way for the clustering software. Finally the database’s re-obtained immunological data should be brought into a clusterable form in order to apply the different cluster algorithms to observe different expression patterns in the same way, as it is already used for microarray data [29]. The received data should contain all the different experiment and patient information for each patient, all aligned in a matrix which can be clustered with particular algorithms. Some cluster results should be shown to demonstrate the functionality and necessity of this project. The major goals of this project will be separated into three main parts: 10
  23. 23. Objectives 2.1 Database development 2.1 Database development The first part is to develop and design a database for immunological data which will arise within several different FACS analyses of different kinds of sample material (lung cancer tissue, pleural liquid, etc.) from patients and healthy donors. The database should be flexible and easily extendable to ensure the possibility of adding new tables and relations. 2.2 Application server deployment For the second part the business logic should be programmed and deployed to an application server to map the entity relations of the database in an appropriate and easy accessible way for other client programs. The application server should bear the business logic and separate and hide the data management layer from the accessing client layer. The data management should contain the up- and download routines as well as the data maintaining methods. 2.3 Web application development To make the data accessible for clients the third part should be the development of a web applica- tion that communicates with the application server which processes the data from the database. By using a browser the web server can be accessed by an appropriate web address. The server creates and returns html pages which contain the requested information for the user. 11
  24. 24. Chapter 3 Methods This chapter will give a brief survey of the FACS technology in respect to phenotypic and functional analyses of immune cells. Further it will give a survey of current Java technologies with regard to establishing server side applications with database connections. First it shows the data storing layer, next it leads through the business logic up to the web server application layer and finally faces the user’s web interface. 3.1 Fluorescent-activated cell sorter (FACS) Flow cytometry is a powerful means in modern biology and has already gained a key position in immunology and cell biology. It can be used to separate various kinds of cells using different stainings of their diverse surface markers. It also allows examining the number of cell cycles with intracellular immunofluorescence. The FACS was designed for an automatic separation and analysis of fluorescently stained cells. The diluted cells pass through a thin vibrating nozzle forming very small droplets which contain just one cell at a time. These cells are detected one by one passing a laser beam which measures the wavelength and the intensity of their fluorescence at a time. Dependent on this information the type and the size of one single cell can be examined and displayed in an appropriate scatter plot (Fig.3.1). Newer FACS equipments may recognize more than two different fluorescence colours at one time, thus it is possible to mark the cells’ surface antigens with several fluorescently stained antibodies to isolate different cell populations 12
  25. 25. Methods 3.1 Fluorescent-activated cell sorter (FACS) [27] [10]. Figure 3.1: Left: FACS with two dye channels FL1-H and FL3-H. Right: Scatter plot with histogram [27] Every dot in a two dimensional scatter plot indicates a particular fluorescently marked cell whereby the two detected fluorescent intensities (FI) are displayed in a logarithmic scale on each axis. The mean fluorescent intensity (MFI) of one dye channel is the fluorescence intensity of one species of equal stained antibodies adhered to the cell surface. The histogram plot indicates the amount of cells with a certain fluorescent intensity. 3.1.1 FACS analyses For this thesis various samples of patients and healthy donors were available. The samples were obtained from several volunteers of different hospitals. Thus currently received samples are listed in Table 3.1. The used FACS (FACSCALIBUR from BECTON DICKINSON) was equipped with two lasers allowing the recognition of four colour staining at one time. Hence for all FACS experi- ments four different surface markers can be applied to examine four different surface antigens simultaneously. 13
  26. 26. Methods 3.1 Fluorescent-activated cell sorter (FACS) Donor Sample material Primitive cancer Cancer type Cancer subtype Healthy blood, pleural liq- - - - uid, purified T cells Patient tumour, lymphn- lung, breast, KBP, MesM, K2 ADK, Kepi, K1P, ode, pleural liq- colon NOS, Infl., Sark. uid, blood, not tu- moral biopsy Table 3.1: Sample material list 3.1.2 Sample treatments To prepare the different samples for the FACS analyses, several material treatments have to be done, in order to extract the cells of interest. Dilacerations are done to break up solid sample materials to bring them into an utilisable form for further processing. Ficoll is used on all sample materials to separate the immunologic cells of interest (leuko- cytes) from the remaining blood compartment (red blood cells, dead cells, necrotic cells etc.). The achieved leukocytes are purified, diluted and mixed with different fluorescent stained anti- bodies later on to prepare it for the different FACS analyses. Proliferation experiments are performed before and/or after T cell purification. The cells are activated and stimulated with different stimulus (cytokines, antibodies, mitogens etc.) and the following proliferation was analysed by the FACS. Purification of cell mixture with the monoclonal antibodies has to be done in order to isolate T cells for proliferation assays and RNA extraction. There are several purification kits available dependent on ensuing analyses and starting material. If there is a sufficient amount of cells RNA extractions may be done for subsequent microarray experiments. 14
  27. 27. Methods 3.1 Fluorescent-activated cell sorter (FACS) 3.1.3 Phenotype analysis A phenotype is a physical manifestation of internally coded, inheritable information of a geno- type which encodes and maintains the cells entire behaviours and structures [4]. Hence a phe- notype analysis tries to examine different cell populations within a multicomponent mixture of cells. The only way in doing this is to use different signs which are common within a partic- ular cell subpopulation. In immunology the easiest way to distinguish between different cell populations is to use monoclonal antibodies against the multiplicity of a cell’s surface markers which are specific for a certain population. The phenotype analysis starts with the proportioning of the prepared cell dilution into sev- eral tubes. Each tube’s cell mixture is stained by using four different species of antibodies marked with different fluorescent dyes. The fluorescent antibodies bind to their specific antigen receptors and in subsequent FACS analyses they reveal the cells’ characteristics. Tube FL1H FL2H FL3H FL4H 1 IgG1 IgG1 IgG1 IgG1 2 CD19 CD56 CD3 CD14 3 CD4 CD103 CD3 CD69 4 CD1a CD83 CD45 CD14 Table 3.2: FACS tube list FL1H to FL4H mark the different fluorescent dyes. The IgG1 in Tube 1 is an immuno glob- ulin which binds with its FC region end to FC receptors. Thus it is used for a calibrating process to reveal the amount of FC receptors on the cell’s surface because specific antigens may also bind to the FC receptors and falsify the FACS result. The calibration process starts by inserting the first tube (with the four different stained IgGs) into the FACS whereby each of the acquired scatter plots indicates a cloud of points lying close together. This cloud is used to calibrate the axes of the scatter plot by moving it into the left lower quadrant. All of the following analyses use this adjustment, which defines if a cell has a positive (points above the axis) or negative (point below the axis) expression of a certain surface marker. 15
  28. 28. Methods 3.1 Fluorescent-activated cell sorter (FACS) The phenotype analyses shown below (Fig.3.2) were made at two different points in time whereby the first scatter plot was captured at day zero and the second at day five after incubation and activation with CD3/CD28 and stimulation with IL-2. Figure 3.2: Phenotype scatter plots. Day 0, cells without CD25+ markers. Right: Day 5 after stimulation, cells with an increased amount of CD25+ markers The first scatter plot shows two groups of cells, CD3+ (T cells; CD3 is a special T cell marker, whereby the + indicates a positive expression on the cell’s surface and the − a negative) and CD3− but both of them are CD25− . The second scatter plot reveals a major increase of CD25 markers on the CD3+ (T)cells’ surfaces after stimulation with IL-2 indicated by the high MFI (mean fluorescence intensity) of the CD25 markers. This possibly may come from the fact that CD25 is a special receptor for IL-2 cytokines. IL-2 is known as an inducer of cell proliferation but under certain conditions it also may cause apoptosis (self mediated cell death). A listing of several used antibody combinations is given in Table 3.2. For example in Tube 2, following cell populations can be distinguished: CD3+ represent T cells, CD19+ represent B cells, CD56+ are NK (natural killer) cells (CD3+ CD56+ cells are NK T cells) and CD14+ are monocytes. In Tube 3, CD3+ CD4+ cells are TH (T helper) cells, CD3+ CD69+ are activated T cells, therefore CD3+ CD4+ CD69+ cells are activates TH cells, whereas CD3+ CD103+ represent a subpopulation of regulatory T cells. In Tube 4, CD45+ represent all hematopoetic cells, CD1a+ CD83+ are dendritic cells (special APCs) and CD14+ CD1a− are monocytes. 16
  29. 29. Methods 3.1 Fluorescent-activated cell sorter (FACS) 3.1.4 Proliferation analysis The proliferation analysis tries to examine the cells’ behaviour of cell division and augmentation under certain activation and stimulation conditions. To observe cell proliferation Carboxy- fluorescein diacetate, succinimidyl ester (CFSE) a red fluorescent dye is used for an intracellular staining of all cells. Within each cell division the CFSE amount and therefore the MFI bisects and diminishes. Figure 3.3: Proliferation scatter plots. Left: Day 0, cells stained with CFSE. Right: Day 5 after stimulation with Il-10 and IL-15. Proliferation assays start with the same proportioning of the cell mixture into tubes as it was already described for the phenotype analysis. Each tube stained with CFSE is stimulated then with a different combination of cytokines followed by incubation for several days. FACS experiments are made on different points in incubation time to record the cells’ behaviour in respect of their different stimulation conditions. One of the four FACS’s dye channels is used to detect the CFSE fluorescence intensity whereby the three remaining channels are used for stained antibodies as supplied before. To activate the cells, micro titer plates are filled with specific antibodies CD3/CD28 or mitogens e.g. PHA (phytohemagglutin). The proliferation assays depicted above (Fig.3.3) were made at different days of incubation. The first scatter plot indicates the initial state without activation and stimulation at day zero. The other scatter plots were captured at day five after 17
  30. 30. Methods 3.1 Fluorescent-activated cell sorter (FACS) stimulation with IL-10 or IL-15. The IL-15 stimulation reveals a considerably amount of cell divisions of CD3+ cells whereas IL-10 seems to inhibit cell proliferation. 3.1.5 Cytokine secretion analysis Cytokine secretion assays use special Kits (e.g. ELISA) to detect particular cytokines secreted by APCs or other cytokine secreting cells. The basic idea of such an assay is to detect cytokines which are released under certain stimulation conditions. Therefore this cytokine detection Kits provide Cy- tokine Catch Reagents and highly specific Cytokine Detection Antibodies. The Cytokine Catch Reagents bind to the receptors of cytokine secreting cells and may catch cytokines which are secreted by these cells. When a cell has secreted its different species of cy- Figure 3.4: Cytokine secreting cell tokines they diffuse to the Cytokine Catch Reagents [20] and bind to them. After a certain incubation time the different stained Cytokine Detection Antibodies are mixed to the cell compound and each anti- body binds to its specific kind of cytokine. Now the concentrations of certain secreted cytokine species can be detected by a following FACS analysis which is performed in the same way than for the phenotype analysis [20]. 18
  31. 31. Methods 3.2 Database development (Enterprise Information System (EIS)-Tier) 3.2 Database development (Enterprise Information System (EIS)-Tier) 3.2.1 Relational databases Relational databases are rested upon the theory of relational mathematics based on the set the- ory. The basic idea behind relational database models is a collection of two-dimensional tables, linked among themselves by different keys. Real world objects are mapped by abstract entities which are represented by their according tables [11]. Tables are storing information about in- stances of entities, their attributes and their relations among each others. Every entity instance consisting of a unique record (tuple) of data represents a row in the table . The uniqueness of each data record is based on one well-defined primary key (PK) whose occurrence must be unique within each table. To realise one to many (1:N) or many to one (N:1) interrelations, data records must contain primary keys of foreign tables, called foreign keys (FK). An implementa- tion of many to many (N:N) relations requires an additional table storing explicit allocations of different foreign keys. The fact of defined relations allows the stored data to be broken down into smaller logical and easier maintainable units. Some good reasons why relational databases should be used are outlined below: • Reducing of duplicate data: Leads to improved data integrity • Data independence: Data can be thought of as being stored in tables regardless of how physically stored • Application independence: The database is independent of accessing systems and pro- grams • Data sharing with a multiplicity of users • Single queries may retrieve data from more than just one table 19
  32. 32. Methods 3.2 Database development (Enterprise Information System (EIS)-Tier) Normalisation Normalisation is used to break up a raw database into logical and easy maintainable units called tables. The idea is to create a level of minimized redundancy that allows two or more tables to be joined within a single query. To realise such an implementation certain theoretical rules, called normal forms, have to be performed. There are six nested normal forms but in generally the first three are used to meet the requirements of a well-formed business database. First normal form (1NF): All attributes must be atomic and there must not be any repeating values whereby each row/column intersection may have just one value [11]. Second normal form (2NF): The table must be in 1NF and there must not be any partial dependencies in a table. Hence every non prime attribute must be fully functionally dependent on a primary key. Third normal form (3NF): table must be in 2NF and there must be no transitive dependencies hence the value of a non-key attribute must not depend on another non-key’s value. Integrity rules There are three integrity rules which have to be performed in a well-designed database. • Key constraint: Candidate keys are defined for each relation and must be unique for every tuple in any relation instance of that schema. • Entity integrity: All values pertaining to the primary (PK) must be no ’null’ values. Each tuple must be uniquely identifiable. • Referential integrity: There must not exist any foreign key (FK) in the database which is not another table’s primary key [11] [25]. To prevent violations of integrity rules some safety precautions, like the interdiction of PK alterations or a cascading alteration of all entries associated to the PK in case of an inevitable change of the PK can be taken into account. 20
  33. 33. Methods 3.2 Database development (Enterprise Information System (EIS)-Tier) 3.2.2 Structured Query Language (SQL) The father of relational databases, and thus SQL, is Dr. E.F. ’Ted’ Codd who worked for IBM. After Codd described a relational model for databases in 1970, IBM spent a lot of time and money researching how to implement his ideas [9]. Now SQL has already evolved into a stan- dard, open language without cooperative ownership and almost all nowadays available database implementations are designed to meet the SQL standards. SQL pertains to the category of non procedural languages called declarative languages. In opposition to procedural languages which result in many lines of code, SQL results in just one statement of the desired demand. A single database query consists of a SQL statement which includes all desired requests. This statement is sent then to the database management system (DBMS) which executes a hidden internal code and returns a somehow defined dataset [9]. There are two possibilities in accomplishing data manipulations with SQL: commands which return demanded datasets are defined in the data manipulation language (DML) and manipulating commands which alter the database’s internal structures use the data definition language (DDL). Data Definition Language (DDL) The DDL is a sub section of SQL allowing the creation and deletion of tables in the database as well as the definition of indexes (keys) and links between tables. It is also possible to enable constraints among different tables, defined by foreign keys[7]. Some of the most important DDL commands are listed below: • CREATE TABLE - creates a new database table • ALTER TABLE - alters (changes) a database table • DROP TABLE - deletes a database table • CREATE INDEX - creates an index (search key) • DROP INDEX - deletes an index 21
  34. 34. Methods 3.2 Database development (Enterprise Information System (EIS)-Tier) Data Manipulation Language (DML) The DML defines the second part of SQL commands. It includes the syntax for complex queries as well as for updates, insertions and deletions of data records [7]. The four basic manipulation commands are outlined below: • SELECT - extracts data from a database table • UPDATE - updates data in a database table • DELETE - deletes data from a database table • INSERT INTO - inserts new data into a database table The basic body of almost all query statements is given in the following example: • The SELECT statement creates a recordset from existing tables according to the param- eters that follow the statement. • The FROM command apprises the database engine to return all the fields in the selected tables. The fields specified in the SQL statement become the columns in the new record- set. • The WHERE condition restricts the rows returned to only rows containing the data spec- ified in the SQL statement. • The ORDER BY command notifies the database engine to sort the records before return- ing them. Example: SELECT address FROM patients WHERE ( name = ’...’) 3.2.3 Java Database Connectivity (JDBC) JDBC is a low-level API (application programming interface) performed in Java programming language [11] which allows an external access to any SQL database to manipulate and update 22
  35. 35. Methods 3.2 Database development (Enterprise Information System (EIS)-Tier) stored data. It provides library routines which supports the integration of direct SQL calls into the Java programming environment. These routines support an interface which makes it very easy to access the database by opening a connection and send SQL code to the database engine which executes the demanded commands. Having accomplished the request the Java program closes the connection and continues with its execution [31]. Java is already a well-established web programming language and in combination with JDBC it becomes an extremely useful tool in generating web based database applications. Due to Java’s platform independence it is an extremely useful tool no matter which operating system is used [30]. Compendious JDBC makes it possible to do three things: • Establishes a connection to a database • Sends SQL queries • Processes and returns the results Two-tier and Three-tier Models A tier structure represents ab- stract layers which communicate among themselves via different in- terfaces. Each tier performs its own particular duties and inter- acts with other layers to accom- plish different tasks. This seg- mentation into different tiers causes a separation between user inter- face and business logic which in- tercommunicate via well-defined Figure 3.5: Multi Tier Model [11] interfaces [30]. 23
  36. 36. Methods 3.3 Application server deployment (Middle-Tier) Two-tier models may be java applications or applets which directly access the database. Therefore a JDBC driver is needed which can communicate with the particular DBMS to send SQL statements to the database. If the database is located on another machine it is called a client/server configuration, whereby the accessing application acts as client. The connection may be established via intranet or internet standard TCP/IP protocols [30]. Three-tier models (see Fig.3.5) use a middle layer the ’middle-tier’ which receives com- mands from two different sides. The middle-tier conduces as service layer which executes commands obtained from the user layer and sends them to the database. The database (EIS- Tier) in turn processes the received SQL commands and returns the appropriate results to the middle tier which then sends them to the Client-Tier. The major advance of a middle-tier is an encapsulating of low-level calls hidden for the user who may access them by a high-level appli- cation interface (Client-Tier). This architecture may also provide performance and maintaining advantages [30]. 3.3 Application server deployment (Middle-Tier) 3.3.1 Enterprise Java Beans 2 (EJB2) EJBs are business data objects developed in java technology running in an EJB container sup- ported by Java 2 Enterprise Edition (J2EE) application servers. The encapsulation mechanism of EJBs allows the developer to concentrate on assignments belonging to his own business, without caring about the beans’ interactions with the container. EJBs may be accessed by an abundance of different users with appropriate admissions. The major advantage of EJBs is its portability among a variety of application servers supporting the J2EE container specification [13] [1]. The major improvement of EJB2 is the advanced EJB QL (Query Language) allowing complex queries with optimised SQL statements mixed with Java code. EJBs fall into three sub groups enumerated below: Session beans: This kind of beans account for the first layer of an EJB structure model (see Fig.3.6) seen by the client and mainly support getter and setter methods for the client. Thus 24
  37. 37. Methods 3.3 Application server deployment (Middle-Tier) Session Beans comprise the main part of the client’s business logic for accessing the data layer. As implied by their name, session beans only exist during on single session by executing one specific remote method invocation. To speed up client connections session beans, once if they were used, are sent to a pool where they wait for other invocations [1]. Session beans may be subdivided into two groups: • Stateless session beans do not maintain their state among different method invocations. • Stateful session beans hold the client state across method invocations. Entity beans: These are beans persisted within the EJB container (see Fig.3.6) for a di- rectly mapping of database entries. One entity bean corresponds exactly to one table within the database, whereby each table’s entry accords with setter and getter methods defined in the en- tity bean. These entity accessing methods may be invoked by session beans in case of a client’s request. The entity layer therefore is a mediator between databases and session beans by hiding the database’s specific accessing language from the developer [1]. Entity beans fall into two groups: • Container-managed persistence (CMP) - In case of a CMP the container must supply the full synchronisation between the database and the entity layer. The container ensures the consistency and integrity during the beans’ entire lifetime. The developer does not care about how the beans access the database, but it is important that the underlying database is relational in nature. • Bean-managed persistence (BMP) - In case of a BMP the programmer is entirely respon- sible for all the synchronising steps to connect the entity beans with the database. All the necessary SQL statements and JDBC calls must be coded by the programmer. The ad- vantage of this kind of persisting entity beans is the full control over all actions pertaining the database, allowing an access optimisation [1]. Message driven beans: Message deliveries in contrast to method invocations are asyn- chronous. Therefore an availability of the bean at the time of an occurring message can not be 25
  38. 38. Methods 3.3 Application server deployment (Middle-Tier) assumed [1]. Hence this kind of beans must be driven by an asynchronous message receipt to send information to the EJBs’ business logic. Figure 3.6: EJB Model [1] 3.3.2 The Java 2 Enterprise Edition (J2EE) server Every application server which wants to meet the EJB technology must confirm to the J2EE container specifications. But most application servers support also a variety of other technolo- gies which sometimes may cause a loss of the portability of J2EE applications across different vendors [16],[18]. Some of the services provided by J2EE servers are outlined below: • EJB: allows the user to call remote methods supported by the EJB technology via TCP/IP. • HTTP (Hyper text transfer protocol): supports the accession of Java Server Pages (JSPs) and servlets via a web browser. • Authentication: increases security issues concerning the user loggings. 26
  39. 39. Methods 3.3 Application server deployment (Middle-Tier) • JNDI (Java Naming and Directory Interface): enables the location of programs and ser- vices within the J2EE platform. 3.3.3 Java Cryptography Extension (JCE) JCE is a non-commercial cryptography extension for Java containing a package which provides implementations for encryption, key generation and key agreement, and Message Authentica- tion Code (MAC) algorithms [19]. It is an extremely valuable encipher tool for information of a high security level. Restrictions to applets or application may be specified in certain ’jurisdic- tion policy files’ dependent on their different jurisdiction context (location). Some features of interest are listed below [19]: • JCE is a pure java implementation • Implementations and interfaces of ciphers, key agreements, MACs, etc. • Support for the following algorithms by the SunJCE provider: – DES – DESede – Blowfish – PBEWithMD5AndDES – Diffie-Hellman key agreement among multiple parties – HmacMD5 – HmacSHA1 3.3.4 Extensible Markup Language (XML) XML was developed by the W3C between 1996 and 1998 to provide a universal format for describing structured documents and data; in other words, it allows data to be self-describing [12]. XML tries to bring information into a clearly arranged text form storable in flat files. 27
  40. 40. Methods 3.4 Web application development (WEB-Tier) Analogical to HTML (Hyper Text Markup Language) XML uses tags which may be defined in DTDs (Document Type Definition) by the programmer arbitrarily [28]. DTDs describe in a formal way which names are to be used for the different types of tag elements, where they may occur, and how they all fit together. A well-defined tree structure makes it easy to parse XML files to extract information. Different from HTML which is used to define the display of web pages, XML’s is applied to store and transmit information whereby it is often used to save configurations. The most important reason of using XML files is their quality of storing information outside of an program application. Hard coded (binary) constants may be separated into XML files which are easily modified without changing the applications source code. 3.3.5 JDOM (Java Document Object Model) JDOM is an open source library for Java-optimised XML data manipulation similar to DOM (Document Object Model) but not build on it. The DOM model tries to represent documents as a hierarchy of Node objects which may have child Nodes of various types. JDOM supplies methods for an easy and efficient reading of XML files and is not an XML parser, but rather a document object model that uses XML parsers to build documents whereby nearly any parser may be used [8]. 3.4 Web application development (WEB-Tier) 3.4.1 Java Server Page (JSP) Java Server Page technology has been developed to provide an easy and intuitive way in creating dynamical generated HTML pages. JSPs are like HTML pages but in addition to the static HTML tags JSPs may contain Java code and specific tags, which account for the dynamic generated part. By carrying the extension *.jsp the web engine of a JSP supporting web server compiles the JSPs to servlets which are launched then in the web container to perform their demanded tasks. Servlets running on the web server are similar to applets which are running 28
  41. 41. Methods 3.4 Web application development (WEB-Tier) in a web browser. Servlets may invoke Enterprise Java Beans or create a direct connection to a database and when they have finished their work, they send back dynamically generated HTML pages which are displayed then by the client browser [17]. One of the most common used architectures is depicted in Fig.3.7. Figure 3.7: JSP MVC model [23] 3.4.2 JSP Tag libraries JSP is a solution for creating and assembling together dynamical websites and it applies an interconnection of various programming languages to control the entire web application. Hence, JSP technology gains more and more complexity and becomes non transparent for a multitude of web designers. The HTML sites mixed with a vast code of pure Java are very confusing and programmers may quickly lose the plot. Therefore JSP developers came up with the idea to create special tags for the code sections written in Java, by replacing pure Java with tags having the same functionality. The intension was to create tags as they were already used for HTML or XML, to perform a certain consistence within the static HTML code. These tags should enable 29
  42. 42. Methods 3.4 Web application development (WEB-Tier) an easy accession and usage of Java extensions for non Java programmers and web designers [5]. A Tag is represented by a well-defined syntax which exactly constitutes where it starts and where it ends. Tags are enclosed by angled brackets which may bear attributes defining the tag’s behaviours and it may embed information or further hierarchically arranged tags (in case of XML files) in its body. An example is given below: < tag1 attribute1=’value1’ attribute2=’value2’ > The tag‘s body ... < / tag1 > Tags are used to store information in text oriented flat files and due to their well-structured form they are easily parsed in turn. It depends on the application how to handle and translate the parsed file’s information. So if the application is a web browser, the information file will be interpreted in a graphical way displayed on a screen. In case of a JSP the engine of the web server processes the *.jsp file and generates a servlet by using the JSP tags. Every JSP tag possesses its own special Java tag class which defines the tag’s behaviours and contains the pertaining Java code which was previously defined in the JSP. The attributes the tags possibly have, are associated with the tags’ class initialising setter and getter methods. These tag classes are used by the web engine to compile the servlet class which processes the associated tasks and returns the result to the client’s web browser. A tag library now stores packages of different tag classes in a clearly arranged way hidden for the JSP developer who merely sees the JSP tags he is using [5]. Some of the major advantages of JSP tags: • The average web designer can now maintain JSP sites • Tags are portable within different web applications • Tags speed up web developments by reusing Java code • Tags are easily extendible by additional functionality 30
  43. 43. Methods 3.4 Web application development (WEB-Tier) 3.4.3 The Jakarta Struts Framework Struts is the first open source framework supporting web design practises along with the thought of the JSP Custom Tag technology. It was developed by Craig R. McClanahan who freely offered the source to the Apache Software Foundation [5]. Struts implements the previous mentioned Model 2 JSP web application architecture (Fig.3.7) that uses a servlet as Controller for dispatching the incoming requests, a Java Bean representing the Model part which stores the data for the request, and a JSP visualising the data to the user accounts for the View part. Hence Struts represents the perfect decoupling of business logic, application control and presentation [5] [24]. Other benefits to Struts are: Extensive JSP Custom Tag libraries: which reduce the major part of Java scriptlet code from the Java Server Pages and enable its reusability. A generally reduce of code for web applications: by supplying a well tested base of software technology. A support of internationalisation for web applications: the web sites are dynamically updated with the appropriate language of the operating system. 3.4.4 Struts Application Flow The following characterisation of how Struts is handling a user request refers to Fig.3.8 depicted below: 1. The first step to invoke a Struts web application is to open an appropriate web site which sends a request to the Action Servlet controller by triggering a submit action (e.g. a button or invoking a site). 2. Receiving the request the Action Servlet checks the Action Mappings and instantiates an Action Form Bean which pertain to the invoking HTML form, to store then the form fields’ information. The Action Form Bean bears a validation method which cancels the user’s request if wrong parameters have been entered. In this case the Action Servlet sends back the previous invoked JSP and indicates the occurred error messages. 31
  44. 44. Methods 3.4 Web application development (WEB-Tier) Figure 3.8: Process flow for displaying a JSP page within a Struts Project [5] 3. If no insertion faults occur the Action Servlet calls the suitable Action Bean dependent on the Action Mapping’s information. 4. This Action Bean may invoke the associated Action Form Bean’s methods to gather its information in order to start data transactions with EJBs or databases directly. 5. Start of the data transaction. 6. After the data transactions are done the Action Bean may invoke the Action Form Bean to store back new data again. 7. By having done this, the Action Bean calls a forward method which accesses the Action Mappings again whereby these mappings now indicate the JSP page which should be displayed next. Every forward requests a different or even the same JSP page depending on the Action Bean’s state. 8. The Action Servlet sends a request the JSP claimed by the action forward and in case of its first invocation, the web container will compile the JSP into a servlet class. 32
  45. 45. Methods 3.4 Web application development (WEB-Tier) 9. The JSP servlet calls and includes demanded tag libraries in its method, processing and generating dynamically the HTML code by including the Action Form Bean’s data and the JSP’s own HTML code. 10. The following response called by the JSP servlet returns the generated HTML code to the Action Servlet. 11. The Action Servlet in turn induces its own response and delivers the HTML code back to the browser. The browser parses the HTML code and visualises the web site for the user. 33
  46. 46. Chapter 4 Results This chapter will present the developed data model for the Tumoral Micro Environment database (TME.db) for storing the immunological data, which was obtained from different cancer pa- tients and controls. Further the functionality of the developed web application will be shown by giving some maintaining and querying examples. Finally some cluster experiments, performed with re-obtained and particular aligned FACS and patient data, will be shown. 4.1 The Database Model The first part for storing all the arising immunological data as well as the donor specific infor- mation was to develop an appropriate, flexible and easily maintainable data model. The data tables should be broken up into smallest possible units to ensure best flexibility among different data tables. To realise an adequate model a relational database management system (Oracle) was chosen for gathering the data. Oracle was the best choice because it has already been used by the bioinformatics work group for several database projects like GOLD.db or MARS.db. All considerations due to the table design were accomplished in regard to real world’s facts in order to create an intuitive data model. 34
  47. 47. Results 4.1 The Database Model 4.1.1 Patient and Experiment Table The core of the data model is embodied by the Patients and the Experiments table represented by the red tables (see appendix A). The patient table contains specific personal information about the donors of the sample materials whereby all the data is encrypted with a special algo- rithm supported by JCE. The patient table is linked to the Hospitals table within a many to one relation as well as one patient may have links to many experiments stored in the experiment table. The Experiments table comprises all the information related to the experiment , and by hav- ing an one to many relation to all the possible experiment tables, it links the database’s entire available information (Therapies → orange section, Cancers → blue section, Biolmarkers → yellow section, Proliferations → pink section, Sampletreatments → blue section, FACSLympho- cytes(Phenotypes) → light green section, Testmaterial → grey section, Cytokineexperiments → green section). 4.1.2 User Management Tables The light orange tables depicted in appendix A store the user related information. The centre of this section is represented by the PatientDBUsers table which stores the encrypted (by JCE) user related information. Each user may save some specific query options which are stored in the SavedQueryOptions table. A many to many relation between the PatientDBUsers table and the PatientDBUserRoles table, established by the UsersUserRoles table, enables multiple user roles for one single user. The same relation construction is created with the UsersHospitals to add to one user a variety of hospitals, which in turn are again linked with the patients table. 4.1.3 Experiment Related Tables All the following explanations refer to appendix A whereby the description starts with the ther- apies table and swift through the model counter clockwise. 35
  48. 48. Results 4.1 The Database Model The orange section stores possible therapeutic patient treatments like chemo therapies, x- ray treatments etc. The Therapies table contains all necessary ids and links (many to one) to the Treatments table which stores the actual therapy name. The blue section contains a collection of different cancer related tables like primitive cancer, cancer type, cancer subtype, tumoral liquid, as well as different cancer stages. The Cancer table stores solely the FKs (foreign keys) which link to the associated tables like PrimitiveCancers, CancerType, CancerSubType, CancerTumoralLiquid and CancerStages. The cancer stage table in turn contains again a tuple of FK which link to four different cancer stages; PStage, TStage, NStage and MStage (particulars to these will be given in a later part of this chapter). The yellow section refers to certain patient stimulations with different biological markers which may induce ascertained health effects. The BiolFactors table contains the different stim- ulation values and links with two FKs to the BiolMarkers table storing the different markers, and to the TestType table which stores the test types of used stimulations. The TestTypeBiol- markers table enables a many to many relation between the latter two tables which defines the affiliation of the test types to the biological markers. The pink section pertains to the proliferation assays whereby each experiment may have many proliferation experiments. To store and access the proliferation experiments’ information more flexible, the data is split into a table containing particular experiment information about the experiment’s handling and a FACS data specific table containing all the data processed by the FACS analyses. As explained the Proliferations table stores FACS experiment specific data and links to the FACSCellProliferations table that stores the percentage of cell expressions and pertaining MFI values. This table as well contains two FKs linked to the Activations and the Stimulations table which save possible activations and stimulations for the proliferating cells. The ActivationsStimulations table establishes again a many to many relation of these latter tables and defines the stimulations which may account to appointed activations. At last the stimulation table relates to the StimulationRanges table which contains particular stimulation 36
  49. 49. Results 4.1 The Database Model ranges comprising a min and max value for each stimulation. This is an important feature for later queries to map the values to defined ranges, claimed by the Genesis software for analysing the data. The blue section is a special one which stores all the possible pre-treatments of one particu- lar test material (e.g. blood or pleural liquid) for one single experiment. There are a lot of many to many relations between these tables because a multiple performance of all different treatment should be enabled. All these different treatment tables relate to the one SampleTreatments ta- ble containing particular material treatments, which have no multiple occurrences. The other tables store e.g. information about RNA extractions which may be performed with different RNA-kits on even one sample material gathering RNA for microarray assays, or information about different stimulations of the material before ficoll etc. The light green section relates to the phenotype analyses and uses a similar data model as for the proliferation experiments. In this case the FACSLymphocytes table again stores experiment specific information and links to the FACSLymphocyteGates containing all the possible gates and the FACSLymphocyteTypes containing all possibly occurring phenotypes. The FACSLymphocyteGatesTypes table characterises again a many to many interconnection of the phenotypes pertaining to one single gate and the AntigenRanges table define particular ranges of the expression of certain surface antigens and their MFI values. This form table model was chosen to ensure the possibility of a dynamically update of gates and antigens at every time to enlarge the storable data set (the same was applied for the proliferation and cytokine experi- ments). The grey section (table) defines all possible sample materials and was separated from the experiment table to enable a later update of additional arising materials. The green section describes the cytokine experiment table relations with exact the same data model as used for the phenotypes. These similarities in the data models allow a faster develop- 37
  50. 50. Results 4.2 The Web Application ment of the accessing and querying software by enabling the use of equal code fragments for all these demanding accessions. The two remaining tables in the right upper corner have no linkage at all. One of these stores constant values important for queries and the other one stores the number of primary keys already given to certain updatable tables. For each new insertion into one of these tables the number of its pertaining given PK is increased by one to ensure the integrity rule of key constrains (see section for integrity rules). 4.1.4 Application server connection To get the data available for Java [8] the open source technology of the JBoss ( application server was used which supports the J2EE technology. Hence the EJB technology could be used to map entity relations to the database’s tables. Nearly all tables were mapped by entity beans to ensure an easy access to the data, but in some cases there was no necessity of this because no frequently maintaining was claimed. Most of the data manipulations are done between session and entity beans, but all the queries are done by session beans directly by using JDBC connections to the database because of its swiftness. The EJBs contain all the business logic to access the database and hide all the data gathering manipulations from the web server side which only sees the methods which may invoke them. 4.2 The Web Application The developed web application is based on the Jakarta Struts Framework and uses JSPs to gen- erate the users view in the web browser. For the web server Tomcat4.0.6 is used which is open source technology supported by the Jakarta project as well. The Struts application was deployed to the web server’s web container and can be accessed by the address 38
  51. 51. Results 4.2 The Web Application 4.2.1 The TME.db Web Page If one has appropriate access admissions (password, username) he may log into the TME.db and the Web Page will be populated with its permitted tool buttons due to his given user roles. All the following explanations refer to an administrator account with all possible permissions. The displayed web page (Fig.4.1) shows a toolbar on its top and bottom with the possibilities of adding or listing patients and different database query options. Figure 4.1: TME Web Page Overview On the left side a panel is shown which enables certain search options for particular patient IDs or FACSIDs of different experiments. The page’s centre shows an additional link to the 39
  52. 52. Results 4.2 The Web Application administration page which allows to add/edit user as well as hospital entries. There is also the possibility to add/edit Gates, Antigens, Activations and Stimulations for the different FACS analyses. 4.2.2 User Management To add a new user all the required fields have to be populated (Fig.4.2). Particular user roles define the new user’s permissions whereby a user may have multiple user roles. These user roles are important in respect to different users which may not see the entire available information (e.g. Clinicians may not see immunological data from the FACS experiments). A ’select hospitals’ field restricts the user’s view to patients pertaining to the selected hospitals. Hence the user may only see patients of specified hospitals. The ’period of validity’ field restricts the user account to a certain expiration date. A hierarchical listing of Figure 4.2: Add User Panel available user roles is given below: • GUEST: role is assigned for default without any permissions. • VIEW HOSPITAL INFO: permits the clinicians an insight to particular patient infor- mation like cancer, given biological markers or the patient’s treatments. • EDIT HOSPITAL INFO: includes the previous roles and allows the clinicians to mod- ify the patient information (add/edit). • DELETE HOSPITAL INFO: includes the previous roles and allows a deletion of pa- tients as well as patient information. 40
  53. 53. Results 4.2 The Web Application • VIEW IMMUNOLOGICAL INFO: includes previous VIEW roles and allows the im- munologists to view the immunological experiments (phenotype, cytokine and prolifera- tion analyses). • EDIT IMMUNOLOGICAL INFO: includes previous VIEW and EDIT roles permis- sions with additional modifying rights of immunological data. • DELETE IMMUNOLOGICAL INFO: includes all previous permissions and the dele- tion rights of immunological information. • ADMIN: includes all roles and the right of maintaining users, hospitals and specific per- missions already mentioned in section 4.2.1. 4.2.3 The Patient Overview Figure 4.3: Patient Overview 41
  54. 54. Results 4.2 The Web Application The Patient Overview joins together the entire given patient information (Fig.4.3). The top of the page contains the patient’s particulars and if appropriate permissions are given (section 4.2.2) the patient information may be changed or deleted. Every patient may have many exper- iments whereby each of them is displayed in a separately tabbed panel. The header of the experiment panels shows the experiment’s specific information as well as options to alter and delete it. There is also a link to the sample treatments page which contains information about certain performed treatments of the experiment’s sample material (Fig.4.4). Below the description header the experiment specific performed assays are shown, each pictured with a particular icon (Cancer Information, Biological Markers, Treatments, FACS Phenotype assays, FACS Cytokine Secretion assays, FACS Proliferation assays). Cancer Information This panel provides specific cancer information about the experiment’s sample material. • Primitive Cancer: describes an umbrella term of a certain kind of cancer like lung, colon, breast etc. • Cancer Type: characterises a specific occurrence of certain forms of cancer. • Cancer Sub Type: represents a particular sub specification of a cancer type. • Tumoral Liquid: bears important information dependent on having tumoral cells in it or not. • P-Stage: constitutes if the detected tumour is malignant or not [26]. • T-Stage: distinguishes the tumoral stage into different states [26] – A Tx tumour has a proven existence but cannot be assessed. – A T1 cancer is less than 3cm in size and completely surrounded by lung tissue. – A T2 cancer is larger then 3cm without invading structures in the middle of the chest. – A T3 cancer is of any size invading chest structures and it is still savely resectable. 42
  55. 55. Results 4.2 The Web Application – A T4 is a tumour of any size invading vital structures and is unresectable. • N-Stage: refers to the involvement of cancer into lymph nodes and is distinguished into different stages [26]: – N0 refers to the absence of any lymph node involvement. – N1 refers to the presence of cancer in the hilar lymph nodes. – N2 refers to an involvement of the mediastinal lymph nodes on the cancer side. – N3 cancers involve the lymph nodes on the other side of the chest, or in the supra- clavicular area. • M-Stage: is used to define the presence of metastasis [26]: – M0 implies the absence of any evidence of cancer spread to other organs. – M1 implies cancer spread to any organ. The M-Stages may be subdivided into more precise stages. To all these cancer information values were given to score their occurrence in queries later on. Biological Markers Biological markers are given to organisms to reveal physiological and biochemical responses provoked by them. To identify potential markers several factors like the correlation between response and effect, the feasibility of determining them or the specificity of the response need to be investigated. In this project the availability of biological marker types like IHC or ELISA, used on patients, should be taken into consideration by scoring them with specific values. These kinds of markers may reveal certain kinds of cancers or their occurrence. But for the moment there are no markers available at all and still remain in prospective. Treatments The treatments refer to certain therapies which were applied to the patient like chemo therapies or radiological treatments. These patient treatments serve as additional information which may 43
  56. 56. Results 4.2 The Web Application be of interest for querying the database. So far all patients were analysed before chemotherapies or radiotherapies have been applied. FACS Phenotype Assays FACS Phenotype Assays try to examine different cell populations within a compound of various cells (see section 3.1.3). The following example will give an explanation of how the phenotype data is achieved from the FACS and how it is stored in the database. When the FACS analyses of all prepared tubes have finished the MFI values and the per- centage of the different labelled surface markers are stored into a special file format. Then all the data of interest are extracted into a text file which may be uploaded to the database in turn. During the upload process the text file is parsed and potential errors in its content are logged and displayed afterwards. A screen shot of one particular phenotype analysis is given in Fig.4.4 (right side). It contains all determined data pertaining to one single FACS experiment. The tabs’ label (e.g. CD3, nonT(cells), non-Tum etc.) indicates the specific gate in which the different stained antigens were detected. Figure 4.4: Phenotype Assay Overview with scatter plot. On the left side of Fig.4.4 one single phenotype scatter plot is shown which was done to ex- amine the CD127+ cells within the CD3 gate (that means the percentage of all T cells bearing CD127 markers is determined). Below the plot the list of the particular determined percentage 44
  57. 57. Results 4.2 The Web Application and MFI values is given which directly result form the FACS analysis. The first column defines the four quadrants of the scatter plot, the second refers to the percentage of gated cells (in this case the ’% Gated’ refers to all cells which are alive), the third (% Total) displays the percentage of cells in a quadrant as regard to all detected cells (dead cells included) and the X, Y Geo Mean columns indicate the logarithmic scaled MFI of the stained CD markers in each quadrant (in this case CD127 refers to the X Mean and CD3 to the Y Mean). As depicted in Fig.4.4 the CD3 gate surrounds the both upper quadrants, hence the percent- age of all T cells which are CD127+ is calculated as following: U R of % Gated · 100 20, 58 · 100 CD3 CD127 % = = = 72, 41% U R of % gated + U L of % Gated 20, 58 + 7, 84 The MFI value of the T cells which are CD127+ is the X Geo Mean of the upper right quadrant (Fig.4.4). CD3 CD127 M F I = 530, 08 This tuple of phenotype parameters is determined for each antigen within each of the given gates whereby each gate represents a particular phenotype. FACS Proliferation Assays FACS Proliferation Assays are performed in order to observe cell proliferation under certain stimulation conditions (see section 3.1.4). The following example will give an explanation of how the proliferation data is determined from a FACS analysis. The proliferation FACS analysis follows the same data flow as already mentioned for the phenotype analysis the only difference is that there may be more than just one analysis for one experiment (e.g analyses on day 6, 7, etc. Fig.4.3). The FACS data is stored into an spread sheet and after performing some modifications the data is uploaded into the database. The following example will give explanation of how the data is modified before the upload process 45
  58. 58. Results 4.2 The Web Application is accomplished. Figure 4.5: Proliferation Assay Overview with: FSC-SSC-plot, scatter plot and histogram plot. Figure 4.5 shows a particular proliferation data set with a FSC(Forward Scatter)-SSC(Side Scatter) plot (see Fig.3.1), a proliferation scatter plot and a histogram plot of the previous scatter plot. Below these three plots a screen shot of the web site displaying the already modified and stored proliferation data. Both the FSC and the SSC are determined by the FACS, whereby the FSC indicates information related to the cell surface by diffracting the FACS’s laser light, and the SSC reveals the cell’s intracellular granularity by reflecting the light. Hence the FSC-SSC plot gives information about the cells’ structure through which the cells are characterised into lymphocytes, monocytes, neutrophiles etc. The FACS software enables the possibility of setting gates (regions) on the plots to frame certain populations of cells. Below the FSC-SSC plot four gated regions and their appropriate percentage of cells are listed. The histogram plot shows the counted cells with their appropriate CFSE fluorescent intensity distribution. 46