• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
High-level
 

High-level

on

  • 282 views

 

Statistics

Views

Total Views
282
Views on SlideShare
282
Embed Views
0

Actions

Likes
0
Downloads
4
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft Word

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    High-level High-level Document Transcript

    • CLASSIFICATION OF BRAIN MRI SERIES BY USING DECISION TREE LEARNING1 Yong Uk Kim, Juntae Kim, Ky Hyun Um, Hyung Je Jo Dept. of Computer Engineering, Dongguk University yukim@dgu.ac.kr, jkim@dgu.ac.kr, khum@dgu.ac.kr, chohj@dgu.ac.kr Abstract image. It is because a diagnosis is conducted by looking at the entire image series, not looking at any one image In this paper we present a system that classifies brain among them. MRI series by using decision tree learning. There are two kinds of information that can be obtained from Conventional image retrieval systems can be classified MRI. One is a set of low-level features that can be into annotation-based retrieval systems [1] and content- obtained directly from the original image such as sizes, based retrieval systems [4][5][7]. In annotation-based colors, textures and contours. The other is a set of high- retrieval systems, the opinions of experts are attached to level features that be made through interpretation of the each image and are used for retrieval. These systems segmented images. To classify images based on the can achieve relatively high accuracy due to the semantic contents, learning and classification should be annotations, but providing annotations needs much time performed based on high-level features. The proposed human intervention. In content-based retrieval systems, system first classifies the image segments by using low- such textual information is not used. A system interprets level features. Then the high-level features are images by analyzing the features of images. However, synthesized and the whole MRI series are classified by usually these systems cannot achieve high accuracy using those features. Experiments have been performed because it is generally difficult to interpret the semantic to classify brain MRI series to normal brains, cerebral contents of an image by using the low-level information infarctions and brain tumors, and the results are such as color, texture, and shape. So it is suggested that discussed. the content-based image retrieval system extracts high- level information such as spatial or logical relationships Key Words and takes advantage of them [2][3][10]. Image Retrieval, Classification, Learning, Decision Tree In this paper, we propose a content-based image classification method based on the decision tree learning [6][9][13] to achieve high accuracy in retrieval of brain MRI series. The proposed system classifies a 1. Introduction1 MRI series to normal, cerebral infarction, or brain tumor case. The decision tree learning is performed in Due to the advances of computer and communication two separated levels. At the first level, segmented technologies a lot of medical information systems such images are classified by using the low-level features. At as HIS (hospital information system), RIS (radiology the second level, entire MRI series are classified by information system), and PACS (picture archiving and using the high-level features synthesized by using the communication system) have been studied and low-level classification results. developed. These medical information systems have been very helpful in managing clinical documents and medical 2. Backgrounds images, and sharing them through the localized network or the internet. However, since the sizes of This chapter presents the characteristics of medical computerized tomography (CT) or magnetic resonance images, especially that of the magnetic resonance image imaging (MRI) are large, the time for information (MRI). Also we introduce several related researches. retrieval becomes critical as the number of images increases rapidly. As the amount of data increases, more efficient and intelligent retrieval systems become 2.1 Characteristics of medical images necessary [3][11][12]. Furthermore, classification or retrieval should be performed on the entire series of Medical images are effective sources of information for images (images of a patient that are photographed diagnosis of a disease and its location, size, and type. multiple times on a regular interval), not on a single Various types of medical imaging are used including X- ray, computerized tomography (CT) and magnetic 1 resonance image (MRI). 1. This research has been funded by the Korea Science and Engineering Foundation.
    • MR images are generally gray-scaled, and their texture direction of each object on the entire image picture, the characteristics are not easily noticeable. Also different extent of overlapping of objects, the location of objects, parts of a brain such as cerebrum, midbrain, cerebellum etc. and pituitary have unclear boundaries. The structural shapes and relationship between parts are complicated. Recently there have been several studies in the effort to The differences between the values of various features extract high-level information semi-automatically or are also small. The MR images also have various automatically. KMeD (Knowledge-based Multimedia image-filming parameters - spatial resolution, contrast Medical Distributed Database) system is one of such resolution, filming angles, etc. Figure 1 shows examples examples [3]. KMeD system uses image and character of brain MRI series [14][15]. to query the medical multimedia DB. It uses high-level information for image retrieval such as contour, area, circumference ratio, shape, direction of object pairs, etc. Patient 1 Patient 2 ��� Patient n It used instance-based MDISC algorithm in classification.. ••• 3. Classification of MRI Series This chapter presents the methods of extracting low- level and high-level information, and two-level learning and classification algorithm. Figure 1. Examples of brain MRI series 3.1 Decision Tree 2.2 Retrieval of medical images The Decision Tree is used to classify data based on To classify and retrieve images based on their contents, selected features [6][9][13]. In learning, a tree is a image retrieval system utilizes various information generated from training examples by divide and from images. There are systems that use low-level conquer method. In order to determine the order of information, and that use high-level information. choosing features, the concept of entropy used. Entropy of a data set S is high if the data are evenly distributed 2.2.1 Use of Low-level Information over the target classes. The decision tree learning algorithm computes the information gain for a feature The low-level information is the primitive or A, which is the amount of expected entropy reduction fundamental features obtained directly from an image when A is chosen to classify data at the present state. such as color, texture, shape, etc. The low-level The formula for the entropy and the gain are as follows: information doesn't represent the semantic contents of an image. Such low-level information is used in the c conventional method of CBIR(content-based image Entropy ( S ) ≡ ∑ ( − pi log 2 pi ) (1) i =0 retrieval). CBIR is a method to automatically classify | Sv | and retrieve images based on the surface characteristics Gain( S , A) ≡ Entropy ( s) − ∑ v∈Values ( A ) | S | Entropy ( S v ) (2) extracted from an image itself. CBIR has the advantage that it is possible to build an automatic system that does not need human experts. However, since it excludes the The Decision Tree learning is usually strong against aspect of semantic contents of images, it is difficult to noise and the result can be easily converted into rules. retrieve the images with same semantic contents but In this paper, we used the Weka library, in which the have different shapes or colors. C4.5 decision tree algorithm is realized in Java [13]. The systems that use the content-based image retrieval are QBIC, VIR, Visual Retrieval Ware, etc. QBIC 3.2 Separation of learning and classification (query by image content) is an image retrieval engine level developed by IBM [4]. In QBIC, a user can retrieve an image by means of the image texture expressed as color The proposed method performs separate levels of ratio, distribution, location and graphics. learning and classification - object learning/ classification and image series learning/classification. 2.2.2 Use of High-level Information Each level extracts content-based low-level and high- level features and applied the decision tree learning The high-level information is the logical relationship separately. The two level learning is used because it can between images or the semantics shown by image series extract high-level features more effectively. The logical such as the distance between image objects, the high-level features are synthesized based on the
    • semantic interpretation of the segmented images. Figure Innercircle Roundness = (7) 2 shows the diagram of the two level process. Outtercircle Images are preprocessed and segmented into several objects. The object classification rules are learned from The result of learning is a decision tree that can be training data (manually classified segments) by using represented as rules. Figure 3 shows an example of the low-level features. The image series classification segmented image objects and the learned object rules are learned from a set of classified MRI data by classification rules. using the high-level features. When a new MRI series is given, each image is first segmented into several objects. Then the object classification rules are applied to classify them, and then the high-level features for Feature Name Content entire series are generated based on the results. The ID Image Id of patient MRI series that is represented as a set of high-level OID Objects Id of image features is then classified by using the image series Bright Color histogram classification rules. Area Area of each object Extrusive Extrusion of object Round Roundness of object Medical Image Center_X Center of object Center_Y Segmentize MBR_ULX Segment Training Data MBR_ULY Minimum bounding Extract Low-Level Features MBR_DRX rectangle of object Object Classification MBR_DRY Object Learning Rules Classify Object Low-level Table 1. Low-level features used in learning. Image Series Training Data Extract High-Level Features Image Series White Image Series LearningClassification Rules Classify Image Series matter High-level Gray matter Result Class Unknown Object Figure 2. Two level learning and classification process 3.3 Learning object classification rules The purpose of object learning is to generate rules for anatomic classification of image segments. The decision tree learning is performed on the training data that is represented by low-level features. For each object, contour length, brightness, area, center, extrusion, roundness and MBR(minimum bounding rectangle) are used as low-level features as shown in Figure 3. Examples of segmented objects Table 1. The equations for computing extrusion and and object classification rules roundness are as follows. n 3.4 Learning image series classification ∑ Distance(center, contour ( x )) i (3) rules Average = i =1 n n The purpose of image series learning is to generate the Extrusive = ∑ ( Average − Distance(center , contour ( xi ))) 2 (4) i =1 rules to classify entire image series. The learning is Innercircle = based on the high-level features that are generated MIN (π × Distance(center , contour ( xi ))2 ) (5) based on the low-level classification results. The Outtercircle = generation of high-level features of image series MAX (π × Distance(center , contour ( xi )) 2 ) (6) consists of two phases. The first phase is to compute logical features by using the direction and location information of the classified objects. The second phase
    • is to compute other features that can be directly The brightness ratio between objects are used as obtained from the entire images. features because the brightness value depends on image-filming devices. The direction information shows The high-level features for image series are shown in the direction of object from the center of head. As Table 2. They are the patient information, the existence Figure 4-(a) shows, the brain is divided into 8 directions of cerebrospinal medulla fluid, the distance between the from the center, and then the direction of an object is center of brain and the center of UO(unknown object), determined by examining which of the 8 directions the the direction of UO, the closeness between the UO and center of the object belongs to. These 8 directions the cerebrospinal fluid, the brightness and area ratio indicate the frontal lobe, temporal lobes and occipital. between objects, etc. The other features for entire image The spatial relationship between cerebrospinal fluid and series are computed by averaging or summing the UO determines whether the UO infiltrated into the values of features of each image. The vertical object cerebrospinal fluid or not as in Figure 4-(b). The locations are also computed. vertical position expressed the vertical location of an object in the entire image series in terms of the ratio to Feature Feature Name Content the top. In the vertical position, the central Information Age Age of patient cerebrospinal fluid and UO are used to indicate where of Patient Sex Sex of patient the possible disease area is located in three dimensions. Exist of ExistOfCsf Is CSF exist Object ExistOfUO Is UO exist Generated high-level features are applied to the learning Ratio of AreaRatio_UO UO area / Area Sum of image series classification rules. Figure 5 shows the area examples of image series and the learned classification White matter / Gray between AreaRatio_W_G matter rules. Each of image series are assigned to one of the objects Ratio of BrightRatio_ UO / White matter three general categories – normal, infarct, and tumor. bright UO_W bright between BrightRatio_ UO / Gray matter objects UO_G birght Direction UO_Direction Direction of UO Spatial SpatialRel_ Normal Join of CSF and UO Relationship CSF_UO Distance between UO Distance Distance_UO_C and brain center Sum of UO area of all Infarct Total_Area_UO image series Sum of CSF area of Total_Area_CSF all image series Sum of White matter Tumor Series Total_Area_White area of all image series Sum of Gray matter Total_Area_Gray area of all image series Vertical position of Vertical Vertical_CSF CSF in image series position of Vertical position of objects Vertical_UO UO in image series Table 2. High-level features used in learning Figure 5. Examples of image series and N image series classification rules NE NW E 4. Experimental Results W We have implemented the proposed system prototype, SW S SE and the experiments have been performed by using a set of real brain MRI series collected from local hospital. (a) (b) The dataset consists of 1400 MR images of 72 persons, 10 of which were normal, 33 were infarction, and 29 Figure 4. Examples of (a) direction, (b) spatial relationship were tumor cases.
    • Table 3 shows the results of object classification in cases showed 93.1% accuracy on MRI series terms of precision. The classification accuracy of classification. GM(gray matter) and WM(white matter) are relatively high. About 17% of the CSF(cerebrospinal fluid) Currently, our system classifies one MRI series taken segments were misclassified to GM or UO, and 11% of on a certain time. Extending the system to classify the UO were misclassified to GM. This is because GM temporal series of MRIs that is taken on a certain time usually has a mid-feature value between CSF and UO. interval can be a future research direction. Also, further The overall object classification accuracy was 97.9%. study should be made on selecting features and introducing more complicated high-level features. Classification Result Incorrect Precision CSF GM WM UO References CSF 97 15 0 5 20 82.9% 298 GM 13 0 23 16 52 98.3% [1] Chang, N. S. and Fu, K. S., "Query-by Data 175 pictorial example," IEEE Transactions on WM 0 28 0 28 98.4% 5 Software Engineering, SE-6(6):519-524, Nov. UO 0 7 0 59 7 89.4% 1980. Total 4998 107 97.9% [2] Chang, S. K. and Hsu, A., ”Image information systems: Where do we go from Table 3. Results of objects classification here?,” IEEE Transactions on Knowledge and Table 4 is the result of the image series classification. In Data Engineering, 4(5): 431-442, 1992 the case of brain tumor, there were strange parts [3] Chu, W. W., Cardenas, A. F. and Taira, R. (classified as UO) that could be distinguished in most K., "KMeD: A knowledge-based multimedia cases. In the case of brain infarction, there were cases me dical distributed database system," that have no strange part. But they can be classified Information Systems, 20(2): 75-96, 1995 based on other features such as the color ratio between [4] Flickner, M. et al, "Query by image objects and the shape of cerebrospinal fluid. The overall Content: The QBIC System," IEEE Computer classification accuracy was 93.1%. Special issue on Content Based Retrieval, Vol. 28, No.9, 1995. Classification Result [5] Gudivada, V. N. and Raghavan, J. V., Normal Infarct Tumor Incorrect Precision “special issue on content-based image retrieval Normal 9 0 1 1 90.0% systems,” IEEE Computer Magazine, Infarct 0 31 2 2 93.9% 28(9):18-62, 1995 Data Tumor 0 2 27 2 93.1% [6] Mitchell, T. M., Machine learning, Total 72 5 93.1% McGraw- Hill, New York, NY, 1997. [7] Niblack, W., Barber, R., Equitz, W., Table 4. Result of image series classification Flickner, M., Glasman, E., Petkovic, D., Yanker, P., Faloutsos, C. and Taubin, G., The experiments on classification of three different “Efficient and Effective Querying by Image kinds of brain tumor were also performed, but the result Content,” Journal of Intelligent Information showed only 63.3% accuracy due to the insufficient Systems, 3:231-262, 1994 data for learning. [8] Ogle, V. E. and Stonebraker, M., "Chabot : Ret rieval from a relational database of images," In Proceedings of International 5. Conclusion Conference on IEEE Computer, 28(9), September 1995 This paper presents a system for classification of brain [9] Quinlan, J. R., C4.5: Programs for machine MRI series. To learn the classification rules, the learning, Morgan Kaufmann, 1993. decision tree algorithm is used in two-levels. In low- [10] Rui, Y., Huang, T. S. and Chang, S. F., level, each image is segmented into smaller objects and “Image Retrieval: Current Techniques, represented by the low-level features, and then the Promising Directions and Open Issues,” object classification rule is learned. In high-level, the Journal of Visual Communication and Image logical features such as the relationship between Representation, 10:39-62, 1999 different objects were generated, and the image series [11] Sauer, F. and Kabuka, M., "Multimedia classification rules are learned. Unlike the conventional technology in the radiology department," image retrieval system, low-level features and high- Procee dings of the second ACM international level features were applied separately in classification conference on Multimedia, Pages 263-269, in order to effectively perform learning and 1994 classification. The preliminary experiments with 72 [12] Sonka, M., Hlavac, V., Boyle, R., Image Processing Analysis and Machine Vision, PWS Publishing, 1999
    • [13] Witten, I. H. and Frank, E., Data Mining: Practical Machine Learning Tools and Techn iques with JAVA Implementations, Morgan Kaufmann, 2000 [14] The Whole Brain Atlas, http://www.med.harvard.edu/A ANLIB [15] The Korean Neurosurgical Society, Neurosurgery, 2000