Defense Powepoint
Upcoming SlideShare
Loading in...5
×
 

Defense Powepoint

on

  • 1,003 views

Dissertation defense slides

Dissertation defense slides

Statistics

Views

Total Views
1,003
Views on SlideShare
1,002
Embed Views
1

Actions

Likes
0
Downloads
5
Comments
0

1 Embed 1

http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Good afternoon everybody and welcome to my Ph.D defense. In my dissertation, I proposed a “A Generalized Multidimensional Index Structure for Multimedia Data to Support Content-Based Similarity Searches in a Collaborative Environment”. Yes the title is a bit long, but this dissertation addresses some important issues related to Multimedia Data Management and I wanted to emphasize on them. I had been working on this topic for the past 5.5. years with Dr. Shu-Ching Chen at DMIS Lab.
  • Before starting my presentation, I wish to sincerely thank my committee members for agreeing to be a part of this.
  • I would also like to thank SCIS for their continuing generous support which enabled me to concentrate on my research without worrying about my financial security. Also, the numerous awards that I received from the department, acknowledging my work, motivated me and encouraged me. I also want to thank FIU for the DYF and the travel grants I received for attending two conferences. Apart from that I sincerely want to thank my lab-mates for making my work place enjoyable and always willing to help me. And definitely I want to thank the SCIS staffs especially to Olga, who made my life here in school a little easier.
  • Today’s presentation is going to follow this Outline. I will go over the motivation of this research in details as I sincerely believe that it is the fundamental portion of a Ph.D research. Unless you are clear and convinced why you want to work on something and why it is important to a greater scientific community, it is very difficult to remain focused for a stretch of time. Thus a solid motivation is necessary to remain motivated!Then I will go over the contributions of this research. There are three major contributions of this research: A Generalized Index Structure, specially designed to organize multimedia data, A query refinement framework and a technique to visualize and analyze the semantic relationship of multimedia data in a collaborative environment. Along with the discussions of each contribution, I will go over the existing works in each field. It will be followed by a discussions on the limitations and assumptions of the proposed framework. I will finish off this presentation with a brief discussion of the future direction of this research.So, first the motivation.
  • What is so special about multimedia data? For the past couple of decades, it has gained immense popularity and has become the preferred medium of communication. WHY?Check this example. On the left is a hand written recipe card of the legendary cake by your grandma that has been passed on for generations. Now, you are a novice in cooking and all you have got is this card. On the other hand, lets imagine your grandma is pretty tech savvy and have convinced your grandpa to record all her speciality cooking with special instructions. She then shares those videos with all your family members (in an effort to keep up with the cooking talent that she is convinced this family possess!!). Which would you prefer?? O.K for sentimental and keepsake reasons, the recipe card might be useful. But for someone who never did any baking, it is pretty daunting.So, multimedia data is way more expressive and thus attractive than traditional alpha-numeric data.
  • But as all things in this world, the special qualities possessed by multimedia data comes at a price. The three main characteristics which makes it useful yet complicated are 1)The multidimensional Representation 2)The perception subjectivity and 3)Semantic Gap. I will explain what each characteristics stand for in a bit. Overall, we can conclude that it is very different from traditional data.
  • A multimedia data is made up of low-level features such as color, textures, sound, etc., which make multimedia data so attractive. A feature extraction technique is used, which extracts the low-level features and represent a multimedia data as a multidimensional feature-vector. For example, applying color feature extraction to an image in the HSV color space yeilds the following feature vector (made of multiple dimensions). If projected in a multidimensional space, the multimedia data is reduced to a point. Here, an example is shown with three dimensions, as anything greater than three is rather difficult to represent and comprehend. It is worth mentioning that it is this feature-vector that is used to organize the data in the database.
  • A video is even more expressive than an image. Naturally, it carries more varied content and hence have a more complex representation. A video can be modeled as a hierarchical structure with each video comprised of a number of sequentially related shots and each shot comprising of a number of sequentially related frames. Also, a video is represented as a multi-modal feature vector as a single mode is unable to capture all its nuances. Thus, along with the color features (as used in image representation), it has additional features such as audio features, shot-level features etc. Any unit of the video can stored separately with its feature representation in the multidimensional feature space as described in the previous slide.After click, the animation starts. Explain the shot-level multi-modal features.
  • The second important characteristics of Multimedia Data is Perception Subjectivity. A single multimedia data can represent several concepts and each user might have a different interpretation. Also, the same user might think differently at different point of time, based on circumstances or cognitive mind-set up. Take these two images. Each can carry different perception. It can communicate the idea of Togetherness, Baking, Family, Quality Time etc. Same is with the picture as the right. One can think of it as Sunset, others as dolphins or a third person might think of something completely abstract which I couldn’t comprehend!
  • A multimedia data broadly contains two types of contents: the low-level contents (features with which you represent them) and the high-level contents (semantics/perception). Frequently, there remains a gap between these two which make their management a huge challenge. Basically, for storage purpose, we use their low-level content (it is a quantitative definitive measure), but while retrieval users are more interested in the high-level contents (semantics). Thus, when the gap is large, such organization strategies fail miserably. For example, check these two images. Both have similar low-level color content but semantically represent two hugely different concept.
  • So, the next question is are the existing DBMS frameworks able to handle these three atypical characteristics of multimedia data? Lets take the example of a simple query that you ordinarily issue to a database:“Select…….”Now, if we want to use the relational database to organize Multimedia Data, the query issued to retrieve a particular image might look something like:“Selec……”Obviously you can see that the existing query frameworks do not NATURALLY accommodate the requirements of multimedia retrievals. Of course, they can be adhocly made to store multimedia data, as been done today, but that will definitely not meet the quality of service that is expected.Please be noted, that I have not used “NOT POSSIBLE”, but rather used “NOT SUITABLE”. Explain on “How the multimedia data can be accommodated by the existing relational DBMS in an ADHOC manner”.
  • Whats missing from the existing database management frameworks?Suitable data organization (index structure)Suitable Query Handling (Content-Based Retrievals)Suitable handling of the Semantics Information carried by the multimedia dataHere is an architecture of the traditional database management framework. It can be concluded, that almost all the components need to be tailored to meet the multimedia data requirements. In this dissertation, I mostly deal with the Index Structure, as it can be considered as the most pivotal part of a successful DBMS framework.
  • So, lets go into the details of the contributions. First the index structure:
  • There are three major expectations from an index structure that is to manage multimedia data successfully:First: To provide a single seamless framework for different types of multimedia data. As you know multimedia data can be of various types, images, videos, documents (even a web-page) as each one of them have different representations and different retrieval requirements. Thus, having separate index structures for individual data type is not practical. Finally, whatever index structures you have, they need to be embedded to the database kernel and other components such as the query optimizer, query processor etc, need to be tuned according to the index structure, If you have multiple index structures, there might be conflicting tuning issues. Moreover, for answering context-based queries, where users might be interested in finding both images/videos pertaining to a particular context, cross-similarity between images and videos need to be determined.
  • The second expectation is to accommodate varied multidimensional representations.Why do we need it? Most of the existing index structures, embedded into the database kernel are single dimensional, hence cannot be used for multidimensional data types.Even if there are a few multidimensional index structures, they are not capable of handling the query requirements of MM data as they do not consider semantics at all.There are a plethora of feature representations. Hence you need a flexible structure that can accommodate the varied types. Otherwise the utility of an index structure will largely be challenged.
  • The third and final expectation, is that it should be able to accommodate the particular query methodology of MM data i.e. the content-based retrieval.For this, the query handling need to consider both the low-level and the high-level contents which the existing index structures are not designed to do.
  • Before going into the details of how the proposed GeM-Tree meet these expectations, lets quickly go over what has been done so far in this area. The first generation index structure, which is still largely used in most of the database management frameworks is the B-Tree and its variants. It is a single dimensional index structure and is tree-based. Then came the genre of multidimensional index structures. They can be broadly divided into two categories: Feature-Based and Distance-Based. Feature-Based index structures such as KDB-Tree, R-Tree and more recently Hybrid-tree index the multidimensional feature space that is used to represent the multimedia data. Distanced-based index structures such as M-Tree and VP-tree indexes the metric space formed from the similarity measurement between pairs of multimedia data objects. This genre of index structure is useful, because it can be used even if the feature values of data objects are not available but only their mutual (dis)similarity measurement is provided. Now both these genres of index structures are useful depending on the dataset and retrieval requirements of a particular application.
  • Replace the VP-tree with an M-tree description
  • So, if we already have so many index structures, why do we need another one? Yes, there are a few issues with the existing ones and they are pretty serious when multimedia data is concerned. First lets see how does each Multidimensional index structure performs when handling semantic relationship during CBR?For feature-based index structures, there need to be a direct correlation between the low-level features and the semantic information. Thus, if there is a semantic gap issue for a particular data set, these kind of query handling is not useful.For distance-based index structures, there is no existing semantics capturing mechanism.None of the existing index structures can handle different data types and none of them can handle different data types from one seamless framework as well.
  • GeM Tree is a distance-based index structure. Lets now see how it addresses each of the expectations.Provide a single framework to manage different types of multimedia data: I propose a very flexible data-signature to represent different data objects. It has three parts: an image part, a video part and an identification part. We have used only images and videos as the datasets, thus there only two parts. But, other data types can be represented as well with such signature. The image part stores the color/texture values, the video part the pixel-change, audio features and the ids, store the information about the hierarchical relationship of the data (if theres any). Lets see with examples, for images, this is how the data signature will look. The image parts have the values, the video-part has all 0s, the ids have only the identification of the particular object. As there is not hierachical/containment relationship, v_id and s_id is zero. W is the distribution weight. In the next slide, we will see what does the weight signify. For a video, it has the image part (as color/texture are common to both the types), additionally it has the video part as well as the containment relationship. Here, a shot is represented. If you recall, a video shot is contained within a video. Thus the Ids have 1, 1, 0 specifying that this particular shot has an Id 1 and it belongs to a video with id 1. If we were to represent a frame, it would have been 1,1,1…..Thus, we can see that both images and videos can be represented with a single data signature.
  • (ii) Accommodate varied Multidimensional RepresentationNow, to accommodate varied representation, we use earth mover’s distance as the similarity measurement metric between multimedia data objects. It calculates the distance between two distributions where distributions can be of variable lengths. Thus, you are no longer required to represent every multimedia data with similar feature distributions. This is particularly useful for multimedia retrieval strategies such as region-based/object-based retrieval where each image/video unit is represented by a varying number of regions/objects.EMD is based on transportation problem , where the amount of work needed to convert one distribution to another is optimized.
  • (iii) Accommodate CBR of individual data type along with concept retrievals involving cross-similarity between multimedia dataGeM-Tree addresses the third expectation of supporting CBR with the help of the Data Signature, the distance function used along with a high-level semantics capturing mechanism. The high-level semantics between multimedia data is captured using a construct called Affinity Relationship which determines the closeness of teo multimedia objects by following the access patterns. It should be pointed out here that this semantic capturing do not rely on the feature-level similarity, hence performs well for cases of semantic gap.
  • Thus, we see that GeM-Tree covers the three expectations quite successfully. Lets now go into a little detail on how GeM-Tree introduces retrieval techniques into its framework. Basically, a multidimensional index structure answers queries following these two strategies: range search and K-nn search. For range search, the database is searched for objects that is within a given range of the query object. For k-NN search, the entire database is searched to retrieve the k objects most similar to the query object. Of course k-nn search is a more natural extension of CBR as you cannot really expect an user to specify a particular range in the form of a numerical value. It is more convinient to search the entire database for the most similar objects. The introduction of CBR into the k-NN search is a pretty complex algorithm because you need to make sure that while you are considering both the low-level and high-level similarity, the properties of the underlying metric space (viz. positivity, symmetry and triangular equality) is not violated. However, as a simple representation, the main step of the k-nn search implementing CBR is as follows where both the low-level similarity in the form of ‘d’ (euclidean distance) and high-level similarity (affinity) between an indexed database object and the query object is considered.
  • Additionally, GeM-Tree supports cross-multimedia similarity search. The data signature is designed such that the Euclidean distance between them can determine the similarity between two different multimedia data types. It is proved in a lemma in the dissertation and is beyond the scope of the this presentation. The high-level similarity between between types of multimedia data is determined from the id representation of the data signature along with the HMMM model used to capture the semantic relationship. For example, you want to find the high-level similarity between a video shot and a frame. We traverse up the HMMM hierarchy to find the video-shot to which the frame belongs and compare the similarity between the two shots.
  • Here we represent the performance of GeM-Tree in terms of distance computation and the accuracy. It is compared with an index structure developed only for images and with another developed only for videos. We also compare it with a framework having no index structures. It can be seen that the computation overhead of GeM-Tree is slightly higher than the dedicated image/video index structures. This is because GeM-Tree need to manage two types of media and hence the variety of candidate pool is bigger and more elimination is necessary to reach to the desired objects. However, it has the added functionality of answering mixed type queries with a reasonable computation overhead. The accuracy of GeM-Tree is also acceptable, though a slightly less than the other two index structures.Seq has the highest accuracy as it scans the entire dataset to provide the query results. The high accuracy is at the cost of very high computation overhead.
  • Here, we demonstrate the capability of GeM-tree in handling variable-length features. This is the distance computations while building the tree with variable length features. We could not compare it with any other tree-based index structure, as there is practically none that can do so.
  • Next I will go over the second contribution, the query refinement.
  • What is a query refinement? Query refinement is necessary for multimedia data management frameworks to alleviate three major problem areas:1)The semantic Gap, 2) the perception subjectivity and 3)The fuzziness of users expressionThis is a multimedia retrieval application (do not have a database management). Users submit a query, the system gives back a set of results. Not all results are related to the user query. The user is then given a chance to refine his/her requirement by result images as positive or negative. He then resubmits the query and the system gives back the result in the next iteration considering the user’s feedback. Thus two things happen here:Number of queries increase in each iteration as users marks positives. For each subsequent iteration, the system should consider the original submitted query along with the positives.The semantic requirement of the user is redefined. Thus, as there is a modification in the query representation as well as the semantic requirements, the index strcture need to accommodate these dynamic changes.
  • Are there any existing query refinement techniques implemented by the multidimensional index structures?Yes, there are query refinement models for feature-based index structures where tries to capture the user requirements by adjusting the intra-inter feature weights. That is again it tries to find a correlation between low-level features and high level semantics. The approach has two major drawbacks:If there is a semantic gap, it remainsIt cannot be utilized for distance –based index structures as it has been seen that such inter and intra feature weights violates the metric property of triangular inequality.
  • GeM-Tree handles the first requirement, that is increase in the number of query points in each iteration, by introducing the concept of multipoint query and modifying the distance function (necessary for calculation of similarity) accordingly.
  • In order to handle the refinement of the high-level semantic information, it introduces an affinity update mechanism and introduces the affinity into the index structure for the multipoint query as shown. All these equations are established with detailed lemma in the dissertation. The basic idea for the first part is that when two data objects are marked as similar by the user, the access value for the following pair is increased by one.
  • In order to evaluate the performance of a retrieval framework, two factors need to considered. How fast it is providing the result and how accurate is the provided result. One can be improved at the cost of other. Thus a balance is very necessary. You can increase the accuracy by evaluating more database objects (i.e. at the cost of the computation cost). Again, you can provide some sort of result by considering few objects (lowering the computation cost at the cost of accuracy). I proposed a score based on the computation cost (T) and the Accuracy (computed with the F-score).
  • We compared the accuracy of the proposed system, here AH-Tree refine (blue line) with three frameworks:Distance Based Multidimensional Index Structure without refinement model(yellow line)Feature based multidimensional index structure with refinement model (pink like) Naïve method (no index structure, but considering user feedback with relevance feedback)We see that Naïve has the best accuracy (as it considers all the objects) followed by our refinement model, followed by one without refinement followed by the feature based refinement. Next we evaluate the computation time…….
  • Thus, there are several values and it is a little difficult to evaluate the overall goodness of a particular model. Here, the evaluation score comes into handy. Computing the proposed evaluation score, we find that our proposed systems performs the best. We call the model as AH-Tree refine instead of GeM-Tree, because we considered only images.
  • Next we discuss the third contribution, visualizing and Analyzing Multimedia…..
  • Why do we need to consider a collaborative environment?With the explosion of social network application and multimedia data being the popular medium of communication, there need to be a proper way to manage this data considering the dynamic and evolving relationships.Thus, the……..Lets consider a Facebook page: frequently users share videos from Youtube with a particular group of users. With the increasing number of users on facebook, we could use this information to manage the data on youtube to provide an easy, cheap access, based on not only annotations but also on contents and user behavior.
  • For the multimedia Data network, each data object acts as a actor (node) and their relationships act as a relationship.
  • Now, the relationship considered is an interesting factor. It varies with the applications considered. For utilizing such information in the database management framework, we considered the semantic relationship, as perceived and reported by the users, as the relationship. In our case, the application is a multimedia retrieval framework in a collaborative environment. User behavior….
  • Previously, we had there information as a text file looking something like this:It is pretty difficult to form any overall idea about the data relationship from such information presentation. SO we form a data network such as this. It represents the semantic relationships among the different data objects in the data base.
  • The generated data network is a weighted graph and is disconnected in nature. It is pretty large and thus visual interpretation is challenging. To overcome the problem, a preview generation technique is proposed, which represent a large graph structure with fewer number of nodes but preserves the overall characteristics.
  • Thus, the Graph Preview has the following approach as just discussed.
  • From the literature, there has been so far only one frequently used approach, the clustered Graph Layouts.
  • But it has the following issues:1. Determining the …
  • It has the following 5 steps:
  • Since, we need to represent the original graph with fewer nodes, at first we filter nodes using different sampling techniques.
  • Next we determine the node metrices. Two types of metrices are identified: Structural metrics and semantic metrics.
  • In step 3, the structural similarity between the original nodes and the sampled nodes is determined using a Graph Similarity approach as shown in the following equation. It considers both the nodes as well as the edges.
  • In step 4, node assignment is done. That the sampled nodes are assigned to some of the original nodes in such a way so that the total similarity between is maximized. An assignment problem approach , called hungarian algorithm, is used to do so.
  • At the last step, in order to form the representative graph, the sampled and assigned nodes need to be connected. It is done using the shortest path approach.
  • To evaluate the representative graph with the original graph, an overall structural comparison is done using centrality measurements. Centrality is basically the measurement of the power/importance of individual node. Thus, it provides information about the connectivity of the individual nodes with respect to the entire network (holistic behavior). The Ec value gives a score describing the structure of a particular graph. We can then find the deviation of the scores between the original and representative graphs to find the effectiveness of the result.The denominator if the maximum possible value of the numeratorEc has a minimum value of 0: star configurationEc has a maximum value of 1: circle configuration
  • Ok, so now we have a multimedia data network that can be visualized and analyzed with ease. How do we utilize it in our multimedia data management framework and specially in the index structure?It should be recalled here that the index structure was built based on ONLY the low-level features. Semantic information (in the form of affinity value) was introduced during the query. The information obtained from the analysis of the multimedia data network can be used to introduce semantic relationship into the indexed metric space without violating any of its properties.
  • We analyze the multimedia data network based on analysis techniques of social networks. For insertion, we find the degree centrality of each element (node) of the multimedia data network. The degree centrality is defined as the number of links incident upon a node. Number of ties it has. It helps to identify the power/importance of a particular node in the entire network. For example, lets assume we have these two images in two nodes. ……..
  • Currently, any delete operation requested by the user is entertained, without considering what effect it might have on the subsequent qualities of the query results. Lets take this example:
  • Deletion policies are based on betweenness centrality measurement of the network. What is a betweenness centrality? It is defines as the number of nodes that connect via a particular node. If for a delete request, the betweenness centrality is high, ask the user to reconsider stating the effect.Other similar decisions can be formed as well by carefully analysing the data network.
  • Next lets see what are the limitations and assumptions of the proposed work: (here comes the achilli’s heel)
  • I assumed that the features used for indexing the multimedia data is sufficient. The framework has a plug in type approach. You can feed in any feature representation or any high level capturing mechanism, and it will generate results accordingly. The framework ensures that the quality of the input information is reflected in the output results.Accuracy calculations are of course subjective.Can handle onlu numeric data and no nominal data. And only soccer videos were used.
  • So, what is the future direction of this research?
  • This research was started with an envision to lay doen the foundation of a full-fledged multimedia database management framework. Thus, its far from completion. The basic and perhaps the most important part, an index structure is proposed in this research and it can be extended in the following useful directions:……

Defense Powepoint Defense Powepoint Presentation Transcript

  • A Generalized Multidimensional Index Structure for Multimedia Data to Support Content-Based Similarity Searches in a Collaborative Environment Kasturi Chatterjee Distributed Multimedia Information Systems Laboratory School of Computing and Information Sciences Florida International University
  • Committee Members • Dr. Shu-Ching Chen (Advisor) • Dr. Jainendra K. Navlakha • Dr. Xudong He • Dr. Keqi Zhang • Dr. Mei-Ling Shyu 2
  • Acknowledgment School of Computing and Information Sciences Continuing Graduate Assistantship (GA, RA) Awards recognizing research Florida International University Dissertation Year Fellowship Travel Grants (GSA) Members of DMIS Lab SCIS staffs Special thanks to Olga 3
  • Outline i. Motivation ii. Contributions a. Generalized Index Structure b. Query Refinement c. Visualizing & Analyzing Multimedia Semantic Relationships in Collaborative Environments iii. Discussions iv. Future Direction 4
  • What is so special about multimedia data? i. Expressive ii. Attractive Which medium is more helpful?! 5
  • Everything comes at a price i. Multidimensional Representation ii. Perception Subjectivity iii.Semantic Gap Very different from traditional data! 6
  • Multidimensional Representation Imag e Y Z Apply feature extraction (HSV color space) <3.5,0,8> X (0.1602,0.0818,0.0405,0.0536,0.0685,0.0667,0,0,0.0287,0,0,0) black red yellow green blue purple white red-yellow yellow- green- blue- green blue purple purple-red 7
  • Multidimensional Representation Video Videos Key Shot Shots Frames Frames Frames temporally related frames Apply feature extraction (multi-modal) (color-features, video-features, audio-features, ……) average-volume, average-energy, …. pixel-change, histogram-change, 8
  • Perception Subjectivity • Togetherness • Sunset • Baking • Dolphins • Family • Quality Time • …………. • …………. 9
  • Semantic Gap Similar feature representation Very different semantic information 10
  • Are existing DBMS frameworks able to handle Multimedia Data? A Typical Query Traditional alpha-numeric Multimedia queries queries SELECT image FROM table SELECT studentName FROM WHERE red „is-close-to‟ 0.245 table WHERE studentAge > AND black „is-close-to‟ 0.356 20 AND studentMajor = AND red-yellow „is-close-to‟ „Computer Science‟; 0.5672 AND …….. AND semanticInterpretation = „something‟….etc. 11
  • Communication Manager What is missing? Application Front Ends SQL Interface SQL Compiler/Interpreter i. Suitable data organization (index structure) Query Evaluation Engine Query Query Query Optimizer Processor Evaluator ii. Suitable query handling Catalog Manager Transaction Manager Lock manager Buffer Manager Access Structure iii. Suitable handling of semantic contents Recovery Manager Manager Storage Manager Index Structure Index Access 12
  • Outline i. Motivation ii. Contributions a. Generalized Index Structure b. Query Refinement c. Visualizing & Analyzing Multimedia Semantic Relationships in Collaborative Environments iii. Discussions iv. Future Direction 13
  • Generalized Index Structure GeM-Tree [chat09c] Expectations i. Provide a single framework to manage different types of multimedia data separate index structures for different data types are inefficient to embed into the database kernel 14
  • Generalized Index Structure GeM-Tree Expectations ii. Accommodate varied Multidimensional Representation existing multidimensional existing index structures index structures cannot for database kernels are handle retrieval mostly single-dimensional requirements of multimedia data plethora of feature representations call for a flexible structure 15
  • Generalized Index Structure GeM-Tree Expectations iii.Accommodate CBR of individual data type along with concept retrievals involving cross-similarity between multimedia data query handling need to existing index structures consider low-level features & cannot handle such retrieval semantic-information approaches 16
  • What has been done so far First generation Multi-dimensional index index structures structures Feature- Distance- B-Tree [1] • tree-based index Based Based structure • feature space • metric-space • single-dimensional indexed based formed from the • currently used in on feature distances relational databases dimension between data • KDB-Tree [2], objects is R-Tree[3], indexed Hybrid-Tree[4] • M-Tree [5], VP- Tree[6] 17
  • KDB-Tree 3 4 7 8 F I 12345678 G H J K N D A 1234 5678 L O C M 12 34 56 78 E B T P Q 1 2 5 6 DE ABC FGH IJ ST PQR KLM NO S R 18
  • VP-Tree I J Data Space E Partition for VP-Tree H B V (A,B,C,D) closest to V A C D (E,F,G,H) next close G F (I,J,K) farthest K 19
  • Issues? Feature-Based Indexes Distance-Based Indexes Semantic Information during CBR low-level feature values no existing semantics capturing correlated to semantics model embedded into search queries Different data types none designed for handling videos/documents Seamless solution none designed to handle multiple data types from a single framework 20
  • GeM-Tree how does it accomplish the goals? Expectation I Provide a single framework to manage different types of multimedia data Using a data-signature to represent multimedia data objects F image ( x 1 , x 2 ,......... , x i ) , ( 0 , 0 , 0 ,......., 0 ) , (1, 0 , 0 ) ,1                  Image part: FA = {x1 ,x2 ,…….,xi F A F B F C } Video part: FB = {y1,y2,…….,yj} Ids: FC = {object_id, v_id, s_id} F shot ( z 1 , z 2 ,......... , z i ) , ( y , y ,...., y ) , (1,1, 0 ) ,1            j   1 2   F A F B 21F C
  • GeM-Tree how does it accomplish the goals? Expectation II Accommodate varied Multidimensional Representation Using Earth Mover‟s Distance (EMD) to calculate (dis)similarity • Derived from Monge-Kantorovich, a transportation problem • Calculates distance between 2 distributions • Distributions can be of variable lengths K ,n Given two distributions x X ,w D K ,m andY , u D y , a flow between x and is y aFmatrix R f ij mxn , find a flow that minimizes the overall m n flow, W ork x, y, F d ij f ij i 1 j i m n m n EMD x , y d ij f ij f ij 22 i 1 j i i 1 j i EMD is calculated by:
  • GeM-Tree how does it accomplish the goals? Expectation III Accommodate CBR of individual data type along with concept retrievals involving cross-similarity between + EMD + data-signature multimedia data Affinity Relationship[8][9]  a stochastic construct called Markov Model Mediator [12]  extended into HMMM for videos  determines the closeness of two multimedia objects (affinity) by following the access patterns  “more frequently two objects are accessed together, greater is their semantic closeness/affinity” 23
  • How GeM-Tree supports CBR Range Search: select all the appropriate database objects within a given range from the query k-NN Search: search the entire database to select k database objects most similar to the query if ((d(Findex_object, Fquery) <= dk) && (A(data, query) >= affinityk )) add index_object to priority queue; update dk and affinityk; else check next index_object from priority queue; 24
  • How GeM-Tree supports cross- multimedia similarity search Low-level Similarity High-level Similarity Euclidean distance between HMMM [9] framework is F of data objects take care of traversed the image and video (upwards/downwards) components according to the information gathered from FC part FC={object_id, v_id, s_id} 25
  • Performance of GeM-Tree Index structure Index structure handling only handling only images videos Query # of Distance Computations Accuracy GeM AH HAH Seq GeM AH HAH Seq Only 98 80 X 147 90% 93% X 98% Image Only 63 X 50 147 90% X 91% 95% Video Mixed 80 X X 147 80% X X 90% Types 26
  • Performance of GeM-Tree Capability of handling variable-length features and supporting queries such as region-based/object- based queries Distance Computing during Developing Index Structure Data Type GeM-Tree Only Images 145 Only Videos 240 Both 960 27
  • Outline i. Motivation ii. Contributions a. Generalized Index Structure b. Query Refinement c. Visualizing & Analyzing Multimedia Semantic Relationships in Collaborative Environments iii. Discussions iv. Future Direction 28
  • What is Query Refinement To Alleviate…. i. Number of queries in each iteration Semantic Gap increases Perception Subjectivity ii. High-level semantic requirement of the userFuzziness of multimedia query is modified 29
  • Where do we stand? Existing Query Refinement Models for Index Structures [7] attempts to capture user requirements by ONLY adjusting the inter and intra-level feature-weights 30
  • Query Refinement in GeM-Tree Requirement I Number of queries in each iteration increases i. Introduces the concept of multi-point query ii. Modifies the (dis)similarity computation approach n 2 D IS T M U L T I ( Q , O ) i 1 Wi | C Fi | r 31
  • Query Refinement in GeM-Tree Requirement II High-level semantic requirement of the user is modified i. Introduces affinity update method aff m , n t 1 x1 x ( access t 1 1) ii. Embeds semantic information into the index structure considering multi-point query n m ax i 1 (m ax( affinity a , q i , affinity b , q i ), m ax( affinity a , q i 1 , affinity b , q i 1 )) 32
  • Evaluation Evaluation score proposed to compare the utility of different multimedia data management frameworks T T m in F Fm ax M odel _ Score (1 ) x (1 | |) n 2 n 2 3x ( i 1 (Ti T m in ) ) n 3x ( i 1 ( Fi Fm ax ) ) n • Compares based on both computation time and accuracy • One can be improved at the cost of other • A balance is necessary 33
  • Experimental Analysis AccuracyComparisons CPU Time Comparison 0.45 120% 0.4 100% 0.35 0.3 80% AH-Tree CPU Time Accuracy 0.25 Refine AH-Tree Refine 60% HybridTree 0.2 HybridTree Refine Refine AH-Tree AH-Tree 0.15 40% Naive Naive 0.1 20% 0.05 00% 1 1 2 2 3 3 Iterations Iterations 34
  • Experimental Analysis 35
  • Outline i. Motivation ii. Contributions a. Generalized Index Structure b. Query Refinement c. Visualizing & Analyzing Multimedia Semantic Relationships in Collaborative Environments iii. Discussions iv. Future Direction 36
  • Why? ~ 400 million users * Collaborative Environment  Explosion of social network applications  Multimedia Data an important communication medium shared  Data management no longer an isolated task youtube video The way a multimedia data is used in a social network can be used to generate A Multimedia Data Network * http://www.facebook.com/press/info.php?statistics 37
  • Multimedia Data Network Multimedia Data shared/accessed among a particular user group can nodes form a social network Each data object acts as edges an actor (node) Their relationship the link (edge) 38
  • What kind of relationship? The edges defining the relationships vary with applications Want to utilize information for User behavior collected for over 5 years customizing Multimedia Database using Multimedia Retrieval Application developed at DMIS strategies Dataset Management for COREL having 10,000 images Used semantic similarity, as perceived and reported by users, as the relationship 39
  • Multimedia Data Network for 10,000 images How the relationship information was presented before affinity.txt 1 2 ……………………… 10000 1 24 34 ……………………… 0 2 12 0 ……………………… 45 3 …………………………………………………………. 4 …………………………………………………………. . …………………………………………………………. . …………………………………………………………. . …………………………………………………………. . …………………………………………………………. . …………………………………………………………. . …………………………………………………………. . …………………………………………………………. 10000 …………………………………………………………. 40
  • Multimedia Data Network Characteristics of the generated network structure  A weighted Disconnected Graph Structure  Large Size  Visual Interpretation/Analysis becomes challenging 41
  • Graph Preview Solution Approach Reduce number of nodes Maintain network characteristics Maximize similarity between original and represented networks 42
  • Existing Approaches Using semantic information Identifying associated disjoint with data clusters (content- Using based) Represent structural clusters as information of glyphs or data compound graph (structure- based) Discovering Clustered Use node groupings/clas Graph metrics ses in data Layouts 43
  • Issues with Clustered Graph representations Determining the cluster size Preserving overall structural similarity/equivalence Determining the representative nodes Preserving the network characteristics 44
  • Proposed Approach Node Filtering Similarity Calculation Node Assignment Determine Metric Graph Layout Pick nodes Calculate Calculate Assign Generate based on structural structural & filtered the network and semantic nodes to representati structure/us semantic similarity original ve graph er choice metric nodes to maximize overall similarity 45
  • Detailed Algorithm Sample nodes to capture overall Step 1 network characteristics Node Filtering Pick nodes based on Select nodes network structure/us representing different er choice groups in the network Random sampling approaches which preserve the distribution 46
  • Detailed Algorithm Step 2 Structural metrics Determine Node Metric Calculate • Adjacency Matrices: structural and edge source & semantic edge terminus metric Semantic metrics • A matrix of scores of different centrality values 47
  • Detailed Algorithm y ij ( k ) xs (i ) s ( j ) ( k 1) xt ( i ) t ( j ) ( k 1) x ij ( k ) y kl ( k 1) y kl ( k 1) Step 3 t ( k ) i ,t ( l ) j s ( k ) i,s (l ) j Similarity Calculation Calculate structural & Structural similarity semantic • Coupled node-edge similarity score [11] Semantic metrics • Euclidean distance between semantic values 48
  • Detailed Algorithm Step 4 Hungarian Algorithm Assign Node Assignment filtered nodes • Pick up m nodes from to original nodes to the set of n nodes maximize which maximizes the overall total similarity score similarity between the original graph and the sub- graph formed • Assignment Problem applying Munkres Algorithm 49
  • Detailed Algorithm Step 5 SPi , j Connect node i and j with edgei,j if threshold Max ( SPi , j k ) Generate Shortest Path Graph Layout the Approach representati ve graph • Preserve the ties between nodes • Consider the overall reach/strength of each node 50
  • Evaluation • Overall structural comparison • Degree of similarity between connected nodes (dyads) • Using Euclidean distance between the centrality values What is Centrality? [10] • Centrality measures the power/importance of a node with respect to the entire network it belongs to • Measure of holistic behavior of a node M 2 c ik c jk k 1 Ec 1 M 2 max c ik c jk 51 k 1
  • Generated Previews low error value ~ 0.02 52
  • How is the Multimedia Data Network utilized ? • identify mutual relationships and role of a particular multimedia data object in a database • design decisions of operations of the index structures Index structure is built on ONLY the low-level features Semantic relationship was introduced during querying No existing insertion policies consider the 53 semantic information stored in a data object
  • Insertion policies degree centrality is defined as the Use degree centrality number of links incident upon a node (i.e., the number of ties that a node has) For a Multimedia Data Network, degree centrality identifies the power/importance of a particular data object in the entire network image to be inserted node 1 node 2 insert higher centrality 54
  • Deletion policies Current Status Any delete request from the users is entertained That the user and hence the data might belong to a collaborative environment is not considered 55
  • Deletion policies betweenness centrality is Use betweenness centrality defined as the number of vertices that connect via a particular node For a delete request, if betweenness centrality of the node is high, ask the user to reconsider 56
  • Outline i. Motivation ii. Contributions a. Generalized Index Structure b. Query Refinement c. Visualizing & Analyzing Multimedia Semantic Relationships in Collaborative Environments iii. Discussions iv. Future Direction 57
  • Assumptions and Limitations • Assumed that features used for indexing represent the multimedia data well • Accuracy calculations are not quantitative and it may vary from person to person • Can handle only Numeric Data • Only Soccer videos were used as test bed, other domains were not checked 58
  • Outline i. Motivation ii. Contributions a. Generalized Index Structure b. Query Refinement c. Visualizing & Analyzing Multimedia Semantic Relationships in Collaborative Environments iii. Discussions iv. Future Direction 59
  • Future Direction • Intelligent multimedia index structure optimizer • Document indexing • Support traditional alpha-numeric data • Query optimizer for multimedia database • Multimedia data management framework for Collaborative Applications 60
  • Publications Journals & Book Chapters i. [chat10] Kasturi Chatterjee, Shixia Liu, Shu-Ching Chen, “Social Network Preview using Graph Similarity,” (submitted to ACM Transactions on Information Systems), 2010. ii. [chat09a] Kasturi Chatterjee, S. Masoud Sadjadi, Shu-Ching Chen, “A Distributed Multimedia Data Management over Grid,” Multimedia Services in Intelligent Environments – Integrated Systems, 2009 (in press). iii. [chat09b] Kasturi Chatterjee, Shu-Ching Chen, “HAH-tree: Towards a Multidimensional Index Structure Supporting Different Video Modeling Approaches in a Video Database Management System,” IJIDS, vol. 2, no. 2, pp. 188-207, 2010. iv. [chat09c] Kasturi Chatterjee, Shu-Ching Chen, “A Multimedia Data Management Approach with GeM-Tree,” JMM, 2010 (in press). v. [chat09d] Shu-Ching Chen, Min Chen, Na Zhao, Shahid Hamid, Kasturi Chatterjee, and Michael Armella, “Florida Public Hurricane Loss Model: Research in Multi-Disciplinary System Integration Assisting Government Policy Making,” Special Issue on Building the Next Generation Infrastructure for Digital Government, Government Information Quarterly, Volume 26, Issue 2, pp. 285-294, April 2009. vi. [chat 07a] Kasturi Chatterjee and Shu-Ching Chen, “A Novel Indexing and Access Mechanism using Affinity Hybrid Tree for Content-Based Image Retrieval in Multimedia Databases,” International Journal of Semantic Computing (IJSC), Vol. 1, Issue 2, pp. 147-170, June 2007. 61
  • Publications Conferences i. [chat09d] Yudan Li, Kasturi Chatterjee, Shu-Ching Chen, and Keqi Zhang, “A 3-D Traffic Animation System with Storm Surge Response,” accepted for publication, IEEE International Symposium on Multimedia (ISM2009), 2009. ii. [chat08a] Kasturi Chatterjee and Shu-Ching Chen, “GeM-Tree: Towards a Generalized Multidimensional Index Structure Supporting Image and Video Retrieval,” the Fourth IEEE Publications International Workshop on Multimedia Information Processing and Retrieval (MIPR2008), in conjunction with IEEE International Symposium on Multimedia (ISM2008), 2008. iii. [chat08c] Kasturi Chatterjee and Shu-Ching Chen, “Hierarchical Affinity-Hybrid Tree: A Multidimensional Index Structure to Organize Videos and Support Content-Based Retrievals,” Proceedings of the 2008 IEEE International Conference on Information Reuse and Integration (IEEE IRI-08), 2008. iv. [chat08d] Shu-Ching Chen, Min Chen, Na Zhao, Shahid Hamid, Khalid Saleem, and Kasturi Chatterjee, “Florida Public Hurricane Loss Model (FPHLM): Research Experience in System Integration,” the 9th Annual International Conference on Digital Government Research, 2008. 62
  • Publications Conferences v. [chat08e] Kasturi Chatterjee, Shixia Liu, and Shu-Ching Chen, “Using Graph Similarity for Social Network Analysis,” in 6th LA Grid Summit, (First Place), 2008. vi. [chat06a] Kasturi Chatterjee and Shu-Ching Chen, “Affinity Hybrid Tree: An Indexing Technique for Content-Based Image Retrieval in Multimedia Databases,” in proceedings of IEEE International Symposium on Multimedia (ISM2006), (Best Paper Award), 2006. vii. [chat06b] Kasturi Chatterjee, Khalid Saleem, Na Zhao, Min Chen, Shu-Ching Chen, and Shahid Hamid, “Modeling Methodology for Component Reuse and System Integration for Hurricane Loss Projection Application,” in proceedings of IEEE International Conference on Information Reuse and Integration (IEEE IRI-2006),2006. 63
  • 64
  • References [1] R. Bayer, “Binary B-Trees for Virtual Memory,” in ACM-SIGFIDET Workshop, San Diego, California, Session 5B, pp. 219-235, 1971. [2] J. Robinson, “The k-d-b-tree: A search structure for large multidimensional dynamic indexes,” in Proceedings of the 1981 ACM SIGMOD International Conference on Management of Data, Ann Arbor, United States, pp. 10–18, 1981. [3] Y. N. Peter, "Data structures and algorithms for nearest neighbor search in general metric spaces,“ in Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms, pp. 311- 321, 1993. [4] C. Patella, et al., “M-tree: An efficient access method for similarity search in metric spaces,’’ in Proceedings of 23rd VLDB, pp. 426-435, 1997. [5] A. Guttman, “R-Trees: A Dynamic Index Structure for Spatial Searching,” in Proc. 1984 ACM SIGMOD International Conference on Management of Data, pp. 47-57, 1984. [6] K. Chakrabarti, S. Mehrotra, “The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces,” in ICDE 1999, pp. 440-447, 1999. [7] K. Chakbarti, et al., “ Efficient Query Refinement in Multimedia Databases,” in Proc. International Conference on Data Engineering, pp. 196-200, 2000. [8] M-L. Shyu, S-C. Chen, M. Chen, C. Zhang, and C-M. Shu, "MMM: A Stochastic Mechanism for Image Database Queries," Proceedings of the IEEE Fifth International Symposium on Multimedia Software Engineering (MSE2003), pp. 188-195, December 10-12, 2003, Taichung, Taiwan, ROC. 65
  • References [9] Shu-Ching Chen, Na Zhao, and Mei-Ling Shyu, "Modeling Semantic Concepts and User Preferences in Content-Based Video Retrieval," International Journal of Semantic Computing (IJSC), Vol. 1, Issue 3, pp. 377-402, September 2007. [10] L. C. Freeman, “Centrality in Social Network: Conceptual Classification,” Social Networks, vol. 1, no. 3, pp. 215-239, 1979. [12] L. A . Zager, et. sl., “Graph Similarity Scoring and Matching,” Applied Mathematics Letters, vol. 21, no.1, pp. 86-94, 2007. 66