Using Concept Lattices for Visual Navigation Assistance in Large Databases Application to a Patent Database Jean Villerd , Sylvie Ranwez, Michel Crampes, David Carteret LGI2P –  É cole des Mines d’Alès, France    I-Nova Company, Villeurbanne – Lyon, France
Outline Context and Problem Setting State of the Art Objective : Intended Visualization Proposal for a Visual Browsing Method Discussion and Perspectives
1.  Context and Problem Setting Scalability issues storage of new information + 30 % each year [Lyman & Varivan 2003] Visual scalability “ capability of visualization representations and visualization tools to display massive data sets effectively, in terms of either the number or the dimension of individual data elements”  [Eick & Karr 2002] Binary-tree visualization of the Yahoo search engine bot crawling the experimental website Microsoft Research's Netscan project 1. Context and Problem Setting   | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives
2.  State of the Art Information Visualization “ focus + context” paradigm [Schneiderman 1996] Collection Visualization collection’s structure Grokker, Kartoo, TreeMaps semantic distance (PCA, MDS) Molage Molage Grokker 1. Context and Problem Setting   |  2. State of the Art  | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives our goal: merging both
FCA for Information Retrieval classification querying and browsing Credo, MailSleuth, ImageSleuth ImageSleuth [Eklund, Ducrou et al. 2006] Can information visualization techniques improve this lattice-based navigation process? Credo [Carpineto et al. 2004] 1. Context and Problem Setting   |  2. State of the Art  | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives
3.   Objective: Expected Visualization extract the structure overview the structure focus on a structure element content (local view) propose navigation paths through the structure 1. Context and Problem Setting   | 2. State of the Art |  3. Expected Visualization  | 4. Proposal for a Visual Browsing Method | 5. Perspectives structure semantic distance
overview : collection of formal concepts 1. Context and Problem Setting   | 2. State of the Art |  3. Expected Visualization  | 4. Proposal for a Visual Browsing Method | 5. Perspectives 3.  Objective: Expected Visualization local view : focus on a particular concept : intent = {  plasma display ,  display panel }
4.  Proposal for a Visual Browsing Method Boolean features numerical features document indexation vector in raw data document x = 1. Context and Problem Setting   | 2. State of the Art | 3. Expected Visualization |  4. Proposal for a Visual Browsing Method  | 5. Perspectives plasma_display display_panel plasma p1 = x1 p2 = x2 p0 = x0
overview : collection of clusters i.e. the concept lattice local view : one cluster’s content (i.e. one formal concept’s extent) document / document distance matrix raw data Boolean features numerical features 4.  Proposal for a Visual Browsing Method 1. Context and Problem Setting   | 2. State of the Art | 3. Expected Visualization |  4. Proposal for a Visual Browsing Method  | 5. Perspectives document / term matrix resulting lattice selection by user
4.1  Lattice spatialization Computation of a distance matrix thanks to the distance described in [Ranwez 2006] Projection using Force Direct Placement method with Molage 1. Context and Problem Setting   | 2. State of the Art | 3. Expected Visualization |  4. Proposal for a Visual Browsing Method  | 5. Perspectives
4.1  Lattice spatialization Neighbours emphasized when selecting a concept Suggesting further navigation paths Intent and simplified extent cardinal displayed 1. Context and Problem Setting   | 2. State of the Art | 3. Expected Visualization |  4. Proposal for a Visual Browsing Method  | 5. Perspectives
4.2  Document spatialization FDP using a single distance matrix for all local views Distance independent from intent’s features Documents appear and disappear at the same position during navigation document / document distance matrix 1. Context and Problem Setting   | 2. State of the Art | 3. Expected Visualization |  4. Proposal for a Visual Browsing Method  | 5. Perspectives d(o1,o2)
329 patents indexed by 10 terms 0.8 average term per patent
plasma_display
plasma_display plasma_display display_panel
plasma_display display_panel
 
plasma_display display_panel display_device
plasma_display display_device
4.3   Synthesis 1. Context and Problem Setting   | 2. State of the Art | 3. Expected Visualization |  4. Proposal for a Visual Browsing Method  | 5. Perspectives formal concept extraction from indexed documents extract the structure emphasize current concept’s neighbours on the overview propose navigation paths through the structure display documents in a concept intent with respect to a semantic distance on their numerical features focus on a structure element content (local view) display the lattice with a semantic distance on edges overview the structure Solutions Goals
5.  Discussion and perspectives Prototype and test on greater data Improve lattice distance Visual Assistance for Feature Selection 1. Context and Problem Setting   | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method |  5. Perspectives
Using Concept Lattices for Visual Navigation Assistance in Large Databases Application to a Patent Database [email_address] [email_address] [email_address] [email_address] É cole des Mines d’Alès –  http://www.ema.fr I-Nova Company, Villeurbanne –  http://www.i-nova.fr

Using FCA for Visual Browsing

  • 1.
    Using Concept Latticesfor Visual Navigation Assistance in Large Databases Application to a Patent Database Jean Villerd , Sylvie Ranwez, Michel Crampes, David Carteret LGI2P – É cole des Mines d’Alès, France I-Nova Company, Villeurbanne – Lyon, France
  • 2.
    Outline Context andProblem Setting State of the Art Objective : Intended Visualization Proposal for a Visual Browsing Method Discussion and Perspectives
  • 3.
    1. Contextand Problem Setting Scalability issues storage of new information + 30 % each year [Lyman & Varivan 2003] Visual scalability “ capability of visualization representations and visualization tools to display massive data sets effectively, in terms of either the number or the dimension of individual data elements” [Eick & Karr 2002] Binary-tree visualization of the Yahoo search engine bot crawling the experimental website Microsoft Research's Netscan project 1. Context and Problem Setting | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives
  • 4.
    2. Stateof the Art Information Visualization “ focus + context” paradigm [Schneiderman 1996] Collection Visualization collection’s structure Grokker, Kartoo, TreeMaps semantic distance (PCA, MDS) Molage Molage Grokker 1. Context and Problem Setting | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives our goal: merging both
  • 5.
    FCA for InformationRetrieval classification querying and browsing Credo, MailSleuth, ImageSleuth ImageSleuth [Eklund, Ducrou et al. 2006] Can information visualization techniques improve this lattice-based navigation process? Credo [Carpineto et al. 2004] 1. Context and Problem Setting | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives
  • 6.
    3. Objective: Expected Visualization extract the structure overview the structure focus on a structure element content (local view) propose navigation paths through the structure 1. Context and Problem Setting | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives structure semantic distance
  • 7.
    overview : collectionof formal concepts 1. Context and Problem Setting | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives 3. Objective: Expected Visualization local view : focus on a particular concept : intent = { plasma display , display panel }
  • 8.
    4. Proposalfor a Visual Browsing Method Boolean features numerical features document indexation vector in raw data document x = 1. Context and Problem Setting | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives plasma_display display_panel plasma p1 = x1 p2 = x2 p0 = x0
  • 9.
    overview : collectionof clusters i.e. the concept lattice local view : one cluster’s content (i.e. one formal concept’s extent) document / document distance matrix raw data Boolean features numerical features 4. Proposal for a Visual Browsing Method 1. Context and Problem Setting | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives document / term matrix resulting lattice selection by user
  • 10.
    4.1 Latticespatialization Computation of a distance matrix thanks to the distance described in [Ranwez 2006] Projection using Force Direct Placement method with Molage 1. Context and Problem Setting | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives
  • 11.
    4.1 Latticespatialization Neighbours emphasized when selecting a concept Suggesting further navigation paths Intent and simplified extent cardinal displayed 1. Context and Problem Setting | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives
  • 12.
    4.2 Documentspatialization FDP using a single distance matrix for all local views Distance independent from intent’s features Documents appear and disappear at the same position during navigation document / document distance matrix 1. Context and Problem Setting | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives d(o1,o2)
  • 13.
    329 patents indexedby 10 terms 0.8 average term per patent
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
    4.3 Synthesis 1. Context and Problem Setting | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives formal concept extraction from indexed documents extract the structure emphasize current concept’s neighbours on the overview propose navigation paths through the structure display documents in a concept intent with respect to a semantic distance on their numerical features focus on a structure element content (local view) display the lattice with a semantic distance on edges overview the structure Solutions Goals
  • 21.
    5. Discussionand perspectives Prototype and test on greater data Improve lattice distance Visual Assistance for Feature Selection 1. Context and Problem Setting | 2. State of the Art | 3. Expected Visualization | 4. Proposal for a Visual Browsing Method | 5. Perspectives
  • 22.
    Using Concept Latticesfor Visual Navigation Assistance in Large Databases Application to a Patent Database [email_address] [email_address] [email_address] [email_address] É cole des Mines d’Alès – http://www.ema.fr I-Nova Company, Villeurbanne – http://www.i-nova.fr