TagFS — Tag Semantics for Hierarchical File Systems
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

TagFS — Tag Semantics for Hierarchical File Systems

  • 5,007 views
Uploaded on

Finally a file can be on more than one folder! End of the tyranny of the hierarchy....

Finally a file can be on more than one folder! End of the tyranny of the hierarchy.

The talk describes the design and implementation of a tag-based file system. Presented at IKNOW 2006.

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • For windows you can try TagsForAll from www.akslab.com
    Are you sure you want to
    Your message goes here
  • Publication available here: http://www.aifb.uni-karlsruhe.de/Publikationen/showPublikation?publ_id=1140
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
5,007
On Slideshare
5,002
From Embeds
5
Number of Embeds
2

Actions

Shares
Downloads
39
Comments
2
Likes
1

Embeds 5

http://www.slideshare.net 4
https://xingmodules.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. TagFS — Tag Semantics for Hierarchical File Systems Bonus Track: Introducing SemFS 2006 2006 Web3.0 2006 Stephan Bloehdorn Institute AIFB, University of Karlsruhe, Germany Olaf Görlitz ISWeb, University of Koblenz-Landau, Germany Simon Schenk ISWeb, University of Koblenz-Landau, Germany Max Völkel Forschungszentrum Informatik, Karlsruhe, Germany talk Max Völkel Forschungszentrum Informatik, Karlsruhe, Germany
  • 2. Motivation
    • „ Every user [...] indicated that their attempts to establish elaborate filing schemas for archived information failed because they proved to require more time and effort than the information was worth.“ Barreau and Nardi, 1995
    • Contributions
      • Mapping file system semantics to tagging semantics
      • Architecture for a semantic file system
  • 3. Hierarchical file systems have some problems
    • Single locaction property
      • E.g. where to put a song produced by two artists?
      • Each file may be only in one folder
        • There are links/shortcuts, but where to put the „primary“ file?
    • Browsing to maximum specifity
      • E.g. 5 clicks for /My Music/Fatboy Slim/2006/danceable/favourite even if there are only 5 Fatboy Slim songs altogether
      • The more you organise, the more you have to browse
    • Missing orthogonality
      • E.g. /2003/Fatboy Slim/favourite or /favourite/Fatboy Slim/2003 ?
      • Many dimensions, only one access path
    • No query refinement
      • FS lists only directories explicitly placed there, no help
  • 4. Tagging
    • Simple idea:
      • Instead of putting resources in folders (nested containers), put tags (labels) on resources
    • Tagging user interfaces
      • User sees a resource, can type in tags (simple, single keywords, separated by space or comma)
      • User can click on a tag, UI lists all resources with that tag assigned
      • Conjunctive queries: e.g. fatboyslim+favourite
    • Examples: del.icou.us, flickr, 100 more
      • Who has used a tagging system?
    new Web2.0
  • 5. Example: del.icio.us Tagging Browsing Queries
  • 6. Comparison
    • File system: partition of the adress space
    • Tagging: overlapping sets
    a b a+b+c a+c b+c a+b c a b c /a /a/b /a/c /a /a/b /a/b/c /a/c /a/c/b /c /c/b /c/b/a /c/a /c/a/b /b /b/a /b/a/c /b/c /b/c/a
  • 7. Mapping file system semantics to tagging semantics Query and Browse – the easy parts
    • Query
      • Use path as query, e.g. /a/b/c = query for a + b + c
    • Browsing / a
      • Contained files: all resources tagged with a (  if not to many)
      • Contained folders: All tags b , for which the conjunctive query a+b is not empty
      • Any tag is a good starting point!
      • Note: Virtual directory views are computed at runtime
  • 8. Mapping file system semantics to tagging semantics Tagging – the hard part
    • Changing existing tagging
      • Copy file from a to b = tag file with b also
      • Move file from a to b = remove tag a , add tag b
      • Delete file from a = remove tag a
      • Rename a to b = for all files tagged with a : remove a , add b
    • Add files to TagFS
      • Add file to folder a = add file to TagFS; tag file with a
      •  File identity determined by hash or filename  Allows updating a file, if content is changed externally
    • Delete files from TagFS
      •  Move file to deleteMe = delete file
    • Create tag a = create folder a
      •  This folder would not be shown, because it‘s empty
  • 9. Tagging in the file system
    • Many locaction property
      • Each file may carry as many tags as desired
    • Browsing until result is small enough
      • Each folder contains all files tagged with the folder name
    • Orthogonality of information dimensions
      • Path interpreted as a conjunctive query, e.g. Lisa Ekdahl/2006 is the same as 2006/Lisa Ekdahl
      • Not all reviewers agreed on this 
    • Query refinement
      • Each folder lists useful query refinements as sub-folers
    • Providing all standard filesystem operations
      •  compatibility with existing applications
  • 10. Introducing SemFS
    • Implementing TagFS with SemFS
    Web2.0 Web3.0 Semantic Desktop RDF keywords Bonus Track
  • 11. What is a file system?
  • 12. What is a file system? Organising files … An address given as path expression := Letter “ : “ ( “ “ name)* File System dir! response files folders name metadata metadata
  • 13. What is a file system? … managing binary data File System metadata data 1011 write 1011 read 1011
  • 14. What is a file system? … managing binary data File System rename metadata data
  • 15. What is a file system? … managing binary data File System Add file or delete file metadata data
  • 16. What is a file system? Organising files and managing binary data
    • View( path )  Metadata-Table
    • Contained files,
    • Contained directories,
    • Metadata: getName, getDate, …
    • Binary content irrelevant
    • Update
    • Metadata: setName, setDate, add file, delete file, move file
    • Binary+Metadata: read file, write file, trunc file
  • 17. What is a virtual file system? Organising files and managing binary data op Virtual File System Looks and behaves like a file system… … but is no file system. It‘s implemen-ted differently. metadata data X Flickr CMS File System metadata data
  • 18. What is a semantic file system? Organising files and managing binary data. op Semantic File System Flexible implementation. Unified metadata  unified search metadata data X Flickr CMS File System
  • 19. Architecture of SemFS: Filters
    • Input: Frontend gets path
    • SemFS maintains a list of filters ,
      • Formal: Filter(graph, path)  (filtered graph, shorter path)
    • Each filter is asked sequencially to process a path
    • A filter consumes some parts of a path, and returns a filtered metadata graph
      • Filter can delegate to a particular filter or delegate back to filter chain
    • Output: Finally the filtered metadata graph is transformed to a directory view
  • 20. Architecture of SemFS: Example for Filters
    • Filter-List: [Favourite, Artist , Main]
    • Main-Filter processes “My Music“ and returns filtered graph, containing all music-resources. Delegates to Artist-Filter.
    • Artist-Filter processes “Fatboy Slim“ and returns all resource, that have e.g. the dc:creator „Fatboy Slim“. Delegates to filter chain.
    • Favourites-Filter processes “favourite“ and returns all resources accesses more than n times in the last x days
    Main Artist Favourite My MusicFatboy Slimfavourite Fatboy Slimfavourite favourite
  • 21. Architecture of SemFS: Example for Filters
    • Filter-List: [Favourite, Artist , Main]
    • Main-Filter processes “My Music“ and returns filtered graph, containing all music-resources. Delegates to Artist-Filter.
    • Artist-Filter processes “Fatboy Slim“ and returns all resource, that have the rdfs:label „Fatboy Slim“. Delegates to filter chain.
    • Favourites-Filter processes “favourite“ and returns all resources accesses more than n times in the last x days
    Main Artist Favourite My MusicFatboy Slimfavourite Fatboy Slimfavourite favourite
    • View( path )  Metadata-Table
    • Contained files,
    • Contained directories,
    • Metadata: getName, getDate, …
  • 22. Architecture of SemFS: Class-Handlers
    • Different content-types have different meta-data and different ways to read/write
      • E.g. local bookmarks, remote bookmarks, images, mp3 files, …
    • Idea: access each class of content types via a Class-Handler
    • Metadata: getName, getDate, …
      • Can be cached in metadata graph
    • Updates
      • Metadata: setName, setDate, add file, delete file, move file
      • Binary+Metadata: read file, write file, trunc file
      • Other metadata operations, e.g. setGenre in ID3-tags
  • 23. Architecture of SemFS: Summary: Metadata Graph, Filters, Class-Handlers
    • Metadata graph (RDF)
      • Holds all metadata
      • acts as a unifying cache, when content has ist own metadata
    • Filters
      • Create the view, organise resources (not limited to files)
    • Class Handlers
      • Handle different content types and manages binary data
  • 24. Using SemFS: A tagging file system (TagFS)
    • One Filter: Tag-Filter
      • Consumes a single path segment a
        • Contained files: Returns all resources tagged with a
        • Contained folders: All tags b , for which the conjunctive query a+b is not empty
    • One Class Handler
      • Delegates classic metadata queries to underlying file store
      • Binary content resides in a folder of a classic file system
      • Tag-queries handled by metadata graph
    • New problems
      •  No two files may have the same name (when rendered)
        • Rename at insertion or display time
  • 25. Conclusions
    • TagFS
    • Many locaction property
    • Browsing only until result is small enough, no further
    • Orthogonality of information dimensions
    • Query refinement
    • Compatible with existing applications
    • SemFS
    • An efficient way to implement virtual file systems with pluggable semantics
    • Allows rapid and solid prototype creation
    • Interoperability with Semantic Desktop applications
    Download Prototype at http://isweb.uni-koblenz.de/Research (Linux only, WebDAV/Windows in progress) Thank you. Questions?
  • 26. BACKUP
  • 27. Contact Information
    • Max Völkel (author, presenter)
      • Forschungszentrum Informatik, Karlsruhe, Germany
      • http://www.fzi.de
    • Stephan Bloehdorn (author)
      • Institute AIFB, University of Karlsruhe, Germany
      • http://aifb.uni-karlsruhe.de /
    • Olaf Görlitz (author) and Simon Schenk (author)
      • ISWeb, University of Koblenz-Landau, Germany
      • http://isweb.uni-koblenz.de /
  • 28. Architecture
  • 29. Tagging Ontology