Your SlideShare is downloading. ×
0
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Python Fuse
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Python Fuse

7,708

Published on

Python FUSE, write file-systems in few lines of code.

Python FUSE, write file-systems in few lines of code.

Published in: Technology
0 Comments
10 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,708
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
130
Comments
0
Likes
10
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Python FUSE File-System in USErspace Beyond the Traditional File-SystemsMatteo Bertozzi (Th30z) http://th30z.blogspot.com
  • 2. Talk OverviewWhat is a File-SystemBrief File-Systems HistoryWhat is FUSEBeyond the Traditional File-SystemAPI OverviewExamples (Finally some code!!!)http://mbertozzi.develer.com/python-fuseQ&A
  • 3. What is a File-SystemIs a Method of storing and organizing data to make it easy to find and access....to interact with an object You name it, and you say what you want it do. The Filesystem takes the name you give Looks through disk to find the object Gives the object your request to do something.
  • 4. What is a File-SystemOn Disk Format (...serialized struct)ext2, ext3, reiserfs, btrfs...Namespace(Mapping between name and content)/home/th30z/, /usr/local/share/test.c, ...Runtime Service: open(), read(), write(), ...
  • 5. (The Origins) ...A bit of HistoryMultics 1965 (File-System Paper)A General-Purpose File System For Secondary StorageUnix Late 1969 User Program User Space Kernel Space System Call Layer The File-System Only One File-System
  • 6. (The Evolution) ...A bit of HistoryMultics 1965 (File-System Paper)A General-Purpose File System For Secondary StorageUnix Late 1969 User Program User Space Kernel Space System Call Layer (which?) The File-System 1 The File-System 2
  • 7. (The Evolution) ...A bit of HistoryMultics 1965 (File-System Paper)A General-Purpose File System For Secondary StorageUnix Late 1969 User Program User Space Kernel Space System Call Layer (which?) FS 1 FS 2 FS 3 FS 4 ... FS N
  • 8. (The Solution) ...A bit of HistoryMultics 1965 (File-System Paper)A General-Purpose File System For Secondary StorageUnix Late 1969Sun Microsystem 1984 User Program User Space Kernel Space System Call Layer Vnode/VFS Layer FS 1 FS 2 FS 3 FS 4 ... FS N
  • 9. Virtual File-SystemProvides an abstraction within thekernel which allows different C Libraryfilesystem implementations to coexist. (open(), read(), write(), ...) User SpaceProvides the filesystem interface to Kernel Spaceuserspace programs. System Calls (sys_open(), sys_read(), ...) VFS Concepts VFS (vfs_read(), vfs_write(), ...)A super-block object represents afilesystem. Kernel ext2 ReiserFS XFS SupportedI-Nodes are filesystem objects such File-Systems ext3 Reiser4 JFSas regular files, directories, FIFOs, ... ext4 Btrfs HFS+A file object represents a file opened ... ... ...by a process.
  • 10. Wow, It seems not muchdifficult writing a filesystem
  • 11. Why File-System are ComplexYou need to know the Kernel (No helper libraries: Qt, Glib, ...)Reduce Disk Seeks / SSD Block limited write cyclesBe consistent, Power Down, Unfinished Write...(Journal, Soft-Updates, Copy-on-Write, ...)Bad Blocks, Disk ErrorDont waste to much space for MetadataExtra Features: Deduplication, Compression,Cryptography, Snapshots…
  • 12. File-Systems Lines of Code MinixFS 2,000MinixFS Fuse 800 UFS 2,000 UFS Fuse 1,000 FAT 6,000 ext2 8,000 ext3 16,000 ReiserFS 27,000 ext4 30,000 btrfs 50,000 NFS 10,000 8,000 Fuse 7,000 9,000 FtpFS 800 SshFS 2,000 Kernel Space User Space
  • 13. Building a File-System is DifficultWriting good code is not easy (Bugs, Typo, ...)Writing good code in Kernel Space is muchmore difficult.Too many reboots during the developmentToo many Kernel Panic during RebootWe need more flexibility and Speedups!
  • 14. FUSE, develop your file-systemwith your favorite language and library in user space
  • 15. What is FUSEKernel module! (like ext2, ReiserFS, XFS, ...)Allows non-privileged user to create their own file-system without editing the kernel code. (User Space)FUSE is particularly useful for writing "virtual filesystems", that act as a view or translation of anexisting file-system storage device. (Facilitate Disk-Based, Network-Based and Pseudo File-System)Bindings: Python, Objective-C, Ruby, Java, C#, ...
  • 16. File-Systems in User Space? ...Make File Systems Development Super EasyAll UserSpace Libraries are Available...Debugging ToolsNo Kernel RecompilationNo Machine Reboot!...File-System upgrade/fix2 sec downtime, app restart!
  • 17. Yeah, ...but what’s FUSE?It’s a File-System with user-space callbacks ntfs-3g sshfs ifuse ChrionFS zfs-fuse YouTubeFS gnome-vfs2 ftpfs cryptoFS RaleighFS U n i x
  • 18. FUSE Kernel Space and User Space Your FS The FUSE kernel module SshFS lib Fuse and the FUSE library FtpFS communicate via a ... User Space special file descriptor /dev/fuse Kernel Space which is obtained by FUSE opening /dev/fuse ext2 ... VFS ext4 drivers Your Fuse FS ... firmware User Input kernel Btrfs ls -l /myfuse/ lib FUSEKernel VFS FUSE
  • 19. ...be creativeBeyond the Traditional File-SystemsImapFS: Access to yourmail with grep. Thousand of tools availableSocialFS: Log all your cat/grep/sedsocial network to collectnews/jokes and othersocial things. open() is the most usedYouTubeFS: Watch YouTube function in ourvideo as on your disk. applicationsGMailFS: Use your mailboxas backup disk.
  • 20. FUSE API Overviewcreate(path, mode) mkdir(path, mode)truncate(path, size) unlink(path)mknod(path, mode, dev) readdir(path)open(path, mode) rmdir(path)write(path, data, offset) rename(opath, npath)read(path, length, offset) link(srcpath, dstpath)release(path) Your Fuse FSfsync(path) User Input ls -l /myfuse/ lib FUSEchmod(path, mode)chown(path, oid, gid) Kernel VFS FUSE
  • 21. (File Operations) FUSE API Overview Reading Appendingcat /myfuse/test.txt Writing echo World >> /myfuse/test2.txt Truncating getattr() echo Hello > /myfuse/test2.txt getattr() echo Woo > /myfuse/test2.txt getattr() getattr() open() open() create() truncate() read() write() write() open() read() flush() flush() write() release() release() release() flush() release() Removing getattr() unlink() rm /myfuse/test.txt
  • 22. (Directory Operations) FUSE API Overview Creating Removing mkdir /myfuse/folder Reading rmdir /myfuse/folder getattr() ls /myfuse/folder/ getattr() getattr() mkdir() rmdir() opendir() readdir() releasedir() Other Methods (getattr() is always called) chown th30z:develer /myfuse/test.txt getattr() -> chown() chmod 755 /myfuse/test.txt getattr() -> chmod()ln -s /myfuse/test.txt /myfuse/test-link.txt getattr() -> symlink()mv /myfuse/folder /myfuse/fancy-folder getattr() -> rename()
  • 23. First Code Example! HTFS (HashTable File-System)
  • 24. HTFS Overview FS Item/Object Traditional Filesystem Time of last access Metadata Object with Metadata Time of last modification Time of last status change (mode, uid, gid, ...) Protection and file-type (mode) User ID of owner (UID) HashTable (dict) keys are Group ID of owner (GID) paths values are Items. Extended Attributes (Key/Value) Data Item 1Path 1Path 2 Item can be a Regular File orPath 3 Item 2 Directory or FIFO...Path 4 Item 3 Data is raw data or filenamePath 5 Item 4 list if item is a directory.(Disk - Storage HashTable)
  • 25. HTFS Itemclass Item(object): def __init__(self, mode, uid, gid): # ----------------------------------- Metadata -- self.atime = time.time() # time of last acces self.mtime = self.atime # time of last modification self.ctime = self.atime # time of last status change self.mode = mode # protection and file-type self.uid = uid # user ID of owner self.gid = gid # group ID of owner # Extended Attributes self.xattr = {} # --- Data ----------- This is a File! if stat.S_ISDIR(mode): we’ve metadata self.data = set() else: data and even xattr self.data =
  • 26. (Data Helper) HTFS Itemdef read(self, offset, length): return self.data[offset:offset+length]def write(self, offset, data): length = len(data) self.data = self.data[:offset] + data + self.data[offset+length:] return lengthdef truncate(self, length): if len(self.data) > length: self.data = self.data[:length] else: self.data += x00 * (length - len(self.data)) ...a couple of utility methods to read/write and interact with data.
  • 27. HTFS Fuse Operationsclass HTFS(fuse.Fuse): def __init__(self, *args, **kwargs): getattr() is called before fuse.Fuse.__init__(self, *args, **kwargs) any operation. Tells to the VFS if you can access self.uid = os.getuid() self.gid = os.getgid() to the specified file and the “State”. root_dir = Item(0755 | stat.S_IFDIR, self.uid, self.gid) self._storage = {/: root_dir} def getattr(self, path): if not path in self._storage: return -errno.ENOENTFile-System must be initialized with # Lookup Item and fill the stat struct the / directory item = self._storage[path] st = zstat(fuse.Stat()) def main(): st.st_mode = item.mode st.st_uid = item.uid server = HTFS() st.st_gid = item.gid server.main() st.st_atime = item.atime st.st_mtime = item.mtime Your FUSE File-System st.st_ctime = item.ctime st.st_size = len(item.data) is like a Server... return st
  • 28. (File Operations) HTFS Fuse Operationsdef create(self, path, flags, mode): self._storage[path] = Item(mode | stat.S_IFREG, self.uid, self.gid) self._add_to_parent_dir(path)def truncate(self, path, len): self._storage[path].truncate(len)def read(self, path, size, offset): return self._storage[path].read(offset, size)def write(self, path, buf, offset): return self._storage[path].write(offset, buf) def unlink(self, path): self._remove_from_parent_dir(path)Disk is just a big dictionary... del self._storage[path] ...and files are items def rename(self, oldpath, newpath): key = name item = self._storage.pop(oldpath) self._storage[newpath] = item value = data
  • 29. (Directory Operations) HTFS Fuse Operationsdef mkdir(self, path, mode): self._storage[path] = Item(mode | stat.S_IFDIR, self.uid, self.gid) self._add_to_parent_dir(path)def rmdir(self, path): self._remove_from_parent_dir(path) del self._storage[path] Directory is a File that containsdef readdir(self, path, offset): dir_items = self._storage[path].data File names for item in dir_items: yield fuse.Direntry(item) as data!def _add_to_parent_dir(self, path): parent_path = os.path.dirname(path) filename = os.path.basename(path) self._storage[parent_path].data.add(filename)
  • 30. (XAttr Operations) HTFS Fuse Operationsdef setxattr(self, path, name, value, flags): self._storage[path].xattr[name] = valuedef getxattr(self, path, name, size): value = self._storage[path].xattr.get(name, ) if size == 0: # We are asked for size of the value return len(value) return value Extended attributes extenddef listxattr(self, path, size): the basic attributes attrs = self._storage[path].xattr.keys() associated with files and if size == 0: directories in the file system. return len(attrs) + len(.join(attrs)) They are stored as name:data return attrs pairs associated with filedef removexattr(self, path, name): system objects if name in self._storage[path].xattr: del self._storage[path].xattr[name]
  • 31. (Other Operations) HTFS Fuse Operations Lookup Item, def chmod(self, path, mode): item = self._storage[path] Access to its item.mode = modeinformation/data return or write it. def chown(self, path, uid, gid): item = self._storage[path] This is the item.uid = uid File-System’s Job item.gid = gid def symlink(self, path, newpath): item = Item(0644 | stat.S_IFLNK, self.uid, self.gid) item.data = path self._storage[newpath] = item self._add_to_parent_dir(newpath) Symlinks contains just def readlink(self, path): pointed file path. return self._storage[path].data
  • 32. Other small Examples
  • 33. Simulate Tera Byte Filesclass TBFS(fuse.Fuse): Read-Only FS def getattr(self, path): with 1 file st = zstat(fuse.Stat()) if path == /: of 128TiB st.st_mode = 0644 | stat.S_IFDIR st.st_size = 1 return st elif path == /tera.data: No st.st_mode = 0644 | stat.S_IFREG st.st_size = 128 * (2 ** 40) Disk/RAM Space return st return -errno.ENOENT Required! def read(self, path, size, offset): return 0 * size read() Send data only def readdir(self, path, offset): if path == /: when is requested yield fuse.Direntry(tera.data)
  • 34. X^OR File-Systemdef _xorData(data): data = [chr(ord(c) ^ 10) for c in data] 10101010 ^ return string.join(data, “”) 01010101 =class XorFS(fuse.Fuse): --------- ... 11111111 ^ def write(self, path, buf, offset): data = _xorData(buf) 01010101 = return _writeData(path, offset, data) --------- def read(self, path, length, offset): 10101010 data = _readData(path, offset, length) return _xorData(data) ... res = _xorData(“xor”) print res // “rex” res2 = _xorData(res) print res // “xor”
  • 35. Dup Write File-Systemclass DupFS(fuse.Fuse): def __init__(self, *args, **kwargs): Write on your Disk ... fd_disk1 = open(‘/dev/hda1’, ...) partition 1 and 2. fd_disk2 = open(‘/dev/hdb5’, ...) fd_log = open(‘/home/th30z/testfs.log’, ...) fd_net = socket.socket(...) Send data ... ... over Network def write(self, path, buf, offset): ... disk_write(fd_disk1, path, offset, buf) Log your disk_write(fd_disk2, path, offset, buf) file-system net_write(fd_net, path, offset, buf) log_write(fd_log, path, offset, buf) operations ... ...do other fancy stuff
  • 36. One more thing
  • 37. (File and Folders doesn’t fit)Rethink the File-System I dont’t know where I’ve to place this file... ...Ok, for now Desktop is a good place...
  • 38. (Mobile/Home Devices)Rethink the File-System Small Devices Small Files EMails, Text... We need to lookup quickly our data. Tags, Full-Text Search... ...Encourage people to view their content as objects.
  • 39. (Large Clusters, The Cloud...) Rethink the File-SystemDistributed data Scalability Fail over Cluster Rebalancing
  • 40. Q&A Python FUSEMatteo Bertozzi (Th30z) http://th30z.blogspot.com

×