Your SlideShare is downloading. ×
9P Overview
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

9P Overview

3,577
views

Published on

Overview of the 9P Protocol

Overview of the 9P Protocol


0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,577
On Slideshare
0
From Embeds
0
Number of Embeds
41
Actions
Shares
0
Downloads
33
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. IBM Research 9P Overview Eric Van Hensbergen IBM Austin Research Lab (bergevan@us.ibm.com) © 2010 IBM Corporation
  • 2. IBM Research Agenda • Historical Background (Plan 9 & Inferno) • 9P Protocol Basics • Extensions • Linux Client Code Overview 2 9P Overview © 2010 IBM Corporation
  • 3. IBM Research Historical Background • Plan 9 from Bell Labs was a distributed operating system developed as a successor to UNIX starting in the mid-1980’s. • Primary motivation for Plan 9 was to rethink operating systems in light of pervasive networking (networking was added an afterthought to original. • Plan 9 resources were scattered across cluster of machines with each machine having a role (Terminal, CPU Server, Auth Server, File Server) • Inferno was a commercial venture based off of Plan 9 which provided Plan 9’s environment tightly coupled with a virtual machine in both native and hosted (Linux, BSD, Windows) platforms. 3 9P Overview © 2010 IBM Corporation
  • 4. IBM Research Plan 9 Trivia • Supported Multiple Hosts, but only 32-bit • x86, MIPS, Alpha, SPARC, PowerPC, ARM • Native Support for UTF-8 from inception • Own Tool Set (Ken Thompson’s C compilers) • Some Kernel Stats • 37 syscalls • 178,738 lines of code amongst all ports (38k lines portable) • optional real-time scheduler • User development environment primarily C and Alef • ANSI/POSIX Emulation environment available • Open sourced (Lucent Public License 1.02) 4 9P Overview © 2010 IBM Corporation
  • 5. IBM Research Plan 9 Core Design Concepts • All Resources Represented as File Hierarchies • System Resources: processes, devices, networking stack • System Services: DNS, Window System, Plumbing • Application Services: Editor Interfaces, Plumbing • Namespaces • private, per-process by default • user manipulatable • bind and union directories • Standard Communication Protocol • a standard protocol, 9P, used to access both local and remote resources 5 9P Overview © 2010 IBM Corporation
  • 6. IBM Research Implication of Design Concepts • Since all resources exposed as file hierarchies and remote hierarchies could be accessed via 9P • remote resources could be accessed as easily as local ones (audio, graphics, network) without specialized protocols for each • Since namespaces were private and per-process • individual users could compose namespaces of local and remote resources and subsequent applications could access those resources transparently • individual applications can do this as well without affecting other applications (each window in the window manager had its own namespace) 6 9P Overview © 2010 IBM Corporation
  • 7. IBM Research 9P Protocol Basics • Based around core Plan 9 System Call I/O operations • Local operations degrade to functional calls • Remote operations closer to proxy operations • Pure request/response RPC model • Transport Independent • only requires reliable, in order delivery mechanism • can be secured with authentication, encryption, & digesting • By default, requests are non-cached avoiding coherence problems and race conditions • Design stresses keeping things simple resulting in small and efficient client and servers 7 9P Overview © 2010 IBM Corporation
  • 8. IBM Research 9P Protocol Terms and Structures • tag - numeric identifier for multiplexing operations • fid - numeric identifier for file system entities • represent transient position in filesystem (directory or files) • also represent open files • transient fids can navigate or queried for meta-data, open fids can only be used for operations (read, write, close) • qids • qid.type: type of qid (directory, file, etc.) • qid.path: unique per-entity identifier • qid.version: monotonically increasing file version • stat - metadata structure (directories or files) • strings - always size prefixed 8 9P Overview © 2010 IBM Corporation
  • 9. IBM Research 9P Basics: Protocol Overview Numeric transaction id for multiplexing size op tag ... Numeric pointer to a path element or open file size Twrite tag fid offset count data size Rwrite tag count Protocol Specification Available: http://ericvh.github.com/9p-rfc/ 9 9P Overview © 2010 IBM Corporation
  • 10. IBM Research 9P Basics: Operations  Session Management  Metadata Management – Version: protocol version and capabilities – Stat: retrieve file metadata negotiation – Wstat: write file metadata – Attach: user identification and session option  File I/O negotiation – Create: atomic create/open – Auth: user authentication enablement – Open, Read, Write, Close – Walk: hierarchy traversal and transaction – Directory read packaged w/read management operation (Reads stat information with – Clunk: forget about a fid file list)  Error Management – Remove – Error: a pending request triggered an error – Flush: cancel a pending request 10 9P Overview © 2010 IBM Corporation
  • 11. IBM Research version size Tversion tag msize version size Rversion tag msize version Initial tag is always (ushort)~0 msize defines maximum length in bytes of any single 9P message. version string (size prefixed) must always begin with 9P, if the server doesn’t recognize, it responds with version=unknown and client retries until it gets a match. version of 9P specified by 4 characters after 9P (ie. 9P2000) optional extensions specified by . specifiers (9P2000.U and 9P2000.L) 11 9P Overview © 2010 IBM Corporation
  • 12. IBM Research auth size Tauth tag afid uname aname size Rattach tag qid User selects afid to represent authentication channel for a particular user (identified by uname) and attach parameter (aname). Auth protocol is not defined by 9P, once it is complete afid is presented in subsequent attach message. The same validated afid may be used for multiple messages with the same uname and aname. 12 9P Overview © 2010 IBM Corporation
  • 13. IBM Research attach size Tattach tag fid afid uname aname size Rattach tag aqid Serves as an introduction from the user to the server. fid chosen initially by client uname identifies user to server aname identifies an attach parameter (optional) afid identifies previously negotiated authentication channel (set to (u32int)~0 if client doesn’t wish to authenticate 13 9P Overview © 2010 IBM Corporation
  • 14. IBM Research flush size Tflush tag oldtag size Rflush tag Flush is sent to server to cancel an outstanding operation (specified by oldtag) Server always sends Rflush It is permitted for server to have already sent response and still send Rflush If client receives response before Rflush, it must honor response It is also permitted to Flush a Flush, server must handle flush requests in order Tag may not be reused until all Rflush have returned 14 9P Overview © 2010 IBM Corporation
  • 15. IBM Research error size Rerror tag ename Rerror sent in response to report errors on other operations. Plan 9 errors returned as strings from the server. 15 9P Overview © 2010 IBM Corporation
  • 16. IBM Research walk - fid creation and navigation size Twalk tag fid newfid nwname wname ... size Rwalk tag nwqid qid ... new fids are created by a walk with no name arguments (nwname=0) this is also known as a ‘clone’ operation for historical reasons walks with fid=newfid move the fid around fs hierarchy following path specified by nwnames wname(s) walks can both create and navigate fids (newfid is navigated) partial path resolution failures return nwqid < nwname (with qids for successful path elements walked) dot-dot (..) and dot (.) treated special meaning parent directory or current directory 16 9P Overview © 2010 IBM Corporation
  • 17. IBM Research clunk - fid reclaimation size Tclunk tag fid size Rclunk tag sent when a fid is no longer needed, client may reuse fid as a newfid for other operations even if clunk returns an error, fid is no longer valid typically invoked on a close, but also invoked when a transient reference is no longer needed 17 9P Overview © 2010 IBM Corporation
  • 18. IBM Research Entity Operations • Create, Open, Read, Write, Remove, Stat, Wstat • basically what you would think • Create functions as atomic create/open operation • Plan 9 has special open modes for exclusive access, append only, and temporary files. • No special dirread function, just open & read directory • returns integral number of stat structures, one for every file in the directory • Rename within directory accomplished with Wstat • non-directory renames non-atomic • Read/Write include offsets in operation • Wstat can selectively set attributes by used “don’t touch” flag 18 9P Overview © 2010 IBM Corporation
  • 19. IBM Research 9P Packet Trace (from v9fs) <<< (0x8055650) Tattach tag 0 fid 2 afid -1 uname aname nuname 266594 >>> (0x8055650) Rattach tag 0 qid (0000000000000002 48513969 'd') <<< (0x8055650) Twalk tag 0 fid 1 newfid 3 nwname 1 'test' >>> (0x8055650) Rwalk tag 0 nwqid 1 (000000000000401a 48613b9d 'd') <<< (0x8055650) Tstat tag 0 fid 3 >>> (0x8055650) Rstat tag 0 'test' 'ericvh' 'root' '' q (000000000000401a 48513b9d 'd') m d777 at 1213278479 mt 1213283229 l 0 t 0 d 0 ext '' <<< (0x8055650) Twalk tag 0 fid 3 newfid 4 nwname 1 'hello.txt' >>> (0x8055650) Rwalk tag 0 nwqid 1 (000000000000401b 4851379d '') <<< (0x8055650) Tstat tag 0 fid 4 >>> (0x8055650) Rstat tag 0 'hello.txt' 'ericvh' 'ericvh' '' q (000000000000401b 4851379d '') m 644 at 1213283229 mt 1213283229 l 12 t 0 d 0 ext '' <<< (0x8055650) Twalk tag 0 fid 4 newfid 5 nwname 0 >>> (0x8055650) Rwalk tag 0 nwqid 0 <<< (0x8055650) Topen tag 0 fid 5 mode 0 >>> (0x8055650) Ropen tag 0 (000000000000401b 4851379d '') iounit 0 <<< (0x8055650) Tstat tag 0 fid 4 >>> (0x8055650) Rstat tag 0 'hello.txt' 'ericvh' 'ericvh' '' q (000000000000401b 4851379d '') m 644 at 1213283229 mt 1213283229 l 12 t 0 d 0 ext '' <<< (0x8055650) Tread tag 0 fid 5 offset 0 count 8192 >>> (0x8055650) Rread tag 0 count 12 data 68656c6c 6f20776f 726c640a <<< (0x8055650) Tread tag 0 fid 5 offset 12 count 8192 >>> (0x8055650) Rread tag 0 count 0 data <<< (0x8055650) Tclunk tag 0 fid 5 >>> (0x8055650) Rclunk tag 0 <<< (0x8055650) Tclunk tag 0 fid 4 >>> (0x8055650) Rclunk tag 0 <<< (0x8055650) Tclunk tag 0 fid 3 >>> (0x8055650) Rclunk tag 0 19 9P Overview © 2010 IBM Corporation
  • 20. IBM Research Extension Models • Extend arguments to existing operations to accommodate non- Plan 9 environments • Provide a single extension operation which encapsulates any extended protocol operations • Provide a set of complimentary operations which provide any extensions (including extensions which are semantic changes to existing operations) • Provide synthetic file system interfaces which exist either within the hierarchy or within an alternate aname mount • can either be provided by primary server, or through a secondary server either mounted underneath 20 9P Overview © 2010 IBM Corporation
  • 21. IBM Research Unix Extensions (9P2000.u) • Existing Support: • UID/GID support • Error ID support • Stat mapping • Permissions mapping • Symbolic and Hard Links • Device Files • All accomplished via optional extended arguments to existing operations and an extended Stat structure 21 9P Overview © 2010 IBM Corporation
  • 22. IBM Research Future Work: .L extension series • The 9P protocol is a network mapping of the Plan 9 file system API • Many mismatches with Linux/POSIX • Existing .U extension model is clunky • Developing a more direct mapping to Linux VFS • New opcodes which match VFS API • Linux native data formats (stat, permissions, etc.) • Direct support of extended attributes, locking, etc. • Should be able to co-exist with legacy 9P and 9P2000.u protocols and servers. 22 9P Overview © 2010 IBM Corporation
  • 23. IBM Research 9P Client/Server Support • Comprehensive list: http://9p.cat-v.org/implementations • C, C#, Python, Ruby, Java, Python, TCL, Limbo, Lisp, OCAML, Scheme, PHP and Javascript • FUSE Clients (for Linux, BSD, and Mac)‫‏‬ • Native Kernel Support for OpenBSD • Windows support via Rangboom proprietary client • Inferno supports native 9P (aka Styx) • Simple server library available (libixp) (9P2000 only) • 9P2000.u available in spfs (single threaded) and npfs (multi- threaded) • golang client and server now available 23 9P Overview © 2010 IBM Corporation
  • 24. IBM Research 9P in the Linux Kernel • Since 2.6.14 • Small Client Code Base • include/net/9p - global definitions and interface files • fs/9p: VFS Interface ~1500 lines of code • net/9p • Core: Protocol Handling ~2500 lines of code • FD Transport (sockets, etc.): ~1100 lines of code • Virtio Transport: ~300 lines of code • RDMA Transport: ~700 lines of code • Small Server Code Base • Spfs (standard userspace server): ~7500 lines of code • Current KVM-qemu patch: ~1500 lines 24 9P Overview © 2010 IBM Corporation
  • 25. IBM Research 9P Linux Kernel Debug • Enable debug for client side trace (-o debug=0xffff turn all on) • 0x001 - display verbose error messages (via syslog) • 0x002 - used for more verbose granular debug • 0x004 - 9p trace • 0x008 - VFS trace • 0x010 - marshalling debug • 0x020 - RPC debug • 0x040 - transport specific debug • 0x080 - allocation debug • 0x100 - display protocol message debug • 0x200 - display FID debug • 0x400 - display packet debug • 0x800 - display fscache tracing debug 25 9P Overview © 2010 IBM Corporation
  • 26. IBM Research v9fs access modes • access=user • new attach every time a new user tries to access the file system • access=<uid> • single attach and only allows uid=<uid> to access • access=any • single attach and allows all users to access with rights of user who performed initial attach 26 9P Overview © 2010 IBM Corporation
  • 27. IBM Research v9fs transport options • trans_fd module • tcp: normal socket operations • unix: mount a named pipe • fd: used passed file descriptors for connection (rfdno, wfdno) • virtio: use virtio channel • rdma: use infiniband RDMA 27 9P Overview © 2010 IBM Corporation
  • 28. IBM Research v9fs cache modes • Default is no cache • cache=loose • no attempts are made at consistency, intended for exclusive access, read-only mounts • fids aren’t generally clunked in order to hold reference to files • cache=fscache • use FS-Cache for persistent, read-only cache backend • EXPERIMENTAL. Hasn’t been fully tested. • Other options possible in future including path caches (dentry cache) and/or temporal based cache with semantics similar to other distributed file systems. 28 9P Overview © 2010 IBM Corporation
  • 29. IBM Research v9fs other options • port=<port> - specify TCP port • uname=<user> - specify user to initially mount as • aname=<name> - attach argument • maxdata=<n> - specify maximum single packet size • noextend - only use vanilla protocol (no .u) • dfltuid - specify default uid to mount as (.u) • dfltgid - specify default gid to mount as (.u) • afid - specify a security channel (only valid for fd transport) • nodevmap - no special files, make any special fils look normal • cachetag - optional persistent tag signature 29 9P Overview © 2010 IBM Corporation
  • 30. IBM Research Typical Regressions Process • Simple mount against spfs file server • Test with short set of Linux file system benchmarks • fsx -N 1000 -R -W testfile • echo run | postmark • bonnie -s 1 • dbench -t 60 4 30 9P Overview © 2010 IBM Corporation
  • 31. IBM Research 9p server operation • spfs/npfs: (9P2000.u) • ufs -p 5670 -s • -p specifies port number • -s specifies single user (whoever is running spfs) • can also pass -d to see server side trace • if using npfs, specify -w to limit number of threads • patched kvm-qemu (for virtio transport) • kvm <other_args> -share / • tells kvm to share / over virtio channel to guest 31 9P Overview © 2010 IBM Corporation
  • 32. IBM Research Code Style and Development Goal • Stick to Linux Coding Style Guidelines (of course) • Keep It Simple • short names • limit any use of macro definitions or conditionals (#ifdef) • extensions should be kept optional • any cache extensions should be kept optional (configurable at mount time) • send patches for review on: • v9fs-developer@lists.sourceforge.net • bug tracking for client on bugzilla.kernel.org • protocol documentation/updates to • http://github.com/ericvh/9p-rfc 32 9P Overview © 2010 IBM Corporation
  • 33. IBM Research Code Review • http://lxr.linux.no/linux/include/net/9p/ • http://lxr.linux.no/linux/fs/9p/ • http://lxr.linux.no/linux/net/9p/ 33 9P Overview © 2010 IBM Corporation