Systems Research Designing large and/or complex computing systems computers (from tablets to supercomputers) networks printers etc. Includes the software: operating systems programming languages networking stacks file systems distributed systems etc.
Systems Research… How it‟s different from say, algorithm design: external interface (requirement) is less precise, more complex, subject to change… more internal structure, thus more internal interfaces module-level design choices have wider implications measure of success is unclear No such thing as a „best‟ design more important to avoid making terrible choices Usually, the physical system doesn‟t yet exist! simultaneous development of the hardware and software how do you design and test the software? how do you design and test the hardware?
The 1992 Turing AwardCITATION: For contributions to the development of distributed, personal computing environments and the technology for their implementation: workstations, networks, operating systems, programming systems, displays, security and document publishing.
A 1-page Biography Born in 1943, in Washington, DC Education: B.A. in Physics at Harvard, 1964 Ph.D. in EECS at Univ. of California at Berkeley, 1967, aged 23! Work: Xerox PARC, 1971-1984 DEC SRC, 1984-1995 Microsoft Research, 1995- Systems: Alto (workstation) Bravo (WYSIWYG editor) Cal (timesharing system) Dover (laser printer) Awards: 1992 ACM Turing Award ACM Software Systems Award IEEE Computer Pioneer Award IEEE von Neumann Award 3 papers in ACM SIGOPS Hall of Fame!
The Xerox Alto The first personal computer, built in 1973 at PARC First computer to use the desktop metaphor First computer with a mouse-driven GUI Conceived in a 1972 memo written by Lampson Design led by Charles Thacker (2009 Turing) Lampson contributed to the design Wrote the OS! Lampson et al. won the 1984 ACM Software System Award and the 2004 NAE Draper prize for the Alto Later work: led the design of Dorado and
The Bravo editor First WYSIWYG editor Shipped with the Xerox Alto Developed by Lampson, Simonyi et al. in 1974 Led to the development of Microsoft Word
The Xerox 9700 and Dover LaserPrinters First laser printer, at PARC, 1969, led by Gary Starkweather Lampson co-designed the electronics and software prototype for the Xerox 9700 Dover Laser Printer later version, in 1976 much cheaper Lampson designed the electronics
SDS 940 First general-purpose time-sharing system, with char-by-char interaction Project Genie at Berkeley Lampson joined as graduate student wrote parts of the OS created several programming languages! Cal: interactive language for numerical computation QSPL: system programming
Other Research Two-phase commit protocol for distributed transactions (with Sturgis) Cal time-sharing system for CDC 6400 (at Berkeley) pioneered shadow pages, redo logs capability-based system: not a good basis for long-term security Programming Languages: Mesa and SPL for systems programming Mesa‟s process mechanism modern thread systems Cedar: combining the virtues of Mesa and Lisp Euclid: first language designed to enable program verification Modula 2+: extends Wirth‟s Modula-2
But wait, there‟s more! Security Access Matrix model, unifying capabilities and ACLs Information Flow Control Theory of Principals “speaking for” other principals Microsoft Palladium (TCB) Scrubbing disk storage How economic factors (not technology) inhibit security Networking co-inventor on Xerox patent for Ethernet! switched LAN at DEC SRC Formal specification and proof TCP connection establishment (at-most-once messaging) Leslie Lamport‟s Paxos protocol
The Hints paper... Presented at the 9th SOSP, appeared in SIGOPS OSR October 1983, reprinted in IEEE Software Jan 1984 Summarizes the learnings from years of systems research and implementation Still influential after all these years... e.g.: Werner Vogels, CTO, Amazon Butlers paper shows a great mix of fundamentals and best practices from the early days of large scale system design. Almost all of his advice has withstood the test of time and as such they are even more important now than in 1983 from a July 2012 blog post
How it‟s organized “Hints”: not laws not foolproof recipes not consistent not always appropriate in short, no guarantees Each hint is: summarized by a slogan illustrated with examples from systems work preceded by an appropriate quotation from Hamlet! Hints for functionality (does it work?), speed (is it fast enough?) and fault tolerance (does it keep working?) Pre-req: Notion of an interface that separates an implementation of some abstraction from the clients who use the abstraction “Defining interfaces is the most important part of system design”
Hints for Functionality Do one thing at a time, and do it Perfection is reached not when well there is no longer anything to KISS add, but when there is no longer Don’t generalize, generalizations anything to take away! are generally wrong! A. Saint-Exupery Interface should capture the We are faced with an minimum essentials of an insurmountable opportunity abstraction W. Kelley mustn’t promise more features than the implementer knows how to deliver (without penalizing other clients) Service should have a fairly Everything should be made as predictable cost simple as possible, but no simpler But, get it right! A. Einstein
Hints for Functionality... Make it fast, rather than general or powerful Analogous to the RISC principle... Better to have fast, basic operations than slow, powerful ones Clients who don‟t need the power shouldn‟t have to pay more for the basic functionality As long as it‟s fast, a client can program the additional functionality it needs; another client can program some other function Don’t hide power If a low-level abstraction allows something to be done quickly, higher levels should not bury this power inside something more general abstractions should conceal only undesirable properties!
Hints for Functionality... Use procedure arguments to provide flexibility in an interface restricted or encoded as necessary e.g. an interface for listing all elements of a set that satisfy a property let clients pass in a filter vs. a special language of patterns Leave it to the client keep the interface simple, flexible and high-performance caveat: as long as it is cheap to pass control back and forth e.g. success of monitors for synchronization, because locking and signalling do very little clients take care of process scheduling, buffer allocation, resource accounting, etc. or use other libraries for that e.g. Unix style utilities: stdin processing stdout
Hints for Functionality... Keep basic interfaces stable embodies assumptions shared by many parts of a system if type-checking is available, may be possible to check for number and types of arguments but still requires programmers to rework the integration may not detect semantic changes in interfaces Keep a place to stand... ... if you do change interfaces e.g. a compatibility package, implements an old interface on top of the new one e.g. OS simulators dev cost may be less vs. cost of fixing all client software performance hit, but often acceptable
Hints for Functionality... Plan to throw one away... ... you will, anyhow! Keep secrets... assumptions about an implementation that clients aren‟t allowed to make tension vs. the desire not to hide power tension vs. the need for performance Divide and conquer reduce a problem to several easier ones bite off as much as will fit, leaving the rest for the next iteration e.g. Alto‟s file system defragmentation and its use of memory Handle normal and worst cases separately the normal case must be fast, the worst case must make progress e.g. caches and hints help the normal case e.g. Garbage Collector in Cedar doesn‟t count refs in local frames; those are scanned completely during GC
Hints for Speed Split resources in a fixed way, rather than sharing them faster to allocate, faster to access, more predictable e.g. using register banks vs. memory cost of extra resources is usually low multiplexing overheads may be larger than the fragmentation waste Dynamic translation... ... from a convenient representation to one that can be quickly interpreted e.g. Smalltalk compiler generates bytecodes, implementation translates (and caches) a single procedure‟s bytecodes to machine language when invoked Cache answers to expensive computations store the triple [f, x, f(x)] in an associative store basic example: processor cache or virtual memory systems [Fetch, address, contents of address] need to ensure cache invalidation/update for non-functional f() such as Fetch()
Hints for Speed... Use hints like a cache entry (saved result of some computation), but: not necessarily reached by associative lookup may be wrong! if using a hint for unrecoverable actions, need to check if it‟s wrong e.g. store-and-forward routing tables in Arpanet based on each node‟s opinion about its links to neighbours periodically broadcast (can be lost, or delivered out of order) When in doubt, use brute force hardware is cheap (even in 1983!) brute force allows for cheaper, faster implementation e.g. chess-playing computers that use brute-force vs. sophisticated chess strategies
Hints for Speed... Compute in the background where possible especially in interactive or real-time systems e.g., Cedar garbage collector, e-mail delivery, etc. Use batch processing if possible doing things incrementally usually costs more disks and tapes work better when accessed sequentially e.g. banking systems use online data, but discard it after nightly batch Safety first in allocating resources, avoid disaster vs. being optimal “a general-purpose system cannot optimize the use of resources” again, “hardware is cheap, and getting cheaper” cleverness works only if you have very well-known loads “The nicest thing about the Alto is that it doesn’t run faster at night” – Morris no fancy processor scheduling, fixed share of cycles to each job Shed load to control demand e.g. Arpanet moved from guaranteed to best-effort delivery
Hints for Fault-Tolerance End-to-end Saltzer et al. classic paper, 1981 ICDCS error recovery at app level is necessary for a reliable system any other error detection or recovery is strictly for performance e.g. file transfer across a network from A to B reading from B‟s disk and validating checksum against A‟s disk is necessary checking the transfer from A‟s disk to A‟s memory, or from A to B over the network, is not sufficient not necessary either, but can help performance (retransfer only corrupted parts) Log updates to record the truth about an object’s state current state of object is treated like a hint log entry must include update procedure (functional) and its arguments sequence of log entries can be re-executed if necessary to re-create the true state Make actions atomic or restartable atomic: failure during the action has no effect log entries need to be restartable / idempotent can be partially executed any number of times before a complete execution
In Conclusion... Most humbly do I take my leave, my lord. “Such a collection of good advice and anecdotes is rather tiresome to read; perhaps it is best taken in small doses at bedtime”! “I can only plead that I have ignored most of these rules at least once, and nearly always regretted it”
For more information… http://amturing.acm.org/award_winners/lampso n_1142421.cfm - Profile of Lampson on ACM site http://amturing.acm.org/bib/lampson_1142421. cfm - citations for the 3 SIGOPS HoF papers http://research.microsoft.com/en- us/um/people/blampson/default.htm - Lampson‟s homepage at MSR