9. Systems Research
Designing large and/or complex computing systems
computers (from tablets to supercomputers)
networks
printers
etc.
Includes the software:
operating systems
programming languages
networking stacks
file systems
distributed systems
etc.
10. Systems Research…
How it‟s different from say, algorithm design:
external interface (requirement) is less precise, more complex, subject
to change…
more internal structure, thus more internal interfaces
module-level design choices have wider implications
measure of success is unclear
No such thing as a „best‟ design
more important to avoid making terrible choices
Usually, the physical system doesn‟t yet exist!
simultaneous development of the hardware and software
how do you design and test the software?
how do you design and test the hardware?
12. The 1992 Turing Award
CITATION:
For contributions to the development of distributed,
personal computing environments and the technology for
their implementation: workstations, networks, operating
systems, programming systems, displays, security and
document publishing.
13. A 1-page Biography
Born in 1943, in Washington, DC
Education:
B.A. in Physics at Harvard, 1964
Ph.D. in EECS at Univ. of California at Berkeley, 1967, aged 23!
Work:
Xerox PARC, 1971-1984
DEC SRC, 1984-1995
Microsoft Research, 1995-
Systems:
Alto (workstation)
Bravo (WYSIWYG editor)
Cal (timesharing system)
Dover (laser printer)
Awards:
1992 ACM Turing Award
ACM Software Systems Award
IEEE Computer Pioneer Award
IEEE von Neumann Award
3 papers in ACM SIGOPS Hall of Fame!
14. The Xerox Alto
The first personal computer, built in 1973 at
PARC
First computer to use the desktop metaphor
First computer with a mouse-driven GUI
Conceived in a 1972 memo written by
Lampson
Design led by Charles Thacker (2009
Turing)
Lampson contributed to the design
Wrote the OS!
Lampson et al. won the 1984 ACM
Software System Award and the 2004 NAE
Draper prize for the Alto
Later work: led the design of Dorado and
15. The Bravo editor
First WYSIWYG editor
Shipped with the Xerox
Alto
Developed by Lampson,
Simonyi et al. in 1974
Led to the development
of
Microsoft Word
16. The Xerox 9700 and Dover Laser
Printers
First laser printer, at PARC, 1969, led by Gary
Starkweather
Lampson co-designed the electronics and
software
prototype for the Xerox 9700
Dover Laser Printer
later
version, in 1976
much cheaper
Lampson designed the electronics
17. SDS 940
First general-purpose time-sharing system,
with char-by-char interaction
Project Genie at Berkeley
Lampson joined as graduate
student
wrote parts of the OS
created several programming
languages!
Cal:
interactive language for
numerical computation
QSPL: system programming
18. Other Research
Two-phase commit protocol for distributed
transactions (with Sturgis)
Cal time-sharing system for CDC 6400 (at Berkeley)
pioneered shadow pages, redo logs
capability-based system: not a good basis for long-term
security
Programming Languages:
Mesa and SPL for systems programming
Mesa‟s process mechanism modern thread systems
Cedar: combining the virtues of Mesa and Lisp
Euclid: first language designed to enable program
verification
Modula 2+: extends Wirth‟s Modula-2
19. But wait, there‟s more!
Security
Access Matrix model, unifying capabilities and ACLs
Information Flow Control
Theory of Principals “speaking for” other principals
Microsoft Palladium (TCB)
Scrubbing disk storage
How economic factors (not technology) inhibit security
Networking
co-inventor on Xerox patent for Ethernet!
switched LAN at DEC SRC
Formal specification and proof
TCP connection establishment (at-most-once messaging)
Leslie Lamport‟s Paxos protocol
22. The Hints paper...
Presented at the 9th SOSP,
appeared in SIGOPS OSR October
1983, reprinted in IEEE Software Jan
1984
Summarizes the learnings from years of
systems research and implementation
Still influential after all these years... e.g.:
Werner Vogels, CTO, Amazon
Butler's paper shows a great mix of
fundamentals and best practices from the
early days of large scale system design.
Almost all of his advice has withstood the
test of time and as such they are even
more important now than in 1983
from a July 2012 blog post
23. How it‟s organized
“Hints”:
not laws
not foolproof recipes
not consistent
not always appropriate
in short, no guarantees
Each hint is:
summarized by a slogan
illustrated with examples from systems work
preceded by an appropriate quotation from Hamlet!
Hints for functionality (does it work?), speed (is it fast enough?) and fault
tolerance (does it keep working?)
Pre-req: Notion of an interface that separates an implementation of some
abstraction from the clients who use the abstraction
“Defining interfaces is the most important part of system design”
24. Hints for Functionality
Do one thing at a time, and do it Perfection is reached not when
well there is no longer anything to
KISS add, but when there is no longer
Don’t generalize, generalizations anything to take away!
are generally wrong! A. Saint-Exupery
Interface should capture the We are faced with an
minimum essentials of an insurmountable opportunity
abstraction W. Kelley
mustn’t promise more features
than the implementer knows
how to deliver (without
penalizing other clients)
Service should have a fairly Everything should be made as
predictable cost simple as possible, but no
simpler
But, get it right! A. Einstein
25. Hints for Functionality...
Make it fast, rather than general or powerful
Analogous to the RISC principle...
Better to have fast, basic operations than slow, powerful
ones
Clients who don‟t need the power shouldn‟t have to pay
more for the basic functionality
As long as it‟s fast, a client can program the additional
functionality it needs; another client can program some
other function
Don’t hide power
If a low-level abstraction allows something to be done
quickly, higher levels should not bury this power inside
something more general
abstractions should conceal only undesirable properties!
26. Hints for Functionality...
Use procedure arguments to provide flexibility in an
interface
restricted or encoded as necessary
e.g. an interface for listing all elements of a set that satisfy
a property
let clients pass in a filter vs. a special language of patterns
Leave it to the client
keep the interface simple, flexible and high-performance
caveat: as long as it is cheap to pass control back and
forth
e.g. success of monitors for synchronization, because
locking and signalling do very little
clients take care of process scheduling, buffer allocation,
resource accounting, etc. or use other libraries for that
e.g. Unix style utilities: stdin processing stdout
27. Hints for Functionality...
Keep basic interfaces stable
embodies assumptions shared by many parts of a system
if type-checking is available, may be possible to check for
number and types of arguments
but still requires programmers to rework the integration
may not detect semantic changes in interfaces
Keep a place to stand...
... if you do change interfaces
e.g. a compatibility package, implements an old interface
on top of the new one
e.g. OS simulators
dev cost may be less vs. cost of fixing all client software
performance hit, but often acceptable
28. Hints for Functionality...
Plan to throw one away...
... you will, anyhow!
Keep secrets...
assumptions about an implementation that clients aren‟t allowed
to make
tension vs. the desire not to hide power
tension vs. the need for performance
Divide and conquer
reduce a problem to several easier ones
bite off as much as will fit, leaving the rest for the next iteration
e.g. Alto‟s file system defragmentation and its use of memory
Handle normal and worst cases separately
the normal case must be fast, the worst case must make
progress
e.g. caches and hints help the normal case
e.g. Garbage Collector in Cedar doesn‟t count refs in local
frames; those are scanned completely during GC
29. Hints for Speed
Split resources in a fixed way, rather than sharing them
faster to allocate, faster to access, more predictable
e.g. using register banks vs. memory
cost of extra resources is usually low
multiplexing overheads may be larger than the fragmentation waste
Dynamic translation...
... from a convenient representation to one that can be quickly
interpreted
e.g. Smalltalk compiler generates bytecodes, implementation translates
(and caches) a single procedure‟s bytecodes to machine language when
invoked
Cache answers to expensive computations
store the triple [f, x, f(x)] in an associative store
basic example: processor cache or virtual memory systems
[Fetch, address, contents of address]
need to ensure cache invalidation/update for non-functional f() such as
Fetch()
30. Hints for Speed...
Use hints
like a cache entry (saved result of some computation), but:
not necessarily reached by associative lookup
may be wrong!
if using a hint for unrecoverable actions, need to check if
it‟s wrong
e.g. store-and-forward routing tables in Arpanet
based on each node‟s opinion about its links to neighbours
periodically broadcast (can be lost, or delivered out of order)
When in doubt, use brute force
hardware is cheap (even in 1983!)
brute force allows for cheaper, faster implementation
e.g. chess-playing computers that use brute-force vs.
sophisticated chess strategies
31. Hints for Speed...
Compute in the background where possible
especially in interactive or real-time systems
e.g., Cedar garbage collector, e-mail delivery, etc.
Use batch processing if possible
doing things incrementally usually costs more
disks and tapes work better when accessed sequentially
e.g. banking systems use online data, but discard it after nightly batch
Safety first
in allocating resources, avoid disaster vs. being optimal
“a general-purpose system cannot optimize the use of resources”
again, “hardware is cheap, and getting cheaper”
cleverness works only if you have very well-known loads
“The nicest thing about the Alto is that it doesn’t run faster at night” – Morris
no fancy processor scheduling, fixed share of cycles to each job
Shed load to control demand
e.g. Arpanet moved from guaranteed to best-effort delivery
32. Hints for Fault-Tolerance
End-to-end
Saltzer et al. classic paper, 1981 ICDCS
error recovery at app level is necessary for a reliable system
any other error detection or recovery is strictly for performance
e.g. file transfer across a network from A to B
reading from B‟s disk and validating checksum against A‟s disk is necessary
checking the transfer from A‟s disk to A‟s memory, or from A to B over the
network, is not sufficient
not necessary either, but can help performance (retransfer only corrupted
parts)
Log updates to record the truth about an object’s state
current state of object is treated like a hint
log entry must include update procedure (functional) and its arguments
sequence of log entries can be re-executed if necessary to re-create the
true state
Make actions atomic or restartable
atomic: failure during the action has no effect
log entries need to be restartable / idempotent
can be partially executed any number of times before a complete execution
33. In Conclusion...
Most humbly do I take my leave, my lord.
“Such a collection of good advice and
anecdotes is rather tiresome to read; perhaps
it is best taken in small doses at bedtime”!
“I can only plead that I have ignored most of
these rules at least once, and nearly always
regretted it”
34. For more information…
http://amturing.acm.org/award_winners/lampso
n_1142421.cfm - Profile of Lampson on ACM
site
http://amturing.acm.org/bib/lampson_1142421.
cfm - citations for the 3 SIGOPS HoF papers
http://research.microsoft.com/en-
us/um/people/blampson/default.htm -
Lampson‟s homepage at MSR