SlideShare a Scribd company logo
Down Memory Lane:
Two Decades with the Slab Allocator
CTO
bryan@joyent.com
Bryan Cantrill
@bcantrill
Two decades ago…
Aside: Me, two decades ago…
Aside: Me, two decades ago…
Slab allocator, per Vahalia
First <3: Allocator footprint
I <3: Cache-aware data structures
I <3: Cache-aware data structures
Hardware Antiques Roadshow!
I <3: Magazine layer
I <3: Magazine autoscaling
I <3: Debugging support
• For many, the most salient property of the slab allocator is its
eponymous allocation properties…
• …but for me, its debugging support is much more meaningful
• Rich support for debugging memory corruption issues:
allocation auditing + detection of double-free/use-after-free/
use-before-initialize/buffer overrun
• Emphasis was leaving the functionality compiled in (and
therefore available in production) — even if off by default
• kmem_flags often used to debug many kinds of problems!
I <3: Debugging support
I <3: Debugger support
I <3: MDB support
• Work in crash(1M) inspired deeper kmem debugging support
in MDB, including commands to:
• Walk the buffers from a cache (::walk kmem)
• Display buffers allocated by a thread (::allocdby)
• Verify integrity of kernel memory (::kmem_verify)
• Determine the kmem cache of a buffer (::whatis)
• Find memory leaks (::findleaks)
• And of course, for Bonwick, ::kmastat and ::kmausers!
I <3: libumem
• Tragic to have a gold-plated kernel facility while user-land
suffered in the squalor of malloc(3C)…
• In the (go-go!) summer of 2000, Jonathan Adams (then a Sun
Microsystems intern from CalTech) ported the kernel memory
allocator to user-land as libumem
• Jonathan returned to Sun as a full-time engineer to finish and
integrate libumem; became open source with OpenSolaris
• libumem is the allocator for illumos and derivatives like
SmartOS — and has since been ported to other systems
libumem performance
• The slab allocator was designed to be scalable — not
necessarily to accommodate pathological software
• At user-level, pathological software is much more common…
• Worse, the flexibility of user-level means that operations that
are quick in the kernel (e.g., grabbing an uncontended lock)
are more expensive at user-level
• Upshot: while libumem provided great scalability, its latency
was worse than other allocators for applications with small,
short-lived allocations
libumem performance: node.js running MD5
I <3: Per-thread caching libumem
• Robert Mustacchi added per-thread caching to libumem:
• free() doesn’t free buffer, but rather enqueues on a per-
size cache on ulwp_t (thread) structure
• malloc() checks the thread’s cache at given size first
• Several problems:
• How to prevent cache from growing without bound?
• How to map dynamic libumem object cache sizes to fixed
range in ulwp_t structure without slowing fast path?
I <3: Per-thread caching libumem
• Cache growth problem solved by having each thread track
the total memory in its cache, and allowing per-thread cache
size to be tuned (default is 1M)
• Solving problem of optimal malloc() in light of arbitrary
libumem cache sizes is a little (okay, a lot) gnarlier; from

usr/src/lib/libumem/umem.c:
I <3 the slab allocator!
• It shows the importance of well-designed, well-considered,
well-described core services
• We haven’t need to rewrite it — it has withstood ~4 orders of
magnitude increase in machine size
• We have been able to meaningfully enhance it over the years
• It is (for me) the canonical immaculate system: more one to
use and be inspired by than need to fix or change
• It remains at the heart of our system — and its ethos very
much remains the zeitgeist of illumos!
Acknowledgements
• Jeff Bonwick for creating not just an allocator, but a way of
thinking about systems problems and of implementing their
solutions — and the courage to rewrite broken software!
• Jonathan Adams for not just taking on libumem, but also
writing about it formally and productizing it
• Robert Mustacchi for per-thread caching libumem
• Ryan Zezeski for bringing the slab allocator to a new
generation with his Papers We Love talk
Further reading
• Jeff Bonwick, The Slab Allocator: An Object-Caching Kernel Memory
Allocator, Summer USENIX, 1994
• Jeff Bonwick and Jonathan Adams, Magazines and Vmem: Extending
the Slab Allocator to Many CPUs and Arbitrary Resources, USENIX
Annual Technical Conference, 2001
• Robert Mustacchi, Per-thread caching in libumem, blog entry on
dtrace.org, July 2012
• Ryan Zezeski, Memory by the Slab: The Tale of Bonwick’s Slab
Allocator, Papers We Love NYC, September 2015
• Uresh Vahalia, UNIX Internals, 1996

More Related Content

What's hot

Run containers on bare metal already!
Run containers on bare metal already!Run containers on bare metal already!
Run containers on bare metal already!
bcantrill
 
The DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps PlaybookThe DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps Playbook
bcantrill
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyond
bcantrill
 
Leaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guideLeaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guide
bcantrill
 
Oral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsOral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generations
bcantrill
 
Docker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote APIDocker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote API
bcantrill
 
The Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of dataThe Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of data
bcantrill
 
node.js in production: Reflections on three years of riding the unicorn
node.js in production: Reflections on three years of riding the unicornnode.js in production: Reflections on three years of riding the unicorn
node.js in production: Reflections on three years of riding the unicorn
bcantrill
 
Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)
bcantrill
 
Manta: a new internet-facing object storage facility that features compute by...
Manta: a new internet-facing object storage facility that features compute by...Manta: a new internet-facing object storage facility that features compute by...
Manta: a new internet-facing object storage facility that features compute by...
Hakka Labs
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data path
bcantrill
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system software
bcantrill
 
Docker-N-Beyond
Docker-N-BeyondDocker-N-Beyond
Docker-N-Beyondsantosh007
 
Course 101: Lecture 5: Linux & GNU
Course 101: Lecture 5: Linux & GNU Course 101: Lecture 5: Linux & GNU
Course 101: Lecture 5: Linux & GNU
Ahmed El-Arabawy
 
Uklug2011.lotus.on.linux.report.technical.edition.v1.0
Uklug2011.lotus.on.linux.report.technical.edition.v1.0Uklug2011.lotus.on.linux.report.technical.edition.v1.0
Uklug2011.lotus.on.linux.report.technical.edition.v1.0dominion
 
Course 101: Lecture 1: Introduction to Embedded Systems
Course 101: Lecture 1: Introduction to Embedded SystemsCourse 101: Lecture 1: Introduction to Embedded Systems
Course 101: Lecture 1: Introduction to Embedded Systems
Ahmed El-Arabawy
 
Docker - Hack Salem! - November 2014
Docker - Hack Salem! - November 2014Docker - Hack Salem! - November 2014
Docker - Hack Salem! - November 2014
Charles Anderson
 
Meetup Mesos : Mesos, Chronos and Marathon in CI/CD factory
Meetup Mesos : Mesos, Chronos and Marathon in CI/CD factoryMeetup Mesos : Mesos, Chronos and Marathon in CI/CD factory
Meetup Mesos : Mesos, Chronos and Marathon in CI/CD factory
Laurent Grangeau
 
DevOps'n the Operating System
DevOps'n the Operating SystemDevOps'n the Operating System
DevOps'n the Operating System
C4Media
 

What's hot (20)

Run containers on bare metal already!
Run containers on bare metal already!Run containers on bare metal already!
Run containers on bare metal already!
 
The DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps PlaybookThe DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps Playbook
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyond
 
Leaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guideLeaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guide
 
Oral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsOral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generations
 
Docker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote APIDocker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote API
 
The Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of dataThe Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of data
 
node.js in production: Reflections on three years of riding the unicorn
node.js in production: Reflections on three years of riding the unicornnode.js in production: Reflections on three years of riding the unicorn
node.js in production: Reflections on three years of riding the unicorn
 
Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)
 
Manta: a new internet-facing object storage facility that features compute by...
Manta: a new internet-facing object storage facility that features compute by...Manta: a new internet-facing object storage facility that features compute by...
Manta: a new internet-facing object storage facility that features compute by...
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data path
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system software
 
Docker-N-Beyond
Docker-N-BeyondDocker-N-Beyond
Docker-N-Beyond
 
Course 101: Lecture 5: Linux & GNU
Course 101: Lecture 5: Linux & GNU Course 101: Lecture 5: Linux & GNU
Course 101: Lecture 5: Linux & GNU
 
Uklug2011.lotus.on.linux.report.technical.edition.v1.0
Uklug2011.lotus.on.linux.report.technical.edition.v1.0Uklug2011.lotus.on.linux.report.technical.edition.v1.0
Uklug2011.lotus.on.linux.report.technical.edition.v1.0
 
Course 101: Lecture 1: Introduction to Embedded Systems
Course 101: Lecture 1: Introduction to Embedded SystemsCourse 101: Lecture 1: Introduction to Embedded Systems
Course 101: Lecture 1: Introduction to Embedded Systems
 
Docker - Hack Salem! - November 2014
Docker - Hack Salem! - November 2014Docker - Hack Salem! - November 2014
Docker - Hack Salem! - November 2014
 
Meetup Mesos : Mesos, Chronos and Marathon in CI/CD factory
Meetup Mesos : Mesos, Chronos and Marathon in CI/CD factoryMeetup Mesos : Mesos, Chronos and Marathon in CI/CD factory
Meetup Mesos : Mesos, Chronos and Marathon in CI/CD factory
 
DevOps'n the Operating System
DevOps'n the Operating SystemDevOps'n the Operating System
DevOps'n the Operating System
 
Elatt Presentation
Elatt PresentationElatt Presentation
Elatt Presentation
 

Viewers also liked

Leadership Without Management: Scaling Organizations by Scaling Engineers
Leadership Without Management: Scaling Organizations by Scaling EngineersLeadership Without Management: Scaling Organizations by Scaling Engineers
Leadership Without Management: Scaling Organizations by Scaling Engineers
bcantrill
 
The State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionThe State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destruction
bcantrill
 
Debugging (Docker) containers in production
Debugging (Docker) containers in productionDebugging (Docker) containers in production
Debugging (Docker) containers in production
bcantrill
 
Corporate Open Source Anti-patterns
Corporate Open Source Anti-patternsCorporate Open Source Anti-patterns
Corporate Open Source Anti-patterns
bcantrill
 
Debugging microservices in production
Debugging microservices in productionDebugging microservices in production
Debugging microservices in production
bcantrill
 
A crime against common sense
A crime against common senseA crime against common sense
A crime against common sense
bcantrill
 
Software Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniquesSoftware Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniquesAngelos Kapsimanis
 
Application resiliency using netflix hystrix
Application resiliency using netflix hystrixApplication resiliency using netflix hystrix
Application resiliency using netflix hystrix
Senthilkumar Gopal
 
Concepts of React
Concepts of ReactConcepts of React
Concepts of React
inovex GmbH
 
Testing Microservices
Testing MicroservicesTesting Microservices
Testing Microservices
Anne-Marie Charrett
 

Viewers also liked (10)

Leadership Without Management: Scaling Organizations by Scaling Engineers
Leadership Without Management: Scaling Organizations by Scaling EngineersLeadership Without Management: Scaling Organizations by Scaling Engineers
Leadership Without Management: Scaling Organizations by Scaling Engineers
 
The State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionThe State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destruction
 
Debugging (Docker) containers in production
Debugging (Docker) containers in productionDebugging (Docker) containers in production
Debugging (Docker) containers in production
 
Corporate Open Source Anti-patterns
Corporate Open Source Anti-patternsCorporate Open Source Anti-patterns
Corporate Open Source Anti-patterns
 
Debugging microservices in production
Debugging microservices in productionDebugging microservices in production
Debugging microservices in production
 
A crime against common sense
A crime against common senseA crime against common sense
A crime against common sense
 
Software Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniquesSoftware Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniques
 
Application resiliency using netflix hystrix
Application resiliency using netflix hystrixApplication resiliency using netflix hystrix
Application resiliency using netflix hystrix
 
Concepts of React
Concepts of ReactConcepts of React
Concepts of React
 
Testing Microservices
Testing MicroservicesTesting Microservices
Testing Microservices
 

Similar to Down Memory Lane: Two Decades with the Slab Allocator

Linux Distribution Collaboration …on a Mainframe!
Linux Distribution Collaboration …on a Mainframe!Linux Distribution Collaboration …on a Mainframe!
Linux Distribution Collaboration …on a Mainframe!
All Things Open
 
Search Architecture at Evernote: Presented by Christian Kohlschütter, Evernote
Search Architecture at Evernote: Presented by Christian Kohlschütter, EvernoteSearch Architecture at Evernote: Presented by Christian Kohlschütter, Evernote
Search Architecture at Evernote: Presented by Christian Kohlschütter, Evernote
Lucidworks
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scale
jimriecken
 
Hard Coding as a design approach
Hard Coding as a design approachHard Coding as a design approach
Hard Coding as a design approach
Oren Eini
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remani
jaxconf
 
Make It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version ControlMake It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version Control
indiver
 
Vulnerability, exploit to metasploit
Vulnerability, exploit to metasploitVulnerability, exploit to metasploit
Vulnerability, exploit to metasploit
Tiago Henriques
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling Software
Abdelmonaim Remani
 
Introduction of vertical crawler
Introduction of vertical crawlerIntroduction of vertical crawler
Introduction of vertical crawler
Jinglun Li
 
But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?
gagravarr
 
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
Gaetano Giunta
 
Hands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestrationHands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestration
Amir Hossein Sorouri
 
NDC London 2020 - Challenges of Managing CoreFx Repo -- Karel Zikmund
NDC London 2020 - Challenges of Managing CoreFx Repo -- Karel ZikmundNDC London 2020 - Challenges of Managing CoreFx Repo -- Karel Zikmund
NDC London 2020 - Challenges of Managing CoreFx Repo -- Karel Zikmund
Karel Zikmund
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP
120bi
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web Applications
Achievers Tech
 
Random House
Random HouseRandom House
Random House
victorlukianchikov
 
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
MayaData
 
2020 oct zowe quarterly webinar series
2020 oct zowe quarterly webinar series2020 oct zowe quarterly webinar series
2020 oct zowe quarterly webinar series
Open Mainframe Project
 
MPI Requirements of the Network Layer
MPI Requirements of the Network LayerMPI Requirements of the Network Layer
MPI Requirements of the Network Layer
inside-BigData.com
 
Linux operating system
Linux operating systemLinux operating system
Linux operating system
ITz_1
 

Similar to Down Memory Lane: Two Decades with the Slab Allocator (20)

Linux Distribution Collaboration …on a Mainframe!
Linux Distribution Collaboration …on a Mainframe!Linux Distribution Collaboration …on a Mainframe!
Linux Distribution Collaboration …on a Mainframe!
 
Search Architecture at Evernote: Presented by Christian Kohlschütter, Evernote
Search Architecture at Evernote: Presented by Christian Kohlschütter, EvernoteSearch Architecture at Evernote: Presented by Christian Kohlschütter, Evernote
Search Architecture at Evernote: Presented by Christian Kohlschütter, Evernote
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scale
 
Hard Coding as a design approach
Hard Coding as a design approachHard Coding as a design approach
Hard Coding as a design approach
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remani
 
Make It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version ControlMake It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version Control
 
Vulnerability, exploit to metasploit
Vulnerability, exploit to metasploitVulnerability, exploit to metasploit
Vulnerability, exploit to metasploit
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling Software
 
Introduction of vertical crawler
Introduction of vertical crawlerIntroduction of vertical crawler
Introduction of vertical crawler
 
But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?But we're already open source! Why would I want to bring my code to Apache?
But we're already open source! Why would I want to bring my code to Apache?
 
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
 
Hands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestrationHands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestration
 
NDC London 2020 - Challenges of Managing CoreFx Repo -- Karel Zikmund
NDC London 2020 - Challenges of Managing CoreFx Repo -- Karel ZikmundNDC London 2020 - Challenges of Managing CoreFx Repo -- Karel Zikmund
NDC London 2020 - Challenges of Managing CoreFx Repo -- Karel Zikmund
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web Applications
 
Random House
Random HouseRandom House
Random House
 
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...
 
2020 oct zowe quarterly webinar series
2020 oct zowe quarterly webinar series2020 oct zowe quarterly webinar series
2020 oct zowe quarterly webinar series
 
MPI Requirements of the Network Layer
MPI Requirements of the Network LayerMPI Requirements of the Network Layer
MPI Requirements of the Network Layer
 
Linux operating system
Linux operating systemLinux operating system
Linux operating system
 

More from bcantrill

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Present
bcantrill
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...
bcantrill
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systems
bcantrill
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systems
bcantrill
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolution
bcantrill
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Age
bcantrill
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
bcantrill
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Law
bcantrill
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
bcantrill
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemaps
bcantrill
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?
bcantrill
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the union
bcantrill
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systems
bcantrill
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after dark
bcantrill
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadership
bcantrill
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mind
bcantrill
 

More from bcantrill (17)

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Present
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmaking
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systems
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systems
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolution
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Age
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Law
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemaps
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the union
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systems
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after dark
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadership
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mind
 

Recently uploaded

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 

Recently uploaded (20)

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 

Down Memory Lane: Two Decades with the Slab Allocator

  • 1. Down Memory Lane: Two Decades with the Slab Allocator CTO bryan@joyent.com Bryan Cantrill @bcantrill
  • 3. Aside: Me, two decades ago…
  • 4. Aside: Me, two decades ago…
  • 7. I <3: Cache-aware data structures
  • 8. I <3: Cache-aware data structures Hardware Antiques Roadshow!
  • 10. I <3: Magazine autoscaling
  • 11. I <3: Debugging support • For many, the most salient property of the slab allocator is its eponymous allocation properties… • …but for me, its debugging support is much more meaningful • Rich support for debugging memory corruption issues: allocation auditing + detection of double-free/use-after-free/ use-before-initialize/buffer overrun • Emphasis was leaving the functionality compiled in (and therefore available in production) — even if off by default • kmem_flags often used to debug many kinds of problems!
  • 12. I <3: Debugging support
  • 13. I <3: Debugger support
  • 14. I <3: MDB support • Work in crash(1M) inspired deeper kmem debugging support in MDB, including commands to: • Walk the buffers from a cache (::walk kmem) • Display buffers allocated by a thread (::allocdby) • Verify integrity of kernel memory (::kmem_verify) • Determine the kmem cache of a buffer (::whatis) • Find memory leaks (::findleaks) • And of course, for Bonwick, ::kmastat and ::kmausers!
  • 15. I <3: libumem • Tragic to have a gold-plated kernel facility while user-land suffered in the squalor of malloc(3C)… • In the (go-go!) summer of 2000, Jonathan Adams (then a Sun Microsystems intern from CalTech) ported the kernel memory allocator to user-land as libumem • Jonathan returned to Sun as a full-time engineer to finish and integrate libumem; became open source with OpenSolaris • libumem is the allocator for illumos and derivatives like SmartOS — and has since been ported to other systems
  • 16. libumem performance • The slab allocator was designed to be scalable — not necessarily to accommodate pathological software • At user-level, pathological software is much more common… • Worse, the flexibility of user-level means that operations that are quick in the kernel (e.g., grabbing an uncontended lock) are more expensive at user-level • Upshot: while libumem provided great scalability, its latency was worse than other allocators for applications with small, short-lived allocations
  • 18. I <3: Per-thread caching libumem • Robert Mustacchi added per-thread caching to libumem: • free() doesn’t free buffer, but rather enqueues on a per- size cache on ulwp_t (thread) structure • malloc() checks the thread’s cache at given size first • Several problems: • How to prevent cache from growing without bound? • How to map dynamic libumem object cache sizes to fixed range in ulwp_t structure without slowing fast path?
  • 19. I <3: Per-thread caching libumem • Cache growth problem solved by having each thread track the total memory in its cache, and allowing per-thread cache size to be tuned (default is 1M) • Solving problem of optimal malloc() in light of arbitrary libumem cache sizes is a little (okay, a lot) gnarlier; from
 usr/src/lib/libumem/umem.c:
  • 20. I <3 the slab allocator! • It shows the importance of well-designed, well-considered, well-described core services • We haven’t need to rewrite it — it has withstood ~4 orders of magnitude increase in machine size • We have been able to meaningfully enhance it over the years • It is (for me) the canonical immaculate system: more one to use and be inspired by than need to fix or change • It remains at the heart of our system — and its ethos very much remains the zeitgeist of illumos!
  • 21. Acknowledgements • Jeff Bonwick for creating not just an allocator, but a way of thinking about systems problems and of implementing their solutions — and the courage to rewrite broken software! • Jonathan Adams for not just taking on libumem, but also writing about it formally and productizing it • Robert Mustacchi for per-thread caching libumem • Ryan Zezeski for bringing the slab allocator to a new generation with his Papers We Love talk
  • 22. Further reading • Jeff Bonwick, The Slab Allocator: An Object-Caching Kernel Memory Allocator, Summer USENIX, 1994 • Jeff Bonwick and Jonathan Adams, Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources, USENIX Annual Technical Conference, 2001 • Robert Mustacchi, Per-thread caching in libumem, blog entry on dtrace.org, July 2012 • Ryan Zezeski, Memory by the Slab: The Tale of Bonwick’s Slab Allocator, Papers We Love NYC, September 2015 • Uresh Vahalia, UNIX Internals, 1996