The document discusses Goroutine stacks and stack allocation in Go. It explains that each Goroutine has an associated stack that starts small (128 bytes) but can grow as needed by allocating heap memory. There are two stacks - a system stack for M's and a user stack for Goroutines. The user stack uses a free list allocation scheme to efficiently allocate and free fixed-size stacks (e.g. 2KB, 4KB, 8KB). Stack growth is achieved by adjusting pointers when the stack size is exceeded.
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
Goroutine stack and local variable allocation in Go
1. Goroutine stack and local
variable allocation in Go
Cherie Hsieh
Golang Taiwan Community
GDG Hsinchu
Women Techmakers Community
2. Outline
1. The Run-Time Stack
2. User Stack in C Program
3. User Stack in Go goroutine
4. Goroutine Stack allocation
5. System Stack and User Stack
6. Stack Growing Mechanism
4. The Run-Time Stack
In common Linux systems, each process/thread has two stacks:
1. Kernel stack (system calls, interrupt handlers)
2. User stack (execute user-space applications)
5. The Run-Time Stack
User Stack provides last-in, first out memory management for procedure-calling.
source: https://www.mdpi.com/computers/computers-09-00048/article_deploy/html/images/computers-09-00048-g005.png
7. User Stack in C Program
The memory layout of a process with main thread.
source: https://cseweb.ucsd.edu/classes/sp16/cse120-a/applications/ln/proccontext.jpg
8. User Stack in C Program
The memory layout of a process with multiple threads.
source: https://cseweb.ucsd.edu/classes/sp16/cse120-a/applications/ln/thdcontext.jpg
9. Stack Size in C Program
Default user stack size of main thread: 8k
Use ulimit cmd-tool to check or change the default stack size.
execve() causes the program that is currently being run by the calling process to be
replaced with a new program, with newly initialized stack, heap, and data segments.
cherie@coscup:~$ ulimit -a
stack size (kbytes, -s) 8192
max user processes (-u) 59988
10. Stack Size in C Program
Create a new thread with a customized stack size using system call clone()
The stack argument specifies the location of the stack used by the child threads.
int clone(int (*fn)(void *), void *stack, int flags, void *arg, ...
/* pid_t *parent_tid, void *tls, pid_t *child_tid */);
12. User Stack in Go goroutine
Goroutine is lightweight, costing little more than the allocation of stack space. And the
stacks start small, so they are cheap.
/* source code: runtime/runtime2.go */
type stack struct {
lo uintptr
hi uintptr
}
type g struct {
// Stack parameters.
// stack describes the actual stack memory: [stack.lo, stack.hi).
stack stack
// ...more members
}
14. The Free List in stackCache and stackPool
Four size classes of free list: 2k, 4k, 8k, and 16k.
/* source code: runtime/malloc.go */
/* We want to cache 2KB, 4KB, 8KB, and 16KB stacks. Larger stacks */
/* will be allocated directly. */
/* Since FixedStack is different on different systems, we */
/* must vary NumStackOrders to keep the same maximum cached size. */
/* OS | FixedStack | NumStackOrders */
/* -----------------+------------+--------------- */
/* linux/darwin/bsd | 2KB | 4 */
/* windows/32 | 4KB | 3 */
/* windows/64 | 8KB | 2 */
/* plan9 | 4KB | 3 */
The size classes would be different due to different operating systems and
architectures.
15. The Free List in stackCache and stackPool
Considering the cache line of each architecture, a cache line padding is added to item
structure for performance optimization.
/* Global pool of spans that have free stacks. */
/* Stacks are assigned an order according to size. */
/* There is a free list for each order. */
var stackpool [_NumStackOrders]struct {
item stackpoolItem
_ [cpu.CacheLinePadSize -
unsafe.Sizeof(stackpoolItem{})%cpu.CacheLinePadSize]byte
}
17. System Stack in Go
There are two stacks in Go
1. User stack Every non-dead G has a user stack associated with it, which is what
user Go code executes on.
2. System stack Every M has a system stack associated with it (also known as the
M's "g0") . System and signal stacks cannot grow, but are large enough to execute
runtime and cgo code (8K).
Code running on the system stack is implicitly non-preemptible and the garbage
collector does not scan system stacks.
20. User Stack Extension in Go goroutine
A goroutines stack starts small, so they are cheap, and grow by allocating (and freeing)
heap storage as required.
const (
StackSmall = 128
StackBig = 4096
)
type g struct {
stack stack
stackguard0 uintptr
// ...more members
}
21. User Stack Extension in Go goroutine
Stack extension check:
1. frame size <= StackSmall
Comparing the stack pointer with the stack guard, then extend the stack if the stack pointer
is below the stack guard.
2. StackSmall < frame size <= StackBig
Moving down the stack pointer and comparing the stack pointer with the stack guard.
Extending the stack if the stack pointer is below the stack guard.
3. frame size > StackBig
Always does stack extension.
23. The Process of Stack Growth
How stack growth works?
1. record the stack map (global) for pointer checking.
2. adjust the pointer value according to adjustInfo.
28. Goroutine Initialization with > 2k function args
package main
/* change arrSize to 260 and check the result */
const arrSize = 4
var c = make(chan bool)
func bigArray(arr [arrSize]uint64) {
var sum uint64 = 1
for i := 0; i < arrSize; i++ {
sum = sum + arr[i]
}
c <- true
}
func main() {
var arr [arrSize]uint64
go bigArray(arr)
<-c
}
29. Further Study
1. stack frame is not always required
2. stack allocation with large frame size.
3. MemStats.StackInuse
memStats := new(runtime.MemStats)
runtime.ReadMemStats(memStats)