SlideShare a Scribd company logo
Understanding Real-World
Concurrency Bugs in Go
@kakashi
Hello!
I am kakashi
- Infra lead @UmboCV
- Co-organizer @ Golang Taipei Gathering
@kakashiliu
@kkcliu
Learning Camera Smart Cloud Neural A.I.
Agenda
● Introduction
● Concurrency in Go
● Go concurrency Bugs
○ Blocking
○ Non-Blocking
● Conclusion
Introduction
● Systematic study for 6 popular go projects
Concurrency in Go
1. Making threads (goroutines) lightweight and easy to create
2. Using explicit messaging (via channels) to communicate across
threads
Beliefs about Go:
● Make concurrent programming easier and less
error-prone
● Make heavy use of message passing via channels,
which is less error prone than shared memory
● Have less concurrency bugs
● Built-in deadlock and data racing can catch any bugs
Go Concurrency Usage Patterns
surprising finding is that shared memory synchronisation operations are still
used more often than message passing
線程安全,碼農有錢
Go Concurrency Bugs
1. Blocking - one or more goroutines are unintentionally stuck in their execution
and cannot move forward.
2. Non-Blocking - If instead all goroutines can finish their tasks but their
behaviors are not desired, we call them non-blocking ones
Blocking Bugs Causes
Message passing operations are even more likely to cause blocking bugs
faultMutex.Lock()
if faultDomain == nil {
var err error
faultDomain, err = fetchFaultDomain()
if err != nil {
return cloudprovider.Zone{}, err
}
}
zone := cloudprovider.Zone{}
faultMutex.UnLock()
return zone, nil
Blocking Bug caused by Mutex
faultMutex.Lock()
defer faultMutex.UnLock()
if faultDomain == nil {
var err error
faultDomain, err = fetchFaultDomain()
if err != nil {
return cloudprovider.Zone{}, err
}
}
zone := cloudprovider.Zone{}
faultMutex.UnLock()
return zone, nil
Blocking Bug caused by Mutex
var group sync.WaitGroup
group.Add(len(pm.plugins))
for_, p := range pm.plugins {
go func(p *plugin) {
defer group.Done()
}()
group.Wait()
}
Blocking Bug caused by WaitGroup
var group sync.WaitGroup
group.Add(len(pm.plugins))
for_, p := range pm.plugins {
go func(p *plugin) {
defer group.Done()
}()
group.Wait() // blocking
}
group.Wait() // fixed
Blocking Bug caused by WaitGroup
func finishReq(timeout time.Duration) r ob {
ch := make(chanob)
go func() {
result := fn()
ch <- result
}()
select {
case result = <- ch
return result
case <- time.After(timeout)
return nil
}
}
Blocking Bug caused by Channel
func finishReq(timeout time.Duration) r ob {
ch := make(chanob, 1)
go func() {
result := fn()
ch <- result // blocking
}()
select {
case result = <- ch
return result
case <- time.After(timeout)
return nil
}
}
Blocking Bug caused by Channel
Blocking Bug: Mistakenly using channel and mutex
Blocking Bug: Mistakenly using channel and mutex
func goroutine1() {
m.Lock()
ch <- request // blocking
m.Unlock()
}
func goroutine2() {
for{
m.Lock() // blocking
m.Unlock()
request <- ch
}
}
Non-Blocking Bugs Causes
There are much fewer non-blocking bugs caused by message passing than by
shared memory accesses.
Non-Blocking Bug caused by select and channel
ticker := time.NewTicker()
for {
f()
select {
case <- stopCh
return
case <- ticker
}
}
Non-Blocking Bug caused by select and channel
ticker := time.NewTicker()
for {
select{
case <- stopCh:
return
default:
}
f()
select {
case <- stopCh:
return
case <- ticker:
}
}
Non-Blocking Bug caused Timer
timer := time.NewTimer(0)
if dur > 0 {
timer = time.NewTimer(dur)
}
select{
case <- timer.C:
case <- ctx.Done:
return nil
}
Non-Blocking Bug caused Timer
timer := time.NewTimer(0)
var timeout <- chan time.Time
if dur > 0 {
timer = time.NewTimer(dur)
timeout = time.NewTimer(dur).C
}
select{
case <- timer.C:
case <- timeout:
case <- ctx.Done:
return nil
}
A data race caused by anonymous function
for i:=17; i<=21; i++ { // write
go func() {
apiVersion := fmt.Sprintf(“v1.%d”, i)
}()
}
A data race caused by anonymous function
for i:=17; i<=21; i++ { // write
go func(i int) {
apiVersion := fmt.Sprintf(“v1.%d”, i)
}(i)
}
A data race caused by passing reference through channel
Conclusion
1. Contrary to the common belief that message passing is less
error-prone, more blocking bugs in our studied Go applications
are caused by wrong message passing than by wrong shared
memory protection.
2. Message passing causes less nonblocking bugs than shared
memory synchronization
3. Misusing Go libraries can cause both blocking and
nonblocking bugs
Q&A

More Related Content

Similar to Understanding real world concurrency bugs in go (fixed)

Similar to Understanding real world concurrency bugs in go (fixed) (20)

Fundamental concurrent programming
Fundamental concurrent programmingFundamental concurrent programming
Fundamental concurrent programming
 
10 reasons to be excited about go
10 reasons to be excited about go10 reasons to be excited about go
10 reasons to be excited about go
 
Demystifying the Go Scheduler
Demystifying the Go SchedulerDemystifying the Go Scheduler
Demystifying the Go Scheduler
 
Async Web Frameworks in Python
Async Web Frameworks in PythonAsync Web Frameworks in Python
Async Web Frameworks in Python
 
Asynchronous programming intro
Asynchronous programming introAsynchronous programming intro
Asynchronous programming intro
 
Go & multi platform GUI Trials and Errors
Go & multi platform GUI Trials and ErrorsGo & multi platform GUI Trials and Errors
Go & multi platform GUI Trials and Errors
 
Inroduction to golang
Inroduction to golangInroduction to golang
Inroduction to golang
 
Introduction to ZooKeeper - TriHUG May 22, 2012
Introduction to ZooKeeper - TriHUG May 22, 2012Introduction to ZooKeeper - TriHUG May 22, 2012
Introduction to ZooKeeper - TriHUG May 22, 2012
 
Go fundamentals
Go fundamentalsGo fundamentals
Go fundamentals
 
2015-GopherCon-Talk-Uptime.pdf
2015-GopherCon-Talk-Uptime.pdf2015-GopherCon-Talk-Uptime.pdf
2015-GopherCon-Talk-Uptime.pdf
 
How go makes us faster (May 2015)
How go makes us faster (May 2015)How go makes us faster (May 2015)
How go makes us faster (May 2015)
 
On the way to low latency (2nd edition)
On the way to low latency (2nd edition)On the way to low latency (2nd edition)
On the way to low latency (2nd edition)
 
Tracer
TracerTracer
Tracer
 
Need for Async: Hot pursuit for scalable applications
Need for Async: Hot pursuit for scalable applicationsNeed for Async: Hot pursuit for scalable applications
Need for Async: Hot pursuit for scalable applications
 
Mender.io | Develop embedded applications faster | Comparing C and Golang
Mender.io | Develop embedded applications faster | Comparing C and GolangMender.io | Develop embedded applications faster | Comparing C and Golang
Mender.io | Develop embedded applications faster | Comparing C and Golang
 
Introduction to Google Colaboratory.pdf
Introduction to Google Colaboratory.pdfIntroduction to Google Colaboratory.pdf
Introduction to Google Colaboratory.pdf
 
Ipc feb4
Ipc feb4Ipc feb4
Ipc feb4
 
Go. Why it goes
Go. Why it goesGo. Why it goes
Go. Why it goes
 
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
 
Renesas DevCon 2010: Starting a QT Application with Minimal Boot
Renesas DevCon 2010: Starting a QT Application with Minimal BootRenesas DevCon 2010: Starting a QT Application with Minimal Boot
Renesas DevCon 2010: Starting a QT Application with Minimal Boot
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 

Recently uploaded (20)

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 

Understanding real world concurrency bugs in go (fixed)

  • 2. Hello! I am kakashi - Infra lead @UmboCV - Co-organizer @ Golang Taipei Gathering @kakashiliu @kkcliu
  • 3. Learning Camera Smart Cloud Neural A.I.
  • 4. Agenda ● Introduction ● Concurrency in Go ● Go concurrency Bugs ○ Blocking ○ Non-Blocking ● Conclusion
  • 5. Introduction ● Systematic study for 6 popular go projects
  • 6. Concurrency in Go 1. Making threads (goroutines) lightweight and easy to create 2. Using explicit messaging (via channels) to communicate across threads
  • 7. Beliefs about Go: ● Make concurrent programming easier and less error-prone ● Make heavy use of message passing via channels, which is less error prone than shared memory ● Have less concurrency bugs ● Built-in deadlock and data racing can catch any bugs
  • 8. Go Concurrency Usage Patterns surprising finding is that shared memory synchronisation operations are still used more often than message passing
  • 10. Go Concurrency Bugs 1. Blocking - one or more goroutines are unintentionally stuck in their execution and cannot move forward. 2. Non-Blocking - If instead all goroutines can finish their tasks but their behaviors are not desired, we call them non-blocking ones
  • 11. Blocking Bugs Causes Message passing operations are even more likely to cause blocking bugs
  • 12. faultMutex.Lock() if faultDomain == nil { var err error faultDomain, err = fetchFaultDomain() if err != nil { return cloudprovider.Zone{}, err } } zone := cloudprovider.Zone{} faultMutex.UnLock() return zone, nil Blocking Bug caused by Mutex
  • 13. faultMutex.Lock() defer faultMutex.UnLock() if faultDomain == nil { var err error faultDomain, err = fetchFaultDomain() if err != nil { return cloudprovider.Zone{}, err } } zone := cloudprovider.Zone{} faultMutex.UnLock() return zone, nil Blocking Bug caused by Mutex
  • 14. var group sync.WaitGroup group.Add(len(pm.plugins)) for_, p := range pm.plugins { go func(p *plugin) { defer group.Done() }() group.Wait() } Blocking Bug caused by WaitGroup
  • 15. var group sync.WaitGroup group.Add(len(pm.plugins)) for_, p := range pm.plugins { go func(p *plugin) { defer group.Done() }() group.Wait() // blocking } group.Wait() // fixed Blocking Bug caused by WaitGroup
  • 16. func finishReq(timeout time.Duration) r ob { ch := make(chanob) go func() { result := fn() ch <- result }() select { case result = <- ch return result case <- time.After(timeout) return nil } } Blocking Bug caused by Channel
  • 17. func finishReq(timeout time.Duration) r ob { ch := make(chanob, 1) go func() { result := fn() ch <- result // blocking }() select { case result = <- ch return result case <- time.After(timeout) return nil } } Blocking Bug caused by Channel
  • 18. Blocking Bug: Mistakenly using channel and mutex
  • 19. Blocking Bug: Mistakenly using channel and mutex func goroutine1() { m.Lock() ch <- request // blocking m.Unlock() } func goroutine2() { for{ m.Lock() // blocking m.Unlock() request <- ch } }
  • 20. Non-Blocking Bugs Causes There are much fewer non-blocking bugs caused by message passing than by shared memory accesses.
  • 21. Non-Blocking Bug caused by select and channel ticker := time.NewTicker() for { f() select { case <- stopCh return case <- ticker } }
  • 22. Non-Blocking Bug caused by select and channel ticker := time.NewTicker() for { select{ case <- stopCh: return default: } f() select { case <- stopCh: return case <- ticker: } }
  • 23. Non-Blocking Bug caused Timer timer := time.NewTimer(0) if dur > 0 { timer = time.NewTimer(dur) } select{ case <- timer.C: case <- ctx.Done: return nil }
  • 24. Non-Blocking Bug caused Timer timer := time.NewTimer(0) var timeout <- chan time.Time if dur > 0 { timer = time.NewTimer(dur) timeout = time.NewTimer(dur).C } select{ case <- timer.C: case <- timeout: case <- ctx.Done: return nil }
  • 25. A data race caused by anonymous function for i:=17; i<=21; i++ { // write go func() { apiVersion := fmt.Sprintf(“v1.%d”, i) }() }
  • 26. A data race caused by anonymous function for i:=17; i<=21; i++ { // write go func(i int) { apiVersion := fmt.Sprintf(“v1.%d”, i) }(i) }
  • 27. A data race caused by passing reference through channel
  • 28. Conclusion 1. Contrary to the common belief that message passing is less error-prone, more blocking bugs in our studied Go applications are caused by wrong message passing than by wrong shared memory protection. 2. Message passing causes less nonblocking bugs than shared memory synchronization 3. Misusing Go libraries can cause both blocking and nonblocking bugs
  • 29. Q&A