Threading Successes 02   Supreme Commander
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Threading Successes 02 Supreme Commander

on

  • 1,501 views

 

Statistics

Views

Total Views
1,501
Views on SlideShare
1,501
Embed Views
0

Actions

Likes
0
Downloads
12
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Threading Successes 02 Supreme Commander Presentation Transcript

  • 1. THQ/Gas Powered Games Supreme Commander and Supreme Commander: Forged Alliance Thread for Performance
  • 2. Supreme Commander runs best on 4 cores - let’s see how!
    • Threading in mid-project can be done!
    • Decoupled threads give great performance
    • Memory management extends the gains
    • Lessons learned
  • 3. Threading was a mid-stream change
    • Code was initially single-threaded
      • Game demanded more performance
      • Changed mid-project (6-12 months into development)
      • Separate render/sim threads to run at different rates
      • Support multiple cores
    • Limited architecture choices due to existing code
    • Using Boost thread library
      • Portable, open-source thread library
  • 4. Render split is essential to speed
    • Lots of “little” threads: sound, loading, etc.
    • Sim thread: All simulation
    • Render thread: Full speed, <=10x per sim tick
    • Sync phase: Once frame is ready to render
      • Sync render and sim
      • Fully queued in and out of sim
      • Fast
  • 5. Decoupled architecture is built for speed
    • Ready to start a frame and a simulation tick
    Issue
  • 6. Decoupled architecture is built for speed Simulation Render Issue Run decoupled sim and render Fully buffered input to sim, call via Sim Thread Interface Sim Thread Interface
  • 7. Decoupled architecture is built for speed
    • Render can run repeatedly
    • Depends on sim duration
    Simulation Render Issue Up to 10x per sim tick Render …
  • 8. Decoupled architecture is built for speed
    • Fully decoupled? No.
    • A few low level systems have locks.
    • No major performance impact!
    Simulation Render Issue Up to 10x per sim tick Render … Locks
  • 9. Decoupled architecture is built for speed
    • Sync sim thread out to render thread,
    • via STI again
    Simulation Render Issue Up to 10x per sim tick Render Render … Sim Thread Interface Issue
  • 10. Decoupled architecture is built for speed
    • Multiplayer: Record everything going through STI
    • Send over network
    Simulation Render Issue Up to 10x per sim tick Render Render … Sim Thread Interface Issue Sim Thread Interface
  • 11. Decoupled architecture is built for speed
    • And so on…
    Simulation Render Issue Up to 10x per sim tick Render Sim Render Render Issue … … Issue
  • 12. Thread model adapts to varying loads
    • Architecture scales well with loads
      • Render load will often dominate
      • Re-render to keep frame rates up
      • Sim-heavy map will try to be sim-dominated
  • 13. Displaying frame times – cool!
    • Thread stats in real time
  • 14. Sometimes, there’s more to render
    • Render
    • Runs as fast as possible
    • Simulation
    • Sim/render sync
    • Both threads synced, fully queued in and out of sim
  • 15. Other times, there’s more to simulate
    • Sim runs across many rendered frames
  • 16. A little sync doesn’t slow this code down Frame n Frame n+1 Sync Busy Waiting Threads are busy most of the time! Mostly waiting
  • 17. Memory manager gives an additional boost
    • Memory: If you’re not careful in a threaded game…
      • Memory use can thrash cache – but not a problem here!
      • Memory alloc/free can be slow
    • Suspected memory management was problem
      • Doing lots of small allocations
      • Built code to make it easy to switch mem managers
    • Custom mem manager outperforms default malloc/free
      • Can cause some debugging questions
      • Purchased commercial one for Supreme Commander
      • Wrote new one for Forged Alliance
  • 18. What are some current bottlenecks?
    • Multiplayer: all sims run concurrently
      • Limited by least-common-denominator machine
      • That’s the RTS way
    • Monolithic render thread
      • Multiple monitors, typically different views
      • Possibly split off top part of render for second monitor?
      • Too expensive/complex for niche feature
  • 19. This was a great learning experience!
    • Good intermediate step
      • Especially for threading mid-project
    • Would do it differently if doing it from scratch
      • Target more processor cores
      • General worker threads w/dispatch system
      • Templates to define an interface to common semantics
      • Directed work graph/node graph (hard to express)
      • Or …?
    • The engine is so good, it’ll be back in Demigod!
      • Demigod team using modified Supreme Commander engine
  • 20. We learned some DOs and DON’Ts
    • Do:
      • Architect for threading from the start, if you can
      • Thread single-threaded code, if you must
      • Decouple threads where possible
    • Don’t:
      • Be afraid to thread single-threaded code
  • 21. Supreme Commander runs best on 4 cores – that’s how!
    • Threading in mid-project can be done!
    • Decoupled threads give great performance
    • Memory management extends the gains
    • Lessons learned
  • 22. So, what do you think?
    • Have you tried something like this?
      • Successes?
      • Failures?
    • Have you rejected trying something like this?
      • Why?
  • 23.