SlideShare a Scribd company logo
1 of 35
© 2007 Rachael L Madsen and Beverly T Block
Rachael Madsen
 Multiple Solutions Computing
 Portland, Oregon

rachael@multi-sol.com
www.multi-sol.com




                                2
Threads vs. Processes



    Multi-processor hardware



    Software choices for process management





                                              3
The use of multiple

computers or processors

to solve a problem

or perform a function



                          4
Threads and Processes

  Are Not The Same


                        5
A single process achieves parallelism by

    creating separate threads for subtasks

    A thread shares context with its parent

    process

    On a single processor, parallelism is an

    illusion created by interweaving




                                               6
Each process has its own context



    Overheads for creation, communication and

    context switching are higher

    Processes allow true concurrent computing

    even on separate systems




                                                7
Threads are generally faster



    Very dependent on hardware and

    operating system

    Difficult to generate metrics





                                     8
Intel/AMD
    vs.
    Cell


            9
Currently available

     Intel: 2 – 4 cores

     AMD: 2 cores (4 soon)

     Cell processors: 9 cores

     Graphical Processing Unit (GPU)



                                       10
Architectural State    Architectural State   Architectural State       Architectural State

Execution Engine        Execution Engine     Execution Engine          Execution Engine

    Local APIC                Local APIC         Local APIC                   Local APIC

           Second Level Cache                           Second Level Cache

              Bus Interface                                   Bus Interface




      System Bus




                                                                                             11
SPE- Synergistic Processing Element
                                  SPU – Synergistic Processor Unit
                                  SXU – Synergistic Execution Unit
                                  MFC – Memory Flow Control
PPE – PowerPC Processor Element   LS – Local Storage
PPU - PowerPC Processing Unit
PXU - PowerPC Execution Unit      MIC – Memory Interface Controller
L1, L2 – Local Storage            BIC – Broadband Interface Controller

                       SPE’s      SPU         SPU         SPU         SPU
                                  SXU         SXU          SXU         SXU


                                  LS          LS           LS          LS
 PPE                                                                                 MIC
                                  MFC        MFC          MFC         MFC

       PPU
                       L2         Element Interconnect Bus (EIB) (up to 96B/cycle)
       L1    PXU

                                  MFC        MFC          MFC         MFC
                                                                                     BIC
                                  LS          LS           LS          LS



                      SPE’s       SXU                      SXU         SXU
                                              SXU

                                  SPU         SPU         SPU         SPU                  12
PowerPC Processing Unit

Local       PowerPC       Local
Storage     Extension     Storage
                          2
1           Unit



                                    13
Synergistic Processor Unit
   Synergistic Execution Unit



       Local Storage

    Memory Flow Control

                                14
User-Level Threads

Kernel-Level Threads

 Hardware Threads




                       15
Defining
    and         Operating      Executing
 Preparing       Threads        Threads
  Threads




Performed by
               Performed
                             Performed by
Programming
               by OS using
                             Processors
Environment
               Processes
and Compiler




                                            16
User-Level Threads
          Kernel-Level Threads
           Hardware Threads


  Intel/AMD                      Cell
Sophisticated firmware   Minimal firmware
on chip to handle        on chip to handle
process execution        execution



                                             17
User-Level Threads
          Kernel-Level Threads
           Hardware Threads


   Intel/AMD                     Cell
Multiple process         Process
management of            management
threads by Operating     written by user:
System                   total control!


                                            18
User-Level Threads
          Kernel-Level Threads
           Hardware Threads


   Intel/AMD                 Cell
Use threading package   User manages
to manage threads       threads directly
(OpenMP, Pthreads,      or by adapting a
TBB, etc)               threading package


                                            19
Intel/AMD                     Cell
Completely controlled       Controlled by user
by OS and chip



For execution to be fast, execution block
(code and data) must be kept in cache as
much as possible.




                                                 20
Global Interpreter Lock (GIL)




    Cache Management




    Data Management




    Program Flow




    Thread Design





                                    21
Python allows only one instance of the

    interpreter to run at any given time

    True multi-processing only available by

    calling lower-level (C/C++/Fortran/etc)
    routines

    This is as it should be! The python

    interpreter should not be parallelized



                                              22
Significant Factors

    Available memory




    Number of other processes running




    How the OS handling of threads and the hardware

    handling of threads interact with each other




                                                      23
Strategies
    Design data structures so that data can be sliced

    into small chunks

    Start with small program and data structures, then

    increase them slowly looking for performance
    degradation

    Optimize code in called processes




    Not enough control to do much else!





                                                         24
Significant Factors

    Available memory on PowerPC



    Whether there are other users on the cell



    Progressive computation on one set of data

    vs. separate computation on separate data


                                                 25
Strategies
    Process plus data for SPE’s must fit within

    256 K

    Optimize code running on SPE’s – try

    different options for your specific
    application

    Divide tasks sent to PPE into chunks that

    will fit into SPE’s.

                                                  26
Data Stream


  Data 1      Data 2      Data 3      Data 4



 Process      Process     Process    Process


 Result 1    Result 2    Result 3    Result 4



Different data is put through the same process


                                                 27
Data Stream


    Data        Data        Data        Data



  Process 1   Process 2   Process 3   Process 4


  Result 1     Result 2    Result 3   Result 4



The same data is put through different processes


                                                   28
Avoiding Deadlocks



    Avoiding Race Conditions



    Scaling





                               29
NetWorkSpaces



    RapidMind



    QT Threads





                    30
Written in Python



    Python-like interface



    Written up in August 2007 Dr Dobb’s Journal

    (currently available on literature table)

    Can work with other languages



    Works on multiple processors as well as multi-core



    Handles appropriate breakdown of data





                                                         31
Uses C++ like syntax to specify work to

    be done in parallel

    Otherwise similar in functionality to

    NetWorkSpaces

    Claims to be highly efficient



    Currently in commercial use



    Free for development; requires license

    for released product


                                              32
Originally intended to support GUI interfaces

    across the internet (multiple systems)

    Covers mechanics of interface with processors




    Does not handle data




    QtPy is a python implementation





                                                    33
http://www-

    128.ibm.com/developerworks/power/cell/docs_documentation.html


    Introduction to the Cell Multiprocessor


    Cell Broadband Engine Programming Tutorial

    Cell Broadband Engine Programming Handbook


    Programming high-performance applications on

    the Cell BE processor
    Maximizing the power of the CBE Processor





                                                                    34
Dr. Dobb’s Journal article about depth-first search:
  http://www.ddj.com/dept/64bit/197801624

Software Development Kit
  http://www-128.ibm.com/developerworks/power/cell

Programming the Cell Broadband Engine
  http://www.embedded.com/showArticle.jhtml?articleID=188101999




                                                                  35

More Related Content

What's hot

Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012Giovanni Toraldo
 
General Bare-metal Provisioning Framework.pdf
General Bare-metal Provisioning Framework.pdfGeneral Bare-metal Provisioning Framework.pdf
General Bare-metal Provisioning Framework.pdfOpenStack Foundation
 
Qi -- Lightweight Boot Loader Applied in Mobile and Embedded Devices
Qi -- Lightweight Boot Loader Applied in Mobile and Embedded DevicesQi -- Lightweight Boot Loader Applied in Mobile and Embedded Devices
Qi -- Lightweight Boot Loader Applied in Mobile and Embedded DevicesNational Cheng Kung University
 
XPDS16: Xen Live Patching - Updating Xen Without Rebooting - Konrad Wilk, Ora...
XPDS16: Xen Live Patching - Updating Xen Without Rebooting - Konrad Wilk, Ora...XPDS16: Xen Live Patching - Updating Xen Without Rebooting - Konrad Wilk, Ora...
XPDS16: Xen Live Patching - Updating Xen Without Rebooting - Konrad Wilk, Ora...The Linux Foundation
 
OpenWRT manual
OpenWRT manualOpenWRT manual
OpenWRT manualfosk
 
KVM Tuning @ eBay
KVM Tuning @ eBayKVM Tuning @ eBay
KVM Tuning @ eBayXu Jiang
 
Provisioning Bare Metal with OpenStack
Provisioning Bare Metal with OpenStackProvisioning Bare Metal with OpenStack
Provisioning Bare Metal with OpenStackDevananda Van Der Veen
 
4. open mano set up and usage
4. open mano set up and usage4. open mano set up and usage
4. open mano set up and usagevideos
 
Introduction to OpenDaylight & Application Development
Introduction to OpenDaylight & Application DevelopmentIntroduction to OpenDaylight & Application Development
Introduction to OpenDaylight & Application DevelopmentMichelle Holley
 
OpenWRT development solutions - Free wireless router product development
OpenWRT development solutions - Free wireless router product developmentOpenWRT development solutions - Free wireless router product development
OpenWRT development solutions - Free wireless router product developmentPaul Dao
 
Hardware Detection Tool
Hardware Detection ToolHardware Detection Tool
Hardware Detection ToolAnne Nicolas
 
Optimizing VM images for OpenStack with KVM/QEMU
Optimizing VM images for OpenStack with KVM/QEMUOptimizing VM images for OpenStack with KVM/QEMU
Optimizing VM images for OpenStack with KVM/QEMUOpenStack Foundation
 
Linux firmware for iRMC controller on Fujitsu Primergy servers
Linux firmware for iRMC controller on Fujitsu Primergy serversLinux firmware for iRMC controller on Fujitsu Primergy servers
Linux firmware for iRMC controller on Fujitsu Primergy serversVladimir Shakhov
 
Intel's Presentation in SIGGRAPH OpenCL BOF
Intel's Presentation in SIGGRAPH OpenCL BOFIntel's Presentation in SIGGRAPH OpenCL BOF
Intel's Presentation in SIGGRAPH OpenCL BOFOfer Rosenberg
 
Xen PV Performance Status and Optimization Opportunities
Xen PV Performance Status and Optimization OpportunitiesXen PV Performance Status and Optimization Opportunities
Xen PV Performance Status and Optimization OpportunitiesThe Linux Foundation
 

What's hot (20)

Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
 
General Bare-metal Provisioning Framework.pdf
General Bare-metal Provisioning Framework.pdfGeneral Bare-metal Provisioning Framework.pdf
General Bare-metal Provisioning Framework.pdf
 
Qi -- Lightweight Boot Loader Applied in Mobile and Embedded Devices
Qi -- Lightweight Boot Loader Applied in Mobile and Embedded DevicesQi -- Lightweight Boot Loader Applied in Mobile and Embedded Devices
Qi -- Lightweight Boot Loader Applied in Mobile and Embedded Devices
 
XPDS16: Xen Live Patching - Updating Xen Without Rebooting - Konrad Wilk, Ora...
XPDS16: Xen Live Patching - Updating Xen Without Rebooting - Konrad Wilk, Ora...XPDS16: Xen Live Patching - Updating Xen Without Rebooting - Konrad Wilk, Ora...
XPDS16: Xen Live Patching - Updating Xen Without Rebooting - Konrad Wilk, Ora...
 
OpenWRT manual
OpenWRT manualOpenWRT manual
OpenWRT manual
 
KVM Tuning @ eBay
KVM Tuning @ eBayKVM Tuning @ eBay
KVM Tuning @ eBay
 
Provisioning Bare Metal with OpenStack
Provisioning Bare Metal with OpenStackProvisioning Bare Metal with OpenStack
Provisioning Bare Metal with OpenStack
 
Os Ramani
Os RamaniOs Ramani
Os Ramani
 
4. open mano set up and usage
4. open mano set up and usage4. open mano set up and usage
4. open mano set up and usage
 
Introduction to OpenDaylight & Application Development
Introduction to OpenDaylight & Application DevelopmentIntroduction to OpenDaylight & Application Development
Introduction to OpenDaylight & Application Development
 
Build Programming Language Runtime with LLVM
Build Programming Language Runtime with LLVMBuild Programming Language Runtime with LLVM
Build Programming Language Runtime with LLVM
 
OpenWRT development solutions - Free wireless router product development
OpenWRT development solutions - Free wireless router product developmentOpenWRT development solutions - Free wireless router product development
OpenWRT development solutions - Free wireless router product development
 
Hardware Detection Tool
Hardware Detection ToolHardware Detection Tool
Hardware Detection Tool
 
Optimizing VM images for OpenStack with KVM/QEMU
Optimizing VM images for OpenStack with KVM/QEMUOptimizing VM images for OpenStack with KVM/QEMU
Optimizing VM images for OpenStack with KVM/QEMU
 
Linux firmware for iRMC controller on Fujitsu Primergy servers
Linux firmware for iRMC controller on Fujitsu Primergy serversLinux firmware for iRMC controller on Fujitsu Primergy servers
Linux firmware for iRMC controller on Fujitsu Primergy servers
 
Explore Android Internals
Explore Android InternalsExplore Android Internals
Explore Android Internals
 
Intel's Presentation in SIGGRAPH OpenCL BOF
Intel's Presentation in SIGGRAPH OpenCL BOFIntel's Presentation in SIGGRAPH OpenCL BOF
Intel's Presentation in SIGGRAPH OpenCL BOF
 
Xen PV Performance Status and Optimization Opportunities
Xen PV Performance Status and Optimization OpportunitiesXen PV Performance Status and Optimization Opportunities
Xen PV Performance Status and Optimization Opportunities
 
Os Ramirez
Os RamirezOs Ramirez
Os Ramirez
 
Learn C Programming Language by Using GDB
Learn C Programming Language by Using GDBLearn C Programming Language by Using GDB
Learn C Programming Language by Using GDB
 

Similar to Os Madsen Block

Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerIntroduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerFörderverein Technische Fakultät
 
Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsStefano Salsano
 
Parallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptParallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptMohmdUmer
 
OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosBrent Salisbury
 
Stream Processing
Stream ProcessingStream Processing
Stream Processingarnamoy10
 
Shak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-finalShak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-finalTommy Lee
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingMichelle Holley
 
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsJiannan Ouyang, PhD
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] IO Visor Project
 
Ov psim demo_slides_power_pc
Ov psim demo_slides_power_pcOv psim demo_slides_power_pc
Ov psim demo_slides_power_pcsimon56
 
Threading Successes 06 Allegorithmic
Threading Successes 06   AllegorithmicThreading Successes 06   Allegorithmic
Threading Successes 06 Allegorithmicguest40fc7cd
 
The Basics of Cell Computing Technology
The Basics of Cell Computing TechnologyThe Basics of Cell Computing Technology
The Basics of Cell Computing TechnologySlide_N
 
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...Slide_N
 

Similar to Os Madsen Block (20)

Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerIntroduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
 
Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and tools
 
Parallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptParallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.ppt
 
OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow Demos
 
Stream Processing
Stream ProcessingStream Processing
Stream Processing
 
Shak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-finalShak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-final
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet Processing
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 
No[1][1]
No[1][1]No[1][1]
No[1][1]
 
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels
 
The Cell Processor
The Cell ProcessorThe Cell Processor
The Cell Processor
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
 
Ov psim demo_slides_power_pc
Ov psim demo_slides_power_pcOv psim demo_slides_power_pc
Ov psim demo_slides_power_pc
 
Xilinx track g
Xilinx   track gXilinx   track g
Xilinx track g
 
Threading Successes 06 Allegorithmic
Threading Successes 06   AllegorithmicThreading Successes 06   Allegorithmic
Threading Successes 06 Allegorithmic
 
Ibm cell
Ibm cell Ibm cell
Ibm cell
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
 
The Basics of Cell Computing Technology
The Basics of Cell Computing TechnologyThe Basics of Cell Computing Technology
The Basics of Cell Computing Technology
 
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
 

More from oscon2007

J Ruby Whirlwind Tour
J Ruby Whirlwind TourJ Ruby Whirlwind Tour
J Ruby Whirlwind Touroscon2007
 
Solr Presentation5
Solr Presentation5Solr Presentation5
Solr Presentation5oscon2007
 
Os Fitzpatrick Sussman Wiifm
Os Fitzpatrick Sussman WiifmOs Fitzpatrick Sussman Wiifm
Os Fitzpatrick Sussman Wiifmoscon2007
 
Performance Whack A Mole
Performance Whack A MolePerformance Whack A Mole
Performance Whack A Moleoscon2007
 
Os Lanphier Brashears
Os Lanphier BrashearsOs Lanphier Brashears
Os Lanphier Brashearsoscon2007
 
Os Fitzpatrick Sussman Swp
Os Fitzpatrick Sussman SwpOs Fitzpatrick Sussman Swp
Os Fitzpatrick Sussman Swposcon2007
 
Os Berlin Dispelling Myths
Os Berlin Dispelling MythsOs Berlin Dispelling Myths
Os Berlin Dispelling Mythsoscon2007
 
Os Keysholistic
Os KeysholisticOs Keysholistic
Os Keysholisticoscon2007
 
Os Jonphillips
Os JonphillipsOs Jonphillips
Os Jonphillipsoscon2007
 
Os Urnerupdated
Os UrnerupdatedOs Urnerupdated
Os Urnerupdatedoscon2007
 

More from oscon2007 (20)

J Ruby Whirlwind Tour
J Ruby Whirlwind TourJ Ruby Whirlwind Tour
J Ruby Whirlwind Tour
 
Solr Presentation5
Solr Presentation5Solr Presentation5
Solr Presentation5
 
Os Borger
Os BorgerOs Borger
Os Borger
 
Os Harkins
Os HarkinsOs Harkins
Os Harkins
 
Os Fitzpatrick Sussman Wiifm
Os Fitzpatrick Sussman WiifmOs Fitzpatrick Sussman Wiifm
Os Fitzpatrick Sussman Wiifm
 
Os Bunce
Os BunceOs Bunce
Os Bunce
 
Yuicss R7
Yuicss R7Yuicss R7
Yuicss R7
 
Performance Whack A Mole
Performance Whack A MolePerformance Whack A Mole
Performance Whack A Mole
 
Os Fogel
Os FogelOs Fogel
Os Fogel
 
Os Lanphier Brashears
Os Lanphier BrashearsOs Lanphier Brashears
Os Lanphier Brashears
 
Os Tucker
Os TuckerOs Tucker
Os Tucker
 
Os Fitzpatrick Sussman Swp
Os Fitzpatrick Sussman SwpOs Fitzpatrick Sussman Swp
Os Fitzpatrick Sussman Swp
 
Os Furlong
Os FurlongOs Furlong
Os Furlong
 
Os Berlin Dispelling Myths
Os Berlin Dispelling MythsOs Berlin Dispelling Myths
Os Berlin Dispelling Myths
 
Os Kimsal
Os KimsalOs Kimsal
Os Kimsal
 
Os Pruett
Os PruettOs Pruett
Os Pruett
 
Os Alrubaie
Os AlrubaieOs Alrubaie
Os Alrubaie
 
Os Keysholistic
Os KeysholisticOs Keysholistic
Os Keysholistic
 
Os Jonphillips
Os JonphillipsOs Jonphillips
Os Jonphillips
 
Os Urnerupdated
Os UrnerupdatedOs Urnerupdated
Os Urnerupdated
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

Os Madsen Block

  • 1. © 2007 Rachael L Madsen and Beverly T Block
  • 2. Rachael Madsen Multiple Solutions Computing Portland, Oregon rachael@multi-sol.com www.multi-sol.com 2
  • 3. Threads vs. Processes  Multi-processor hardware  Software choices for process management  3
  • 4. The use of multiple computers or processors to solve a problem or perform a function 4
  • 5. Threads and Processes Are Not The Same 5
  • 6. A single process achieves parallelism by  creating separate threads for subtasks A thread shares context with its parent  process On a single processor, parallelism is an  illusion created by interweaving 6
  • 7. Each process has its own context  Overheads for creation, communication and  context switching are higher Processes allow true concurrent computing  even on separate systems 7
  • 8. Threads are generally faster  Very dependent on hardware and  operating system Difficult to generate metrics  8
  • 9. Intel/AMD vs. Cell 9
  • 10. Currently available Intel: 2 – 4 cores AMD: 2 cores (4 soon) Cell processors: 9 cores Graphical Processing Unit (GPU) 10
  • 11. Architectural State Architectural State Architectural State Architectural State Execution Engine Execution Engine Execution Engine Execution Engine Local APIC Local APIC Local APIC Local APIC Second Level Cache Second Level Cache Bus Interface Bus Interface System Bus 11
  • 12. SPE- Synergistic Processing Element SPU – Synergistic Processor Unit SXU – Synergistic Execution Unit MFC – Memory Flow Control PPE – PowerPC Processor Element LS – Local Storage PPU - PowerPC Processing Unit PXU - PowerPC Execution Unit MIC – Memory Interface Controller L1, L2 – Local Storage BIC – Broadband Interface Controller SPE’s SPU SPU SPU SPU SXU SXU SXU SXU LS LS LS LS PPE MIC MFC MFC MFC MFC PPU L2 Element Interconnect Bus (EIB) (up to 96B/cycle) L1 PXU MFC MFC MFC MFC BIC LS LS LS LS SPE’s SXU SXU SXU SXU SPU SPU SPU SPU 12
  • 13. PowerPC Processing Unit Local PowerPC Local Storage Extension Storage 2 1 Unit 13
  • 14. Synergistic Processor Unit Synergistic Execution Unit Local Storage Memory Flow Control 14
  • 16. Defining and Operating Executing Preparing Threads Threads Threads Performed by Performed Performed by Programming by OS using Processors Environment Processes and Compiler 16
  • 17. User-Level Threads Kernel-Level Threads Hardware Threads Intel/AMD Cell Sophisticated firmware Minimal firmware on chip to handle on chip to handle process execution execution 17
  • 18. User-Level Threads Kernel-Level Threads Hardware Threads Intel/AMD Cell Multiple process Process management of management threads by Operating written by user: System total control! 18
  • 19. User-Level Threads Kernel-Level Threads Hardware Threads Intel/AMD Cell Use threading package User manages to manage threads threads directly (OpenMP, Pthreads, or by adapting a TBB, etc) threading package 19
  • 20. Intel/AMD Cell Completely controlled Controlled by user by OS and chip For execution to be fast, execution block (code and data) must be kept in cache as much as possible. 20
  • 21. Global Interpreter Lock (GIL)  Cache Management  Data Management  Program Flow  Thread Design  21
  • 22. Python allows only one instance of the  interpreter to run at any given time True multi-processing only available by  calling lower-level (C/C++/Fortran/etc) routines This is as it should be! The python  interpreter should not be parallelized 22
  • 23. Significant Factors Available memory  Number of other processes running  How the OS handling of threads and the hardware  handling of threads interact with each other 23
  • 24. Strategies Design data structures so that data can be sliced  into small chunks Start with small program and data structures, then  increase them slowly looking for performance degradation Optimize code in called processes  Not enough control to do much else!  24
  • 25. Significant Factors Available memory on PowerPC  Whether there are other users on the cell  Progressive computation on one set of data  vs. separate computation on separate data 25
  • 26. Strategies Process plus data for SPE’s must fit within  256 K Optimize code running on SPE’s – try  different options for your specific application Divide tasks sent to PPE into chunks that  will fit into SPE’s. 26
  • 27. Data Stream Data 1 Data 2 Data 3 Data 4 Process Process Process Process Result 1 Result 2 Result 3 Result 4 Different data is put through the same process 27
  • 28. Data Stream Data Data Data Data Process 1 Process 2 Process 3 Process 4 Result 1 Result 2 Result 3 Result 4 The same data is put through different processes 28
  • 29. Avoiding Deadlocks  Avoiding Race Conditions  Scaling  29
  • 30. NetWorkSpaces  RapidMind  QT Threads  30
  • 31. Written in Python  Python-like interface  Written up in August 2007 Dr Dobb’s Journal  (currently available on literature table) Can work with other languages  Works on multiple processors as well as multi-core  Handles appropriate breakdown of data  31
  • 32. Uses C++ like syntax to specify work to  be done in parallel Otherwise similar in functionality to  NetWorkSpaces Claims to be highly efficient  Currently in commercial use  Free for development; requires license  for released product 32
  • 33. Originally intended to support GUI interfaces  across the internet (multiple systems) Covers mechanics of interface with processors  Does not handle data  QtPy is a python implementation  33
  • 34. http://www-  128.ibm.com/developerworks/power/cell/docs_documentation.html Introduction to the Cell Multiprocessor  Cell Broadband Engine Programming Tutorial  Cell Broadband Engine Programming Handbook  Programming high-performance applications on  the Cell BE processor Maximizing the power of the CBE Processor  34
  • 35. Dr. Dobb’s Journal article about depth-first search: http://www.ddj.com/dept/64bit/197801624 Software Development Kit http://www-128.ibm.com/developerworks/power/cell Programming the Cell Broadband Engine http://www.embedded.com/showArticle.jhtml?articleID=188101999 35