www.thalesgroup.com




         Heterogeneous Manycore with Self Adaptive Capabilities
                        and the Corresponding Industrial Needs

                                                           RAW 2012
                                   Fabrice Lemonnier, 22nd May, 2012




Research & Technology
2 /                                        Manycore: main issue for industry


          Programmability:




                                                                           The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
               Time to market
               Development cost
               Reuse of legacy software




                                                                           otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
      Why take so many risks with manycore ?

      Most of industrials want to continue
      like the past few years: compile without
      thinking (as much as possible) !

      No more Free lunch ! In the near future
      the processors will all be made of multi-cores and many-
      cores.

      Nevertheless, can we provide solutions to ease the
      programmation ?
3 /




                                                                                                                                                                       Tile-Gx100 from Tilera: 100 cores




                                                                 •SMP


                                             •Bare

                   •Standard
                                                                                                           •Standard


                                                                                       •Multicore



                                                                 Linux
                                                                                                                              Programmability:




                                                                             environmentTM (MDE)
                                                                                      Development
                                                                                                           C/C++ languages




                                             Metal Environment

                   Debugging Tools (gdb 7)




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                           Programmability: Homogeneous manycores




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
4 /




                                                                                                                                                                       Multiprocessor




                                                                                           CUDA parallel

                                                                                           multi-threading
                                                                                                                                      Programmability:




                                                          C/C++, openCL, …
                                                                                                                                                                       Fermi from Nvidia 512 cores organised in 16 Streaming




                                                                                           programming model:


                                                          Programming languages:




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                                               Programmability: Homogeneous manycores




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
5 /




                                                                         •Tools
                                                                                              sigmaC
                                                                                                         •specific




                                                             the application
                                                                                                                               Programmability:
                                                                                                                                                                       MPPA from Kalray: 256 cores organised in 16 clusters




                                                                   to automatically map
                                                                                                         data flow language:




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                                              Programmability: Homogeneous manycores




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
6 /                                      Homogeneous manycores


      Parallelisation is the only way




                                                                  The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
      to raise computing power for
      a low power consumption.

      Homogeneity eases the




                                                                  otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
      programming aspects

      Maximum of performance is
      reached only for static
      application.

      Moreover, tools can be used to make automatic
      optimisation through data parallelism and generate static
      allocation and scheduling.
7 /




                                                                                                                                                                                        •




targeted application domain
                                                                                                                                                                                       Customisation




                                                                  Australian Desert Animal: the Thorny Devil



Customization is necessary to raise the efficiency for a




                                        The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                                      But parallelisation is not enough




                                        otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
8 /




                                                                                                                                                                                                  OMAP: Communication market


                                                                                                                                       power consumption ratio) but for a dedicated domain
                                                                                                                                       Heterogeneity for the best efficiency (computing power –




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                                               MPSoC




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
9 /




                                Fabric
                                                        Cluster
                                                                                           Cluster
                                                                                                                               Cluster




                                                        Cluster
                                                                                                                               Cluster




                                                                                           Cluster




                                                        Cluster
                                                                                                                               Cluster




                                                                                           Cluster
                                                                                                                              core
                                                                                                                             Fabric
                                                                                                                            Controller
                                                                                                                                                                       Heterogeneous manycore P2012 from ST




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                              Heterogeneous manycores




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
10 /




                                                                                 and …
                                                                                                                                                                                                     Only affordable for large series of products.
                                                                                                                                                                                                                                                     Dedicated to a specific domain of application




                                                                                                                                            no way to develop their own heterogeneous manycore
                                                                                                                                            Industry with small and medium series of products have


                                                                                 An alternative is to use a combination between multicore




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                                                                                                                     Heterogeneous manycores




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
11 /




                                                                                                                                                                       ZYNQ: Xilinx FPGA with a dual core ARM A9 MPCore




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                                          FPGA + multicore




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
12 /




                                                                                                                                       Package (MCP)
                                                                                                                                                                                              Intel® Atom™ Processor E6x5C Series

                                                                                                                                       GPP + dedicated accelerators on FPGA on a Multi-Chip




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                                                    …or the inverse: GPP + FPGA




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
13 /




                                                            Fabric
                                                                            Cluster
                                                                                                   Cluster
                                                                                                                          Cluster
                                                                                                                                                  ZYNQ




                                                                            Cluster
                                                                                                  Cluster
                                                                                                                          Cluster




                                                                            Cluster
                                                                                                  Cluster
                                                                                                                          Cluster
                                                                                                                           core
                                                                                                                          Fabric
                                                                                                                         Controller
                                                                                                                                                  A combination between the heterogeneous manycore
                                                                                                                                                  solution like P2012 and the FPGA+multicore approach like




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                             Our proposition




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
14 /




                                                                                                                                        •
                                                                                                                                                        •

                                                                                                                                        A FPGA layer
                                                                                                                                                       A manycore layer
                                                                                                                                                                          A 3D stacked chip based on:




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                        Our proposition




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
15 /                                     Most important Advantages


       Increase accessibility to heterogeneous manycores




                                                                   The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
       technology by allowing a customisation by the user

       Reduction of the impact of the NRC




                                                                   otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
       Allow implementation of self adaptive capabilities
       necessary for the future interactive applications and the
       constraints of the current and future technologies
16 /




                                                                                                                                                                                           low volume




  Cognitive radio
                                                                                                                                                                                                         low power consumption
                                                                                                                                                                                                                                  Embedded Real-Time Applications




  Smart camera
UAV
                                                                                                                          Adapt to environment  dynamicity, flexibility & dependability




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                                                                                    Future applications issues




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
17 /                                   Self adaptive capabilities, why?


       •Tobe able to dynamically adapt the architecture to the




                                                                      The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
       current request of the application for the same power
       consumption

       •Evolutionof the technology: reduction of the reliability




                                                                      otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
       and the yield of current and future sub-micron
       technologies -> adaptation depending on the faulty cores.

       •Increase   energy efficiency

       •Increasethe programming efficiency by taking a part of
       the mapping complexity at runtime

       •Temperature management -> adaptation of the
       application mapping
18 /




                              •
                                                                           •
                                                                                                            •Main
                                                                                                                                                   •FOSFOR
                                                                                                                                                                                                                                              Projects:

                                                                                                                                                                                                              •Morpheus




                                                                                                            drawbacks:
                                                                                                                         multicore on FPGA




                                                                          the scalability of the solution
                            the limitation of the size of the FPGA area
                                                                                                                                                                                    technologies managed by an ARM processor.

                                                                                                                                  (ANR project): distributed OS for heterogeneous
                                                                                                                                                                                              (FP6 project): heterogeneous chip with 3 FPGA




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                                                                          State of the art




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
19 /




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
20 /




                                                                                                                                       Model of Computation
                                                                                                                                                                         Optimisation tools




                                                                                                                                Model of Execution




                                                 Model of programmation




Common Interfaces
                                                                                       strategies of relocation




                                                 Flexible Hardware




                    The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                  Holistic Approach




                    otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
21 /




                                                                                                                                       Model of Computation
                                                                                                                                                                         Optimisation tools




                                                                                                                                Model of Execution




                                                 Model of programmation




Common Interfaces
                                                                                       strategies of relocation




                                                 Flexible Hardware




                    The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                  Holistic Approach




                    otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
22 /




                                                                                                                                       GPP




                                                                                  DSP nodes
                                                                                 eFPGA nodes
                                                                                     Slave Nodes
                                                                                                                                           Master Nodes
                                                                                                                                                                        Master-slave execution model




                                data
                                                                       DMA
                                                                     requests




                                   DMA
                                                                                                                                NI
                                                                                                                                                          NI



                                                                                                                                     NoC
                                                                                                                                                                      GPP Node




                                                                   control
                                                                   / status




                                   node
                                accelerator
                                                                       acc
                                                                     requests
                                                                                                   Accelerator Interface (AI)




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                       Programming efficiency: common execution model




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
23 /




                                                           data_transfer
                                                                                                          requests FIFOs




specificities
                                                            DMA
                                                                                       wait_sync
                                                                                       send_data



                                                                                      receive_data
                                                                                     send_sync2acc
                                                                                     send_sync2gpp
                                                                                                                                                                         GPP




                                                                           synchro
                                                                                                                                                                       (master)




                                                                                         work
                                                                                        wait_sync



                                                         or (slave)
                                                         Accelerat
                                                                                     send_sync2dmu




Ensure hardware and software
independency with the accelerator




       The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                     Master-slave execution model




       otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
24 /




                                                                                                                                       Model of Computation
                                                                                                                                                                         Optimisation tools




                                                                                                                                Model of Execution




                                                 Model of programmation




Common Interfaces
                                                                                       strategies of relocation




                                                 Flexible Hardware




                    The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                  Holistic Approach




                    otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
25 /
                                                                   Tool flow and MoC

  •Optimisation and parallelisation tools can only
  be used on static applications.




                                                                                                   The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
  •Necessity to identify static clusters inside the
  applications based on SDF/CSDF MoC




                                                                                                   otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
                                       Act          Act



SDF, CSDF MoC                           Act           Act

                                                                  Act   : Actor
                                              Act

                                                                        : static cluster

                                       Act      Act         Act         : Clusters group managed
                                                                        by one state management

                                                                   : Cluster group input/output

       actor: consume and produce token of data with               : Cluster input/output
       predefined and static rules
26 /                                                     Tool flow and MoC

 The Tool flow is based on                                    Application
                                                               (C code)
 2 main tools:




                                                                                     The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
 •Thales tool: SpearDE
                                       Graphic                     C to SpearDE
 •ACE tool: Cosy                        input                     representation
                                      (manual)                   Conversion (Cosy)




                                                                                     otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
                                                           Data
                      architecture                    parallelisation
                     representation                      Mapping
                                                        (SpearDE)


                                                        Streaming
                                                       optimisation
                                                          (Cosy)


                                                       Compilation
                                                         (Cosy)
                   Library of IPs
                                            executable code


                    Slave cores                        Master cores
27 /




                                                                                                                                                                                                                     A1

                                                                                                                                                                                                          A3




                                                                                                     Ax
                                                                                                                                                                                                                     A2


                                                                                                                                                                                                          A4
                                                                                                                                                                                                                     A5
                                                                                                                                                                                                                          cluster1




                                                      : partition
                                                                    : static cluster




                           : cluster input/output

: partition input/output
                                                                                                 : Actor number x
                                                                                                                                            A3
                                                                                                                                                                     A1




                                                                                                                                            A4
                                                                                                                                                                     A2


                                                                                                                                                 partition2
                                                                                                                                                                                partition1


                                                                                                                                                              A5
                                                                                                                                                                   partition3
                                                                                                                                                                                             cluster1p1




                                                                    A1.4
                                                                                       A1.3
                                                                                                        A1.1
                                                                                              A1.2


                                                                    A2.4
                                                                                       A2.3
                                                                                                        A2.1
                                                                                              A2.2
                                                                                                                        •DSP
                                                                                                                       •FPGA




                                                    A3
                                                    A4
                                                                                         A5




                                                     •DSP
                                                    •FPGA
                                                                                                                               cluster1p1




                                                                                                                    •DSP
                                                                                                                    •GPP




                                       The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                                                     Tools : partitionning, parallelisation and mapping




                                       otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
28 /




                                                                                                                                       Model of Computation
                                                                                                                                                                         Optimisation tools




                                                                                                                                Model of Execution




                                                 Model of programmation




Common Interfaces
                                                                                       strategies of relocation




                                                 Flexible Hardware




                    The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                  Holistic Approach




                    otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
29 /




                                   nodes
                                                                                                                            Generic
                                                                                                                           Interfaces




                                accelerators
                                                                                                                                                                                   GPP nodes




                               Heterogeneous
                                                                                                                                                                                  Homogeneous




                                                                          AI
                                                                                                     NI
                                                                                                                                                               NI




                                     DSP
                                    Node
                                                                                                                                                                                         GPP Node




                                                                          AI
                                                                                                     NI
                                                                                                                                                               NI




                                  Node
                                                                                                                                                                                         GPP Node




                                Dedicated
                               Accelerator



         HW acc.)
                                                                                                                                  NoC




                                                                          AI
                                                                                                     NI
                                                                                                                                                               NI




                                  Node
                                                                                                                                                                                         GPP Node




                                Dedicated
                               Accelerator


eFPGA Domain (Reconfigurable
                                                                                                                                                               NI
                                                                                                                                                                                         DDR Ctrl.




                                                                                                     NI
                                                                                                                                                               NI
                                                                                                                                                                                         I/O




                                                                          Config. Ctrl.




                                The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                              Modularity and scalability: common interfaces




                                otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
30 /




                                                                                                                                       Model of Computation
                                                                                                                                                                         Optimisation tools




                                                                                                                                Model of Execution




                                                 Model of programmation




Common Interfaces
                                                                                       strategies of relocation




                                                 Flexible Hardware




                    The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                  Holistic Approach




                    otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
31 /




                                                                                                                                                                                                  event




                                                                                       Act
                                                                                                                                                             Act




                                                                                                                                             Act


                                                                                                                                  Act




                                                                                       Act
                                                                                                                                                                             states management



                                                                                                                                                             Act




                                                                                                                                             Act
                                                                                                                                                   state 2
                                                                                                                                                                   state 1




                                                                                       Act
                                                                                                         state 3
                                                                                                                                                                                            cluster group




                                                                                                                                        Act
                                                                                                                                        : Actor


                                                                                                                    : static cluster




                               : Cluster input/output
                                                        : Cluster group input/output
                                                                                         : Clusters group managed
                                                                                         by one state management




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                            Dynamicity: the cluster group




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
32 /                                                                         Dynamicity at cluster group level
           event                  cluster group 1                                                            event                 cluster group 4
                   states management                       event                           cluster group 3           states management
                                                                   states management                                                       state 1




                                                                                                                                                         The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                   state 1
                                                                                               state 1                       Act         Act
                                    nop
                                                                       Act         Act           Act
 sensor
  data
                                                                             Act         Act




                                                                                                                                                         otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
                                                         event                             cluster group 5
                                                                 states management

                                                                                               state 1
          event                   cluster group 2
                   states management                                   Act         Act           Act


                                                             s               Act         Act
                                                             c                                                g
  sensor                                                     a                                 state 1.1      a
   data                                                                                                                Act     : Actor
                                               state 2       t         Act         Act                        t
                                                                                                 Act
                                                             t                                                h
                       Act         Act            Act
                                                             e                                                e                : static cluster
                                                                             Act         Act
                                                             r                                                r
                             Act         Act                                                                                  : Clusters group managed
                                                                                               state 1.2                      by one state management

                                                                       Act         Act           Act                     : Cluster group input/output

                                                                                                                        : Cluster input/output
                                                                             Act         Act
33 /                                                                       Start a new part of the application
           event                  cluster group 1                                                              event                 cluster group 4
                   states management                         event                           cluster group 3           states management
                                                                     states management                                                       state 1




                                                                                                                                                           The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                 state 1                       Act         Act

                                                                         Act         Act           Act
 sensor
  data                                           state 2
                                                                               Act         Act




                                                                                                                                                           otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
                        Act          Act          Act


                               Act                         event                             cluster group 5
                                                                   states management

                                                                                                 state 1
          event                   cluster group 2
                   states management                                     Act         Act           Act


                                                               s               Act         Act
                                                               c                                                g
  sensor                                                       a                                 state 1.1      a
   data                                                                                                                  Act     : Actor
                                                 state 2       t         Act         Act                        t
                                                                                                   Act
                                                               t                                                h
                       Act           Act            Act
                                                               e                                                e                : static cluster
                                                                               Act         Act
                                                               r                                                r
                              Act          Act                                                                                  : Clusters group managed
                                                                                                 state 1.2                      by one state management

                                                                         Act         Act           Act                     : Cluster group input/output

                                                                                                                          : Cluster input/output
                                                                               Act         Act
34 /                                                                                  Modification of the behaviour
           event                  cluster group 1                                                              event                 cluster group 4
                   states management                         event                           cluster group 3           states management
                                                                     states management




                                                                                                                                                          The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                 state 1
                                                                         Act         Act           Act
 sensor                                                                                                                                    state 2
  data                                           state 2
                                                                               Act                                             Act         Act
                                                                                           Act




                                                                                                                                                          otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
                        Act          Act          Act

                                                                                                                                     Act
                               Act                         event                             cluster group 5
                                                                   states management

                                                                                                 state 1
          event                   cluster group 2
                   states management                                     Act         Act           Act


                                                               s               Act         Act
                                                               c                                                g
  sensor                                                       a                                 state 1.1      a
   data                                                                                                                  Act    : Actor
                                                 state 2       t         Act         Act                        t
                                                                                                   Act
                                                               t                                                h
                       Act           Act            Act
                                                               e                                                e               : static cluster
                                                                               Act         Act
                                                               r                                                r
                              Act          Act                                                                                 : Clusters group managed
                                                                                                 state 1.2                     by one state management

                                                                         Act         Act           Act                     : Cluster group input/output

                                                                                                                          : Cluster input/output
                                                                               Act         Act
35 /                                                               Modification of the parallelisation level
           event                  cluster group 1                                                              event                 cluster group 4
                   states management                         event                           cluster group 3           states management
                                                                     states management




                                                                                                                                                          The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                 state 1
                                                                         Act         Act          Act
 sensor                                                                                                                                    state 2
  data                                           state 2
                                                                               Act                                             Act         Act
                                                                                           Act




                                                                                                                                                          otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
                        Act          Act          Act

                                                                                                                                     Act
                               Act                         event                             cluster group 5
                                                                   states management

                                                                                                 state 1
          event                   cluster group 2
                   states management                                     Act         Act          Act


                                                               s               Act         Act
                                                               c                                                g
  sensor                                                       a                                                a
   data                                                                                                                  Act    : Actor
                                                 state 2       t                                                t
                                                               t                                                h
                       Act           Act            Act
                                                               e                                                e               : static cluster
                                                               r                                                r
                              Act          Act                                                                                 : Clusters group managed
                                                                                                                               by one state management

                                                                                                                           : Cluster group input/output

                                                                                                                          : Cluster input/output
36 /




                                                                                                                                   A1.4
                                                                                                                                          A1.3
                                                                                                                                                 A1.2
                                                                                                                                                        A1.1




                                                                              relocation
                                                                                                                                   A2.4
                                                                                                                                          A2.3
                                                                                                                                                 A2.2
                                                                                                                                                        A2.1
                                                                                                                                                                      •FPGA




                                                                                                                     A3
                                                                                                                     A4
                                                                                                                                            A5




                                                                                                                      •FPGA
                                                                                                                                                               •GPP
                                                                                                                                                                              cluster1p1




                                                                                                                                   A1.4
                                                                                                                                          A1.3
                                                                                                                                                 A1.2
                                                                                                                                                        A1.1




                                                                              relocation
                                                                                                                                   A2.4
                                                                                                                                          A2.3
                                                                                                                                                 A2.2
                                                                                                                                                        A2.1
                                                                                                                                                                      •DSP




                                                                                                                     A3
                                                                                                                     A4
                                                                                                                                            A5




                                                                                                                      •DSP
                                                                                                                                                               •GPP
                                                                                                                                                                              cluster1p1




                                                                                                                                                 A1.2


                                                                                                                                   A1.4
                                                                                                                                          A1.3
                                                                                                                                                        A1.1




                                                                              relocation
                                                                                                                                   A2.4
                                                                                                                                          A2.3
                                                                                                                                                 A2.2
                                                                                                                                                        A2.1
                                                                                                                                                                      •DSP




                                                                                                                     A3
                                                                                                                     A4
                                                                                                                                            A5




                                                                                                                      •DSP
                                                                                                                                                               •DSP




                                                                           time




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                           Dynamicity at cluster level




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
                                                                                                                                                                              cluster1p1
37 /                                                                      Dynamic relocation

                                thread1
                     thread3     thread1 thread2thread2 thread4




                                                                                                            The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                  API
                       I/O     Acc1 Acc1   Acc3   Acc4 DDR ctrl

                      GPP      GPP GPP     GPP    GPP    GPP




                                                                                                            otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
                                           NoC
                                                                              runtime
                        Dynamic relocation
                                                                             compile time
        I/O

       Acc1                                                         thread1     thread2
              Acc3
                                                                        thread1 thread2 thread3 thread4
       Acc1
              Acc4                                                                                        API



     Tools for                                                               Tools for
   parallelisation                                                         parallelisation
   and mapping                                                             and mapping
                                      Application
38 /




                                                                                                                                       Model of Computation
                                                                                                                                                                         Optimisation tools




                                                                                                                                Model of Execution




                                                 Model of programmation




Common Interfaces
                                                                                       strategies of relocation




                                                 Flexible Hardware




                    The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                  Holistic Approach




                    otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
39 /
                        A Virtualisation Layer for self adaptive capabilities

  Virtualisation services provide a high level of abstraction of the heterogeneous
  resources: communication and accelerators management




                                                                                                                  The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
  Self adaptive services define actions to be taken depending on events (monitoring):
  relocation, DVFS,…

                                 Allocation file                                    Application




                                                                                                                  otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
                         Self adaptive services
                                         ACTION

Virtualisation                                                                    Virtualisation
    Layer                                                                            services
                          MONITORING               DIAGNOSIS
                                                     O = F(L)

                                               SYSTEM




       kernel    Monitoring Actuators
                                        Task   Memory   Network    Communication
                                                                                 Scheduler
                                                                                           Cluster   Semaphore
                                        mngt    mngt    services    management              mngt     event mngt
AI
                                                                               NI
                                                                                                           NI
                                                                                                                                                                                                        40 /




                                                     DSP
                                                    Node
                                                                                                                GPP Node




                                                               AI
                                                                               NI
                                                                                                           NI




                                                   Node
                                                                                                                GPP Node




                                                 Dedicated
                                                 Accelerator
                                                                                                   NoC




                                                               AI
                                                                               NI
                                                                                                           NI




                                                   Node
                                                                                                                GPP Node




                                                 Dedicated
                                                 Accelerator
                                                                                                           NI
                                                                                                                DDR Ctrl.




                                                                               NI




                                                               Config. Ctrl.




         eFPGA Domain (Reconfigurable HW acc.)
                                                                                                           NI
                                                                                                                I/O
                                                                                                                            Accelerator/Virtual Code
                                                                                                                                                                                  Mapping
                                                                                                                                                                     MONITORING
                                                                                                                                                                                            ACTION




                                                                                                                                                       SYSTEM
                                                                                                                                                                  O = F(L)
                                                                                                                                                                DIAGNOSIS




                                                                                         Dynamic
                                                                                    allocation / binding




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                     Self-adaptation




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
41 /




                                                                                                                                       Model of Computation
                                                                                                                                                                         Optimisation tools




                                                                                                                                Model of Execution




                                                 Model of programmation




Common Interfaces
                                                                                       strategies of relocation




                                                 Flexible Hardware




                    The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                  Holistic Approach




                    otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
42 /




                                                                                        NoC
                                                                          Homogeneous manycore
                                                                         Tile
                                                                                        Tile
                                                                                                   Tile



                                                                         Tile
                                                                                        Tile
                                                                                                   Tile



                                                                         Tile



                                          FlexTiles: a 3D stack chip
                                                                                        Tile
                                                                                                   Tile
                                                                                                                                                          3D stacked reconfigurable layer




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                            New dynamic reconfigurable technology




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
43 /




                                                                                        NoC
                                                                          Homogeneous manycore
                                                                                                            3D stacked reconfigurable layer



                                                                         Tile
                                                                                        Tile
                                                                                                   Tile



                                                                         Tile
                                                                                        Tile
                                                                                                   Tile



                                                                         Tile
                                                                                                                                                          Map Accelerated functions




                                          FlexTiles: a 3D stack chip
                                                                                        Tile
                                                                                                   Tile




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                      New dynamic reconfigurable technology




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
44 /




                                                                                        NoC
                                                                          Homogeneous manycore
                                                                                                            3D stacked reconfigurable layer



                                                                         Tile
                                                                                        Tile
                                                                                                   Tile
                                                                                                                                                          Duplicate




                                                                         Tile
                                                                                        Tile
                                                                                                   Tile



                                                                         Tile



                                          FlexTiles: a 3D stack chip
                                                                                        Tile
                                                                                                   Tile




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                              New dynamic reconfigurable technology




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
45 /




                                                                                        NoC
                                                                          Homogeneous manycore
                                                                                                            3D stacked reconfigurable layer



                                                                         Tile
                                                                                        Tile
                                                                                                   Tile
                                                                                                                                                          Migrate




                                                                         Tile
                                                                                        Tile
                                                                                                   Tile



                                                                         Tile



                                          FlexTiles: a 3D stack chip
                                                                                        Tile
                                                                                                   Tile




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                              New dynamic reconfigurable technology




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
46 /




                                                                                                                                       Model of Computation
                                                                                                                                                                         Optimisation tools




                                                                                                                                Model of Execution




                                                 Model of programmation




Common Interfaces
                                                                                       strategies of relocation




                                                 Flexible Hardware




                    The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                  Holistic Approach




                    otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
47 /




                                                                                                                                                                          GPP




            shMEM
            on chip
                                                                                                                                                          icache

                                                                                                                                         dcache


                                                                                                                          dLMEM GPP



                                                                                                 NI
                                                                                                                                                                         DSP

                                                                                                                                                            iLMEM DSP

                                                                                                                                             dLMEM DSP




                                                                                                                NI




                                                    data




           NOC
                        NOC
                                        NOC
                                                    NOC
                                                                      NOC




                       control
                                     bitstream




        test/debug
                                                                   instruction
                                                                                                                                                                        eFPGA

                                                                                                                                                         iLMEM eFPGA

                                                                                                                                          dLMEM eFPGA




                                                                                                                NI




   +
   NI


  ctrl
  DDR
                                                                                                                                                                                chip




           DDR
                                                                                                                                                                                       NoC QoS




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
48 /




                       low latency
                     highly scalable


                    packet switching
                  wormhole protocol
             power efficient and dependable
   between nodes: no global clock, no even local clock
GALS: asynchronous logic in nodes, local synchronous cores




        The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                      ANoC (CEA)




        otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
49 /




          Globally Synchronous with time slots
         Contention free routing by construction
        wormhole routing specified at design time
      Guaranteed levels of services and performances




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                              AEtheral NoC (TUe)




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
50 /                                                             Conclusion (1)


       Parallelisation is the only way to reach HPC for low power




                                                                               The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
       consumption.

       But Industry doesn’t want to take the plunge




                                                                               otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
       Moreover, parallelisation is not enough, customisation is
       also necessary
        •   Only affordable for high volumes and very difficult to programme

       Reconfigurable customisation is the solution:
        •   Increase accessibility to heterogeneous manycore technology
        •   Allow implementation of self adaptive capabilities
51 /                                                             Conclusion (2)


       Self adaptive capabilities provide:




                                                                              The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
          •Dynamic  customisation of the manycore architecture to the
          current request of the application




                                                                              otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
          •Reductionof the programming complexity by taking a part of the
          mapping complexity at runtime

          •Fault   tolerance: adaptation depending on the faulty cores.

          •Energy    efficiency

          •Temperature    management -> adaptation of the application
          mapping
52 /




                                                                                                                                       Model of Computation
                                                                                                                                                                         Optimisation tools




                                                                                                                                Model of Execution




                                                 Model of programmation




Common Interfaces
                                                                                       strategies of relocation




                                                 Flexible Hardware




                    The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                  Holistic Approach




                    otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
53 /




                                                                                                                                        •
                                                                                                                                                        •

                                                                                                                                        A FPGA layer
                                                                                                                                                       A manycore layer
                                                                                                                                                                          A 3D stacked chip based on:




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                        Our proposition: a 3D stacked chip and …




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
54 /                                               …a complete platform


                                     Application




                                                                                       The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                        Parallelisation, partioning
       toolchain      Compilation       Synthesis, P&R




                                                                                       otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
                   relocatable binary code        relocatable bitstream

                                   Operating Library API
                              Virtualisation                     ACTION
       operating                  layer
        library    Kernel      Resource
                              Monitoring &         MONITORING             DIAGNOSIS
                                                                            O = F(L)
                               Allocation                            SYSTEM
                                Hardware Abstraction Layer API
  heterogenous                  Hardware Abstraction Layer
    manycore
                                   Hardware Nodes
55 /




                                                                                    Duration: 36 months
                                                                                                          Starting date: 15/10/2011
                                                                                                                                              Funding budget: 3,670,000€
                                                                                                                                                                           Project coordinator: THALES
                                                                                                                                                        FlexTiles
                                                                                                                                      www.flextiles.eu




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                                                                                                                                                                                                         FlexTiles: FP7 project




otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
56 /   8 partners in 5 countries           Consortium and questions

                   Partners & Third Country        Main       scientific   and
                   Party                           technical contributions




                                                                                 The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                         THALES        France           Infrastructure and
                                                            applications

                          KIT         Germany          Virtualisation layer




                                                                                 otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
                         TUE         Netherlands          Kernel ; NoC


                        CSEM         Switzerland              DSP

                         CEA           France          NoC ; 3D stacking
                         UR1           France      Reconfigurable technology
                      SUNDANCE         United         FPGA Demonstrator
                                      Kingdom

                         ACE         Netherlands       Parallelisation and
                                                       compilation Tools
57 /




                                                                                                                                                                                           With FlexTiles, Industry will be able to…




Take the plunge to the manycore utilisation




                    The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
                    otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
58 /




                                                                                                                        Questions ?
                                                                                                                                                               Thank you for your attention




The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or
otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8

RAW 2012

  • 1.
    www.thalesgroup.com Heterogeneous Manycore with Self Adaptive Capabilities and the Corresponding Industrial Needs RAW 2012 Fabrice Lemonnier, 22nd May, 2012 Research & Technology
  • 2.
    2 / Manycore: main issue for industry  Programmability: The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or  Time to market  Development cost  Reuse of legacy software otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 Why take so many risks with manycore ? Most of industrials want to continue like the past few years: compile without thinking (as much as possible) ! No more Free lunch ! In the near future the processors will all be made of multi-cores and many- cores. Nevertheless, can we provide solutions to ease the programmation ?
  • 3.
    3 / Tile-Gx100 from Tilera: 100 cores •SMP •Bare •Standard •Standard •Multicore Linux Programmability: environmentTM (MDE) Development C/C++ languages Metal Environment Debugging Tools (gdb 7) The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Programmability: Homogeneous manycores otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 4.
    4 / Multiprocessor CUDA parallel multi-threading Programmability: C/C++, openCL, … Fermi from Nvidia 512 cores organised in 16 Streaming programming model: Programming languages: The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Programmability: Homogeneous manycores otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 5.
    5 / •Tools sigmaC •specific the application Programmability: MPPA from Kalray: 256 cores organised in 16 clusters to automatically map data flow language: The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Programmability: Homogeneous manycores otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 6.
    6 / Homogeneous manycores Parallelisation is the only way The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or to raise computing power for a low power consumption. Homogeneity eases the otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 programming aspects Maximum of performance is reached only for static application. Moreover, tools can be used to make automatic optimisation through data parallelism and generate static allocation and scheduling.
  • 7.
    7 / • targeted application domain Customisation Australian Desert Animal: the Thorny Devil Customization is necessary to raise the efficiency for a The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or But parallelisation is not enough otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 8.
    8 / OMAP: Communication market power consumption ratio) but for a dedicated domain Heterogeneity for the best efficiency (computing power – The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or MPSoC otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 9.
    9 / Fabric Cluster Cluster Cluster Cluster Cluster Cluster Cluster Cluster Cluster core Fabric Controller Heterogeneous manycore P2012 from ST The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Heterogeneous manycores otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 10.
    10 / and … Only affordable for large series of products. Dedicated to a specific domain of application no way to develop their own heterogeneous manycore Industry with small and medium series of products have An alternative is to use a combination between multicore The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Heterogeneous manycores otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 11.
    11 / ZYNQ: Xilinx FPGA with a dual core ARM A9 MPCore The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or FPGA + multicore otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 12.
    12 / Package (MCP) Intel® Atom™ Processor E6x5C Series GPP + dedicated accelerators on FPGA on a Multi-Chip The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or …or the inverse: GPP + FPGA otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 13.
    13 / Fabric Cluster Cluster Cluster ZYNQ Cluster Cluster Cluster Cluster Cluster Cluster core Fabric Controller A combination between the heterogeneous manycore solution like P2012 and the FPGA+multicore approach like The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Our proposition otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 14.
    14 / • • A FPGA layer A manycore layer A 3D stacked chip based on: The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Our proposition otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 15.
    15 / Most important Advantages Increase accessibility to heterogeneous manycores The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or technology by allowing a customisation by the user Reduction of the impact of the NRC otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 Allow implementation of self adaptive capabilities necessary for the future interactive applications and the constraints of the current and future technologies
  • 16.
    16 / low volume Cognitive radio low power consumption Embedded Real-Time Applications Smart camera UAV Adapt to environment  dynamicity, flexibility & dependability The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Future applications issues otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 17.
    17 / Self adaptive capabilities, why? •Tobe able to dynamically adapt the architecture to the The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or current request of the application for the same power consumption •Evolutionof the technology: reduction of the reliability otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 and the yield of current and future sub-micron technologies -> adaptation depending on the faulty cores. •Increase energy efficiency •Increasethe programming efficiency by taking a part of the mapping complexity at runtime •Temperature management -> adaptation of the application mapping
  • 18.
    18 / • • •Main •FOSFOR Projects: •Morpheus drawbacks: multicore on FPGA the scalability of the solution the limitation of the size of the FPGA area technologies managed by an ARM processor. (ANR project): distributed OS for heterogeneous (FP6 project): heterogeneous chip with 3 FPGA The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or State of the art otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 19.
    19 / The informationcontained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 20.
    20 / Model of Computation Optimisation tools Model of Execution Model of programmation Common Interfaces strategies of relocation Flexible Hardware The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Holistic Approach otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 21.
    21 / Model of Computation Optimisation tools Model of Execution Model of programmation Common Interfaces strategies of relocation Flexible Hardware The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Holistic Approach otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 22.
    22 / GPP DSP nodes eFPGA nodes Slave Nodes Master Nodes Master-slave execution model data DMA requests DMA NI NI NoC GPP Node control / status node accelerator acc requests Accelerator Interface (AI) The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Programming efficiency: common execution model otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 23.
    23 / data_transfer requests FIFOs specificities DMA wait_sync send_data receive_data send_sync2acc send_sync2gpp GPP synchro (master) work wait_sync or (slave) Accelerat send_sync2dmu Ensure hardware and software independency with the accelerator The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Master-slave execution model otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 24.
    24 / Model of Computation Optimisation tools Model of Execution Model of programmation Common Interfaces strategies of relocation Flexible Hardware The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Holistic Approach otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 25.
    25 / Tool flow and MoC •Optimisation and parallelisation tools can only be used on static applications. The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or •Necessity to identify static clusters inside the applications based on SDF/CSDF MoC otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 Act Act SDF, CSDF MoC Act Act Act : Actor Act : static cluster Act Act Act : Clusters group managed by one state management : Cluster group input/output actor: consume and produce token of data with : Cluster input/output predefined and static rules
  • 26.
    26 / Tool flow and MoC The Tool flow is based on Application (C code) 2 main tools: The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or •Thales tool: SpearDE Graphic C to SpearDE •ACE tool: Cosy input representation (manual) Conversion (Cosy) otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 Data architecture parallelisation representation Mapping (SpearDE) Streaming optimisation (Cosy) Compilation (Cosy) Library of IPs executable code Slave cores Master cores
  • 27.
    27 / A1 A3 Ax A2 A4 A5 cluster1 : partition : static cluster : cluster input/output : partition input/output : Actor number x A3 A1 A4 A2 partition2 partition1 A5 partition3 cluster1p1 A1.4 A1.3 A1.1 A1.2 A2.4 A2.3 A2.1 A2.2 •DSP •FPGA A3 A4 A5 •DSP •FPGA cluster1p1 •DSP •GPP The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Tools : partitionning, parallelisation and mapping otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 28.
    28 / Model of Computation Optimisation tools Model of Execution Model of programmation Common Interfaces strategies of relocation Flexible Hardware The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Holistic Approach otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 29.
    29 / nodes Generic Interfaces accelerators GPP nodes Heterogeneous Homogeneous AI NI NI DSP Node GPP Node AI NI NI Node GPP Node Dedicated Accelerator HW acc.) NoC AI NI NI Node GPP Node Dedicated Accelerator eFPGA Domain (Reconfigurable NI DDR Ctrl. NI NI I/O Config. Ctrl. The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Modularity and scalability: common interfaces otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 30.
    30 / Model of Computation Optimisation tools Model of Execution Model of programmation Common Interfaces strategies of relocation Flexible Hardware The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Holistic Approach otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 31.
    31 / event Act Act Act Act Act states management Act Act state 2 state 1 Act state 3 cluster group Act : Actor : static cluster : Cluster input/output : Cluster group input/output : Clusters group managed by one state management The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Dynamicity: the cluster group otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 32.
    32 / Dynamicity at cluster group level event cluster group 1 event cluster group 4 states management event cluster group 3 states management states management state 1 The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or state 1 state 1 Act Act nop Act Act Act sensor data Act Act otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 event cluster group 5 states management state 1 event cluster group 2 states management Act Act Act s Act Act c g sensor a state 1.1 a data Act : Actor state 2 t Act Act t Act t h Act Act Act e e : static cluster Act Act r r Act Act : Clusters group managed state 1.2 by one state management Act Act Act : Cluster group input/output : Cluster input/output Act Act
  • 33.
    33 / Start a new part of the application event cluster group 1 event cluster group 4 states management event cluster group 3 states management states management state 1 The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or state 1 Act Act Act Act Act sensor data state 2 Act Act otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 Act Act Act Act event cluster group 5 states management state 1 event cluster group 2 states management Act Act Act s Act Act c g sensor a state 1.1 a data Act : Actor state 2 t Act Act t Act t h Act Act Act e e : static cluster Act Act r r Act Act : Clusters group managed state 1.2 by one state management Act Act Act : Cluster group input/output : Cluster input/output Act Act
  • 34.
    34 / Modification of the behaviour event cluster group 1 event cluster group 4 states management event cluster group 3 states management states management The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or state 1 Act Act Act sensor state 2 data state 2 Act Act Act Act otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 Act Act Act Act Act event cluster group 5 states management state 1 event cluster group 2 states management Act Act Act s Act Act c g sensor a state 1.1 a data Act : Actor state 2 t Act Act t Act t h Act Act Act e e : static cluster Act Act r r Act Act : Clusters group managed state 1.2 by one state management Act Act Act : Cluster group input/output : Cluster input/output Act Act
  • 35.
    35 / Modification of the parallelisation level event cluster group 1 event cluster group 4 states management event cluster group 3 states management states management The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or state 1 Act Act Act sensor state 2 data state 2 Act Act Act Act otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 Act Act Act Act Act event cluster group 5 states management state 1 event cluster group 2 states management Act Act Act s Act Act c g sensor a a data Act : Actor state 2 t t t h Act Act Act e e : static cluster r r Act Act : Clusters group managed by one state management : Cluster group input/output : Cluster input/output
  • 36.
    36 / A1.4 A1.3 A1.2 A1.1 relocation A2.4 A2.3 A2.2 A2.1 •FPGA A3 A4 A5 •FPGA •GPP cluster1p1 A1.4 A1.3 A1.2 A1.1 relocation A2.4 A2.3 A2.2 A2.1 •DSP A3 A4 A5 •DSP •GPP cluster1p1 A1.2 A1.4 A1.3 A1.1 relocation A2.4 A2.3 A2.2 A2.1 •DSP A3 A4 A5 •DSP •DSP time The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Dynamicity at cluster level otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 cluster1p1
  • 37.
    37 / Dynamic relocation thread1 thread3 thread1 thread2thread2 thread4 The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or API I/O Acc1 Acc1 Acc3 Acc4 DDR ctrl GPP GPP GPP GPP GPP GPP otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 NoC runtime Dynamic relocation compile time I/O Acc1 thread1 thread2 Acc3 thread1 thread2 thread3 thread4 Acc1 Acc4 API Tools for Tools for parallelisation parallelisation and mapping and mapping Application
  • 38.
    38 / Model of Computation Optimisation tools Model of Execution Model of programmation Common Interfaces strategies of relocation Flexible Hardware The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Holistic Approach otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 39.
    39 / A Virtualisation Layer for self adaptive capabilities Virtualisation services provide a high level of abstraction of the heterogeneous resources: communication and accelerators management The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Self adaptive services define actions to be taken depending on events (monitoring): relocation, DVFS,… Allocation file Application otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 Self adaptive services ACTION Virtualisation Virtualisation Layer services MONITORING DIAGNOSIS O = F(L) SYSTEM kernel Monitoring Actuators Task Memory Network Communication Scheduler Cluster Semaphore mngt mngt services management mngt event mngt
  • 40.
    AI NI NI 40 / DSP Node GPP Node AI NI NI Node GPP Node Dedicated Accelerator NoC AI NI NI Node GPP Node Dedicated Accelerator NI DDR Ctrl. NI Config. Ctrl. eFPGA Domain (Reconfigurable HW acc.) NI I/O Accelerator/Virtual Code Mapping MONITORING ACTION SYSTEM O = F(L) DIAGNOSIS Dynamic allocation / binding The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Self-adaptation otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 41.
    41 / Model of Computation Optimisation tools Model of Execution Model of programmation Common Interfaces strategies of relocation Flexible Hardware The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Holistic Approach otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 42.
    42 / NoC Homogeneous manycore Tile Tile Tile Tile Tile Tile Tile FlexTiles: a 3D stack chip Tile Tile 3D stacked reconfigurable layer The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or New dynamic reconfigurable technology otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 43.
    43 / NoC Homogeneous manycore 3D stacked reconfigurable layer Tile Tile Tile Tile Tile Tile Tile Map Accelerated functions FlexTiles: a 3D stack chip Tile Tile The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or New dynamic reconfigurable technology otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 44.
    44 / NoC Homogeneous manycore 3D stacked reconfigurable layer Tile Tile Tile Duplicate Tile Tile Tile Tile FlexTiles: a 3D stack chip Tile Tile The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or New dynamic reconfigurable technology otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 45.
    45 / NoC Homogeneous manycore 3D stacked reconfigurable layer Tile Tile Tile Migrate Tile Tile Tile Tile FlexTiles: a 3D stack chip Tile Tile The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or New dynamic reconfigurable technology otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 46.
    46 / Model of Computation Optimisation tools Model of Execution Model of programmation Common Interfaces strategies of relocation Flexible Hardware The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Holistic Approach otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 47.
    47 / GPP shMEM on chip icache dcache dLMEM GPP NI DSP iLMEM DSP dLMEM DSP NI data NOC NOC NOC NOC NOC control bitstream test/debug instruction eFPGA iLMEM eFPGA dLMEM eFPGA NI + NI ctrl DDR chip DDR NoC QoS The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 48.
    48 / low latency highly scalable packet switching wormhole protocol power efficient and dependable between nodes: no global clock, no even local clock GALS: asynchronous logic in nodes, local synchronous cores The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or ANoC (CEA) otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 49.
    49 / Globally Synchronous with time slots Contention free routing by construction wormhole routing specified at design time Guaranteed levels of services and performances The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or AEtheral NoC (TUe) otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 50.
    50 / Conclusion (1) Parallelisation is the only way to reach HPC for low power The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or consumption. But Industry doesn’t want to take the plunge otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 Moreover, parallelisation is not enough, customisation is also necessary • Only affordable for high volumes and very difficult to programme Reconfigurable customisation is the solution: • Increase accessibility to heterogeneous manycore technology • Allow implementation of self adaptive capabilities
  • 51.
    51 / Conclusion (2) Self adaptive capabilities provide: The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or •Dynamic customisation of the manycore architecture to the current request of the application otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 •Reductionof the programming complexity by taking a part of the mapping complexity at runtime •Fault tolerance: adaptation depending on the faulty cores. •Energy efficiency •Temperature management -> adaptation of the application mapping
  • 52.
    52 / Model of Computation Optimisation tools Model of Execution Model of programmation Common Interfaces strategies of relocation Flexible Hardware The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Holistic Approach otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 53.
    53 / • • A FPGA layer A manycore layer A 3D stacked chip based on: The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Our proposition: a 3D stacked chip and … otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 54.
    54 / …a complete platform Application The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or Parallelisation, partioning toolchain Compilation Synthesis, P&R otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 relocatable binary code relocatable bitstream Operating Library API Virtualisation ACTION operating layer library Kernel Resource Monitoring & MONITORING DIAGNOSIS O = F(L) Allocation SYSTEM Hardware Abstraction Layer API heterogenous Hardware Abstraction Layer manycore Hardware Nodes
  • 55.
    55 / Duration: 36 months Starting date: 15/10/2011 Funding budget: 3,670,000€ Project coordinator: THALES FlexTiles www.flextiles.eu The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or FlexTiles: FP7 project otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 56.
    56 / 8 partners in 5 countries Consortium and questions Partners & Third Country Main scientific and Party technical contributions The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or THALES France Infrastructure and applications KIT Germany Virtualisation layer otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8 TUE Netherlands Kernel ; NoC CSEM Switzerland DSP CEA France NoC ; 3D stacking UR1 France Reconfigurable technology SUNDANCE United FPGA Demonstrator Kingdom ACE Netherlands Parallelisation and compilation Tools
  • 57.
    57 / With FlexTiles, Industry will be able to… Take the plunge to the manycore utilisation The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8
  • 58.
    58 / Questions ? Thank you for your attention The information contained in this document and any attachments are the property of THALES. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document is strictly prohibited without Thales prior written approval. ©THALES 2011. Template trtp version 7.0.8