SlideShare a Scribd company logo
1 of 49
Download to read offline
A parallel for loop memory template
 for a high level synthesis compiler

                  Craig Moore
      Wim Meeus, Harald Devos, and Dirk Stroobandt




              Euromicro Conference on
                Digital System Design
                     Lille, France
                     02/09/2010
Outline

 ●   High Level Synthesis
 ●   Hardware Development
 ●   External Memory
 ●   Burst memory transfers
 ●   Parallel For Loops
 ●   Memory Template Overview
 ●   Small Example
 ●   Future Work
 ●   Conclusions
30/06/2010           Craig Moore, DSD 02/09/2010      2
High Level Synthesis (HLS)
                   Missing Pieces




30/06/2010   Craig Moore, DSD 02/09/2010   3
HLS Missing Pieces




30/06/2010   Craig Moore, DSD 02/09/2010   4
HLS Missing Pieces




30/06/2010   Craig Moore, DSD 02/09/2010   5
Memory Templates
                                      as Tools
 ●   HDL Programmers have:
     ●   Toolkit of memory designs
     ●   Use the right tool for the job
     ●   Manually adapt their designs
 ●   HLS Compilers should:
     ●   Have a toolkit of templates
     ●   Adapt the template to the app
     ●   Evaluate each template
     ●   Suggest the best template

30/06/2010                Craig Moore, DSD 02/09/2010   6
Basic Steps for any Algorithm

1) Read values from memory
2) Process each value
3) Store output in memory


     for (int i = start; i < end; i++)
     {
       b[i] = func(a[i]);
     }


30/06/2010       Craig Moore, DSD 02/09/2010   7
Implement on Hardware




30/06/2010   Craig Moore, DSD 02/09/2010   8
External Memory
                           for FPGAs
                            ●   A bottle neck
                            ●   Sequential in nature
                            ●   Number of values
                                returned each cycle
                                depends on bus
                                width.
                            ●   Each memory request
                                requires a handshake



30/06/2010   Craig Moore, DSD 02/09/2010               9
Adapting to
                                    the Bottleneck
 ●   Stream values from
     memory
 ●   Pre-fetch values
 ●   Read/Write more than
     one value each clock
     cycle
 ●   Store values locally to
     mask latency
 ●   Reduce number of
     requests
30/06/2010            Craig Moore, DSD 02/09/2010    10
Burst Transfers

 ●   Burst of consecutive memory operations




30/06/2010          Craig Moore, DSD 02/09/2010    11
Burst Transfers

 ●   Burst of consecutive memory operations

              Read Transfer
             Start Address: 3
                   Transfer: 4

0

1

2

3

4

5

6

30/06/2010                       Craig Moore, DSD 02/09/2010    12
Burst Transfers

 ●   Burst of consecutive memory operations

              Read Transfer
             Start Address: 3
                   Transfer: 4

0

1

2

3

4

5

6

30/06/2010                       Craig Moore, DSD 02/09/2010    13
Burst Transfers

 ●   Burst of consecutive memory operations

              Read Transfer
             Start Address: 3
                   Transfer: 4

0

1

2

3

4

5

6

30/06/2010                       Craig Moore, DSD 02/09/2010    14
Burst Transfers

 ●   Burst of consecutive memory operations

              Read Transfer
             Start Address: 3
                   Transfer: 4

0

1

2

3

4

5

6

30/06/2010                       Craig Moore, DSD 02/09/2010    15
Burst Transfers

 ●   Burst of consecutive memory operations

              Read Transfer
             Start Address: 3
                   Transfer: 4

0

1

2

3

4

5

6

30/06/2010                       Craig Moore, DSD 02/09/2010    16
Burst Transfers

 ●   Burst of consecutive memory operations

             Write Transfer
             Start Address: 2
                   Transfer: 5

0

1

2

3

4

5

6

30/06/2010                       Craig Moore, DSD 02/09/2010    17
Burst Transfers

 ●   Burst of consecutive memory operations

             Write Transfer
             Start Address: 2
                   Transfer: 5

0

1

2

3

4

5

6

30/06/2010                       Craig Moore, DSD 02/09/2010    18
Burst Transfers

 ●   Burst of consecutive memory operations

             Write Transfer
             Start Address: 2
                   Transfer: 5

0

1

2

3

4

5

6

30/06/2010                       Craig Moore, DSD 02/09/2010    19
Burst Transfers

 ●   Burst of consecutive memory operations

             Write Transfer
             Start Address: 2
                   Transfer: 5

0

1

2

3

4

5

6

30/06/2010                       Craig Moore, DSD 02/09/2010    20
Burst Transfers

 ●   Burst of consecutive memory operations

             Write Transfer
             Start Address: 2
                   Transfer: 5

0

1

2

3

4

5

6

30/06/2010                       Craig Moore, DSD 02/09/2010    21
Burst Transfers

 ●   Burst of consecutive memory operations

             Write Transfer
             Start Address: 2
                   Transfer: 5

0

1

2

3

4

5

6

30/06/2010                       Craig Moore, DSD 02/09/2010    22
Parallel for Loop

 ●   Each iteration is run in parallel
 ●   No loop dependencies
     ●   Loop Transformations to remove them

               Example with Dependencies
             for i = 1 to 4
             {
               a(i) = a(i) + 1
               b(i) = a(i – 1) + a(i + 1)
             }

30/06/2010             Craig Moore, DSD 02/09/2010     23
Template Overview




30/06/2010   Craig Moore, DSD 02/09/2010   24
Template Overview




Requests read bursts
and controls execution
of data paths, waits for
output buffer if it is full
30/06/2010                    Craig Moore, DSD 02/09/2010   25
Template Overview
Non-pipelined loop bodies
executing in parallel.




 30/06/2010                 Craig Moore, DSD 02/09/2010   26
Manual Design




                                           With enough values,
                                           performs write bursts.

30/06/2010   Craig Moore, DSD 02/09/2010                     27
Manual Design




             Starts and stops execution


30/06/2010     Craig Moore, DSD 02/09/2010   28
Manual Design
                                           Controls access to memory,
                                           grants permission based on
                                           request (output buffer priority)




30/06/2010   Craig Moore, DSD 02/09/2010                               29
Manual Design
                                                                Controls access to memory,
Non-pipelined loop bodies
                                                                grants permission based on
executing in parallel.
                                                                request (output buffer priority)




  Requests read bursts
  and controls execution                                                With enough values,
                                Starts and stops execution
  of data paths, waits for                                              performs write bursts.
  output buffer if it is full
 30/06/2010                       Craig Moore, DSD 02/09/2010                               30
Byte-Enable Signal

 ●   Multiple values for each memory transaction
 ●   Tells which bytes to replace and preserve




30/06/2010           Craig Moore, DSD 02/09/2010   31
Byte-Enable Signal

 ●   Multiple values for each memory transaction
 ●   Tells which bytes to replace and preserve


                        Ignore




                       Enable


30/06/2010           Craig Moore, DSD 02/09/2010   32
Byte-Enable Signal

 ●   Multiple values for each memory transaction
 ●   Tells which bytes to replace and preserve


                        Ignore




                       Enable


30/06/2010           Craig Moore, DSD 02/09/2010   33
Byte-Enable Signal

 ●   Multiple values for each memory transaction
 ●   Tells which bytes to replace and preserve


                        Ignore




                       Enable


30/06/2010           Craig Moore, DSD 02/09/2010   34
Byte-Enable Signal

 ●   Multiple values for each memory transaction
 ●   Tells which bytes to replace and preserve


                        Ignore




                       Enable


30/06/2010           Craig Moore, DSD 02/09/2010   35
Parametrized Template




30/06/2010   Craig Moore, DSD 02/09/2010   36
Parametrized Template
                                              Parameters
                                   ●   Memory Bus Width = M




30/06/2010   Craig Moore, DSD 02/09/2010                      37
Parametrized Template
                                              Parameters
                                   ●   Memory Bus Width = M
                                   ●   Word Width = W




30/06/2010   Craig Moore, DSD 02/09/2010                      38
Parametrized Template
                                              Parameters
                                   ●   Memory Bus Width = M
                                   ●   Word Width = W
                                   ●   Max Words = A = M / W




30/06/2010   Craig Moore, DSD 02/09/2010                       39
Parametrized Template
                                               Parameters
                                   ●   Memory Bus Width = M
                                   ●   Word Width = W
                                   ●   Max Words = A = M / W
                                   ●   Input FIFOs = X = Cx * A




30/06/2010   Craig Moore, DSD 02/09/2010                       40
Parametrized Template
                                               Parameters
                                   ●   Memory Bus Width = M
                                   ●   Word Width = W
                                   ●   Max Words = A = M / W
                                   ●   Input FIFOs = X = Cx * A
                                   ●   Iterations = Output FIFOs =
                                       N = CN * X




30/06/2010   Craig Moore, DSD 02/09/2010                       41
Parametrized Template
                                               Parameters
                                  ●    Memory Bus Width = M
                                  ●    Word Width = W
                                  ●    Max Words = A = M / W
                                  ●    Input FIFOs = X = Cx * A
                                  ●    Iterations = Output FIFOs =
                                       N = CN * X
                                  ●    Burst Length
                                  ●    Output FIFO Length
                                   ●   Iteration Length
                                   ●   Input FIFO Length
30/06/2010   Craig Moore, DSD 02/09/2010                       42
Parametrized Template
                                               Parameters
                                  ●    Memory Bus Width = M
                                  ●    Word Width = W
                                  ●    Max Words = A = M / W
                                  ●    Input FIFOs = X = Cx * A
                                  ●    Iterations = Output FIFOs =
                                       N = CN * X
                                  ●    Burst Length
                                  ●    Output FIFO Length
                                   ●   Iteration Length
                                   ●   Input FIFO Length
30/06/2010   Craig Moore, DSD 02/09/2010                       43
Example – Reading Values



      Values in Memory
      Values to be read
      Byte enabled
      Byte disabled
      Values processed


30/06/2010            Craig Moore, DSD 02/09/2010   44
Example – Processing Values



      Values in Memory
      Values to be read
      Byte enabled
      Byte disabled
      Values processed


30/06/2010             Craig Moore, DSD 02/09/2010   45
Example – Writing Values



      Values in Memory
      Values to be read
      Byte enabled
      Byte disabled
      Values processed


30/06/2010            Craig Moore, DSD 02/09/2010   46
Future Work

 ●   More templates for other parallel for loops
     ●   Pipelined loop body
     ●   Data reuse
 ●   Compiler identifies parallel for loop
     ●   No keywords
     ●   Check for loop dependencies, and do loop
         transformations if required
 ●   Compiler suggests best memory template
     ●   Chosen based on performance estimate
     ●   Design space exploration using templates
30/06/2010              Craig Moore, DSD 02/09/2010    47
Conclusions

 ●   HLS Tools don't create memory designs
 ●   Manual memory designs can take
     days/weeks/months to complete
 ●   Parametrized memory template designs are
     generated in seconds
     ●   Easy to perform design space exploration using
         different parameter values and/or templates




30/06/2010              Craig Moore, DSD 02/09/2010       48
Thank You!
                                 Questions?
                              craig.moore@elis.ugent.be
                          http://www.elis.ugent.be/~cmoore



                      Wim Meeus*, Harald Devos‡, and Dirk Stroobandt*
             *{wim.meeus, dirk.stroobandt}@elis.ugent.be, ‡devos.harald@gmail.com

30/06/2010                      Craig Moore, DSD 02/09/2010                         49

More Related Content

Recently uploaded

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxAnaBeatriceAblay2
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 

Recently uploaded (20)

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 

Featured

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 

Featured (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

A parallel 'for' loop memory template for a high level synthesis compiler

  • 1. A parallel for loop memory template for a high level synthesis compiler Craig Moore Wim Meeus, Harald Devos, and Dirk Stroobandt Euromicro Conference on Digital System Design Lille, France 02/09/2010
  • 2. Outline ● High Level Synthesis ● Hardware Development ● External Memory ● Burst memory transfers ● Parallel For Loops ● Memory Template Overview ● Small Example ● Future Work ● Conclusions 30/06/2010 Craig Moore, DSD 02/09/2010 2
  • 3. High Level Synthesis (HLS) Missing Pieces 30/06/2010 Craig Moore, DSD 02/09/2010 3
  • 4. HLS Missing Pieces 30/06/2010 Craig Moore, DSD 02/09/2010 4
  • 5. HLS Missing Pieces 30/06/2010 Craig Moore, DSD 02/09/2010 5
  • 6. Memory Templates as Tools ● HDL Programmers have: ● Toolkit of memory designs ● Use the right tool for the job ● Manually adapt their designs ● HLS Compilers should: ● Have a toolkit of templates ● Adapt the template to the app ● Evaluate each template ● Suggest the best template 30/06/2010 Craig Moore, DSD 02/09/2010 6
  • 7. Basic Steps for any Algorithm 1) Read values from memory 2) Process each value 3) Store output in memory for (int i = start; i < end; i++) { b[i] = func(a[i]); } 30/06/2010 Craig Moore, DSD 02/09/2010 7
  • 8. Implement on Hardware 30/06/2010 Craig Moore, DSD 02/09/2010 8
  • 9. External Memory for FPGAs ● A bottle neck ● Sequential in nature ● Number of values returned each cycle depends on bus width. ● Each memory request requires a handshake 30/06/2010 Craig Moore, DSD 02/09/2010 9
  • 10. Adapting to the Bottleneck ● Stream values from memory ● Pre-fetch values ● Read/Write more than one value each clock cycle ● Store values locally to mask latency ● Reduce number of requests 30/06/2010 Craig Moore, DSD 02/09/2010 10
  • 11. Burst Transfers ● Burst of consecutive memory operations 30/06/2010 Craig Moore, DSD 02/09/2010 11
  • 12. Burst Transfers ● Burst of consecutive memory operations Read Transfer Start Address: 3 Transfer: 4 0 1 2 3 4 5 6 30/06/2010 Craig Moore, DSD 02/09/2010 12
  • 13. Burst Transfers ● Burst of consecutive memory operations Read Transfer Start Address: 3 Transfer: 4 0 1 2 3 4 5 6 30/06/2010 Craig Moore, DSD 02/09/2010 13
  • 14. Burst Transfers ● Burst of consecutive memory operations Read Transfer Start Address: 3 Transfer: 4 0 1 2 3 4 5 6 30/06/2010 Craig Moore, DSD 02/09/2010 14
  • 15. Burst Transfers ● Burst of consecutive memory operations Read Transfer Start Address: 3 Transfer: 4 0 1 2 3 4 5 6 30/06/2010 Craig Moore, DSD 02/09/2010 15
  • 16. Burst Transfers ● Burst of consecutive memory operations Read Transfer Start Address: 3 Transfer: 4 0 1 2 3 4 5 6 30/06/2010 Craig Moore, DSD 02/09/2010 16
  • 17. Burst Transfers ● Burst of consecutive memory operations Write Transfer Start Address: 2 Transfer: 5 0 1 2 3 4 5 6 30/06/2010 Craig Moore, DSD 02/09/2010 17
  • 18. Burst Transfers ● Burst of consecutive memory operations Write Transfer Start Address: 2 Transfer: 5 0 1 2 3 4 5 6 30/06/2010 Craig Moore, DSD 02/09/2010 18
  • 19. Burst Transfers ● Burst of consecutive memory operations Write Transfer Start Address: 2 Transfer: 5 0 1 2 3 4 5 6 30/06/2010 Craig Moore, DSD 02/09/2010 19
  • 20. Burst Transfers ● Burst of consecutive memory operations Write Transfer Start Address: 2 Transfer: 5 0 1 2 3 4 5 6 30/06/2010 Craig Moore, DSD 02/09/2010 20
  • 21. Burst Transfers ● Burst of consecutive memory operations Write Transfer Start Address: 2 Transfer: 5 0 1 2 3 4 5 6 30/06/2010 Craig Moore, DSD 02/09/2010 21
  • 22. Burst Transfers ● Burst of consecutive memory operations Write Transfer Start Address: 2 Transfer: 5 0 1 2 3 4 5 6 30/06/2010 Craig Moore, DSD 02/09/2010 22
  • 23. Parallel for Loop ● Each iteration is run in parallel ● No loop dependencies ● Loop Transformations to remove them Example with Dependencies for i = 1 to 4 { a(i) = a(i) + 1 b(i) = a(i – 1) + a(i + 1) } 30/06/2010 Craig Moore, DSD 02/09/2010 23
  • 24. Template Overview 30/06/2010 Craig Moore, DSD 02/09/2010 24
  • 25. Template Overview Requests read bursts and controls execution of data paths, waits for output buffer if it is full 30/06/2010 Craig Moore, DSD 02/09/2010 25
  • 26. Template Overview Non-pipelined loop bodies executing in parallel. 30/06/2010 Craig Moore, DSD 02/09/2010 26
  • 27. Manual Design With enough values, performs write bursts. 30/06/2010 Craig Moore, DSD 02/09/2010 27
  • 28. Manual Design Starts and stops execution 30/06/2010 Craig Moore, DSD 02/09/2010 28
  • 29. Manual Design Controls access to memory, grants permission based on request (output buffer priority) 30/06/2010 Craig Moore, DSD 02/09/2010 29
  • 30. Manual Design Controls access to memory, Non-pipelined loop bodies grants permission based on executing in parallel. request (output buffer priority) Requests read bursts and controls execution With enough values, Starts and stops execution of data paths, waits for performs write bursts. output buffer if it is full 30/06/2010 Craig Moore, DSD 02/09/2010 30
  • 31. Byte-Enable Signal ● Multiple values for each memory transaction ● Tells which bytes to replace and preserve 30/06/2010 Craig Moore, DSD 02/09/2010 31
  • 32. Byte-Enable Signal ● Multiple values for each memory transaction ● Tells which bytes to replace and preserve Ignore Enable 30/06/2010 Craig Moore, DSD 02/09/2010 32
  • 33. Byte-Enable Signal ● Multiple values for each memory transaction ● Tells which bytes to replace and preserve Ignore Enable 30/06/2010 Craig Moore, DSD 02/09/2010 33
  • 34. Byte-Enable Signal ● Multiple values for each memory transaction ● Tells which bytes to replace and preserve Ignore Enable 30/06/2010 Craig Moore, DSD 02/09/2010 34
  • 35. Byte-Enable Signal ● Multiple values for each memory transaction ● Tells which bytes to replace and preserve Ignore Enable 30/06/2010 Craig Moore, DSD 02/09/2010 35
  • 36. Parametrized Template 30/06/2010 Craig Moore, DSD 02/09/2010 36
  • 37. Parametrized Template Parameters ● Memory Bus Width = M 30/06/2010 Craig Moore, DSD 02/09/2010 37
  • 38. Parametrized Template Parameters ● Memory Bus Width = M ● Word Width = W 30/06/2010 Craig Moore, DSD 02/09/2010 38
  • 39. Parametrized Template Parameters ● Memory Bus Width = M ● Word Width = W ● Max Words = A = M / W 30/06/2010 Craig Moore, DSD 02/09/2010 39
  • 40. Parametrized Template Parameters ● Memory Bus Width = M ● Word Width = W ● Max Words = A = M / W ● Input FIFOs = X = Cx * A 30/06/2010 Craig Moore, DSD 02/09/2010 40
  • 41. Parametrized Template Parameters ● Memory Bus Width = M ● Word Width = W ● Max Words = A = M / W ● Input FIFOs = X = Cx * A ● Iterations = Output FIFOs = N = CN * X 30/06/2010 Craig Moore, DSD 02/09/2010 41
  • 42. Parametrized Template Parameters ● Memory Bus Width = M ● Word Width = W ● Max Words = A = M / W ● Input FIFOs = X = Cx * A ● Iterations = Output FIFOs = N = CN * X ● Burst Length ● Output FIFO Length ● Iteration Length ● Input FIFO Length 30/06/2010 Craig Moore, DSD 02/09/2010 42
  • 43. Parametrized Template Parameters ● Memory Bus Width = M ● Word Width = W ● Max Words = A = M / W ● Input FIFOs = X = Cx * A ● Iterations = Output FIFOs = N = CN * X ● Burst Length ● Output FIFO Length ● Iteration Length ● Input FIFO Length 30/06/2010 Craig Moore, DSD 02/09/2010 43
  • 44. Example – Reading Values Values in Memory Values to be read Byte enabled Byte disabled Values processed 30/06/2010 Craig Moore, DSD 02/09/2010 44
  • 45. Example – Processing Values Values in Memory Values to be read Byte enabled Byte disabled Values processed 30/06/2010 Craig Moore, DSD 02/09/2010 45
  • 46. Example – Writing Values Values in Memory Values to be read Byte enabled Byte disabled Values processed 30/06/2010 Craig Moore, DSD 02/09/2010 46
  • 47. Future Work ● More templates for other parallel for loops ● Pipelined loop body ● Data reuse ● Compiler identifies parallel for loop ● No keywords ● Check for loop dependencies, and do loop transformations if required ● Compiler suggests best memory template ● Chosen based on performance estimate ● Design space exploration using templates 30/06/2010 Craig Moore, DSD 02/09/2010 47
  • 48. Conclusions ● HLS Tools don't create memory designs ● Manual memory designs can take days/weeks/months to complete ● Parametrized memory template designs are generated in seconds ● Easy to perform design space exploration using different parameter values and/or templates 30/06/2010 Craig Moore, DSD 02/09/2010 48
  • 49. Thank You! Questions? craig.moore@elis.ugent.be http://www.elis.ugent.be/~cmoore Wim Meeus*, Harald Devos‡, and Dirk Stroobandt* *{wim.meeus, dirk.stroobandt}@elis.ugent.be, ‡devos.harald@gmail.com 30/06/2010 Craig Moore, DSD 02/09/2010 49