SlideShare a Scribd company logo
1 of 20
1
FPGA Design and Implementation
ASIC & VLSI
• Time-to-market: Some large ASICs can take a year or
more to design.
• Design Issues: you need a lot of time to handles the
mapping, routing, placement, and timing.
• The FPGA design flow eliminates the complex and time-
consuming floorplanning, place and route, timing
analysis.
Interconnect Resources
Logic Block
I/O Cell
CONCEPTUAL FPGA
FPGA
• Speed (Memory BRAM & Distributed)
• (RAM lost data). // size and cost
• Floating point & Fixed point issue.
• Flex.
08/28/09
Design Entry
Technology Mapping
Placement
Routing
Programming Unit
Configured FPGA
Design Flow
Process Diagram
Why HDL?
• To allow the designer to implement and verify complex
hardware functionality at a high level, without the
requirement of having to know the details of the low-
level design implementation.
• Advantage:
• FPGAs have lower prototyping costs
• FPGAs have shorter production times
• Synthesis: The process which translates VHDL code
into a complete circuit with logical elements( gates, flip
flops, etc…).
Maximum Throughput Designs
• Dataflow
• Unrolling
• Pipelining
• Merging
Loop Unrolling
• arrays a[i], b[i] and c[i] are mapped to RAMs.
• Rolled Loop: This implementation takes four clock cycles, one multiplier and each RAM can be a
single port.
• Unrolled Loop: The entire loop operation can be performed in a single clock cycle. requires four
multipliers and requires the ability to perform 4 reads and 4 write in the same clock cycle; may
require the arrays be implemented as register arrays rather than RAM.
Loop Merging
Pipelining
• pipelining allows operations to happen
concurrently.
Pipelining
• Function pipelining is only possible as there is no resource contention or data dependency which
prevents pipelining. The input array “m[2]” is implemented with a single-port RAM. The function
cannot be pipelined because the two reads operations on input “m[2]” (“op_Read_m[0]” and
“op_Read_m[1]”) cannot be performed in the same clock cycle.
• Solution: The resource contention problem could be solved by using a dual-port RAM for array
“m[2]", allowing both reads to be performed in the same clock cycle or increasing the the interval
of pipeline
Array Optimizations
08/28/09
Array Optimizations
• Mapping: When there are many small arrays mapping to a single large
array will reduce the storage overhead.
• Partitioning: If each small array gets a separate memory, a lot of memory
space is potentially wasted and the design will be large and consequently
large power consumption.
• Horizontal mapping: this corresponds to creating a new array by
concatenating the original arrays. Physically, this gets implemented as a
single array with more elements.
• Vertical mapping: this corresponds to creating a new array by
concatenating the original words in the array. Physically, this gets
implemented by a single array with a larger bit-width.
Horizontal mapping
08/28/09
Horizontal mapping
• Although horizontal mapping can result in using less RAM
components and hence improve area, it can have an impact on
throughput and performance.
• In the previous example both the accesses to "array1" and "array2"
can be performed in the same clock cycle.
• If both arrays are mapped to the same RAM this will now require a
separate access, and clock cycle, for each read operation.
Vertical mapping
Array Partitioning
• Arrays can also be partitioned into smaller arrays because it has a limited
amount of read ports and write ports which can limit the throughput of a
load/store intensive algorithm.
• The bandwidth can sometimes be improved by splitting up the original
array (a single memory resource) into multiple smaller arrays (multiple
memories), effectively increasing the number of ports.
Array Partitioning
• If the elements of an array are accessed one at a time, an efficient
implementation in hardware is to keep them grouped together and
mapped into a RAM.
• If multiple elements of an array are required simultaneously, it may
be more advantageous for performance to implement them as
individual registers: allowing parallel access to the data.
• Implementing an array of storage elements as individual registers
may help performance but this consume large area and increase
power consumption.
xa7a100tfgg484-2i
2-D for size N =128*128
Input Array Dual port Independent Registers
LUT 1642 10778
FF 835 9548
Power 246 2031

More Related Content

Similar to Computre_Engineering_Introduction_FPGA.ppt

Different Approaches in Energy Efficient Cache Memory
Different Approaches in Energy Efficient Cache MemoryDifferent Approaches in Energy Efficient Cache Memory
Different Approaches in Energy Efficient Cache Memory
Dhritiman Halder
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
Databricks
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
Rose Toomey
 
Caching principles-solutions
Caching principles-solutionsCaching principles-solutions
Caching principles-solutions
pmanvi
 

Similar to Computre_Engineering_Introduction_FPGA.ppt (20)

Different Approaches in Energy Efficient Cache Memory
Different Approaches in Energy Efficient Cache MemoryDifferent Approaches in Energy Efficient Cache Memory
Different Approaches in Energy Efficient Cache Memory
 
Oracle real application_cluster
Oracle real application_clusterOracle real application_cluster
Oracle real application_cluster
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
14-7810-20.ppt
14-7810-20.ppt14-7810-20.ppt
14-7810-20.ppt
 
Hpc 4 5
Hpc 4 5Hpc 4 5
Hpc 4 5
 
In-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesIn-memory Data Management Trends & Techniques
In-memory Data Management Trends & Techniques
 
Kinetic basho public
Kinetic basho publicKinetic basho public
Kinetic basho public
 
Ijiret archana-kv-increasing-memory-performance-using-cache-optimizations-in-...
Ijiret archana-kv-increasing-memory-performance-using-cache-optimizations-in-...Ijiret archana-kv-increasing-memory-performance-using-cache-optimizations-in-...
Ijiret archana-kv-increasing-memory-performance-using-cache-optimizations-in-...
 
VLSI design Dr B.jagadeesh UNIT-5.pptx
VLSI design Dr B.jagadeesh   UNIT-5.pptxVLSI design Dr B.jagadeesh   UNIT-5.pptx
VLSI design Dr B.jagadeesh UNIT-5.pptx
 
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
 
cachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance Cachingcachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance Caching
 
Designing data intensive applications
Designing data intensive applicationsDesigning data intensive applications
Designing data intensive applications
 
Map db
Map dbMap db
Map db
 
Caching principles-solutions
Caching principles-solutionsCaching principles-solutions
Caching principles-solutions
 
Cache memory
Cache memoryCache memory
Cache memory
 
Investigations on Implementation of Ternary Content Addressable Memory Archit...
Investigations on Implementation of Ternary Content Addressable Memory Archit...Investigations on Implementation of Ternary Content Addressable Memory Archit...
Investigations on Implementation of Ternary Content Addressable Memory Archit...
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
Computer architecture
Computer architectureComputer architecture
Computer architecture
 
SanDisk: Persistent Memory and Cassandra
SanDisk: Persistent Memory and CassandraSanDisk: Persistent Memory and Cassandra
SanDisk: Persistent Memory and Cassandra
 

Recently uploaded

01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
AshwaniAnuragi1
 
21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx
rahulmanepalli02
 
Degrees of freedom for the robots 1.pptx
Degrees of freedom for the robots 1.pptxDegrees of freedom for the robots 1.pptx
Degrees of freedom for the robots 1.pptx
Mostafa Mahmoud
 
Artificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdfArtificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdf
Kira Dess
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 

Recently uploaded (20)

Signal Processing and Linear System Analysis
Signal Processing and Linear System AnalysisSignal Processing and Linear System Analysis
Signal Processing and Linear System Analysis
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailing
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfInstruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
 
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
 
Adsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptAdsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) ppt
 
Databricks Generative AI Fundamentals .pdf
Databricks Generative AI Fundamentals  .pdfDatabricks Generative AI Fundamentals  .pdf
Databricks Generative AI Fundamentals .pdf
 
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdflitvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
 
Ground Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth ReinforcementGround Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth Reinforcement
 
Fundamentals of Structure in C Programming
Fundamentals of Structure in C ProgrammingFundamentals of Structure in C Programming
Fundamentals of Structure in C Programming
 
Raashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashid final report on Embedded Systems
Raashid final report on Embedded Systems
 
21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx
 
Degrees of freedom for the robots 1.pptx
Degrees of freedom for the robots 1.pptxDegrees of freedom for the robots 1.pptx
Degrees of freedom for the robots 1.pptx
 
Artificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdfArtificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdf
 
Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Station
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Passive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptPassive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.ppt
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 

Computre_Engineering_Introduction_FPGA.ppt

  • 1. 1 FPGA Design and Implementation
  • 2. ASIC & VLSI • Time-to-market: Some large ASICs can take a year or more to design. • Design Issues: you need a lot of time to handles the mapping, routing, placement, and timing. • The FPGA design flow eliminates the complex and time- consuming floorplanning, place and route, timing analysis.
  • 4. FPGA • Speed (Memory BRAM & Distributed) • (RAM lost data). // size and cost • Floating point & Fixed point issue. • Flex. 08/28/09
  • 5. Design Entry Technology Mapping Placement Routing Programming Unit Configured FPGA Design Flow Process Diagram
  • 6. Why HDL? • To allow the designer to implement and verify complex hardware functionality at a high level, without the requirement of having to know the details of the low- level design implementation. • Advantage: • FPGAs have lower prototyping costs • FPGAs have shorter production times • Synthesis: The process which translates VHDL code into a complete circuit with logical elements( gates, flip flops, etc…).
  • 7. Maximum Throughput Designs • Dataflow • Unrolling • Pipelining • Merging
  • 8. Loop Unrolling • arrays a[i], b[i] and c[i] are mapped to RAMs. • Rolled Loop: This implementation takes four clock cycles, one multiplier and each RAM can be a single port. • Unrolled Loop: The entire loop operation can be performed in a single clock cycle. requires four multipliers and requires the ability to perform 4 reads and 4 write in the same clock cycle; may require the arrays be implemented as register arrays rather than RAM.
  • 10.
  • 11. Pipelining • pipelining allows operations to happen concurrently.
  • 12. Pipelining • Function pipelining is only possible as there is no resource contention or data dependency which prevents pipelining. The input array “m[2]” is implemented with a single-port RAM. The function cannot be pipelined because the two reads operations on input “m[2]” (“op_Read_m[0]” and “op_Read_m[1]”) cannot be performed in the same clock cycle. • Solution: The resource contention problem could be solved by using a dual-port RAM for array “m[2]", allowing both reads to be performed in the same clock cycle or increasing the the interval of pipeline
  • 14. Array Optimizations • Mapping: When there are many small arrays mapping to a single large array will reduce the storage overhead. • Partitioning: If each small array gets a separate memory, a lot of memory space is potentially wasted and the design will be large and consequently large power consumption. • Horizontal mapping: this corresponds to creating a new array by concatenating the original arrays. Physically, this gets implemented as a single array with more elements. • Vertical mapping: this corresponds to creating a new array by concatenating the original words in the array. Physically, this gets implemented by a single array with a larger bit-width.
  • 16. Horizontal mapping • Although horizontal mapping can result in using less RAM components and hence improve area, it can have an impact on throughput and performance. • In the previous example both the accesses to "array1" and "array2" can be performed in the same clock cycle. • If both arrays are mapped to the same RAM this will now require a separate access, and clock cycle, for each read operation.
  • 18. Array Partitioning • Arrays can also be partitioned into smaller arrays because it has a limited amount of read ports and write ports which can limit the throughput of a load/store intensive algorithm. • The bandwidth can sometimes be improved by splitting up the original array (a single memory resource) into multiple smaller arrays (multiple memories), effectively increasing the number of ports.
  • 19. Array Partitioning • If the elements of an array are accessed one at a time, an efficient implementation in hardware is to keep them grouped together and mapped into a RAM. • If multiple elements of an array are required simultaneously, it may be more advantageous for performance to implement them as individual registers: allowing parallel access to the data. • Implementing an array of storage elements as individual registers may help performance but this consume large area and increase power consumption.
  • 20. xa7a100tfgg484-2i 2-D for size N =128*128 Input Array Dual port Independent Registers LUT 1642 10778 FF 835 9548 Power 246 2031