SlideShare a Scribd company logo
1 of 60
ASIP Synthesis Methodology (ASSIST) Project Prof. M. Balakrishnan Department of Computer Science & Engineering IIT Delhi 29th January 2002
Outline of Presentation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Project Details ASSIST  : ASIP Synthesis Methodology Start Date : 12 th  May, 2000 IIT Delhi University of Dortmund Faculty Prof. M. Blalakrishnan Prof. Anshul Kumar  Students  Manoj Kumar Jain  Ph.D. Rajeshwari M. Banakar  Ph.D. Vishal Bhatt  M.Tech. R. Ram Kumar  B.Tech. Vijay G. Prabakaran  B.Tech. Partner institutions Faculty Prof. Peter Marwedel Dr. Rainer Leupers Students Lars Wehmeyer  Ph.D. Stefan Steinke  Ph.D. ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Application Specific Instruction set Processor (ASIP) ,[object Object],[object Object],[object Object]
Objectives of the Project ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Work done ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Survey ,[object Object],[object Object],Jain, M.K.; Balakrishnan, M.; Anshul Kumar :  “ ASIP Design Methodologies : Survey and Issues ”,  VLSI 2001 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Flow Diagram of ASIP Design Methodology Application & Design Constraints Application Analysis Architectural Design Space Exploration Instruction Set Generation Code Synthesis Hardware Synthesis Object Code Processor Description
Major Classification ,[object Object],[object Object]
Architectural Features Explored ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Architecture Design Space: Issues to be addressed ,[object Object],[object Object],[object Object]
Methodology : ASSIST  Flow Diagram Basic Processor Config. Processor Pipeline + models Component Power  models Area and Clock period data ASIP Compiler Retargetable Compiler Generator Constraints Application Application Parameters Parameter Extractor Profiler # of clocks Estimator Power Estimator Area and Clock Period Estimator Configuration Selector Processor Configurations Synthesizable VHDL Generator Synthesizable VHDL Design Space Explorer ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Methodology : ASSIST  Flow Diagram Basic Processor Config. Processor Pipeline + models Component Power  models Area and Clock period data ASIP Compiler Retargetable Compiler Generator Constraints Application Application Parameters Parameter Extractor Profiler # of clocks Estimator Power Estimator Area and Clock Period Estimator Configuration Selector Processor Configurations Synthesizable VHDL Generator Synthesizable VHDL Design Space Explorer ,[object Object],[object Object],[object Object]
Methodology : ASSIST  Flow Diagram Basic Processor Config. Processor Pipeline + models Component Power  models Area and Clock period data ASIP Compiler Retargetable Compiler Generator Constraints Application Application Parameters Parameter Extractor Profiler # of clocks Estimator Power Estimator Area and Clock Period Estimator Configuration Selector Processor Configurations Synthesizable VHDL Generator Synthesizable VHDL Design Space Explorer Leon Processor Syn.
Register Size Evaluation: Problem Definition  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Register Size Evaluation: Methodology Parameterized compiler for ARM Execution Code-size, cycle, power and energy analysis Decision for next parameter value Parameter values
Experimental Setup Benchmark Suite Register File Size Trace Data encc Compiler Instruction Set Simulator
encc Compiler Environment C Code assembly trace file profiling information executable encc ISS trace  analyzer Assembler & Linker energy database
Results Range  Number of registers  3  to  8 Memory configurations - only off chip - on-chip instruction off-chip data Results collected - number of instructions executed - number of cycles - ratio of spilling instructions (static) - power consumption - energy consumption
Result for the program me_ivlin knee due to exec. time reduction knee due to power saving
Time saving and Power saving  contributions in Energy Saving
Energy Saving due to Voltage Scaling
Maximum variation in results 44.1 12.5 37.5 Average 30.1 5    6 14.0 5    6 22.2 3    4 election_sort 57.1 4    5 22.3 4    5 44.8 4    5 insertion_sort 33.2 6    7 10.3 6    7 25.6 6    7 heap_sort 55.6 4    5 17.3 4    5 46.3 4    5 bubble_sort 59.3 3    4 15.3 5    6 53.4 3    4 me_ivlin 33.4 3    4 7.4 7    8 29.7 3    4 matrix-mult 21.0 4    5 1.0 6    7 20.5 4    5 lattice_init 62.9 3    4 12.6 3    4 57.5 3    4 biquad_N_sections % red. Reg. size % red. Reg. size % inc. Reg. size Energy Power Performance Benchmark Program
Conclusion  ,[object Object],[object Object],[object Object]
References ,[object Object],[object Object],[object Object]
Register Windows Evaluation: Problem Definition Performance analysis for the  ASIP parameter,  number of register windows ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Register Windows ,[object Object],[object Object],[object Object]
Overlapping Register W0 locals W3 locals W2 locals W1 locals W0 outs W1 ins W3 outs W0 ins W2 outs W3 ins W1 outs W2 ins Overlapping Registers
Effects of Number of Windows f1  Program f1 f3 f4 f2 f5 f2 f3 f4 Memory
Effects of Number of Windows  f1  Program f1 f3 f4 f2 f5 f2 f3 f4 f1 Memory SPILL
Effects of Number of Windows f5  Program f1 f3 f4 f2 f5 f2 f3 f4 f1 Memory SPILL
Register Windows Evaluation: Methodology Memory  Access Time Models Time Penalty Compute T  avg_access ..…….. … ..….. ……… ……… ……… ..…….. … ..….. F(); ……… ……… ..…….. DS(); F(); DS(); ……… Spill Count Modified Application Application Compute Time Penalty Compile & Execute ,[object Object],[object Object],T  avg_access Step 1 Step 2 Step 3
Spill Count Computation ,[object Object],[object Object],[object Object],[object Object],[object Object]
Memory Access Time Models ,[object Object],[object Object]
Memory Models considered ,[object Object],[object Object],[object Object],[object Object],[object Object]
System Configurations
Total Execution Time ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Execution time for MPEG Decoder
References ,[object Object]
Cache v/s Scratchpad : Objectives ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Target Architecture ,[object Object],[object Object],[object Object],[object Object],[object Object],Main Memory Cache Scratch pad Cache
Methodology: Flow Diagram application encc Packing Algorithm ARMulator Scratchpad Performance Cache/Scratchpad size Trace analysis CACTI Area Model Area Energy Cache Performance
Cache and Scratch pad Memory TAG array DATA array Decoder Input Wordlines Bitlines Column mux Sense  amplifiers Comparators Output driver Mux drivers Sense  amplifier Output  driver Column Mux Column Mux Scratch pad memory Decoder Data array Peripheral Circuitry
Energy models ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Scratch pad Energy Model E_sptotal = SP_access * E_scratchpad  where SP_access  =  number of scratchpad  accesses  obtained from the trace analysis.  E_scratchpad =  the energy per access.  E_sptotal  =  the total energy in the scratch pad
Memory Interaction Model Memory Access Model
Energy per access Cache Scratch pad
Results for  bubble_sort Area reduction  : 34% Energy reduction  :  40% Time reduction  :  18% Area Time reduction  :  46%
Energy Consumption for lattice Cache Scratch pad
Leon Synthesis Objectives ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Salient features of Leon Processor ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Architectural features varied ,[object Object],[object Object],[object Object],[object Object]
Leon Synthesis: Achievements ,[object Object],[object Object],[object Object]
Leon Synthesis: Achievements contd. ,[object Object],[object Object],[object Object]
Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Proposed Future Work ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Publications (Journal  and Reviewed Conferences Papers Jain, M.K.; Balakrishnan, M.; Anshul Kumar : “ ASIP Design Methodologies : Survey and Issues ”,  VLSI 2001 . Jain, M.K.; Wehmeyer, L.; Steinke, S.; Marwedel, P.; Balakrishnan, M. : “ Evaluating Register File Size in ASIP Synthesis ”,  COSES 2001 . Wehmeyer, L.; Jain, M.K.; Steinke, S.; Marwedel, P.; Balakrishnan, M. : “ Analysis of the Influence of the Register File Size on Energy Consumption, Code Size and Execution Time ”,  IEEE TCAD, vol. 20, no. 11, Nov. 2001 . Bhatt, V.; Balakrishnan, M.; Anshul Kumar : “ Register Windows Analysis in ASIPs ”,  VLSI 2002 . ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Publications  (Conferences Papers) Wehmeyer, L.; Jain, M.K.; Steinke, S.; Marwedel, P.; Balakrishnan, M. : “ Using a retargetable, Energy aware Compiler Framework for Deciding Number of Registers in ASIP Design ”, Fifth International Workshop on Software and Compilers for Embedded Systems,  SCOPES 2001 , 20-22 March, 2001, St. Goar, Germany.  Banakar, R.; Bose, R.; Balakrishnan, M. : “ Low Power Design: Abstraction levels and RT level design techniques ”, VLSI Design and Test Workshop,  VDAT 2001 , Aug. 2001, Banglore, India.
Publications (Technical Reports) Jain, M. K. : “ ASIP Design Methodologies : Survey and Issues ”,  TR #2000/24 , Embedded Systems Project, Department of Computer Science and Engineering, IIT Delhi. Jain M. K., Wehmeyer, L.; Marwedel, P.; Balakrishnan, M. : “ Register File Synthesis in ASIP Design ”,  TR #2000/746 , Department of CS XII, University of Dortmund, Germany. Kumar, R. R.; Prabakaran, V. G. : “ Application Specific Instruction Set Processor Synthesis and Estimation ”,  TR # 2000/29 (B.Tech. Project report) , Embedded Systems Project, Department of Computer Science and Engineering, IIT Delhi. Bhatt, V. V. : “ Register Window Analysis in ASIPs ”,  TR #2000/36 (M.Tech. Project Report) , Embedded Systems Project, Department of Computer Science and Engineering, IIT Delhi. Banakar, B.; Steinke, S.; Lee, B. S.; Balakrishnan, M.; Marwedel, P. : “ Comparison of Cache and Scratch-Pad based memory Systems with respect to Performance, Area and Energy Consumption ”,  TR #2001/762 , Department of CS XII, University of Dortmund, Germany.
ASIP Synthesis and Retargetable Code Generation Workshop Jan. 2, 2002  to  Jan. 4, 2002  IIT Delhi ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],The Speakers : Prof. M. Balakrishnan, IIT Delhi Prof. Anshul Kumar, IIT Delhi Prof. Paolo Ienne, EPFL Dr. Preeti Ranjan Panda, Synopsis Inc. Prof. Nikil Dutt, UC Irvine Prof. Peter Marwedel, Univ. of Dortmund Dr. Uday Khedker, IIT Bombay Dr. Rainer Leupers, Univ. of Dortmund
Thanks

More Related Content

Similar to Dst

Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...ijesajournal
 
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...ijesajournal
 
Performance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applicationsPerformance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applicationsAccumulo Summit
 
It5304 syllabus
It5304 syllabusIt5304 syllabus
It5304 syllabusnimal83
 
Presentation
PresentationPresentation
Presentationbutest
 
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...jsvetter
 
Tool-Driven Technology Transfer in Software Engineering
Tool-Driven Technology Transfer in Software EngineeringTool-Driven Technology Transfer in Software Engineering
Tool-Driven Technology Transfer in Software EngineeringHeiko Koziolek
 
The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...NECST Lab @ Politecnico di Milano
 
Cache Performance Evaluation under Multi-parameters Using SMPCache simulator
Cache Performance Evaluation under Multi-parameters Using SMPCache simulatorCache Performance Evaluation under Multi-parameters Using SMPCache simulator
Cache Performance Evaluation under Multi-parameters Using SMPCache simulatorالمهندسة عائشة بني صخر
 
B2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingB2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingSteve Feldman
 
B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)Steve Feldman
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
An integrated approach for designing and testing specific processors
An integrated approach for designing and testing specific processorsAn integrated approach for designing and testing specific processors
An integrated approach for designing and testing specific processorsVLSICS Design
 
Cse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionCse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionShobha Kumar
 
Weather and Climate Visualization software
Weather and Climate Visualization softwareWeather and Climate Visualization software
Weather and Climate Visualization softwareRahul Gupta
 
Performance evaluation of a multi-core system using Systems development meth...
 Performance evaluation of a multi-core system using Systems development meth... Performance evaluation of a multi-core system using Systems development meth...
Performance evaluation of a multi-core system using Systems development meth...Yoshifumi Sakamoto
 
Generation of Random EMF Models for Benchmarks
Generation of Random EMF Models for BenchmarksGeneration of Random EMF Models for Benchmarks
Generation of Random EMF Models for BenchmarksMarkus Scheidgen
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Rusif Eyvazli
 

Similar to Dst (20)

Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
 
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
 
Performance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applicationsPerformance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applications
 
It5304 syllabus
It5304 syllabusIt5304 syllabus
It5304 syllabus
 
Presentation
PresentationPresentation
Presentation
 
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
 
Tool-Driven Technology Transfer in Software Engineering
Tool-Driven Technology Transfer in Software EngineeringTool-Driven Technology Transfer in Software Engineering
Tool-Driven Technology Transfer in Software Engineering
 
The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...
 
Cache Performance Evaluation under Multi-parameters Using SMPCache simulator
Cache Performance Evaluation under Multi-parameters Using SMPCache simulatorCache Performance Evaluation under Multi-parameters Using SMPCache simulator
Cache Performance Evaluation under Multi-parameters Using SMPCache simulator
 
B2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingB2 2006 sizing_benchmarking
B2 2006 sizing_benchmarking
 
B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
An integrated approach for designing and testing specific processors
An integrated approach for designing and testing specific processorsAn integrated approach for designing and testing specific processors
An integrated approach for designing and testing specific processors
 
Cse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionCse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solution
 
Weather and Climate Visualization software
Weather and Climate Visualization softwareWeather and Climate Visualization software
Weather and Climate Visualization software
 
Performance evaluation of a multi-core system using Systems development meth...
 Performance evaluation of a multi-core system using Systems development meth... Performance evaluation of a multi-core system using Systems development meth...
Performance evaluation of a multi-core system using Systems development meth...
 
Generation of Random EMF Models for Benchmarks
Generation of Random EMF Models for BenchmarksGeneration of Random EMF Models for Benchmarks
Generation of Random EMF Models for Benchmarks
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...
 
Ch1
Ch1Ch1
Ch1
 
Ch1
Ch1Ch1
Ch1
 

More from Ramana Reddy

certificate modified -29-03
certificate modified -29-03certificate modified -29-03
certificate modified -29-03Ramana Reddy
 
SEMINAR DOC.finally completed
SEMINAR DOC.finally completedSEMINAR DOC.finally completed
SEMINAR DOC.finally completedRamana Reddy
 
09-0282_Emp_Job_Sat_Survey_FINAL
09-0282_Emp_Job_Sat_Survey_FINAL09-0282_Emp_Job_Sat_Survey_FINAL
09-0282_Emp_Job_Sat_Survey_FINALRamana Reddy
 
1997internationaljobsatisfaction
1997internationaljobsatisfaction1997internationaljobsatisfaction
1997internationaljobsatisfactionRamana Reddy
 
Job_Satisfaction_Survey_2008_Report
Job_Satisfaction_Survey_2008_ReportJob_Satisfaction_Survey_2008_Report
Job_Satisfaction_Survey_2008_ReportRamana Reddy
 
AREAS OF OPERATIONx
AREAS OF OPERATIONxAREAS OF OPERATIONx
AREAS OF OPERATIONxRamana Reddy
 
Arciniega_y_Gonzalez_2003
Arciniega_y_Gonzalez_2003Arciniega_y_Gonzalez_2003
Arciniega_y_Gonzalez_2003Ramana Reddy
 
SEMIN abrstact finally completed
SEMIN abrstact finally completedSEMIN abrstact finally completed
SEMIN abrstact finally completedRamana Reddy
 
PathJobSatisfaction
PathJobSatisfactionPathJobSatisfaction
PathJobSatisfactionRamana Reddy
 

More from Ramana Reddy (17)

2002fords
2002fords2002fords
2002fords
 
aaaaaa
aaaaaaaaaaaa
aaaaaa
 
certificate modified -29-03
certificate modified -29-03certificate modified -29-03
certificate modified -29-03
 
SEMINAR DOC.finally completed
SEMINAR DOC.finally completedSEMINAR DOC.finally completed
SEMINAR DOC.finally completed
 
wp_Reported
wp_Reportedwp_Reported
wp_Reported
 
09-0282_Emp_Job_Sat_Survey_FINAL
09-0282_Emp_Job_Sat_Survey_FINAL09-0282_Emp_Job_Sat_Survey_FINAL
09-0282_Emp_Job_Sat_Survey_FINAL
 
jobsatisfaction
jobsatisfactionjobsatisfaction
jobsatisfaction
 
Job Satisfaction
Job SatisfactionJob Satisfaction
Job Satisfaction
 
1997internationaljobsatisfaction
1997internationaljobsatisfaction1997internationaljobsatisfaction
1997internationaljobsatisfaction
 
ath25_Job
ath25_Jobath25_Job
ath25_Job
 
Job_Satisfaction_Survey_2008_Report
Job_Satisfaction_Survey_2008_ReportJob_Satisfaction_Survey_2008_Report
Job_Satisfaction_Survey_2008_Report
 
IJSSAMPL
IJSSAMPLIJSSAMPL
IJSSAMPL
 
AREAS OF OPERATIONx
AREAS OF OPERATIONxAREAS OF OPERATIONx
AREAS OF OPERATIONx
 
Arciniega_y_Gonzalez_2003
Arciniega_y_Gonzalez_2003Arciniega_y_Gonzalez_2003
Arciniega_y_Gonzalez_2003
 
SEMIN abrstact finally completed
SEMIN abrstact finally completedSEMIN abrstact finally completed
SEMIN abrstact finally completed
 
4Lex 100 winners
4Lex 100 winners4Lex 100 winners
4Lex 100 winners
 
PathJobSatisfaction
PathJobSatisfactionPathJobSatisfaction
PathJobSatisfaction
 

Dst

  • 1. ASIP Synthesis Methodology (ASSIST) Project Prof. M. Balakrishnan Department of Computer Science & Engineering IIT Delhi 29th January 2002
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. Flow Diagram of ASIP Design Methodology Application & Design Constraints Application Analysis Architectural Design Space Exploration Instruction Set Generation Code Synthesis Hardware Synthesis Object Code Processor Description
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14. Methodology : ASSIST Flow Diagram Basic Processor Config. Processor Pipeline + models Component Power models Area and Clock period data ASIP Compiler Retargetable Compiler Generator Constraints Application Application Parameters Parameter Extractor Profiler # of clocks Estimator Power Estimator Area and Clock Period Estimator Configuration Selector Processor Configurations Synthesizable VHDL Generator Synthesizable VHDL Design Space Explorer Leon Processor Syn.
  • 15.
  • 16. Register Size Evaluation: Methodology Parameterized compiler for ARM Execution Code-size, cycle, power and energy analysis Decision for next parameter value Parameter values
  • 17. Experimental Setup Benchmark Suite Register File Size Trace Data encc Compiler Instruction Set Simulator
  • 18. encc Compiler Environment C Code assembly trace file profiling information executable encc ISS trace analyzer Assembler & Linker energy database
  • 19. Results Range Number of registers 3 to 8 Memory configurations - only off chip - on-chip instruction off-chip data Results collected - number of instructions executed - number of cycles - ratio of spilling instructions (static) - power consumption - energy consumption
  • 20. Result for the program me_ivlin knee due to exec. time reduction knee due to power saving
  • 21. Time saving and Power saving contributions in Energy Saving
  • 22. Energy Saving due to Voltage Scaling
  • 23. Maximum variation in results 44.1 12.5 37.5 Average 30.1 5  6 14.0 5  6 22.2 3  4 election_sort 57.1 4  5 22.3 4  5 44.8 4  5 insertion_sort 33.2 6  7 10.3 6  7 25.6 6  7 heap_sort 55.6 4  5 17.3 4  5 46.3 4  5 bubble_sort 59.3 3  4 15.3 5  6 53.4 3  4 me_ivlin 33.4 3  4 7.4 7  8 29.7 3  4 matrix-mult 21.0 4  5 1.0 6  7 20.5 4  5 lattice_init 62.9 3  4 12.6 3  4 57.5 3  4 biquad_N_sections % red. Reg. size % red. Reg. size % inc. Reg. size Energy Power Performance Benchmark Program
  • 24.
  • 25.
  • 26.
  • 27.
  • 28. Overlapping Register W0 locals W3 locals W2 locals W1 locals W0 outs W1 ins W3 outs W0 ins W2 outs W3 ins W1 outs W2 ins Overlapping Registers
  • 29. Effects of Number of Windows f1 Program f1 f3 f4 f2 f5 f2 f3 f4 Memory
  • 30. Effects of Number of Windows f1 Program f1 f3 f4 f2 f5 f2 f3 f4 f1 Memory SPILL
  • 31. Effects of Number of Windows f5 Program f1 f3 f4 f2 f5 f2 f3 f4 f1 Memory SPILL
  • 32.
  • 33.
  • 34.
  • 35.
  • 37.
  • 38. Execution time for MPEG Decoder
  • 39.
  • 40.
  • 41.
  • 42. Methodology: Flow Diagram application encc Packing Algorithm ARMulator Scratchpad Performance Cache/Scratchpad size Trace analysis CACTI Area Model Area Energy Cache Performance
  • 43. Cache and Scratch pad Memory TAG array DATA array Decoder Input Wordlines Bitlines Column mux Sense amplifiers Comparators Output driver Mux drivers Sense amplifier Output driver Column Mux Column Mux Scratch pad memory Decoder Data array Peripheral Circuitry
  • 44.
  • 45. Memory Interaction Model Memory Access Model
  • 46. Energy per access Cache Scratch pad
  • 47. Results for bubble_sort Area reduction : 34% Energy reduction : 40% Time reduction : 18% Area Time reduction : 46%
  • 48. Energy Consumption for lattice Cache Scratch pad
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57. Publications (Conferences Papers) Wehmeyer, L.; Jain, M.K.; Steinke, S.; Marwedel, P.; Balakrishnan, M. : “ Using a retargetable, Energy aware Compiler Framework for Deciding Number of Registers in ASIP Design ”, Fifth International Workshop on Software and Compilers for Embedded Systems, SCOPES 2001 , 20-22 March, 2001, St. Goar, Germany. Banakar, R.; Bose, R.; Balakrishnan, M. : “ Low Power Design: Abstraction levels and RT level design techniques ”, VLSI Design and Test Workshop, VDAT 2001 , Aug. 2001, Banglore, India.
  • 58. Publications (Technical Reports) Jain, M. K. : “ ASIP Design Methodologies : Survey and Issues ”, TR #2000/24 , Embedded Systems Project, Department of Computer Science and Engineering, IIT Delhi. Jain M. K., Wehmeyer, L.; Marwedel, P.; Balakrishnan, M. : “ Register File Synthesis in ASIP Design ”, TR #2000/746 , Department of CS XII, University of Dortmund, Germany. Kumar, R. R.; Prabakaran, V. G. : “ Application Specific Instruction Set Processor Synthesis and Estimation ”, TR # 2000/29 (B.Tech. Project report) , Embedded Systems Project, Department of Computer Science and Engineering, IIT Delhi. Bhatt, V. V. : “ Register Window Analysis in ASIPs ”, TR #2000/36 (M.Tech. Project Report) , Embedded Systems Project, Department of Computer Science and Engineering, IIT Delhi. Banakar, B.; Steinke, S.; Lee, B. S.; Balakrishnan, M.; Marwedel, P. : “ Comparison of Cache and Scratch-Pad based memory Systems with respect to Performance, Area and Energy Consumption ”, TR #2001/762 , Department of CS XII, University of Dortmund, Germany.
  • 59.

Editor's Notes

  1. Here we have assumed total execution time as constant. To keep execution time as constant when execution requires lesser number of cycles we have increased the clock period. With the increased clock period we can reduce supply voltage. For estimating supply voltage with varying clock period we had referred The paper titled “Low Power CMOS Digital Design” – A.P Chandrakasan et al IEEE J. Solid-State Circuits, Vol. 27, No. 4, pp. 473-484, April 1992. With this estimated voltage we have calculated Energy. Since Energy is product of Average Power Consumption and Execution time, here Execution time is constant and Power depends quadratically on Voltage. Keeping these facts into consideration we have computed Energy Consumption.