How to create innovative architecture using
VisualSim?
Date: March - 31 - 2016
Host: Ranjith K R
Application Engineer
Mirabilis Design Inc.
Email: radiga@mirabilisdesign.com
Logistics of the Webinar
To ask a question, click on Arrow to the left
of Chat and type the question. Folks are
2
of Chat and type the question. Folks are
standing by to answer your questions.
There will also be a time at the end for Q&A
Purpose of this Webinar
• Importance of System level Modeling and Simulation
• Exploration of Complex Electronic systems with VisualSim
– Performance and Power
– Resource selection
– Bottleneck analysis– Bottleneck analysis
• Modeling Libraries and accuracy
Background on Mirabilis Design
• Provider of system-level modeling, simulation, analysis and
exploration software
• Supports systems, semiconductor and embedded software
• VisualSim- Modeling and simulation environment
• Based in Silicon Valley with experts in system modeling, power• Based in Silicon Valley with experts in system modeling, power
measurements and architectures
• Largest source of system modeling IP with embedded timing
and power
4
Select the “Right” configuration to match customer request
About VisualSim
• Graphical and hierarchical
modeling
• Large library of stochastic
and cycle-accurate
components and IP blocks
Architecture
Exploration
Performance
Analysis
ApplicationHardware
components and IP blocks
with embedded timing and
power
• Library blocks are used to
assemble hardware,
software, network, traffic,
reports and use-cases
5
Power
Analysis
HW-SW
Partitioning
InterfacesRTOS
Validate and optimize your design quickly and accurately
Motivation for Analyze and Validate with VisualSim
I/OCPU1
task1 task2 task3 task4
Contention
Complex behavior
- input stream
- data dependent behavior
DSPCPU2
Contention
- limited resources
- scheduling/arbitration
Interference of multiple
applications
- limited resources
- scheduling/arbitration
- anomalies
Unpredictable System behavior
Workload – 1
Workload – 2
Problem with
HW/SW???
7
Early Design Explorations to get Decisions right! –
Modeling and Simulation
• Modeling complete Electronic system
– Hardware, Operating System, applications
• Start Exploring your System within Weeks
– Bottleneck Analysis, Performance and Power Analysis– Bottleneck Analysis, Performance and Power Analysis
– Conduct “What-if” analysis
• System sizing
– Hardware-Software Partitioning
– Evaluate Dynamic behavior
• Fault-Injection
8
Exploring System Architecture with VisualSim
Architecture Model
Behavior Flow
Spreadsheet
Modeling Hardware Blocks
• Large library of pre-configured parameterized hardware
blocks
• RegEx functions and scripting language to create
custom/proprietary blocks
• All blocks have timing and power information embedded• All blocks have timing and power information embedded
• Pre-defined statistics available
• Multi-clock domains across the entire system
• Define components as statistical or cycle-accurate
Largest, fully open IP library and rapid custom IP generator
SoC Modeling Kit
• Communication
– AMBA – AHB, APB, AXI
– Frame work for Modeling Custom Bus Technologies
• Memory
– SDR, DDR, DDR2, DDR3, DDR4, QDR,
– LPDDR, LPDDR2, LPDDR3, LPDDR4
– NVMe, Memory Controller
• Storage
– Flash, SSD, Disk Drive
• Interfaces
– Gigabit Ethernet, RapidIO, PCIe, Switched Ethernet
Modeling Software Tasks
• As a Flow Diagram
– Delay with conditions
• As a detailed software description
– Flow diagram
– Read data, write data– Read data, write data
– Processing time, request for resources
• Software Instruction
– Instruction trace from an existing system
– Synthetic trace for a new or existing system
Largest variety of software modeling and direct software usage
Power Modeling with VisualSim
Function 1
Function 2
Block Functional Diagram
Function N
.
.
.
Block Power Mode Diagram
Power Management State Machine
Multimedia SoC Platform Exploration
Requirements:
1. Process 13K Macro Blocks
2. Less than 1 W of Power
3. Perform HW-SW Partitioning
Explorations:Explorations:
1. HW-SW Partitioning
2. Power Management
3. Parameters for Processor,
Memory and Accelerators
Analysis:
1. Performance
2. Power
VisualSim Model
Application Behavior Description
Hardware Architecture
Analysis and Results
• Common Statistics for all devices
– End-to-end latency and Task Delay
– Throughput (MIPS or MB/s), Utilization (%), Task Delay
– Minimum, maximum, mean, standard deviation
• Processor• Processor
– Individual statistics for Internal Caches, Execution Units,
registers, Pipeline
– Flush Time, Stall (%), Thread swaps and Context switching
– Detailed pipeline activity
• Cache
– Hit-miss Ratio
System Model Accuracy
• Development of highly optimized architecture for a 16-core Tensilica-
based Network Processor.
– Identified that mesh bus topology doesn’t meet performance requirements.
– Validated network ASIC against the industry benchmark.
– Development time: 3 man-months
• LPDDR-2 based wireless system• LPDDR-2 based wireless system
– All RAS, RP, RCD, Read and Write signals for all Banks matched with JEDEC and
RTL at every clock cycle
– Development time: 2.5 man-months
• Full Multimedia SoC
– Matched read and write latency with detailed model
– Development time: 3 weeks
Abstract system model can be AS Accurate as detailed models
Modeling Libraries
Traffic (100)
Distribution and Sequence
Trace file input
Instruction profiler
Reports (2000)
Latency, Throughput,
Utilization
Ave and peak power
SoC
AMBA (AHB/ APB/ AXI)
CoreConnect- PLB & OPB
NoC, Virtual Channel
Memory Controller
SDR, DDR, DDR2, DDR3
QDR, RDRAM
LPDDR, LPDDR2
Board-Level
VME- Parallel & Daisy
PCI/PCI-X/PCI-Express
SPI 3.0
Rapid IO
1553B
FlexRay
CAN
Processors
•ARM
•PowerPC- Freescale and IBM
•Intel and AMD
•TI
•MIPS
•Tensilica
•Renesas SH
•
Ave and peak power
Custom generator
LPDDR, LPDDR2
AFDX
TTEthernet
•
•Marvel
Resources
Time & quantity resources
Assignment language
Custom development script
600 RegEx functions
Storage
Flash
Disk
Memory Controller (Fixed,
Round-Robin, Priority)
Multi-Port Multi-Channel
Controller
Networking
Switched Ethernet
Resilient Packet Ring
RP3
Wireless LAN 802.11
Bluetooth
Spacewire
AVB
Fibre Channel
FireWire
Xilinx FPGA
Hard & Soft IP
Virtex
Spartan
Processors, Memory, Bus
DMA, FSL, APU and MPMC-2
Zynq 7000
Conclusion
• Start Exploring your System within Weeks
• Simulation of Dynamic behavior of System
Components for accurate Power and Performance
valuesvalues
• Validate System Architecture against current and
future requirements
How to create innovative architecture using
VisualSim?
Visit us at the
Booth 1103
Date: March - 31 - 2016
Host: Ranjith K R
Application Engineer
Mirabilis Design Inc.
Email: radiga@mirabilisdesign.com
April 11-14, 2016 · Colorado Springs, Colorado USA
Booth 1103
Backup Slides
VisualSim Accuracy (Performance-level)
• Deficit RR-based Router
– (Simulated vs. expected)
– 100% for throughput, latency & algorithm
– Input and output rates matched
• MPEG Encoder on TI DSP
CONFIDENTIAL
• MPEG Encoder on TI DSP
– Customer Feedback
– 100% matched DSP utilization
– 98% of time for end-to-end latency
VisualSim Bus Accuracy (Architecture-Level)
Burst 64
Data 256
Actual VisualSim Accuracy
Latency 2.16 us 1.97 us 91.29%
Throughput 107 MBps 111.3 MBps 95.92%PCI Throughput 107 MBps 111.3 MBps 95.92%PCI
Burst 32
Data 128
Actual VisualSim Accuracy
Latency 1.20 us 1.06 us 88.33%
Throughput 96 MBps 102 MBps 93.75%

How to create innovative architecture using ViualSim?

  • 1.
    How to createinnovative architecture using VisualSim? Date: March - 31 - 2016 Host: Ranjith K R Application Engineer Mirabilis Design Inc. Email: radiga@mirabilisdesign.com
  • 2.
    Logistics of theWebinar To ask a question, click on Arrow to the left of Chat and type the question. Folks are 2 of Chat and type the question. Folks are standing by to answer your questions. There will also be a time at the end for Q&A
  • 3.
    Purpose of thisWebinar • Importance of System level Modeling and Simulation • Exploration of Complex Electronic systems with VisualSim – Performance and Power – Resource selection – Bottleneck analysis– Bottleneck analysis • Modeling Libraries and accuracy
  • 4.
    Background on MirabilisDesign • Provider of system-level modeling, simulation, analysis and exploration software • Supports systems, semiconductor and embedded software • VisualSim- Modeling and simulation environment • Based in Silicon Valley with experts in system modeling, power• Based in Silicon Valley with experts in system modeling, power measurements and architectures • Largest source of system modeling IP with embedded timing and power 4 Select the “Right” configuration to match customer request
  • 5.
    About VisualSim • Graphicaland hierarchical modeling • Large library of stochastic and cycle-accurate components and IP blocks Architecture Exploration Performance Analysis ApplicationHardware components and IP blocks with embedded timing and power • Library blocks are used to assemble hardware, software, network, traffic, reports and use-cases 5 Power Analysis HW-SW Partitioning InterfacesRTOS Validate and optimize your design quickly and accurately
  • 6.
    Motivation for Analyzeand Validate with VisualSim I/OCPU1 task1 task2 task3 task4 Contention Complex behavior - input stream - data dependent behavior DSPCPU2 Contention - limited resources - scheduling/arbitration Interference of multiple applications - limited resources - scheduling/arbitration - anomalies
  • 7.
    Unpredictable System behavior Workload– 1 Workload – 2 Problem with HW/SW??? 7
  • 8.
    Early Design Explorationsto get Decisions right! – Modeling and Simulation • Modeling complete Electronic system – Hardware, Operating System, applications • Start Exploring your System within Weeks – Bottleneck Analysis, Performance and Power Analysis– Bottleneck Analysis, Performance and Power Analysis – Conduct “What-if” analysis • System sizing – Hardware-Software Partitioning – Evaluate Dynamic behavior • Fault-Injection 8
  • 9.
    Exploring System Architecturewith VisualSim Architecture Model Behavior Flow Spreadsheet
  • 10.
    Modeling Hardware Blocks •Large library of pre-configured parameterized hardware blocks • RegEx functions and scripting language to create custom/proprietary blocks • All blocks have timing and power information embedded• All blocks have timing and power information embedded • Pre-defined statistics available • Multi-clock domains across the entire system • Define components as statistical or cycle-accurate Largest, fully open IP library and rapid custom IP generator
  • 11.
    SoC Modeling Kit •Communication – AMBA – AHB, APB, AXI – Frame work for Modeling Custom Bus Technologies • Memory – SDR, DDR, DDR2, DDR3, DDR4, QDR, – LPDDR, LPDDR2, LPDDR3, LPDDR4 – NVMe, Memory Controller • Storage – Flash, SSD, Disk Drive • Interfaces – Gigabit Ethernet, RapidIO, PCIe, Switched Ethernet
  • 12.
    Modeling Software Tasks •As a Flow Diagram – Delay with conditions • As a detailed software description – Flow diagram – Read data, write data– Read data, write data – Processing time, request for resources • Software Instruction – Instruction trace from an existing system – Synthetic trace for a new or existing system Largest variety of software modeling and direct software usage
  • 13.
    Power Modeling withVisualSim Function 1 Function 2 Block Functional Diagram Function N . . . Block Power Mode Diagram
  • 14.
  • 15.
    Multimedia SoC PlatformExploration Requirements: 1. Process 13K Macro Blocks 2. Less than 1 W of Power 3. Perform HW-SW Partitioning Explorations:Explorations: 1. HW-SW Partitioning 2. Power Management 3. Parameters for Processor, Memory and Accelerators Analysis: 1. Performance 2. Power
  • 16.
    VisualSim Model Application BehaviorDescription Hardware Architecture
  • 17.
    Analysis and Results •Common Statistics for all devices – End-to-end latency and Task Delay – Throughput (MIPS or MB/s), Utilization (%), Task Delay – Minimum, maximum, mean, standard deviation • Processor• Processor – Individual statistics for Internal Caches, Execution Units, registers, Pipeline – Flush Time, Stall (%), Thread swaps and Context switching – Detailed pipeline activity • Cache – Hit-miss Ratio
  • 18.
    System Model Accuracy •Development of highly optimized architecture for a 16-core Tensilica- based Network Processor. – Identified that mesh bus topology doesn’t meet performance requirements. – Validated network ASIC against the industry benchmark. – Development time: 3 man-months • LPDDR-2 based wireless system• LPDDR-2 based wireless system – All RAS, RP, RCD, Read and Write signals for all Banks matched with JEDEC and RTL at every clock cycle – Development time: 2.5 man-months • Full Multimedia SoC – Matched read and write latency with detailed model – Development time: 3 weeks Abstract system model can be AS Accurate as detailed models
  • 19.
    Modeling Libraries Traffic (100) Distributionand Sequence Trace file input Instruction profiler Reports (2000) Latency, Throughput, Utilization Ave and peak power SoC AMBA (AHB/ APB/ AXI) CoreConnect- PLB & OPB NoC, Virtual Channel Memory Controller SDR, DDR, DDR2, DDR3 QDR, RDRAM LPDDR, LPDDR2 Board-Level VME- Parallel & Daisy PCI/PCI-X/PCI-Express SPI 3.0 Rapid IO 1553B FlexRay CAN Processors •ARM •PowerPC- Freescale and IBM •Intel and AMD •TI •MIPS •Tensilica •Renesas SH • Ave and peak power Custom generator LPDDR, LPDDR2 AFDX TTEthernet • •Marvel Resources Time & quantity resources Assignment language Custom development script 600 RegEx functions Storage Flash Disk Memory Controller (Fixed, Round-Robin, Priority) Multi-Port Multi-Channel Controller Networking Switched Ethernet Resilient Packet Ring RP3 Wireless LAN 802.11 Bluetooth Spacewire AVB Fibre Channel FireWire Xilinx FPGA Hard & Soft IP Virtex Spartan Processors, Memory, Bus DMA, FSL, APU and MPMC-2 Zynq 7000
  • 20.
    Conclusion • Start Exploringyour System within Weeks • Simulation of Dynamic behavior of System Components for accurate Power and Performance valuesvalues • Validate System Architecture against current and future requirements
  • 21.
    How to createinnovative architecture using VisualSim? Visit us at the Booth 1103 Date: March - 31 - 2016 Host: Ranjith K R Application Engineer Mirabilis Design Inc. Email: radiga@mirabilisdesign.com April 11-14, 2016 · Colorado Springs, Colorado USA Booth 1103
  • 22.
  • 23.
    VisualSim Accuracy (Performance-level) •Deficit RR-based Router – (Simulated vs. expected) – 100% for throughput, latency & algorithm – Input and output rates matched • MPEG Encoder on TI DSP CONFIDENTIAL • MPEG Encoder on TI DSP – Customer Feedback – 100% matched DSP utilization – 98% of time for end-to-end latency
  • 24.
    VisualSim Bus Accuracy(Architecture-Level) Burst 64 Data 256 Actual VisualSim Accuracy Latency 2.16 us 1.97 us 91.29% Throughput 107 MBps 111.3 MBps 95.92%PCI Throughput 107 MBps 111.3 MBps 95.92%PCI Burst 32 Data 128 Actual VisualSim Accuracy Latency 1.20 us 1.06 us 88.33% Throughput 96 MBps 102 MBps 93.75%