The document describes the specifications and operations of Double Data Rate (DDR) SDRAM memory. It details features like double data rate architecture, burst lengths, CAS latencies, commands like read, write, refresh, and initialization procedures. It provides timing diagrams for different memory operations.
The document discusses various on-chip bus architectures used for system-on-chip designs. It describes buses such as AMBA, CoreConnect, STBus, Wishbone and others. For each bus, it provides details on the bus hierarchy, protocols, and how they enable connection and data transfer between functional blocks in a system-on-chip.
DDR3 is an evolution of DDR2 RAM that provides faster speeds, lower power consumption, and other improvements. Key features of DDR3 include higher clock frequencies up to 1600MHz, lower voltage of 1.5V, 8-bit prefetch, on-die termination for better signal quality, and fly-by topology. DDR3 also has read/write leveling to calibrate timing, lower signaling standards for reduced power/noise, and improved routing guidelines.
Highlighted notes while studying Concurrent Data Structures:
DDR3 SDRAM
Source: Wikipedia
Double Data Rate 3 Synchronous Dynamic Random-Access Memory, officially abbreviated as DDR3 SDRAM, is a type of synchronous dynamic random-access memory (SDRAM) with a high bandwidth ("double data rate") interface, and has been in use since 2007. It is the higher-speed successor to DDR and DDR2 and predecessor to DDR4 synchronous dynamic random-access memory (SDRAM) chips. DDR3 SDRAM is neither forward nor backward compatible with any earlier type of random-access memory (RAM) because of different signaling voltages, timings, and other factors.
Wikipedia is a free online encyclopedia, created and edited by volunteers around the world and hosted by the Wikimedia Foundation.
Highlighted notes while studying Concurrent Data Structures:
DDR4 SDRAM
Source: Wikipedia
Double Data Rate 4 Synchronous Dynamic Random-Access Memory, officially abbreviated as DDR4 SDRAM, is a type of synchronous dynamic random-access memory with a high bandwidth ("double data rate") interface.
Wikipedia is a free online encyclopedia, created and edited by volunteers around the world and hosted by the Wikimedia Foundation.
JTAG (Joint Test Action Group) is an IEEE 1149.1 standard that defines a 4-wire interface for testing printed circuit boards and integrated circuits. It allows for boundary scan testing which provides controllability and observability of a device's I/O pins through software. The standard defines instructions and registers that are used to perform functional tests, interconnect tests, and built-in self-tests. JTAG provides advantages like eliminating physical test points, reducing costly test fixtures, and increasing test speed. It works by using a TAP controller and instruction and data registers that are serially loaded to control and observe the device under test.
Designed a fully customized 128x10b SRAM by constructing schematic & virtuoso layout of memory cell array (6T cell), row & column decoder, pre-charge circuit, write circuit and sense amplifier using Cadence. Manually placed and routed all components, performed DRC & LVS debugging of constructed schematic and layout and ran PEX to generate the final Netlist, Hspice Spectre simulation of final design for verification of the correct functionality and analysis of best read, best write cycles & the worst case timing for read and write. Timing and power consumed is analyzed through STA-Primetime (Static timing Analysis)
The document discusses the PowerPC processor. It provides details about the IBM 405Fx PowerPC processor core such as its 32-bit RISC design, 5-stage pipeline, separate instruction and data caches, virtual memory management unit, timers, and debug support. The PowerPC architecture consists of the user instruction set architecture, virtual environment architecture, and operating environment architecture. The processor core contains the pipeline, cache units, MMU, timers, and interfaces to other functions.
The document describes the specifications and operations of Double Data Rate (DDR) SDRAM memory. It details features like double data rate architecture, burst lengths, CAS latencies, commands like read, write, refresh, and initialization procedures. It provides timing diagrams for different memory operations.
The document discusses various on-chip bus architectures used for system-on-chip designs. It describes buses such as AMBA, CoreConnect, STBus, Wishbone and others. For each bus, it provides details on the bus hierarchy, protocols, and how they enable connection and data transfer between functional blocks in a system-on-chip.
DDR3 is an evolution of DDR2 RAM that provides faster speeds, lower power consumption, and other improvements. Key features of DDR3 include higher clock frequencies up to 1600MHz, lower voltage of 1.5V, 8-bit prefetch, on-die termination for better signal quality, and fly-by topology. DDR3 also has read/write leveling to calibrate timing, lower signaling standards for reduced power/noise, and improved routing guidelines.
Highlighted notes while studying Concurrent Data Structures:
DDR3 SDRAM
Source: Wikipedia
Double Data Rate 3 Synchronous Dynamic Random-Access Memory, officially abbreviated as DDR3 SDRAM, is a type of synchronous dynamic random-access memory (SDRAM) with a high bandwidth ("double data rate") interface, and has been in use since 2007. It is the higher-speed successor to DDR and DDR2 and predecessor to DDR4 synchronous dynamic random-access memory (SDRAM) chips. DDR3 SDRAM is neither forward nor backward compatible with any earlier type of random-access memory (RAM) because of different signaling voltages, timings, and other factors.
Wikipedia is a free online encyclopedia, created and edited by volunteers around the world and hosted by the Wikimedia Foundation.
Highlighted notes while studying Concurrent Data Structures:
DDR4 SDRAM
Source: Wikipedia
Double Data Rate 4 Synchronous Dynamic Random-Access Memory, officially abbreviated as DDR4 SDRAM, is a type of synchronous dynamic random-access memory with a high bandwidth ("double data rate") interface.
Wikipedia is a free online encyclopedia, created and edited by volunteers around the world and hosted by the Wikimedia Foundation.
JTAG (Joint Test Action Group) is an IEEE 1149.1 standard that defines a 4-wire interface for testing printed circuit boards and integrated circuits. It allows for boundary scan testing which provides controllability and observability of a device's I/O pins through software. The standard defines instructions and registers that are used to perform functional tests, interconnect tests, and built-in self-tests. JTAG provides advantages like eliminating physical test points, reducing costly test fixtures, and increasing test speed. It works by using a TAP controller and instruction and data registers that are serially loaded to control and observe the device under test.
Designed a fully customized 128x10b SRAM by constructing schematic & virtuoso layout of memory cell array (6T cell), row & column decoder, pre-charge circuit, write circuit and sense amplifier using Cadence. Manually placed and routed all components, performed DRC & LVS debugging of constructed schematic and layout and ran PEX to generate the final Netlist, Hspice Spectre simulation of final design for verification of the correct functionality and analysis of best read, best write cycles & the worst case timing for read and write. Timing and power consumed is analyzed through STA-Primetime (Static timing Analysis)
The document discusses the PowerPC processor. It provides details about the IBM 405Fx PowerPC processor core such as its 32-bit RISC design, 5-stage pipeline, separate instruction and data caches, virtual memory management unit, timers, and debug support. The PowerPC architecture consists of the user instruction set architecture, virtual environment architecture, and operating environment architecture. The processor core contains the pipeline, cache units, MMU, timers, and interfaces to other functions.
Semiconductor engineering is becoming more dynamic fiels since the technology scaling is taking place. Power reduction techniques are lucrative solutions to the performance, area and power trade off. Therefore Power reduction of VLSI designs are critical.
The document discusses various types of computer memory technologies, including RAM types like DRAM, SRAM, DDR, DDR2, and DDR3. It explains the memory hierarchy from registers to cache to main memory to disks. Key points covered include how DRAM works using capacitors that must be periodically refreshed, advantages of SDRAM over regular DRAM like pipelining commands. Generations of DDR memory are compared in terms of clock speeds, data rates, and other features.
Low power VLSI design has become important due to increasing integration leading to higher power consumption. Low power design is essential for handheld devices to allow long battery life and better performance. There are various techniques for low power design including reducing supply voltage, minimizing capacitance and switching activity, and employing strategies like clock gating and power gating. Low power design can be achieved at different levels from system to logic to physical design.
This document provides an overview of serial communication buses. It defines serial buses as using a single wire or fiber to transmit data one bit at a time. Common serial buses include RS-232, RS-422, RS-485, and USB. RS-232 defines standards for serial binary communication but has limitations like short maximum cable length. RS-422 and RS-485 use differential signaling to allow longer distances and higher speeds. RS-485 also enables multipoint connections. USB serves as a serial bus to connect devices and transfer data and power using a host controller.
DDR memory is a type of RAM that allows for increased performance over single data rate memory by facilitating two data transactions per clock cycle without doubling the clock speed. It consists of over 130 signals and uses mode and extended mode registers to control operations. DDR memory comes in SRAM and DRAM varieties, with DRAM being more common due to its lower power consumption and use in main memory, though it requires constant refreshing to prevent data loss.
This presentation is a short introduction to issues in Hardware-Software Codesign. It discusses definition of codesign, its significance, design issues in Hardware-software codesign, Abstraction levels, Duality of harware and software
This document provides an overview of system on chip (SoC) interconnect architectures and standard bus protocols. It discusses key considerations for choosing an interconnect architecture such as bandwidth, latency, and clock domains. Common SoC bus standards including AMBA, CoreConnect, and Wishbone are described along with their bus architectures and components. The document also provides details on specific buses within standards, such as AMBA's AHB, ASB, and APB buses and CoreConnect's PLB, OPB, and DCR buses.
The document discusses various ATPG (Automatic Test Pattern Generation) methods and algorithms. It provides an introduction to ATPG, explaining that ATPG generates test patterns to detect faults in circuits. It then covers major ATPG classifications like pseudorandom, ad-hoc, and algorithmic. Several algorithmic ATPG methods are described, including the D-algorithm, PODEM, FAN, and genetic algorithms. Sequential ATPG is more complex due to memory elements. The summary reiterates that testing large circuits is difficult and many ATPG methods have been developed for combinational and sequential circuits.
The document discusses several challenges in embedded systems design. It notes that current scientific foundations separate hardware and software design paradigms in ways that make integrating computation and physical constraints difficult. Engineering practices also separate critical and best-effort design methods. The document argues that a successful approach to embedded systems design needs a mathematical basis that integrates abstract-machine and transfer-function models, allows combining critical and best-effort engineering, and encompasses heterogeneous components through constructs like compositionality and non-interference rules.
Power gating is the main power reduction techniques for the static power. As long as technology scaling is taking place, static power becomes paramount important factor to the VLSI designs.Therefore Power gating is the recent power reduction technique that is actively in research areas.
This document discusses various low power techniques for integrated circuits. It begins by describing the increasing challenges of power consumption as device densities and clock frequencies increase while supply voltages and threshold voltages decrease. It then discusses different types of power consumption, including dynamic power, static power, leakage power from different sources, and how they can be reduced. The document covers many low power design techniques like multi-threshold CMOS, clock gating, multi-voltage, DVFS, and more. It discusses the evolution of these techniques and challenges in their implementation like timing issues, level shifters, and floorplanning for multi-voltage designs.
PCIe is a standard expansion card interface introduced in 2004 to replace PCI and PCI-X. It uses serial instead of parallel communication and is scalable, allowing for higher maximum system bandwidth. The presentation discusses the history of expansion card standards leading to PCIe, including ISA, EISA, VESA, PCI, and PCI-X. It also covers key aspects of PCIe such as the root complex, endpoints, switches, lanes, bus:device.function notation, enumeration, and address spaces such as configuration space.
The document provides an overview of the responsibilities and functions of the Genie-PCIe data link layer. The data link layer is responsible for reliable transmission of transaction layer packets (TLPs) between the physical and transaction layers. It handles flow control initialization, sequencing, buffering, error detection and recovery for transmitted TLPs using ACK/NAK protocols and data link layer packets (DLLPs). The data link control state machine manages the link status and ensures proper initialization and maintenance of the link.
This document provides an overview of system architecture and processor architectures. It discusses different types of system architecture like system-level building blocks, components of a system, hardware and software implementation, and instruction-level parallelism. It also describes various processor architectures like sequential, pipelined, superscalar, VLIW, SIMD, array, and vector processors. Additionally, it covers memory and addressing in systems-on-chip including memory considerations, virtual memory, and the process of determining physical memory addresses.
This document summarizes the key aspects of a DDR2 SDRAM controller, including:
1) It describes the differences between DDR1 and DDR2 memory technologies, such as lower power consumption and higher data rates in DDR2.
2) It provides a block diagram of the main components and I/O signals of a DDR2 SDRAM controller.
3) It explains the basic functionality of a DDR2 SDRAM controller, including initialization, refresh operations, and read and write operations.
This document discusses pipelining as an approach to optimize sequential circuits. It describes how pipelining can be implemented using registers between logic blocks to improve resource utilization and increase throughput. This allows computations to be spread over multiple clock cycles in an assembly-line fashion. The document also discusses latch-based vs register-based pipelines and different logic styles like NORA-CMOS that can be used for pipelined structures. It covers design rules and considerations for ensuring correct pipelined operation. Finally, it briefly describes non-bistable sequential circuits like astable, monostable and Schmitt trigger circuits.
DMA stands for Direct memory access and is a method of transferring data from the computers RAM to another part of the computer without processing it using the CPU.
Universal Flash Storage is an upcoming memory specification for use in mobile phones, tablets and other consumer electronics devices.
It is the successor of Embedded Multimedia controller (eMMC) that currently prevails and will be available as storage in on-chip and expandable form (in the form of memory cards).
The document provides information on different types of computer system architectures including SISD, SIMD, MIMD, and MISD. It discusses the key characteristics of each architecture such as SISD involving a single processor executing a single instruction stream on data from a single memory. SIMD involves multiple processors executing the same instruction on multiple data streams simultaneously. MIMD involves multiple processors executing different instruction streams on different data simultaneously. Pipelining is described as a technique used to increase instruction throughput by splitting instruction processing into independent stages.
SOC Application Studies: Image CompressionA B Shinde
This document discusses application studies of AES encryption and JPEG image compression on SOC designs. It provides an overview of the AES algorithm and requirements, describing the encryption process. An initial SOC design for AES is proposed using an ARM7 processor, and performance is evaluated. JPEG compression is also summarized, outlining the color space transformation, discrete cosine transform, and entropy coding steps. Finally, an example JPEG system for a digital still camera is presented using a TMS320C54x processor to implement the imaging pipeline and compression.
Architectural tricks to maximize memory bandwidthDeepak Shankar
Deepak Shankar, CEO and Founder of Mirabilis Deign Inc. hosted a webinar(Feb 17,2016) on the architectural possibilities to improve memory bandwidth. This webinar highlighted that memory plays a role in impacting the performance & power consumption of a system.
Semiconductor engineering is becoming more dynamic fiels since the technology scaling is taking place. Power reduction techniques are lucrative solutions to the performance, area and power trade off. Therefore Power reduction of VLSI designs are critical.
The document discusses various types of computer memory technologies, including RAM types like DRAM, SRAM, DDR, DDR2, and DDR3. It explains the memory hierarchy from registers to cache to main memory to disks. Key points covered include how DRAM works using capacitors that must be periodically refreshed, advantages of SDRAM over regular DRAM like pipelining commands. Generations of DDR memory are compared in terms of clock speeds, data rates, and other features.
Low power VLSI design has become important due to increasing integration leading to higher power consumption. Low power design is essential for handheld devices to allow long battery life and better performance. There are various techniques for low power design including reducing supply voltage, minimizing capacitance and switching activity, and employing strategies like clock gating and power gating. Low power design can be achieved at different levels from system to logic to physical design.
This document provides an overview of serial communication buses. It defines serial buses as using a single wire or fiber to transmit data one bit at a time. Common serial buses include RS-232, RS-422, RS-485, and USB. RS-232 defines standards for serial binary communication but has limitations like short maximum cable length. RS-422 and RS-485 use differential signaling to allow longer distances and higher speeds. RS-485 also enables multipoint connections. USB serves as a serial bus to connect devices and transfer data and power using a host controller.
DDR memory is a type of RAM that allows for increased performance over single data rate memory by facilitating two data transactions per clock cycle without doubling the clock speed. It consists of over 130 signals and uses mode and extended mode registers to control operations. DDR memory comes in SRAM and DRAM varieties, with DRAM being more common due to its lower power consumption and use in main memory, though it requires constant refreshing to prevent data loss.
This presentation is a short introduction to issues in Hardware-Software Codesign. It discusses definition of codesign, its significance, design issues in Hardware-software codesign, Abstraction levels, Duality of harware and software
This document provides an overview of system on chip (SoC) interconnect architectures and standard bus protocols. It discusses key considerations for choosing an interconnect architecture such as bandwidth, latency, and clock domains. Common SoC bus standards including AMBA, CoreConnect, and Wishbone are described along with their bus architectures and components. The document also provides details on specific buses within standards, such as AMBA's AHB, ASB, and APB buses and CoreConnect's PLB, OPB, and DCR buses.
The document discusses various ATPG (Automatic Test Pattern Generation) methods and algorithms. It provides an introduction to ATPG, explaining that ATPG generates test patterns to detect faults in circuits. It then covers major ATPG classifications like pseudorandom, ad-hoc, and algorithmic. Several algorithmic ATPG methods are described, including the D-algorithm, PODEM, FAN, and genetic algorithms. Sequential ATPG is more complex due to memory elements. The summary reiterates that testing large circuits is difficult and many ATPG methods have been developed for combinational and sequential circuits.
The document discusses several challenges in embedded systems design. It notes that current scientific foundations separate hardware and software design paradigms in ways that make integrating computation and physical constraints difficult. Engineering practices also separate critical and best-effort design methods. The document argues that a successful approach to embedded systems design needs a mathematical basis that integrates abstract-machine and transfer-function models, allows combining critical and best-effort engineering, and encompasses heterogeneous components through constructs like compositionality and non-interference rules.
Power gating is the main power reduction techniques for the static power. As long as technology scaling is taking place, static power becomes paramount important factor to the VLSI designs.Therefore Power gating is the recent power reduction technique that is actively in research areas.
This document discusses various low power techniques for integrated circuits. It begins by describing the increasing challenges of power consumption as device densities and clock frequencies increase while supply voltages and threshold voltages decrease. It then discusses different types of power consumption, including dynamic power, static power, leakage power from different sources, and how they can be reduced. The document covers many low power design techniques like multi-threshold CMOS, clock gating, multi-voltage, DVFS, and more. It discusses the evolution of these techniques and challenges in their implementation like timing issues, level shifters, and floorplanning for multi-voltage designs.
PCIe is a standard expansion card interface introduced in 2004 to replace PCI and PCI-X. It uses serial instead of parallel communication and is scalable, allowing for higher maximum system bandwidth. The presentation discusses the history of expansion card standards leading to PCIe, including ISA, EISA, VESA, PCI, and PCI-X. It also covers key aspects of PCIe such as the root complex, endpoints, switches, lanes, bus:device.function notation, enumeration, and address spaces such as configuration space.
The document provides an overview of the responsibilities and functions of the Genie-PCIe data link layer. The data link layer is responsible for reliable transmission of transaction layer packets (TLPs) between the physical and transaction layers. It handles flow control initialization, sequencing, buffering, error detection and recovery for transmitted TLPs using ACK/NAK protocols and data link layer packets (DLLPs). The data link control state machine manages the link status and ensures proper initialization and maintenance of the link.
This document provides an overview of system architecture and processor architectures. It discusses different types of system architecture like system-level building blocks, components of a system, hardware and software implementation, and instruction-level parallelism. It also describes various processor architectures like sequential, pipelined, superscalar, VLIW, SIMD, array, and vector processors. Additionally, it covers memory and addressing in systems-on-chip including memory considerations, virtual memory, and the process of determining physical memory addresses.
This document summarizes the key aspects of a DDR2 SDRAM controller, including:
1) It describes the differences between DDR1 and DDR2 memory technologies, such as lower power consumption and higher data rates in DDR2.
2) It provides a block diagram of the main components and I/O signals of a DDR2 SDRAM controller.
3) It explains the basic functionality of a DDR2 SDRAM controller, including initialization, refresh operations, and read and write operations.
This document discusses pipelining as an approach to optimize sequential circuits. It describes how pipelining can be implemented using registers between logic blocks to improve resource utilization and increase throughput. This allows computations to be spread over multiple clock cycles in an assembly-line fashion. The document also discusses latch-based vs register-based pipelines and different logic styles like NORA-CMOS that can be used for pipelined structures. It covers design rules and considerations for ensuring correct pipelined operation. Finally, it briefly describes non-bistable sequential circuits like astable, monostable and Schmitt trigger circuits.
DMA stands for Direct memory access and is a method of transferring data from the computers RAM to another part of the computer without processing it using the CPU.
Universal Flash Storage is an upcoming memory specification for use in mobile phones, tablets and other consumer electronics devices.
It is the successor of Embedded Multimedia controller (eMMC) that currently prevails and will be available as storage in on-chip and expandable form (in the form of memory cards).
The document provides information on different types of computer system architectures including SISD, SIMD, MIMD, and MISD. It discusses the key characteristics of each architecture such as SISD involving a single processor executing a single instruction stream on data from a single memory. SIMD involves multiple processors executing the same instruction on multiple data streams simultaneously. MIMD involves multiple processors executing different instruction streams on different data simultaneously. Pipelining is described as a technique used to increase instruction throughput by splitting instruction processing into independent stages.
SOC Application Studies: Image CompressionA B Shinde
This document discusses application studies of AES encryption and JPEG image compression on SOC designs. It provides an overview of the AES algorithm and requirements, describing the encryption process. An initial SOC design for AES is proposed using an ARM7 processor, and performance is evaluated. JPEG compression is also summarized, outlining the color space transformation, discrete cosine transform, and entropy coding steps. Finally, an example JPEG system for a digital still camera is presented using a TMS320C54x processor to implement the imaging pipeline and compression.
Architectural tricks to maximize memory bandwidthDeepak Shankar
Deepak Shankar, CEO and Founder of Mirabilis Deign Inc. hosted a webinar(Feb 17,2016) on the architectural possibilities to improve memory bandwidth. This webinar highlighted that memory plays a role in impacting the performance & power consumption of a system.
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...Deepak Shankar
- Identify design challenges, trade-offs, and exploration.
- Construct an architecture model using data available in documents, spreadsheets, existing code, datasheets, and future concepts.
- Analyze the model to determine the cause of a bottleneck or performance degradation
Task allocation on many core-multi processor distributed systemDeepak Shankar
Migration of software from a single to multi-core, single to multi-thread, and integrated into a distributed system requires a knowledge of the system and scheduling algorithms. The system consists of a combination of hardware, RTOS, network, and traffic profiles. Of the 100+ popular scheduling algorithms, the majority use First Come-First Server with priority and preemption, Weight Round Robin, and Slot-based. The task allocation must take into consideration a number of factors including the hardware configuration, the RTOS scheduling, task dependency, parallel partitioning, shared resources, and memory access. Additionally, embedded system architectures always have the possibility of using custom hardware to implement tasks that may be associated with Artificial Intelligence, diagnostic or image processing.
In this Webinar, we will show you how to conduct trade-offs using a system model of the tasks and the target resources. You will learn to make decisions based on the hardware and network statistics. The statistics will assist in identifying deadlocks, bottlenecks, possible failures and hardware requirements. To estimate the best task allocation and partitioning, a discrete-event simulation with both time- and quantity-shared resource modeling is essential. The software must be defined as a UML or a task graph.
Web: www.mirabilisdesign.com
Webinar Youtube Link: https://youtu.be/ZrV39SYTWSc
Using VisualSim Architect for Semiconductor System AnalysisDeepak Shankar
Mirabilis Design provides architecture exploration software for semiconductor, electronics and embedded software. Using this modeling and simulation solution, designers could trade-off power vs performance, partition into hardware-software, optimize for timing, minimize power consumption, functional analysis and evaluate the quality of the system in the event of a failure. The outcome of this early exploration is a highly validated specification, a reference design for prospective customers to evaluate and data for certification purposes.
VisualSim has a large library of components (stochastic, hardware, software, network and RTOS) that is used to assemble models of the entire system, extremely fast and handle level of abstraction from stochastic to timing-accurate. These models are simulated against workloads and use-cases and the generated reports are used to make architecture decisions.
Webinar on Latency and throughput computation of automotive EE networkDeepak Shankar
This solution enables Architects to conduct trade-off on early planning, system sizing and network topology planning. This is part one in a three series that covers systems engineering exploration of Automotive EE Systems. technologies studied in this session include FlexRay, CAN, CAn_FD, TSN. Ethernet, ECU, Brake System, power Supply electronics, Li-Ion Batteries, ADAS and AUTOSAR.
The document discusses optimizing power and timing of RISC-V processors and systems. It describes evaluating pipeline stages, widths and speeds of RISC-V cores. It also discusses modeling SoC architectures containing RISC-V processors using VisualSim to evaluate power, performance and hardware accelerators for applications like media and SSD controllers.
Mirabilis_Design AMD Versal System-Level IP LibraryDeepak Shankar
Mirabilis Design provides the VisualSim Versal Library that enable System Architect and Algorithm Designers to quickly map the signal processing algorithms onto the Versal FPGA and define the Fabric based on the performance. The Versal IP support all the heterogeneous resource.
How to create innovative architecture using VisualSim?Deepak Shankar
In this presentation, we will get you started on using VisualSim Architect to conduct performance analysis, power measurement and functional validation. You will learn advanced concepts of system modeling and how to apply VisualSim Architect for a variety of applications.
Highlights include the application for both System-on-Chip and Large Systems including Designing memory interfaces using DDR3 and LPDDR3.
VisualSim Architect is used by systems and semiconductor companies to validate and analyze the specification of the product. The environment offers an easy-to-use methodology, huge library of technology components, extremely fast simulator and a huge reports list.
How to create innovative architecture using ViualSim?Deepak Shankar
In this presentation, we will get you started on using VisualSim Architect to conduct performance analysis, power measurement and functional validation. You will learn advanced concepts of system modeling and how to apply VisualSim Architect for a variety of applications.
Highlights include the application for both System-on-Chip and Large Systems including Designing memory interfaces using DDR3 and LPDDR3.
VisualSim Architect is used by systems and semiconductor companies to validate and analyze the specification of the product. The environment offers an easy-to-use methodology, huge library of technology components, extremely fast simulator and a huge reports list.
Please find our webinar video - How to create innovative architecture using ViualSim? at the last slide.
How to create innovative architecture using VisualSim?Deepak Shankar
In this presentation, we will get you started on using VisualSim Architect to conduct performance analysis, power measurement and functional validation. You will learn advanced concepts of system modeling and how to apply VisualSim Architect for a variety of applications.
Highlights include the application for both System-on-Chip and Large Systems including Designing memory interfaces using DDR3 and LPDDR3.
VisualSim Architect is used by systems and semiconductor companies to validate and analyze the specification of the product. The environment offers an easy-to-use methodology, huge library of technology components, extremely fast simulator and a huge reports list.
Webinar: Detecting Deadlocks in Electronic Systems using Time-based SimulationDeepak Shankar
Webinar: Detecting Deadlocks in Electronic Systems
Date: Nov 13th, 2019
Europe/ India Time: 11 AM CEST / 2:30 PM IST
US Time: 10 AM PT/ 1 PM ET
Register For the Webinar
Join Deepak Shankar, Founder of Mirabilis Design,
on Deadlock Detection of task graphs, using Discrete-Event Simulation.
on Thursday Nov 13th 2019
Europe/ India Time: 11 AM CEST / 2:30 PM IST
US Time: 10 AM PT/ 1 PM ET
Register For the Webinar
In Part One on Functional Analysis and Safety, we covered architecture modeling, fault injection, identification and resolution. View this Webinar, at the Mirabilis Design Video Channel. In Part Two, we focus on detecting deadlocks in systems that are time-variant. Traditional methods such as Ho-Ramamoorthy check for deadlocks in static directed graphs. In real systems, deadlocks occur from dependents missing deadlines, non-availability of resources from dependency and processing needs, multiple concurrent resource requests, criss-cross requests, stringent flow control, limited credit policies and buffer overflow. These require a dynamic, time-based simulation model to evaluate and detect deadlocks. In this Webinar, we use VisualSim Architect to assemble the task graph of the electronic; run use-cases and traffic through a time-based simulation; and evaluate the generated report to detect the source of the deadlocks.
During the webinar, you will learn to
1. Construct the system behavior using a system modeling environment
2. Run traffic and use-cases to create real-world operation
3. Evaluate the timing and resource consumption data to detect deadlocks
4. Determine the cause of the deadlocks using process and resource information
We will evaluate the simulated outcomes of an application to observe the functional coverage and design bottlenecks. Data Sampling with different test case are used to validate the correctness of the design. Example of deadlock scenarios are Multi-Core Cache Coherence, protocol and baseband Task Graphs, preemptive shared Bus and external resources such as printer, cameras and electrical drives.
System Architecture Exploration Training ClassDeepak Shankar
This document describes a training webinar on system architecture exploration using VisualSim software. It includes an agenda for a two-day training covering basics of VisualSim like traffic generation, queues, plotting statistics. It also describes exploring hardware and software platforms using resources like servers, queues and system resources in VisualSim. The document discusses processing models, reporting statistics and experimenting with different system options.
This slides show how to utilize real-world applications to teach early architecture exploration of electronics, embedded systems, software/firmware and semiconductor using visualsim.
Energy efficient AI workload partitioning on multi-core systemsDeepak Shankar
o create an AI system, the semiconductor, software, and systems team need to work together. Multi-core systems can provide extremely low latency and higher throughput at lower power consumption. But concurrent access to shared resources by multiple of AI workloads running on different cores can create higher worst-case execution time (WCET) and causes multiple system failures. Architecture exploration can be used to efficiently balance the compute, communication, synchronization, and storage. In this Webinar, we will be using Workloads from automotive, and data centers to demonstrate the methodology.
VisualSim Architect enables designers to assemble architecture models that extend from the smallest IoT to full automotive, and Radar systems to Data Centers. These models will include any combination of software, processors, ECU, RTOS and networks. Using this platform, software designer can explore the partitioning of the AI tasks (software or model) on to cores based on the latency, bandwidth, and power constraints. Within the IoT, the processor, A/D, Bluetooth and software can be modeled while an automotive design will require the network, ECU and firmware. Both have a unique mechanism to define the traffic, test scenarios and AI workloads. Hardware engineers can select cores, cores per cluster, cache hierarchy, memory controller, accelerators, and the interface topology. Software engineers can tune the partitioning, synchronization overhead, memory access schedules and scheduling.
Accelerated development in Automotive E/E Systems using VisualSim ArchitectDeepak Shankar
The recent trends and developments in the automotive sector towards fully autonomous diving system and vehicle to vehicle (V2V) communication would mean a drastic increase in the number of sensors, increased number of ECUs, increased concern for safety and security. This calls for the need to perform thorough evaluations on the target system architecture, at all levels - Hardware, Software and Network. During this webinar, we show how we evaluate each of these aspects of the Automotive E/E system and take a closer look at the performance, power and functional correctness of each of the auto subsystems. We will also inject faults into the demo model, which will tell us how the automotive system would perform under failure.
The webinar also showcases various Use case examples, which includes - comparison of TSN Standards, modelling of various topology, task graph modelling, glimpses into TC10 sleep-wakeup standard and integrated software.
In the design of electronics and semiconductors, challenges are compounded by the integration of AI, multi-core, real-time software, network, connectivity, diagnostics, and security. Performance limits, battery life, and cost are adoption barriers. It is extremely important to have tools and processes that deliver efficiency throughout the design cycle.
Continuous verification from planning to development addresses the multi-discipline needs of hardware, software, and networks. This unique approach accelerates the design phase, defines the test efforts, and finds defects during specification. Architecture modeling is required to meet timing deadlines, generate the lowest power consumption, and attain the highest Quality-of-Service. optimize the electronic design system and designing of custom components.
Heterogeneous Computing : The Future of SystemsAnand Haridass
Charts from NITK-IBM Computer Systems Research Group (NCSRG)
- Dennard Scaling,Moore's Law, OpenPOWER, Storage Class Memory, FPGA, GPU, CAPI, OpenCAPI, nVidia nvlink, Google Microsoft Heterogeneous system usage
A Dataflow Processing Chip for Training Deep Neural Networksinside-BigData.com
In this deck from the Hot Chips conference, Chris Nicol from Wave Computing presents: A Dataflow Processing Chip for Training Deep Neural Networks.
Watch the video: https://wp.me/p3RLHQ-k6W
Learn more: https://wavecomp.ai/
and
http://www.hotchips.org/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Get ready to dive into the exciting world of IoT data processing! 🌐📊
Join us for a thought-provoking webinar on "Processing: Turning IoT Data into Intelligence" hosted by industry visionary Deepak Shankar, founder of Mirabilis Design. Discover how to harness the potential of IoT devices by strategically choosing processors that optimize power, performance, and space.
In this engaging session, you'll explore key insights:
✅ Impact of processor architecture on Power-Performance-Area optimization
✅ Enabling AI and ML algorithms through precise compute and storage requirements
✅ Future trends in IoT hardware innovation
✅ Strategies for extending battery life and cost prediction through system design
Don't miss the chance to learn how to leverage a single IoT Edge processor for multiple applications and much more. This is your opportunity to gain a competitive edge in the evolving IoT landscape.
Similar to Designing memory controller for ddr5 and hbm2.0 (20)
How to achieve 95%+ Accurate power measurement during architecture exploration? Deepak Shankar
Mirabilis Design is a software company that develops VisualSim Architect modeling and simulation software to optimize system specifications prior to development. The software enables power-performance-area modeling and simulation of semiconductor systems and software. It uses dynamic simulation and evaluation of power, timing, and behavior using a single system model. This achieves 95%+ accurate power measurement during architecture exploration. The software separates behavior and architecture and supports multiple abstraction levels in a single model to optimize system designs early in the development process.
In Electronic System design, modeling abstraction is a powerful technique that involves creating simplified representations of complex electronic systems.
VisualSim Architect allows designers to create more manageable, modular, scalable, and robust electronic systems that meet the requirements of real-world applications. By leveraging abstraction, designers can focus on the critical aspects of a system's functionality, behavior, and interface, and effectively communicate design concepts and make informed decisions.
Evaluating UCIe based multi-die SoC to meet timing and power Deepak Shankar
This document discusses evaluating a UCIe-based multi-die system-on-chip (SoC) using system modeling to meet timing and power constraints. It provides an overview of UCIe and how it can be used to connect multiple dies. It then describes assembling a system model in VisualSim Architect using UCIe components to analyze configurations and optimize latency, bandwidth, and power. Examples of multi-media and automotive applications using UCIe-based chiplet designs are also presented.
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERSDeepak Shankar
Selecting the right Ethernet standard and configuring all the network devices in the embedded systems accurately is an extremely hard and rigorous job. The configuration depends on the topology, workloads of the connected devices, processing overhead at the switches, and the external interfaces. Network calculus, mathematical models and analytical techniques provide worst case execution time (WCET), but their probability of activity is extremely wide. This leads to overdesign which leads to higher costs, power consumption, weight, and size. Simulating the network is the best way to measure the throughput of the entire system. Digital system simulation provides better latency and throughput accuracy, but the accuracy is still limited because it does not consider the latency associated with the network OS, cybersecurity processing and scheduling. In many cases, these factors can reduce the throughput by 20-40%.
In this paper, we will present our research on modeling the entire Ethernet network, including the workloads, network flow control, scheduling, switch hardware, and software. To substantially increase the coverage and compare topologies, we have developed a set of benchmarks that provides coverage for different combination of deterministic, rate-constrained, and best effort traffic. During the presentation, we will cover the benchmarks, the list of attributes required to accurately model the traffic, nodes, switches, and the scheduler settings. We will also look at the statistics and reports required to make the configuration decision. In addition, we will discuss how the model must be constructed to study the impact of future requirements, failures, network intrusions, and security detection schemes.
Key Takeaways:
1. Learn how to efficiently use network simulation to design Ethernet systems
2. Develop a reusable benchmark and associated statistics to test different configurations
3. The role and impact of the CDT slots, guard band, send slope, idle slope, shuffle scheduling, flow control and virtual channels
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Deepak Shankar
The document discusses comparing the performance and power of ARM Cortex and RISC-V processors for AI applications. It outlines a methodology for modeling systems from the microarchitecture to SoC level using different instruction sets. Examples are provided to demonstrate how the methodology can be used to improve the accuracy of comparisons between architectures.
Capacity Planning and Power Management of Data Centers. Deepak Shankar
Key Points discussed in this webinar are:
1.How dynamic simulation can replace traditional network simulations that are slow and lack configuration and visibility to analyze performance.
2.How to avoid over or under design, cost increases, and delays.
3.How an architectural model can be used to test the capacity and power requirements of your data center or your server.
Contact us at info@mirabilisdesign.com for any queries.
Analytical, prototyping, model-based systems engineering and custom discrete-event model development of automotive networks are inaccurate, expensive, and takes too long to do detailed routing analysis, Quality-of-Service (QoS) trade-off, and bandwidth exploration. To capture the nuances of QoS, scheduling, buffer management, and network topologies, these solutions require a considerable amount of time, costs, and customization. To achieve the reliability of wiring harness, the latency and bandwidth measurements of automotive networks must be accurate, tested for failure conditions, and simulated for security breaches, traffic spikes, and translations.
Using ai for optimal time sensitive networking in avionicsDeepak Shankar
The IEEE 802.1 Time-Sensitive Networking is a standard technology to provide deterministic
routing or transmission of packets on standard Ethernet. By reserving resources for critical traffic,
and applying various queuing and shaping techniques, TSN achieves zero congestion loss for
critical data traffic. This, in turn, allows TSN to guarantee a worst-case end-to-end latency for
critical data. TSN also provides ultra-reliability for data traffic via a data packet level reliability
mechanism as well as protection against bandwidth violation, malfunctioning, malicious attacks,
etc. TSN includes reliable time synchronization, a profile of IEEE 1588, which provides the basis
for many other TSN functions.
Develop High-bandwidth/low latency electronic systems for AI/ML applicationDeepak Shankar
the architecture exploration required to accurately size and implement AI/ML platforms for a wide-range of applications in automotive, radar and high-performance computing.
Webinar on Functional Safety Analysis using Model-based System AnalysisDeepak Shankar
To learn more, visit https://www.mirabilisdesign.com or email: info (at) mirabilisdesign.com.
To meet the ISO-26262 Parts 4,5,6 Requirements.
Failure Analysis, Identification and Resolution of Electronics and Software
Join Mirabilis Design for a Webinar to evaluate performance and power consumption, measure the quality of your architecture in the event of failures and, the recovery time from the failures. During this Webinar, we will demonstrate a step-by-step approach to dynamic system modeling, fault generation, and evaluation of diagnostics to cover both ISO26262-Part 4,5,6.
Using the VisualSim modeling and simulation software, we will validate and optimize the system architecture, apply failures, add diagnostics to identify the failures, and create logic to resolve the error condition. This model will be used to measure the compliance of the functional safety setup to meet the requirements of ISO26262-Part 4,5,6.
At the Webinar, we will
1. Cover hardware, software, network, RTOS and power systems.
2. Construct an architecture model of a braking system.
3. Apply failures, add methods to detect errors and algorithms to return the system to normal operation.
3. Analyze the models to meet the timing, power and functional requirements during an event of a failure.
System failure analysis plays a vital role in avoiding any real-time injuries/dangers, especially in aerospace, automotive and medical appliances. While designing the system, a proactive and systematic method to evaluate where and how the system might fail, the outcome of the failure, and how the failures can be prevented helps to consider required safety measures. This minimizes the cost, resources, and time-consumed after the occurrence of an unexpected incident.
Is accurate system-level power measurement challenging? Check this out!Deepak Shankar
The most common method of computing power of a system or semiconductor is with spreadsheets. Spreadsheets generates worst case power consumption and, in most cases, is insufficient to make architecture decisions. Accurate power measurement requires knowledge of use-cases, processing time, resource consumption and any transitions. Doing this at the RTL-level or using software tools is both too late and requires huge model construction effort. Based on our experience, a systems-level model with timing, power and functionality is the only real solution to measure accurate power consumption. Unfortunately, system-level models are hard to construct because of the complex expressions, right-level of abstraction and defining the right workload. Fortunately, there is a solution that enables to you to build functional models that can generate accurate power measures. These measurements can be used to make architecture decisions, conduct performance-power trade-off, determining power management quality, and compliance with requirements.
During this Presentation, we will demonstrate how system-level power modeling and measurement works. We shall go over the requirements to create the model, what outputs to capture and how to ensure accuracy. During the presentation, the speaker will demonstrate real-life examples, share best practices, and compare with real hardware. This presentation will cover power from the perspective of semiconductor, systems and embedded software.
VisualSim Architect is graphical modeling software that allows engineers to model systems, evaluate performance through simulations, and validate software. Key features include a large library of reusable components, graphical model construction, and simulation of models to analyze metrics like latency, power consumption, and reliability. The software helps optimize system specification and design by providing feedback on simulations earlier in the design process.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
1. DESIGNING MEMORY CONTROLLER FOR
DDR5 AND HBM2.0
Deepak Shankar
Founder
Mirabilis Design Inc.
Email: dshankar@mirabilisdesign.com
2. Before we get started….
All attendees are muted and will stay muted
Use the chat or the “Raise Hand” feature to bring questions to our attention
3. DESIGNING MEMORY CONTROLLER FOR
DDR5 AND HBM2.0
Deepak Shankar
Founder
Mirabilis Design Inc.
Email: dshankar@mirabilisdesign.com
4. Agenda
Introduction to DDR5 and HBM2.0
Role of the Memory Controller and the Importance
Parts of Memory Controller Options
Metrics to judge the Quality of Service
Introduction to Architecture Exploration
Role of Architecture Exploration in designing a Memory Controller
Parameters to describe the Memory Controller
Architecture model of a Memory Controller
Experiments
Other Controller Designs
Q&A
5. Introduction to HBM2.0 and DDR5
High Bandwidth Memory (HBM)
◦ high-speed computer memory interface for 3D-stacked SDRAM
◦ Used in conjunction with performance-sensitive consumer applications, graphics accelerators, network
devices and supercomputers.
◦ HBM2 has up to eight dies per stack and doubles pin transfer rates up to 2 GT/s.
◦ 1024-bit wide access with memory bandwidth per package of 256 GB/.
Double Data Rate 5 Synchronous Dynamic Random-Access Memory
◦ 4800 to 6400 million transfers per second (PC5-38400 to PC5-51200).
◦ Minimum burst length is 16, with the option of "burst chop" after 8 transfers.
◦ number of bank groups is 8, with 4 banks per group
◦ Two independent channels per DIMM.
8. Role of Memory Controller
Manages the flow of data going to and from the computer's main memory
Shared memory is a key component and major performance bottleneck in multi-core processors
Location of the memory controllers at the interconnect has a major impact on throughput
Memory controller decides which request gets access to the memory, for what duration and in
what order
Bandwidth impact at in-order shared bus connecting the CPUs and memory controller (Article)
◦ Intelligent read-to-write switching memory controller provides same benefit as doubling interleaved
memory ranks
◦ Lower read latency across range of throughput obtained by a delayed write scheduling
9. Parts of a Memory Controller
Address decoder
Buffer and buffer management
Scheduling algorithm to select the next request
Read and Write channels
Interfaces to processor and DRAM
Signal handling and triggering the refresh
10. Memory Controller Quality of Service
Latency vs Bandwidth
Bytes per Watt
Buffer occupancy
Algorithm efficiency Maximum bandwidth for target application
12. Introduction to Architecture Exploration
Architecture Exploration
◦ Optimize and validate the system specification
◦ Specification: Processor speed, topology and arbitration
◦ Requirements: Timing, energy, cost, weight and efficiency
Performance Analysis
◦ Buffer size, utilization, throughput and response time
Power Measurement
◦ Peak and average power, energy and power/task
Functional Correctness
◦ Arbitration, software task scheduling and task graph
Failure Analysis
◦ Hardware, Software, network, data, power and logic
Making Better Quality Products
13. Analysis using Architecture Exploration
Buffer management
Power optimization
Core and processor selection and sizing
Response times for various data sizes and rates
Firmware algorithm selection
Algorithm, arbitration and scheduling design
Credit policy and impact of the flash memory selection on throughput
Memory management
Software Task graph
14. Performance Evaluation of System
Which Libraries?
1. Only configured Parameter
and data table setting.
• Traffic
• Expression
• MasterDevice
• Bus Arbitor/Bus
• DMA
• RAM
• Processor
• PCIe
• AMBA AXI
• Power Management
2. Need to create script code
• GPU Warp/PE
NXP i.MX6 /
nVIDIA Drive PX
Xilinx FPGA
Kintex 8
Discrete
DMA
ARM A53
GPU
Display Ctrl
SRAM3
DRAM3
Video IN
Parameters
15. Role of Architecture Exploration in
Memory Controller Design
Two types
◦ Stochastic
◦ Cycle-accurate
Modeling
◦ Incorporates the interface fabric, workloads and the traffic model
◦ Define memory controller algorithm as a delay, order buffer or detailed algorithm
◦ Connect the memory controller into a SoC or embedded system
Simulation
◦ Different scheduling algorithms
◦ Separate or single channel for Read and Write
◦ Buffer size
◦ Clock Speed
◦ Connected DRAM
◦ Number of Masters or cores
Analysis
◦ Generated reports to evaluate the Quality of Service
16. Parameters to Define Memory controller
Stochastic model
◦ Delay for the controller
◦ Scheduling algorithm with buffer
◦ Memory Width
◦ Buffer length
Cycle-accurate
◦ Address breakdown by bits
◦ Fragmentation of large request
◦ Clock speed, bus width and memory width
◦ Buffer length
◦ Burst length
◦ Timing
◦ Refresh-related attributes
◦ Detailed scheduler design based on address and
buffer settings
17. Architecture Model of SoC
Master
Fabric
Exploration
Parameters
Memory Controller
DRAM Definition
Reports
20. Experiment with a Traffic Model
9/11/2020 MIRABILIS DESIGN INC. 20
DRAM
Display
IO
A
M
B
A
A
X
I
B
u
s
CPU
GPU
Display
Ctrl
CAN
Packet
Ethernet
21. Experiment with Detailed models of
Processor, GPU and Interfaces
9/11/2020 MIRABILIS DESIGN INC. 21
DRAM
Display
IO
A
M
B
A
A
X
I
B
u
s
CPU
GPU
Display
Ctrl
P
C
I
e
Video Camera SRAM
Packet
System Overview
◦ Camera : 30fps, VGA corresponds
◦ CPU : ARM Cortex-A53 1.2GHz
◦ GPU : 64Cores(8Warps×8PEs), 32Threads, 1GHz
◦ DisplayCtrl : DisplayBuffer 293,888Byte
◦ SRAM : SDR, 64MB, 1.0GHz
◦ DRAM : DDR3, 64MB, 2.4GHz
22. Debugging Memory Controller Design
Review the latency, buffer usage and throughput
Compare the memory throughput with the Fabric
Modify attributes of the traffic, Fabric and Memory
25. About Mirabilis Design
Founded in 2003 and based in Sunnyvale, CA, USA.
Development and support centers in US, India, China, Korea and Czech Republic
Focused on system architecture exploration of electronics, semiconductors and software
40+ customers worldwide in Semiconductors, Aerospace, Computing and Automotive
VisualSim- Modeling and simulation software
Largest source of system modeling IP with embedded timing and power
100’s of man years experience in system design and exploration of digital electronics
Select the “Right” configuration to match customer request
26. Introduction to VisualSim Architect
◦ Architect processors, hardware
systems, software and network
◦ Map algorithms on integrated
and distributed systems
◦ Compute resource requirements
for application task graphs
◦ Test compliance to standards and
generation of diagnostics
Timing and
Throughput
Power
measurement,
management
and Battery
Entire EE to
Semiconductor
Functional and
Safety Analysis
Libraries
Hardware,
Software and
Network
Graphical
Modeling
Functional, timing and power analysis to existing Model-based System Design
27. Largest Systems-Level Model Library
Largest library of traffic, resources, hardware, software and analysis
Traffic
• Distribution
• Sequence
• Trace file
• Instruction profile
Reports
• Timing and Buffer
• Throughput/Util
• Ave/peak power
• Statistics
Power
• State power table
• Power
management
• Energy harvesters
• Battery
• RegEx operators
SoC Buses
• AMBA and Corelink
• AHB, AB, AXI, ACE,
CHI, CMN600
• Network-on-Chip
• TileLink
System Bus
• PCI/PCI-X/PCIe
• Rapid IO
• AFDX
• OpenVPX
• VME
• SPI 3.0
• 1553B
Processors
• GPU, DSP, mP and mC
• RISC-V
• Nvidia- Drive-PX
• PowerPC
• X86- Intel and AMD
• DSP- TI and ADI
• MIPS, Tensilica, SH
ARM
• M-, R-, 7TDMI
• A8, A53, A55, A72,
A76, A77
Custom Creator
• Script language
• 600 RegEx fn
• Task graph
• Tracer
• C/C++/Java
• Python
Support
• Listener and
Trace
• Debuggers
• Assertions
Stochastic
• FIFO/LIFO Queue
• Time Queue
• Quantity Queue
• System Resource
• Schedulers
• Cyber Security
RTOS
• Template
• ARINC 653
• AUTOSAR
Memory
• Memory Controller
• DDR DRAM 2,3,4, 5
• LPDDR 2, 3, 4
• HBM, HMC
• SDR, QDR, RDRAM
Storage
• Flash & NVMe
• Storage Array
• Disk and SATA
• Fibre Channel
• FireWire
Networking
• Ethernet & GiE
• Audio-Video Bridging
• 802.11 and Bluetooth
• 5G
• Spacewire
• CAN-FD
• TTEthernet
• FlexRay
• TSN & IEEE802.1Q
FPGA
• Xilinx- Zynq, Virtex, Kintex
• Intel-Stratix, Arria
• Microsemi- Smartfusion
• Programmable logic
template
• Interface traffic generator
Software
• GEM5
• Software code integration
• Instruction trace
• Statistical software model
• Task graph
Interfaces
• Virtual Channel
• DMA
• Crossbar
• Serial Switch
• Bridge
RTL-like
• Clock, Wire-Delay
• Registers, Latches
• Flip-flop
• ALU and FSM
• Mux, DeMux
• Lookup table
28. DESIGNING MEMORY CONTROLLER FOR
DDR5 AND HBM2.0
Deepak Shankar
Founder
Mirabilis Design Inc.
Email: dshankar@mirabilisdesign.com