This document discusses optimizations for PCIe 3.0 devices on Intel platforms. It describes new PCIe protocol extensions like Transaction Layer Packet processing hints that can reduce latency and power consumption. It emphasizes device architecture considerations for efficiently using these extensions, including power management. Software must support capabilities like Processing Hints through driver and firmware updates to realize the benefits.
DAIS19: On the Performance of ARM TrustZoneLEGATO project
Presented by Valerio Schiavoni @vschiavoni at DAIS 19
The TrustZone technology, available in the vast majority of recent Arm processors, allows the execution of code inside a so-called secure world. It effectively provides hardware-isolated areas of the processor for sensitive data and code, i.e., a trusted execution environment (TEE). The Op-Tee framework provides a collection of toolchain, opensource libraries and secure kernel specifically geared to develop applications for TrustZone. This paper presents an in-depth performance- and energy-wise study of TrustZone using the Op-Tee framework, including secure storage and the cost of switching between secure and unsecure worlds, using emulated and hardware measurements.
Slow peripheral interfaces (i2 c spi uart)PREMAL GAJJAR
The Serial Peripheral Interface or SPI bus is a synchronous serial data link, a de facto standard, named by Motorola, that operates in full duplex mode. It is used for short distance, single master communication, for example in embedded systems, sensors, and SD cards.
44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick44CON
44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick.
Hardware hacks tend to focus on low-speed (jtag, uart) and external (network, usb) interfaces, and PCI Express is typically neither. After a crash course in PCIe Architecture, we’ll demonstrate a handful of hacks showing how pull PCIe outside of your system case and add PCIe slots to systems without them, including embedded platforms. We’ll top it off with a demonstration of SLOTSCREAMER, an inexpensive device that’s part of the NSA Playset which we’ve configured to access memory and IO, cross-platform and transparent to the OS - all by design with no 0-day needed. The open hardware and software framework that we will release will expand your Playset with the ability to tinker with DMA attacks to read memory, bypass software and hardware security measures, and directly attack other hardware devices in the system.
ARMv8-M TrustZone: A New Security Feature for Embedded Systems (FFRI Monthly ...FFRI, Inc.
In this slide, we introduce the TrustZone of information that has published at this time in relation to ARMv8-M.
It is possible to separate/isolate the security level by adding the security state.
ARMv8-M architecture has a different mechanism than TrustZone to provide traditional ARMv8-A architecture, which is optimized for embedded systems.
Heterogeneous Computing : The Future of SystemsAnand Haridass
Charts from NITK-IBM Computer Systems Research Group (NCSRG)
- Dennard Scaling,Moore's Law, OpenPOWER, Storage Class Memory, FPGA, GPU, CAPI, OpenCAPI, nVidia nvlink, Google Microsoft Heterogeneous system usage
Introduction to BeRTOS, real time embedded operating system open source. BeRTOS is free also for commercial projects or closed source applications.
http://www.bertos.org/download/
DAIS19: On the Performance of ARM TrustZoneLEGATO project
Presented by Valerio Schiavoni @vschiavoni at DAIS 19
The TrustZone technology, available in the vast majority of recent Arm processors, allows the execution of code inside a so-called secure world. It effectively provides hardware-isolated areas of the processor for sensitive data and code, i.e., a trusted execution environment (TEE). The Op-Tee framework provides a collection of toolchain, opensource libraries and secure kernel specifically geared to develop applications for TrustZone. This paper presents an in-depth performance- and energy-wise study of TrustZone using the Op-Tee framework, including secure storage and the cost of switching between secure and unsecure worlds, using emulated and hardware measurements.
Slow peripheral interfaces (i2 c spi uart)PREMAL GAJJAR
The Serial Peripheral Interface or SPI bus is a synchronous serial data link, a de facto standard, named by Motorola, that operates in full duplex mode. It is used for short distance, single master communication, for example in embedded systems, sensors, and SD cards.
44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick44CON
44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick.
Hardware hacks tend to focus on low-speed (jtag, uart) and external (network, usb) interfaces, and PCI Express is typically neither. After a crash course in PCIe Architecture, we’ll demonstrate a handful of hacks showing how pull PCIe outside of your system case and add PCIe slots to systems without them, including embedded platforms. We’ll top it off with a demonstration of SLOTSCREAMER, an inexpensive device that’s part of the NSA Playset which we’ve configured to access memory and IO, cross-platform and transparent to the OS - all by design with no 0-day needed. The open hardware and software framework that we will release will expand your Playset with the ability to tinker with DMA attacks to read memory, bypass software and hardware security measures, and directly attack other hardware devices in the system.
ARMv8-M TrustZone: A New Security Feature for Embedded Systems (FFRI Monthly ...FFRI, Inc.
In this slide, we introduce the TrustZone of information that has published at this time in relation to ARMv8-M.
It is possible to separate/isolate the security level by adding the security state.
ARMv8-M architecture has a different mechanism than TrustZone to provide traditional ARMv8-A architecture, which is optimized for embedded systems.
Heterogeneous Computing : The Future of SystemsAnand Haridass
Charts from NITK-IBM Computer Systems Research Group (NCSRG)
- Dennard Scaling,Moore's Law, OpenPOWER, Storage Class Memory, FPGA, GPU, CAPI, OpenCAPI, nVidia nvlink, Google Microsoft Heterogeneous system usage
Introduction to BeRTOS, real time embedded operating system open source. BeRTOS is free also for commercial projects or closed source applications.
http://www.bertos.org/download/
Mirabilis_Design AMD Versal System-Level IP LibraryDeepak Shankar
Mirabilis Design provides the VisualSim Versal Library that enable System Architect and Algorithm Designers to quickly map the signal processing algorithms onto the Versal FPGA and define the Fabric based on the performance. The Versal IP support all the heterogeneous resource.
Configuration management benefits for everyone - Rudder @ FLOSSUK Spring Conf...RUDDER
Sharing and reusing configurations, rolling out upgrades, ensuring a security policy is correctly applied, automating repetitive tasks, preparing for disaster recovery... these are all missions for configuration management tools.
Rudder is a new, open source approach to this domain, built on existing and reliable components. By allowing experts and power-users to create reusable templates and configurations based on best practices, it enables other actors in the IT department to benefit from the advantages of configuration management: using a web-based interface, junior sysadmins can quickly setup new servers while learning and respecting best practices and company policy, while service managers and security officers can get instant reports on their policies compliance level.
This talk introduces Rudder and show some illustrative use cases before describing the architecture of it's main components and how they interact (a web interface written in Scala, the CFEngine 3 infrastructure used to manage hosts, OpenLDAP as an inventory and configuration data store...), including how to write your own techniques and extend existing ones.
Are you ready for NVMe? IBM FlashSystem uses NVMe inside, and is NVMe-ready for use with FCP and Ethernet fabrics. This session explains FC-NVMe and NVMe-OF and how IBM FlashSystem uses NVMe inside.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
1. SF 2009
PCI Express* 3.0 Technology:
Device Architecture
Optimizations on Intel
Platforms
Mahesh Wagh
IO Architect
TCIS006
2. Agenda
• Next Generation PCI Express* (PCIe*)
Protocol Extensions Summary
• Device Architecture Considerations
– Energy Efficient Performance
– Power Management
• Software Development
• Summary
• Call to action
2
3. PCI Express* (PCIe*) 2.1 Protocol Extensions Summary
Extensions Explanation Benefit
Transaction Request hints to Reduce access latency to system memory. Reduce System
enable optimized Interconnect & Memory Bandwidth & Associated Power
Layer Packet processing within Consumption
(TLP) host memory/cache Application Class: NIC, Storage, Accelerators/GP-GPU
Processing hierarchy
Hints
Latency Mechanisms for Reducing Platform Power based on device service requirements,
platform to tune PM Application Class: All devices/Segments
Tolerance
Reporting
Opportunistic Mechanisms for Reducing Platform Power based aligning device activity with
platform to tune PM platform PM events to further reduce platform power
Buffer Flush and to align device Application Class: All devices/Segments
and Fill activities
Atomics Atomic Read- Reduced Synchronization overhead, software library algorithm and
Modify-Write data structure re-use across core and accelerators/devices.
mechanism Application Class: (Graphics, Accelerators/GP-GPU))
Resizable Mechanism to System Resource optimizations - breakaway from “All or Nothing
negotiate BAR size device address space allocation”
BAR
Application Class: – Any Device with large local memory
(Example: Graphics)
Multicast Address Based Significant gain in efficiency compared to multiple unicasts
Multicast Application Class: (Embedded, Storage & Multiple Graphics
adapters)
3
4. Continued…
Extensions Explanation Benefit
I/O Page Extends IO address System Memory Management optimizations
remapping for page faults – Application Class: Accelerators, GP-GPU usage models
Faults
(Address Translation Services
1.1)
Ordering New ordering semantic to Improved performance (latency reduction) ~ (IDO) 20%
improve performance read latency improvement by permitting unrelated reads
Enhancements
to pass writes. Application Class: All Devices with two
party communication
Dynamic Mechanisms to allow Dynamic component power/thermal control, manage
dynamic power/performance endpoint function power usage to meet new customer or
Power
management of D0 (active) regulatory operation requirements
Allocation substates. Application Class: GP-GPU
(DPA)
Internal Error Extend AER to report Enables software to implement common and interoperable
component internal errors error handling services. Improved error containment and
Reporting
(Correctable/ uncorrectable) recovery.
and multiple error logs Application Class: RAS for Switches
TLP Prefix Mechanism to extend TLP Scalable Architecture headroom for TLP headers to grow
headers with minimal impact to routing elements. Support Vendor
Specific header formats.
Application Class: MR-IOV, Large Topologies and
provisioning for future use models
New Features with Broad Applicability
4
6. TLP Processing Hints (TPH)
Processor
TLP Processing Hints (PCIe Base
2.1 specification)
$ – Memory Read, Memory Write
and Atomic Operations
System Specific Steering Tags
$ (ST)
Memory – Identify targeted resource e.g.
System Cache Location
– 256 unique Steering Tags
Root Complex Benefits
Effective Use of System Resources
• Reduce Access latency to system
memory
• Reduce Memory & system
interconnect BW & Power
PCI Express* (PCIe*)
Device
Improves System Efficiency
Effective use of System Fabric and Resources
6
7. Requirements Checklist
Ecosystem Device Driver
Platform support (Root Complex,
Routing Elements) Device Specific
Assignments
System specific Steering Tag
advertisement
Platform
Device Architecture spec. compliance
PCIe
PCIe Capability support
ST
Characterize workloads Mapping
Application processor Affinity
OS/VMM
Steering Tag to Workload
association Firmware
Select Modes of Operation
Root Complex
Software Development TPH capability Platform
Basic Capability Discovery,
Identification and Management
ST Table TPH
System specific Steering Tag TPH capability
advertisement and assignment
Device Driver enhancements
PCIe Device
7 PCI Express* (PCIe*) Technology
8. No ST Mode
No Steering Tags used
Request Steering is Platform
Specific OS/VMM
Basic Capability Enablement
Root Complex
Minimal Implementation TPH capability Platform
cost and complexity TPH
TPH capability (No ST)
PCIe Device
Basic support
8 PCI Express* (PCIe*) Technology
9. Interrupt Vector Mode
ST associated with Interrupt
(MSI/MSI-X)
Platform
Firmware provides Platform ST
specific ST information to Mapping
OS/Hypervisor OS/VMM
Firmware
OS/Hypervisor assigns ST
along with Interrupt vector
Root Complex
assignment TPH capability Platform
Suitable for devices with ST Table TPH
workload/Interrupt affinity TPH capability
to cores PCIe Device
TTM Advantage
9 PCI Express* (PCIe*) Technology
10. Device Specific Mode
Device Driver
Device Specific
Device Specific ST Assignments
association
Builds upon Interrupt vector Platform
mode s/w support ST
Mapping
Device Driver determines OS/VMM
processor affinity Firmware
New API required to request Root Complex
ST assignments TPH capability Platform
Independent of Interrupt ST Table
TPH capability
TPH
association PCIe Device
Scalable & Flexible Solution
10 PCI Express* (PCIe*) Technology
11. TPH Aware Device Architecture
Classify device initiated Steering Tag Modes
transactions:
Bulk vs. Control No ST: Basic Hints only, No ST
used
Select hints based on
Data Struct. Use models Interrupt Vector Mode:
Faster TTM with Interrupt
association
Control Struct. (Descriptors) Device Specific Mode:
Scalable, Flexible & Dynamic,
Headers for Pkt. Processing
can provide TTM advantage
Data Payload (Copies)
Software Development
Basic TPH Capability Identification, Discovery and Management
Firmware Support to advertise ST assignments
OS/Hypervisor ST assignment support
Optional API support
TPH permits Device Architecture Specific Trade-Offs
11
13. System Perspective
• Device behavior impacts power consumption of other
system components
– Devices should consider their system power impact, not just
their own device level power consumption
• Extreme example: Enhanced Host Controller Interface (EHCI)
• Systems (and devices) are idle most of the time
– There’s a big opportunity for devices and systems to take
advantage of that
• Latency Tolerance Reporting (LTR) enables lower
power, longer exit latency system power states when
devices can tolerate it
• Optimized Buffer Flush/Fill (OBFF) enables platform
activity alignment, resulting in system power savings
Opportunity for Devices to Differentiate on
Platform Power Savings
13
14. Platform Power Savings Opportunity
40 Average power scenario running MM07 (Office* Productivity suite)
Platform Power in Watts
On an Intel® Core™2 Duo platform running Windows* Vista*
35
Idle workload power
30 (~8W – 10W)
25
PM
20 Opportunity
15
10
5
0
0 10 20 30 40 50 60 70 80 90 100 110
Time in Minutes Source: Intel Corporation
• Usage Analysis: Typical mobile platform in S0 state is ~90% idle
• When idle, platform components are kept in high power state to
meet the service latency requirements of devices & applications
Power consumption for idle workload is high
14
15. Optimized Buffer Flush/Fill
• Next several foils describe:
– Platform activity alignment
– PCI Express* (PCIe*) OBFF mechanisms
– OBFF within context of platform activity
alignment
– Device implementation impacts for OBFF
15
16. Platform Activity Alignment
Current Platforms
Power Management Opportunities
Platforms with Activity Alignment
OS tick Device Device interrupts Device Device traffic
inter- interrupts (deferrable) traffic (deferrable)
rupts (critical) • Not time critical
(critical)
• Time Critical • No Buffering
• Status Notifications • Buffer over- constraints
• Buffer • User Command (under-) run • Debug
replenish Completions • Throughput/ dumps
• Performance/ • Debug, Statistics Performance
Throughput
Creates PM Opportunities for Semi-active workloads
16
17. Optimized Buffer Flush/Fill (OBFF)
OBFF
Processor • Notify all Endpoints of
optimal windows with
minimal power impact
Solution1: When possible,
use WAKE# with new
Optional Memory wire semantics
OBFF 2 Solution2: WAKE# not
Message available – Use PCIe
Wake# Message
Optimal Windows
Root Complex WAKE# Waveforms
• Processor Active
– Platform fully Transition Event WAKE#
active. Optimal for
bus mastering and Idle OBFF
interrupts
• OBFF – Platform Idle Proc. Active
memory path
available for OBFF/Proc. Active Idle
memory read and
writes OBFF Proc. Active
• Idle – Platform is
PCI Express* (PCIe*) in low power state Proc. Active OBFF
Device
Greatest Potential Improvement When
17 Implemented by All Platform Devices
18. OBFF and Activity Alignment
Interrupts DMAs
PCIe Dev A
Device A
PCIe Dev B
Platform Device B
PCIe Dev C
Device C
WAKE#
OS Timer
Tick (intrpt)
Traffic Pattern with No OBFF
Time
Traffic Pattern with OBFF
Idle Active Idle Active Idle OBFF Idle Active
Idle OBFF
Window Wndw Window Wndw Window Wndw Window Window
18 PCI Express* (PCIe*) Technology 18
19. OBFF Device Implementation
Impacts
Classify device initiated
Maximize idle window transactions:
duration for platform critical vs. deferrable
•Align transactions with •Perform critical
other devices in system transactions as necessary
•Coalesce transactions into •Defer other transactions
groups where possible to align with platform
– Perform groups of activity
transactions all at – Decode platform idle
once, don’t trickle all / active / OBFF
the time window signaling
Select data buffer depths to tolerate platform activity
alignment
• 300µs of deferral buffering recommended for Intel mobile
platforms
19 19
20. Latency Tolerance Reporting
• The next several foils describe:
– Power vs. response latency
– PCI Express* (PCIe*) LTR mechanisms,
semantics
– Examples of device implementation schemes
• Application state driven LTR reporting
• Data buffer depth driven LTR reporting
• Software guided LTR reporting
– Device implementation impact summary for
LTR
20 20
21. Power Vs Response Latency
(Mobile)
System Active State (S0) System Inactive State (Sx)
Platform Response Latency
~10sec
Max Workload Idle
Low Service Latency S4
requirements during High Service Latency
500msec
active workloads gives requirements during
reliable performance idle workloads gives
more PM opportunity
S3
Devices and
Applications have Response Latency
Fixed Service Penalty increases
Latency with lower power
expectations from states
a platform in S0
~100usec
Proc. C0 Proc. CX
Max power 8-10W 600mW 400mW
Platform Power Consumption Decreasing
Variable Service Latency requirements in S0 is Optimal
21 21
22. Latency Tolerance Reporting (LTR)
LTR Mechanism
Processor • PCI Express* (PCIe*) Message sent by
Endpoint with tolerable latency
– Capability to report both snooped & non-
snooped values
– “Terminate at Receiver” routing, MFD &
Switch send aggregated message
LTR Memory Benefits
Message • Provides Device Benefit: Dynamically tune
platform PM state as a function of Device
activity level
• Platform benefit: Enables greater power
savings without impact to
Root Complex performance/functionality
Dynamic LTR
1
LTR (Max)
Buffer 1 Idle
2
Buffer
LTR (Activity
Adjusted) Buffer 2 Active
PCI Express* Device
LTR enables dynamic power vs. performance
22 tradeoffs at minimal cost impact
23. Application State Driven LTR
Example: WLAN Device Sending LTR
Latency Latency Latency
(100usec) (1msec) (100usec)
Client Awake Client Dozing
Beacon with TIM Data PS-Poll ACK
Latency information with Wi-Fi Legacy Power Save
Example use of device PM states to give latency guidance
23 23
24. Data Buffer Utilization Guided LTR
Example: Active Ethernet NIC Sending LTR
Initial A new LTR value is in effect no later than the value sent in the previous LTR msg
Latency Latency
500µs 500µsec
Platform Components Power State
Very Low High Low High Low Low Very Low
Power Power Power Power Power Power Power
500µs (PCIe) Bus
Transactions
100µs 100µs
600µs 100us
Device
Buffer Device Timeout
LTR Threshold Buffer
100µs
Idle Idle
Network Data
Timeout
LTR = 100µs LTR = 500µs
Example use of buffering to give latency guidance
24 PCI Express* (PCIe*) Technology 24
25. Software Guided Latency
SW Guided Latency
Three device categories Platform
Static: Device can always Power
support max platform latency
Device Driver Mgt
Slow Dynamic: Latency
requirements change LTR Policy Logic Policy
infrequently Engine
Fast Dynamic: Latency
requirements change frequently
Static and Slow Dynamic types of
devices may choose SW guided
messaging
Policy logic for determination of MMIO Register LTR
when to send latency messages
(and what values) in software
PCI Express* Device
E.g. use an MMIO register
A write to the register would
trigger an LTR message
25 25
26. LTR Device Implementation Impacts
When idle, let platform enter deep power saving states
Use MaxLatency (LTR Extended Capabilities field) when idle
Require low latencies only when necessary – don’t keep
platform in high power state longer than necessary
Dynamic, hardware Software guided
driven LTR LTR
Leverage application Implement
based opportunities to SW simple MMIO
tolerate more latency register
Guided
E.g. WLAN radio off interface
?
between beacons Register
Implement data writes cause
buffering mechanism LTR message
to comprehend LTR to be sent
26 26
27. Software Enabling
Features requiring basic software support
8GT/s speed upgrade
Capability Discovery, Atomics
Identification and Transaction Ordering Relaxations
Internal Error Reporting
Management TLP Prefix
Features requiring additional support
Above and beyond Transaction Processing Hints
capability enablement LTR & OBFF
Resizable BAR
Resource Allocation, IO Page Faults
Enumeration & API Dynamic Power Allocation
Multicast
27
28. Summary
• Next Generation PCI Express* (PCIe*)
Protocol Extensions Deliver Energy Efficient
Performance
– Protocol Extensions with Broad Applicability
• Ecosystem Development is essential
– Platform Support
– Device architectures optimized around protocol
features
– Software support and Enabling
28
29. Call to Action
• Device Architecture Considerations
– Develop Device Architecture to make the most of the most of
proposed protocol extensions
• Differentiate products utilizing TPH
– Select Hints/ST modes based on device/market segment
requirements
• Differentiate products utilizing LTR and OBFF
– Can differentiate by platform power impact not just device power
– Power Savings opportunity is huge
• Keep track of Next Generation PCI Express* Technology
development
– PCI-SIG www.pcisig.com
• Engage with Intel on Next Generation PCI Express product
development
– www.intel.com/technology/pciexpress/devnet
29
30. Acknowledgements / Disclaimer
• I would like to acknowledge the contributions of the following
Intel employees
– Stephen Whalley, Intel Corporation
– Jasmin Ajanovic, Intel Corporation
– Anil Vasudevan, Intel Corporation
– Prashant Sethi, Intel Corporation
– Simer Singh, Intel Corporation
– Miles Penner, Intel Corporation
– Mahesh Natu, Intel Corporation
– Rob Gough, Intel Corporation
– Eric Wehage, Intel Corporation
– Jim Walsh, Intel Corporation
– Jaya Jeyaseelan, Intel Corporation
– Neil Songer, Intel Corporation
– Barnes Cooper, Intel Corporation
All opinions, judgments, recommendations, etc. that
are presented herein are the opinions of the presenter
of the material and do not necessarily reflect the
opinions of the PCI-SIG*.
30 30
31. Additional Sources of Information
on This Topic
• Other Sessions
– TSIS007 –PCI Express* 3.0 Technology: PHY
Implementation Considerations on Intel
Platforms
– TSIS008 – PCI Express* 3.0 Technology:
Electrical Requirements for Designing ASICs on
Intel Platforms
– TCIQ002: Q&A: PCI Express* 3.0 Technology
• www.intel.com/technology/pciexpress/devn
et
31
33. Risk Factors
The above statements and any others in this document that refer to plans and expectations for the third quarter, the year and the
future are forward-looking statements that involve a number of risks and uncertainties. Many factors could affect Intel’s actual
results, and variances from Intel’s current expectations regarding such factors could cause actual results to differ materially from
those expressed in these forward-looking statements. Intel presently considers the following to be the important factors that could
cause actual results to differ materially from the corporation’s expectations. Ongoing uncertainty in global economic conditions pose
a risk to the overall economy as consumers and businesses may defer purchases in response to tighter credit and negative financial
news, which could negatively affect product demand and other related matters. Consequently, demand could be different from
Intel's expectations due to factors including changes in business and economic conditions, including conditions in the credit market
that could affect consumer confidence; customer acceptance of Intel’s and competitors’ products; changes in customer order
patterns including order cancellations; and changes in the level of inventory at customers. Intel operates in intensely competitive
industries that are characterized by a high percentage of costs that are fixed or difficult to reduce in the short term and product
demand that is highly variable and difficult to forecast. Additionally, Intel is in the process of transitioning to its next generation of
products on 32nm process technology, and there could be execution issues associated with these changes, including product defects
and errata along with lower than anticipated manufacturing yields. Revenue and the gross margin percentage are affected by the
timing of new Intel product introductions and the demand for and market acceptance of Intel's products; actions taken by Intel's
competitors, including product offerings and introductions, marketing programs and pricing pressures and Intel’s response to such
actions; and Intel’s ability to respond quickly to technological developments and to incorporate new features into its products. The
gross margin percentage could vary significantly from expectations based on changes in revenue levels; capacity utilization; start-up
costs, including costs associated with the new 32nm process technology; variations in inventory valuation, including variations
related to the timing of qualifying products for sale; excess or obsolete inventory; product mix and pricing; manufacturing yields;
changes in unit costs; impairments of long-lived assets, including manufacturing, assembly/test and intangible assets; and the
timing and execution of the manufacturing ramp and associated costs. Expenses, particularly certain marketing and compensation
expenses, as well as restructuring and asset impairment charges, vary depending on the level of demand for Intel's products and
the level of revenue and profits. The current financial stress affecting the banking system and financial markets and the going
concern threats to investment banks and other financial institutions have resulted in a tightening in the credit markets, a reduced
level of liquidity in many financial markets, and heightened volatility in fixed income, credit and equity markets. There could be a
number of follow-on effects from the credit crisis on Intel’s business, including insolvency of key suppliers resulting in product
delays; inability of customers to obtain credit to finance purchases of our products and/or customer insolvencies; counterparty
failures negatively impacting our treasury operations; increased expense or inability to obtain short-term financing of Intel’s
operations from the issuance of commercial paper; and increased impairments from the inability of investee companies to obtain
financing. The majority of our non-marketable equity investment portfolio balance is concentrated in companies in the flash memory
market segment, and declines in this market segment or changes in management’s plans with respect to our investments in this
market segment could result in significant impairment charges, impacting restructuring charges as well as gains/losses on equity
investments and interest and other. Intel's results could be impacted by adverse economic, social, political and
physical/infrastructure conditions in countries where Intel, its customers or its suppliers operate, including military conflict and other
security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates. Intel's
results could be affected by adverse effects associated with product defects and errata (deviations from published specifications),
and by litigation or regulatory matters involving intellectual property, stockholder, consumer, antitrust and other issues, such as the
litigation and regulatory matters described in Intel's SEC reports. A detailed discussion of these and other risk factors that could
affect Intel’s results is included in Intel’s SEC filings, including the report on Form 10-Q for the quarter ended June 27, 2009.
Rev. 7/27/09