Rsockets provides a socket-like API for RDMA networking. It aims to allow applications to use familiar socket programming concepts while achieving high performance comparable to native RDMA. Rsockets allows existing socket applications to run over RDMA with minimal changes by intercepting socket calls and mapping them to rsocket functions. Initial benchmarks show rsockets achieving lower latency and higher bandwidth than IPoIB and competing with native InfiniBand performance. It also allows several MPI and HPC applications to run with only linking to an interception library.
Embree Ray Tracing Kernels | Overview and New Features | SIGGRAPH 2018 Tech S...Intel® Software
Overview of the new Embree 3 ray tracing framework, including how to use the new API, supported geometry types, and ray intersection methods. Includes a look at new features like normal oriented curves, vertex grids, etc.
The Architecture of 11th Generation Intel® Processor GraphicsIntel® Software
Scheduled for release this year, this next generation brings significant improvements over the widely used 9th generation of Intel® Processor Graphics. The talk begins with an overview of Intel® Graphics architecture, its building blocks, and their performance implications. Next, take an in-depth look at the new and innovative features of this latest generation of integrated graphics.
Accelerate Large-Scale Inverse Kinematics with the Intel® Distribution of Ope...Intel® Software
Inverse kinematics (IK) technology helps game developers create natural character animations but its complexity makes it time consuming to implement across a large cast. This talk details how the Deep Learning Deployment Toolkit (DLDT) allows game developers to quickly deploy deep-learning algorithms to solve character IK problems and to yield better results
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-ResolutionIntel® Software
The visual computing world is moving to an exciting technological era of ultra HD (UHD) and wide-gamut deep colors (WCG). The new Gen9 graphics engine in the 6th generation Intel® Core™ processors is the developers’ platform choice for creating visual excellence in 4K and deep colors. The Gen9 processor graphics offers attractive solutions for high-quality and low-power video scaling that handle UHD and WCG. First, we introduce a hardware fixed-function scaler inside the new SFC (scaling and format conversion) module that provides high quality scaling in low-power platforms. Second, we present a super-resolution scaling solution based on convolutional neural network that can be implemented via OpenCL™ running on the execution units (EUs). We discuss the merits of each solution in different user environments
Embree Ray Tracing Kernels | Overview and New Features | SIGGRAPH 2018 Tech S...Intel® Software
Overview of the new Embree 3 ray tracing framework, including how to use the new API, supported geometry types, and ray intersection methods. Includes a look at new features like normal oriented curves, vertex grids, etc.
The Architecture of 11th Generation Intel® Processor GraphicsIntel® Software
Scheduled for release this year, this next generation brings significant improvements over the widely used 9th generation of Intel® Processor Graphics. The talk begins with an overview of Intel® Graphics architecture, its building blocks, and their performance implications. Next, take an in-depth look at the new and innovative features of this latest generation of integrated graphics.
Accelerate Large-Scale Inverse Kinematics with the Intel® Distribution of Ope...Intel® Software
Inverse kinematics (IK) technology helps game developers create natural character animations but its complexity makes it time consuming to implement across a large cast. This talk details how the Deep Learning Deployment Toolkit (DLDT) allows game developers to quickly deploy deep-learning algorithms to solve character IK problems and to yield better results
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-ResolutionIntel® Software
The visual computing world is moving to an exciting technological era of ultra HD (UHD) and wide-gamut deep colors (WCG). The new Gen9 graphics engine in the 6th generation Intel® Core™ processors is the developers’ platform choice for creating visual excellence in 4K and deep colors. The Gen9 processor graphics offers attractive solutions for high-quality and low-power video scaling that handle UHD and WCG. First, we introduce a hardware fixed-function scaler inside the new SFC (scaling and format conversion) module that provides high quality scaling in low-power platforms. Second, we present a super-resolution scaling solution based on convolutional neural network that can be implemented via OpenCL™ running on the execution units (EUs). We discuss the merits of each solution in different user environments
Threading Game Engines: QUAKE 4 & Enemy Territory QUAKE Warspsteinb
This talk will briefly discuss performance threading of Quake4 and Quake Wars Engine. It will go over the issues involved parallelizing serial code, working with different backends, load balancing and design considerations. It will also offer some insight into extracting parallelism in game engines on next-generation hardware.Get a first-time look at Havok Behavior 5.5, demonstrating how the Havok Behavior Tool combines the fidelity of traditional animation assets with powerful physical and procedural animation techniques in a single creative environment. View Havok’s extensible end-to-end character content creation pipeline spanning physics, animation, and real-time behavior asset composition and conditioning.
Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...Intel® Software
Explore practical examples of Intel® Embree and Intel® OSPRay in production rendering and the best practices of using the kernels in typical rendering pipelines.
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...Intel® Software
Explore practical elements, such as performance profiling, debugging, and porting advice. Get an overview of advanced programming topics, like common design patterns, SIMD lane interoperability, data conversions, and more.
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationIntel® Software
In this presentation, you will hear a story about how Intel graphics can accelerate deep learning applications. The method is simple and reproducible, with impressive results of up to four times over the original CPU performance. We introduce clCaffe*, an extension of the well-known Caffe* framework with OpenCL™ standard. This OpenCL™ standard enables primitives of the convolutional neural networks (CNN) pipeline to operate on GPU (graphics processing unit), FPGA (field programmable gate array) or any device with OpenCL support. Once set up, Caffe users can seamlessly toggle to clCaffe to take advantage of Intel graphics acceleration. Compared with original CPUs, Intel graphics presents 2.5x speedup (AlexNet classification), or 4.0x (GoogleNet classification) on 5th or 6th generation Intel® Core™ processors. Finally, we give a detailed analysis of clCaffe performance, and identify the lacking components in Intel Graphics software stack that impair its performance in the deep learning support.
It Doesn't Have to Be Hard: How to Fix Your Performance WoesIntel® Software
Maximize your game performance on a wide range of hardware. Learn how to use Intel® GPA to identify and quantify common performance bottlenecks, mitigate them, and validate optimizations.
Review state-of-the-art techniques that use neural networks to synthesize motion, such as mode-adaptive neural network and phase-functioned neural networks. See how next-generation CPUs with reinforcement learning can offer better performance.
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...Intel® Software
This talk focuses on the newest release in RenderMan* 22.5 and its adoption at Pixar Animation Studios* for rendering future movies. With native support for Intel® Advanced Vector Extensions, Intel® Advanced Vector Extensions 2, and Intel® Advanced Vector Extensions 512, it includes enhanced library features, debugging support, and an extensive test framework.
Open Source Interactive CPU Preview Rendering with Pixar's Universal Scene De...Intel® Software
Universal Scene Description* (USD) is an open source initiative developed by Pixar for fast, large scale, and universal asset management across multiple programs including Maya, Houdini, and others.
Lab Handson: Power your Creations with Intel Edison!Codemotion
by Francesco Baldassarri - Come along and play with Intel Edison, for the Internet of Things? Learn about the Developer Kit for IoT, chose your preferred environment and test it – or test all the possibilities? We will be providing information and hands on training for developers interested in testing our solutions in C/C++, Javascript, Arduino, Wyliodrin and Python. Just bring you laptop and we will help you to get started. We will also provide information about our Cloud Analytics platform, and test hardware samples with the Grove Starter Kit – Intel IoT Edition. Visit us anytime and start making! What will you make?
Create a Scalable and Destructible World in HITMAN 2*Intel® Software
Gain insight into how IO Interactive* (IOI) designed the crowd, environmental audio, non-playable character simulation, and physical destruction systems to take advantage of available hardware and dynamically upscale resolution and deliver more realism. See the design and architecture of the destruction system, including the asset pipeline and game runtime that enables IOI to create a more interesting world for their players.
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...Intel® Software
Variable-rate shading (VRS) is a new feature of Microsoft DirectX* 12 and is supported on the 11th generation of Intel® graphics hardware. Get an overview and learn best practices, recommendations, and how to modify traditional 3D effects to take advantage of VRS.
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...Intel® Software
Deep learning based Inference on edge based devices is growing rapidly. In this talk, learn about how developers and researchers are taking advantage of Intel® Processor Graphics to get best performance.
NFF-GO (YANFF) - Yet Another Network Function FrameworkMichelle Holley
NFF-Go is a framework allows developers to deploy performant cloud-native network functions much faster. NFF-Go internally implements low-level optimizations and can auto-scale to multicores using built-in capabilities to take advantage of Intel® architecture. NFF uses Data Plane Development Kit (DPDK) for efficient input/output (I/O) and Go programming language as a high-level, safe, productive language.
Apache CarbonData & Spark meetup
"QATCodec: past, present and future" if from INTEL
Apache Spark™ is a unified analytics engine for large-scale data processing.
CarbonData is a high-performance data solution that supports various data analytic scenarios, including BI analysis, ad-hoc SQL query, fast filter lookup on detail record, streaming analytics, and so on. CarbonData has been deployed in many enterprise production environments, in one of the largest scenario it supports queries on single table with 3PB data (more than 5 trillion records) with response time less than 3 seconds!
Threading Game Engines: QUAKE 4 & Enemy Territory QUAKE Warspsteinb
This talk will briefly discuss performance threading of Quake4 and Quake Wars Engine. It will go over the issues involved parallelizing serial code, working with different backends, load balancing and design considerations. It will also offer some insight into extracting parallelism in game engines on next-generation hardware.Get a first-time look at Havok Behavior 5.5, demonstrating how the Havok Behavior Tool combines the fidelity of traditional animation assets with powerful physical and procedural animation techniques in a single creative environment. View Havok’s extensible end-to-end character content creation pipeline spanning physics, animation, and real-time behavior asset composition and conditioning.
Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...Intel® Software
Explore practical examples of Intel® Embree and Intel® OSPRay in production rendering and the best practices of using the kernels in typical rendering pipelines.
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...Intel® Software
Explore practical elements, such as performance profiling, debugging, and porting advice. Get an overview of advanced programming topics, like common design patterns, SIMD lane interoperability, data conversions, and more.
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationIntel® Software
In this presentation, you will hear a story about how Intel graphics can accelerate deep learning applications. The method is simple and reproducible, with impressive results of up to four times over the original CPU performance. We introduce clCaffe*, an extension of the well-known Caffe* framework with OpenCL™ standard. This OpenCL™ standard enables primitives of the convolutional neural networks (CNN) pipeline to operate on GPU (graphics processing unit), FPGA (field programmable gate array) or any device with OpenCL support. Once set up, Caffe users can seamlessly toggle to clCaffe to take advantage of Intel graphics acceleration. Compared with original CPUs, Intel graphics presents 2.5x speedup (AlexNet classification), or 4.0x (GoogleNet classification) on 5th or 6th generation Intel® Core™ processors. Finally, we give a detailed analysis of clCaffe performance, and identify the lacking components in Intel Graphics software stack that impair its performance in the deep learning support.
It Doesn't Have to Be Hard: How to Fix Your Performance WoesIntel® Software
Maximize your game performance on a wide range of hardware. Learn how to use Intel® GPA to identify and quantify common performance bottlenecks, mitigate them, and validate optimizations.
Review state-of-the-art techniques that use neural networks to synthesize motion, such as mode-adaptive neural network and phase-functioned neural networks. See how next-generation CPUs with reinforcement learning can offer better performance.
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...Intel® Software
This talk focuses on the newest release in RenderMan* 22.5 and its adoption at Pixar Animation Studios* for rendering future movies. With native support for Intel® Advanced Vector Extensions, Intel® Advanced Vector Extensions 2, and Intel® Advanced Vector Extensions 512, it includes enhanced library features, debugging support, and an extensive test framework.
Open Source Interactive CPU Preview Rendering with Pixar's Universal Scene De...Intel® Software
Universal Scene Description* (USD) is an open source initiative developed by Pixar for fast, large scale, and universal asset management across multiple programs including Maya, Houdini, and others.
Lab Handson: Power your Creations with Intel Edison!Codemotion
by Francesco Baldassarri - Come along and play with Intel Edison, for the Internet of Things? Learn about the Developer Kit for IoT, chose your preferred environment and test it – or test all the possibilities? We will be providing information and hands on training for developers interested in testing our solutions in C/C++, Javascript, Arduino, Wyliodrin and Python. Just bring you laptop and we will help you to get started. We will also provide information about our Cloud Analytics platform, and test hardware samples with the Grove Starter Kit – Intel IoT Edition. Visit us anytime and start making! What will you make?
Create a Scalable and Destructible World in HITMAN 2*Intel® Software
Gain insight into how IO Interactive* (IOI) designed the crowd, environmental audio, non-playable character simulation, and physical destruction systems to take advantage of available hardware and dynamically upscale resolution and deliver more realism. See the design and architecture of the destruction system, including the asset pipeline and game runtime that enables IOI to create a more interesting world for their players.
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...Intel® Software
Variable-rate shading (VRS) is a new feature of Microsoft DirectX* 12 and is supported on the 11th generation of Intel® graphics hardware. Get an overview and learn best practices, recommendations, and how to modify traditional 3D effects to take advantage of VRS.
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...Intel® Software
Deep learning based Inference on edge based devices is growing rapidly. In this talk, learn about how developers and researchers are taking advantage of Intel® Processor Graphics to get best performance.
NFF-GO (YANFF) - Yet Another Network Function FrameworkMichelle Holley
NFF-Go is a framework allows developers to deploy performant cloud-native network functions much faster. NFF-Go internally implements low-level optimizations and can auto-scale to multicores using built-in capabilities to take advantage of Intel® architecture. NFF uses Data Plane Development Kit (DPDK) for efficient input/output (I/O) and Go programming language as a high-level, safe, productive language.
Apache CarbonData & Spark meetup
"QATCodec: past, present and future" if from INTEL
Apache Spark™ is a unified analytics engine for large-scale data processing.
CarbonData is a high-performance data solution that supports various data analytic scenarios, including BI analysis, ad-hoc SQL query, fast filter lookup on detail record, streaming analytics, and so on. CarbonData has been deployed in many enterprise production environments, in one of the largest scenario it supports queries on single table with 3PB data (more than 5 trillion records) with response time less than 3 seconds!
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai YuDatabricks
Catalyst is an excellent optimizer in SparkSQL, provides open interface for rule-based optimization in planning stage. However, the static (rule-based) optimization will not consider any data distribution at runtime. A technology called Adaptive Execution has been introduced since Spark 2.0 and aims to cover this part, but still pending in early stage. We enhanced the existing Adaptive Execution feature, and focus on the execution plan adjustment at runtime according to different staged intermediate outputs, like set partition numbers for joins and aggregations, avoid unnecessary data shuffling and disk IO, handle data skew cases, and even optimize the join order like CBO etc.. In our benchmark comparison experiments, this feature save huge manual efforts in tuning the parameters like the shuffled partition number, which is error-prone and misleading. In this talk, we will expose the new adaptive execution framework, task scheduling, failover retry mechanism, runtime plan switching etc. At last, we will also share our experience of benchmark 100 -300 TB scale of TPCx-BB in a hundreds of bare metal Spark cluster.
In this deck from the 2019 OpenFabrics Workshop in Austin, Sean Hefty and Venkata Krishnan from Intel present: Enabling Applications to Exploit SmartNICs and FPGAs.
"Advances in Smart NIC/FPGA with integrated network interface allow acceleration of application-specific computation to be performed alongside communication. This communication works in a synergistic manner with various acceleration models that include inline, lookaside or remotely triggered ones. Bringing this technology to the HPC ecosystem for deployment on next-generation Exascale class systems however requires exposing these capabilities to applications in terms that are familiar to software developers. In this regard, the lack of a standardized software interface that applications can use is an impediment to the deployment of Smart NIC/FPGA in Exascale platforms. We propose extensions to OFI to expose these capabilities. This would improve the performance of middleware based on this interface. And in turn, this will indirectly benefit applications that use that middleware without requiring any application changes. Participants will learn about the potential for Smart NIC/FPGA application acceleration and will have the opportunity to contribute application expertise and domain knowledge to a discussion of how Smart NIC/FPGA acceleration technology can bring individual applications into the Exascale era."
Watch the video: https://wp.me/p3RLHQ-k1F
Learn more: https://www.openfabrics.org/2019-workshop-agenda-and-abstracts/
and
https://www.intel.com/content/www/us/en/programmable/solutions/acceleration-hub/overview.html
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Golang é uma linguagem fantástica para se desenvolver aplicações e um fator a ser explorado é o seu uso em dispositivos IoT. A linguagem já conta com diversas ferramentas de cross-compile, alguns pacotes experimentais de comunicação baixo nível e diversos projetos relacionados a hardware.
DPDK IPSec performance benchmark ~ Georgii TkachukIntel
DPDK IPSec performance benchmark ~ Georgii Tkachuk
IPSec and cryptodev overview and performance numbers by Intel Benchmarking team.
Part of 2 day SDN/NFV/DPDK dev lab
https://www.meetup.com/Out-Of-The-Box-Network-Developers/events/237028223/
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...Databricks
Apache Spark is a popular data processing engine designed to execute advanced analytics on very large data sets which are common in today’s enterprise use cases. To enable Spark’s high performance for different workloads (e.g. machine-learning applications), in-memory data storage capabilities are built right in.
However, Spark’s in-memory capabilities are limited by the memory available in the server; it is common for computing resources to be idle during the execution of a Spark job, even though the system’s memory is saturated. To mitigate this limitation, Spark’s distributed architecture can run on a cluster of nodes, thus taking advantage of the memory available across all nodes. While employing additional nodes would solve the server DRAM capacity problem, it does so at an increased cost. Intel(R) Memory Drive Technology is a software-defned memory (SDM) technology, which combined with an Intel(R) Optane(TM) SSD, expands the system’s memory.
This combination of Intel(R) Optane(TM) SSD with Intel Memory Drive Technology alleviates those memory limitations that are inherent to Spark, by making more memory available to the operating system and to Spark jobs, transparently.
An easy-to-use, automatic, self-contained toolkit to accelerate ODM* benchmarking NFVi-ready server designs on Intel® Scalable Server platforms based on golden benchmark to characterize baseline performance test on DPDK, QAT and OVS, running on a single Xeon SP server.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
3. More Specifically…
Programming to Verbs
struct ibv_device **dev_list;
struct ibv_context *ib_ctx = NULL;
struct ibv_device_attr dev_attr;
struct ibv_port_attr port_attr;
int i, p, ret;
dev_list = ibv_get_device_list(NULL);
if (!dev_list)
error();
for (i = 0; dev_list[i]; i++) {
ib_ctx = ibv_open_device(dev_list[i]);
if (!ib_ctx)
error();
ret = ibv_query_device(ib_ctx, &dev_attr)
if (ret)
error();
Get a list of devices
and their attributes
4. More Specifically…
for (p = 1; p < dev_attr.phys_port_cnt; p++) {
ret = ibv_query_port(ib_ctx, i, &port_attr);
if (ret)
error();
if (port_attr.state == IBV_PORT_ACTIVE)
goto done;
}
ibv_close_device(dev_list[i]);
ib_ctx = NULL;
}
done:
ibv_free_device_list(dev_list);
if (!ib_ctx)
error();
Select a port and
get its attributes
5. More Specifically…
struct ibv_pd *pd;
struct ibv_comp_channel *comp_channel;
struct ibv_cq *cq;
pd = ibv_alloc_pd(ib_ctx);
if (!pd)
error();
comp_channel = ibv_create_comp_channel(ib_ctx);
if (!comp_channel)
error();
cq = ibv_create_cq(ib_ctx, min(min(MY_SQ_SiZE + MY_RQ_SIZE),
dev_attr.max_qp_wr), dev_attr.max_cqe),
NULL, comp_channel, 0);
if (!cq)
error();
We need :
- protection domain
- completion channel
- completion queue
7. More Specifically…
void *msgs;
struct ibv_mr *mr;
msgs = calloc(qp_init_attr.cap.max_recv_wr, MY_MSG_SIZE);
if (!msgs)
error();
mr = ibv_reg_mr(pd, msgs, qp_init_attr.cap.max_recv_wr * MY_MSG_SIZE,
IBV_ACCESS_LOCAL_WRITE);
if (!mr)
error();
Allocate some messages
to receive data…
and register them
with the device
8. More Specifically…
struct ibv_recv_wr recv_wr, *bad_wr;
struct ibv_sge sge;
recv_wr.next = NULL;
recv_wr.sg_list = &sge;
recv_wr.num_sge = 1;
recv_wr.wr_id = 0;
sge.length = MY_MSG_SIZE;
sge.lkey = mr->lkey;
sge.addr = msgs;
for (i = 0; i < qp_init_attr.cap.max_recv_wr; i++) {
ret = ibv_post_recv(qp, &recv_wr, &bad_wr);
if (ret)
error();
sge.addr += MY_MSG_SIZE;
}
Post the messages
on the queue pair
*before* we connect
10. More Specifically…
void *msg;
struct ibv_mr *mr;
msg = calloc(1, MY_MSG_SIZE);
if (!msg)
error();
mr = ibv_reg_mr(pd, msg, MY_MSG_SIZE,
IBV_ACCESS_LOCAL_WRITE);
if (!mr)
error();
Allocate a send buffer…
and register it
with the device
11. More Specifically…
struct ibv_send_wr send_wr, *bad_wr;
struct ibv_sge sge;
send_wr.next = NULL;
send_wr.sg_list = &sge;
send_wr.num_sge = 1;
send_wr.wr_id = 0;
sge.length = MY_MSG_SIZE;
sge.lkey = mr->lkey;
sge.addr = msgs;
<format_msg(msgs, 0);>
ret = ibv_post_send(qp, &send_wr, &bad_wr);
if (ret)
error();
All this just to send?
12. More Specifically…
struct ibv_wc wc;
struct ibv_cq *cq;
void *context;
int ret;
do {
ret = ibv_poll_cq(cq, 1, &wc);
if (ret)
break;
ret = ibv_req_notify_cq(cq, 0);
if (ret)
error();
ret = ibv_poll_cq(cq, 1, &wc);
if (ret)
break;
Wait for the send to complete
or we receive a response
Remember to poll the
completion queue after
requesting notification
13. More Specifically…
ret = ibv_get_cq_event(comp_channel, &cq, &context);
if (ret)
error();
ibv_ack_cq_events(cq, 1);
} while (1);
if (ret < 0)
error();
Wait for an event and
check the completion
queue again
And it’s just that easy to send data!
14. Motivation continued
• And it’s just as bad on the receive side
Now, anyone want to DO an actual
RDMA operation?
15. Motivation continued
Actually, I just wanted to echo typing
between two systems connected by IB
that did not have ipoib (or sdp) but this
wouldn’t make as good an intro
16. Big Intro…
• RDMA sockets API
– Another API - ~joy~
• Calls that look and behave like sockets
• Connects like sockets
• Byte streaming transfers like sockets
– I.e. SOCK_STREAM
• Support for nonblocking operation like sockets
Ta-da!
Like sockets … except that it’s not
RSOCKETS!
17. Goals
• Socket programming concepts with minimal to
no need to learn anything about RDMA
– Let’s face it, no matter how many APIs we create
developers will still learn sockets
– Sockets will continue as the common fallback API
• Support existing socket applications under ideal
conditions
• SDP license free!
Support well-known network
programming concepts
18. Goals
• Outperform ipoib (and sdp)
– Or it’s pointless, except for limited environments
• Perform favorably compared to native RDMA
implementation
– Or there’s not a strong enough reason NOT to learn
RDMA programming
– Narrow the cost-benefit gap of maintaining verbs
support in an application long term
High performance
19. RSOCKETS Overview
• Proprietary protocol / algorithm
– I made it up
– Will be open sourced
• Entirely user-space
implementation
– Well, if we ignore the existing RDMA
support
– No need to merge anything
upstream!
RSOCKETS
Verbs
RDMA
CM
RDMA Device
Kernel
bypass
21. Supported Features
Functions take same parameters as sockets
• PF_INET, PF_INET6, SOCK_STREAM, IPPROTO_TCP
• MSG_DONTWAIT, MSG_PEEK
• SO_REUSEADDR, TCP_NODELAY, SO_ERROR
• SO_SNDBUF, SO_RCVBUF
• O_NONBLOCK
Implementation based on needs of
OSU and Intel MPI
23. More words from our sponsor…
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-
Intel microprocessors for optimizations that are not unique to Intel
microprocessors. These optimizations include SSE2, SSE3, and SSSE3
instruction sets and other optimizations. Intel does not guarantee the
availability, functionality, or effectiveness of any optimization on
microprocessors not manufactured by Intel. Microprocessor-dependent
optimizations in this product are intended for use with Intel
microprocessors. Certain optimizations not specific to Intel
microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information
regarding the specific instruction sets covered by this notice.
Notice revision #20110804
8-node Xeon X5570 @ 2.93
Ghz (Nehalem) cluster
8 cores / node
40 Gbps Infiniband
2 node latency and BW tests
rstream / perftest
64 process MPI runs
24. What’s the Performance?
Promising latency
and bandwidth
Can it work with
existing apps?
At all? Well?
0
1
2
3
4
5
6
7
8
9
10
IPoIB SDP RSOCKET IB
64-Byte Ping-Pong Latency (us)
0
5
10
15
20
25
30
64 128 256 512 1k 2k 4k 8k 16k 32k 64k 128k 256k 512k 1m
Bandwidth (Gbps)
IPoIB
SDP
RSOCKET
IB
N/2: 500 vs 650 B
Note: implementation has minimal optimizations
25. Supporting Existing Apps
MPI or socket application
LD_PRELOAD RSOCKET
conversion library
RSOCKET
RDMA VerbsRDMA CM
Socket API
Real Socket API
Limited fallback
support
Export socket
calls and map
them to rsockets
26. IMB - Intel MPI Benchmarks
• Measure important MPI functionality
• Results for arbitrarily selected sizes
• IPoIB performance was much worse
– Omitted for space
• SDP tests failed for 64 ranks
– Had lower performance for fewer ranks
Results in microseconds -
lower is better
29. What About a “Real” App?
• HPC Challenge benchmarks
– Set of higher-level benchmarks
• As close to a “real” app that I could easily run
• Selected results reported
– SDP failed to run
– IPoIB results included
32. Closing the Performance Gap
• Notable area for improvement:
– Direct data placement (reduce memory copies)
• Possible, but…
• Most target applications use nonblocking
sockets
– Restricts use with recv()
– Which reduces usefulness with send()
• Alternatives?
33. Closing the Performance Gap
• Is there any way to add direct access to RDMA
operations through sockets?
– Get that last bit of performance
• While keeping it simple?
• And.. without actually needing to know anything
about RDMA?
– Or these acronyms: PD, CQ, HCA, MR, QP, LID, GID, …
• And make it generic, so that other technologies may
be able to use it
– Tag matching, file I/O, SSDs
• And continue to support the socket programming
model!
34. Direct Data Placement Extensions
• Can we find calls that blend in with existing calls?
• Now we may be talking about new programming
concepts
• Are there any existing calls that are usable?
– send, sendto, sendmsg, write, writev, pwrite …
– recv, recvfrom, recvmsg, read, readv, pread …
– mmap, lseek, fseek, fgetpos, fsetpos, fsync …
This is a discussion point only
Although not used with sockets, these
calls may be used as guides
35. Direct Data Placement APIs
• Map memory to a specified offset
• Specify access restrictions
• Maps to memory registration
rmmap
• Read from an offset into a local buffer
• Maps to RDMA read operation
rget
• Write from a local buffer to the given offset
• Maps to RDMA write operation
rput
36. Direct Data Placement
• Extends current usage model
– No change to connecting or send/recv calls
– Memory region data exchanged underneath
• Appears usable for multiple technologies
• Seems easy to learn and use
Sounds great, you should get to
work on this right away!
37. The Real Problem
Target
applications use
nonblocking
sockets
Direct data
placement calls
may not block
Notification of
completion
should come
from select() and
poll() calls
Would need to determine how to handle
nonblocking calls without an indecent
exposure to RDMA
38. Requests to Verbs
• Asynchronous memory registration
– Assist with direct data placement
• A single file descriptor for all RDMA resources
– Event queue, completion queue, connections
– Simplifies implementation
• Way to transfer control of a set of RDMA
resources to another process
– Help support apps that fork
39. What’s Your Opinion?
Does rsockets have a
place going forward?
• It’s really 5 years
too late
• In limited
environments
• Absolutely
What’s the best way
to add direct data
placement?
• Not at all
• Best solution using
existing socket calls
• Extensions
What other features
are worth
implementing?
• Datagram support?
• Out of band data?
• Fork?