CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...AMD Developer Central
Presentation CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distributed Platforms, by Max Grossman at the AMD Developer Summit (APU13) November 11-13, 2013.
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...AMD Developer Central
Presentation HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated Processing Units, by Robert Engel at the AMD Developer Summit (APU13) Nov. 11-13, 2013.
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...AMD Developer Central
Presentation CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distributed Platforms, by Max Grossman at the AMD Developer Summit (APU13) November 11-13, 2013.
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...AMD Developer Central
Presentation HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated Processing Units, by Robert Engel at the AMD Developer Summit (APU13) Nov. 11-13, 2013.
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...AMD Developer Central
Presentation PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Applications Using PPA , by Hui Huang, Zhaoqiang Zheng and Lihua Zhang at the AMD Developer Summit (APU13) November 11-13, 2013
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...AMD Developer Central
Presentation PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by Wu Feng and Mark Gardner at the AMD Developer Summit (APU13) November 11-13, 2013.
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...AMD Developer Central
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compiler Developers, by Yaxun Liu from the AMD Developer Summit (APU13) November 11-13, 2013.
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...AMD Developer Central
Presentation PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, by Jean-Charles Vasnier, at the AMD Developer Summit (APU13) November 11-13, 2013.
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderAMD Developer Central
Presentation PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander, at the AMD Developer Summit (APU13) November 11-13, 2013.
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...AMD Developer Central
Presentation WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael Sevenier, at the AMD Developer Summit (APU13) November 11-13, 2013.
A team of designers and engineers were selected to showcase PDT and provide a scouting report from International CES 2012 in Las Vegas. The International Consumer Electronics Show (CES) is the world's largest consumer technology tradeshow.
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...AMD Developer Central
Presentation PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Applications Using PPA , by Hui Huang, Zhaoqiang Zheng and Lihua Zhang at the AMD Developer Summit (APU13) November 11-13, 2013
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...AMD Developer Central
Presentation PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by Wu Feng and Mark Gardner at the AMD Developer Summit (APU13) November 11-13, 2013.
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...AMD Developer Central
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compiler Developers, by Yaxun Liu from the AMD Developer Summit (APU13) November 11-13, 2013.
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...AMD Developer Central
Presentation PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, by Jean-Charles Vasnier, at the AMD Developer Summit (APU13) November 11-13, 2013.
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderAMD Developer Central
Presentation PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander, at the AMD Developer Summit (APU13) November 11-13, 2013.
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...AMD Developer Central
Presentation WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael Sevenier, at the AMD Developer Summit (APU13) November 11-13, 2013.
A team of designers and engineers were selected to showcase PDT and provide a scouting report from International CES 2012 in Las Vegas. The International Consumer Electronics Show (CES) is the world's largest consumer technology tradeshow.
[CB16] Be a Binary Rockstar: An Introduction to Program Analysis with Binary ...CODE BLUE
This talk will explore program analysis on compiled code, where source is not available. Many static program analysis tools, such as LLVM passes, depend on the ability to compile source to bytecode, and cannot operate on binaries. A solution to this problem will be explained and demonstrated using the new Intermediate Language (IL) in Binary Ninja. Binary Ninja IL will be described, providing a basic understanding of how to write analyses using it.
This talk will describe and release a tool in Binary Ninja IL for automated discovery of a simple memory corruption vulnerability and demonstrate it on a CTF binary. The concepts of variable analysis, abstract interpretation, and integer range analysis will be discussed in the context of vulnerability discovery.
--- Sophia D'Antoine
Sophia D’Antoine is a security engineer at Trail of Bits in NYC and a graduate of Rensselaer Polytechnic Institute. She is a regular speaker at security conferences around the world, including RECon, HITB, and CanSecWest. Her present work includes techniques for automated software exploitation and software obfuscation using LLVM. She spends too much time playing CTF and going to noise concerts.
A talk given at PHP Cambridge all about Python
The slides cover Python from any other programmer's prospective - but the talk as given involved comparisons to PHP.
On the necessity and inapplicability of pythonYung-Yu Chen
Python is a popular scripting language adopted by numerical software vendors to help users solve challenging numerical problems. It provides easy-to-use interface and offers decent speed through array operations, but it is not suitable for engineering the low-level constructs. To make good numerical software, developers need to be familiar with C++ and computer architecture. The gap of understandings between the high-level applications and low-level implementation motivated me to organize a course to train computer scientists what it takes to build numerical software that the users (application experts) want. This talk will portray a bird view of the advantages and disadvantages of Python and where and how C++ should be used in the context of numerical software. The information may be used to map out a plan to acquire the necessary skill sets for making the software.
Recording https://www.youtube.com/watch?v=OwA-Xt_Ke3Y
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015Windows Developer
Visual Studio 2015 provides the best in class C++ development experience whether you are targeting Android, iOS, Linux, Windows, or IoT. With a good mix of demos and showcase for new C++ experiences, this talk goes over six great reasons why you should migrate to Visual Studio 2015 today.
The sole purpose of sharing these slides are to educate the beginners of IT and Computer Science/Engineering. Credits should go to the referred material and also CICRA campus, Colombo 4, Sri Lanka where I taught these in 2017.
Vulkan and DirectX12 share many common concepts, but differ vastly from the APIs most game developers are used to. As a result, developing for DX12 or Vulkan requires a new approach to graphics programming and in many cases a redesign of the Game Engine. This lecture will teach the basic concepts common to Vulkan and DX12 and help developers overcome the main problems that often appear when switching to one of the new APIs. It will explain how those new concepts will help games utilize the hardware more efficiently and discuss best practices for game engine development.
For more, visit http://developer.amd.com/
AMD’s math libraries can support a range of programmers from hobbyists to ninja programmers. Kent Knox from AMD’s library team introduces you to OpenCL libraries for linear algebra, FFT, and BLAS, and shows you how to leverage the speed of OpenCL through the use of these libraries.
Review the material presented in the AMD Math libraries webinar in this deck.
For more:
Visit the AMD Developer Forums:http://devgurus.amd.com/welcome
Watch the replay: www.youtube.com/user/AMDDevCentral
Follow us on Twitter: https://twitter.com/AMDDevCentral
This is the slide deck from the popular "Introduction to Node.js" webinar with AMD and DevelopIntelligence, presented by Joshua McNeese. Watch our AMD Developer Central YouTube channel for the replay at https://www.youtube.com/user/AMDDevCentral.
This presentation accompanies the webinar replay located here: http://bit.ly/1zmvlkL
AMD Media SDK Software Architect Mikhail Mironov shows you how to leverage an AMD platform for multimedia processing using the new Media Software Development Kit. He discusses how to use a new set of C++ interfaces for easy access to AMD hardware blocks, and shows you how to leverage the Media SDK in the development of video conferencing, wireless display, remote desktop, video editing, transcoding, and more.
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAMD Developer Central
This deck presents highlights from the Introduction to OpenCL™ Programming Webinar presented by Acceleware & AMD on Sept. 17, 2014. Watch a replay of this popular webinar on the AMD Dev Central YouTube channel here: https://www.youtube.com/user/AMDDevCentral or here for the direct link: http://bit.ly/1r3DgfF
Learn more about DirectGMA in this blog post: bit.ly/AMDDirectGMA
AMD has introduced Direct Graphics Memory Access in order to:
‒ Makes a portion of the GPU memory accessible to other devices
‒ Allows devices on the bus to write directly into this area of GPU memory
‒ Allows GPUs to write directly into the memory of remote devices on the bus supporting DirectGMA
‒ Provides a driver interface to allow 3rd party hardware vendors to support data exchange with an AMD GPU using DirectGMA
‒ and more
View the accompanying blog post here: bit.ly/AMDDirectGMA
This Webinar explores a variety of new and updated features in Java 8, and discuss how these changes can positively impact your day-to-day programming.
Watch the video replay here: http://bit.ly/1vStxKN
Your Webinar presenter, Marnie Knue, is an instructor for Develop Intelligence and has taught Sun & Oracle certified Java classes, RedHat JBoss administration, Spring, and Hibernate. Marnie also has spoken at JavaOne.
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...AMD Developer Central
This presentation discusses the Mantle API, what it is, why choose it, and abstraction level, small batch performance and platform efficiency.
Download the presentation from the AMD Developer website here: http://bit.ly/TrEUeC
Inside XBox One by Martin Fuller from the Sweden Game Developers Conference, June 2, 2014, Stockholm, Sweden. View other presentations here: http://bit.ly/TrEUeC
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...AMD Developer Central
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Harris Gasparakis, AMD, at the Embedded Vision Alliance Summit, May 2014.
Harris Gasparakis, Ph.D., is AMD’s OpenCV manager. In addition to enhancing OpenCV with OpenCL acceleration, he is engaged in AMD’s Computer Vision strategic planning, ISVs, and AMD Ventures engagements, including technical leadership and oversight in the AMD Gesture product line. He holds a Ph.D. in theoretical high energy physics from YITP at SUNYSB. He is credited with enabling real-time volumetric visualization and analysis in Radiology Information Systems (Terarecon), including the first commercially available virtual colonoscopy system (Vital Images). He was responsible for cutting edge medical technology (Biosense Webster, Stereotaxis, Boston Scientific), incorporating image and signal processing with AI and robotic control.
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...AMD Developer Central
In this webinar presentation, ArrayFire COO Oded Green demonstrates best practices to help you quickly get started with OpenCL™ programming. Learn how to get the best performance from AMD hardware in various programming languages using ArrayFire. Oded discusses the latest advancements in the OpenCL™ ecosystem, including cutting edge OpenCL™ libraries such as clBLAS, clFFT, clMAGMA and ArrayFire. Examples are shown in real code for common application domains.
Watch the webinar here: http://bit.ly/1obT0M2
For more developer resources, visit:
http://arrayfire.com/
http://developer.amd.com/
Follow us on Twitter: https://twitter.com/AMDDevCentral
See info in the slides for more contact information and resource links!
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14AMD Developer Central
Johan Andersson will show how the Frostbite 3 game engine is using the low-level graphics API Mantle to deliver significantly improved performance in Battlefield 4 on PC and future games from Electronic Arts in this presentation from the 2014 Game Developers Conference in San Francisco March 17-21. Also view this and other presentations on our developer website at http://developer.amd.com/resources/documentation-articles/conference-presentations/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Welocme to ViralQR, your best QR code generator.ViralQR
Welcome to ViralQR, your best QR code generator available on the market!
At ViralQR, we design static and dynamic QR codes. Our mission is to make business operations easier and customer engagement more powerful through the use of QR technology. Be it a small-scale business or a huge enterprise, our easy-to-use platform provides multiple choices that can be tailored according to your company's branding and marketing strategies.
Our Vision
We are here to make the process of creating QR codes easy and smooth, thus enhancing customer interaction and making business more fluid. We very strongly believe in the ability of QR codes to change the world for businesses in their interaction with customers and are set on making that technology accessible and usable far and wide.
Our Achievements
Ever since its inception, we have successfully served many clients by offering QR codes in their marketing, service delivery, and collection of feedback across various industries. Our platform has been recognized for its ease of use and amazing features, which helped a business to make QR codes.
Our Services
At ViralQR, here is a comprehensive suite of services that caters to your very needs:
Static QR Codes: Create free static QR codes. These QR codes are able to store significant information such as URLs, vCards, plain text, emails and SMS, Wi-Fi credentials, and Bitcoin addresses.
Dynamic QR codes: These also have all the advanced features but are subscription-based. They can directly link to PDF files, images, micro-landing pages, social accounts, review forms, business pages, and applications. In addition, they can be branded with CTAs, frames, patterns, colors, and logos to enhance your branding.
Pricing and Packages
Additionally, there is a 14-day free offer to ViralQR, which is an exceptional opportunity for new users to take a feel of this platform. One can easily subscribe from there and experience the full dynamic of using QR codes. The subscription plans are not only meant for business; they are priced very flexibly so that literally every business could afford to benefit from our service.
Why choose us?
ViralQR will provide services for marketing, advertising, catering, retail, and the like. The QR codes can be posted on fliers, packaging, merchandise, and banners, as well as to substitute for cash and cards in a restaurant or coffee shop. With QR codes integrated into your business, improve customer engagement and streamline operations.
Comprehensive Analytics
Subscribers of ViralQR receive detailed analytics and tracking tools in light of having a view of the core values of QR code performance. Our analytics dashboard shows aggregate views and unique views, as well as detailed information about each impression, including time, device, browser, and estimated location by city and country.
So, thank you for choosing ViralQR; we have an offer of nothing but the best in terms of QR code services to meet business diversity!
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
4. Introduction – who am I?
●
Five years in the industry
●
Spent all of that using SPUs, GPUs, vectors units &
DSPs
●
Last two years focused on open standards (mostly
OpenCL)
●
Passionate about making compute easy
Neil Henning
neil@codeplay.com
5. Introduction – who are we?
●
GPU Compiler Experts based out of Edinburgh, Scotland
●
35 employees working on contracts, R&D and internal tech
Neil Henning
neil@codeplay.com
7. Current Landscape
●
Languages – CUDA, RenderScript, C++AMP & OpenCL
●
Targets – GPU (mobile & desktop), CPU (scalar & vector), DSPs, FPGAs
●
Concerns – performance, power, precision, parallelism & portability
Neil Henning
neil@codeplay.com
8. Current Landscape - CUDA
__global__ void kernel(char * a, char * b)
{
a[blockIdx.x] = b[blockIdx.x];
}
char in[SIZE], out[SIZE];
char * cIn, * cOut;
cudaMalloc((void **)&cIn, SIZE);
cudaMalloc((void **)&cOut, SIZE);
cudaMemcpy(cIn, in, size,
cudaMemcpyHostToDevice);
kernel<<<SIZE, 1>>>(cOut, cIn);
cudaMemcpy(out, cOut, size,
cudaMemcpyDeviceToHost);
cudaFree(cIn);
cudaFree(cOut);
●
CUDA incredibly established
●
●
First major GPU compute approach to market
majority of devices
●
Huge bank of tools, libraries and knowledge
●
Really only had uptake in offline processing
●
Used in banking, medical imaging, game asset
●
Standard isn’t open, little room (or enthusiasm) for
creation, and many many more uses!
Using CUDA means abandoning compute on
other vendors to implement
Neil Henning
neil@codeplay.com
9. Current Landscape - RenderScript
#pragma version(1)
#pragma rs java_package_name(foo)
rs_allocation gIn; rs_allocation gOut;
rs_script gScript;
void root(const char * in, char * out,
const void * usr, uint32_t x, uint32_t y) {
*out = *in;
}
void filter() {
rsForEach(gScript, gIn, gOut, NULL);
}
Context ctxt = /* … */;
RenderScript rs = RenderScript.create(ctxt);
ScriptC_foo script = new ScriptC_foo(rs,
getResources(), R.raw.foo);
Allocation in = Allocation.createSized(rs,
Element.I8(rs), SIZE);
Allocation out = Allocation.createSized(rs,
Element.I8(rs), SIZE);
script.set_gIn(in); script.set_gOut(out);
script.set_gScript(script);
script.invoke_filter();
●
Intelligent runtime load balances kernels
●
Only on Android
●
Creates Java classes to interface with kernels
●
Limited documentation & shortage of examples
●
Focused on performance portability
●
No real idea of feature roadmap
Neil Henning
neil@codeplay.com
10. Current Landscape – C++AMP
int in[SIZE], out[SIZE];
array_view<const int, 1> aIn(SIZE, in);
array_view<int, 1> aOut(SIZE, out);
aOut.discard_data();
parallel_for_each(aOut.extent,
[=](index<1> idx) restrict(amp)
{
aOut[idx] = aIn[idx];
}
);
●
Very well thought out single source approach
●
Lovely use of C++ templates to capture type information,
array dimensions
●
Great use of C++11 Lambda’s for capturing kernel intent
●
Part of target community is really C++11 averse, need
convincing
Limited low-level support
●
Initial interest by community faded fast
●
// can access aOut[…] like normal
●
Xbox One will support C++AMP – watch this space
Neil Henning
neil@codeplay.com
11. Current Landscape - OpenCL
void kernel foo(global int * a, global int * b)
{
int idx = get_global_id(0);
a[idx] = b[idx];
}
// device, context, queue, in, out already created
cl_program program =
clCreateProgramWithSource(context, 1,
fooAsStr, NULL, NULL);
clBuildProgram(program, 1, &device,
NULL, NULL, NULL);
cl_kernel kernel = clCreateKernel(program,
“foo”, NULL);
// set kernel arguments
clEnqueueNDRangeKernel(queue, kernel, 1,
NULL, &size, NULL, 0, NULL, NULL);
●
Open standard with many contributors
●
API is verbose, very very verbose!
●
API puts control in developer hands
●
Steep learning curve for new developers
●
Support on lots of heterogeneous platforms – not just GPUs!
●
Have to support diverse range of application types
Neil Henning
neil@codeplay.com
12. Current Landscape
Modern systems have many compute-capable devices in them
Not unlike the fictitious system shown above!
Neil Henning
neil@codeplay.com
13. Current Landscape
Scalar CPUs are the ‘normal’ target for programmers, easy
to target, easy to use
Mostly a fallback target for
compute currently
Neil Henning
neil@codeplay.com
14. Current Landscape
Scalar CPUs are the ‘normal’ target for programmers, easy
to target, easy to use
Mostly a fallback target for
compute currently
Vector units are supported if
kernel has vector types
Can auto-vectorize user kernels,
as vector units harder for ‘normal’ programmers to target
Neil Henning
neil@codeplay.com
15. Current Landscape
Scalar CPUs are the ‘normal’ target for programmers, easy
to target, easy to use
Mostly a fallback target for
compute currently
Vector units are supported if
kernel has vector types
Can auto-vectorize user kernels,
as vector units harder for ‘normal’ programmers to target
Can make no assumptions as to
what DSPs ‘look’ like
Digital Signal Processors (DSPs)
are a future target for the compute market
Neil Henning
neil@codeplay.com
16. Current Landscape
Scalar CPUs are the ‘normal’ target for programmers, easy
to target, easy to use
Mostly a fallback target for
compute currently
Vector units are supported if
kernel has vector types
Can auto-vectorize user kernels,
as vector units harder for ‘normal’ programmers to target
GPUs do not forgive poor code like a CPU or even a DSP
could, require large arrays of work to utilize
GPUs are the reason we have
compute in the first place
Can make no assumptions as to
what they ‘look’ like
Digital Signal Processors (DSPs)
are a future target for the compute market
Neil Henning
neil@codeplay.com
17. Current Landscape
●
●
Have to weigh up many competing concerns for languages
Platform, operating system, device type, battery life, use case
Neil Henning
neil@codeplay.com
18. What is wrong with the current landscape
Neil Henning
neil@codeplay.com
19. What is wrong with the current landscape
●
Compute approaches are not on all device and OS combinations
●
No CUDA on AMD, RenderScript on iOS or C++AMP on Linux
●
Have to support offline precise compute & time-bound online compute
●
Very divergent targets/use cases/device types is problematic!
Neil Henning
neil@codeplay.com
20. What is wrong with the current landscape
●
What if loop count is always multiple of four?
void foo(int * a, int * b, int * count)
{
for(int idx = 0; idx < *(count); ++idx)
{
a[idx] = 42 * b[idx];
}
}
Neil Henning
neil@codeplay.com
21. What is wrong with the current landscape
●
void foo(int * a, int * b, int * count)
{
for(int idx = 0; idx < *(count); idx += 4)
{
a[idx + 0] = 42 * b[idx + 0];
a[idx + 1] = 42 * b[idx + 1];
a[idx + 2] = 42 * b[idx + 2];
a[idx + 3] = 42 * b[idx + 3];
}
}
What if loop count is always multiple of four?
●
Can unroll the loop four times!
Neil Henning
neil@codeplay.com
22. What is wrong with the current landscape
●
void foo(int * a, int * b, int * count)
{
for(int idx = 0; idx < *(count); idx += 4)
{
a[idx + 0] = 42 * b[idx + 0];
a[idx + 1] = 42 * b[idx + 1];
a[idx + 2] = 42 * b[idx + 2];
a[idx + 3] = 42 * b[idx + 3];
}
}
What if loop count is always multiple of four?
●
Can unroll the loop four times!
●
What if pointers a & b are sixteen byte aligned?
Neil Henning
neil@codeplay.com
23. What is wrong with the current landscape
●
What if loop count is always multiple of four?
●
Can unroll the loop four times!
●
What if pointers a & b are sixteen byte aligned?
●
void foo(int * a, int * b, int * count)
{
int vecCount = count / 4;
int4 * vA = (int4 * )a;
int4 * vB = (int4 * )b;
Can vectorize the loop body!
for(int idx = 0; idx < vecCount; ++idx)
{
vA[idx] = vB[idx] * (int4 )42;
}
}
Neil Henning
neil@codeplay.com
24. What is wrong with the current landscape
for(int idx = 0; idx < vecCount; ++idx)
{
vA[idx] = vB[idx] * (int4 )42;
}
●
What if loop count is always multiple of four?
●
Can unroll the loop four times!
●
What if pointers a & b are sixteen byte aligned?
●
void foo(int * a, int * b, int * count)
{
int vecCount = count / 4;
int4 * vA = (int4 * )a;
int4 * vB = (int4 * )b;
Can vectorize the loop body!
●
Why does my code look so radically different now?
}
Neil Henning
neil@codeplay.com
25. What is wrong with the current landscape
for(int idx = 0; idx < vecCount; ++idx)
{
vA[idx] = vB[idx] * (int4 )42;
}
●
What if loop count is always multiple of four?
●
Can unroll the loop four times!
●
What if pointers a & b are sixteen byte aligned?
●
void foo(int * a, int * b, int * count)
{
int vecCount = count / 4;
int4 * vA = (int4 * )a;
int4 * vB = (int4 * )b;
Can vectorize the loop body!
●
Why does my code look so radically different now?
●
Current languages force drastic developer interventions
}
Neil Henning
neil@codeplay.com
26. What is wrong with the current landscape
void foo(int * a, int * b, int * count)
{
int vecCount = count / 4;
int4 * vA = (int4 * )a;
int4 * vB = (int4 * )b;
for(int idx = 0; idx < vecCount; ++idx)
{
vA[idx] = vB[idx] * (int4 )42;
}
●
Existing languages (mostly) force developers to do coding
wizardry that is unnecessary
●
Also no real feedback to developer as ‘main’ compute
target has highly secretive ISAs
●
Don’t want to force vendors to reveal secrets, but do want
ability to influence kernel code generation
}
Neil Henning
neil@codeplay.com
27. What is wrong with the current landscape
●
Rely on vendors to provide tools to aid development
●
Debuggers, profilers, static analysis all increasingly required
●
Libraries can vastly decrease development time
●
Rely solely on vendors to provide all these complicated pieces
Neil Henning
neil@codeplay.com
28. What is wrong with the current landscape
●
Vendors already have lots of targets to support
●
Every generation of devices need to test conformance
●
Need to support compilers, graphics, compute, tools, list goes on!
●
Why should the vendor be the only one taking the burden?
Neil Henning
neil@codeplay.com
29. What is wrong with the current landscape
●
No one can agree on what is the ‘best’ approach
●
Personal preference of developer/organization sways opinions
●
Why not allow Lisp on a GPU? Lua on a DSP?
●
Vendor doesn’t need extra headache of supporting these niche use cases
Neil Henning
neil@codeplay.com
30. What is wrong with the current landscape
●
My pitch – let community support compute standards
●
Take the approach of LLVM & Clang
●
Vendor has to support lower standard on their hardware
●
But allows community to support & innovate
Neil Henning
neil@codeplay.com
31. How to enable your language on GPUs
Neil Henning
neil@codeplay.com
32. How to enable your language on GPUs
●
First step – be able to compile language to a binary
●
Can’t output real binary though
●
Vendor doesn’t want to expose ISA
●
Developer wants portability of compiled kernels
Neil Henning
neil@codeplay.com
33. How to enable your language on GPUs
●
Need to use an Intermediate Representation (IR)
●
Two approaches in development for this!
●
HSA Intermediate Language (HSAIL)
●
OpenCL Standard Portable Intermediate Representation (SPIR)
Neil Henning
neil@codeplay.com
34. How to enable your language on GPUs
Our
Language
Our
Language
●
Language -> LLVM IR -> HSAIL
●
Language -> LLVM IR -> SPIR
●
Low level mapping onto hardware, more of a virtual ISA
●
Then pass SPIR to OpenCL runtime as binary
●
Execute like normal OpenCL C Language kernel
●
Provisional specification available!
than an IR
●
HSAIL heavily in development
Neil Henning
neil@codeplay.com
35. How to enable your language on GPUs
Our
Language
●
HSA will provide a low-level runtime to interface
between HSA compiled binaries and OS
Our
Language
●
OpenCL SPIR will require a SPIR compliant OpenCL
implementation as target
●
HSAIL is being standardized and ratified
●
Can compile using LLVM, then use
●
Existing JIT’ed languages potential targets
clCreateProgramWithBinary, passing SPIR options
Neil Henning
neil@codeplay.com
36. How to enable your language on GPUs
●
At present, SPIR is only target we can investigate
●
Intel has OpenCL drivers with provisional SPIR support
●
Can use Clang -> LLVM -> SPIR, then use Intel’s OpenCL to consume SPIR
●
Can take code that compiles to LLVM and run it on OpenCL
Neil Henning
neil@codeplay.com
37. How to enable your language on GPUs
●
Various steps to getting your language working on GPUs with SPIR
●
We’ll use Intel’s OpenCL SDK with provisional SPIR support;
1.
Create a test harness to load a SPIR binary
2.
Create a simple kernel using Intel’s SPIR compiler on host
3.
Create a simple kernel using tip Clang (language OpenCL) targeting SPIR
4.
Try other languages that compile to LLVM with SPIR target
Neil Henning
neil@codeplay.com
38. How to enable your language on GPUs
// some SPIR bitcode file
const unsigned char spir_bc[spir_bc_length];
// already initialized platform, device & context for a SPIR compliant device
cl_platform_id platform = ... ;
cl_device device = ... ;
cl_context context = … ;
// create our program with our SPIR bitcode file
cl_program program = clCreateProgramWithBinary(
context, 1, &device, &spir_bc_length, &spir_bc, NULL, NULL);
// build, passing arguments telling the compiler language is SPIR, and the SPIR standard we are using
clBuildProgram(program, 1, &device, “–x spir –spir–std=1.2”, NULL, NULL);
Neil Henning
neil@codeplay.com
39. How to enable your language on GPUs
// already initialized memory buffers for our context
cl_mem in_mem = ... ;
cl_mem out_mem = ... ;
// assume our kernel function from the spir kernel was called foo
cl_kernel kernel = clCreateKernel(program, “foo”, NULL);
// assume our kernel has one read buffer as first argument, and one write buffer as second
clSetKernelArg(kernel, 0, sizeof(cl_mem), (void * )&in_mem);
clSetKernelArg(kernel, 1, sizeof(cl_mem), (void * )&out_mem);
Neil Henning
neil@codeplay.com
40. How to enable your language on GPUs
// already initialized command queue
cl_command_queue queue = … ;
cl_event write_event, run_event;
clEnqueueWriteBuffer(queue, in_mem, CL_FALSE, 0, BUFFER_SIZE,
&read_payload, 0, NULL, &write_event);
const size_t size = BUFFER_SIZE / sizeof(cl_int);
clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &size, NULL, 1, &write_event, &run_event);
clEnqueueReadBuffer(queue, out_mem, CL_TRUE, 0, BUFFER_SIZE,
&result_payload, 1, &run_event, NULL);
Neil Henning
neil@codeplay.com
41. How to enable your language on GPUs
●
Now, create a simple OpenCL kernel
void kernel foo(global int * in, global int * out)
{
out[get_global_id(0)] = in[get_global_id(0)];
}
●
And use Intel’s command line (or GUI!) tool to build
Ioc32 –cmd=build –input foo.cl –spir32=foo.bc
Neil Henning
neil@codeplay.com
42. How to enable your language on GPUs
●
Next we point the buffer for our SPIR kernel at the generated SPIR kernel
●
And it fails…?
●
Turns out Intel’s OpenCL runtime doesn’t like us telling them they are building
SPIR!
●
Simply remove “–x spir –spir–std=1.2” from the build options and voila!
Neil Henning
neil@codeplay.com
43. How to enable your language on GPUs
●
Next step – use tip Clang to build our foo.cl kernel
clang –cc1 –triple spir-unknown-unknown –emit-llvm-bc foo.cl –o foo.bc
●
Compiles ok, but when we run it fails…?
●
So Clang generated SPIR bitcode file could very well not work
●
We’ll take a look at the readable IR for the Intel & Clang compiled kernels
Neil Henning
neil@codeplay.com
47. How to enable your language on GPUs
●
So the metadata is different!
●
We could fix Clang to produce the right metadata…?
●
Or just hack around!
●
Lets use Intel’s compiler to generate a stub function
●
Then we can use an extern function defined in our Clang module!
Neil Henning
neil@codeplay.com
48. How to enable your language on GPUs
extern int doSomething(int a);
void kernel foo(global int * in, global int * out)
{
int id = get_global_id(0);
out[id] = doSomething(in[id]);
}
int doSomething(int a)
{
return a;
}
Neil Henning
neil@codeplay.com
49. How to enable your language on GPUs
●
And it fails…?
●
Intel’s compiler doesn’t like extern functions!
●
We’ve already bodged it thus far…
●
So lets continue!
Int __attribute__((weak)) doSomething(int a) {}
void kernel foo(global int * in, global int * out)
{
int id = get_global_id(0);
out[id] = doSomething(in[id]);
}
Neil Henning
neil@codeplay.com
50. How to enable your language on GPUs
●
More than a little nasty…
●
Relies on Clang extension to declare function weak within OpenCL
●
Relies on Intel using Clang and allowing extension
●
But it works!
●
Can build both the Intel stub code & the Clang actual code
●
Then use llvm-link to pull them together!
Neil Henning
neil@codeplay.com
51. How to enable your language on GPUs
●
So now we can compile two OpenCL kernels, link them together, and run it
●
What is next? Want to enable your language!
●
What about using Clang, but using a different language?
●
C & C++ come to mind!
Neil Henning
neil@codeplay.com
52. How to enable your language on GPUs
●
Use a simple C file
int doSomething(int a)
{
return a;
}
●
And use Clang to compile it
clang –cc1 –triple spir-unknown-unknown –emit-llvm-bc foo.c –o foo.bc
Neil Henning
neil@codeplay.com
53. How to enable your language on GPUs
●
Or a simple C++ file!
extern “C” int doSomething(int a);
template<typename T> T templatedSomething(const T t)
{
return t;
}
int doSomething(int a)
{
return templatedSomething(a);
}
Neil Henning
neil@codeplay.com
54. How to enable your language on GPUs
●
Lets have some real C++ code
●
Use features that OpenCL doesn’t provide us
We’ll do a matrix multiplication in C++
Use classes, constructors, templates
Neil Henning
neil@codeplay.com
55. How to enable your language on GPUs
typedef float __attribute__((ext_vector_type(4))) float4;
typedef float __attribute__((ext_vector_type(16))) float16;
float __attribute__((overloadable)) dot(float4 a, float4 b);
template<typename T, unsigned int WIDTH, unsigned int HEIGHT> class Matrix
{
typedef T __attribute__((ext_vector_type(WIDTH))) RowType;
RowType rows[HEIGHT];
public:
Matrix() {}
template<typename U> Matrix(const U & u) { __builtin_memcpy(&rows, &u, sizeof(U)); }
RowType & operator[](const unsigned int index) { return rows[index]; }
const RowType & operator[](const unsigned int index) const { return rows[index]; }
};
Neil Henning
neil@codeplay.com
56. How to enable your language on GPUs
template<typename T, unsigned int WIDTH, unsigned int HEIGHT>
Matrix<T, WIDTH, HEIGHT> operator *(const Matrix<T, WIDTH, HEIGHT> & a, const Matrix<T,
WIDTH, HEIGHT> & b)
{
Matrix<T, HEIGHT, WIDTH> bShuffled;
for(unsigned int h = 0; h < HEIGHT; h++)
for(unsigned int w = 0; w < WIDTH; w++)
bShuffled[w][h] = b[h][w];
Matrix<T, WIDTH, HEIGHT> result;
for(unsigned int h = 0; h < HEIGHT; h++)
for(unsigned int w = 0; w < WIDTH; w++)
result[h][w] = dot(a[h], bShuffled[w]);
return result;
}
Neil Henning
neil@codeplay.com
57. How to enable your language on GPUs
extern “C” float16 doSomething(float16 a, float16 b);
float16 doSomething(float16 a, float16 b)
{
Matrix<float, 4, 4> matA(a);
Matrix<float, 4, 4> matB(b);
Matrix<float, 4, 4> mul = matA * matB;
float16 result = (float16 )0;
result.s0123 = mul[0];
result.s4567 = mul[1];
result.s89ab = mul[2];
result.scdef = mul[3];
return result;
}
Neil Henning
neil@codeplay.com
58. How to enable your language on GPUs
●
And when we run it…
ex5.vcxproj -> E:AMDDeveloperSummit2013buildExample5Debugex5.exe
Found 2 platforms!
Choosing vendor 'Intel(R) Corporation'!
Found 1 devices!
SPIR file length '3948' bytes!
[ 0.0, 1.0, 2.0, 3.0] * [ 16.0, 15.0, 14.0, 13.0] = [ 40.0, 34.0, 28.0, 22.0]
[ 4.0, 5.0, 6.0, 7.0] * [ 12.0, 11.0, 10.0, 9.0] = [200.0, 178.0, 156.0, 134.0]
[ 8.0, 9.0, 10.0, 11.0] * [ 8.0, 7.0, 6.0, 5.0] = [360.0, 322.0, 284.0, 246.0]
[ 12.0, 13.0, 14.0, 15.0] * [ 4.0, 3.0, 2.0, 1.0] = [520.0, 466.0, 412.0, 358.0]
●
Success!
Neil Henning
neil@codeplay.com
59. How to enable your language on GPUs
●
The least you need to target a GPU;
●
Generate correct LLVM IR with SPIR
metadata
●
Or at least generate LLVM IR and
use the approach we used to
combine Clang and IOC generated
kernels
!opencl.kernels = !{!0}
!opencl.enable.FP_CONTRACT = !{}
!opencl.spir.version = !{!6}
!opencl.ocl.version = !{!7}
!opencl.used.extensions = !{!8}
!opencl.used.optional.core.features = !{!8}
!opencl.compiler.options = !{!8}
!0 = metadata !{void (i32 addrspace(1)*, i32 addrspace(1)*)*
@foo, metadata !1, metadata !2, metadata !3, metadata !4,
metadata !5}
!1 = metadata !{metadata !"kernel_arg_addr_space", i32 1, i32
1}
!2 = metadata !{metadata !"kernel_arg_access_qual", metadata
!"none", metadata !"none"}
!3 = metadata !{metadata !"kernel_arg_type", metadata !"int*",
metadata !"int*"}
!4 = metadata !{metadata !"kernel_arg_type_qual", metadata
!"", metadata !""}
!5 = metadata !{metadata !"kernel_arg_name", metadata !"a",
metadata !"b"}
!6 = metadata !{i32 1, i32 0}
!7 = metadata !{i32 0, i32 0}
!8 = metadata !{}
Neil Henning
neil@codeplay.com
60. How to enable your language on GPUs
●
Porting C/C++ libraries to SPIR requires a little more work
int foo(int * a)
{
return *a;
}
●
The data pointed to by ‘a’ will by default be put in the private address space
●
But a straight conversion to SPIR needs all data in global address space
●
Means that any porting of existing code could be quite intrusive
Neil Henning
neil@codeplay.com
61. How to enable your language on GPUs
●
To target your language at GPUs
●
●
Need to be able to segregate work into parallel chunks
●
Have to ban certain features that don’t work with compute
●
●
Need to deal with distinct address spaces
Language could also provide an API onto OpenCL SPIR builtins
But with OpenCL SPIR it is now possible to make any language work on a GPU!
Neil Henning
neil@codeplay.com
63. Developing tools for GPUs
●
Tools increasingly required to support development
●
Even having printf (which OpenCL 1.2 added) is novel!
●
But with increasingly complex code better tools needed
●
Main three are debuggers, profilers and compiler-tools
Neil Henning
neil@codeplay.com
64. Developing tools for GPUs
●
Debuggers for compute are difficult for non-vendor to develop
●
Codeplay has developed such tools on top of compute standards
●
Problem is bedrock for these tools can change at any time
●
Hard to beat vendor-owned approach that has lower-level access
Neil Henning
neil@codeplay.com
65. Developing tools for GPUs
Our
Language
●
Codeplay are pushing hard for HSA to have features
that aid tool development
●
Debuggers are much easier with instruction
support, debug info, change registers, call stacks
Our
Language
●
OpenCL SPIR harder to create debugger for without
vendor support
●
Can we standardize a way to debug OpenCL SPIR,
or allow debugging via emulation of SPIR?
Neil Henning
neil@codeplay.com
66. Developing tools for GPUs
●
Profilers require superset of debugger feature-set
●
Need to be able to trap kernels at defined points
●
Accurate timings only other requirement beyond debugger support
●
More fun when we go beyond performance, and measure power
Neil Henning
neil@codeplay.com
67. Developing tools for GPUs
●
HSA and OpenCL SPIR both good profiler targets
●
Could split SPIR kernels into profiling sections
●
Then use existing timing information in OpenCL
●
HSA will only require debugger features we are pushing for
Neil Henning
neil@codeplay.com
68. Developing tools for GPUs
●
Compiler tools consist of optimizers and analysis
●
Both HSA and OpenCL SPIR being based on LLVM enable this!
●
We as compiler experts can aid existing runtimes
●
You as developers can add optimizations & analyse your kernels!
Neil Henning
neil@codeplay.com
70. Conclusion
●
With the rise of open standards, compute is increasingly easy
●
With HSA & OpenCL SPIR hardware is finally open to us!
●
Just need standards to ratify, mature & be available on hardware!
●
Next big push into compute is upon us
Neil Henning
neil@codeplay.com