We describe ocl, a Python library built on top of pyOpenCL and numpy. It allows programming
GPU devices using Python. Python functions which are marked up using the provided
decorator, are converted into C99/OpenCL and compiled using the JIT at runtime. This
approach lowers the barrier to entry to programming GPU devices since it requires only
Python syntax and no external compilation or linking steps. The resulting Python program runs
even if a GPU is not available. As an example of application, we solve the problem of
computing the covariance matrix for historical stock prices and determining the optimal
portfolio according to Modern Portfolio Theory
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...AMD Developer Central
In this webinar presentation, ArrayFire COO Oded Green demonstrates best practices to help you quickly get started with OpenCL™ programming. Learn how to get the best performance from AMD hardware in various programming languages using ArrayFire. Oded discusses the latest advancements in the OpenCL™ ecosystem, including cutting edge OpenCL™ libraries such as clBLAS, clFFT, clMAGMA and ArrayFire. Examples are shown in real code for common application domains.
Watch the webinar here: http://bit.ly/1obT0M2
For more developer resources, visit:
http://arrayfire.com/
http://developer.amd.com/
Follow us on Twitter: https://twitter.com/AMDDevCentral
See info in the slides for more contact information and resource links!
ADMS'13 High-Performance Holistic XML Twig Filtering Using GPUsty1er
Current state of the art in information dissemination com- prises of publishers broadcasting XML-coded documents, in turn selectively forwarded to interested subscribers. The de- ployment of XML at the heart of this setup greatly increases the expressive power of the profiles listed by subscribers, using the XPath language. On the other hand, with great expressive power comes great performance responsibility: it is becoming harder for the matching infrastructure to keep up with the high volumes of data and users. Traditionally, general purpose computing platforms have generally been favored over customized computational setups, due to the simplified usability and significant reduction of development time. The sequential nature of these general purpose com- puters however limits their performance scalability. In this work, we propose the implementation of the filtering infras- tructure using the massively parallel Graphical Processing Units (GPUs). We consider the holistic (no post-processing) evaluation of thousands of complex twig-style XPath queries in a streaming (single-pass) fashion, resulting in a speedup over CPUs up to 9x in the single-document case and up to 4x for large batches of documents. A thorough set of exper- iments is provided, detailing the varying effects of several factors on the CPU and GPU filtering platforms
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...AMD Developer Central
In this webinar presentation, ArrayFire COO Oded Green demonstrates best practices to help you quickly get started with OpenCL™ programming. Learn how to get the best performance from AMD hardware in various programming languages using ArrayFire. Oded discusses the latest advancements in the OpenCL™ ecosystem, including cutting edge OpenCL™ libraries such as clBLAS, clFFT, clMAGMA and ArrayFire. Examples are shown in real code for common application domains.
Watch the webinar here: http://bit.ly/1obT0M2
For more developer resources, visit:
http://arrayfire.com/
http://developer.amd.com/
Follow us on Twitter: https://twitter.com/AMDDevCentral
See info in the slides for more contact information and resource links!
ADMS'13 High-Performance Holistic XML Twig Filtering Using GPUsty1er
Current state of the art in information dissemination com- prises of publishers broadcasting XML-coded documents, in turn selectively forwarded to interested subscribers. The de- ployment of XML at the heart of this setup greatly increases the expressive power of the profiles listed by subscribers, using the XPath language. On the other hand, with great expressive power comes great performance responsibility: it is becoming harder for the matching infrastructure to keep up with the high volumes of data and users. Traditionally, general purpose computing platforms have generally been favored over customized computational setups, due to the simplified usability and significant reduction of development time. The sequential nature of these general purpose com- puters however limits their performance scalability. In this work, we propose the implementation of the filtering infras- tructure using the massively parallel Graphical Processing Units (GPUs). We consider the holistic (no post-processing) evaluation of thousands of complex twig-style XPath queries in a streaming (single-pass) fashion, resulting in a speedup over CPUs up to 9x in the single-document case and up to 4x for large batches of documents. A thorough set of exper- iments is provided, detailing the varying effects of several factors on the CPU and GPU filtering platforms
An evaluation of LLVM compiler for SVE with fairly complicated loopsLinaro
By Hiroshi Nakashima, Kyoto University / RIKEN AICS
As a part of the evaluation of Post-K’s compilers, we have been investigating compiled codes of vectorizable kernel loops in a particle-in-cell simulation program. This talk will reveal how the latest version of LLVM compiler (v1.4) works on the loops together with the qualitative and quantitative comparison with the code generated by Intel’s compiler for KNL.
Hiroshi Nakashima Bio
Currently working as a professor of Kyoto University’s supercomputer center (ACCMS) for R&D on HPC programming and supercomputer system architecture, as well as a visiting senior researcher of RIKEN AICS for the evaluation of Post-K computer and its compilers.
Email
h.nakashima@media.kyoto-u.ac.jp
For more info on The Linaro High Performance Computing (HPC) visit https://www.linaro.org/sig/hpc/
Application developers like to stick to the object-oriented style of programming by designing their application's logic as interaction between different object entities. In this process, every entity is modeled as a C++ class or structure. Array of structure (AoS) maintains a collection of those entities, which makes the code more readable and easier to maintain. But, this user-friendly code can potentially pose a challenge when it comes to vectorization efficiency. Often, the data needed for populating the vector register is gathered since the data is laid out in non-unit stride fashion in the main memory. To make the data layout more vector-friendly, developers often had to change their data structures manually from AoS to a structure of arrays (SoA). Single instruction multiple data (SIMD) layout templates from Intel help developers preserve an AoS interface while programming but, under the hood, the data structure is laid out in an SoA format. This is a win-win solution for both object-oriented and vector-friendly programming.
This presentation demonstrates how to analyze the memory access pattern in your performance-sensitive loops and how to enable the layout templates to make changes from constant- and variable-strided memory accesses to unit-strided memory access wherever possible.
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...Intel® Software
In this presentation, we focus on an alternative approach that uses nodes that contain Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors. Programming models and the development tools are identical for these resources, greatly simplifying development. We discuss how the same models for vectorization and threading can be used across these compute resources to create software that performs well on them. We further propose an extension to the Intel® Threading Building Blocks (Intel® TBB) flow graph interface that enables intra-node distributed memory programming, simplifying communication, and load balancing between the processors and coprocessors. Finally, we validate this approach by presenting a benchmark of a risk analysis implementation that achieves record-setting performance.
Presentation of NvFX: an effect layer that allows encapsulation of GLSL and/or D3D shading language.
The basic concept follows the footprints of NVIDIA CgFX
https://github.com/tlorach/nvFX
TWJUG x Oracle Groundbreakers 2019 Taiwan - What’s New in Last Java VersionsJoseph Kuo
We all know that Java's release cycle now is 6 months. The Java 12 has been released in March 2019, with the next version, Java 13, to come in September 2019. This session is to introduce new features and functions in Java 12 and 13. Also, we will mention other relevant features released in the previous Java versions.
https://cyberjos.blog/java/seminar/oracle-groundbreakers-2019-taiwan-whats-new-in-last-java-versions/
JCConf 2018 - Retrospect and Prospect of JavaJoseph Kuo
It has been more than 2 decades since the first version of Java was released in 1996. As of today, Java has been applied in many different fields form large-scale distributed computing services with scalability and stability, to millions of various APPs installed in mobile devices/cellphones/cars all over the world. At this time when Java 11 is being ready to introduce more new enhancements and deprecate legacy libraries, let us retrospect the past history of Java from the beginning, focus on recent significant changes from Java 8 to 10, prospect new features included in Java 11, and speculate what functionalities may come out in the future.
https://cyberjos.blog/java/seminar/jcconf-2018-retrospect-and-prospect-of-java/
In the slide, i describe the basis of python programming and their function. If any doubt in the slide, contact me through mail or linked in. My mail id is mdsathees@gmail.com
Recorded video here:
http://on-demand.gputechconf.com/siggraph/2017/video/sig1757-tristan-lorach-vkFX-effective-approach-for-vulkan-api.html
Vulkan is a complex low-level API, full of structures and dedicated objects. Using it may be tedious and often leads to complicated source code. We propose here a way to define and use Vulkan components in a convenient and readable way. Then we will show how this infrastructure allows to introduce and use higher concepts, such as Techniques, Passes; and even how to instantiate resources, render-targets right from within the effect, making it self-sufficient and consistent as a general description. The overall purpose of this open-source project is to improve and enhance the use of Vulkan API, while keeping its strength and flexibility. This project can run in two different ways: either as a compiler generating C++ code for you; or at runtime, to load effects and use them right away.
vkFx comes from a former project called nvFx, presented few years ago. While nvFx was intended to be Generic (OpenGL & D3D compliant), vkFx is Vulkan-specific: so the project is thin and doesn’t break important paradigms that Vulkan requires to stay powerful.
Aggregate programming is a novel paradigm that addresses, at the core, many issues commonly found in the development of large-scale, situated, self-adaptive systems. It is a particular macro-programming approach where a developer expresses the behaviour of the system at the aggregate-level, by targeting the distributed computational machine which is given by the entire set of (possibly mobile and heterogeneous) networked devices pervading the environment. It is the model that takes care of turning a system-level behaviour specification into the concrete, device-centric programs executed locally by each component.
Aggregate computing is formally grounded in the field calculus, a minimal functional language that works with computational fields, i.e., distributed data structures mapping devices (digital representatives of space-time portions) to computational objects. Fields are a useful unifying abstraction for drawing a connection between the physical and the computational world, and between the local and global programming viewpoints. This approach is compositional, allowing to define layers of building blocks of increasing abstraction, and is also amenable to formal analyses.
In this talk, I will present scafi (SCAla with computational FIels), an aggregate computing framework for the Scala programming language which provides (i) an internal DSL for expressing aggregate computations as well as (ii) a library support for the configuration and execution of aggregate systems. There is no need to learn ad-hoc external DSLs anymore: with scafi, Scala programmers can instantaneously start playing with this new, intriguing development approach!
An evaluation of LLVM compiler for SVE with fairly complicated loopsLinaro
By Hiroshi Nakashima, Kyoto University / RIKEN AICS
As a part of the evaluation of Post-K’s compilers, we have been investigating compiled codes of vectorizable kernel loops in a particle-in-cell simulation program. This talk will reveal how the latest version of LLVM compiler (v1.4) works on the loops together with the qualitative and quantitative comparison with the code generated by Intel’s compiler for KNL.
Hiroshi Nakashima Bio
Currently working as a professor of Kyoto University’s supercomputer center (ACCMS) for R&D on HPC programming and supercomputer system architecture, as well as a visiting senior researcher of RIKEN AICS for the evaluation of Post-K computer and its compilers.
Email
h.nakashima@media.kyoto-u.ac.jp
For more info on The Linaro High Performance Computing (HPC) visit https://www.linaro.org/sig/hpc/
Application developers like to stick to the object-oriented style of programming by designing their application's logic as interaction between different object entities. In this process, every entity is modeled as a C++ class or structure. Array of structure (AoS) maintains a collection of those entities, which makes the code more readable and easier to maintain. But, this user-friendly code can potentially pose a challenge when it comes to vectorization efficiency. Often, the data needed for populating the vector register is gathered since the data is laid out in non-unit stride fashion in the main memory. To make the data layout more vector-friendly, developers often had to change their data structures manually from AoS to a structure of arrays (SoA). Single instruction multiple data (SIMD) layout templates from Intel help developers preserve an AoS interface while programming but, under the hood, the data structure is laid out in an SoA format. This is a win-win solution for both object-oriented and vector-friendly programming.
This presentation demonstrates how to analyze the memory access pattern in your performance-sensitive loops and how to enable the layout templates to make changes from constant- and variable-strided memory accesses to unit-strided memory access wherever possible.
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...Intel® Software
In this presentation, we focus on an alternative approach that uses nodes that contain Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors. Programming models and the development tools are identical for these resources, greatly simplifying development. We discuss how the same models for vectorization and threading can be used across these compute resources to create software that performs well on them. We further propose an extension to the Intel® Threading Building Blocks (Intel® TBB) flow graph interface that enables intra-node distributed memory programming, simplifying communication, and load balancing between the processors and coprocessors. Finally, we validate this approach by presenting a benchmark of a risk analysis implementation that achieves record-setting performance.
Presentation of NvFX: an effect layer that allows encapsulation of GLSL and/or D3D shading language.
The basic concept follows the footprints of NVIDIA CgFX
https://github.com/tlorach/nvFX
TWJUG x Oracle Groundbreakers 2019 Taiwan - What’s New in Last Java VersionsJoseph Kuo
We all know that Java's release cycle now is 6 months. The Java 12 has been released in March 2019, with the next version, Java 13, to come in September 2019. This session is to introduce new features and functions in Java 12 and 13. Also, we will mention other relevant features released in the previous Java versions.
https://cyberjos.blog/java/seminar/oracle-groundbreakers-2019-taiwan-whats-new-in-last-java-versions/
JCConf 2018 - Retrospect and Prospect of JavaJoseph Kuo
It has been more than 2 decades since the first version of Java was released in 1996. As of today, Java has been applied in many different fields form large-scale distributed computing services with scalability and stability, to millions of various APPs installed in mobile devices/cellphones/cars all over the world. At this time when Java 11 is being ready to introduce more new enhancements and deprecate legacy libraries, let us retrospect the past history of Java from the beginning, focus on recent significant changes from Java 8 to 10, prospect new features included in Java 11, and speculate what functionalities may come out in the future.
https://cyberjos.blog/java/seminar/jcconf-2018-retrospect-and-prospect-of-java/
In the slide, i describe the basis of python programming and their function. If any doubt in the slide, contact me through mail or linked in. My mail id is mdsathees@gmail.com
Recorded video here:
http://on-demand.gputechconf.com/siggraph/2017/video/sig1757-tristan-lorach-vkFX-effective-approach-for-vulkan-api.html
Vulkan is a complex low-level API, full of structures and dedicated objects. Using it may be tedious and often leads to complicated source code. We propose here a way to define and use Vulkan components in a convenient and readable way. Then we will show how this infrastructure allows to introduce and use higher concepts, such as Techniques, Passes; and even how to instantiate resources, render-targets right from within the effect, making it self-sufficient and consistent as a general description. The overall purpose of this open-source project is to improve and enhance the use of Vulkan API, while keeping its strength and flexibility. This project can run in two different ways: either as a compiler generating C++ code for you; or at runtime, to load effects and use them right away.
vkFx comes from a former project called nvFx, presented few years ago. While nvFx was intended to be Generic (OpenGL & D3D compliant), vkFx is Vulkan-specific: so the project is thin and doesn’t break important paradigms that Vulkan requires to stay powerful.
Aggregate programming is a novel paradigm that addresses, at the core, many issues commonly found in the development of large-scale, situated, self-adaptive systems. It is a particular macro-programming approach where a developer expresses the behaviour of the system at the aggregate-level, by targeting the distributed computational machine which is given by the entire set of (possibly mobile and heterogeneous) networked devices pervading the environment. It is the model that takes care of turning a system-level behaviour specification into the concrete, device-centric programs executed locally by each component.
Aggregate computing is formally grounded in the field calculus, a minimal functional language that works with computational fields, i.e., distributed data structures mapping devices (digital representatives of space-time portions) to computational objects. Fields are a useful unifying abstraction for drawing a connection between the physical and the computational world, and between the local and global programming viewpoints. This approach is compositional, allowing to define layers of building blocks of increasing abstraction, and is also amenable to formal analyses.
In this talk, I will present scafi (SCAla with computational FIels), an aggregate computing framework for the Scala programming language which provides (i) an internal DSL for expressing aggregate computations as well as (ii) a library support for the configuration and execution of aggregate systems. There is no need to learn ad-hoc external DSLs anymore: with scafi, Scala programmers can instantaneously start playing with this new, intriguing development approach!
Sar ice image classification using parallelepiped classifier based on gram sc...csandit
Synthetic Aperture Radar (SAR) is a special type of imaging radar that involves advanced
technology and complex data processing to obtain detailed images from the lake surface. Lake
ice typically reflects more of the radar energy emitted by the sensor than the surrounding area,
which makes it easy to distinguish between the water and the ice surface. In this research work,
SAR images are used for ice classification based on supervised and unsupervised classification
algorithms. In the pre-processing stage, Hue saturation value (HSV) and Gram–Schmidt
spectral sharpening techniques are applied for sharpening and resampling to attain highresolution
pixel size. Based on the performance evaluation metrics it is proved that Gram-
Schmidt spectral sharpening performs better than sharpening the HSV between the boundaries.
In classification stage, Gram–Schmidt spectral technique based sharpened SAR images are used
as the input for classifying using parallelepiped and ISO data classifier. The performances of
the classifiers are evaluated with overall accuracy and kappa coefficient. From the
experimental results, ice from water is classified more accurately in the parallelepiped
supervised classification algorithm.
Performance analysis of transmission of 5 users based on model b using gf (5)...csandit
We present transmission of five users with 5 WDM × 4 TDM × 5 CODE channel on 3D
OCDMA system based on Model B using GF (5) with varying receiver attenuation at 1Gbps, 2
Gbps, 5Gbps and 10Gbps data rates on OPTSIM.
Pso based optimized security scheme for image authentication and tamper proofingcsandit
The hash function offers an authentication and an integrity to digital images. In this paper an
innovative optimized security scheme based on Particle swarm optimization (PSO) for image
authentication and tamper proofing is proposed. This scheme provide solutions to the issues
such as robustness, security and tamper detection with precise localization. The features are
extracted in Daubechies4 wavelet transform domain with help of PSO to generate the image
hash. This scheme is moderately robust against attacks and to detect and locate the tampered
areas in an image. The experimental results are presented to exhibit the effectiveness of the
proposed scheme.
Review of access control models for cloud computingcsandit
The relationship between users and resources is dynamic in the cloud, and service providers
and users are typically not in the same security domain. Identity-based security (e.g.,
discretionary or mandatory access control models) cannot be used in an open cloud computing
environment, where each resource node may not be familiar, or even do not know each other.
Users are normally identified by their attributes or characteristics and not by predefined
identities. There is often a need for a dynamic access control mechanism to achieve crossdomain
authentication. In this paper, we will focus on the following three broad categories of
access control models for cloud computing: (1) Role-based models; (2) Attribute-based
encryption models and (3) Multi-tenancy models. We will review the existing literature on each
of the above access control models and their variants (technical approaches, characteristics,
applicability, pros and cons), and identify future research directions for developing access
control models for cloud computing environments.
Secure & fault tolerance handoff in vanet using special mobile agentcsandit
Vehicular Traffic poses an emerging issue nowadays. The critical factors for the data
communication are speed and time tradeoffs. For data communication, gathering and retrieving
information many cost-effective and tested techniques are required in VANET. Client server
architectures being coercive are commonly used in spite of having drawbacks of fault and time
in-effectiveness. This paper elaborates a proposed method in VANET for fault tolerance
information retrieval based on theory of bandwidth and timestamp. Mobile Agents, with the
feature of autonomy, social ability, learning, and most importantly mobility, regarded as an
appropriate technology to build applications for instance information retrieval system in mobile
computing environment.
.NET Core, ASP.NET Core Course, Session 3aminmesbahi
Session 3,
Introducing to Compiler
What is the LLVM?
LLILC
RyuJIT
AOT Compilation
Preprocessors and Conditional Compilation
An Overview on Dependency Injection
The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019corehard_by
C++ is known for things such as performance, expressiveness, the lack of a standard build system and package management, complexity and long compile times. The inability to iterate quickly is one of the biggest killers of productivity. This talk is aimed at anyone interested in improving the last of these points - it will provide insights into why compilation (and linking) take so long for C++ and will then provide an exhaustive list of techniques and tools to mitigate the problem, such as: - tooling and infrastructure - hardware, build systems, caching, distributed builds, diagnostics of bottlenecks, code hygiene - techniques - unity builds, precompiled headers, linking (static vs shared libraries) - source code modification - the PIMPL idiom, better template use, annotations - modules - what they are, when they are coming to C++ and what becomes obsolete because of them
Evaluating GPU programming Models for the LUMI SupercomputerGeorge Markomanolis
It is common in the HPC community that the achieved performance with just CPUs is limited for many computational cases. The EuroHPC pre-exascale and the coming exascale systems are mainly focused on accelerators, and some of the largest upcoming supercomputers such as LUMI and Frontier will be powered by AMD Instinct accelerators. However, these new systems create many challenges for developers who are not familiar with the new ecosystem or with the required programming models that can be used to program for heterogeneous architectures. In this paper, we present some of the more well-known programming models to program for current and future GPU systems. We then measure the performance of each approach using a benchmark and a mini-app, test with various compilers, and tune the codes where necessary. Finally, we compare the performance, where possible, between the NVIDIA Volta (V100), Ampere (A100) GPUs, and the AMD MI100 GPU.
Presentation of a paper accepted in Supercomputing Frontiers Asia 2022
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesDr. Fabio Baruffa
In the framework of the Intel Parallel Computing Centre at the Research Campus Garching in Munich, our group at LRZ presents recent results on performance optimization of Gadget-3, a widely used community code for computational astrophysics. We identify and isolate a sample code kernel, which is representative of a typical Smoothed Particle Hydrodynamics (SPH) algorithm and focus on threading parallelism optimization, change of the data layout into Structure of Arrays (SoA), compiler auto-vectorization and algorithmic improvements in the particle sorting. We measure lower execution time and improved threading scalability both on Intel Xeon (2.6× on Ivy Bridge) and Xeon Phi (13.7× on Knights Corner) systems. First tests on second generation Xeon Phi (Knights Landing) demonstrate the portability of the devised optimization solutions to upcoming architectures.
Come costruire una Platform As A Service con Docker, Kubernetes Go e JavaCodemotion
"Come costruire una Platform As A Service con Docker, Kubernetes Go e Java" by Massimiliano Dessì
Per automatizzare la CI e la CD, durante sviluppo, test, in preproduzione e in produzione si utilizzano le tecniche chiamate attualmente DevOps, in locale con Vagrant oppure su una PAAS su cloud, privati o pubblici. Possiamo costruire una PAAS scalabile utilizzando solo Docker, Docker e Kubernetes oppure soluzioni già pronte come Openshift 3 (che sta sopra Docker e Kubernetes). Nella presentazione vedremo come avere questi tre tipi di PAAS con in più uno strato di orchestrazione in GO/Java e Ansible per automatizzare il comportamento in base ad eventi monitorati
Kubernetes has become the defacto standard as a platform for container orchestration. Its ease of extending and many integrations has paved the way for a wide variety of data science and research tooling to be built on top of it.
From all encompassing tools like Kubeflow that make it easy for researchers to build end-to-end Machine Learning pipelines to specific orchestration of analytics engines such as Spark; Kubernetes has made the deployment and management of these things easy. This presentation will showcase some of the larger research tools in the ecosystem and go into how Kubernetes has enabled this easy form of application management.
This PPT File helps IT freshers with the Basic Interview Questions, which will boost there confidence before going to the Interview. For more details and Interview Questions please log in www.rekruitin.com and click on Job Seeker tools. Also register on the and get employed.
By ReKruiTIn.com
Similar to Open cl programming using python syntax (20)
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
2. 58 Computer Science & Information Technology (CS & IT)
C99 while the CUDA dialect is more powerful and, for example, supports templates.
A major complexity in writing CUDA and OpenCL code consists in having to write code inside
code (the kernel managed by the host). This process can be somewhat simplified using a different
language for the host. For example, the pyCUDA and pyOpenCL libraries created by Andreas
Klöckner[3] allow compiling, queuing, and running kernels (in CUDA and OpenCL respectively)
from inside a Python program. They are integrated with numpy[4], the Python numeric library.
In this paper we discuss a library called ocl which is built on top of pyOpenCL. It allows writing
both the host program and the kernels using the Python languages without loss of performance
compared to native OpenCL. ocl consists of two parts:
• A thin layer on top of pyOpenCL which encapsulates the device state (consisting of a
context and a task queue).
• A function decorator which, at runtime, converts decorated Python functions into C99
code and uses the OpenCL JIT to compile them.
Python + numpy + pyOpenCL + ocl allow development of parallel programs which run
everywhere (CPUs and GPUs), are written in a single language, and do not require any outside
compilation, linking, or other building step.
ocl is similar to Cython[8] and Clyther[9] in the sense that it works by parsing the Python code
into a Python Abstract Syntax Tree (AST) and then serializing the AST into C99 code. There are
differences between Clyther and ocl: The implementation is different. The semantic used to build
the kernel body is different and closer to C in the ocl case. Moreover, the ocl implementation is
extensible and, in fact, it can be used to generate and compile C code (without the need for
OpenCL, like Cython) but it can also be used to generate JavaScript code (to run, for example, in
a browser). Javascript code generation is beyond the scope of this paper since its application is in
web development, not high performance computing.
In order to explain the syntax of ocl we will consider first the simple example of adding two
vectors and then a more complex financial example of computing the correlation matrix for
arithmetic return of multiple time series and determine the optimal portfolio according to Modern
Portfolio Theory.
2. SIMPLE EXAMPLE
In our first example we consider the following Python code which defines two arrays, fills them
with random Gaussian floating numbers, and adds them using the available devices.
from ocl import Device
import numpy
import random
n = 1000000
a = numpy.zeros(n, dtype=numpy.float32)
b = numpy.zeros(n, dtype=numpy.float32)
for k in xrange(n):
a[k] = random.gauss(0,1)
3. Computer Science & Information Technology (CS & IT) 59
b[k] = random.gauss(0,1)
device = Device()
a_buffer = device.buffer(source=a, mode=device.flags.READ_ONLY)
b_buffer = device.buffer(source=b)
@device.compiler.define_kernel(a='global:const:ptr_float',
b='global:ptr_float')
def add(a, b):
k = new_int(get_global_id(0))
b[k] = a[k]+b[k]
program = device.compile()
program.add(device.queue, [n], None, a_buffer, b_buffer)
b = device.retrieve(b_buffer,shape=(n,))
In this example:
• Lines 1-3 import the requires modules.
• Lines 6-7 define two numpy arrays (a and b) of size n in the main RAM memory and fill
them with zeros.
• Lines 9-11 populate the arrays a and b with random Gaussian numbers with mean 0 and
standard deviation 1.
• Line 13 determines the available device and incorporates its state.
• Lines 14-15 copies the arrays a and b into the memory of the device represented by
a_buffer and b_buffer respectively.
• Lines 17-21 define a new kernel called “add”.
• Line 17-18 decorate the “add” function to specify it is a kernel and needs to be converted
into C99 code and declare the types of the function arguments.
• Lines 19-21 define the body of the kernel.
• Line 20 runs on the device (one instance per thread) and on each thread returns a different
value for k.
• Line 23 converts and compiles the kernel, and loads it into the device.
• Line 24 calls the function “add”. More precisely it queues [n] threads each calling the
function “add” and determines that a_buffer and b_buffer are to be passed as arguments.
• Line 25 retrieves the output array b_buffer from the device and copies it into b.
The program above is run with a single Bash shell command:
python myprogram.py
Notice how the kernel function “add” is preceded by the decorator
@device.compiler.define_kernel(...)
This decorator performs introspection of the function at runtime, parses the body of the function
into an Abstract Syntax Tree (AST) for later conversion to C99/OpenCL code. This allows us to
write OpenCL code using Python syntax.
Supported control structures and commands include def, if..elif else, for ... range, while, break,
and continue.
4. 60 Computer Science & Information Technology (CS & IT)
The decorated code is converted at runtime into the following Python AST
FunctionDef(name='add',
args=arguments(args=[Name(id='a'), Name(id='b', ctx=Param())]),
body=[Assign(targets=[Name(id='k')],
value=Call(func=Name(id='new_int'),
args=[Call(func=Name(id='get_global_id'),
args=[Num(n=0)]]),
Assign(targets=[Subscript(value=Name(id='b'),
slice=Index(value=Name(id='k'))],
value=BinOp(left=Subscript(value=Name(id='a'),
slice=Index(value=Name(id='k'))),
op=Add(),
right=Subscript(value=Name(id='b'),
slice=Index(value=Name(id='k'))))),
Return(value=Name(id='None'))])
(we simplified the AST by omitting irrelevant parameters).
The call to device.compile(...) converts the AST into the following code:
__kernel void add(__global const float* a, __global float* b) {
int k;
k = get_global_id(0);
b[k] = (a[k] + b[k]);
}
__kernel is an OpenCL modifier which declares the function to be a kernel. __global and
__local are used to declared the type of memory storage: global of the device or local of the core,
respectively.
One complication is that C99/OpenCL is a statically typed language while Python is only
strongly typed. Therefore one needs a way to declare the type of variables using Python syntax.
For the function arguments this is done in the decorator arguments. For example:
a = "global:const:ptr_float"
is converted into
__global const float *a
The type of local variables is defined using the pseudo-casting operators new_int, new_float,
etc. For example:
k = new_int(get_global_id(0))
is converted into
int k;
k = get_global_id(0);
Variable types must be declared when first assigned. The converter checks that this is the case to
5. Computer Science & Information Technology (CS & IT) 61
prevent further and more obscure compile time errors.
In this example, get_global_id(0) is an OpenCL function user to identify the rank of the current
running instance of the kernel. If one queues [n] kernel tasks, it will return all values from 0 to
n−1, a different value for each running task. The order in which queued tasks are executed is not
guaranteed.
The return type of a function is determined automatically but one must return a variable (or
None), not an expression.
Other variable types and pointers can be defined and manipulated using the syntax shown in the
following table:
ocl C99/OpenCL
x = new_type(...)
x = new_prt_type(...)
x = new_prt_prt_type(...)
ADDR(x)
REFD(x)
CAST(prt_type,x)
type x = ...;
type *x = ...;
type **x = ...;
&x
*x
(type*)x
OpenCL provides special types of variables to take advantages of vector registries. For example
float4 can store four floating point numbers called x, y, z, and w. These special vector types can
be used from ocl as follows:
q = new_float4((0.0,0.0,0.0,0.0))
q.x = 1.0
Notice that the goal of ocl is not to translate arbitrary Python code into C99/OpenCL but to allow
the use of Python syntax to write OpenCL kernels and functions. Only C99/OpenCL variable
types are allowed, not Pyhton types such as lists and dictionaries. This may be supported in future
versions of ocl. Moreover all functions called in the Python code which constitutes the body of
the add function are intended to be OpenCL functions not Python functions. While Python code is
normally garbage collected, the decorated code runs on the device as C code and it is not garbage
collected therefore explicit memory management may be necessary:
@device.compiler.define_kernel(...)
def some_kernel(...):
d = new_ptr_float(malloc(size))
...
free(d)
which would be converted into:
__kernel void some_kernel(...):
float* d;
d = (float*)malloc(size);
...
6. 62 Computer Science & Information Technology (CS & IT)
free(d);
}
One can define multiple kernels within the same program and call them when needed. One can
use the @device.compiler.define(...) decorator to define other functions to be executed on the
device. They will be visible and callable by the kernels, but they will not be callable from the host
program.
One can pass constants from the Python context to the OpenCL context:
device.compile(..., constants = dict(n = m))
where n is a constant in the kernel and m is a variable in the host program.
Additional C files can be included using the syntax
device.compile(..., inlcudes = ['#include <math.h>'])
where includes is a list of #include statements.
3. FINANCIAL APPLICATION EXAMPLE
In the following example, we solve the typical financial problem of analyzing multiple time series
such as stock prices, p[k,t] (simulated stock prices). For each time series k we compute the
arithmetic daily returns, r[k,t], and the average returns, mu[k]. We then compute the covariance
matrix, cov[i,j], and the correlation matrix, cor[i,j]. We use different kernels for each part of the
computation.
Finally, to make the application more practical, we use Modern Portfolio Theory [10] to compute
a tangency portfolio which maximizes the Sharpe ratio, i.e. return/risk under the assumption of
normal returns:
Here µ is the vector of average returns (mu), Σ is the covariance matrix (cov), and is the input
risk free interest rate. The tangency portfolio is identified by the vector x (aka array x in the code)
whose terms indicate the amount to be invested in each stock (must add up to one dollar). We
perform this maximization on the CPU to demonstrate integration with the numpy Linear Algebra
package.
We use the symbols i and j to identify the stock time series, and the symbol t for time (for daily
data t is a day). n is the number of stocks and m is the number of trading days.
We use the canvas[6] library, based on the Python matplotlib library, to display one of the stock
price series and the resulting correlation matrix. Below is the complete code. The output from the
code can be seen in fig. 1.
from ocl import Device
from canvas import Canvas
import random
7. Computer Science & Information Technology (CS & IT) 63
import numpy
from math import exp
n = 1000 # number of time series
m = 250 # number of trading days for time series
p = numpy.zeros((n,m), dtype=numpy.float32)
r = numpy.zeros((n,m), dtype=numpy.float32)
mu = numpy.zeros(n, dtype=numpy.float32)
cov = numpy.zeros((n,n), dtype=numpy.float32)
cor = numpy.zeros((n,n), dtype=numpy.float32)
for k in xrange(n):
p[k,0] = 100.0
for t in xrange(1,m):
c = 1.0 if k==0 else (p[k-1,t]/p[k-1,t-1])
p[k,t] = p[k,t-1]*exp(random.gauss(0.0001,0.10))*c
device = Device()
p_buffer = device.buffer(source=p, mode=device.flags.READ_ONLY)
r_buffer = device.buffer(source=r)
mu_buffer = device.buffer(source=mu)
cov_buffer = device.buffer(source=cov)
cor_buffer = device.buffer(source=cor)
@device.compiler.define_kernel(p='global:const:ptr_float',
r='global:ptr_float')
def compute_r(p, r):
i = new_int(get_global_id(0))
for t in range(0,m-1):
r[i*m+t] = p[i*m+t+1]/p[i*m+t] - 1.0
@device.compiler.define_kernel(r='global:ptr_float',
mu='global:ptr_float')
def compute_mu(r, mu):
i = new_int(get_global_id(0))
sum = new_float(0.0)
for t in range(0,m-1):
sum = sum + r[i*m+t]
mu[i] = sum/(m-1)
@device.compiler.define_kernel(r='global:ptr_float',
mu='global:ptr_float', cov='global:ptr_float')
def compute_cov(r, mu, cov):
i = new_int(get_global_id(0))
j = new_int(get_global_id(1))
sum = new_float(0.0)
for t in range(0,m-1):
sum = sum + r[i*m+t]*r[j*m+t]
cov[i*n+j] = sum/(m-1)-mu[i]*mu[j]
8. 64 Computer Science & Information Technology (CS & IT)
@device.compiler.define_kernel(cov='global:ptr_float',
cor='global:ptr_float')
def compute_cor(cov, cor):
i = new_int(get_global_id(0))
j = new_int(get_global_id(1))
cor[i*n+j] = cov[i*n+j] / sqrt(cov[i*n+i]*cov[j*n+j])
program = device.compile(constants=dict(n=n,m=m))
q = device.queue
program.compute_r(q, [n], None, p_buffer, r_buffer)
program.compute_mu(q, [n], None, r_buffer, mu_buffer)
program.compute_cov(q, [n,n], None, r_buffer, mu_buffer, cov_buffer)
program.compute_cor(q, [n,n], None, cov_buffer, cor_buffer)
r = device.retrieve(r_buffer,shape=(n,m))
mu = device.retrieve(mu_buffer,shape=(n,))
cov = device.retrieve(cov_buffer,shape=(n,n))
cor = device.retrieve(cor_buffer,shape=(n,n))
points = [(x,y) for (x,y) in enumerate(p[0])]
Canvas(title='Price').plot(points).save(filename='price.png')
Canvas(title='Correlations').imshow(cor).save(filename='cor.png')
rf = 0.05/m # input daily risk free interest rate
x = numpy.linalg.solve(cov,mu-rf) # cov*x = (mu-rf)
x *= 1.00/sum(x) # assumes 1.00 dollars in total investment
open('optimal_portfolio','w').write(repr(x))
Notice how the memory buffers are always 1-dimensional therefore the i,j indexes have to be
mapped into a 1D index i*n+j. Also notice that while kernels compute_r and compute_mu are
called [n] times (once per stock k), kernels compute_cov and compute_cor are called [n,n] times,
once per each couple of stocks i,j. The values of i,j are retrieved by get_global_id(0) and (1)
respectively.
In this program we have defined multiple kernels and complied them at once. We call one kernel
at the time to make sure that the call to the previous kernel is completed before running the next
one.
9. Computer Science & Information Technology (CS & IT) 65
Figure 1: The image on the left shows one of the randomly generated stock price histories. The image on
the right represents the computed correlation matrix. Rows and columns corresponds to stock returns and
the color at the intersection is their correlation (red for high correlation and blue for no correlation). The
resulting shape is an artifact of the algorithm used to generate random data.
4. OCL WITHOUT OpenCL
ocl can be used to generate C99 code, compile it, and run it without OpenCL. In this case the
code always runs on the CPU and not on the GPU. Here is a simple example:
from ocl import Compiler
c99 = Compiler()
@c99.define(n='int')
def factorial(n):
output = 1
for k in range(1, n + 1):
output = output * k
return output
compiled = c99.compile()
print(compiled.factorial(10))
assert compiled.factorial(10) == factorial(10)
In this example, we have replaced device.compiler with c99 = Compiler() since “device” is an
OpenCL specific concept. We have been careful to engineer the function “factorial” so that it can
run both in Python (factorial) and compiled in C99 (compiled.factorial). This is not always
possible but it possible for functions that only call mathematical libraries, such as our factorial
function. Notice that c99.compile() requires the presence of gcc installed instead of the OpenCL
JIT.
5. CONCLUSIONS
In this paper we provide an introduction to ocl, a library that allows writing OpenCL programs
and kernels using exclusively Python syntax. ocl is built on top of pyOpenCL[2], meta[7], and
numpy[4] (the Python Scientific Libraries). The Python code is converted to C99/OpenCL and
compiled at run-time. It can run on any OpenCL compatible device including ordinary Intel/AMD
CPUs, Nvidia/ATI GPUs, and ARM CPUs with the same performance as native OpenCL code.
10. 66 Computer Science & Information Technology (CS & IT)
Our approach does not necessarily simplifies the algorithms but it makes them more readable,
portable, and eliminates external compilation and linking steps thus lowering the barrier of entry
to GPU programming. It does not eliminate the need to understand OpenCL semantic but allows
the use of a simpler syntax.
As an example of application we considered the computation of the covariance and correlation
matrices from financial data series, in order determine the optimal portfolio according to Model
Portfolio Theory [10].
We plan to extend ocl to support a richer syntax, generate CUDA as well as OpenCL code, and
provide better debugging information. Moreover, since Python code can easily be serialized and
deserialized at runtime, we plan to improve the library to allow distributed computations where
the code is written once (in Python), distributed, then compiled and ran on different worker nodes
using heterogenous devices.
ACKNOWLEDGEMENTS
I thank Andreas Klöckner for pyOpenCL and the excellent documentation.
REFERENCES
[1] Lindholm, Erik, et al. “NVIDIA Tesla: A unified graphics and computing architecture.” Micro, IEEE
28.2 (2008): 39-55.
[2] Munshi, Aaftab. “OpenCL: Parallel Computing on the GPU and CPU.” SIGGRAPH, Tutorial (2008).
[3] Klöckner, Andreas, et al. “Pycuda and pyopencl: A scripting-based approach to gpu run-time code
generation.” Parallel Computing 38.3 (2012): 157-174.
[4] Oliphant, Travis E. A Guide to NumPy. Vol. 1. USA: Trelgol Publishing, 2006.
[5] https://github.com/mdipierro/ocl
[6] https://github.com/mdipierro/canvas
[7] http://srossross.github.com/Meta/html/index.html
[8] Behnel, Stefan, et al. “Cython: The best of both worlds.” Computing in Science & Engineering 13.2
(2011): 31-39.
[9] http://srossross.github.com/Clyther/
[10] Markowitz, Harry M. “Foundations of portfolio theory.” The Journal of Finance 46.2 (2012):
469-477.