Numba is a Python compiler that translates Python code into fast machine code using the LLVM compiler infrastructure. It allows Python code that works with NumPy arrays to be just-in-time compiled to native machine instructions, achieving performance comparable to C, C++ and Fortran for numeric work. Numba provides decorators like @jit that can compile functions for improved performance on NumPy array operations. It aims to make Python a compiled and optimized language for scientific computing by leveraging type information from NumPy to generate fast machine code.
YouTube Link: https://youtu.be/QswQA1lRIQY
** Python Certification Training: https://www.edureka.co/python **
This Edureka PPT on 'Collections In Python' will cover the concepts of Collection data type in python along with the collections module and specialized collection data structures like counter, chainmap, deque etc. Following are the topics discussed:
What Are Collections In Python?
What Is A Collection Module In Python?
Specialized Collection Data Structures
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
a. Concept and Definition✓
b. Inserting and Deleting nodes ✓
c. Linked implementation of a stack (PUSH/POP) ✓
d. Linked implementation of a queue (Insert/Remove) ✓
e. Circular List
• Stack as a circular list (PUSH/POP) ✓
• Queue as a circular list (Insert/Remove) ✓
f. Doubly Linked List (Insert/Remove) ✓
For more course related material:
https://github.com/ashim888/dataStructureAndAlgorithm/
Personal blog
www.ashimlamichhane.com.np
YouTube Link: https://youtu.be/QswQA1lRIQY
** Python Certification Training: https://www.edureka.co/python **
This Edureka PPT on 'Collections In Python' will cover the concepts of Collection data type in python along with the collections module and specialized collection data structures like counter, chainmap, deque etc. Following are the topics discussed:
What Are Collections In Python?
What Is A Collection Module In Python?
Specialized Collection Data Structures
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
a. Concept and Definition✓
b. Inserting and Deleting nodes ✓
c. Linked implementation of a stack (PUSH/POP) ✓
d. Linked implementation of a queue (Insert/Remove) ✓
e. Circular List
• Stack as a circular list (PUSH/POP) ✓
• Queue as a circular list (Insert/Remove) ✓
f. Doubly Linked List (Insert/Remove) ✓
For more course related material:
https://github.com/ashim888/dataStructureAndAlgorithm/
Personal blog
www.ashimlamichhane.com.np
Python Functions Tutorial | Working With Functions In Python | Python Trainin...Edureka!
** Python Certification Training: https://www.edureka.co/python **
This Edureka PPT on Python Functions tutorial covers all the important aspects of functions in Python right from the introduction to what functions are, all the way till checking out the major functions and using the code-first approach to understand them better.
Agenda
Why use Functions?
What are the Functions?
Types of Python Functions
Built-in Functions in Python
User-defined Functions in Python
Python Lambda Function
Conclusion
Python Tutorial Playlist: https://goo.gl/WsBpKe
Blog Series: http://bit.ly/2sqmP4s
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
The tar command stands for tape achieve, which is the most commonly used tape drive backup command used by the Linux system. It allows to quickly access a collection of files and placed them into a highly compressed archive file commonly called tar, gzip and bzip in Linux.
To visit www.excavatorinfo.com
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
Complete C++ programming Language CourseVivek chan
This is the Complete course of C++ Programming Language for Beginners. All Topics of C++ programming Language are covered in this single power point presentation.
Visit: www.cyberlabzone.com
Python Functions Tutorial | Working With Functions In Python | Python Trainin...Edureka!
** Python Certification Training: https://www.edureka.co/python **
This Edureka PPT on Python Functions tutorial covers all the important aspects of functions in Python right from the introduction to what functions are, all the way till checking out the major functions and using the code-first approach to understand them better.
Agenda
Why use Functions?
What are the Functions?
Types of Python Functions
Built-in Functions in Python
User-defined Functions in Python
Python Lambda Function
Conclusion
Python Tutorial Playlist: https://goo.gl/WsBpKe
Blog Series: http://bit.ly/2sqmP4s
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
The tar command stands for tape achieve, which is the most commonly used tape drive backup command used by the Linux system. It allows to quickly access a collection of files and placed them into a highly compressed archive file commonly called tar, gzip and bzip in Linux.
To visit www.excavatorinfo.com
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
Complete C++ programming Language CourseVivek chan
This is the Complete course of C++ Programming Language for Beginners. All Topics of C++ programming Language are covered in this single power point presentation.
Visit: www.cyberlabzone.com
Making NumPy-style and Pandas-style code faster and run in parallel. Continuum has been working on scaled versions of NumPy and Pandas for 4 years. This talk describes how Numba and Dask provide scaled Python today.
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Intel® Software
In part two of this presentation, continue to learn about the latest developments and tools for high-performance Python* for scikit-learn*, NumPy, SciPy, Pandas, mpi4py, and Numba*. Apply low-overhead profiling tools to analyze mixed C, C++, and Python applications to detect performance bottlenecks in the code and to pinpoint hotspots as the target for performance tuning. Get the best performance from your Python application with the best-known methods, tools, and libraries.
A lecture given for Stats 285 at Stanford on October 30, 2017. I discuss how OSS technology developed at Anaconda, Inc. has helped to scale Python to GPUs and Clusters.
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014PyData
Pythran is a an ahead of time compiler that turns modules written in a large subset of Python into C++ meta-programs that can be compiled into efficient native modules. It targets mainly compute intensive part of the code, hence it comes as no surprise that it focuses on scientific applications that makes extensive use of Numpy. Under the hood, Pythran inter-procedurally analyses the program and performs high level optimizations and parallel code generation. Parallelism can be found implicitly in Python intrinsics or Numpy operations, or explicitly specified by the programmer using OpenMP directives directly in the Python source code. Either way, the input code remains fully compatible with the Python interpreter. While the idea is similar to Parakeet or Numba, the approach differs significantly: the code generation is not performed at runtime but offline. Pythran generates C++11 heavily templated code that makes use of the NT2 meta-programming library and relies on any standard-compliant compiler to generate the binary code. We propose to walk through some examples and benchmarks, exposing the current state of what Pythran provides as well as the limit of the approach.
Imagine writing a pure Python library which can achieve the performance of Fortran or C/C++.
To this end we have developed Pyccel, which translates Python code to either Fortran or C, and makes the generated code callable from Python. The generated Fortran or C code is not only fast, but also human-readable; hence it can easily be profiled and optimized for the target machine.
Pyccel has a focus on high-performance computing applications, where the efficient usage of the available hardware resources is fundamental.
To this end it provides type annotations, function decorators, and OpenMP pragmas.
Pyccel is easy to use, is almost completely written in Python, and compares favourably against other Python accelerators.
Keynote talk at PyCon Estonia 2019 where I discuss how to extend CPython and how that has led to a robust ecosystem around Python. I then discuss the need to define and build a Python extension language I later propose as EPython on OpenTeams: https://openteams.com/initiatives/2
Fix typo of original tutorial slide.
Introduction of PyTorch
Explains PyTorch usages by a CNN example.
Describes the PyTorch modules (torch, torch.nn, torch.optim, etc) and the usages of multi-GPU processing.
Also gives examples for Recurrent Neural Network and Transfer Learning.
The slides shown here have been used for talks given to scientists in informal contexts.
Python is introduced as a valuable tool for both producing and evaluating data.
The talk is essentially a guided tour of the author's favourite parts of the Python ecosystem. Besides the Python language itself, NumPy and SciPy as well as Matplotlib are mentioned.
A last part of the talk concerns itself with code execution speed. With this problem in sight, Cython and f2py are introduced as means of glueing different languages together and speeding Python up.
The source code for the slides, code snippets and further links are available in a git repository at
https://github.com/aeberspaecher/PythonForScientists
Introduction of PyTorch
Explains PyTorch usages by a CNN example.
Describes the PyTorch modules (torch, torch.nn, torch.optim, etc) and the usages of multi-GPU processing.
Also gives examples for Recurrent Neural Network and Transfer Learning.
Travis Oliphant "Python for Speed, Scale, and Science"Fwdays
Python is sometimes discounted as slow because of its dynamic typing and interpreted nature and not suitable for scale because of the GIL. But, in this talk, I will show how with the help of talented open-source contributors around the world, we have been able to build systems in Python that are fast and scalable to many machines and how this has helped Python take over Science.
Slides for the Cluj.py meetup where we explored the inner workings of CPython, the reference implementation of Python. Includes examples of writing a C extension to Python, and introduces Cython - ultimately the sanest way of writing C extensions.
Also check out the code samples on GitHub: https://github.com/trustyou/meetups/tree/master/python-c
C++ open positions and popularity remain high as media has recently, and there is a reason for that: from the many languages and platforms that developers have available today, C++ features uncontested capabilities in power and performance, allowing innovation outside the box (just think on action games, natural user interfaces or augmented reality, to mention some). In this talk you’ll see the new features and technologies that are coming with Visual C++ vNext, helping you build compelling applications with a renewed developer experience. Don’t miss it!!
Euro python2011 High Performance PythonIan Ozsvald
I ran this as a 4 hour tutorial at EuroPython 2011 to teach High Performance Python coding.
Techniques covered include bottleneck analysis by profiling, bytecode analysis, converting to C using Cython and ShedSkin, use of the numerical numpy library and numexpr, multi-core and multi-machine parallelisation and using CUDA GPUs.
Write-up with 49 page PDF report: http://ianozsvald.com/2011/06/29/high-performance-python-tutorial-v0-1-from-my-4-hour-tutorial-at-europython-2011/
Natural Language Processing with CNTK and Apache Spark with Ali ZaidiDatabricks
Apache Spark provides an elegant API for developing machine learning pipelines that can be deployed seamlessly in production. However, one of the most intriguing and performant family of algorithms – deep learning – remains difficult for many groups to deploy in production, both because of the need for tremendous compute resources and also because of the inherent difficulty in tuning and configuring.
In this session, you’ll discover how to deploy the Microsoft Cognitive Toolkit (CNTK) inside of Spark clusters on the Azure cloud platform. Learn about the key considerations for administering GPU-enabled Spark clusters, configuring such workloads for maximum performance, and techniques for distributed hyperparameter optimization. You’ll also see a real-world example of training distributed deep learning learning algorithms for speech recognition and natural language processing.Microsoft Cognitive Toolkit (CNTK) inside of Spark clusters on the Azure cloud platform. We’ll discuss the key considerations for administering GPU-enabled Spark clusters, configuring such workloads for maximum performance, and techniques for distributed hyperparameter optimization. We’ll illustrate a real-world example of training distributed deep learning learning algorithms for speech recognition and natural language processing.
The presentation from SPbPython meetup about simple self-made just-in-time (JIT) compiler for Python code.
N-th Fibonacci sequence number returning function is JIT-ed in the example.
Similar to Numba: Array-oriented Python Compiler for NumPy (20)
At my first visit to SciPy in Latin America, I was able to review the history of PyData, SciPy, and NumFOCUS, and discuss how to grow its communities and cooperate in the future. I also introduce OpenTeams as a way for open-source contributors to grow their reputation and build businesses.
Talk given at first OmniSci user conference where I discuss cooperating with open-source communities to ensure you get useful answers quickly from your data. I get a chance to introduce OpenTeams in this talk as well and discuss how it can help companies cooperate with communities.
Python is dominating the fast-growing data-science landscape. This talk provides a foundational overview of the practice of data science and some of the most popular Python libraries for doing data science. It also provides an overview of how Anaconda brings it all together.
With Dask and Numba, you can NumPy-like and Pandas-like code and have it run very fast on multi-core systems as well as at scale on many-node clusters.
With Anaconda (in particular Numba and Dask) you can scale up your NumPy and Pandas stack to many cpus and GPUs as well as scale-out to run on clusters of machines including Hadoop.
Using Anaconda to light up dark data. My talk given to the Berkeley Institute of Data Science describing Anaconda and the Blaze ecosystem for bringing a virtual analytical database to your data.
Conda is a cross-platform package manager that lets you quickly and easily build environments containing complicated software stacks. It was built to manage the NumPy stack in Python but can be used to manage any complex software dependencies.
Blaze: a large-scale, array-oriented infrastructure for PythonTravis Oliphant
This talk gives a high-level overview of the motivation, design goals, and status of the Blaze project from Continuum Analytics which is a large-scale array object for Python.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Numba: Array-oriented Python Compiler for NumPy
1. Numba: An array-oriented
Python compiler
SIAM Conference on Computational
Science and Engineering
Travis E. Oliphant
February 25, 2012
2. Big Picture
Empower domain experts with
high-level tools that exploit modern
hard-ware
?
Array Oriented Computing
3. Software Stack Future?
Plateaus of Code re-use + DSLs
SQL R
TDPL Matlab
Python
OBJC Julia C
FORTRAN C++
LLVM
4. Motivation
• Python is great for rapid development and
high-level thinking-in-code
• It is slow for interior loops because lack of
type information leads to a lot of indirection
and “extra” code.
5. Motivation
• NumPy users have had a lot of type
information for a long time --- but only
currently have one-size fits all pre-
compiled, vectorized loops.
• Idea is to use this type information to
allow compilation of arbitrary
expressions involving NumPy arrays
6. Current approaches
• Cython
• Weave
• Write fast-code in C/C++/Fortran and “wrap” with
- SWIG or Cython
- f2py or fwrap
- hand-written
Numba philosophy : don’t wrap or rewrite
--- just decorate
7. Simple API
• jit --- provide type information (fastest to call)
• autojit --- detects input types, infers output, generates
code if needed, and dispatches (a little more
expensive to call)
@jit('void(double[:,:], double, double)')
#@autojit
def numba_update(u, dx2, dy2):
nx, ny = u.shape
for i in xrange(1,nx-1):
for j in xrange(1, ny-1):
u[i,j] = ((u[i+1,j] + u[i-1,j]) * dy2 +
(u[i,j+1] + u[i,j-1]) * dx2) / (2*(dx2+dy2))
9. Image Processing ~1500x speed-up
@jit('void(f8[:,:],f8[:,:],f8[:,:])')
def filter(image, filt, output):
M, N = image.shape
m, n = filt.shape
for i in range(m//2, M-m//2):
for j in range(n//2, N-n//2):
result = 0.0
for k in range(m):
for l in range(n):
result += image[i+k-m//2,j+l-n//2]*filt[k, l]
output[i,j] = result
10. NumPy + Mamba = Numba
Python Function Machine Code
LLVM-PY
LLVM Library
ISPC OpenCL OpenMP CUDA CLANG
Intel AMD Nvidia Apple ARM
14. Fast vectorize
NumPy’s ufuncs take “kernels” and
apply the kernel element-by-element
over entire arrays
Write kernels in
from numbapro import vectorize
from math import sin
Python!
@vectorize([‘f8(f8)’, ‘f4(f4)’])
def sinc(x):
if x==0.0:
return 1.0
else:
return sin(x*pi)/(pi*x)
15. Updated Laplace Example
https://github.com/teoliphant/speed.git
Version Time Speed Up
NumPy 3.19 1.0
Numba 2.32 1.38
Vect. Numba 2.33 1.37
Cython 2.38 1.34
Weave 2.47 1.29
Numexpr 2.62 1.22
Fortran Loops 2.30 1.39
Vect. Fortran 1.50 2.13
16. Many Advanced Features
• Extension classes (jit a class)
• Struct support (NumPy arrays can be structs)
• SSA --- can refer to local variables as different types
• Typed lists and typed dictionaries coming
• pointer support
• calling ctypes and CFFI functions natively
• pycc (create stand-alone dynamic library and executable)
• pycc --python (create static extension module for Python)
17. Ufuncs
Generalized
UFuncs
Python
Function
Window
Kernel
Funcs
Function-
Uses of Numba
based
Indexing
Memory
Filters
Numba
NumPy Runtime
I/O Filters
Reduction
Filters
Computed
Columns
function pointer
18. Status
• Main team sponsored by Continuum Analytics and
composed of:
- Travis Oliphant (NumPy, SciPy)
- Jon Riehl (Mython, PyFront, Basil, ...)
- Mark Florrison (minivect, Cython)
- Siu Kwan Lam (pymothoa, llvmpy)
• Rapid progress this year
• Version 0.6 released first of February
• Version 0.7 next week numba.pydata.org
• Version 0.8 first of April
• Stable API (jit, autojit) easy to use
• Full Python support by 1.0 at end of summer
• Should be able to write equivalent of NumPy and
SciPy with Numba
19. Introducing NumbaPro
• Create parallel-for loops
• Parallel execution of
ufuncs
fast development and fast • Run ufuncs on the GPU
execution! • Write CUDA directly in
Python!
Python and NumPy compiled to
• Free for Academics
Parallel Architectures
(GPUs and multi-core
machines)
20. Create parallel-for loops
import numbapro # import first to make prange available
from numba import autojit, prange
@autojit
def parallel_sum2d(a):
sum = 0.0
for i in prange(a.shape[0]):
for j in range(a.shape[1]):
sum += a[i,j]
21. Ufuncs in parallel (multi-core or GPU)
from numbapro import vectorize
from math import sin
@vectorize([‘f8(f8)’, ‘f4(f4)’], target=‘gpu’)
def sinc(x):
if x==0.0:
return 1.0
else:
return sin(x*pi)/(pi*x)
@vectorize([‘f8(f8)’, ‘f4(f4)’], target=‘parallel’)
def sinc2(x):
if x==0.0:
return 1.0
else:
return sin(x*pi)/(pi*x)
22. Introducing CUDA-Python
from numbapro import cuda
from numba import autojit
@autojit(target=‘gpu’)
def array_scale(src, dst, scale):
tid = cuda.threadIdx.x
blkid = cuda.blockIdx.x
blkdim = cuda.blockDim.x CUDA Development
i = tid + blkid * blkdim
if i >= n: using Python syntax!
return
dst[i] = src[i] * scale
src = np.arange(N, dtype=np.float)
dst = np.empty_like(src)
array_scale[grid, block](src, dst, 5.0)
23. Example: Matrix multiply
@cuda.jit(argtypes=[f4[:,:], f4[:,:], f4[:,:]])
def cu_square_matrix_mul(A, B, C):
sA = cuda.shared.array(shape=(tpb, tpb),
dtype=f4)
sB = cuda.shared.array(shape=(tpb, tpb),
dtype=f4) bpg = 50
tpb = 32
tx = cuda.threadIdx.x n = bpg * tpb
ty = cuda.threadIdx.y
bx = cuda.blockIdx.x A = np.array(np.random.random((n, n)),
by = cuda.blockIdx.y dtype=np.float32)
bw = cuda.blockDim.x
bh = cuda.blockDim.y
B = np.array(np.random.random((n, n)),
dtype=np.float32)
x = tx + bx * bw C = np.empty_like(A)
y = ty + by * bh
stream = cuda.stream()
acc = 0. with stream.auto_synchronize():
for i in range(bpg): dA = cuda.to_device(A, stream)
if x < n and y < n: dB = cuda.to_device(B, stream)
sA[ty, tx] = A[y, tx + i * tpb]
sB[ty, tx] = B[ty + i * tpb, x]
dC = cuda.to_device(C, stream)
cu_square_matrix_mul[(bpg, bpg),
cuda.syncthreads() (tpb, tpb), stream](dA, dB, dC)
dC.to_host(stream)
if x < n and y < n:
for j in range(tpb):
acc += sA[ty, j] * sB[j, tx]
cuda.syncthreads()
if x < n and y < n:
C[y, x] = acc
27. NumFOCUS
www.numfocus.org
501(c)3 Public Charity
28. NumFOCUS Mission
• Sponsor development of high-level languages and
libraries for science
• Foster teaching of array-oriented and higher-order
computational approaches and applied
computational science
• Promote the use of open code in science and
encourage reproducible and accessible research
29. NumFOCUS Activities
• Sponsor sprints and conferences
• Provide scholarships and grants
• Provide bounties and prizes for code development
• Pay for freely-available documentation and basic
course development
• Equipment grants
• Sponsor BootCamps
• Raise funds from industries using open source
high-level languages