Talk given at the June 2008 meeting of the New Zealand Python User Group in Auckland.
Outline: An overview to approaches for parallel/concurrent programming in Python.
Code demonstrated in the presentation can be found here:
http://www.kloss-familie.de/moin/TalksPresentations
This document provides an introduction to parallel programming using Python. It discusses the motivation for parallel programming by utilizing multiple CPU cores simultaneously. The two main approaches in Python are forking processes using os.fork and spawning threads. It provides examples of forking processes and using threads via the _thread and threading modules. It also discusses challenges like synchronizing access to shared objects and introduces solutions like the multiprocessing module for interprocess communication.
This document provides an overview of concurrency in Python using multiprocessing and threading. It begins by introducing the speaker and defining key terms like concurrency, threads, and processes. It then discusses the benefits and use cases of threads versus processes. The document also covers the Global Interpreter Lock (GIL) in Python and how multiprocessing can help avoid it. It provides an example benchmark showing multiprocessing can significantly outperform threading for CPU-bound tasks. Finally, it discusses key aspects of Python's multiprocessing module like Process, Queue, Pool, and Manager classes.
Concurrency and parallelism in Python are always hot topics. This talk will look the variety of forms of concurrency and parallelism. In particular this talk will give an overview of various forms of message-passing concurrency which have become popular in languages like Scala and Go. A Python library called python-csp which implements similar ideas in a Pythonic way will be introduced and we will look at how this style of programming can be used to avoid deadlocks, race hazards and "callback hell".
Do more than one thing at the same time, the Python wayJaime Buelta
The document discusses doing more than one thing at a time in Python using threads and processes. It describes how to create threads using the threading module and processes using the multiprocessing module. While threads are easier to use, the Global Interpreter Lock (GIL) in Python prevents true parallelism. Processes can better utilize multiple CPUs but require more work for communication. Asynchronous programming is recommended for I/O-bound tasks while processes are better for CPU-bound work. The talk cautions that threading should be used carefully in Python due to the GIL.
This document summarizes Gavin M. Roy's presentation on concurrency with multiprocessing in Python. It discusses using threads via the threading module, issues with the Global Interpreter Lock (GIL) in Python, and how to use the multiprocessing module to achieve true parallelism across multiple processes. It provides examples of creating threads and processes that run concurrently and examples of how to share objects between processes using connections, queues, pipes, managers and reduction tools.
Presentation given on Monday 10 September at the ROOT Users' Workshop 2018 in Sarajevo. Progress update on the Automated Parallel Computation of Collaborative Statistical Models project, a collaboration between the Netherlands eScience Center and Nikhef.
We present an update on our recent efforts to further parallelize RooFit. We have performed extensive benchmarks and identified at least three bottlenecks that will benefit from parallelization. To tackle these and possible future bottlenecks, we designed a parallelization layer that allows us to parallelize existing classes with minimal effort, but with high performance and retaining as much of the existing class's interface as possible. The high-level parallelization model is a task-stealing approach. The implementation is currently based on the bi-directional memory mapped pipe (BidirMMapPipe), but could in the future be replaced by other modes of communication between processes.
이 발표는 2018년 4월 14일 서울에서 열린 TensorFlow Dev Summit Extended Seoul '18 에서 TensorFlow Dev Summit 2018의 발표 내용 중 TensorFlow.Data 및 TensorFlow.Hub에 관한 발표들을 정리한 내용입니다.
This presentation summarizes the talks about TensorFlow.Data and TensorFlow.Hub among the sessions of TensorFlow Dev Summit 2018, and presented at TensorFlow Dev Summit Extended Seoul '18 held on April 14, 2018 in Seoul.
Writing concurrent code is becoming more and more important to leverage the parallelism of multicore architectures. The C++11 library introduced futures and promises as a first step towards task-based programming. However, the C++ support of concurrency is still very limited. Other languages, like C# and Python, provide some forms of resumable functions or coroutines and in C#, the async/await pattern enables to write functions that suspend their execution while waiting for a computation or I/O to complete.This talk will describe a proposal for the addition of resumable function and async/await in C++17. We will focus on the implementation of resumable function on Windows, and we'll play with a first prototype of their implementation in the Visual Studio 2015 Preview. Finally, we will see how resumable functions can also be used to implement (lazy) generators, similar to the one provided by "yield" statements in C#.
This document provides an introduction to parallel programming using Python. It discusses the motivation for parallel programming by utilizing multiple CPU cores simultaneously. The two main approaches in Python are forking processes using os.fork and spawning threads. It provides examples of forking processes and using threads via the _thread and threading modules. It also discusses challenges like synchronizing access to shared objects and introduces solutions like the multiprocessing module for interprocess communication.
This document provides an overview of concurrency in Python using multiprocessing and threading. It begins by introducing the speaker and defining key terms like concurrency, threads, and processes. It then discusses the benefits and use cases of threads versus processes. The document also covers the Global Interpreter Lock (GIL) in Python and how multiprocessing can help avoid it. It provides an example benchmark showing multiprocessing can significantly outperform threading for CPU-bound tasks. Finally, it discusses key aspects of Python's multiprocessing module like Process, Queue, Pool, and Manager classes.
Concurrency and parallelism in Python are always hot topics. This talk will look the variety of forms of concurrency and parallelism. In particular this talk will give an overview of various forms of message-passing concurrency which have become popular in languages like Scala and Go. A Python library called python-csp which implements similar ideas in a Pythonic way will be introduced and we will look at how this style of programming can be used to avoid deadlocks, race hazards and "callback hell".
Do more than one thing at the same time, the Python wayJaime Buelta
The document discusses doing more than one thing at a time in Python using threads and processes. It describes how to create threads using the threading module and processes using the multiprocessing module. While threads are easier to use, the Global Interpreter Lock (GIL) in Python prevents true parallelism. Processes can better utilize multiple CPUs but require more work for communication. Asynchronous programming is recommended for I/O-bound tasks while processes are better for CPU-bound work. The talk cautions that threading should be used carefully in Python due to the GIL.
This document summarizes Gavin M. Roy's presentation on concurrency with multiprocessing in Python. It discusses using threads via the threading module, issues with the Global Interpreter Lock (GIL) in Python, and how to use the multiprocessing module to achieve true parallelism across multiple processes. It provides examples of creating threads and processes that run concurrently and examples of how to share objects between processes using connections, queues, pipes, managers and reduction tools.
Presentation given on Monday 10 September at the ROOT Users' Workshop 2018 in Sarajevo. Progress update on the Automated Parallel Computation of Collaborative Statistical Models project, a collaboration between the Netherlands eScience Center and Nikhef.
We present an update on our recent efforts to further parallelize RooFit. We have performed extensive benchmarks and identified at least three bottlenecks that will benefit from parallelization. To tackle these and possible future bottlenecks, we designed a parallelization layer that allows us to parallelize existing classes with minimal effort, but with high performance and retaining as much of the existing class's interface as possible. The high-level parallelization model is a task-stealing approach. The implementation is currently based on the bi-directional memory mapped pipe (BidirMMapPipe), but could in the future be replaced by other modes of communication between processes.
이 발표는 2018년 4월 14일 서울에서 열린 TensorFlow Dev Summit Extended Seoul '18 에서 TensorFlow Dev Summit 2018의 발표 내용 중 TensorFlow.Data 및 TensorFlow.Hub에 관한 발표들을 정리한 내용입니다.
This presentation summarizes the talks about TensorFlow.Data and TensorFlow.Hub among the sessions of TensorFlow Dev Summit 2018, and presented at TensorFlow Dev Summit Extended Seoul '18 held on April 14, 2018 in Seoul.
Writing concurrent code is becoming more and more important to leverage the parallelism of multicore architectures. The C++11 library introduced futures and promises as a first step towards task-based programming. However, the C++ support of concurrency is still very limited. Other languages, like C# and Python, provide some forms of resumable functions or coroutines and in C#, the async/await pattern enables to write functions that suspend their execution while waiting for a computation or I/O to complete.This talk will describe a proposal for the addition of resumable function and async/await in C++17. We will focus on the implementation of resumable function on Windows, and we'll play with a first prototype of their implementation in the Visual Studio 2015 Preview. Finally, we will see how resumable functions can also be used to implement (lazy) generators, similar to the one provided by "yield" statements in C#.
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsKoan-Sin Tan
This document summarizes a presentation about dark silicon in mobile devices and possible open source solutions. It discusses how power and thermal constraints are more severe for mobile devices due to limited battery progress and no fans. It also covers big.LITTLE scheduling, thread-level parallelism challenges, and user-level threading libraries like AsyncTask. Finally, it notes that while some open source parallel programming frameworks exist, fully utilizing parallelism on mobile and addressing dark silicon remain challenges with no widely adopted solutions.
Some of the biggest issues at the center of analyzing large amounts of data are query flexibility, latency, and fault tolerance. Modern technologies that build upon the success of “big data” platforms, such as Apache Hadoop, have made it possible to spread the load of data analysis to commodity machines, but these analyses can still take hours to run and do not respond well to rapidly-changing data sets.
A new generation of data processing platforms -- which we call “stream architectures” -- have converted data sources into streams of data that can be processed and analyzed in real-time. This has led to the development of various distributed real-time computation frameworks (e.g. Apache Storm) and multi-consumer data integration technologies (e.g. Apache Kafka). Together, they offer a way to do predictable computation on real-time data streams.
In this talk, we will give an overview of these technologies and how they fit into the Python ecosystem. As part of this presentation, we also released streamparse, a new Python that makes it easy to debug and run large Storm clusters.
Links:
* http://parse.ly/code
* https://github.com/Parsely/streamparse
* https://github.com/getsamsa/samsa
The document discusses attention mechanisms and their implementation in TensorFlow. It begins with an overview of attention mechanisms and their use in neural machine translation. It then reviews the code implementation of an attention mechanism for neural machine translation from English to French using TensorFlow. Finally, it briefly discusses pointer networks, an attention mechanism variant, and code implementation of pointer networks for solving sorting problems.
This document discusses using TensorFlow on Android. It begins by introducing TensorFlow and how it works as a dataflow graph. It then discusses efforts to optimize TensorFlow for mobile and embedded devices through techniques like quantization and models like MobileNet that use depthwise separable convolutions. It shares experiences building and running TensorFlow models on Android, including benchmarking an Inception model and building a label_image demo. It also compares TensorFlow mobile efforts to other mobile deep learning frameworks like CoreML and the upcoming Android Neural Networks API.
Java and the machine - Martijn Verburg and Kirk PepperdineJAX London
In Terminator 3 - Rise of the Machines, bare metal comes back to haunt humanity, ruthlessly crushing all resistance. This keynote is here to warn you that the same thing is happening to Java and the JVM! Java was designed in a world where there were a wide range of hardware platforms to support. Its premise of Write Once Run Anywhere (WORA) proved to be one of the compelling reasons behind Java's dominance (even if the reality didn't quite meet the marketing hype). However, this WORA property means that Java and the JVM struggled to utilise specialist hardware and operating system features that could make a massive difference in the performance of your application. This problem has recently gotten much, much worse. Due to the rise of multi-core processors, massive increases in main memory and enhancements to other major hardware components (e.g. SSD), the JVM is now distant from utilising that hardware, causing some major performance and scalability issues! Kirk Pepperdine and Martijn Verburg will take you through the complexities of where Java meets the machine and loses. They'll give up some of their hard-won insights on how to work around these issues so that you can plan to avoid termination, unlike some of the poor souls that ran into the T-800...
Python Training in Bangalore | Multi threading | Learnbay.inLearnbayin
Learn Multi threading in Python .How to create threads in Python.
About Race Condition.
Learnbay provides python training in Bangalore for network automation.
Essentials of Multithreaded System Programming in C++Shuo Chen
This document discusses challenges in multithreaded system programming in C++. It covers topics such as thread safety of libraries, RAII and fork(), signals and threads, and operating file descriptors in threads. The document is intended for C++ programmers familiar with threads and aims to explain interactions between threads and system calls/libraries to avoid common issues.
This document provides an overview and introduction to Muduo, a C++ network programming library for Linux. Some key points:
- Muduo is a non-blocking, event-driven, multi-core ready C++ network library that aims to provide high performance and modern features.
- The document discusses challenges with network programming using sockets APIs directly and how a library like Muduo can help abstract away complexity.
- It covers core concepts in non-blocking and event-driven network programming used by Muduo like the event loop, callbacks, and lifetime management of connection objects.
- Examples are provided of how Muduo implements patterns like chat servers and comparisons are made to other libraries
The document discusses intra-machine parallelism and threaded programming. It introduces key concepts like threads, processes, synchronization constructs (locks and condition variables), and challenges like overhead and Amdahl's law. An example of domain decomposition for parallel rendering is presented to demonstrate how to divide a problem into independent tasks and assign them to threads.
[Harvard CS264] 02 - Parallel Thinking, Architecture, Theory & Patternsnpinto
This document outlines the topics that will be covered in the course on massively parallel computing, including computational thinking skills for parallel programming, hardware limitations and constraints on algorithms, and common parallel programming patterns. The topics include thinking in parallel, computer architecture, programming models, theoretical concepts, and parallel programming patterns. The goal is to provide students with the skills needed to design efficient parallel algorithms that maximize performance on modern parallel hardware.
Efficient logging in multithreaded C++ serverShuo Chen
This document discusses efficient logging in multithreaded C++ servers. It describes the muduo logging library which can log over 1 million messages per second with low latency. The key aspects are an efficient LogStream frontend, asynchronous backend using double buffering to pass log messages from threads to a log writer thread without blocking, and writing to local files for performance and reliability.
이 발표에서는 TensorFlow의 지난 1년을 간단하게 돌아보고, TensorFlow의 차기 로드맵에 따라 개발 및 도입될 예정인 여러 기능들을 소개합니다. 또한 2017년 및 2018년의 머신러닝 프레임워크 개발 트렌드와 방향에 대한 이야기도 함께 합니다.
In this talk, I look back the TensorFlow development over the past year. Then discusses the overall development direction of machine learning frameworks, with an introduction to features that will be added to TensorFlow later on.
Published on 11 may, 2018
Chainer is a deep learning framework which is flexible, intuitive, and powerful.
This slide introduces some unique features of Chainer and its additional packages such as ChainerMN (distributed learning), ChainerCV (computer vision), ChainerRL (reinforcement learning), Chainer Chemistry (biology and chemistry), and ChainerUI (visualization).
This document discusses Zurg, a distributed process management system with a master-slave architecture. The Zurg slave runs on each host and can run commands, start and monitor applications, and collect performance data. It communicates with the Zurg master. Some challenges discussed include reliably detecting when processes exit, limiting output, and ensuring processes are properly restarted if the slave crashes. The master will store status information accessible via web interfaces.
Suggestions:
1) For best quality, download the PDF before viewing.
2) Open at least two windows: One for the Youtube video, one for the screencast (link below), and optionally one for the slides themselves.
3) The Youtube video is shown on the first page of the slide deck, for slides, just skip to page 2.
Screencast: http://youtu.be/VoL7JKJmr2I
Video recording: http://youtu.be/CJRvb8zxRdE (Thanks to Al Friedrich!)
In this talk, we take Deep Learning to task with real world data puzzles to solve.
Data:
- Higgs binary classification dataset (10M rows, 29 cols)
- MNIST 10-class dataset
- Weather categorical dataset
- eBay text classification dataset (8500 cols, 500k rows, 467 classes)
- ECG heartbeat anomaly detection
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
This document discusses designing for concurrency in Golang. It provides two examples of concurrent systems and how they can be modeled using Goroutines. Goroutines are lightweight threads that allow implementing one goroutine per concurrent activity. This approach avoids sharing memory and uses message passing with channels for communication. The document concludes that Goroutines are cheap to create and fast to schedule, enabling designs that don't share memory and have many concurrent Goroutines. Code samples and references for further reading on Golang concurrency patterns are also provided.
Caffe (Convolutional Architecture for Fast Feature Embedding) is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors.
Caffe’s expressive architecture encourages application and innovation. Models and optimization are defined by configuration without hard-coding. Switch between CPU and GPU by setting a single flag to train on a GPU machine then deploy to commodity clusters or mobile devices.Caffe’s extensible code fosters active development. In Caffe’s first year, it has been forked by over 1,000 developers and had many significant changes contributed back. Thanks to these contributors the framework tracks the state-of-the-art in both code and models.Speed makes Caffe perfect for research experiments and industry deployment. Caffe can processover 60M images per day with a single NVIDIA K40 GPU*. That’s 1 ms/image for inference and 4 ms/image for learning. We believe that Caffe is the fastest convnet implementation available.Caffe already powers academic research projects, startup prototypes, and even large-scale industrial applications in vision, speech, and multimedia. Join our community of brewers on the caffe-users group and Github.
This tutorial is designed to equip researchers and developers with the tools and know-how needed to incorporate deep learning into their work. Both the ideas and implementation of state-of-the-art deep learning models will be presented. While deep learning and deep features have recently achieved strong results in many tasks, a common framework and shared models are needed to advance further research and applications and reduce the barrier to entry. To this end we present the Caffe framework, public reference models, and working examples for deep learning. Join our tour from the 1989 LeNet for digit recognition to today’s top ILSVRC14 vision models. Follow along with do-it-yourself code notebooks. While focusing on vision, general techniques are covered.
This document provides an introduction to parallel programming using Python. It discusses the motivation for parallel programming being to utilize idle CPU capacity. The two main ways to run tasks in parallel in Python are using process forks and spawned threads. It then covers forking processes, threads, interprocess communication, and the multiprocessing module in Python for parallel programming.
This document provides an introduction to concurrency in Python using threads. It discusses how threads allow programs to perform multiple tasks simultaneously by sharing system resources like memory. The document covers basic threading concepts like creating and launching threads, as well as challenges like accessing shared data between threads, which can be non-deterministic due to thread scheduling. It aims to provide an overview of concurrency support in the Python standard library beyond just the user manual.
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsKoan-Sin Tan
This document summarizes a presentation about dark silicon in mobile devices and possible open source solutions. It discusses how power and thermal constraints are more severe for mobile devices due to limited battery progress and no fans. It also covers big.LITTLE scheduling, thread-level parallelism challenges, and user-level threading libraries like AsyncTask. Finally, it notes that while some open source parallel programming frameworks exist, fully utilizing parallelism on mobile and addressing dark silicon remain challenges with no widely adopted solutions.
Some of the biggest issues at the center of analyzing large amounts of data are query flexibility, latency, and fault tolerance. Modern technologies that build upon the success of “big data” platforms, such as Apache Hadoop, have made it possible to spread the load of data analysis to commodity machines, but these analyses can still take hours to run and do not respond well to rapidly-changing data sets.
A new generation of data processing platforms -- which we call “stream architectures” -- have converted data sources into streams of data that can be processed and analyzed in real-time. This has led to the development of various distributed real-time computation frameworks (e.g. Apache Storm) and multi-consumer data integration technologies (e.g. Apache Kafka). Together, they offer a way to do predictable computation on real-time data streams.
In this talk, we will give an overview of these technologies and how they fit into the Python ecosystem. As part of this presentation, we also released streamparse, a new Python that makes it easy to debug and run large Storm clusters.
Links:
* http://parse.ly/code
* https://github.com/Parsely/streamparse
* https://github.com/getsamsa/samsa
The document discusses attention mechanisms and their implementation in TensorFlow. It begins with an overview of attention mechanisms and their use in neural machine translation. It then reviews the code implementation of an attention mechanism for neural machine translation from English to French using TensorFlow. Finally, it briefly discusses pointer networks, an attention mechanism variant, and code implementation of pointer networks for solving sorting problems.
This document discusses using TensorFlow on Android. It begins by introducing TensorFlow and how it works as a dataflow graph. It then discusses efforts to optimize TensorFlow for mobile and embedded devices through techniques like quantization and models like MobileNet that use depthwise separable convolutions. It shares experiences building and running TensorFlow models on Android, including benchmarking an Inception model and building a label_image demo. It also compares TensorFlow mobile efforts to other mobile deep learning frameworks like CoreML and the upcoming Android Neural Networks API.
Java and the machine - Martijn Verburg and Kirk PepperdineJAX London
In Terminator 3 - Rise of the Machines, bare metal comes back to haunt humanity, ruthlessly crushing all resistance. This keynote is here to warn you that the same thing is happening to Java and the JVM! Java was designed in a world where there were a wide range of hardware platforms to support. Its premise of Write Once Run Anywhere (WORA) proved to be one of the compelling reasons behind Java's dominance (even if the reality didn't quite meet the marketing hype). However, this WORA property means that Java and the JVM struggled to utilise specialist hardware and operating system features that could make a massive difference in the performance of your application. This problem has recently gotten much, much worse. Due to the rise of multi-core processors, massive increases in main memory and enhancements to other major hardware components (e.g. SSD), the JVM is now distant from utilising that hardware, causing some major performance and scalability issues! Kirk Pepperdine and Martijn Verburg will take you through the complexities of where Java meets the machine and loses. They'll give up some of their hard-won insights on how to work around these issues so that you can plan to avoid termination, unlike some of the poor souls that ran into the T-800...
Python Training in Bangalore | Multi threading | Learnbay.inLearnbayin
Learn Multi threading in Python .How to create threads in Python.
About Race Condition.
Learnbay provides python training in Bangalore for network automation.
Essentials of Multithreaded System Programming in C++Shuo Chen
This document discusses challenges in multithreaded system programming in C++. It covers topics such as thread safety of libraries, RAII and fork(), signals and threads, and operating file descriptors in threads. The document is intended for C++ programmers familiar with threads and aims to explain interactions between threads and system calls/libraries to avoid common issues.
This document provides an overview and introduction to Muduo, a C++ network programming library for Linux. Some key points:
- Muduo is a non-blocking, event-driven, multi-core ready C++ network library that aims to provide high performance and modern features.
- The document discusses challenges with network programming using sockets APIs directly and how a library like Muduo can help abstract away complexity.
- It covers core concepts in non-blocking and event-driven network programming used by Muduo like the event loop, callbacks, and lifetime management of connection objects.
- Examples are provided of how Muduo implements patterns like chat servers and comparisons are made to other libraries
The document discusses intra-machine parallelism and threaded programming. It introduces key concepts like threads, processes, synchronization constructs (locks and condition variables), and challenges like overhead and Amdahl's law. An example of domain decomposition for parallel rendering is presented to demonstrate how to divide a problem into independent tasks and assign them to threads.
[Harvard CS264] 02 - Parallel Thinking, Architecture, Theory & Patternsnpinto
This document outlines the topics that will be covered in the course on massively parallel computing, including computational thinking skills for parallel programming, hardware limitations and constraints on algorithms, and common parallel programming patterns. The topics include thinking in parallel, computer architecture, programming models, theoretical concepts, and parallel programming patterns. The goal is to provide students with the skills needed to design efficient parallel algorithms that maximize performance on modern parallel hardware.
Efficient logging in multithreaded C++ serverShuo Chen
This document discusses efficient logging in multithreaded C++ servers. It describes the muduo logging library which can log over 1 million messages per second with low latency. The key aspects are an efficient LogStream frontend, asynchronous backend using double buffering to pass log messages from threads to a log writer thread without blocking, and writing to local files for performance and reliability.
이 발표에서는 TensorFlow의 지난 1년을 간단하게 돌아보고, TensorFlow의 차기 로드맵에 따라 개발 및 도입될 예정인 여러 기능들을 소개합니다. 또한 2017년 및 2018년의 머신러닝 프레임워크 개발 트렌드와 방향에 대한 이야기도 함께 합니다.
In this talk, I look back the TensorFlow development over the past year. Then discusses the overall development direction of machine learning frameworks, with an introduction to features that will be added to TensorFlow later on.
Published on 11 may, 2018
Chainer is a deep learning framework which is flexible, intuitive, and powerful.
This slide introduces some unique features of Chainer and its additional packages such as ChainerMN (distributed learning), ChainerCV (computer vision), ChainerRL (reinforcement learning), Chainer Chemistry (biology and chemistry), and ChainerUI (visualization).
This document discusses Zurg, a distributed process management system with a master-slave architecture. The Zurg slave runs on each host and can run commands, start and monitor applications, and collect performance data. It communicates with the Zurg master. Some challenges discussed include reliably detecting when processes exit, limiting output, and ensuring processes are properly restarted if the slave crashes. The master will store status information accessible via web interfaces.
Suggestions:
1) For best quality, download the PDF before viewing.
2) Open at least two windows: One for the Youtube video, one for the screencast (link below), and optionally one for the slides themselves.
3) The Youtube video is shown on the first page of the slide deck, for slides, just skip to page 2.
Screencast: http://youtu.be/VoL7JKJmr2I
Video recording: http://youtu.be/CJRvb8zxRdE (Thanks to Al Friedrich!)
In this talk, we take Deep Learning to task with real world data puzzles to solve.
Data:
- Higgs binary classification dataset (10M rows, 29 cols)
- MNIST 10-class dataset
- Weather categorical dataset
- eBay text classification dataset (8500 cols, 500k rows, 467 classes)
- ECG heartbeat anomaly detection
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
This document discusses designing for concurrency in Golang. It provides two examples of concurrent systems and how they can be modeled using Goroutines. Goroutines are lightweight threads that allow implementing one goroutine per concurrent activity. This approach avoids sharing memory and uses message passing with channels for communication. The document concludes that Goroutines are cheap to create and fast to schedule, enabling designs that don't share memory and have many concurrent Goroutines. Code samples and references for further reading on Golang concurrency patterns are also provided.
Caffe (Convolutional Architecture for Fast Feature Embedding) is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors.
Caffe’s expressive architecture encourages application and innovation. Models and optimization are defined by configuration without hard-coding. Switch between CPU and GPU by setting a single flag to train on a GPU machine then deploy to commodity clusters or mobile devices.Caffe’s extensible code fosters active development. In Caffe’s first year, it has been forked by over 1,000 developers and had many significant changes contributed back. Thanks to these contributors the framework tracks the state-of-the-art in both code and models.Speed makes Caffe perfect for research experiments and industry deployment. Caffe can processover 60M images per day with a single NVIDIA K40 GPU*. That’s 1 ms/image for inference and 4 ms/image for learning. We believe that Caffe is the fastest convnet implementation available.Caffe already powers academic research projects, startup prototypes, and even large-scale industrial applications in vision, speech, and multimedia. Join our community of brewers on the caffe-users group and Github.
This tutorial is designed to equip researchers and developers with the tools and know-how needed to incorporate deep learning into their work. Both the ideas and implementation of state-of-the-art deep learning models will be presented. While deep learning and deep features have recently achieved strong results in many tasks, a common framework and shared models are needed to advance further research and applications and reduce the barrier to entry. To this end we present the Caffe framework, public reference models, and working examples for deep learning. Join our tour from the 1989 LeNet for digit recognition to today’s top ILSVRC14 vision models. Follow along with do-it-yourself code notebooks. While focusing on vision, general techniques are covered.
This document provides an introduction to parallel programming using Python. It discusses the motivation for parallel programming being to utilize idle CPU capacity. The two main ways to run tasks in parallel in Python are using process forks and spawned threads. It then covers forking processes, threads, interprocess communication, and the multiprocessing module in Python for parallel programming.
This document provides an introduction to concurrency in Python using threads. It discusses how threads allow programs to perform multiple tasks simultaneously by sharing system resources like memory. The document covers basic threading concepts like creating and launching threads, as well as challenges like accessing shared data between threads, which can be non-deterministic due to thread scheduling. It aims to provide an overview of concurrency support in the Python standard library beyond just the user manual.
It's not your mother's C++ anymore. Manual memory management, tedious loops, difficult-to-use STL algorithms -- are all a thing of the past now. The new C++ 11 standard contains a huge number of improvements to the C++ core language and standard library, and can help C++ developers be more productive.
In this session we will discuss the major features of C++ 11, including lambda functions, type inference for local variables, range-based for loops, smart pointers, and more. We will see how to use these features effectively to modernize your existing C++ programs and how to develop in the modern C++ style.
The document summarizes new features in C++ 11 and C++ 14, including language features like auto variables, lambda functions, rvalue references, and move semantics. It discusses new library features like smart pointers, the concurrency library, and user-defined literals. The presentation covers status and compiler support, best practices for modern C++, and what to expect in upcoming standards like C++ 14 with further language and library improvements.
The hair dryer works by using a heating element and fan. When plugged in, electric current heats the filaments in the heating element. The current also powers a motor that turns a fan. The fan blows air through the heating element, which warms the air. The warm air is then blown out of the dryer to dry hair by evaporating its water content faster. Safety features include a thermal fuse, temperature sensors, and insulation to prevent overheating.
Multithreading with modern C++ is hard. Undefined variables, Deadlocks, Livelocks, Race Conditions, Spurious Wakeups, the Double Checked Locking Pattern, etc. And at the base is the new Memory-Modell which make the life not easier. The story of things which can go wrong is very long. In this talk I give you a tour through the things which can go wrong and show how you can avoid them.
The document discusses multithreading concepts like concurrency and threading, how to create and control threads including setting priorities and states, and how to safely share resources between threads using synchronization, locks, and wait/notify methods to avoid issues like deadlocks. It also covers deprecated thread methods and increased threading support in JDK 1.5.
This document discusses various places where CPU cycles are wasted in production systems based on insights from continuous profiling of large compute clusters. Some examples of wasted cycles include misconfiguration of the underlying OS, suboptimal choices of dependencies, and excessive serialization/deserialization. Specific issues highlighted include gettimeofday calls on older AWS instances, kubelet directory walking, exception throwing in xgboost, and high CPU usage from zlib. Continuous fleet-wide profiling is becoming an important tool for identifying such performance issues.
CT Brown - Doing next-gen sequencing analysis in the cloudJan Aerts
This document summarizes work on digital normalization, a technique for reducing sequencing data size prior to assembly. Digital normalization works by discarding reads whose k-mer counts are below a cutoff, based on analysis of k-mer abundances across the dataset. It can remove over 95% of data in a single pass with fixed memory. This makes genome and metagenome assembly scalable to larger datasets using cloud computing resources. The work is done in an open science manner, with all code, data, and manuscripts openly accessible online.
Talk at Bioinformatics Open Source Conference, 2012c.titus.brown
This document summarizes work on digital normalization, a technique for reducing sequencing data size prior to assembly. Digital normalization works by discarding reads whose k-mer counts are below a cutoff, based on analysis of k-mer frequencies in the de Bruijn graph. It can remove over 95% of data in a single pass with fixed memory. Digital normalization enables assembly of large datasets in the cloud by reducing data size and memory requirements. The document acknowledges collaborators and funding sources and provides links for code, blogs, papers, and future events.
The document introduces Parallel Pixie Dust (PPD), a cross-platform thread library that aims to guarantee deadlock-free and race-condition free schedules that are optimal. It discusses the need for multiple threads due to factors like the memory wall. Current threading models are problematic because testing and debugging threaded code is difficult. PPD uses futures and thread pools to simulate data flow and generate tree-like thread schedules. It provides parallel versions of functions and thread-safe containers to enable multi-threaded standard library algorithms. The goal is to make writing correct multi-threaded programs easier.
What's new with JavaScript in GNOME: The 2020 edition (GUADEC 2020)Igalia
By Philip Chimento.
This talk is about all the improvements made in GNOME's JavaScript platform in the past year. If you are writing code for a GNOME app or shell extension that uses JavaScript and you want to know how to modernize your code or use new language features, this talk will be interesting for you. If you are curious
about the progress made on the garbage collection bug, and what needs to happen before it can be fixed, this talk will be interesting for you. And if you are interested in working on a JavaScript engine and want some ideas for projects
to get started with, from beginner through expert level, this talk will definitely be interesting for you!
(c) GUADEC 2020
July 22nd - 28th, 2020
https://2020.guadec.org
This document provides an agenda and overview for a hands-on introduction to multi-threaded programming and Pthreads. The tutorial will cover fundamental concepts of concurrency and multi-threading, and illustrate these concepts through simple C programs that utilize Pthreads. Attendees will learn about thread creation, synchronization methods like mutexes and barriers, and how to compile and run basic Pthreads applications.
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2pjvrpW.
Joe Duffy talks about the concurrency's explosion onto the mainstream over the past 15 years. He looks at some of today's hottest trends (Cloud, IoT, Microservices) and attempts to predict what lies ahead not only for concurrent programming, but also distributed, from now to 15 years into the future. Filmed at qconlondon.com.
Joe Duffy is Director of Engineering for the Compiler and Language Group at Microsoft. He leads the teams building C++, C#, VB, and F# languages, compilers, and static analysis platforms, across many architectures and platforms.
Cloud computing: evolution or redefinitionPET Computação
Campo do conhecimento: Enfoque na Modelagem Complexa ou Orientada a Objetos; Campo da comunicação humana:Diretrizes conceituais na modelagem complexa da comunicação humana , Canais Representacionais Mentais, Estilos de Aprender e Ensinar; Teste verificador dos canais representacionais; Teste verificador dos estilos pessoais; Exemplos práticos de utilização do conhecimento complexo na comunicação humana ; Trabalhando com a diversidade e suas potencialidades na comunicação interpessoal afetando positivamente o ambiente de trabalho; Conclusões e Orientações Práticas para o Desenvolvimento de Competências para o Profissional do século XXI.
How I Sped up Complex Matrix-Vector Multiplication: Finding Intel MKL's "SBrandon Liu
Implementing a fixed point int16_t integer matrix vector multiplication kernel for Intel processors with AVX-512 and the Xbyak just-in-time compiler (what Intel MKL jit_cgemm uses)
Parallelformers is a tool for efficiently parallelizing large language models across multiple GPUs. It was created to address the challenges of deploying and using very large models, which require extensive engineering and expensive hardware. Parallelformers uses model parallelism techniques inspired by Megatron-LM to split models across GPUs for efficient distributed processing and inference. The key design principles of Parallelformers are efficient model parallelism, scalability to support many models, simplicity of use, and enabling easy deployment of large models.
Course: "Introductory course to HLS FPGA programming"Mirko Mariotti
Slides of the course: "Introductory course to HLS FPGA programming", Nov 27 – 30, 2023. ICSC National research center on HPC, big data and Quantum Computing
This document provides a tutorial on multi-core CPUs, computer clusters, and grid computing for economists. It begins by discussing trends in microprocessor development such as increasing transistor counts and clock speeds. Future improvements will come from multi-core CPUs rather than increased clock speed. It also discusses increases in network speeds that enable computer clusters and grids. The document then provides suggestions for optimizing code performance on a single CPU before parallelization. This includes minimizing branches, inlining subroutines, and using high-performance libraries. The rest of the document discusses programming multi-core CPUs, clusters, and grids.
The genesis of clusterlib - An open source library to tame your favourite sup...Arnaud Joly
The presentations tells the story of clusterlib an open source package from the problem statement to a first grade an open source library. Awesome tools are also presented for software projects.
The goal of the clusterlib is to ease the creation, launch and management of embarrassingly parallel jobs on supercomputers with schedulers such as SLURM and SGE.
Towards a Systematic Study of Big Data Performance and BenchmarkingSaliya Ekanayake
This document summarizes a Ph.D. dissertation defense presented by Saliya Ekanayake on September 28, 2016. The dissertation studied big data performance and benchmarking, focusing on parallel machine learning. It evaluated performance factors like thread models, affinity, communication mechanisms, and optimizations for high-level languages. It also presented SPIDAL Java, a library for scalable parallel machine learning applications, and evaluated its performance on use cases like gene sequence clustering and stock data analysis using up to 48 nodes.
Why Cloud Computing has to go the FOSS wayAhmed Mekkawy
This presentation tries to show the trends of software industry to reach the conclusion that cloud computing as a concept is inevitable, and having them as open clouds in inevitable as well.
This document summarizes challenges in assembling large DNA sequence data sets and strategies to address them.
1. The cost to generate DNA sequence data is decreasing rapidly, creating data sets too large for most computers to assemble. Hundreds to thousands of such data sets are generated each year.
2. Techniques like streaming compression and low-memory probabilistic data structures allow assembly memory usage to scale linearly with the sample size rather than the total data, enabling assembly of larger datasets.
3. Benchmarking different computational platforms revealed that while some platforms have faster processors, the ability to store large amounts of data locally is also important for assembly tasks. Scaling algorithms, rather than just optimizing code, is key to addressing
Software and the Concurrency Revolution : NotesSubhajit Sahu
Highlighted notes of article while studying Concurrent Data Structures, CSE:
Software and the Concurrency Revolution
Herb Sutter
Software Architect, Microsoft
Software Development Consultant, www.gotw.ca/training
Herb Sutter is a prominent C++ expert. He is also a book author and was a columnist for Dr. Dobb's Journal. He joined Microsoft in 2002 as a platform evangelist for Visual C++ .NET, rising to lead software architect for C++/CLI.
A Survey on in-a-box parallel computing and its implications on system softwa...ChangWoo Min
1) The document surveys research on parallel computing using multicore CPUs and GPUs, and its implications for system software.
2) It discusses parallel programming models like OpenMP, Intel TBB, CUDA, and OpenCL. It also covers research on optimizing memory allocation, reducing system call overhead, and revisiting OS architecture for manycore systems.
3) The document reviews work on supporting GPUs in virtualized environments through techniques like GPU virtualization. It also summarizes projects that utilize the GPU in middleware for tasks like network packet processing.
Similar to Beating the (sh** out of the) GIL - Multithreading vs. Multiprocessing (20)
Kauri ID - A Self-Sovereign, Blockchain-based Identity SystemGuy K. Kloss
Presented on Friday, 13 July 2018 at the ITP Conference in Wellington, New Zealand
Kiwis can't express their identity digitally and securely across cultural backgrounds, across competitive boundaries. This is an ongoing, permanent problem so far. Yes, there are ways for it, e.g. using RealMe, Google federated identity, etc. But they all have their "warts". Some are expensive or cumbersome to use from an organisation's perspective. Others are leaking meta-data to corporates, whose goal is to use your information to be able to sell to you in a better way, thus making the end user The Product (TM). Yet others are lacking critical mass among the population to be successful.
A Bloomberg Intelligence's Report ("The year Ahead 2018") is quoting the cost for the US banking sector for KYC (Know Your Customer) or AML (Anty Money Laundering) breaches to total US$ 16.1 billion from 2008 to 2015. The same report cites the Royal Bank of Scotland to employ 2,000 staff (early 2017) exclusively to comply with KYC rules, with the expectation to lower this headcount by 95% given a viable digital solution.
Due to the magnitude of this problem, a local major bank has kicked off an initiative with the local community to venture into solution opportunities.
This paper presents the background to the problem statement, the goal definition and particularly the approach taken for the system. Design decisions and evaluations will be discussed for this system under the project title of "Kauri ID", a self-sovereign, blockchain-based identity infrastructure. It puts the user at the centre, and no company or organisation owns identity information or acts as a (formal) guardian.
Kauri ID employs privacy by design, enabling fine-granular, selective and confidential data sharing. Authenticity is implemented via a web of trust, attesting identity attribute claims.
Even though Kauri ID is inherently self-sovereign, sovereign aspects can be catered for via governmental attribute endorsements, thus building a bridge between New Zealand's RealMe system and Kauri ID.
Qrious about Insights -- Big Data in the Real WorldGuy K. Kloss
Presentation for the Data Science Research Group Workshop on 7 February 2017 at AUT. The talk centres around the problem in Big Data analytics, tools for overcoming these problems, and the way the company Qrious leverages these to build solutions.
This document provides an explanation of blockchain technology through the example of building a digital currency called Infocoin. It explains how blockchain addresses key issues like double spending by using cryptographic signatures, serial numbers, proof-of-work, and maintaining a shared public ledger in the form of a blockchain. It then discusses potential applications of blockchain beyond digital currencies, as well as challenges regarding scalability, energy use, and choosing which blockchain to use.
Building a (Really) Secure Cloud ProductGuy K. Kloss
Guest lecture to Master of Information Security and Digital Forensics students at Auckland University of Technology (AUT) on the development of the MEGAchat Cloud application.
Representational State Transfer (REST) and HATEOASGuy K. Kloss
This document outlines Representational State Transfer (REST) and HATEOAS (Hypermedia as the Engine of Application State). It discusses the principles of REST including identification of resources, manipulation of resources through HTTP methods, self-descriptive messages, and HATEOAS. An example scenario of a flight booking API is provided to illustrate how HATEOAS links indicate state transitions within a REST API.
Introduction to LaTeX (For Word users)Guy K. Kloss
This document introduces LaTeX, an open-source document preparation system. LaTeX uses TeX as its typesetting engine and allows authors to focus on the content instead of formatting. It offers advantages over Word like portability, flexibility, precise control over formatting, and high quality output, especially for mathematical formulas. The document discusses what LaTeX is, how to pronounce it, its advantages over Word for large projects, and its ability to produce higher quality documents than Word.
MataNui - Building a Grid Data Infrastructure that "doesn't suck!"Guy K. Kloss
This document discusses the development of a grid data infrastructure called MataNui to manage large amounts of observational astronomical data and metadata from a collaboration between researchers in New Zealand and Japan. The infrastructure uses existing open-source tools like MongoDB, GridFTP, and the DataFinder GUI client to allow distributed storage and access of data while meeting requirements like handling large data volumes, metadata, and remote access. This approach provides a robust, reusable, and user-friendly system to address common data management challenges in scientific collaborations.
Operations Research and Optimization in Python using PuLPGuy K. Kloss
This document discusses mathematical optimization and summarizes a presentation given by Dr. Stuart Mitchell on the topic. It provides an example problem of optimizing wedding guest seating assignments. The key points are:
1) Mathematical optimization provides a precise way to formulate problems with objectives and constraints to find optimal solutions.
2) The wedding seating problem involves assigning guests to tables to maximize happiness based on relationships while ensuring each guest has a seat.
3) The problem is formulated as a set partitioning problem and implemented in Python code using the PuLP library to find the optimal seating arrangement.
Python Data Plotting and Visualisation ExtravaganzaGuy K. Kloss
This document is a presentation on Python data plotting and visualization tools. It outlines 2D and 3D plotting tools for Python, including Gnuplot, matplotlib, Mayavi, Visual Python, and the Mayavi "visual" module. The presentation was given by Guy K. Kloss at the first ever Kiwi PyCon in Christchurch, New Zealand on November 7, 2009.
Lecture "Open Source and Open Content"Guy K. Kloss
The document discusses open source software, education, content, standards and licenses. It provides examples of the growth and adoption of open source software like Linux and Apache. It also discusses open education initiatives like MIT OpenCourseWare and Wikipedia that provide free access to knowledge. Open standards and licenses like Creative Commons are presented as allowing open reuse and collaboration on intellectual property.
Largely based on Vishnu Gopal's presentation http://www.slideshare.net/vishnu/basic-source-control-with-subversion
Used for a quick SVN introduction in a Software Engineering course at Massey University.
Thinking Hybrid - Python/C++ IntegrationGuy K. Kloss
The document discusses integrating Python and C++ by using Python where possible for its simplicity and readability, but using C++ for performance-critical parts where needed. It provides tips on a real-world example of hybrid Python/C++ development. The example quote is "Python where we can, C++ where we must" from Alex Martelli, a senior Google developer.
Thinking Hybrid - Python/C++ IntegrationGuy K. Kloss
Talk given at the January 2008 meeting of the New Zealand Python User Group in Auckland.
Outline: Talk on integrating native C++ sensibly into Python for ease of use of the code base. Inheriting from C++ classes, overriding functionality, automatically generating the bindings using Py++ and SCons.
Code demonstrated in the presentation can be found here:
http://www.kloss-familie.de/moin/TalksPresentations
Gaining Colour Stability in Live Image CapturingGuy K. Kloss
The document discusses gaining colour stability in live image capturing. It describes how camera sensors interpret colour differently than the human eye based on lighting conditions. Traditional colour management using ICC profiles requires static calibration and does not work for changing environments. The document proposes adapting current colour constancy methods and exploiting slow background changes to create a processing loop that can automatically adapt to varying conditions in live image capturing.
This document provides an introduction to LaTeX for Word users. It summarizes what LaTeX is, the benefits of using LaTeX over Word, how to produce a simple LaTeX document, and how to install LaTeX on Windows. The presentation includes slides on document structure in LaTeX and common file types.
Thinking Hybrid - Python/C++ IntegrationGuy K. Kloss
Talk on integrating native C++ sensibly into Python for ease of use of the code base. Inheriting from C++ classes, overriding functionality, automatically generating the bindings using Py++ and SCons.
Code demonstrated in the presentation can be found here:
http://www.kloss-familie.de/moin/TalksPresentations
Solution Manual For Financial Accounting, 8th Canadian Edition 2024, by Libby...Donc Test
Solution Manual For Financial Accounting, 8th Canadian Edition 2024, by Libby, Hodge, Verified Chapters 1 - 13, Complete Newest Version Solution Manual For Financial Accounting, 8th Canadian Edition by Libby, Hodge, Verified Chapters 1 - 13, Complete Newest Version Solution Manual For Financial Accounting 8th Canadian Edition Pdf Chapters Download Stuvia Solution Manual For Financial Accounting 8th Canadian Edition Ebook Download Stuvia Solution Manual For Financial Accounting 8th Canadian Edition Pdf Solution Manual For Financial Accounting 8th Canadian Edition Pdf Download Stuvia Financial Accounting 8th Canadian Edition Pdf Chapters Download Stuvia Financial Accounting 8th Canadian Edition Ebook Download Stuvia Financial Accounting 8th Canadian Edition Pdf Financial Accounting 8th Canadian Edition Pdf Download Stuvia
Financial Assets: Debit vs Equity Securities.pptxWrito-Finance
financial assets represent claim for future benefit or cash. Financial assets are formed by establishing contracts between participants. These financial assets are used for collection of huge amounts of money for business purposes.
Two major Types: Debt Securities and Equity Securities.
Debt Securities are Also known as fixed-income securities or instruments. The type of assets is formed by establishing contracts between investor and issuer of the asset.
• The first type of Debit securities is BONDS. Bonds are issued by corporations and government (both local and national government).
• The second important type of Debit security is NOTES. Apart from similarities associated with notes and bonds, notes have shorter term maturity.
• The 3rd important type of Debit security is TRESURY BILLS. These securities have short-term ranging from three months, six months, and one year. Issuer of such securities are governments.
• Above discussed debit securities are mostly issued by governments and corporations. CERTIFICATE OF DEPOSITS CDs are issued by Banks and Financial Institutions. Risk factor associated with CDs gets reduced when issued by reputable institutions or Banks.
Following are the risk attached with debt securities: Credit risk, interest rate risk and currency risk
There are no fixed maturity dates in such securities, and asset’s value is determined by company’s performance. There are two major types of equity securities: common stock and preferred stock.
Common Stock: These are simple equity securities and bear no complexities which the preferred stock bears. Holders of such securities or instrument have the voting rights when it comes to select the company’s board of director or the business decisions to be made.
Preferred Stock: Preferred stocks are sometime referred to as hybrid securities, because it contains elements of both debit security and equity security. Preferred stock confers ownership rights to security holder that is why it is equity instrument
<a href="https://www.writofinance.com/equity-securities-features-types-risk/" >Equity securities </a> as a whole is used for capital funding for companies. Companies have multiple expenses to cover. Potential growth of company is required in competitive market. So, these securities are used for capital generation, and then uses it for company’s growth.
Concluding remarks
Both are employed in business. Businesses are often established through debit securities, then what is the need for equity securities. Companies have to cover multiple expenses and expansion of business. They can also use equity instruments for repayment of debits. So, there are multiple uses for securities. As an investor, you need tools for analysis. Investment decisions are made by carefully analyzing the market. For better analysis of the stock market, investors often employ financial analysis of companies.
STREETONOMICS: Exploring the Uncharted Territories of Informal Markets throug...sameer shah
Delve into the world of STREETONOMICS, where a team of 7 enthusiasts embarks on a journey to understand unorganized markets. By engaging with a coffee street vendor and crafting questionnaires, this project uncovers valuable insights into consumer behavior and market dynamics in informal settings."
OJP data from firms like Vicinity Jobs have emerged as a complement to traditional sources of labour demand data, such as the Job Vacancy and Wages Survey (JVWS). Ibrahim Abuallail, PhD Candidate, University of Ottawa, presented research relating to bias in OJPs and a proposed approach to effectively adjust OJP data to complement existing official data (such as from the JVWS) and improve the measurement of labour demand.
In a tight labour market, job-seekers gain bargaining power and leverage it into greater job quality—at least, that’s the conventional wisdom.
Michael, LMIC Economist, presented findings that reveal a weakened relationship between labour market tightness and job quality indicators following the pandemic. Labour market tightness coincided with growth in real wages for only a portion of workers: those in low-wage jobs requiring little education. Several factors—including labour market composition, worker and employer behaviour, and labour market practices—have contributed to the absence of worker benefits. These will be investigated further in future work.
Falcon stands out as a top-tier P2P Invoice Discounting platform in India, bridging esteemed blue-chip companies and eager investors. Our goal is to transform the investment landscape in India by establishing a comprehensive destination for borrowers and investors with diverse profiles and needs, all while minimizing risk. What sets Falcon apart is the elimination of intermediaries such as commercial banks and depository institutions, allowing investors to enjoy higher yields.
2. Elemental Economics - Mineral demand.pdfNeal Brewster
After this second you should be able to: Explain the main determinants of demand for any mineral product, and their relative importance; recognise and explain how demand for any product is likely to change with economic activity; recognise and explain the roles of technology and relative prices in influencing demand; be able to explain the differences between the rates of growth of demand for different products.
The Rise of Generative AI in Finance: Reshaping the Industry with Synthetic DataChampak Jhagmag
In this presentation, we will explore the rise of generative AI in finance and its potential to reshape the industry. We will discuss how generative AI can be used to develop new products, combat fraud, and revolutionize risk management. Finally, we will address some of the ethical considerations and challenges associated with this powerful technology.
BONKMILLON Unleashes Its Bonkers Potential on Solana.pdfcoingabbar
Introducing BONKMILLON - The Most Bonkers Meme Coin Yet
Let's be real for a second – the world of meme coins can feel like a bit of a circus at times. Every other day, there's a new token promising to take you "to the moon" or offering some groundbreaking utility that'll change the game forever. But how many of them actually deliver on that hype?
BONKMILLON Unleashes Its Bonkers Potential on Solana.pdf
Beating the (sh** out of the) GIL - Multithreading vs. Multiprocessing
1. Threading Theory Multiprocessing Others Conclusion Finalise
Beating the (sh** out of the) GIL
Multithreading vs. Multiprocessing
Hair dryer 1920s,
Dark Roasted Blend:
http://www.darkroastedblend.
com/2007/01/
retro-technology-update.html
Guy K. Kloss | Multithreading vs. Multiprocessing 1/36
2. Threading Theory Multiprocessing Others Conclusion Finalise
Beating the (sh** out of the) GIL
Multithreading vs. Multiprocessing
Guy K. Kloss
Computer Science
Massey University, Albany
New Zealand Python User Group Meeting
Auckland, 12 June 2008
Guy K. Kloss | Multithreading vs. Multiprocessing 2/36
3. Threading Theory Multiprocessing Others Conclusion Finalise
Outline
1 Threading
2 Theory
3 Multiprocessing
4 Others
5 Conclusion
Guy K. Kloss | Multithreading vs. Multiprocessing 3/36
5. Threading Theory Multiprocessing Others Conclusion Finalise
Outline
1 Threading
2 Theory
3 Multiprocessing
4 Others
5 Conclusion
Guy K. Kloss | Multithreading vs. Multiprocessing 5/36
6. Threading Theory Multiprocessing Others Conclusion Finalise
Source: http://blog.snaplogic.org/?cat=29
Guy K. Kloss | Multithreading vs. Multiprocessing 6/36
7. Threading Theory Multiprocessing Others Conclusion Finalise
What People Think Now
Threading and shared memory are common
(thanks to Windows and Java)
Python supports threads (Yay!)
Python also supports easy forking (Yay!)
The GIL . . . is a problem for pure Python,
non I/O bound applications
Lots of people “understand” threads . . .
. . . and fail at them (to do them properly)
Guy K. Kloss | Multithreading vs. Multiprocessing 7/36
8. Threading Theory Multiprocessing Others Conclusion Finalise
What People Think Now
Threading and shared memory are common
(thanks to Windows and Java)
Python supports threads (Yay!)
Python also supports easy forking (Yay!)
The GIL . . . is a problem for pure Python,
non I/O bound applications
Lots of people “understand” threads . . .
. . . and fail at them (to do them properly)
Guy K. Kloss | Multithreading vs. Multiprocessing 7/36
9. Threading Theory Multiprocessing Others Conclusion Finalise
What People Think Now
Threading and shared memory are common
(thanks to Windows and Java)
Python supports threads (Yay!)
Python also supports easy forking (Yay!)
The GIL . . . is a problem for pure Python,
non I/O bound applications
Lots of people “understand” threads . . .
. . . and fail at them (to do them properly)
Guy K. Kloss | Multithreading vs. Multiprocessing 7/36
10. Threading Theory Multiprocessing Others Conclusion Finalise
What People Think Now
Threading and shared memory are common
(thanks to Windows and Java)
Python supports threads (Yay!)
Python also supports easy forking (Yay!)
The GIL . . . is a problem for pure Python,
non I/O bound applications
Lots of people “understand” threads . . .
. . . and fail at them (to do them properly)
Guy K. Kloss | Multithreading vs. Multiprocessing 7/36
11. Threading Theory Multiprocessing Others Conclusion Finalise
What People Think Now
Threading and shared memory are common
(thanks to Windows and Java)
Python supports threads (Yay!)
Python also supports easy forking (Yay!)
The GIL . . . is a problem for pure Python,
non I/O bound applications
Lots of people “understand” threads . . .
. . . and fail at them (to do them properly)
Guy K. Kloss | Multithreading vs. Multiprocessing 7/36
12. Threading Theory Multiprocessing Others Conclusion Finalise
What People Think Now
Blog post by Mark Ramm, 14 May 2008
A multi threaded system is particularly important for people
who use Windows, which makes multi–process computing
much more memory intensive than it needs to be. As my
grandma always said Windows can’t fork worth a damn. ;)
[. . . ]
So, really it’s kinda like shared–memory optimized
micro–processes running inside larger OS level processes, and
that makes multi–threaded applications a lot more
reasonable to wrap your brain around. Once you start down
the path of lock managment the non-deterministic character
of the system can quickly overwhelm your brain.
Guy K. Kloss | Multithreading vs. Multiprocessing 8/36
13. Threading Theory Multiprocessing Others Conclusion Finalise
Simple Threading Example
from threading import Thread
from stuff import expensiveFunction
class MyClass(Thread):
def __init__(self, argument):
self.argument = argument
Thread.__init__(self) # I n i t i a l i s e the thread
def run(self):
self.value = expensiveFunction(self.argument)
callObjects = []
for i in range(config.segments):
callObjects.append(MyClass(i))
for item in callObjects:
item.start()
# Do something e l s e .
time.sleep(15.0)
for item in callObjects:
item.join()
print item.value
Guy K. Kloss | Multithreading vs. Multiprocessing 9/36
14. Threading Theory Multiprocessing Others Conclusion Finalise
Our Example with Threading
Our fractal example
now with threading.
Just a humble hair–dryer from the
30s: “One of the first machines used
for permanent wave hairstyling back
in the 1920’s and 1930’s.”
Dark Roasted Blend:
http://www.darkroastedblend.com/2007/05/
mystery-devices-issue-2.html
Guy K. Kloss | Multithreading vs. Multiprocessing 10/36
15. Threading Theory Multiprocessing Others Conclusion Finalise
The GIL
Global Interpreter Lock
What is it for?
Cooperative multitasking
Interpreter knows when it’s “good to switch”
Often more efficient than preemptive multi–tasking
Can be released from native (C) code extensions
(done for I/O intensive operations)
Is it good?
Easy coding
Easy modules/extensions
Large base of available modules alredy
Speed improvement by factor 2
(for single–threaded applications)
Keeps code safe
Guy K. Kloss | Multithreading vs. Multiprocessing 11/36
16. Threading Theory Multiprocessing Others Conclusion Finalise
The GIL
Global Interpreter Lock
What is it for?
Cooperative multitasking
Interpreter knows when it’s “good to switch”
Often more efficient than preemptive multi–tasking
Can be released from native (C) code extensions
(done for I/O intensive operations)
Is it good?
Easy coding
Easy modules/extensions
Large base of available modules alredy
Speed improvement by factor 2
(for single–threaded applications)
Keeps code safe
Guy K. Kloss | Multithreading vs. Multiprocessing 11/36
17. Threading Theory Multiprocessing Others Conclusion Finalise
The GIL
Alternatives
Other implementations
(C) Python uses it
Jython doesn’t
IronPython doesn’t
They use their own/internal threading mechanisms
Is it a design flaw?
Maybe . . . but . . .
Fierce/intense discussions to change the code base
Solutions that pose other benefits:
Processes create fewer inherent dead lock situations
Processes scale also to multi–host scenarios
Guy K. Kloss | Multithreading vs. Multiprocessing 12/36
18. Threading Theory Multiprocessing Others Conclusion Finalise
The GIL
Alternatives
Other implementations
(C) Python uses it
Jython doesn’t
IronPython doesn’t
They use their own/internal threading mechanisms
Is it a design flaw?
Maybe . . . but . . .
Fierce/intense discussions to change the code base
Solutions that pose other benefits:
Processes create fewer inherent dead lock situations
Processes scale also to multi–host scenarios
Guy K. Kloss | Multithreading vs. Multiprocessing 12/36
19. Threading Theory Multiprocessing Others Conclusion Finalise
Doug Hellmann in Python Magazine 10/2007:
Techniques using low–level, operating system–specific,
libraries for process management are as passe as using
compiled languages for CGI programming. I don’t have time
for this low–level stuff any more, and neither do you. Let’s
look at some modern alternatives.
Guy K. Kloss | Multithreading vs. Multiprocessing 13/36
20. Threading Theory Multiprocessing Others Conclusion Finalise
GIL–less Python
There was an attempt/patch “way back then ...”
There’s a new project now by Adam Olsen
Python 3000 with “free theading” [1]
Using Monitors to isolate state
Design focus: usability
(for common cases, maintainable code)
Optional at compile time using --with-freethread
Sacrificed single–threaded performance
(60–65 % but equivalent to threaded CPython)
Automatic deadlock detection
(detection/breaking, giving exceptions/stack trace)
Runs on Linux and OS/X
Guy K. Kloss | Multithreading vs. Multiprocessing 14/36
21. Threading Theory Multiprocessing Others Conclusion Finalise
Outline
1 Threading
2 Theory
3 Multiprocessing
4 Others
5 Conclusion
Guy K. Kloss | Multithreading vs. Multiprocessing 15/36
22. Threading Theory Multiprocessing Others Conclusion Finalise
Parallelisation in General
CPU vs. I/O bottle necks
Threading: Good for I/O constrains
This talk aims at CPU constrains
Threads vs. Processes
Threads: Within a process on one host
Processes: Independent on the OS
Processes are:
Heavier in memory/overhead
Have their own name space and memory
Involve less problems with competing access
to resources and their management
But:
On UN*X/Linux: Process overhead is very low
(C)Python is inefficient in handling threads
Stackless Python is much more efficient on threading
Guy K. Kloss | Multithreading vs. Multiprocessing 16/36
23. Threading Theory Multiprocessing Others Conclusion Finalise
Parallelisation in General
CPU vs. I/O bottle necks
Threading: Good for I/O constrains
This talk aims at CPU constrains
Threads vs. Processes
Threads: Within a process on one host
Processes: Independent on the OS
Processes are:
Heavier in memory/overhead
Have their own name space and memory
Involve less problems with competing access
to resources and their management
But:
On UN*X/Linux: Process overhead is very low
(C)Python is inefficient in handling threads
Stackless Python is much more efficient on threading
Guy K. Kloss | Multithreading vs. Multiprocessing 16/36
24. Threading Theory Multiprocessing Others Conclusion Finalise
Parallelisation in General
CPU vs. I/O bottle necks
Threading: Good for I/O constrains
This talk aims at CPU constrains
Threads vs. Processes
Threads: Within a process on one host
Processes: Independent on the OS
Processes are:
Heavier in memory/overhead
Have their own name space and memory
Involve less problems with competing access
to resources and their management
But:
On UN*X/Linux: Process overhead is very low
(C)Python is inefficient in handling threads
Stackless Python is much more efficient on threading
Guy K. Kloss | Multithreading vs. Multiprocessing 16/36
25. Threading Theory Multiprocessing Others Conclusion Finalise
Parallelisation in General
CPU vs. I/O bottle necks
Threading: Good for I/O constrains
This talk aims at CPU constrains
Threads vs. Processes
Threads: Within a process on one host
Processes: Independent on the OS
Processes are:
Heavier in memory/overhead
Have their own name space and memory
Involve less problems with competing access
to resources and their management
But:
On UN*X/Linux: Process overhead is very low
(C)Python is inefficient in handling threads
Stackless Python is much more efficient on threading
Guy K. Kloss | Multithreading vs. Multiprocessing 16/36
26. Threading Theory Multiprocessing Others Conclusion Finalise
Abstraction Level vs. Control
Abstraction levels for parallel computing models [7]
Parallelism Communication Synchronisation
4 implicit
3 explicit implicit
2 explicit implicit
1 explicit
Explicit: The programmer specifies it in the parallel program
Implicit: A compiler/runtime system derives it from other information
Guy K. Kloss | Multithreading vs. Multiprocessing 17/36
27. Threading Theory Multiprocessing Others Conclusion Finalise
Abstraction Level vs. Control
Low level: Close to hardware
Must specify parallelism
. . . communication
. . . and synchronisation
→ Best means for performance tuning
→ Premature optimisation?
High level: Highest machine independence
More/all handled by computing model
Up to automatic parallelisation approaches
Both extremes have not been very successful to date
Most developments now:
Level 3 for specific purposes
Level 1 for general programming
(esp. in the scientific community)
With Python consistent level 2 possible
Guy K. Kloss | Multithreading vs. Multiprocessing 18/36
28. Threading Theory Multiprocessing Others Conclusion Finalise
Abstraction Level vs. Control
Low level: Close to hardware
Must specify parallelism
. . . communication
. . . and synchronisation
→ Best means for performance tuning
→ Premature optimisation?
High level: Highest machine independence
More/all handled by computing model
Up to automatic parallelisation approaches
Both extremes have not been very successful to date
Most developments now:
Level 3 for specific purposes
Level 1 for general programming
(esp. in the scientific community)
With Python consistent level 2 possible
Guy K. Kloss | Multithreading vs. Multiprocessing 18/36
29. Threading Theory Multiprocessing Others Conclusion Finalise
Abstraction Level vs. Control
Low level: Close to hardware
Must specify parallelism
. . . communication
. . . and synchronisation
→ Best means for performance tuning
→ Premature optimisation?
High level: Highest machine independence
More/all handled by computing model
Up to automatic parallelisation approaches
Both extremes have not been very successful to date
Most developments now:
Level 3 for specific purposes
Level 1 for general programming
(esp. in the scientific community)
With Python consistent level 2 possible
Guy K. Kloss | Multithreading vs. Multiprocessing 18/36
30. Threading Theory Multiprocessing Others Conclusion Finalise
Common for Parallel Computing
Message Passing Interface (MPI)
for distributed memory
OpenMP
shared memory multi–threading
The two do not have to be categorised like this
Guy K. Kloss | Multithreading vs. Multiprocessing 19/36
31. Threading Theory Multiprocessing Others Conclusion Finalise
Art by “Teknika Molodezhi,” Russia 1966
Dark Roasted Blend: http://www.darkroastedblend.com/2008/01/retro-future-mind-boggling.html
Guy K. Kloss | Multithreading vs. Multiprocessing 20/36
32. Threading Theory Multiprocessing Others Conclusion Finalise
Outline
1 Threading
2 Theory
3 Multiprocessing
4 Others
5 Conclusion
Guy K. Kloss | Multithreading vs. Multiprocessing 21/36
33. Threading Theory Multiprocessing Others Conclusion Finalise
Processing around the GIL
Smart multi–processing
Smart task farming
Guy K. Kloss | Multithreading vs. Multiprocessing 22/36
34. Threading Theory Multiprocessing Others Conclusion Finalise
(py)Processing module
By R. Oudkerk [2]
Written in C (really fast!)
Allowes multiple cores and multiple hosts/clusters
Data synchronisation through managers
Easy “upgrade path”
Drop in replacement (mostly) for the threading module
Transparent to user
Forks processes, but uses Thread API
Supports queues, pipes, locks,
managers (for sharing state), worker pools
VERY fast, see PEP-371 [3]
Jesse Noller for pyprocessing into core Python
benchmarks available, awesome results!
PEP is officially accepted: Thanks Guido!
Guy K. Kloss | Multithreading vs. Multiprocessing 23/36
35. Threading Theory Multiprocessing Others Conclusion Finalise
(py)Processing module
(continued)
Some details
Producer/consumer style system
– workers pull jobs
Hides most details of communication
– usable default settings
Communication is tweakable
(to improve performance or meet certain requirements)
Guy K. Kloss | Multithreading vs. Multiprocessing 24/36
36. Threading Theory Multiprocessing Others Conclusion Finalise
(py)Processing module
Let’s see it!
Guy K. Kloss | Multithreading vs. Multiprocessing 25/36
37. Threading Theory Multiprocessing Others Conclusion Finalise
Parallel Python module
By Vitalii Vanovschi [4]
Pure Python
Full “Batteries included” paradigm model:
Spawns automatically across detected cores,
and can spawn to clusters
Uses some thread module methods under the hood
More of a “task farming” approach
(requires potentially rethinking/restructuring)
Automatically deploys code and data,
no difficult/multiple installs
Fault tolerance, secure inter–node communication,
runs everywhere
Very active communigy,
good documentation, good support
Guy K. Kloss | Multithreading vs. Multiprocessing 26/36
38. Threading Theory Multiprocessing Others Conclusion Finalise
Parallel Python module
Let’s see it!
Guy K. Kloss | Multithreading vs. Multiprocessing 27/36
39. Threading Theory Multiprocessing Others Conclusion Finalise
Outline
1 Threading
2 Theory
3 Multiprocessing
4 Others
5 Conclusion
Guy K. Kloss | Multithreading vs. Multiprocessing 28/36
40. Threading Theory Multiprocessing Others Conclusion Finalise
Honourable Mentions
pprocess [5]
IPython for parallel computing [6]
Bulk Synchronous Parallel (BSP) Model [7]
sequence of super steps
(computation, communication, barrier synch)
Reactor based architectures, through Twisted [8]
“Don’t call us, we call you”
MPI (pyMPI, Pypar, MPI for Python, pypvm)
requires constant number of processors during
compation’s duration
Pyro (distributed object system)
Linda (PyLinda)
Scientific Python (master/slave computing model)
data distribution through call parameters/replication
Guy K. Kloss | Multithreading vs. Multiprocessing 29/36
41. Threading Theory Multiprocessing Others Conclusion Finalise
Outline
1 Threading
2 Theory
3 Multiprocessing
4 Others
5 Conclusion
Guy K. Kloss | Multithreading vs. Multiprocessing 30/36
42. Threading Theory Multiprocessing Others Conclusion Finalise
Things to Note
Which approach is best?
Can’t say!
Many of the approaches are complimentary
Needs to be evaluated what to use when
All, however, save you a lot of time over the alternative
of writing everything yourself with low–level libraries.
What an age to be alive!
Problems can arise when objects cannot be pickled
Guy K. Kloss | Multithreading vs. Multiprocessing 31/36
43. Threading Theory Multiprocessing Others Conclusion Finalise
Conclusion
Resolving the GIL is not necessarily the best solution
More inefficient (single threaded) runtime
Problems with shared memory access
Various approaches to beat the GIL
Solutions are complimentary in many ways
many scale beyond a local machine/memory system
Guy K. Kloss | Multithreading vs. Multiprocessing 32/36
44. Threading Theory Multiprocessing Others Conclusion Finalise
Questions?
G.Kloss@massey.ac.nz
Slides and code available here:
http://www.kloss-familie.de/moin/TalksPresentations
Guy K. Kloss | Multithreading vs. Multiprocessing 33/36
45. Threading Theory Multiprocessing Others Conclusion Finalise
References I
[1] A. Olsen,
Python 3000 with Free Threading project,
[Online]
http://code.google.com/p/python-safethread/
[2] R. Oudkerk,
Processing Package,
[Online] http://pypi.python.org/pypi/processing/
[3] J. Noller,
PEP-371,
[Online] http://www.python.org/dev/peps/pep-0371/
Guy K. Kloss | Multithreading vs. Multiprocessing 34/36
46. Threading Theory Multiprocessing Others Conclusion Finalise
References II
[4] V. Vanovschi,
Parallel Python,
[Online] http://parallelpython.com/
[5] P. Boddie,
pprocess,
[Online] http://pypi.python.org/pypi/processing/
[6] Project Website,
IPython,
[Online] http://ipython.scipy.org/doc/ipython1/
html/parallel_intro.html
[7] K. Hinsen,
Parallel Scripting with Python
Computing in Science & Engineering, Nov/Dec 2007
Guy K. Kloss | Multithreading vs. Multiprocessing 35/36
47. Threading Theory Multiprocessing Others Conclusion Finalise
References III
[8] B. Eckel,
Concurrency with Python, Twisted, and Flex,
[Online] http://www.artima.com/weblogs/viewpost.
jsp?thread=230001
Guy K. Kloss | Multithreading vs. Multiprocessing 36/36