The PyConTW (http://tw.pycon.org) organizer wishes to improve the quality and quantity of the programming cummunities in Taiwan. Though Python is their core tool and methodology, they know it's worth to learn and communicate with wide-ranging communities. Understanding cultures and ecosystem of a language takes me about three to six months. This six-hour course wraps up what I - an experienced Java developer - have learned from Python ecosystem and the agenda of the past PyConTW.
你可以在以下鏈結找到中文內容:
http://www.codedata.com.tw/python/python-tutorial-the-1st-class-1-preface
In an environment where cloud-scaling applications is becoming more and more important, client-server architectures paradigms, as shown by memcached, are back with vengeance. In this talk, Galder will talk about Hot Rod, Infinispan's new client/server binary protocol, explaining the key differences compared to memcached's binary protocol, such as the possibility of receiving cluster topology changes. Audience of this talk will learn of the importance of Hot Rod in 'cloud-scale' application server clustering, where stateless application server instances could use Infinispan Hot Rod clients to retrieve state from an elastic farm of Infinispan Hot Rod servers, improving capabilities to run application server instances as a PaaS. The talk will finish with a brief demo of a cluster of Infinispan Hot Rod servers running on EC2 being accessed from a non-Java client. The audience is expected to have an intermediate understanding of client-server software architectures and cloud deployments.
Advanced Data Retrieval and Analytics with Apache Spark and Openstack SwiftDaniel Krook
Lightning talk from the OpenStack NYC meetup on October 8, 2014.
http://bit.ly/ibm-os-meetup
By Gil Vernik
The integration between Apache Spark and Swift, and the use of Storlets for smart retrieval via filtering and privacy-support.
The content of this talk is a statement from the IBM Research division, not IBM product divisions, and is not a statement from IBM regarding its plans, directions or product intents. Any activities described by this talk are subject to change.
DesignS is a unified brand platform to promote Singapore design, consisting of a group partnership of design-centric institutes and associations in Singapore.
For updates, visit http://www.designs.com.sg/
The PyConTW (http://tw.pycon.org) organizer wishes to improve the quality and quantity of the programming cummunities in Taiwan. Though Python is their core tool and methodology, they know it's worth to learn and communicate with wide-ranging communities. Understanding cultures and ecosystem of a language takes me about three to six months. This six-hour course wraps up what I - an experienced Java developer - have learned from Python ecosystem and the agenda of the past PyConTW.
你可以在以下鏈結找到中文內容:
http://www.codedata.com.tw/python/python-tutorial-the-1st-class-1-preface
In an environment where cloud-scaling applications is becoming more and more important, client-server architectures paradigms, as shown by memcached, are back with vengeance. In this talk, Galder will talk about Hot Rod, Infinispan's new client/server binary protocol, explaining the key differences compared to memcached's binary protocol, such as the possibility of receiving cluster topology changes. Audience of this talk will learn of the importance of Hot Rod in 'cloud-scale' application server clustering, where stateless application server instances could use Infinispan Hot Rod clients to retrieve state from an elastic farm of Infinispan Hot Rod servers, improving capabilities to run application server instances as a PaaS. The talk will finish with a brief demo of a cluster of Infinispan Hot Rod servers running on EC2 being accessed from a non-Java client. The audience is expected to have an intermediate understanding of client-server software architectures and cloud deployments.
Advanced Data Retrieval and Analytics with Apache Spark and Openstack SwiftDaniel Krook
Lightning talk from the OpenStack NYC meetup on October 8, 2014.
http://bit.ly/ibm-os-meetup
By Gil Vernik
The integration between Apache Spark and Swift, and the use of Storlets for smart retrieval via filtering and privacy-support.
The content of this talk is a statement from the IBM Research division, not IBM product divisions, and is not a statement from IBM regarding its plans, directions or product intents. Any activities described by this talk are subject to change.
DesignS is a unified brand platform to promote Singapore design, consisting of a group partnership of design-centric institutes and associations in Singapore.
For updates, visit http://www.designs.com.sg/
Bedford College - Art & Design - HND Work ShowcaseBedfordCollege
A showcase of Art & Design HND student work in the fields of Graphic Design, Fashion & Textile Design, Fine Art & 3D Design.
Find out more at www.southbankarts.com or www.bedford.ac.uk/art
Mike Wittenstein, experience designer, offered this presentation to job seekers at Atlanta's First United Methodist Church on how to use Story to land a better job faster. The presentation was delivered on October 15th, 2009.
Startups Today (Keynote at Auburn, April 2013)Keith McGreggor
The Five Innovations that have changed the Startup landscape forever.
Presented to the Auburn University chapter of the National Academy of Inventors, April 5, 2013
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Essentials of Automations: Optimizing FME Workflows with Parameters
A Curious Course on Coroutines and Concurrency
1. A Curious Course on
Coroutines and Concurrency
David Beazley
http://www.dabeaz.com
Presented at PyCon'2009, Chicago, Illinois
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 1
2. This Tutorial
• A mondo exploration of Python coroutines
mondo:
1. Extreme in degree or nature.
(http://www.urbandictionary.com)
2. An instructional technique of Zen Buddhism
consisting of rapid dialogue of questions and
answers between master and pupil. (Oxford
English Dictionary, 2nd Ed)
• You might want to brace yourself...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 2
3. Requirements
• You need Python 2.5 or newer
• No third party extensions
• We're going to be looking at a lot of code
http://www.dabeaz.com/coroutines/
• Go there and follow along with the examples
• I will indicate file names as appropriate
sample.py
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 3
4. High Level Overview
• What in the heck is a coroutine?
• What can you use them for?
• Should you care?
• Is using them even a good idea?
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 4
5. here
Joke
Killer
Headache
You are
Throbbing
In G
tro ene
So to ra
m t
e Co ors
D r
at ou
Copyright (C) 2009, David Beazley, http://www.dabeaz.com
a P tin
ro es
Ev ce
M en ss
ix t H ing
in
So and
Co m li
ro e T ng
ut hr
in ea
W es ds
r as
op ite Ta
er a m sk
Head Explosion Index
at u s
in lt
g s ita
ys sk
te ing
m
End
A Pictorial Overview
5
6. About Me
• I'm a long-time Pythonista
• Author of the Python Essential Reference
(look for the 4th edition--shameless plug)
• Created several packages (Swig, PLY, etc.)
• Currently a full-time Python trainer
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 6
7. Some Background
• I'm an unabashed fan of generators and
generator expressions (Generators Rock!)
• See "Generator Tricks for Systems
Programmers" from PyCon'08
• http://www.dabeaz.com/generators
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 7
8. Coroutines and Generators
• In Python 2.5, generators picked up some
new features to allow "coroutines" (PEP-342).
• Most notably: a new send() method
• If Python books are any guide, this is the most
poorly documented, obscure, and apparently
useless feature of Python.
• "Oooh. You can now send values into
generators producing fibonacci numbers!"
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 8
9. Uses of Coroutines
• Coroutines apparently might be possibly
useful in various libraries and frameworks
"It's all really quite simple. The toelet is connected to
the footlet, and the footlet is connected to the
anklelet, and the anklelet is connected to the leglet,
and the is leglet connected to the is thighlet, and the
thighlet is connected to the hiplet, and the is hiplet
connected to the backlet, and the backlet is
connected to the necklet, and the necklet is
connected to the headlet, and ?????? ..... profit!"
• Uh, I think my brain is just too small...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 9
10. Disclaimers
• Coroutines - The most obscure Python feature?
• Concurrency - One of the most difficult topics
in computer science (usually best avoided)
• This tutorial mixes them together
• It might create a toxic cloud
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 10
11. More Disclaimers
• As a programmer of the 80s/90s, I've never used
a programming language that had coroutines--
until they showed up in Python
• Most of the groundwork for coroutines
occurred in the 60s/70s and then stopped in
favor of alternatives (e.g., threads, continuations)
• I want to know if there is any substance to the
renewed interest in coroutines that has been
occurring in Python and other languages
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 11
12. Even More Disclaimers
• I'm a neutral party
• I didn't have anything to do with PEP-342
• I'm not promoting any libraries or frameworks
• I have no religious attachment to the subject
• If anything, I'm a little skeptical
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 12
13. Final Disclaimers
• This tutorial is not an academic presentation
• No overview of prior art
• No theory of programming languages
• No proofs about locking
• No Fibonacci numbers
• Practical application is the main focus
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 13
14. Performance Details
• There are some later performance numbers
• Python 2.6.1 on OS X 10.4.11
• All tests were conducted on the following:
• Mac Pro 2x2.66 Ghz Dual-Core Xeon
• 3 Gbytes RAM
• Timings are 3-run average of 'time' command
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 14
15. Part I
Introduction to Generators and Coroutines
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 15
16. Generators
• A generator is a function that produces a
sequence of results instead of a single value
def countdown(n): countdown.py
while n > 0:
yield n
n -= 1
>>> for i in countdown(5):
... print i,
...
5 4 3 2 1
>>>
• Instead of returning a value, you generate a
series of values (using the yield statement)
• Typically, you hook it up to a for-loop 16
Copyright (C) 2009, David Beazley, http://www.dabeaz.com
17. Generators
• Behavior is quite different than normal func
• Calling a generator function creates an
generator object. However, it does not start
running the function.
def countdown(n):
print "Counting down from", n
while n > 0:
yield n
n -= 1 Notice that no
output was
>>> x = countdown(10) produced
>>> x
<generator object at 0x58490>
>>>
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 17
18. Generator Functions
• The function only executes on next()
>>> x = countdown(10)
>>> x
<generator object at 0x58490>
>>> x.next()
Counting down from 10 Function starts
10 executing here
>>>
• yield produces a value, but suspends the function
• Function resumes on next call to next()
>>> x.next()
9
>>> x.next()
8
>>>
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 18
19. Generator Functions
• When the generator returns, iteration stops
>>> x.next()
1
>>> x.next()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
StopIteration
>>>
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 19
20. A Practical Example
• A Python version of Unix 'tail -f'
import time follow.py
def follow(thefile):
thefile.seek(0,2) # Go to the end of the file
while True:
line = thefile.readline()
if not line:
time.sleep(0.1) # Sleep briefly
continue
yield line
• Example use : Watch a web-server log file
logfile = open("access-log")
for line in follow(logfile):
print line,
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 20
21. Generators as Pipelines
• One of the most powerful applications of
generators is setting up processing pipelines
• Similar to shell pipes in Unix
input
generator generator generator for x in s:
sequence
• Idea: You can stack a series of generator
functions together into a pipe and pull items
through it with a for-loop
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 21
22. A Pipeline Example
• Print all server log entries containing 'python'
def grep(pattern,lines): pipeline.py
for line in lines:
if pattern in line:
yield line
# Set up a processing pipe : tail -f | grep python
logfile = open("access-log")
loglines = follow(logfile)
pylines = grep("python",loglines)
# Pull results out of the processing pipeline
for line in pylines:
print line,
• This is just a small taste
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 22
23. Yield as an Expression
• In Python 2.5, a slight modification to the yield
statement was introduced (PEP-342)
• You could now use yield as an expression
• For example, on the right side of an assignment
def grep(pattern): grep.py
print "Looking for %s" % pattern
while True:
line = (yield)
if pattern in line:
print line,
• Question : What is its value?
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 23
24. Coroutines
• If you use yield more generally, you get a coroutine
• These do more than just generate values
• Instead, functions can consume values sent to it.
>>> g = grep("python")
>>> g.next() # Prime it (explained shortly)
Looking for python
>>> g.send("Yeah, but no, but yeah, but no")
>>> g.send("A series of tubes")
>>> g.send("python generators rock!")
python generators rock!
>>>
• Sent values are returned by (yield)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 24
25. Coroutine Execution
• Execution is the same as for a generator
• When you call a coroutine, nothing happens
• They only run in response to next() and send()
methods Notice that no
output was
produced
>>> g = grep("python")
>>> g.next()
Looking for python On first operation,
>>> coroutine starts
running
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 25
26. Coroutine Priming
• All coroutines must be "primed" by first
calling .next() (or send(None))
• This advances execution to the location of the
first yield expression.
def grep(pattern):
print "Looking for %s" % pattern
while True: .next() advances the
line = (yield) coroutine to the
if pattern in line: first yield expression
print line,
• At this point, it's ready to receive a value
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 26
27. Using a Decorator
• Remembering to call .next() is easy to forget
• Solved by wrapping coroutines with a decorator
def coroutine(func): coroutine.py
def start(*args,**kwargs):
cr = func(*args,**kwargs)
cr.next()
return cr
return start
@coroutine
def grep(pattern):
...
• I will use this in most of the future examples
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 27
28. Closing a Coroutine
• A coroutine might run indefinitely
• Use .close() to shut it down
>>> g = grep("python")
>>> g.next() # Prime it
Looking for python
>>> g.send("Yeah, but no, but yeah, but no")
>>> g.send("A series of tubes")
>>> g.send("python generators rock!")
python generators rock!
>>> g.close()
• Note: Garbage collection also calls close()
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 28
29. Catching close()
• close() can be caught (GeneratorExit)
@coroutine grepclose.py
def grep(pattern):
print "Looking for %s" % pattern
try:
while True:
line = (yield)
if pattern in line:
print line,
except GeneratorExit:
print "Going away. Goodbye"
• You cannot ignore this exception
• Only legal action is to clean up and return
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 29
30. Throwing an Exception
• Exceptions can be thrown inside a coroutine
>>> g = grep("python")
>>> g.next() # Prime it
Looking for python
>>> g.send("python generators rock!")
python generators rock!
>>> g.throw(RuntimeError,"You're hosed")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in grep
RuntimeError: You're hosed
>>>
• Exception originates at the yield expression
• Can be caught/handled in the usual ways
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 30
31. Interlude
• Despite some similarities, Generators and
coroutines are basically two different concepts
• Generators produce values
• Coroutines tend to consume values
• It is easy to get sidetracked because methods
meant for coroutines are sometimes described as
a way to tweak generators that are in the process
of producing an iteration pattern (i.e., resetting its
value). This is mostly bogus.
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 31
32. A Bogus Example
• A "generator" that produces and receives values
def countdown(n): bogus.py
print "Counting down from", n
while n >= 0:
newvalue = (yield n)
# If a new value got sent in, reset n with it
if newvalue is not None:
n = newvalue
else:
n -= 1
• It runs, but it's "flaky" and hard to understand
c = countdown(5)
5 Notice how a value
for n in c: output
2 got "lost" in the
print n
1 iteration protocol
if n == 5:
0
c.send(3)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 32
33. Keeping it Straight
• Generators produce data for iteration
• Coroutines are consumers of data
• To keep your brain from exploding, you don't mix
the two concepts together
• Coroutines are not related to iteration
• Note : There is a use of having yield produce a
value in a coroutine, but it's not tied to iteration.
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 33
34. Part 2
Coroutines, Pipelines, and Dataflow
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 34
35. Processing Pipelines
• Coroutines can be used to set up pipes
send() send() send()
coroutine coroutine coroutine
• You just chain coroutines together and push
data through the pipe with send() operations
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 35
36. Pipeline Sources
• The pipeline needs an initial source (a producer)
send() send()
source coroutine
• The source drives the entire pipeline
def source(target):
while not done:
item = produce_an_item()
...
target.send(item)
...
target.close()
• It is typically not a coroutine
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 36
37. Pipeline Sinks
• The pipeline must have an end-point (sink)
send() send()
coroutine sink
• Collects all data sent to it and processes it
@coroutine
def sink():
try:
while True:
item = (yield) # Receive an item
...
except GeneratorExit: # Handle .close()
# Done
...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 37
38. An Example
• A source that mimics Unix 'tail -f'
import time cofollow.py
def follow(thefile, target):
thefile.seek(0,2) # Go to the end of the file
while True:
line = thefile.readline()
if not line:
time.sleep(0.1) # Sleep briefly
continue
target.send(line)
• A sink that just prints the lines
@coroutine
def printer():
while True:
line = (yield)
print line,
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 38
39. An Example
• Hooking it together
f = open("access-log")
follow(f, printer())
• A picture
send()
follow() printer()
• Critical point : follow() is driving the entire
computation by reading lines and pushing them
into the printer() coroutine
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 39
40. Pipeline Filters
• Intermediate stages both receive and send
send() send()
coroutine
• Typically perform some kind of data
transformation, filtering, routing, etc.
@coroutine
def filter(target):
while True:
item = (yield) # Receive an item
# Transform/filter item
...
# Send it along to the next stage
target.send(item)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 40
41. A Filter Example
• A grep filter coroutine
@coroutine
copipe.py
def grep(pattern,target):
while True:
line = (yield) # Receive a line
if pattern in line:
target.send(line) # Send to next stage
• Hooking it up
f = open("access-log")
follow(f,
grep('python',
printer()))
• A picture
send() send()
follow() grep() printer()
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 41
42. Interlude
• Coroutines flip generators around
generators/iteration
input
generator generator generator for x in s:
sequence
coroutines
send() send()
source coroutine coroutine
• Key difference. Generators pull data through
the pipe with iteration. Coroutines push data
into the pipeline with send().
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 42
43. Being Branchy
• With coroutines, you can send data to multiple
destinations
send()
coroutine coroutine
send()
send() send()
source coroutine coroutine
send() coroutine
• The source simply "sends" data. Further routing
of that data can be arbitrarily complex
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 43
44. Example : Broadcasting
• Broadcast to multiple targets
@coroutine cobroadcast.py
def broadcast(targets):
while True:
item = (yield)
for target in targets:
target.send(item)
• This takes a sequence of coroutines (targets)
and sends received items to all of them.
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 44
45. Example : Broadcasting
• Example use:
f = open("access-log")
follow(f,
broadcast([grep('python',printer()),
grep('ply',printer()),
grep('swig',printer())])
)
grep('python') printer()
follow broadcast grep('ply') printer()
grep('swig') printer()
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 45
46. Example : Broadcasting
• A more disturbing variation...
f = open("access-log") cobroadcast2.py
p = printer()
follow(f,
broadcast([grep('python',p),
grep('ply',p),
grep('swig',p)])
)
grep('python')
follow broadcast grep('ply') printer()
grep('swig')
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 46
47. Interlude
• Coroutines provide more powerful data routing
possibilities than simple iterators
• If you built a collection of simple data processing
components, you can glue them together into
complex arrangements of pipes, branches,
merging, etc.
• Although there are some limitations (later)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 47
48. A Digression
• In preparing this tutorial, I found myself wishing
that variable assignment was an expression
@coroutine
@coroutine
def printer():
def printer():
while True:
line = (yield) vs. while (line = yield):
print line,
print line,
• However, I'm not holding my breath on that...
• Actually, I'm expecting to be flogged with a
rubber chicken for even suggesting it.
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 48
49. Coroutines vs. Objects
• Coroutines are somewhat similar to OO design
patterns involving simple handler objects
class GrepHandler(object):
def __init__(self,pattern, target):
self.pattern = pattern
self.target = target
def send(self,line):
if self.pattern in line:
self.target.send(line)
• The coroutine version
@coroutine
def grep(pattern,target):
while True:
line = (yield)
if pattern in line:
target.send(line)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 49
50. Coroutines vs. Objects
• There is a certain "conceptual simplicity"
• A coroutine is one function definition
• If you define a handler class...
• You need a class definition
• Two method definitions
• Probably a base class and a library import
• Essentially you're stripping the idea down to the
bare essentials (like a generator vs. iterator)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 50
51. Coroutines vs. Objects
• Coroutines are faster
• A micro benchmark
@coroutine benchmark.py
def null():
while True: item = (yield)
line = 'python is nice'
p1 = grep('python',null()) # Coroutine
p2 = GrepHandler('python',null()) # Object
• Send in 1,000,000 lines
timeit("p1.send(line)",
"from __main__ import line,p1") 0.60 s
timeit("p2.send(line)",
"from __main__ import line,p2") 0.92 s
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 51
52. Coroutines & Objects
• Understanding the performance difference
class GrepHandler(object):
...
def send(self,line):
if self.pattern in line:
self.target.send(line)
Look at these self lookups!
• Look at the coroutine
@coroutine
def grep(pattern, target):
while True:
line = (yield)
if pattern in line: "self" free
target.send(d)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 52
53. Part 3
Coroutines and Event Dispatching
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 53
54. Event Handling
• Coroutines can be used to write various
components that process event streams
• Let's look at an example...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 54
55. Problem
• Where is my ^&#&@* bus?
• Chicago Transit Authority (CTA) equips most
of its buses with real-time GPS tracking
• You can get current data on every bus on the
street as a big XML document
• Use "The Google" to search for details...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 55
57. XML Parsing
• There are many possible ways to parse XML
• An old-school approach: SAX
• SAX is an event driven interface
Handler Object
class Handler:
def startElement():
events ...
XML Parser def endElement():
...
def characters():
...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 57
58. Minimal SAX Example
import xml.sax basicsax.py
class MyHandler(xml.sax.ContentHandler):
def startElement(self,name,attrs):
print "startElement", name
def endElement(self,name):
print "endElement", name
def characters(self,text):
print "characters", repr(text)[:40]
xml.sax.parse("somefile.xml",MyHandler())
• You see this same programming pattern in
other settings (e.g., HTMLParser module)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 58
59. Some Issues
• SAX is often used because it can be used to
incrementally process huge XML files without
a large memory footprint
• However, the event-driven nature of SAX
parsing makes it rather awkward and low-level
to deal with
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 59
60. From SAX to Coroutines
• You can dispatch SAX events into coroutines
• Consider this SAX handler cosax.py
import xml.sax
class EventHandler(xml.sax.ContentHandler):
def __init__(self,target):
self.target = target
def startElement(self,name,attrs):
self.target.send(('start',(name,attrs._attrs)))
def characters(self,text):
self.target.send(('text',text))
def endElement(self,name):
self.target.send(('end',name))
• It does nothing, but send events to a target
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 60
61. An Event Stream
• The big picture
events send()
SAX Parser Handler (event,value)
'start' ('direction',{})
'end' 'direction'
'text' 'North Bound'
Event type Event values
• Observe : Coding this was straightforward
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 61
62. Event Processing
• To do anything interesting, you have to
process the event stream
• Example: Convert bus elements into
dictionaries (XML sucks, dictionaries rock)
<bus> {
! ! <id>7574</id> 'id' : '7574',
! <route>147</route>
! 'route' : '147',
! <revenue>true</revenue>
! 'revenue' : 'true',
! <direction>North Bound</direction>
! 'direction' : 'North Bound'
! ...
! ...
</bus> }
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 62
63. Buses to Dictionaries
@coroutine buses.py
def buses_to_dicts(target):
while True:
event, value = (yield)
# Look for the start of a <bus> element
if event == 'start' and value[0] == 'bus':
busdict = { }
fragments = []
# Capture text of inner elements in a dict
while True:
event, value = (yield)
if event == 'start': fragments = []
elif event == 'text': fragments.append(value)
elif event == 'end':
if value != 'bus':
busdict[value] = "".join(fragments)
else:
target.send(busdict)
break
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 63
64. State Machines
• The previous code works by implementing a
simple state machine
('start',('bus',*))
A B
('end','bus')
• State A: Looking for a bus
• State B: Collecting bus attributes
• Comment : Coroutines are perfect for this
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 64
65. Buses to Dictionaries
@coroutine
def buses_to_dicts(target):
while True:
event, value = (yield)
A # Look for the start of a <bus> element
if event == 'start' and value[0] == 'bus':
busdict = { }
fragments = []
# Capture text of inner elements in a dict
while True:
event, value = (yield)
if event == 'start': fragments = []
elif event == 'text': fragments.append(value)
elif event == 'end':
B
if value != 'bus':
busdict[value] = "".join(fragments)
else:
target.send(busdict)
break
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 65
66. Filtering Elements
• Let's filter on dictionary fields
@coroutine
def filter_on_field(fieldname,value,target):
while True:
d = (yield)
if d.get(fieldname) == value:
target.send(d)
• Examples:
filter_on_field("route","22",target)
filter_on_field("direction","North Bound",target)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 66
67. Processing Elements
• Where's my bus?
@coroutine
def bus_locations():
while True:
bus = (yield)
print "%(route)s,%(id)s,"%(direction)s","
"%(latitude)s,%(longitude)s" % bus
• This receives dictionaries and prints a table
22,1485,"North Bound",41.880481123924255,-87.62948191165924
22,1629,"North Bound",42.01851969751819,-87.6730209876751
...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 67
68. Hooking it Together
• Find all locations of the North Bound #22 bus
(the slowest moving object in the universe)
xml.sax.parse("allroutes.xml",
EventHandler(
buses_to_dicts(
filter_on_field("route","22",
filter_on_field("direction","North Bound",
bus_locations())))
))
• This final step involves a bit of plumbing, but
each of the parts is relatively simple
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 68
69. How Low Can You Go?
• I've picked this XML example for reason
• One interesting thing about coroutines is that
you can push the initial data source as low-
level as you want to make it without rewriting
all of the processing stages
• Let's say SAX just isn't quite fast enough...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 69
70. XML Parsing with Expat
• Let's strip it down....
import xml.parsers.expat coexpat.py
def expat_parse(f,target):
parser = xml.parsers.expat.ParserCreate()
parser.buffer_size = 65536
parser.buffer_text = True
parser.returns_unicode = False
parser.StartElementHandler =
lambda name,attrs: target.send(('start',(name,attrs)))
parser.EndElementHandler =
lambda name: target.send(('end',name))
parser.CharacterDataHandler =
lambda data: target.send(('text',data))
parser.ParseFile(f)
• expat is low-level (a C extension module)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 70
71. Performance Contest
• SAX version (on a 30MB XML input)
xml.sax.parse("allroutes.xml",EventHandler(
buses_to_dicts(
filter_on_field("route","22", 8.37s
filter_on_field("direction","North Bound",
bus_locations())))))
• Expat version
expat_parse(open("allroutes.xml"),
buses_to_dicts(
filter_on_field("route","22", 4.51s
filter_on_field("direction","North Bound", (83% speedup)
bus_locations()))))
• No changes to the processing stages
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 71
72. Going Lower
• You can even drop send() operations into C
• A skeleton of how this works... cxml/cxmlparse.c
PyObject *
py_parse(PyObject *self, PyObject *args) {
PyObject *filename;
PyObject *target;
PyObject *send_method;
if (!PyArg_ParseArgs(args,"sO",&filename,&target)) {
return NULL;
}
send_method = PyObject_GetAttrString(target,"send");
...
/* Invoke target.send(item) */
args = Py_BuildValue("(O)",item);
result = PyEval_CallObject(send_meth,args);
...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 72
73. Performance Contest
• Expat version
expat_parse(open("allroutes.xml"),
buses_to_dicts(
filter_on_field("route","22", 4.51s
filter_on_field("direction","North Bound",
bus_locations())))))
• A custom C extension written directly on top
of the expat C library (code not shown)
cxmlparse.parse("allroutes.xml",
buses_to_dicts(
filter_on_field("route","22", 2.95s
filter_on_field("direction","North Bound",
bus_locations()))))) (55% speedup)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 73
74. Interlude
• ElementTree has fast incremental XML parsing
from xml.etree.cElementTree import iterparse iterbus.py
for event,elem in iterparse("allroutes.xml",('start','end')):
if event == 'start' and elem.tag == 'buses':
buses = elem
elif event == 'end' and elem.tag == 'bus':
busdict = dict((child.tag,child.text)
for child in elem)
if (busdict['route'] == '22' and
busdict['direction'] == 'North Bound'):
print "%(id)s,%(route)s,"%(direction)s","
"%(latitude)s,%(longitude)s" % busdict
buses.remove(elem)
3.04s
• Observe: Coroutines are in the same range
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 74
75. Part 4
From Data Processing to Concurrent Programming
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 75
76. The Story So Far
• Coroutines are similar to generators
• You can create collections of small processing
components and connect them together
• You can process data by setting up pipelines,
dataflow graphs, etc.
• You can use coroutines with code that has
tricky execution (e.g., event driven systems)
• However, there is so much more going on...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 76
77. A Common Theme
• You send data to coroutines
• You send data to threads (via queues)
• You send data to processes (via messages)
• Coroutines naturally tie into problems
involving threads and distributed systems.
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 77
78. Basic Concurrency
• You can package coroutines inside threads or
subprocesses by adding extra layers
Thread Host
coroutine socket coroutine
queue
Thread
queue
source coroutine coroutine
pipe
Subprocess
coroutine
• Will sketch out some basic ideas...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 78
79. A Threaded Target
@coroutine cothread.py
def threaded(target):
messages = Queue()
def run_target():
while True:
item = messages.get()
if item is GeneratorExit:
target.close()
return
else:
target.send(item)
Thread(target=run_target).start()
try:
while True:
item = (yield)
messages.put(item)
except GeneratorExit:
messages.put(GeneratorExit)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 79
80. A Threaded Target
@coroutine
def threaded(target):
messages = Queue() A message queue
def run_target():
while True:
item = messages.get()
if item is GeneratorExit:
target.close()
return
else:
target.send(item)
Thread(target=run_target).start()
try:
while True:
item = (yield)
messages.put(item)
except GeneratorExit:
messages.put(GeneratorExit)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 80
81. A Threaded Target
@coroutine
def threaded(target):
messages = Queue()
def run_target(): A thread. Loop
while True: forever, pulling items
item = messages.get() out of the message
if item is GeneratorExit: queue and sending
target.close() them to the target
return
else:
target.send(item)
Thread(target=run_target).start()
try:
while True:
item = (yield)
messages.put(item)
except GeneratorExit:
messages.put(GeneratorExit)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 81
82. A Threaded Target
@coroutine
def threaded(target):
messages = Queue()
def run_target():
while True:
item = messages.get()
if item is GeneratorExit:
target.close()
return
else:
target.send(item)
Thread(target=run_target).start()
try:
while True:
item = (yield) Receive items and
messages.put(item) pass them into the
except GeneratorExit: thread (via the queue)
messages.put(GeneratorExit)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 82
83. A Threaded Target
@coroutine
def threaded(target):
messages = Queue()
def run_target():
while True:
item = messages.get()
if item is GeneratorExit:
target.close()
return
else: Handle close() so
target.send(item) that the thread shuts
Thread(target=run_target).start() down correctly
try:
while True:
item = (yield)
messages.put(item)
except GeneratorExit:
messages.put(GeneratorExit)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 83
84. A Thread Example
• Example of hooking things up
xml.sax.parse("allroutes.xml", EventHandler(
buses_to_dicts(
threaded(
filter_on_field("route","22",
filter_on_field("direction","North Bound",
bus_locations()))
))))
• A caution: adding threads makes this example
run about 50% slower.
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 84
85. A Picture
• Here is an overview of the last example
Main Program
xml.sax.parse
EventHandler
Thread
buses_to_dicts filter_on_field
filter_on_field
bus_locations
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 85
86. A Subprocess Target
• Can also bridge two coroutines over a file/pipe
@coroutine coprocess.py
def sendto(f):
try:
while True:
item = (yield)
pickle.dump(item,f)
f.flush()
except StopIteration:
f.close()
def recvfrom(f,target):
try:
while True:
item = pickle.load(f)
target.send(item)
except EOFError:
target.close()
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 86
87. A Subprocess Target
• High Level Picture
pipe/socket
sendto() recvfrom()
pickle.dump() pickle.load()
• Of course, the devil is in the details...
• You would not do this unless you can recover
the cost of the underlying communication
(e.g., you have multiple CPUs and there's
enough processing to make it worthwhile)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 87
88. Implementation vs. Environ
• With coroutines, you can separate the
implementation of a task from its execution
environment
• The coroutine is the implementation
• The environment is whatever you choose
(threads, subprocesses, network, etc.)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 88
89. A Caution
• Creating huge collections of coroutines,
threads, and processes might be a good way to
create an unmaintainable application (although
it might increase your job security)
• And it might make your program run slower!
• You need to carefully study the problem to
know if any of this is a good idea
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 89
90. Some Hidden Dangers
• The send() method on a coroutine must be
properly synchronized
• If you call send() on an already-executing
coroutine, your program will crash
• Example : Multiple threads sending data into
the same target coroutine
cocrash.py
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 90
91. Limitations
• You also can't create loops or cycles
send() send()
source coroutine coroutine
send()
• Stacked sends are building up a kind of call-stack
(send() doesn't return until the target yields)
• If you call a coroutine that's already in the
process of sending, you'll get an error
• send() doesn't suspend coroutine execution
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 91
92. Part 5
Coroutines as Tasks
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 92
93. The Task Concept
• In concurrent programming, one typically
subdivides problems into "tasks"
• Tasks have a few essential features
• Independent control flow
• Internal state
• Can be scheduled (suspended/resumed)
• Can communicate with other tasks
• Claim : Coroutines are tasks
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 93
94. Are Coroutines Tasks?
• Let's look at the essential parts
• Coroutines have their own control flow.
@coroutine
def grep(pattern): statements
print "Looking for %s" % pattern
while True:
line = (yield)
if pattern in line:
print line,
• A coroutine is just a sequence of statements like
any other Python function
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 94
95. Are Coroutines Tasks?
• Coroutines have their internal own state
• For example : local variables
@coroutine
def grep(pattern):
print "Looking for %s" % pattern
while True:
locals line = (yield)
if pattern in line:
print line,
• The locals live as long as the coroutine is active
• They establish an execution environment
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 95
96. Are Coroutines Tasks?
• Coroutines can communicate
• The .send() method sends data to a coroutine
@coroutine
def grep(pattern):
print "Looking for %s" % pattern
while True:
line = (yield) send(msg)
if pattern in line:
print line,
• yield expressions receive input
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 96
97. Are Coroutines Tasks?
• Coroutines can be suspended and resumed
• yield suspends execution
• send() resumes execution
• close() terminates execution
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 97
98. I'm Convinced
• Very clearly, coroutines look like tasks
• But they're not tied to threads
• Or subprocesses
• A question : Can you perform multitasking
without using either of those concepts?
• Multitasking using nothing but coroutines?
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 98
99. Part 6
A Crash Course in Operating Systems
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 99
100. Program Execution
• On a CPU, a program is a series of instructions
_main:
int main() { pushl %ebp
int i, total = 0;
for (i = 0; i < 10; i++)
cc movl
subl
%esp, %ebp
$24, %esp
{ movl $0, -12(%ebp)
total += i; movl $0, -16(%ebp)
} jmp L2
} L3:
movl -16(%ebp), %eax
• When running, there leal
addl
-12(%ebp), %edx
%eax, (%edx)
is no notion of doing leal -16(%ebp), %eax
incl (%eax)
more than one thing L2:
at a time (or any kind cmpl
jle
$9, -16(%ebp)
L3
of task switching) leave
ret
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 100
101. The Multitasking Problem
• CPUs don't know anything about multitasking
• Nor do application programs
• Well, surely something has to know about it!
• Hint: It's the operating system
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 101
102. Operating Systems
• As you hopefully know, the operating system
(e.g., Linux, Windows) is responsible for
running programs on your machine
• And as you have observed, the operating
system does allow more than one process to
execute at once (e.g., multitasking)
• It does this by rapidly switching between tasks
• Question : How does it do that?
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 102
103. A Conundrum
• When a CPU is running your program, it is not
running the operating system
• Question: How does the operating system
(which is not running) make an application
(which is running) switch to another task?
• The "context-switching" problem...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 103
104. Interrupts and Traps
• There are usually only two mechanisms that an
operating system uses to gain control
• Interrupts - Some kind of hardware related
signal (data received, timer, keypress, etc.)
• Traps - A software generated signal
• In both cases, the CPU briefly suspends what it is
doing, and runs code that's part of the OS
• It is at this time the OS might switch tasks
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 104
105. Traps and System Calls
• Low-level system calls are actually traps
• It is a special CPU instruction
read(fd,buf,nbytes) read:
push %ebx
mov 0x10(%esp),%edx
• When a trap instruction
mov 0xc(%esp),%ecx
mov 0x8(%esp),%ebx
mov $0x3,%eax
executes, the program int $0x80 trap
suspends execution at pop %ebx
...
that point
• And the OS takes over
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 105
106. High Level Overview
• Traps are what make an OS work
• The OS drops your program on the CPU
• It runs until it hits a trap (system call)
• The program suspends and the OS runs
• Repeat
trap trap trap
run run run run
OS executes
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 106
107. Task Switching
• Here's what typically happens when an
OS runs multiple tasks.
trap trap trap
Task A: run
run run
task switch
trap trap
Task B:
run run
• On each trap, the system switches to a
different task (cycling between them)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 107
108. Task Scheduling
• To run many tasks, add a bunch of queues
Ready Queue Running
task task task task task
CPU CPU
Wait Queues
Traps
task task
task
task task task
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 108
109. An Insight
• The yield statement is a kind of "trap"
• No really!
• When a generator function hits a "yield"
statement, it immediately suspends execution
• Control is passed back to whatever code
made the generator function run (unseen)
• If you treat yield as a trap, you can build a
multitasking "operating system"--all in Python!
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 109
110. Part 7
Let's Build an Operating System
(You may want to put on your 5-point safety harness)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 110
111. Our Challenge
• Build a multitasking "operating system"
• Use nothing but pure Python code
• No threads
• No subprocesses
• Use generators/coroutines
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 111
112. Some Motivation
• There has been a lot of recent interest in
alternatives to threads (especially due to the GIL)
• Non-blocking and asynchronous I/O
• Example: servers capable of supporting
thousands of simultaneous client connections
• A lot of work has focused on event-driven
systems or the "Reactor Model" (e.g., Twisted)
• Coroutines are a whole different twist...
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 112
113. Step 1: Define Tasks
• A task object
class Task(object): pyos1.py
taskid = 0
def __init__(self,target):
Task.taskid += 1
self.tid = Task.taskid # Task ID
self.target = target # Target coroutine
self.sendval = None # Value to send
def run(self):
return self.target.send(self.sendval)
• A task is a wrapper around a coroutine
• There is only one operation : run()
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 113
114. Task Example
• Here is how this wrapper behaves
# A very simple generator
def foo():
print "Part 1"
yield
print "Part 2"
yield
>>> t1 = Task(foo()) # Wrap in a Task
>>> t1.run()
Part 1
>>> t1.run()
Part 2
>>>
• run() executes the task to the next yield (a trap)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 114
115. Step 2: The Scheduler
class Scheduler(object):
def __init__(self): pyos2.py
self.ready = Queue()
self.taskmap = {}
def new(self,target):
newtask = Task(target)
self.taskmap[newtask.tid] = newtask
self.schedule(newtask)
return newtask.tid
def schedule(self,task):
self.ready.put(task)
def mainloop(self):
while self.taskmap:
task = self.ready.get()
result = task.run()
self.schedule(task)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 115
116. Step 2: The Scheduler
class Scheduler(object):
def __init__(self):
self.ready = Queue()
A queue of tasks that
self.taskmap = {} are ready to run
def new(self,target):
newtask = Task(target)
self.taskmap[newtask.tid] = newtask
self.schedule(newtask)
return newtask.tid
def schedule(self,task):
self.ready.put(task)
def mainloop(self):
while self.taskmap:
task = self.ready.get()
result = task.run()
self.schedule(task)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 116
117. Step 2: The Scheduler
class Scheduler(object):
def __init__(self):
self.ready = Queue()
self.taskmap = {}
Introduces a new task
def new(self,target):
newtask = Task(target)
to the scheduler
self.taskmap[newtask.tid] = newtask
self.schedule(newtask)
return newtask.tid
def schedule(self,task):
self.ready.put(task)
def mainloop(self):
while self.taskmap:
task = self.ready.get()
result = task.run()
self.schedule(task)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 117
118. Step 2: The Scheduler
class Scheduler(object):
def __init__(self):
self.ready = Queue()
self.taskmap = {}
A dictionary that
def new(self,target): keeps track of all
newtask = Task(target)
self.taskmap[newtask.tid] = newtask active tasks (each
self.schedule(newtask) task has a unique
return newtask.tid integer task ID)
def schedule(self,task):
self.ready.put(task) (more later)
def mainloop(self):
while self.taskmap:
task = self.ready.get()
result = task.run()
self.schedule(task)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 118
119. Step 2: The Scheduler
class Scheduler(object):
def __init__(self):
self.ready = Queue()
self.taskmap = {}
def new(self,target):
newtask = Task(target)
self.taskmap[newtask.tid] = newtask
self.schedule(newtask)
return newtask.tid
Put a task onto the
def schedule(self,task): ready queue. This
self.ready.put(task) makes it available
def mainloop(self): to run.
while self.taskmap:
task = self.ready.get()
result = task.run()
self.schedule(task)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 119
120. Step 2: The Scheduler
class Scheduler(object):
def __init__(self):
self.ready = Queue()
self.taskmap = {}
def new(self,target):
newtask = Task(target)
self.taskmap[newtask.tid] = newtask
self.schedule(newtask)
return newtask.tid
def schedule(self,task):
self.ready.put(task) The main scheduler
def mainloop(self): loop. It pulls tasks off the
while self.taskmap: queue and runs them to
task = self.ready.get() the next yield.
result = task.run()
self.schedule(task)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 120
121. First Multitasking
• Two tasks:
def foo():
while True:
print "I'm foo"
yield
def bar():
while True:
print "I'm bar"
yield
• Running them into the scheduler
sched = Scheduler()
sched.new(foo())
sched.new(bar())
sched.mainloop()
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 121
122. First Multitasking
• Example output:
I'm foo
I'm bar
I'm foo
I'm bar
I'm foo
I'm bar
• Emphasize: yield is a trap
• Each task runs until it hits the yield
• At this point, the scheduler regains control
and switches to the other task
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 122
123. Problem : Task Termination
• The scheduler crashes if a task returns
def foo(): taskcrash.py
for i in xrange(10):
print "I'm foo"
yield
...
I'm foo
I'm bar
I'm foo
I'm bar
Traceback (most recent call last):
File "crash.py", line 20, in <module>
sched.mainloop()
File "scheduler.py", line 26, in mainloop
result = task.run()
File "task.py", line 13, in run
return self.target.send(self.sendval)
StopIteration
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 123
124. Step 3: Task Exit
class Scheduler(object): pyos3.py
...
def exit(self,task):
print "Task %d terminated" % task.tid
del self.taskmap[task.tid]
...
def mainloop(self):
while self.taskmap:
task = self.ready.get()
try:
result = task.run()
except StopIteration:
self.exit(task)
continue
self.schedule(task)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 124
125. Step 3: Task Exit
Remove the task
class Scheduler(object): from the scheduler's
...
def exit(self,task): task map
print "Task %d terminated" % task.tid
del self.taskmap[task.tid]
...
def mainloop(self):
while self.taskmap:
task = self.ready.get()
try:
result = task.run()
except StopIteration:
self.exit(task)
continue
self.schedule(task)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 125
126. Step 3: Task Exit
class Scheduler(object):
...
def exit(self,task):
print "Task %d terminated" % task.tid
del self.taskmap[task.tid]
...
def mainloop(self):
while self.taskmap:
task = self.ready.get()
try:
result = task.run() Catch task exit and
except StopIteration:
self.exit(task) cleanup
continue
self.schedule(task)
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 126
127. Second Multitasking
• Two tasks:
def foo():
for i in xrange(10):
print "I'm foo"
yield
def bar():
for i in xrange(5):
print "I'm bar"
yield
sched = Scheduler()
sched.new(foo())
sched.new(bar())
sched.mainloop()
Copyright (C) 2009, David Beazley, http://www.dabeaz.com 127