SlideShare a Scribd company logo
1 of 26
Download to read offline
Python Profiling:
A. Jesse Jiryu Davis


@jessejiryudavis


MongoDB
The Glory
&
The Guts
“PyMongo is slower!
compared to the JavaScript version”
MongoDB Node.js driver:!88,000 per second
PyMongo: ! ! ! ! ! ! ! ! ! 29,000 per second
“Why Is
PyMongo Slower?”
From:!steve@mongodb.com!
To:!! jesse@mongodb.com!
CC:!! eliot@mongodb.com

Hi Jesse,!
!
Why is the Node MongoDB driver 3 times!
faster than PyMongo?!


http://dzone.com/articles/mongodb-facts-over-80000
The Python Code
# Obtain a MongoDB collection.!
import pymongo!
!
client = pymongo.MongoClient('localhost')!
db = client.random!
collection = db.randomData!
collection.remove()!
n_documents = 80000!
batch_size = 5000!
batch = []!
!
import time!
start = time.time()
The Python Code
import random!
from datetime import datetime!
!
min_date = datetime(2012, 1, 1)!
max_date = datetime(2013, 1, 1)!
delta = (max_date - min_date).total_seconds()!
The Python Code
What?!
The Python Code
for i in range(n_documents):!
date = datetime.fromtimestamp(!
time.mktime(min_date.timetuple())!
+ int(round(random.random() * delta)))!
!
value = random.random()!
document = {!
'created_on': date,!
'value': value}!
!
batch.append(document)!
if len(batch) == batch_size:!
collection.insert(batch)!
batch = []!
duration = time.time() - start!
!
print 'inserted %d documents per second' % (!
n_documents / duration)!
The Python Code
inserted 30,000 documents per second
The Node.js Code
(not shown)
The Question
Why is the Python script
3 times slower than the
equivalent Node script?
Why Profile?
• Optimization is like debugging
• Hypothesis:

“The following change will yield a
worthwhile improvement.”
• Experiment
• Repeat until fast enough
Why Profile?
Profiling is a way to

generate hypotheses.
Which Profiler?
• cProfile
• GreenletProfiler
• Yappi
Yappi
By Sümer Cip
Yappi
Compared to cProfile, it is:
!
• As fast
• Also measures functions
• Can measure CPU time, not just wall

• Can measure all threads
• Can export to callgrind
Yappi
import yappi!
!
yappi.set_clock_type('cpu')!
yappi.start(builtins=True)!
!
start = time.time()!
!
for i in range(n_documents):!
# ... same code ... !
!
duration = time.time() - start!
stats = yappi.get_func_stats()!
stats.save('callgrind.out', type='callgrind')!
Same code

as before
KCacheGrind
for index in range(n_documents):!
date = datetime.fromtimestamp(!
time.mktime(min_date.timetuple())!
+ int(round(random.random() * delta)))!
!
value = random.random()!
document = {!
'created_on': date,!
'value': value}!
!
batch.append(document)!
if len(batch) == batch_size:!
collection.insert(batch)!
batch = []!
The Python Code
one third

of the time
for index in range(n_documents):!
date = datetime.now()!
!
!
!
value = random.random()!
document = {!
'created_on': date,!
'value': value}!
!
batch.append(document)!
if len(batch) == batch_size:!
collection.insert(batch)!
batch = []!
The Python Code
The Python Code
• Before: 30,000 inserts per second
• After: 50,000 inserts per second
Why Profile?
• Generate hypotheses

• Estimate possible improvement
How Does

Profiling Work?
int callback(PyFrameObject *frame,!
int what,!
PyObject *arg);!
int start(void)!
{!
PyEval_SetProfile(callback);!
}!
PyObject *!
PyEval_EvalFrameEx(PyFrameObject *frame)!
{!
if (tstate->c_profilefunc != NULL) {!
tstate->c_profilefunc(frame,!
PyTrace_CALL,!
Py_None);!
}!
!
/* ... execute bytecode in the frame!
* until return or exception... */!
!
if (tstate->c_profilefunc != NULL) {!
tstate->c_profilefunc(frame,!
PyTrace_RETURN,!
retval);!
}!
}!
int callback(PyFrameObject *frame,!
int what,!
PyObject *arg)!
{!
switch (what) {!
case PyTrace_CALL:!
{!
PyCodeObject *cobj = frame->f_code;!
PyObject *filename = cobj->co_filename;!
PyObject *funcname = cobj->co_name;!
!
/* ... record the function call ... */!
}!
break;!
!
/* ... other cases ... */!
!
}!
}!
A. Jesse Jiryu Davis


@jessejiryudavis


MongoDB

More Related Content

What's hot

The Secrets of The FullStack Ninja - Part A - Session I
The Secrets of The FullStack Ninja - Part A - Session IThe Secrets of The FullStack Ninja - Part A - Session I
The Secrets of The FullStack Ninja - Part A - Session IOded Sagir
 
Event Loop in Javascript
Event Loop in JavascriptEvent Loop in Javascript
Event Loop in JavascriptDiptiGandhi4
 
appborg, coffeesurgeon, moof, logging-system
appborg, coffeesurgeon, moof, logging-systemappborg, coffeesurgeon, moof, logging-system
appborg, coffeesurgeon, moof, logging-systemendian7000
 
Matthew Eernisse, NodeJs, .toster {webdev}
Matthew Eernisse, NodeJs, .toster {webdev}Matthew Eernisse, NodeJs, .toster {webdev}
Matthew Eernisse, NodeJs, .toster {webdev}.toster
 
Mining crypto in browser as a bleeding edge performance challenge for the Web...
Mining crypto in browser as a bleeding edge performance challenge for the Web...Mining crypto in browser as a bleeding edge performance challenge for the Web...
Mining crypto in browser as a bleeding edge performance challenge for the Web...Denis Radin
 
When a Sassquatch and a Board get together (or how to use Grunt to chew Sass)
When a Sassquatch and a Board get together (or how to use Grunt to chew Sass)When a Sassquatch and a Board get together (or how to use Grunt to chew Sass)
When a Sassquatch and a Board get together (or how to use Grunt to chew Sass)Ricardo Castelhano
 
Building a REST API with Node.js and MongoDB
Building a REST API with Node.js and MongoDBBuilding a REST API with Node.js and MongoDB
Building a REST API with Node.js and MongoDBVivochaLabs
 
Make the prompt great again
Make the prompt great againMake the prompt great again
Make the prompt great againjtyr
 
Front End Development Automation with Grunt
Front End Development Automation with GruntFront End Development Automation with Grunt
Front End Development Automation with GruntLadies Who Code
 
What Is Async, How Does It Work, And When Should I Use It?
What Is Async, How Does It Work, And When Should I Use It?What Is Async, How Does It Work, And When Should I Use It?
What Is Async, How Does It Work, And When Should I Use It?emptysquare
 
Web development-workflow
Web development-workflowWeb development-workflow
Web development-workflowQiu Juntao
 
Nvvp streams-3
Nvvp streams-3Nvvp streams-3
Nvvp streams-3Josh Wyatt
 
Introduction to Erebos: a JavaScript client for Swarm
Introduction to Erebos: a JavaScript client for SwarmIntroduction to Erebos: a JavaScript client for Swarm
Introduction to Erebos: a JavaScript client for SwarmMiloš Mošić
 
ECMAScript 6 and the Node Driver
ECMAScript 6 and the Node DriverECMAScript 6 and the Node Driver
ECMAScript 6 and the Node DriverMongoDB
 
Build web application by express
Build web application by expressBuild web application by express
Build web application by expressShawn Meng
 
grifork - fast propagative task runner -
grifork - fast propagative task runner -grifork - fast propagative task runner -
grifork - fast propagative task runner -IKEDA Kiyoshi
 

What's hot (18)

The Secrets of The FullStack Ninja - Part A - Session I
The Secrets of The FullStack Ninja - Part A - Session IThe Secrets of The FullStack Ninja - Part A - Session I
The Secrets of The FullStack Ninja - Part A - Session I
 
Event Loop in Javascript
Event Loop in JavascriptEvent Loop in Javascript
Event Loop in Javascript
 
appborg, coffeesurgeon, moof, logging-system
appborg, coffeesurgeon, moof, logging-systemappborg, coffeesurgeon, moof, logging-system
appborg, coffeesurgeon, moof, logging-system
 
Matthew Eernisse, NodeJs, .toster {webdev}
Matthew Eernisse, NodeJs, .toster {webdev}Matthew Eernisse, NodeJs, .toster {webdev}
Matthew Eernisse, NodeJs, .toster {webdev}
 
Mining crypto in browser as a bleeding edge performance challenge for the Web...
Mining crypto in browser as a bleeding edge performance challenge for the Web...Mining crypto in browser as a bleeding edge performance challenge for the Web...
Mining crypto in browser as a bleeding edge performance challenge for the Web...
 
When a Sassquatch and a Board get together (or how to use Grunt to chew Sass)
When a Sassquatch and a Board get together (or how to use Grunt to chew Sass)When a Sassquatch and a Board get together (or how to use Grunt to chew Sass)
When a Sassquatch and a Board get together (or how to use Grunt to chew Sass)
 
Building a REST API with Node.js and MongoDB
Building a REST API with Node.js and MongoDBBuilding a REST API with Node.js and MongoDB
Building a REST API with Node.js and MongoDB
 
Make the prompt great again
Make the prompt great againMake the prompt great again
Make the prompt great again
 
Front End Development Automation with Grunt
Front End Development Automation with GruntFront End Development Automation with Grunt
Front End Development Automation with Grunt
 
Swarm@MoscowJS v2 (en)
Swarm@MoscowJS v2 (en)Swarm@MoscowJS v2 (en)
Swarm@MoscowJS v2 (en)
 
What Is Async, How Does It Work, And When Should I Use It?
What Is Async, How Does It Work, And When Should I Use It?What Is Async, How Does It Work, And When Should I Use It?
What Is Async, How Does It Work, And When Should I Use It?
 
Web development-workflow
Web development-workflowWeb development-workflow
Web development-workflow
 
Nvvp streams-3
Nvvp streams-3Nvvp streams-3
Nvvp streams-3
 
Introduction to Erebos: a JavaScript client for Swarm
Introduction to Erebos: a JavaScript client for SwarmIntroduction to Erebos: a JavaScript client for Swarm
Introduction to Erebos: a JavaScript client for Swarm
 
ECMAScript 6 and the Node Driver
ECMAScript 6 and the Node DriverECMAScript 6 and the Node Driver
ECMAScript 6 and the Node Driver
 
Build web application by express
Build web application by expressBuild web application by express
Build web application by express
 
earthquake.gem
earthquake.gemearthquake.gem
earthquake.gem
 
grifork - fast propagative task runner -
grifork - fast propagative task runner -grifork - fast propagative task runner -
grifork - fast propagative task runner -
 

Viewers also liked

Python Performance: Single-threaded, multi-threaded, and Gevent
Python Performance: Single-threaded, multi-threaded, and GeventPython Performance: Single-threaded, multi-threaded, and Gevent
Python Performance: Single-threaded, multi-threaded, and Geventemptysquare
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBrendan Gregg
 
Deep into your applications, performance & profiling
Deep into your applications, performance & profilingDeep into your applications, performance & profiling
Deep into your applications, performance & profilingFabien Arcellier
 
Smashing the bottleneck: Qt application profiling
Smashing the bottleneck: Qt application profilingSmashing the bottleneck: Qt application profiling
Smashing the bottleneck: Qt application profilingDeveler S.r.l.
 
Websockets in Node.js - Making them reliable and scalable
Websockets in Node.js - Making them reliable and scalableWebsockets in Node.js - Making them reliable and scalable
Websockets in Node.js - Making them reliable and scalableGareth Marland
 
Scaling Django with gevent
Scaling Django with geventScaling Django with gevent
Scaling Django with geventMahendra M
 
Vasiliy Litvinov - Python Profiling
Vasiliy Litvinov - Python ProfilingVasiliy Litvinov - Python Profiling
Vasiliy Litvinov - Python ProfilingSergey Arkhipov
 
What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performancePiotr Przymus
 
Denis Nagorny - Pumping Python Performance
Denis Nagorny - Pumping Python PerformanceDenis Nagorny - Pumping Python Performance
Denis Nagorny - Pumping Python PerformanceSergey Arkhipov
 
The High Performance Python Landscape by Ian Ozsvald
The High Performance Python Landscape by Ian OzsvaldThe High Performance Python Landscape by Ian Ozsvald
The High Performance Python Landscape by Ian OzsvaldPyData
 
Boost.Python: C++ and Python Integration
Boost.Python: C++ and Python IntegrationBoost.Python: C++ and Python Integration
Boost.Python: C++ and Python IntegrationGlobalLogic Ukraine
 
Spark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance TuningSpark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance Tuning晨揚 施
 
Python profiling
Python profilingPython profiling
Python profilingdreampuf
 
Spark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud IbrahimovSpark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud IbrahimovMaksud Ibrahimov
 
The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkSpark Summit
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profilingJon Haddad
 
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production ScaleGPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scalesparktc
 
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterCornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterSpark Summit
 

Viewers also liked (20)

Python Performance: Single-threaded, multi-threaded, and Gevent
Python Performance: Single-threaded, multi-threaded, and GeventPython Performance: Single-threaded, multi-threaded, and Gevent
Python Performance: Single-threaded, multi-threaded, and Gevent
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 
Understanding greenlet
Understanding greenletUnderstanding greenlet
Understanding greenlet
 
Deep into your applications, performance & profiling
Deep into your applications, performance & profilingDeep into your applications, performance & profiling
Deep into your applications, performance & profiling
 
Smashing the bottleneck: Qt application profiling
Smashing the bottleneck: Qt application profilingSmashing the bottleneck: Qt application profiling
Smashing the bottleneck: Qt application profiling
 
Websockets in Node.js - Making them reliable and scalable
Websockets in Node.js - Making them reliable and scalableWebsockets in Node.js - Making them reliable and scalable
Websockets in Node.js - Making them reliable and scalable
 
Scaling Django with gevent
Scaling Django with geventScaling Django with gevent
Scaling Django with gevent
 
Vasiliy Litvinov - Python Profiling
Vasiliy Litvinov - Python ProfilingVasiliy Litvinov - Python Profiling
Vasiliy Litvinov - Python Profiling
 
What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
 
Denis Nagorny - Pumping Python Performance
Denis Nagorny - Pumping Python PerformanceDenis Nagorny - Pumping Python Performance
Denis Nagorny - Pumping Python Performance
 
The High Performance Python Landscape by Ian Ozsvald
The High Performance Python Landscape by Ian OzsvaldThe High Performance Python Landscape by Ian Ozsvald
The High Performance Python Landscape by Ian Ozsvald
 
Boost.Python: C++ and Python Integration
Boost.Python: C++ and Python IntegrationBoost.Python: C++ and Python Integration
Boost.Python: C++ and Python Integration
 
Spark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance TuningSpark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance Tuning
 
Python profiling
Python profilingPython profiling
Python profiling
 
Exploiting GPUs in Spark
Exploiting GPUs in SparkExploiting GPUs in Spark
Exploiting GPUs in Spark
 
Spark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud IbrahimovSpark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud Ibrahimov
 
The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in Spark
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profiling
 
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production ScaleGPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
 
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterCornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
 

Similar to Python Performance Profiling: The Guts And The Glory

An Introduction to Go
An Introduction to GoAn Introduction to Go
An Introduction to GoCloudflare
 
The Weather of the Century Part 3: Visualization
The Weather of the Century Part 3: VisualizationThe Weather of the Century Part 3: Visualization
The Weather of the Century Part 3: VisualizationMongoDB
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without InterferenceTony Tam
 
Rapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and MingRapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and MingRick Copeland
 
Building Your First MongoDB Application
Building Your First MongoDB ApplicationBuilding Your First MongoDB Application
Building Your First MongoDB ApplicationTugdual Grall
 
Intro to node.js - Ran Mizrahi (27/8/2014)
Intro to node.js - Ran Mizrahi (27/8/2014)Intro to node.js - Ran Mizrahi (27/8/2014)
Intro to node.js - Ran Mizrahi (27/8/2014)Ran Mizrahi
 
Intro to node.js - Ran Mizrahi (28/8/14)
Intro to node.js - Ran Mizrahi (28/8/14)Intro to node.js - Ran Mizrahi (28/8/14)
Intro to node.js - Ran Mizrahi (28/8/14)Ran Mizrahi
 
Rapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and PythonRapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and PythonRick Copeland
 
EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...
EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...
EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...Alessandro Molina
 
Future of Development and Deployment using Docker
Future of Development and Deployment using DockerFuture of Development and Deployment using Docker
Future of Development and Deployment using DockerTamer Abdul-Radi
 
Expert JavaScript Programming
Expert JavaScript ProgrammingExpert JavaScript Programming
Expert JavaScript ProgrammingYoshiki Shibukawa
 
國民雲端架構 Django + GAE
國民雲端架構 Django + GAE國民雲端架構 Django + GAE
國民雲端架構 Django + GAEWinston Chen
 
MongoDB at ZPUGDC
MongoDB at ZPUGDCMongoDB at ZPUGDC
MongoDB at ZPUGDCMike Dirolf
 
MongoDB and Ruby on Rails
MongoDB and Ruby on RailsMongoDB and Ruby on Rails
MongoDB and Ruby on Railsrfischer20
 
MongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and QueueingMongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and QueueingBoxed Ice
 
Eventsourcing with PHP and MongoDB
Eventsourcing with PHP and MongoDBEventsourcing with PHP and MongoDB
Eventsourcing with PHP and MongoDBJacopo Nardiello
 
Snickers: Open Source HTTP API for Media Encoding
Snickers: Open Source HTTP API for Media EncodingSnickers: Open Source HTTP API for Media Encoding
Snickers: Open Source HTTP API for Media EncodingFlávio Ribeiro
 

Similar to Python Performance Profiling: The Guts And The Glory (20)

An Introduction to Go
An Introduction to GoAn Introduction to Go
An Introduction to Go
 
The Weather of the Century Part 3: Visualization
The Weather of the Century Part 3: VisualizationThe Weather of the Century Part 3: Visualization
The Weather of the Century Part 3: Visualization
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without Interference
 
Rapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and MingRapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and Ming
 
Building Your First MongoDB Application
Building Your First MongoDB ApplicationBuilding Your First MongoDB Application
Building Your First MongoDB Application
 
Intro to node.js - Ran Mizrahi (27/8/2014)
Intro to node.js - Ran Mizrahi (27/8/2014)Intro to node.js - Ran Mizrahi (27/8/2014)
Intro to node.js - Ran Mizrahi (27/8/2014)
 
Intro to node.js - Ran Mizrahi (28/8/14)
Intro to node.js - Ran Mizrahi (28/8/14)Intro to node.js - Ran Mizrahi (28/8/14)
Intro to node.js - Ran Mizrahi (28/8/14)
 
Rapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and PythonRapid, Scalable Web Development with MongoDB, Ming, and Python
Rapid, Scalable Web Development with MongoDB, Ming, and Python
 
EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...
EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...
EuroPython 2013 - FAST, DOCUMENTED AND RELIABLE JSON BASED WEBSERVICES WITH P...
 
Future of Development and Deployment using Docker
Future of Development and Deployment using DockerFuture of Development and Deployment using Docker
Future of Development and Deployment using Docker
 
Gcrc talk
Gcrc talkGcrc talk
Gcrc talk
 
Expert JavaScript Programming
Expert JavaScript ProgrammingExpert JavaScript Programming
Expert JavaScript Programming
 
國民雲端架構 Django + GAE
國民雲端架構 Django + GAE國民雲端架構 Django + GAE
國民雲端架構 Django + GAE
 
MongoDB at ZPUGDC
MongoDB at ZPUGDCMongoDB at ZPUGDC
MongoDB at ZPUGDC
 
Node azure
Node azureNode azure
Node azure
 
MongoDB and Ruby on Rails
MongoDB and Ruby on RailsMongoDB and Ruby on Rails
MongoDB and Ruby on Rails
 
Mongodb
MongodbMongodb
Mongodb
 
MongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and QueueingMongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and Queueing
 
Eventsourcing with PHP and MongoDB
Eventsourcing with PHP and MongoDBEventsourcing with PHP and MongoDB
Eventsourcing with PHP and MongoDB
 
Snickers: Open Source HTTP API for Media Encoding
Snickers: Open Source HTTP API for Media EncodingSnickers: Open Source HTTP API for Media Encoding
Snickers: Open Source HTTP API for Media Encoding
 

Recently uploaded

Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Best Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfBest Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfIdiosysTechnologies1
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 

Recently uploaded (20)

Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Best Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfBest Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdf
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 

Python Performance Profiling: The Guts And The Glory

  • 1. Python Profiling: A. Jesse Jiryu Davis 
 @jessejiryudavis 
 MongoDB The Glory & The Guts
  • 2.
  • 3. “PyMongo is slower! compared to the JavaScript version” MongoDB Node.js driver:!88,000 per second PyMongo: ! ! ! ! ! ! ! ! ! 29,000 per second
  • 4. “Why Is PyMongo Slower?” From:!steve@mongodb.com! To:!! jesse@mongodb.com! CC:!! eliot@mongodb.com
 Hi Jesse,! ! Why is the Node MongoDB driver 3 times! faster than PyMongo?! 
 http://dzone.com/articles/mongodb-facts-over-80000
  • 5. The Python Code # Obtain a MongoDB collection.! import pymongo! ! client = pymongo.MongoClient('localhost')! db = client.random! collection = db.randomData! collection.remove()!
  • 6. n_documents = 80000! batch_size = 5000! batch = []! ! import time! start = time.time() The Python Code
  • 7. import random! from datetime import datetime! ! min_date = datetime(2012, 1, 1)! max_date = datetime(2013, 1, 1)! delta = (max_date - min_date).total_seconds()! The Python Code
  • 8. What?! The Python Code for i in range(n_documents):! date = datetime.fromtimestamp(! time.mktime(min_date.timetuple())! + int(round(random.random() * delta)))! ! value = random.random()! document = {! 'created_on': date,! 'value': value}! ! batch.append(document)! if len(batch) == batch_size:! collection.insert(batch)! batch = []!
  • 9. duration = time.time() - start! ! print 'inserted %d documents per second' % (! n_documents / duration)! The Python Code inserted 30,000 documents per second
  • 11. The Question Why is the Python script 3 times slower than the equivalent Node script?
  • 12. Why Profile? • Optimization is like debugging • Hypothesis:
 “The following change will yield a worthwhile improvement.” • Experiment • Repeat until fast enough
  • 13. Why Profile? Profiling is a way to
 generate hypotheses.
  • 14. Which Profiler? • cProfile • GreenletProfiler • Yappi
  • 16. Yappi Compared to cProfile, it is: ! • As fast • Also measures functions • Can measure CPU time, not just wall
 • Can measure all threads • Can export to callgrind
  • 17. Yappi import yappi! ! yappi.set_clock_type('cpu')! yappi.start(builtins=True)! ! start = time.time()! ! for i in range(n_documents):! # ... same code ... ! ! duration = time.time() - start! stats = yappi.get_func_stats()! stats.save('callgrind.out', type='callgrind')! Same code
 as before
  • 19. for index in range(n_documents):! date = datetime.fromtimestamp(! time.mktime(min_date.timetuple())! + int(round(random.random() * delta)))! ! value = random.random()! document = {! 'created_on': date,! 'value': value}! ! batch.append(document)! if len(batch) == batch_size:! collection.insert(batch)! batch = []! The Python Code one third
 of the time
  • 20. for index in range(n_documents):! date = datetime.now()! ! ! ! value = random.random()! document = {! 'created_on': date,! 'value': value}! ! batch.append(document)! if len(batch) == batch_size:! collection.insert(batch)! batch = []! The Python Code
  • 21. The Python Code • Before: 30,000 inserts per second • After: 50,000 inserts per second
  • 22. Why Profile? • Generate hypotheses
 • Estimate possible improvement
  • 23. How Does
 Profiling Work? int callback(PyFrameObject *frame,! int what,! PyObject *arg);! int start(void)! {! PyEval_SetProfile(callback);! }!
  • 24. PyObject *! PyEval_EvalFrameEx(PyFrameObject *frame)! {! if (tstate->c_profilefunc != NULL) {! tstate->c_profilefunc(frame,! PyTrace_CALL,! Py_None);! }! ! /* ... execute bytecode in the frame! * until return or exception... */! ! if (tstate->c_profilefunc != NULL) {! tstate->c_profilefunc(frame,! PyTrace_RETURN,! retval);! }! }!
  • 25. int callback(PyFrameObject *frame,! int what,! PyObject *arg)! {! switch (what) {! case PyTrace_CALL:! {! PyCodeObject *cobj = frame->f_code;! PyObject *filename = cobj->co_filename;! PyObject *funcname = cobj->co_name;! ! /* ... record the function call ... */! }! break;! ! /* ... other cases ... */! ! }! }!
  • 26. A. Jesse Jiryu Davis 
 @jessejiryudavis 
 MongoDB