This document discusses Python profiling to optimize code performance. It introduces the Yappi profiler, which can measure CPU time and threads. Running the Python code through Yappi and viewing the results in KCacheGrind reveals that datetime object creation takes a significant amount of time. Replacing it with datetime.now() improves performance by 50%. Profiling helps generate hypotheses about where optimizations could yield improvements.
3. “PyMongo is slower!
compared to the JavaScript version”
MongoDB Node.js driver:!88,000 per second
PyMongo: ! ! ! ! ! ! ! ! ! 29,000 per second
4. “Why Is
PyMongo Slower?”
From:!steve@mongodb.com!
To:!! jesse@mongodb.com!
CC:!! eliot@mongodb.com
Hi Jesse,!
!
Why is the Node MongoDB driver 3 times!
faster than PyMongo?!
http://dzone.com/articles/mongodb-facts-over-80000
5. The Python Code
# Obtain a MongoDB collection.!
import pymongo!
!
client = pymongo.MongoClient('localhost')!
db = client.random!
collection = db.randomData!
collection.remove()!
11. The Question
Why is the Python script
3 times slower than the
equivalent Node script?
12. Why Profile?
• Optimization is like debugging
• Hypothesis:
“The following change will yield a
worthwhile improvement.”
• Experiment
• Repeat until fast enough
16. Yappi
Compared to cProfile, it is:
!
• As fast
• Also measures functions
• Can measure CPU time, not just wall
• Can measure all threads
• Can export to callgrind
19. for index in range(n_documents):!
date = datetime.fromtimestamp(!
time.mktime(min_date.timetuple())!
+ int(round(random.random() * delta)))!
!
value = random.random()!
document = {!
'created_on': date,!
'value': value}!
!
batch.append(document)!
if len(batch) == batch_size:!
collection.insert(batch)!
batch = []!
The Python Code
one third
of the time
20. for index in range(n_documents):!
date = datetime.now()!
!
!
!
value = random.random()!
document = {!
'created_on': date,!
'value': value}!
!
batch.append(document)!
if len(batch) == batch_size:!
collection.insert(batch)!
batch = []!
The Python Code
21. The Python Code
• Before: 30,000 inserts per second
• After: 50,000 inserts per second