SlideShare a Scribd company logo
1 of 68
Crushing the Head of the Snake
Robert Brewer
Chief Architect
Crunch.io
How to Time
from timeit import Timer
>>> range(5)
[0, 1, 2, 3, 4]
>>> t = Timer("range(a)", "a = 1000000")
>>> t.timeit(1)
0.028472900390625
>>> t.timeit(100)
1.8600409030914307
>>> t.timeit(1000)
18.056041955947876
Comparing algorithms
>>> Timer("range(1000)").timeit(1 000 000)
>>> Timer("range(1000)").timeit()
11.392634868621826
>>> Timer("xrange(1000)").timeit()
0.20040297508239746
>>> Timer("list(xrange(1000))").timeit()
12.207480907440186
Caveat: Overhead
>>> Timer().timeit(1000000)
0.029289960861206055
Caveat: Wall time not CPU time
>>> Timer("xrange(1000)").timeit()
0.20040297508239746
>>> Timer("xrange(1000)").repeat(3)
[0.20735883712768555,
0.1968221664428711,
0.18882489204406738]
take the minimum
How to Profile
>>> import mod
>>> import cProfile
>>> cProfile.run("mod.b()", sort="cumulative")
How to Profile
>>> import mod
>>> import cProfile
>>> cProfile.run("mod.b()", sort="cumulative")
(make changes to module)
>>> reload(mod)
>>> cProfile.run("mod.b()", sort="cumulative")
How to Profile
>>> cProfile.run("for i in xrange(3000): range(i).sort()",
sort="cumulative")
6002 function calls in 0.093 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.019 0.019 0.093 0.093 <string>:1(<module>)
3000 0.052 0.000 0.052 0.000 {list.sort}
3000 0.022 0.000 0.022 0.000 {range}
1 0.000 0.000 0.000 0.000 {method 'disable' of
'_lsprof.Profiler'
objects}
How to Profile
6002 function calls in 0.093 seconds
ncalls tottime percall cumtime percall filename:lineno(func)
3000 0.052 0.000 0.052 0.000 {list.sort}
3000 0.022 0.000 0.022 0.000 {range}
Example: Standard Deviation
>>> import numpy
>>> n = 100
>>> a = numpy.array(xrange(n),
dtype=float)
>>> a.std(ddof=1)
29.011491975882016
Example: Standard Deviation
>>> n = 4000000000
>>> a = numpy.array(xrange(n),
dtype=float)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: setting an array element
with a sequence.
Example: Standard Deviation
>>> n = 4000000000
>>> arr = numpy.zeros(n, dtype=float)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError
Example: Standard Deviation
Example: Standard Deviation
Given array A broken in n parts a1...an
and local variance V(ai) = Σj(aij - ai)2
V(a) + 2(Σaij)(ai - A) + |ai|(A2 - ai
2)
|A| - ddof
n
Σi =
1
√
Example: Standard Deviation
def run():
points = 400 000 (0000)
segments = 100
part_len = points / segments
partitions = []
for p in range(segments):
part = range(part_len * p,
part_len * (p + 1))
partitions.append(part)
return stddev(partitions, ddof=1)
Example: Standard Deviation
def stddev(partitions, ddof=0):
final = 0.0
for part in partitions:
m = total(part) / length(part)
# Find the mean of the entire group.
gtotal = total([total(p) for p in partitions])
glength = total([length(p) for p in partitions])
g = gtotal / glength
adj = ((2 * total(part) * (m - g)) +
((g ** 2 - m ** 2) * length(part)))
final += varsum(part) + adj
return math.sqrt(final / (glength - ddof))
Example: Standard Deviation
2052106 function calls in 71.025 seconds
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 71.023 71.023 stddev.py:39(run)
1 0.006 0.006 71.013 71.013 stddev.py:22(stddev)
410400 63.406 0.000 70.490 0.000 stddev.py:4(total)
100 0.341 0.003 69.178 0.692 stddev.py:15(varsum)
410601 7.076 0.000 7.076 0.000 {range}
410200 0.151 0.000 0.174 0.000 stddev.py:11(length)
820700 0.042 0.000 0.042 0.000 {len}
100 0.000 0.000 0.000 0.000 {list.append}
1 0.000 0.000 0.000 0.000 {math.sqrt}
Example: Standard Deviation
400000 in 71.025 seconds
Assuming no other effects of scale,
it will take 197.3 hours (over 8 days)
to calculate our 4 billion-row array.
Example: Standard Deviation
Can we calculate
our 4 billion-row array in
1 minute?
That’s 400,000 in 6ms.
All we need is an 11,837.5x speedup.
Optimization
Example: Standard Deviation
2052106 function calls in 71.025 seconds
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 71.023 71.023 stddev.py:39(run)
1 0.006 0.006 71.013 71.013 stddev.py:22(stddev)
410400 63.406 0.000 70.490 0.000 stddev.py:4(total)
100 0.341 0.003 69.178 0.692 stddev.py:15(varsum)
410601 7.076 0.000 7.076 0.000 {range}
410200 0.151 0.000 0.174 0.000 stddev.py:11(length)
820700 0.042 0.000 0.042 0.000 {len}
100 0.000 0.000 0.000 0.000 {list.append}
1 0.000 0.000 0.000 0.000 {math.sqrt}
Amongst Our Weaponry
Extracting loop invariants
Extracting Loop Invariants
def varsum(arr):
vs = 0
for j in range(len(arr)):
mean = (total(arr) / length(arr))
vs += (arr[j] - mean) ** 2
return vs
Extracting Loop Invariants
def varsum(arr):
vs = 0
mean = (total(arr) / length(arr))
for j in range(len(arr)):
vs += (arr[j] - mean) ** 2
return vs
Extracting Loop Invariants
52606 calls in 1.944 seconds (36x)
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 1.942 1.942 stddev1.py:41(run)
1 0.006 0.006 1.932 1.932 stddev1.py:23(stddev)
10500 1.673 0.000 1.859 0.000 stddev1.py:4(total)
10701 0.196 0.000 0.196 0.000 {range}
100 0.062 0.001 0.081 0.001 stddev1.py:15(varsum)
10300 0.003 0.000 0.003 0.000 stddev1.py:11(length)
20900 0.001 0.000 0.001 0.000 {len}
100 0.000 0.000 0.000 0.000 {list.append}
1 0.000 0.000 0.000 0.000 {math.sqrt}
still 5.4 hrs
Extracting Loop Invariants
def stddev(partitions, ddof=0):
final = 0.0
for part in partitions:
m = total(part) / length(part)
# Find the mean of the entire group.
gtotal = total([total(p) for p in partitions])
glength = total([length(p) for p in partitions])
g = gtotal / glength
adj = ((2 * total(part) * (m - g)) +
((g ** 2 - m ** 2) * length(part)))
final += varsum(part) + adj
return math.sqrt(final / (glength - ddof))
Extracting Loop Invariants
def stddev(partitions, ddof=0):
final = 0.0
# Find the mean of the entire group.
gtotal = total([total(p) for p in partitions])
glength = total([length(p) for p in partitions])
g = gtotal / glength
for part in partitions:
m = total(part) / length(part)
adj = ((2 * total(part) * (m - g)) +
((g ** 2 - m ** 2) * length(part)))
final += varsum(part) + adj
return math.sqrt(final / (glength - ddof))
Extracting Loop Invariants
2512 function calls in 0.142 seconds (13x)
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 0.140 0.140 stddev1.py:42(run)
1 0.000 0.000 0.136 0.136 stddev1.py:23(stddev)
100 0.063 0.001 0.082 0.001 stddev1.py:15(varsum)
402 0.064 0.000 0.071 0.000 stddev1.py:4(total)
603 0.013 0.000 0.013 0.000 {range}
400 0.000 0.000 0.000 0.000 stddev1.py:11(length)
902 0.000 0.000 0.000 0.000 {len}
100 0.000 0.000 0.000 0.000 {list.append}
1 0.000 0.000 0.000 0.000 {math.sqrt}
still 23 minutes
Amongst Our Weaponry
Use builtin Python functions
whenever possible
Use Python Builtins
def total(arr):
s = 0
for j in range(len(arr)):
s += arr[j]
return s
Use Python Builtins
def total(arr):
s = 0
for j in range(len(arr)):
s += arr[j]
return s
def total(arr):
return sum(arr)
Use Python Builtins
2110 function calls in 0.096 seconds (1.47x)
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 0.093 0.093 stddev1.py:39(run)
1 0.000 0.000 0.083 0.083 stddev1.py:20(stddev)
100 0.065 0.001 0.070 0.001 stddev1.py:12(varsum)
402 0.000 0.000 0.015 0.000 stddev1.py:4(total)
402 0.015 0.000 0.015 0.000 {sum}
201 0.012 0.000 0.012 0.000 {range}
400 0.000 0.000 0.000 0.000 stddev1.py:8(length)
500 0.000 0.000 0.000 0.000 {len}
100 0.000 0.000 0.000 0.000 {list.append}
1 0.000 0.000 0.000 0.000 {math.sqrt}
still 16 minutes
Use Python Builtins
2110 function calls in 0.096 seconds (1.47x)
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 0.093 0.093 stddev1.py:39(run)
1 0.000 0.000 0.083 0.083 stddev1.py:20(stddev)
100 0.065 0.001 0.070 0.001 stddev1.py:12(varsum)
402 0.000 0.000 0.015 0.000 stddev1.py:4(total)
402 0.015 0.000 0.015 0.000 {sum}
201 0.012 0.000 0.012 0.000 {range}
400 0.000 0.000 0.000 0.000 stddev1.py:8(length)
500 0.000 0.000 0.000 0.000 {len}
100 0.000 0.000 0.000 0.000 {list.append}
1 0.000 0.000 0.000 0.000 {math.sqrt}
Use Python Builtins
def varsum(arr):
vs = 0
mean = (total(arr) / length(arr))
for j in range(len(arr)):
vs += (arr[j] - mean) ** 2
return vs
Use Python Builtins
def varsum(arr):
mean = (total(arr) / length(arr))
return sum((v - mean) ** 2
for v in arr)
Use Python Builtins
402110 function calls in 0.122 seconds
1.27x slower
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 0.120 0.120 stddev.py:36(run)
1 0.000 0.000 0.115 0.115 stddev.py:17(stddev)
502 0.044 0.000 0.114 0.000 {sum}
100 0.000 0.000 0.106 0.001 stddev.py:12(varsum)
400100 0.070 0.000 0.070 0.000 stddev.py:14(genexpr)
402 0.000 0.000 0.011 0.000 stddev.py:4(total)
…
Amongst Our Weaponry
Reduce function calls
Reduce Function Calls
>>> Timer("sum(a)", "a = range(10)").repeat(3)
[0.15801000595092773,
0.1406857967376709,
0.14577603340148926]
>>> Timer("total(a)",
"a = range(10); total = lambda x: sum(x)"
).repeat(3)
[0.2066800594329834,
0.1998300552368164,
0.21536493301391602]
0.000000059 seconds per call
Reduce Function Calls
def variances_squared(arr):
mean = (total(arr) / length(arr))
for v in arr:
yield (v - mean) ** 2
Reduce Function Calls
def varsum(arr):
mean = (total(arr) / length(arr))
return sum( (v - mean) ** 2
for v in arr )
def varsum(arr):
mean = (total(arr) / length(arr))
return sum([(v - mean) ** 2
for v in arr])
Reduce Function Calls
2010 function calls in 0.082 seconds (1.17x)
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 0.080 0.080 stddev.py:36(run)
1 0.000 0.000 0.071 0.071 stddev.py:17(stddev)
100 0.050 0.001 0.056 0.001 stddev.py:12(varsum)
502 0.020 0.000 0.020 0.000 {sum}
402 0.000 0.000 0.016 0.000 stddev.py:4(total)
101 0.009 0.000 0.009 0.000 {range}
400 0.000 0.000 0.000 0.000 stddev.py:8(length)
400 0.000 0.000 0.000 0.000 {len}
100 0.000 0.000 0.000 0.000 {list.append}
1 0.000 0.000 0.000 0.000 {math.sqrt}
still 13+ minutes
Amongst Our Weaponry
Vector operations
with NumPy
Vector Operations
part = numpy.array(
xrange(...), dtype=float)
def total(arr):
return arr.sum()
def varsum(arr):
return (
(arr - arr.mean()) ** 2).sum()
Vector Operations
3408 function calls in 0.057 seconds (1.43x)
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 0.057 0.057 stddev1.py:37(run)
200 0.051 0.000 0.051 0.000 {numpy...array}
1 0.001 0.001 0.006 0.006 stddev1.py:18(stddev)
500 0.003 0.000 0.003 0.000 {numpy.ufunc.reduce}
100 0.001 0.000 0.003 0.000 stddev1.py:14(varsum)
400 0.000 0.000 0.003 0.000 {numpy.ndarray.sum}
300 0.000 0.000 0.002 0.000 stddev1.py:6(total)
100 0.000 0.000 0.001 0.000 {numpy.ndarray.mean}
…
still 9.5 minutes
Vector Operations
3408 function calls in 0.057 seconds (1.43x)
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 0.057 0.057 stddev1.py:37(run)
200 0.051 0.000 0.051 0.000 {numpy...array}
1 0.001 0.001 0.006 0.006 stddev1.py:18(stddev)
500 0.003 0.000 0.003 0.000 {numpy.ufunc.reduce}
100 0.001 0.000 0.003 0.000 stddev1.py:14(varsum)
400 0.000 0.000 0.003 0.000 {numpy.ndarray.sum}
300 0.000 0.000 0.002 0.000 stddev1.py:6(total)
100 0.000 0.000 0.001 0.000 {numpy.ndarray.mean}
…
still 9.5 minutes
Vector Operations
3408 function calls in 0.006 seconds (13.6x)
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.001 0.001 0.006 0.006 stddev1.py:18(stddev)
500 0.003 0.000 0.003 0.000 {numpy.ufunc.reduce}
100 0.001 0.000 0.003 0.000 stddev1.py:14(varsum)
400 0.000 0.000 0.003 0.000 {numpy.ndarray.sum}
300 0.000 0.000 0.002 0.000 stddev1.py:6(total)
100 0.000 0.000 0.001 0.000 {numpy.ndarray.mean}
…
should be exactly 1 minute
Vector Operations
Let’s try 4 billion!
Bump up that N...
Vector Operations
MemoryError
Oh, yeah...
Amongst Our Weaponry
Parallelization
with
multiprocessing
Parallelization
from multiprocessing import Pool
def run():
results = Pool().map(
run_one, range(segments))
result = stddev(results)
return result
Parallelization
def run_one(i):
p = numpy.memmap(
'stddev.%d' % i, dtype=float,
mode='r', shape=(part_len,))
T, L = p.sum(), float(len(p))
m = T / L
V = ((p - m) ** 2).sum()
return T, L, V
Parallelization
def stddev(TLVs, ddof=0):
final = 0.0
totals = [T for T, L, V in TLVs]
lengths = [L for T, L, V in TLVs]
glength = sum(lengths)
g = sum(totals) / glength
for T, L, V in TLVs:
m = T / L
adj = ((2 * T * (m - g)) + ((g ** 2 - m ** 2) * L))
final += V + adj
return math.sqrt(final / (glength - ddof))
Parallelization
3734 function calls in 0.024 seconds
6x slower
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 0.024 0.024 stddev.py:47(run)
4 0.000 0.000 0.011 0.003 threading.py:234(wait)
22 0.011 0.000 0.011 0.000 {thread.lock.acquire}
1 0.000 0.000 0.011 0.011 pool.py:222(map)
1 0.000 0.000 0.008 0.008 pool.py:113(__init__)
4 0.001 0.000 0.005 0.001 process.py:116(start)
1 0.003 0.003 0.005 0.005 stddev.py:11(stddev)
4 0.000 0.000 0.004 0.001 forking.py:115(init)
4 0.003 0.001 0.003 0.001 {posix.fork}
...
Parallelization
Could that waiting be insignificant
when we scale up to 4 billion?
Let’s try it!
Parallelization
3766 function calls in 67.811 seconds
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 67.811 67.811 stddev.py:47(run)
4 0.000 0.000 67.747 16.930 threading.py:234(wait)
22 67.747 3.079 67.747 3.079 {thread.lock.acquire}
1 0.000 0.000 67.747 67.747 pool.py:222(map)
1 0.000 0.000 0.062 0.060 pool.py:113(__init__)
4 0.000 0.000 0.058 0.014 process.py:116(start)
4 0.057 0.014 0.057 0.014 {posix.fork}
1 0.003 0.003 0.005 0.005 stddev.py:11(stddev)
2 0.002 0.001 0.002 0.001 {sum}
SO CLOSE! 1.13 minutes
Parallelization
def run_one(i):
if i == 50:
cProfile.runctx(..., "prf.50")
>>> import pstats
>>> s = pstats.Stats("prf.50")
>>> s.sort_stats("cumulative")
<pstats.Stats instance at 0x2bddcb0>
>>> _.print_stats()
Parallelization
57 function calls in 2.804 seconds
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.431 0.431 2.791 2.791 stddev.py:43(run_one)
2 0.000 0.000 2.360 1.180 numpy.ndarray.sum
2 2.360 1.180 2.360 1.180 numpy.ufunc.reduce
1 0.000 0.000 0.000 0.000 memmap.py:195(__new__)
Parallelization
def run_one(i):
p = numpy.memmap(
'stddev.%d' % i, dtype=float,
mode='r', shape=(part_len,))
T, L = p.sum(), float(len(p))
m = T / L
V = ((p - m) ** 2).sum()
return T, L, V
200 seconds / 4 cores = 50
Parallelization? Serialization!
67.8 seconds for 4 billion rows, but
-50 of those are loading data!
17.8 seconds to do the actual math.
Serialization
import bloscpack as bp
bargs = bp.args.DEFAULT_BLOSC_ARGS
bargs['clevel'] = 6
bp.pack_ndarray_file(
part, fname, blosc_args=bargs)
part = bp.unpack_ndarray_file(fname)
Serialization
Let’s try it!
I Crush
Your
Head!
I Crush Your Head!
1153 function calls in 26.166 seconds
ncalls tottime percall cumtime percall filename:lineno(func)
1 0.000 0.000 26.166 26.166 stddev_bp.py:56(run)
4 0.000 0.000 26.134 6.53 threading.py:234(wait)
22 26.134 1.188 26.134 1.188 thread.lock.acquire
1 0.000 0.000 26.133 26.133 pool.py:222(map)
1 0.000 0.000 26.133 26.133 pool.py:521(get)
1 0.000 0.000 26.133 26.133 pool.py:513(wait)
1 0.003 0.003 0.030 0.030 __init__.py:227(Pool)
1 0.000 0.000 0.021 0.021 pool.py:113(__init__)
I Crush Your Head!
With some time-tested general
programming techniques:
Extract loop invariants
Use language builtins
Reduce function calls
I Crush Your Head!
And some Python libraries
for architectural improvements:
Use NumPy for vector ops
Use multiprocessing for parallelization
Use bloscpack for compression
I Crush Your Head!
We sped up our calculation
so that it runs in:
0.003% of the time
or 27317 times faster
4.4 orders of magnitude
Crushing the Head of the Snake
Any questions?
@aminusfu
bob@crunch.io

More Related Content

What's hot

What's hot (20)

The Ring programming language version 1.9 book - Part 32 of 210
The Ring programming language version 1.9 book - Part 32 of 210The Ring programming language version 1.9 book - Part 32 of 210
The Ring programming language version 1.9 book - Part 32 of 210
 
The Ring programming language version 1.9 book - Part 45 of 210
The Ring programming language version 1.9 book - Part 45 of 210The Ring programming language version 1.9 book - Part 45 of 210
The Ring programming language version 1.9 book - Part 45 of 210
 
The Ring programming language version 1.8 book - Part 42 of 202
The Ring programming language version 1.8 book - Part 42 of 202The Ring programming language version 1.8 book - Part 42 of 202
The Ring programming language version 1.8 book - Part 42 of 202
 
The Ring programming language version 1.8 book - Part 30 of 202
The Ring programming language version 1.8 book - Part 30 of 202The Ring programming language version 1.8 book - Part 30 of 202
The Ring programming language version 1.8 book - Part 30 of 202
 
Numerical Algorithm for a few Special Functions
Numerical Algorithm for a few Special FunctionsNumerical Algorithm for a few Special Functions
Numerical Algorithm for a few Special Functions
 
The Ring programming language version 1.3 book - Part 50 of 88
The Ring programming language version 1.3 book - Part 50 of 88The Ring programming language version 1.3 book - Part 50 of 88
The Ring programming language version 1.3 book - Part 50 of 88
 
The Ring programming language version 1.10 book - Part 44 of 212
The Ring programming language version 1.10 book - Part 44 of 212The Ring programming language version 1.10 book - Part 44 of 212
The Ring programming language version 1.10 book - Part 44 of 212
 
Optimization and Mathematical Programming in R and ROI - R Optimization Infra...
Optimization and Mathematical Programming in R and ROI - R Optimization Infra...Optimization and Mathematical Programming in R and ROI - R Optimization Infra...
Optimization and Mathematical Programming in R and ROI - R Optimization Infra...
 
The Ring programming language version 1.5.3 book - Part 77 of 184
The Ring programming language version 1.5.3 book - Part 77 of 184The Ring programming language version 1.5.3 book - Part 77 of 184
The Ring programming language version 1.5.3 book - Part 77 of 184
 
ScalaMeter 2012
ScalaMeter 2012ScalaMeter 2012
ScalaMeter 2012
 
The Ring programming language version 1.4 book - Part 18 of 30
The Ring programming language version 1.4 book - Part 18 of 30The Ring programming language version 1.4 book - Part 18 of 30
The Ring programming language version 1.4 book - Part 18 of 30
 
Dive into EXPLAIN - PostgreSql
Dive into EXPLAIN  - PostgreSqlDive into EXPLAIN  - PostgreSql
Dive into EXPLAIN - PostgreSql
 
Groovy
GroovyGroovy
Groovy
 
Functional Reactive Programming with Kotlin on Android - Giorgio Natili - Cod...
Functional Reactive Programming with Kotlin on Android - Giorgio Natili - Cod...Functional Reactive Programming with Kotlin on Android - Giorgio Natili - Cod...
Functional Reactive Programming with Kotlin on Android - Giorgio Natili - Cod...
 
Time series-mining-slides
Time series-mining-slidesTime series-mining-slides
Time series-mining-slides
 
Lesson10
Lesson10Lesson10
Lesson10
 
제 6회 엑셈 수요 세미나 자료 연구컨텐츠팀
제 6회 엑셈 수요 세미나 자료 연구컨텐츠팀제 6회 엑셈 수요 세미나 자료 연구컨텐츠팀
제 6회 엑셈 수요 세미나 자료 연구컨텐츠팀
 
Matched filter detection
Matched filter detectionMatched filter detection
Matched filter detection
 
Kotlin Coroutines. Flow is coming
Kotlin Coroutines. Flow is comingKotlin Coroutines. Flow is coming
Kotlin Coroutines. Flow is coming
 
DSP_FOEHU - Lec 03 - Sampling of Continuous Time Signals
DSP_FOEHU - Lec 03 - Sampling of Continuous Time SignalsDSP_FOEHU - Lec 03 - Sampling of Continuous Time Signals
DSP_FOEHU - Lec 03 - Sampling of Continuous Time Signals
 

Viewers also liked

Evolutionary Algorithms: Perfecting the Art of "Good Enough"
Evolutionary Algorithms: Perfecting the Art of "Good Enough"Evolutionary Algorithms: Perfecting the Art of "Good Enough"
Evolutionary Algorithms: Perfecting the Art of "Good Enough"
PyData
 
Numba: Flexible analytics written in Python with machine-code speeds and avo...
Numba:  Flexible analytics written in Python with machine-code speeds and avo...Numba:  Flexible analytics written in Python with machine-code speeds and avo...
Numba: Flexible analytics written in Python with machine-code speeds and avo...
PyData
 
How Soon is Now: automatically extracting publication dates of news articles ...
How Soon is Now: automatically extracting publication dates of news articles ...How Soon is Now: automatically extracting publication dates of news articles ...
How Soon is Now: automatically extracting publication dates of news articles ...
PyData
 
Doing frequentist statistics with scipy
Doing frequentist statistics with scipyDoing frequentist statistics with scipy
Doing frequentist statistics with scipy
PyData
 
Fang Xu- Enriching content with Knowledge Base by Search Keywords and Wikidata
Fang Xu- Enriching content with Knowledge Base by Search Keywords and WikidataFang Xu- Enriching content with Knowledge Base by Search Keywords and Wikidata
Fang Xu- Enriching content with Knowledge Base by Search Keywords and Wikidata
PyData
 
Promoting a Data Driven Culture in a Microservices Environment
Promoting a Data Driven Culture in a Microservices EnvironmentPromoting a Data Driven Culture in a Microservices Environment
Promoting a Data Driven Culture in a Microservices Environment
PyData
 
Making your code faster cython and parallel processing in the jupyter notebook
Making your code faster   cython and parallel processing in the jupyter notebookMaking your code faster   cython and parallel processing in the jupyter notebook
Making your code faster cython and parallel processing in the jupyter notebook
PyData
 
Large scale-ctr-prediction lessons-learned-florian-hartl
Large scale-ctr-prediction lessons-learned-florian-hartlLarge scale-ctr-prediction lessons-learned-florian-hartl
Large scale-ctr-prediction lessons-learned-florian-hartl
PyData
 
Embracing the Monolith in Small Teams: Doubling down on python to move fast w...
Embracing the Monolith in Small Teams: Doubling down on python to move fast w...Embracing the Monolith in Small Teams: Doubling down on python to move fast w...
Embracing the Monolith in Small Teams: Doubling down on python to move fast w...
PyData
 

Viewers also liked (20)

Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
 
Robert Meyer- pypet
Robert Meyer- pypetRobert Meyer- pypet
Robert Meyer- pypet
 
Speed Without Drag by Saul Diez-Guerra PyData SV 2014
Speed Without Drag by Saul Diez-Guerra PyData SV 2014Speed Without Drag by Saul Diez-Guerra PyData SV 2014
Speed Without Drag by Saul Diez-Guerra PyData SV 2014
 
Evolutionary Algorithms: Perfecting the Art of "Good Enough"
Evolutionary Algorithms: Perfecting the Art of "Good Enough"Evolutionary Algorithms: Perfecting the Art of "Good Enough"
Evolutionary Algorithms: Perfecting the Art of "Good Enough"
 
Brains & Brawn: the Logic and Implementation of a Redesigned Advertising Mark...
Brains & Brawn: the Logic and Implementation of a Redesigned Advertising Mark...Brains & Brawn: the Logic and Implementation of a Redesigned Advertising Mark...
Brains & Brawn: the Logic and Implementation of a Redesigned Advertising Mark...
 
Nipype
NipypeNipype
Nipype
 
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
 
Numba: Flexible analytics written in Python with machine-code speeds and avo...
Numba:  Flexible analytics written in Python with machine-code speeds and avo...Numba:  Flexible analytics written in Python with machine-code speeds and avo...
Numba: Flexible analytics written in Python with machine-code speeds and avo...
 
Interactive Financial Analytics with Python & Ipython by Dr Yves Hilpisch
Interactive Financial Analytics with Python & Ipython by Dr Yves HilpischInteractive Financial Analytics with Python & Ipython by Dr Yves Hilpisch
Interactive Financial Analytics with Python & Ipython by Dr Yves Hilpisch
 
How Soon is Now: automatically extracting publication dates of news articles ...
How Soon is Now: automatically extracting publication dates of news articles ...How Soon is Now: automatically extracting publication dates of news articles ...
How Soon is Now: automatically extracting publication dates of news articles ...
 
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike MullerFaster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
 
Python resampling
Python resamplingPython resampling
Python resampling
 
Doing frequentist statistics with scipy
Doing frequentist statistics with scipyDoing frequentist statistics with scipy
Doing frequentist statistics with scipy
 
Fang Xu- Enriching content with Knowledge Base by Search Keywords and Wikidata
Fang Xu- Enriching content with Knowledge Base by Search Keywords and WikidataFang Xu- Enriching content with Knowledge Base by Search Keywords and Wikidata
Fang Xu- Enriching content with Knowledge Base by Search Keywords and Wikidata
 
Low-rank matrix approximations in Python by Christian Thurau PyData 2014
Low-rank matrix approximations in Python by Christian Thurau PyData 2014Low-rank matrix approximations in Python by Christian Thurau PyData 2014
Low-rank matrix approximations in Python by Christian Thurau PyData 2014
 
Promoting a Data Driven Culture in a Microservices Environment
Promoting a Data Driven Culture in a Microservices EnvironmentPromoting a Data Driven Culture in a Microservices Environment
Promoting a Data Driven Culture in a Microservices Environment
 
Making your code faster cython and parallel processing in the jupyter notebook
Making your code faster   cython and parallel processing in the jupyter notebookMaking your code faster   cython and parallel processing in the jupyter notebook
Making your code faster cython and parallel processing in the jupyter notebook
 
Large scale-ctr-prediction lessons-learned-florian-hartl
Large scale-ctr-prediction lessons-learned-florian-hartlLarge scale-ctr-prediction lessons-learned-florian-hartl
Large scale-ctr-prediction lessons-learned-florian-hartl
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
 
Embracing the Monolith in Small Teams: Doubling down on python to move fast w...
Embracing the Monolith in Small Teams: Doubling down on python to move fast w...Embracing the Monolith in Small Teams: Doubling down on python to move fast w...
Embracing the Monolith in Small Teams: Doubling down on python to move fast w...
 

Similar to Crushing the Head of the Snake by Robert Brewer PyData SV 2014

Similar to Crushing the Head of the Snake by Robert Brewer PyData SV 2014 (20)

The Ring programming language version 1.3 book - Part 16 of 88
The Ring programming language version 1.3 book - Part 16 of 88The Ring programming language version 1.3 book - Part 16 of 88
The Ring programming language version 1.3 book - Part 16 of 88
 
The Ring programming language version 1.5.1 book - Part 23 of 180
The Ring programming language version 1.5.1 book - Part 23 of 180The Ring programming language version 1.5.1 book - Part 23 of 180
The Ring programming language version 1.5.1 book - Part 23 of 180
 
The Ring programming language version 1.5.2 book - Part 24 of 181
The Ring programming language version 1.5.2 book - Part 24 of 181The Ring programming language version 1.5.2 book - Part 24 of 181
The Ring programming language version 1.5.2 book - Part 24 of 181
 
The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212
 
Learn Matlab
Learn MatlabLearn Matlab
Learn Matlab
 
The Ring programming language version 1.2 book - Part 14 of 84
The Ring programming language version 1.2 book - Part 14 of 84The Ring programming language version 1.2 book - Part 14 of 84
The Ring programming language version 1.2 book - Part 14 of 84
 
The Ring programming language version 1.7 book - Part 28 of 196
The Ring programming language version 1.7 book - Part 28 of 196The Ring programming language version 1.7 book - Part 28 of 196
The Ring programming language version 1.7 book - Part 28 of 196
 
The Ring programming language version 1.5.2 book - Part 75 of 181
The Ring programming language version 1.5.2 book - Part 75 of 181The Ring programming language version 1.5.2 book - Part 75 of 181
The Ring programming language version 1.5.2 book - Part 75 of 181
 
Python profiling
Python profilingPython profiling
Python profiling
 
mat lab introduction and basics to learn
mat lab introduction and basics to learnmat lab introduction and basics to learn
mat lab introduction and basics to learn
 
Time Series Analysis and Mining with R
Time Series Analysis and Mining with RTime Series Analysis and Mining with R
Time Series Analysis and Mining with R
 
The Ring programming language version 1.5.3 book - Part 35 of 184
The Ring programming language version 1.5.3 book - Part 35 of 184The Ring programming language version 1.5.3 book - Part 35 of 184
The Ring programming language version 1.5.3 book - Part 35 of 184
 
Fourier project presentation
Fourier project  presentationFourier project  presentation
Fourier project presentation
 
The Ring programming language version 1.5.3 book - Part 25 of 184
The Ring programming language version 1.5.3 book - Part 25 of 184The Ring programming language version 1.5.3 book - Part 25 of 184
The Ring programming language version 1.5.3 book - Part 25 of 184
 
Clojure basics
Clojure basicsClojure basics
Clojure basics
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
 
Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from Scratch
 
MLE Example
MLE ExampleMLE Example
MLE Example
 
Basic R Data Manipulation
Basic R Data ManipulationBasic R Data Manipulation
Basic R Data Manipulation
 

More from PyData

More from PyData (20)

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Crushing the Head of the Snake by Robert Brewer PyData SV 2014

  • 1. Crushing the Head of the Snake Robert Brewer Chief Architect Crunch.io
  • 2. How to Time from timeit import Timer >>> range(5) [0, 1, 2, 3, 4] >>> t = Timer("range(a)", "a = 1000000") >>> t.timeit(1) 0.028472900390625 >>> t.timeit(100) 1.8600409030914307 >>> t.timeit(1000) 18.056041955947876
  • 3. Comparing algorithms >>> Timer("range(1000)").timeit(1 000 000) >>> Timer("range(1000)").timeit() 11.392634868621826 >>> Timer("xrange(1000)").timeit() 0.20040297508239746 >>> Timer("list(xrange(1000))").timeit() 12.207480907440186
  • 5. Caveat: Wall time not CPU time >>> Timer("xrange(1000)").timeit() 0.20040297508239746 >>> Timer("xrange(1000)").repeat(3) [0.20735883712768555, 0.1968221664428711, 0.18882489204406738] take the minimum
  • 6. How to Profile >>> import mod >>> import cProfile >>> cProfile.run("mod.b()", sort="cumulative")
  • 7. How to Profile >>> import mod >>> import cProfile >>> cProfile.run("mod.b()", sort="cumulative") (make changes to module) >>> reload(mod) >>> cProfile.run("mod.b()", sort="cumulative")
  • 8. How to Profile >>> cProfile.run("for i in xrange(3000): range(i).sort()", sort="cumulative") 6002 function calls in 0.093 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(func) 1 0.019 0.019 0.093 0.093 <string>:1(<module>) 3000 0.052 0.000 0.052 0.000 {list.sort} 3000 0.022 0.000 0.022 0.000 {range} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
  • 9. How to Profile 6002 function calls in 0.093 seconds ncalls tottime percall cumtime percall filename:lineno(func) 3000 0.052 0.000 0.052 0.000 {list.sort} 3000 0.022 0.000 0.022 0.000 {range}
  • 10. Example: Standard Deviation >>> import numpy >>> n = 100 >>> a = numpy.array(xrange(n), dtype=float) >>> a.std(ddof=1) 29.011491975882016
  • 11. Example: Standard Deviation >>> n = 4000000000 >>> a = numpy.array(xrange(n), dtype=float) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: setting an array element with a sequence.
  • 12. Example: Standard Deviation >>> n = 4000000000 >>> arr = numpy.zeros(n, dtype=float) Traceback (most recent call last): File "<stdin>", line 1, in <module> MemoryError
  • 14. Example: Standard Deviation Given array A broken in n parts a1...an and local variance V(ai) = Σj(aij - ai)2 V(a) + 2(Σaij)(ai - A) + |ai|(A2 - ai 2) |A| - ddof n Σi = 1 √
  • 15. Example: Standard Deviation def run(): points = 400 000 (0000) segments = 100 part_len = points / segments partitions = [] for p in range(segments): part = range(part_len * p, part_len * (p + 1)) partitions.append(part) return stddev(partitions, ddof=1)
  • 16. Example: Standard Deviation def stddev(partitions, ddof=0): final = 0.0 for part in partitions: m = total(part) / length(part) # Find the mean of the entire group. gtotal = total([total(p) for p in partitions]) glength = total([length(p) for p in partitions]) g = gtotal / glength adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m ** 2) * length(part))) final += varsum(part) + adj return math.sqrt(final / (glength - ddof))
  • 17. Example: Standard Deviation 2052106 function calls in 71.025 seconds ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 71.023 71.023 stddev.py:39(run) 1 0.006 0.006 71.013 71.013 stddev.py:22(stddev) 410400 63.406 0.000 70.490 0.000 stddev.py:4(total) 100 0.341 0.003 69.178 0.692 stddev.py:15(varsum) 410601 7.076 0.000 7.076 0.000 {range} 410200 0.151 0.000 0.174 0.000 stddev.py:11(length) 820700 0.042 0.000 0.042 0.000 {len} 100 0.000 0.000 0.000 0.000 {list.append} 1 0.000 0.000 0.000 0.000 {math.sqrt}
  • 18. Example: Standard Deviation 400000 in 71.025 seconds Assuming no other effects of scale, it will take 197.3 hours (over 8 days) to calculate our 4 billion-row array.
  • 19. Example: Standard Deviation Can we calculate our 4 billion-row array in 1 minute? That’s 400,000 in 6ms. All we need is an 11,837.5x speedup.
  • 21. Example: Standard Deviation 2052106 function calls in 71.025 seconds ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 71.023 71.023 stddev.py:39(run) 1 0.006 0.006 71.013 71.013 stddev.py:22(stddev) 410400 63.406 0.000 70.490 0.000 stddev.py:4(total) 100 0.341 0.003 69.178 0.692 stddev.py:15(varsum) 410601 7.076 0.000 7.076 0.000 {range} 410200 0.151 0.000 0.174 0.000 stddev.py:11(length) 820700 0.042 0.000 0.042 0.000 {len} 100 0.000 0.000 0.000 0.000 {list.append} 1 0.000 0.000 0.000 0.000 {math.sqrt}
  • 23. Extracting Loop Invariants def varsum(arr): vs = 0 for j in range(len(arr)): mean = (total(arr) / length(arr)) vs += (arr[j] - mean) ** 2 return vs
  • 24. Extracting Loop Invariants def varsum(arr): vs = 0 mean = (total(arr) / length(arr)) for j in range(len(arr)): vs += (arr[j] - mean) ** 2 return vs
  • 25. Extracting Loop Invariants 52606 calls in 1.944 seconds (36x) ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 1.942 1.942 stddev1.py:41(run) 1 0.006 0.006 1.932 1.932 stddev1.py:23(stddev) 10500 1.673 0.000 1.859 0.000 stddev1.py:4(total) 10701 0.196 0.000 0.196 0.000 {range} 100 0.062 0.001 0.081 0.001 stddev1.py:15(varsum) 10300 0.003 0.000 0.003 0.000 stddev1.py:11(length) 20900 0.001 0.000 0.001 0.000 {len} 100 0.000 0.000 0.000 0.000 {list.append} 1 0.000 0.000 0.000 0.000 {math.sqrt} still 5.4 hrs
  • 26. Extracting Loop Invariants def stddev(partitions, ddof=0): final = 0.0 for part in partitions: m = total(part) / length(part) # Find the mean of the entire group. gtotal = total([total(p) for p in partitions]) glength = total([length(p) for p in partitions]) g = gtotal / glength adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m ** 2) * length(part))) final += varsum(part) + adj return math.sqrt(final / (glength - ddof))
  • 27. Extracting Loop Invariants def stddev(partitions, ddof=0): final = 0.0 # Find the mean of the entire group. gtotal = total([total(p) for p in partitions]) glength = total([length(p) for p in partitions]) g = gtotal / glength for part in partitions: m = total(part) / length(part) adj = ((2 * total(part) * (m - g)) + ((g ** 2 - m ** 2) * length(part))) final += varsum(part) + adj return math.sqrt(final / (glength - ddof))
  • 28. Extracting Loop Invariants 2512 function calls in 0.142 seconds (13x) ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 0.140 0.140 stddev1.py:42(run) 1 0.000 0.000 0.136 0.136 stddev1.py:23(stddev) 100 0.063 0.001 0.082 0.001 stddev1.py:15(varsum) 402 0.064 0.000 0.071 0.000 stddev1.py:4(total) 603 0.013 0.000 0.013 0.000 {range} 400 0.000 0.000 0.000 0.000 stddev1.py:11(length) 902 0.000 0.000 0.000 0.000 {len} 100 0.000 0.000 0.000 0.000 {list.append} 1 0.000 0.000 0.000 0.000 {math.sqrt} still 23 minutes
  • 29. Amongst Our Weaponry Use builtin Python functions whenever possible
  • 30. Use Python Builtins def total(arr): s = 0 for j in range(len(arr)): s += arr[j] return s
  • 31. Use Python Builtins def total(arr): s = 0 for j in range(len(arr)): s += arr[j] return s def total(arr): return sum(arr)
  • 32. Use Python Builtins 2110 function calls in 0.096 seconds (1.47x) ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 0.093 0.093 stddev1.py:39(run) 1 0.000 0.000 0.083 0.083 stddev1.py:20(stddev) 100 0.065 0.001 0.070 0.001 stddev1.py:12(varsum) 402 0.000 0.000 0.015 0.000 stddev1.py:4(total) 402 0.015 0.000 0.015 0.000 {sum} 201 0.012 0.000 0.012 0.000 {range} 400 0.000 0.000 0.000 0.000 stddev1.py:8(length) 500 0.000 0.000 0.000 0.000 {len} 100 0.000 0.000 0.000 0.000 {list.append} 1 0.000 0.000 0.000 0.000 {math.sqrt} still 16 minutes
  • 33. Use Python Builtins 2110 function calls in 0.096 seconds (1.47x) ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 0.093 0.093 stddev1.py:39(run) 1 0.000 0.000 0.083 0.083 stddev1.py:20(stddev) 100 0.065 0.001 0.070 0.001 stddev1.py:12(varsum) 402 0.000 0.000 0.015 0.000 stddev1.py:4(total) 402 0.015 0.000 0.015 0.000 {sum} 201 0.012 0.000 0.012 0.000 {range} 400 0.000 0.000 0.000 0.000 stddev1.py:8(length) 500 0.000 0.000 0.000 0.000 {len} 100 0.000 0.000 0.000 0.000 {list.append} 1 0.000 0.000 0.000 0.000 {math.sqrt}
  • 34. Use Python Builtins def varsum(arr): vs = 0 mean = (total(arr) / length(arr)) for j in range(len(arr)): vs += (arr[j] - mean) ** 2 return vs
  • 35. Use Python Builtins def varsum(arr): mean = (total(arr) / length(arr)) return sum((v - mean) ** 2 for v in arr)
  • 36. Use Python Builtins 402110 function calls in 0.122 seconds 1.27x slower ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 0.120 0.120 stddev.py:36(run) 1 0.000 0.000 0.115 0.115 stddev.py:17(stddev) 502 0.044 0.000 0.114 0.000 {sum} 100 0.000 0.000 0.106 0.001 stddev.py:12(varsum) 400100 0.070 0.000 0.070 0.000 stddev.py:14(genexpr) 402 0.000 0.000 0.011 0.000 stddev.py:4(total) …
  • 37.
  • 38. Amongst Our Weaponry Reduce function calls
  • 39. Reduce Function Calls >>> Timer("sum(a)", "a = range(10)").repeat(3) [0.15801000595092773, 0.1406857967376709, 0.14577603340148926] >>> Timer("total(a)", "a = range(10); total = lambda x: sum(x)" ).repeat(3) [0.2066800594329834, 0.1998300552368164, 0.21536493301391602] 0.000000059 seconds per call
  • 40. Reduce Function Calls def variances_squared(arr): mean = (total(arr) / length(arr)) for v in arr: yield (v - mean) ** 2
  • 41. Reduce Function Calls def varsum(arr): mean = (total(arr) / length(arr)) return sum( (v - mean) ** 2 for v in arr ) def varsum(arr): mean = (total(arr) / length(arr)) return sum([(v - mean) ** 2 for v in arr])
  • 42. Reduce Function Calls 2010 function calls in 0.082 seconds (1.17x) ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 0.080 0.080 stddev.py:36(run) 1 0.000 0.000 0.071 0.071 stddev.py:17(stddev) 100 0.050 0.001 0.056 0.001 stddev.py:12(varsum) 502 0.020 0.000 0.020 0.000 {sum} 402 0.000 0.000 0.016 0.000 stddev.py:4(total) 101 0.009 0.000 0.009 0.000 {range} 400 0.000 0.000 0.000 0.000 stddev.py:8(length) 400 0.000 0.000 0.000 0.000 {len} 100 0.000 0.000 0.000 0.000 {list.append} 1 0.000 0.000 0.000 0.000 {math.sqrt} still 13+ minutes
  • 43. Amongst Our Weaponry Vector operations with NumPy
  • 44. Vector Operations part = numpy.array( xrange(...), dtype=float) def total(arr): return arr.sum() def varsum(arr): return ( (arr - arr.mean()) ** 2).sum()
  • 45. Vector Operations 3408 function calls in 0.057 seconds (1.43x) ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 0.057 0.057 stddev1.py:37(run) 200 0.051 0.000 0.051 0.000 {numpy...array} 1 0.001 0.001 0.006 0.006 stddev1.py:18(stddev) 500 0.003 0.000 0.003 0.000 {numpy.ufunc.reduce} 100 0.001 0.000 0.003 0.000 stddev1.py:14(varsum) 400 0.000 0.000 0.003 0.000 {numpy.ndarray.sum} 300 0.000 0.000 0.002 0.000 stddev1.py:6(total) 100 0.000 0.000 0.001 0.000 {numpy.ndarray.mean} … still 9.5 minutes
  • 46. Vector Operations 3408 function calls in 0.057 seconds (1.43x) ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 0.057 0.057 stddev1.py:37(run) 200 0.051 0.000 0.051 0.000 {numpy...array} 1 0.001 0.001 0.006 0.006 stddev1.py:18(stddev) 500 0.003 0.000 0.003 0.000 {numpy.ufunc.reduce} 100 0.001 0.000 0.003 0.000 stddev1.py:14(varsum) 400 0.000 0.000 0.003 0.000 {numpy.ndarray.sum} 300 0.000 0.000 0.002 0.000 stddev1.py:6(total) 100 0.000 0.000 0.001 0.000 {numpy.ndarray.mean} … still 9.5 minutes
  • 47. Vector Operations 3408 function calls in 0.006 seconds (13.6x) ncalls tottime percall cumtime percall filename:lineno(func) 1 0.001 0.001 0.006 0.006 stddev1.py:18(stddev) 500 0.003 0.000 0.003 0.000 {numpy.ufunc.reduce} 100 0.001 0.000 0.003 0.000 stddev1.py:14(varsum) 400 0.000 0.000 0.003 0.000 {numpy.ndarray.sum} 300 0.000 0.000 0.002 0.000 stddev1.py:6(total) 100 0.000 0.000 0.001 0.000 {numpy.ndarray.mean} … should be exactly 1 minute
  • 48. Vector Operations Let’s try 4 billion! Bump up that N...
  • 51. Parallelization from multiprocessing import Pool def run(): results = Pool().map( run_one, range(segments)) result = stddev(results) return result
  • 52. Parallelization def run_one(i): p = numpy.memmap( 'stddev.%d' % i, dtype=float, mode='r', shape=(part_len,)) T, L = p.sum(), float(len(p)) m = T / L V = ((p - m) ** 2).sum() return T, L, V
  • 53. Parallelization def stddev(TLVs, ddof=0): final = 0.0 totals = [T for T, L, V in TLVs] lengths = [L for T, L, V in TLVs] glength = sum(lengths) g = sum(totals) / glength for T, L, V in TLVs: m = T / L adj = ((2 * T * (m - g)) + ((g ** 2 - m ** 2) * L)) final += V + adj return math.sqrt(final / (glength - ddof))
  • 54. Parallelization 3734 function calls in 0.024 seconds 6x slower ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 0.024 0.024 stddev.py:47(run) 4 0.000 0.000 0.011 0.003 threading.py:234(wait) 22 0.011 0.000 0.011 0.000 {thread.lock.acquire} 1 0.000 0.000 0.011 0.011 pool.py:222(map) 1 0.000 0.000 0.008 0.008 pool.py:113(__init__) 4 0.001 0.000 0.005 0.001 process.py:116(start) 1 0.003 0.003 0.005 0.005 stddev.py:11(stddev) 4 0.000 0.000 0.004 0.001 forking.py:115(init) 4 0.003 0.001 0.003 0.001 {posix.fork} ...
  • 55. Parallelization Could that waiting be insignificant when we scale up to 4 billion? Let’s try it!
  • 56. Parallelization 3766 function calls in 67.811 seconds ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 67.811 67.811 stddev.py:47(run) 4 0.000 0.000 67.747 16.930 threading.py:234(wait) 22 67.747 3.079 67.747 3.079 {thread.lock.acquire} 1 0.000 0.000 67.747 67.747 pool.py:222(map) 1 0.000 0.000 0.062 0.060 pool.py:113(__init__) 4 0.000 0.000 0.058 0.014 process.py:116(start) 4 0.057 0.014 0.057 0.014 {posix.fork} 1 0.003 0.003 0.005 0.005 stddev.py:11(stddev) 2 0.002 0.001 0.002 0.001 {sum} SO CLOSE! 1.13 minutes
  • 57. Parallelization def run_one(i): if i == 50: cProfile.runctx(..., "prf.50") >>> import pstats >>> s = pstats.Stats("prf.50") >>> s.sort_stats("cumulative") <pstats.Stats instance at 0x2bddcb0> >>> _.print_stats()
  • 58. Parallelization 57 function calls in 2.804 seconds ncalls tottime percall cumtime percall filename:lineno(func) 1 0.431 0.431 2.791 2.791 stddev.py:43(run_one) 2 0.000 0.000 2.360 1.180 numpy.ndarray.sum 2 2.360 1.180 2.360 1.180 numpy.ufunc.reduce 1 0.000 0.000 0.000 0.000 memmap.py:195(__new__)
  • 59. Parallelization def run_one(i): p = numpy.memmap( 'stddev.%d' % i, dtype=float, mode='r', shape=(part_len,)) T, L = p.sum(), float(len(p)) m = T / L V = ((p - m) ** 2).sum() return T, L, V 200 seconds / 4 cores = 50
  • 60. Parallelization? Serialization! 67.8 seconds for 4 billion rows, but -50 of those are loading data! 17.8 seconds to do the actual math.
  • 61. Serialization import bloscpack as bp bargs = bp.args.DEFAULT_BLOSC_ARGS bargs['clevel'] = 6 bp.pack_ndarray_file( part, fname, blosc_args=bargs) part = bp.unpack_ndarray_file(fname)
  • 64. I Crush Your Head! 1153 function calls in 26.166 seconds ncalls tottime percall cumtime percall filename:lineno(func) 1 0.000 0.000 26.166 26.166 stddev_bp.py:56(run) 4 0.000 0.000 26.134 6.53 threading.py:234(wait) 22 26.134 1.188 26.134 1.188 thread.lock.acquire 1 0.000 0.000 26.133 26.133 pool.py:222(map) 1 0.000 0.000 26.133 26.133 pool.py:521(get) 1 0.000 0.000 26.133 26.133 pool.py:513(wait) 1 0.003 0.003 0.030 0.030 __init__.py:227(Pool) 1 0.000 0.000 0.021 0.021 pool.py:113(__init__)
  • 65. I Crush Your Head! With some time-tested general programming techniques: Extract loop invariants Use language builtins Reduce function calls
  • 66. I Crush Your Head! And some Python libraries for architectural improvements: Use NumPy for vector ops Use multiprocessing for parallelization Use bloscpack for compression
  • 67. I Crush Your Head! We sped up our calculation so that it runs in: 0.003% of the time or 27317 times faster 4.4 orders of magnitude
  • 68. Crushing the Head of the Snake Any questions? @aminusfu bob@crunch.io