7. 30% chance of
raising >$5M
in New York
Survival function
New York NPO
revenue risk estimate
Revenue ($M)
Likelihood of
hitting target
bootstrapped
95%
confidence
intervals
8. Market comparison
P:
P:
P:
Revenue ($M)
New York
Boston
Chicago
10% > chance
of raising
$5M in
New York
9.
10. 2. Compare markets
3. Find the best markets
1. Cluster
3. Regression
Revenue
random variable (R)
Probabilistic revenue
prediction model 1. Visualize
Monte
Carlo
simulation
Probability
2. Feature selection
# volunteers
$ fundraising
e.g.
4. Parameter distributions
e.g. education e.g. health
max
Revenue ($M)
θ1
θ2
θn
θ3
17. Helping NPOs
decide where to grow
Consulting
Peer driven insights
Dashboard ($10k/license)
Non-profit
CEO
Revenue
prediction
Where to expand to next?
18. 2. Compare markets
3. Find the best markets
1. Cluster
3. Regression
Revenue
random variable (R)
Probabilistic revenue
prediction model 1. Visualize
Probability
2. Feature selection
# volunteers
$ fundraising
e.g.
4. Parameters
e.g. education e.g. health
max
Revenue ($M)
θ1
θ2
θn
= $125K
= 60GB free
23. SSE: 0.0120 SSE:0.019
Detroit New York
Detroit (blue) vs. New York (red)
Lognormal Distribution Survival Function
Revenue $M
Probability
Data fit to a lognormal distribution
Taking the log of the data normalizes it
Calculating the sum of square error (SSE) requires us to the evaluate the con-
tinuous distribtuion at where ever we have bins
similar sum of
square error
sampling bias?
most likely, yes
The model predicts greater
probability of hitting high
revenue targets in Detroit as
compared to New York which
intuitively seems incorrect. The
reason the model predicts this is
because it has to fit those 3 high
revenue points we have for
Detroit. Note that the sum of
squared error is approximately
the same for both cities. The
issue is sampling bias. The
solution is to collect more data
in an unbiased manner. The
data on AWS should fix this.
mean = loc
stdev = scale
24. Plug in any revenue target to get the probability
of hitting it
25. Plug in any revenue target to get the probability
of hitting it$ Millions
Cities where you are most likely to your
revenue target
Demo