SlideShare a Scribd company logo
1 of 40
Download to read offline
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 1/40
Time
Series
Analysis
In [49]: import os
import datetime
import requests
import zipfile
import io
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.tsa.ar_model import AR
from sklearn.metrics import r2_score, mean_squared_error
%matplotlib inline
0.
Get
Data
In [50]: sorted(list(filter(lambda f: not f.startswith("."), os.listdir("."))))
In [51]: def get_data(data_url):
with requests.get(data_url) as r:
with zipfile.ZipFile(io.BytesIO(r.content)) as z:
z.extractall()
In [52]: data_url = "http://quantquote.com/files/quantquote_daily_sp500_83986.zi
p"
get_data(data_url=data_url)
data_dir = os.path.join("quantquote_daily_sp500_83986", "daily")
Out[50]: ['README.md',
'Untitled.ipynb',
'census_data.csv',
'data_science_answers.ipynb',
'data_science_raw.ipynb',
'pandas_basics.ipynb',
'pandas_basics_addntl.ipynb',
'pandas_basics_answers.ipynb',
'python_basics.ipynb',
'python_basics_addntl.ipynb',
'python_basics_answers.ipynb',
'quantquote_daily_sp500_83986',
'requirements.txt',
'time_series_analysis.ipynb',
'time_series_analysis_answers.ipynb']
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 2/40
Scope
out
data
directory
In [53]: stock_csv_names = sorted(os.listdir(data_dir))
In [54]: cols = ['date', 'time', 'open', 'high', 'low_price', 'close', 'volume']
Check
sample
data
In [55]: df = pd.read_csv(os.path.join(data_dir, stock_csv_names[0]),names=cols)
In [56]: df.shape
In [64]: df.head()
In [58]: df.dtypes
We see that most of the typing here looks good (no object / string representations of numeric data) but we do
see that date is coming in as an int rather than a datetime.
Out[56]: (3452, 7)
Out[64]:
date time open high low_price close volume
0 1999-11-18 0 42.2076 46.3820 37.4581 39.1928 4.398181e+07
1 1999-11-19 0 39.8329 39.8885 36.9293 37.6251 1.139020e+07
2 1999-11-22 0 38.3208 40.0091 37.1613 39.9442 4.654716e+06
3 1999-11-23 0 39.4247 40.4729 37.3375 37.5138 4.268903e+06
4 1999-11-24 0 37.2262 38.9052 37.1056 38.0889 3.602367e+06
Out[58]: date int64
time int64
open float64
high float64
low_price float64
close float64
volume float64
dtype: object
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 3/40
In [59]: df.isnull().sum()
In [60]: df.duplicated().sum()
No missing data, and no duplicate data.
In [61]: df.date = pd.to_datetime(df.date.astype(str), infer_datetime_format=True
)
check that this is a proper time-series data set, i.e. we're indexed on time, which in this case will mean that we
have one date for every row:
In [67]: df.date.nunique()/len(df)
In [31]: df=df.set_index('date')
In [68]: df.time.nunique()
In [69]: df=df.drop('time', axis=1)
Problem
Get an iterable of DataFrames , one for each stock in our dataset, with the wrangling we did above included.
In [82]: def get_csv_path(csv_name, stock_csv_folder=data_dir):
return os.path.join(stock_csv_folder, csv_name)
Out[59]: date 0
time 0
open 0
high 0
low_price 0
close 0
volume 0
dtype: int64
Out[60]: 0
Out[67]: 1.0
Out[68]: 1
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 4/40
In [83]: def get_df(csv_name, cols=cols):
df = pd.read_csv(get_csv_path(csv_name),
names=cols,
usecols=list(filter(lambda c: c!= "time", cols)))
df.date = pd.to_datetime(df.date.astype(str), infer_datetime_format=
True)
return df.set_index("date", drop=False)
In [84]: dfs_iter = (get_df(csv_name) for csv_name in stock_csv_names)
In [85]: dfs_list = list(dfs_iter) #convert generator object to list
In [89]: len(dfs_list)
1.
Prices
&
Returns
Prices
In [90]: aapl_df = get_df("table_aapl.csv")
In [93]: pd.Series(aapl_df.index).quantile([0, 1])
In [94]: aapl_df.isnull().sum()
In [95]: aapl_df.duplicated().sum()
Out[89]: 500
Out[93]: 0.0 1998-01-02
1.0 2013-08-09
Name: date, dtype: datetime64[ns]
Out[94]: date 0
open 0
high 0
low_price 0
close 0
volume 0
dtype: int64
Out[95]: 0
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 5/40
In [96]: ax = aapl_df.close.plot(figsize=(11, 8))
t = ax.set_title("aapl: closing price")
A couple of observations on the above graph:
the stock's price has increased over time
there is a good bit of variability in between the start and end points
it would have been nice to buy AAPL back in the 90s!
We can see clearly that there are a number of different components, if you will, to the above time series:
there's an upward trend over time
there look to be some periodic-ish patterns
there's a fair amount of noise-ish stuff, too
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 6/40
In [97]: def plot_time_series_decomposition(series, freq=252):
res = sm.tsa.seasonal_decompose(series, freq=freq)
fig, (ax1,ax2,ax3) = plt.subplots(3,1, figsize=(11,8))
p1 = res.trend.plot(ax=ax1, rot=0)
t1 = ax1.title.set_text("trend")
p2 = res.seasonal.plot(ax=ax2, rot=0)
t2 = ax2.title.set_text("seasonal")
p3 = res.resid.plot(ax=ax3, rot=0)
t3 = ax3.title.set_text("resid")
fig.tight_layout()
In [98]: plot_time_series_decomposition(aapl_df.close)
Returns
In [101]: aapl_df["return_gross"] = aapl_df.close.divide(aapl_df.close.shift(1))
In [104]: aapl_df["return_net"] = aapl_df.return_gross - 1
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 7/40
In [105]: aapl_df.return_net.describe()
Returns Distirbution
In [112]: ax = aapl_df.return_net.hist(figsize=(11, 8), bins=200)
t = ax.set_title("aapl: simple daily return")
In [107]: (aapl_df.return_simple < -.1).sum()
Out[105]: count 3925.000000
mean 0.001672
std 0.029774
min -0.518150
25% -0.013670
50% 0.000859
75% 0.016312
max 0.183749
Name: return_net, dtype: float64
Out[107]: 10
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 8/40
In [108]: aapl_df[aapl_df.return_simple < -.1]
Out[108]:
date open high low_price close volume return_gross return_sim
date
1999-
01-14
1999-
01-14
11.06280 11.18440 9.98318 10.07560 5.813583e+07 0.892333 -0.1076
1999-
09-21
1999-
09-21
17.79520 17.80980 16.77650 16.81050 1.138997e+08 0.875187 -0.1248
2000-
09-29
2000-
09-29
13.64490 14.10200 12.34160 12.52160 2.339738e+08 0.481850 -0.5181
2000-
12-06
2000-
12-06
7.11421 7.29413 6.80785 6.99264 4.577349e+07 0.845883 -0.1541
2001-
07-18
2001-
07-18
10.62510 10.69810 9.92974 10.10480 3.854362e+07 0.829214 -0.1707
2002-
06-19
2002-
06-19
8.44660 8.55844 8.20832 8.33476 5.893970e+07 0.850621 -0.1493
2002-
07-17
2002-
07-17
7.84362 7.87766 7.38652 7.62479 4.142656e+07 0.878923 -0.1210
2008-
01-23
2008-
01-23
132.46100 136.15700 122.67700 135.03800 1.181346e+08 0.880185 -0.1198
2008-
09-29
2008-
09-29
116.41400 116.41400 97.82880 103.12900 9.250666e+07 0.823352 -0.1766
2013-
01-24
2013-
01-24
451.33600 456.96900 441.77900 442.29000 4.939171e+07 0.876643 -0.1233
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 9/40
In [113]: ax = aapl_df.return_net.plot(figsize=(11, 8))
t = ax.set_title("aapl: net daily return")
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 10/40
In [114]: plot_time_series_decomposition(aapl_df.return_simple.dropna())
In [115]: aapl_df["return_simple_rolling_21_mean"] = aapl_df.return_simple.rolling
(21).mean()
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 11/40
In [122]: ax = aapl_df.return_simple_rolling_21_mean.plot(figsize=(11, 8))
t = ax.set_title("aapl: net daily return, rolling 21 mean(one-month)")
retain this functionality to try other windows
In [127]: def set_and_plot_rolling_mean(window):
col_name = f"return_simple_rolling_{window}_mean"
aapl_df[col_name] = aapl_df.return_simple.rolling(window).mean()
ax = aapl_df[col_name].plot(figsize=(11, 8))
t = ax.set_title(f"aapl: net daily return, rolling {window} day mea
n")
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 12/40
In [121]: set_and_plot_rolling_mean(63)
In [123]: set_and_plot_rolling_mean(166)
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 13/40
In [124]: set_and_plot_rolling_mean(252)
In [126]: def set_and_plot_rolling_std(window):
col_name = f"return_simple_rolling_{window}_std"
aapl_df[col_name] = aapl_df.return_simple.rolling(window).std()
ax = aapl_df[col_name].plot(figsize=(11, 8))
t = ax.set_title(f"aapl: net daily return, rolling {window} day std"
)
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 14/40
In [128]: set_and_plot_rolling_std(63)
Plot rolling mean and std together
In [129]: [c for c in aapl_df.columns if "63" in c]
In [132]: aapl_quarterly = aapl_df[[c for c in aapl_df.columns if "63" in c]].drop
na()
In [134]: aapl_quarterly = aapl_quarterly / aapl_quarterly.iloc[0]
In [135]: aapl_quarterly.iloc[0:5]
Out[129]: ['return_simple_rolling_63_mean', 'return_simple_rolling_63_std']
Out[135]:
return_simple_rolling_63_mean return_simple_rolling_63_std
date
1998-04-03 1.000000 1.000000
1998-04-06 0.973895 1.004831
1998-04-07 0.601837 0.829599
1998-04-08 0.699827 0.788331
1998-04-09 0.681255 0.784945
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 15/40
In [136]: ax = aapl_quarterly.plot(figsize=(11, 8))
t = ax.set_title("aapl: simple daily return, rolling 63 mean and std")
inverse relationship between the two
In [138]: aapl_quarterly.corr()
In [139]: aapl_quarterly.iloc[0:5]
Out[138]:
return_simple_rolling_63_mean return_simple_rolling_63_std
return_simple_rolling_63_mean 1.000000 -0.370614
return_simple_rolling_63_std -0.370614 1.000000
Out[139]:
return_simple_rolling_63_mean return_simple_rolling_63_std
date
1998-04-03 1.000000 1.000000
1998-04-06 0.973895 1.004831
1998-04-07 0.601837 0.829599
1998-04-08 0.699827 0.788331
1998-04-09 0.681255 0.784945
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 16/40
In [141]: plot_time_series_decomposition(aapl_df.return_simple_rolling_63_std.drop
na())
Compare rolling returns and volatility with differing window lengths and quantify these relationships using
correlation.
In [142]: def compare_return_to_vol(return_col, vol_col, df=aapl_df):
return df[[return_col, vol_col]].corr()
In [144]: compare_return_to_vol("return_simple_rolling_63_mean", "return_simple_ro
lling_63_std")
Out[144]:
return_simple_rolling_63_mean return_simple_rolling_63_std
return_simple_rolling_63_mean 1.000000 -0.370614
return_simple_rolling_63_std -0.370614 1.000000
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 17/40
In [145]: aapl_df.columns
In [148]: compare_return_to_vol("return_simple_rolling_166_mean", "return_simple_r
olling_63_std")
In [149]: set_and_plot_rolling_std(166)
Out[145]: Index(['date', 'open', 'high', 'low_price', 'close', 'volume', 'return_
gross',
'return_simple', 'return_net', 'return_simple_rolling_21_mean',
'return_simple_rolling_63_mean', 'return_simple_rolling_166_mea
n',
'return_simple_rolling_252_mean', 'return_simple_rolling_63_st
d'],
dtype='object')
Out[148]:
return_simple_rolling_166_mean return_simple_rolling_63_std
return_simple_rolling_166_mean 1.000000 -0.280703
return_simple_rolling_63_std -0.280703 1.000000
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 18/40
In [151]: compare_return_to_vol("return_simple_rolling_166_mean", "return_simple_r
olling_166_std")
In [152]: aapl_halflyish = aapl_df[[c for c in aapl_df.columns if "166" in c]
].dropna()
In [153]: aapl_halflyish = aapl_halflyish / aapl_halflyish.iloc[0]
In [154]: ax = aapl_halflyish.plot(figsize=(11, 8))
t = ax.set_title("aapl: simple daily return, rolling 166 mean and std")
2.
Multi-Stock
Analysis
In [155]: len(dfs_list)
Out[151]:
return_simple_rolling_166_mean return_simple_rolling_166_std
return_simple_rolling_166_mean 1.000000 -0.349501
return_simple_rolling_166_std -0.349501 1.000000
Out[155]: 500
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 19/40
In [157]: os.listdir(data_dir)[:5]
In [158]: os.listdir(data_dir)[-5:]
In [159]: def get_stock_name(csv_name):
file_name = csv_name.split(".")[0]
return file_name.split("_")[1]
In [160]: dfs_list_indexed = [
pd.concat({get_stock_name(csv_name): dfs_list[i]}, axis=1)
for i, csv_name in enumerate(stock_csv_names)]
In [161]: all_stocks = dfs_list_indexed[0].join(dfs_list_indexed[1:])
In [162]: all_stocks.iloc[:5, :8]
Out[157]: ['table_dlph.csv',
'table_cat.csv',
'table_coh.csv',
'table_mcd.csv',
'table_ca.csv']
Out[158]: ['table_schw.csv',
'table_cl.csv',
'table_te.csv',
'table_vz.csv',
'table_hrs.csv']
Out[162]:
a aa
date open high low_price close volume date open
date
1999-11-
18
1999-11-
18
42.2076 46.3820 37.4581 39.1928 4.398181e+07
1999-11-
18
24.5183
1999-11-
19
1999-11-
19
39.8329 39.8885 36.9293 37.6251 1.139020e+07
1999-11-
19
24.5647
1999-11-
22
1999-11-
22
38.3208 40.0091 37.1613 39.9442 4.654716e+06
1999-11-
22
24.6846
1999-11-
23
1999-11-
23
39.4247 40.4729 37.3375 37.5138 4.268903e+06
1999-11-
23
25.0018
1999-11-
24
1999-11-
24
37.2262 38.9052 37.1056 38.0889 3.602367e+06
1999-11-
24
25.0482
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 20/40
In [164]: all_stocks["aa"].head()
Close
In [165]: tickers = all_stocks.columns.levels[0]
tickers[:5]
In [166]: _all_close_list = [all_stocks[tick].close for tick in tickers]
In [170]: _all_close_list[0].head()
In [171]: _all_close_list = [srs.rename(tickers[i])
for i, srs in enumerate(_all_close_list)]
In [176]: all_stocks_close = _all_close_list[0].to_frame().join(_all_close_list[1
:])
Out[164]:
date open high low_price close volume
date
1999-11-18 1999-11-18 24.5183 24.5183 24.0579 24.2514 2658683.414
1999-11-19 1999-11-19 24.5647 24.7581 24.2978 24.6846 3022133.556
1999-11-22 1999-11-22 24.6846 25.1450 24.4912 25.0250 4525318.956
1999-11-23 1999-11-23 25.0018 25.4351 24.9515 25.2185 5622139.724
1999-11-24 1999-11-24 25.0482 25.0482 24.6614 24.9283 3144923.734
Out[165]: Index(['a', 'aa', 'aapl', 'abbv', 'abc'], dtype='object')
Out[170]: date
1999-11-18 39.1928
1999-11-19 37.6251
1999-11-22 39.9442
1999-11-23 37.5138
1999-11-24 38.0889
Name: close, dtype: float64
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 21/40
In [177]: all_stocks_close.iloc[:5, :8]
In [178]: all_stocks_returns = all_stocks_close / all_stocks_close.shift(1)
Correlation
In [179]: corrs = all_stocks_returns.corr()
In [180]: corrs.iloc[:5, :5]
In [181]: corrs_modified = corrs.replace(1, np.nan)
In [182]: corrs_modified.iloc[:5, :5]
Out[177]:
a aa aapl abbv abc abt ace acn
date
1999-11-18 39.1928 24.2514 21.7924 NaN 3.09830 11.9023 14.3008 NaN
1999-11-19 37.6251 24.6846 22.4756 NaN 3.05589 11.8612 13.9294 NaN
1999-11-22 39.9442 25.0250 22.0185 NaN 2.95767 12.2375 14.0259 NaN
1999-11-23 37.5138 25.2185 22.6264 NaN 2.74784 12.0383 13.9294 NaN
1999-11-24 38.0889 24.9283 23.0373 NaN 2.74784 12.1964 13.3722 NaN
Out[180]:
a aa aapl abbv abc
a 1.000000 0.383146 0.361200 0.164908 0.169508
aa 0.383146 1.000000 0.283109 0.067920 0.254100
aapl 0.361200 0.283109 1.000000 0.032483 0.142794
abbv 0.164908 0.067920 0.032483 1.000000 0.282846
abc 0.169508 0.254100 0.142794 0.282846 1.000000
Out[182]:
a aa aapl abbv abc
a NaN 0.383146 0.361200 0.164908 0.169508
aa 0.383146 NaN 0.283109 0.067920 0.254100
aapl 0.361200 0.283109 NaN 0.032483 0.142794
abbv 0.164908 0.067920 0.032483 NaN 0.282846
abc 0.169508 0.254100 0.142794 0.282846 NaN
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 22/40
In [183]: corrs_modified.max().sort_values(ascending=False)[:5]
In [184]: corrs_modified.min().sort_values(ascending=False)[:5]
Problem
Create a price
index that provides insight into the daily tendencies of our 500 stocks. We are going to do this as
follows: calculate, for each day, the
mean
spread
between
open
and
closing
prices
across
all
500
stocks.
Next, vizualize your results. Plot on the same axes a the volatility
of
the
open
and
closing
price
spreads. Note
that you may want to normalize prior to plotting.
Additionally, find:
the 10 days with the highest average open-close spreads
the 10 days with the highest volatility in their open-close spreads
In [185]: _all_open_list = [all_stocks[tick].open for tick in tickers]
In [186]: _all_open_list[0].head()
In [187]: _all_open_list = [srs.rename(tickers[i])
for i, srs in enumerate(_all_open_list)]
In [188]: all_stocks_open = _all_open_list[0].to_frame().join(_all_open_list[1:])
Out[183]: eqr 0.884192
avb 0.884192
spg 0.883757
bxp 0.883757
vno 0.883234
dtype: float64
Out[184]: wyn 0.178355
pfg 0.175845
dfs 0.175066
dd 0.168540
hst 0.159443
dtype: float64
Out[186]: date
1999-11-18 42.2076
1999-11-19 39.8329
1999-11-22 38.3208
1999-11-23 39.4247
1999-11-24 37.2262
Name: open, dtype: float64
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 23/40
In [189]: all_stocks_open.iloc[:5, :8]
In [190]: mean_spread = (all_stocks_close - all_stocks_open).mean(axis=1)
In [191]: mean_spread.head()
In [193]: std_spread = (all_stocks_close - all_stocks_open).std(axis=1)
In [194]: std_spread.head()
In [196]: joined_spread = mean_spread.to_frame().join(std_spread)
In [197]: joined_spread = joined_spread / joined_spread.iloc[0]
Out[189]:
a aa aapl abbv abc abt ace acn
date
1999-11-18 42.2076 24.5183 22.1401 NaN 2.88847 12.1775 14.8134 NaN
1999-11-19 39.8329 24.5647 21.7608 NaN 3.08267 12.0003 14.3973 NaN
1999-11-22 38.3208 24.6846 22.3079 NaN 3.01348 12.0193 14.0259 NaN
1999-11-23 39.4247 25.0018 22.3079 NaN 2.95767 12.1585 14.2116 NaN
1999-11-24 37.2262 25.0482 22.6118 NaN 2.73445 12.0794 13.8848 NaN
Out[191]: date
1999-11-18 0.029497
1999-11-19 -0.136570
1999-11-22 -0.136851
1999-11-23 -0.344567
1999-11-24 0.187926
dtype: float64
Out[194]: date
1999-11-18 1.447925
1999-11-19 2.082665
1999-11-22 1.295058
1999-11-23 2.393124
1999-11-24 1.648351
dtype: float64
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 24/40
In [198]: ax = joined_spread.plot(figsize=(11, 8))
3.
Returns,
by
Month
Check if AAPL post better daily performance in certain months
Average
daily
returns,
by
month
In [200]: aapl_df["month"] = aapl_df.date.apply(lambda d: d.month)
In [201]: daily_by_month = aapl_df[["month", "return_simple"]
].groupby("month"
).agg([np.mean, np.std])
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 25/40
In [202]: daily_by_month
In [203]: daily_by_month.columns = daily_by_month.columns.droplevel()
Out[202]:
return_simple
mean std
month
1 0.002190 0.034613
2 0.001037 0.025793
3 0.003630 0.027720
4 0.002755 0.030354
5 -0.000095 0.025757
6 0.000647 0.024897
7 0.003089 0.030554
8 0.001921 0.023400
9 -0.000956 0.042424
10 0.003242 0.033829
11 0.001369 0.028624
12 0.000834 0.025551
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 26/40
In [204]: ax = daily_by_month.plot.barh(figsize=(11, 11))
t = ax.set_title("aapl: average simple daily return, by month")
So, we see that, on average, AAPL has seen negative daily returns (with higher-than-average volatility) in
September - earnings?
Average
monthly
returns,
by
month
In [206]: aapl_df["year"] = aapl_df.date.apply(lambda d: d.year)
In [207]: monthly_by_month = aapl_df[["year", "month", "return_simple"]
].groupby(["year", "month"]
).agg([np.amin, np.amax])
In [209]: monthly_by_month = monthly_by_month.diff(axis=1)
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 27/40
In [211]: monthly_by_month.columns = monthly_by_month.columns.swaplevel()
In [212]: monthly_by_month = monthly_by_month["amax"]
In [213]: monthly_by_month.head()
In [218]: monthlies = monthly_by_month.groupby(level=1
).agg([np.mean, np.std])
Out[213]:
return_simple
year month
1998 1 0.262945
2 0.106502
3 0.151977
4 0.089050
5 0.076243
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 28/40
In [219]: ax = monthlies.plot.barh(figsize=(11, 11))
t = ax.set_title("aapl: average simple monthly return, by month")
4.
Autocorrelation
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 29/40
In [220]: ax = aapl_df.return_simple.to_frame(
).join(
aapl_df.return_simple.shift(1).rename("shifted")
).plot.scatter(x="shifted", y="return_simple", figsize=(11,
8))
scrub the outlier and try again
In [221]: t_and_t_plus_one = aapl_df.return_simple.to_frame(
).join(
aapl_df.return_simple.shift(1).rename("shif
ted")
)
In [222]: (t_and_t_plus_one < -.5).sum()
In [223]: for col in t_and_t_plus_one.columns:
t_and_t_plus_one.loc[t_and_t_plus_one[col] < -.5, col] = np.nan
Out[222]: return_simple 1
shifted 1
dtype: int64
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 30/40
In [224]: ax = t_and_t_plus_one.plot.scatter(x="shifted", y="return_simple", figsi
ze=(11, 8))
In [226]: aapl_autocorrs = pd.Series({
lag: aapl_df.return_simple.autocorr(lag=lag) for lag in list
(range(1, 20))
})
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 31/40
In [227]: ax = aapl_autocorrs.plot.bar(figsize=(11, 8), rot=0)
t = ax.title.set_text("aapl: autocorrelation")
Problem
Extract the most signification lag values from above, and understand whether day-of-the-week makes a
difference.
In [228]: aapl_df["weekday"] = aapl_df.date.apply(lambda d: d.weekday())
In [229]: aapl_df.weekday.value_counts(normalize=True).sort_index()
Out[229]: 0 0.187723
1 0.204534
2 0.205807
3 0.201732
4 0.200204
Name: weekday, dtype: float64
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 32/40
In [230]: aapl_df[aapl_df.weekday == 4].return_simple.head()
In [231]: aapl_df[aapl_df.weekday == 4].return_simple.autocorr(lag=1)
In [232]: for w in sorted(aapl_df.weekday.unique()):
print(w, aapl_df[aapl_df.weekday == w].return_simple.autocorr(lag=1
), sep=": ")
Correlations,
with
rolling
window
returns
In [234]: aapl_df.return_simple.to_frame(
).join(
aapl_df.return_simple_rolling_21_mean.shift(1)
).corr()
In [235]: aapl_df.return_simple.to_frame(
).join(
aapl_df.return_simple_rolling_63_mean.shift(1)
).corr()
5.
Log
Returns
Out[230]: date
1998-01-02 NaN
1998-01-09 0.010519
1998-01-16 -0.022928
1998-01-23 0.009838
1998-01-30 -0.009699
Name: return_simple, dtype: float64
Out[231]: 0.06890856162364511
0: 0.03307978683299217
1: 0.02142547271905113
2: 0.01132272930404902
3: -0.025362416692087088
4: 0.06890856162364511
Out[234]:
return_simple return_simple_rolling_21_mean
return_simple 1.000000 -0.004471
return_simple_rolling_21_mean -0.004471 1.000000
Out[235]:
return_simple return_simple_rolling_63_mean
return_simple 1.00000 0.01565
return_simple_rolling_63_mean 0.01565 1.00000
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 33/40
Taking the logarithm of prices allows for easy returns calculations. Additionally, taking the log of a data set can
squeeze in outliers and reduce skewness:
In [237]: aapl_df["close_log"] = aapl_df.close.apply(np.log)
In [238]: ax = aapl_df.close_log.plot(figsize=(11, 8))
t = ax.set_title("aapl: log closing price")
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 34/40
In [239]: plot_time_series_decomposition(aapl_df.close_log)
In [240]: aapl_df["return_log"] = aapl_df.close_log - aapl_df.close_log.shift(1)
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 35/40
In [241]: ax = aapl_df.return_log.hist(figsize=(11, 8), bins=100)
t = ax.set_title("aapl: log daily return")
In [242]: aapl_df.return_simple.skew()
In [243]: aapl_df.return_log.skew()
For skewed data with negative values, take the cube
root of it: to separate out the positive and negative values,
and it's interesting to look at each group's frequencies separately.
In [244]: aapl_df["return_cubrt"] = aapl_df.return_simple.apply(np.cbrt)
Out[242]: -1.198448451367238
Out[243]: -3.4413935608928203
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 36/40
In [246]: ax = aapl_df.return_cubrt.hist(figsize=(11, 8), bins=500)
t = ax.set_title("aapl: cube root of daily return")
In [247]: aapl_df["day_of_week"] = aapl_df.date.apply(lambda d: d.weekday())
In [248]: aapl_df.day_of_week.value_counts(normalize=True).sort_index()
In [249]: aapl_df["is_monday"] = aapl_df.day_of_week == 0
Out[248]: 0 0.187723
1 0.204534
2 0.205807
3 0.201732
4 0.200204
Name: day_of_week, dtype: float64
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 37/40
In [250]: ax = aapl_df.loc[aapl_df.is_monday == True, "return_cubrt"].hist(
figsize=(11, 8), bins=100, density=True, alpha=.35)
ax = aapl_df.loc[aapl_df.is_monday == False, "return_cubrt"].hist(ax=ax,
alpha=.35, bins=100, density=True)
t = ax.set_title("aapl: cube root of daily return")
In [251]: aapl_df[["return_simple", "is_monday"]].groupby("is_monday").agg([np.mea
n, np.std])
6.
Autoregression
In [252]: aapl_returns_2010 = aapl_df.loc[aapl_df.date.apply(lambda d: d.year == 2
010), "return_simple"]
Out[251]:
return_simple
mean std
is_monday
False 0.001286 0.029905
True 0.003341 0.029161
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 38/40
In [253]: train = aapl_returns_2010.loc[:datetime.date(2010, 6, 30)]
In [254]: train.head()
In [255]: train.tail()
In [256]: test = aapl_returns_2010.loc[datetime.date(2010, 7, 1):datetime.date(201
0, 7, 31)]
In [257]: model = AR(train.values)
In [258]: model_fit = model.fit()
In [259]: model_fit.k_ar
In [260]: model_fit.params
In [261]: params_df = pd.DataFrame(model_fit.params,
index=[i for i in range(len(model_fit.params
))],
columns=["model_params"])
In [262]: autocorrs = aapl_autocorrs.rename("autocorr").loc[:13].to_frame()
Out[254]: date
2010-01-04 0.016518
2010-01-05 0.000701
2010-01-06 -0.016521
2010-01-07 -0.001990
2010-01-08 0.007175
Name: return_simple, dtype: float64
Out[255]: date
2010-06-24 -0.006807
2010-06-25 -0.005433
2010-06-28 0.005089
2010-06-29 -0.044515
2010-06-30 -0.020156
Name: return_simple, dtype: float64
Out[259]: 13
Out[260]: array([ 0.00246467, 0.04125616, 0.03683962, -0.25484031, 0.04155076,
0.09983197, -0.17533004, -0.02235045, -0.07667173, -0.08510721,
-0.01421416, -0.03022782, 0.04335507, 0.29210439])
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 39/40
In [263]: ax = autocorrs.join(params_df).plot.bar(figsize=(11, 8))
In [264]: predictions = model_fit.predict(start=len(train),
end=len(train) + len(test) -1,
dynamic=False)
In [265]: predictions_df = pd.DataFrame(predictions, index=test.index, columns=["p
redicted"])
8/1/2019 Aiden Wu's Sample Code
localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 40/40
In [266]: ax = test.to_frame().join(predictions_df).plot(figsize=(11, 8), rot=10)
In [ ]:
In [ ]:

More Related Content

What's hot

Engineering a robust(ish) data pipeline with Luigi and AWS Elastic Map Reduce
Engineering a robust(ish) data pipeline with Luigi and AWS Elastic Map ReduceEngineering a robust(ish) data pipeline with Luigi and AWS Elastic Map Reduce
Engineering a robust(ish) data pipeline with Luigi and AWS Elastic Map ReduceAaron Knight
 
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...InfluxData
 
Pandas+postgre sql 實作 with code
Pandas+postgre sql 實作 with codePandas+postgre sql 實作 with code
Pandas+postgre sql 實作 with codeTim Hong
 
The Ring programming language version 1.5.3 book - Part 40 of 184
The Ring programming language version 1.5.3 book - Part 40 of 184The Ring programming language version 1.5.3 book - Part 40 of 184
The Ring programming language version 1.5.3 book - Part 40 of 184Mahmoud Samir Fayed
 
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrainsit-people
 
Python 내장 함수
Python 내장 함수Python 내장 함수
Python 내장 함수용 최
 
Advanced Query Parsing Techniques
Advanced Query Parsing TechniquesAdvanced Query Parsing Techniques
Advanced Query Parsing TechniquesSearch Technologies
 
Herding types with Scala macros
Herding types with Scala macrosHerding types with Scala macros
Herding types with Scala macrosMarina Sigaeva
 
Building Real Time Systems on MongoDB Using the Oplog at Stripe
Building Real Time Systems on MongoDB Using the Oplog at StripeBuilding Real Time Systems on MongoDB Using the Oplog at Stripe
Building Real Time Systems on MongoDB Using the Oplog at StripeMongoDB
 
python高级内存管理
python高级内存管理python高级内存管理
python高级内存管理rfyiamcool
 
The Ring programming language version 1.10 book - Part 44 of 212
The Ring programming language version 1.10 book - Part 44 of 212The Ring programming language version 1.10 book - Part 44 of 212
The Ring programming language version 1.10 book - Part 44 of 212Mahmoud Samir Fayed
 
[Let'Swift 2019] 실용적인 함수형 프로그래밍 워크샵
[Let'Swift 2019] 실용적인 함수형 프로그래밍 워크샵[Let'Swift 2019] 실용적인 함수형 프로그래밍 워크샵
[Let'Swift 2019] 실용적인 함수형 프로그래밍 워크샵Wanbok Choi
 

What's hot (20)

Engineering a robust(ish) data pipeline with Luigi and AWS Elastic Map Reduce
Engineering a robust(ish) data pipeline with Luigi and AWS Elastic Map ReduceEngineering a robust(ish) data pipeline with Luigi and AWS Elastic Map Reduce
Engineering a robust(ish) data pipeline with Luigi and AWS Elastic Map Reduce
 
Docopt
DocoptDocopt
Docopt
 
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
 
Pandas+postgre sql 實作 with code
Pandas+postgre sql 實作 with codePandas+postgre sql 實作 with code
Pandas+postgre sql 實作 with code
 
R and C++
R and C++R and C++
R and C++
 
The Ring programming language version 1.5.3 book - Part 40 of 184
The Ring programming language version 1.5.3 book - Part 40 of 184The Ring programming language version 1.5.3 book - Part 40 of 184
The Ring programming language version 1.5.3 book - Part 40 of 184
 
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains
 
Python 내장 함수
Python 내장 함수Python 내장 함수
Python 내장 함수
 
Welcome to python
Welcome to pythonWelcome to python
Welcome to python
 
Advanced Query Parsing Techniques
Advanced Query Parsing TechniquesAdvanced Query Parsing Techniques
Advanced Query Parsing Techniques
 
Typelevel summit
Typelevel summitTypelevel summit
Typelevel summit
 
Hadoop Pig
Hadoop PigHadoop Pig
Hadoop Pig
 
Adding CF Attributes to an HDF5 File
Adding CF Attributes to an HDF5 FileAdding CF Attributes to an HDF5 File
Adding CF Attributes to an HDF5 File
 
Herding types with Scala macros
Herding types with Scala macrosHerding types with Scala macros
Herding types with Scala macros
 
Building Real Time Systems on MongoDB Using the Oplog at Stripe
Building Real Time Systems on MongoDB Using the Oplog at StripeBuilding Real Time Systems on MongoDB Using the Oplog at Stripe
Building Real Time Systems on MongoDB Using the Oplog at Stripe
 
Computer vision
Computer vision Computer vision
Computer vision
 
python高级内存管理
python高级内存管理python高级内存管理
python高级内存管理
 
The Ring programming language version 1.10 book - Part 44 of 212
The Ring programming language version 1.10 book - Part 44 of 212The Ring programming language version 1.10 book - Part 44 of 212
The Ring programming language version 1.10 book - Part 44 of 212
 
Dotnet 18
Dotnet 18Dotnet 18
Dotnet 18
 
[Let'Swift 2019] 실용적인 함수형 프로그래밍 워크샵
[Let'Swift 2019] 실용적인 함수형 프로그래밍 워크샵[Let'Swift 2019] 실용적인 함수형 프로그래밍 워크샵
[Let'Swift 2019] 실용적인 함수형 프로그래밍 워크샵
 

Similar to Time Series Analysis Sample Code

Use of django at jolt online v3
Use of django at jolt online v3Use of django at jolt online v3
Use of django at jolt online v3Jaime Buelta
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using PythonNishantKumar1179
 
Global Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealGlobal Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealTzung-Bi Shih
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak PROIDEA
 
Assignment 4.pdf
Assignment 4.pdfAssignment 4.pdf
Assignment 4.pdfdash41
 
Viktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning ServiceViktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning ServiceLviv Startup Club
 
Is HTML5 Ready? (workshop)
Is HTML5 Ready? (workshop)Is HTML5 Ready? (workshop)
Is HTML5 Ready? (workshop)Remy Sharp
 
Is html5-ready-workshop-110727181512-phpapp02
Is html5-ready-workshop-110727181512-phpapp02Is html5-ready-workshop-110727181512-phpapp02
Is html5-ready-workshop-110727181512-phpapp02PL dream
 
GoFFIng around with Ruby #RubyConfPH
GoFFIng around with Ruby #RubyConfPHGoFFIng around with Ruby #RubyConfPH
GoFFIng around with Ruby #RubyConfPHGautam Rege
 
Designing REST API automation tests in Kotlin
Designing REST API automation tests in KotlinDesigning REST API automation tests in Kotlin
Designing REST API automation tests in KotlinDmitriy Sobko
 
Tools for Solving Performance Issues
Tools for Solving Performance IssuesTools for Solving Performance Issues
Tools for Solving Performance IssuesOdoo
 
The Ring programming language version 1.9 book - Part 99 of 210
The Ring programming language version 1.9 book - Part 99 of 210The Ring programming language version 1.9 book - Part 99 of 210
The Ring programming language version 1.9 book - Part 99 of 210Mahmoud Samir Fayed
 
Informatics Practices (new) solution CBSE 2021, Compartment, improvement ex...
Informatics Practices (new) solution CBSE  2021, Compartment,  improvement ex...Informatics Practices (new) solution CBSE  2021, Compartment,  improvement ex...
Informatics Practices (new) solution CBSE 2021, Compartment, improvement ex...FarhanAhmade
 
Python Google Cloud Function with CORS
Python Google Cloud Function with CORSPython Google Cloud Function with CORS
Python Google Cloud Function with CORSRapidValue
 
Class 12 computer sample paper with answers
Class 12 computer sample paper with answersClass 12 computer sample paper with answers
Class 12 computer sample paper with answersdebarghyamukherjee60
 

Similar to Time Series Analysis Sample Code (20)

Profiling in Python
Profiling in PythonProfiling in Python
Profiling in Python
 
Use of django at jolt online v3
Use of django at jolt online v3Use of django at jolt online v3
Use of django at jolt online v3
 
alexnet.pdf
alexnet.pdfalexnet.pdf
alexnet.pdf
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using Python
 
Global Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealGlobal Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the Seal
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
 
Assignment 4.pdf
Assignment 4.pdfAssignment 4.pdf
Assignment 4.pdf
 
Viktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning ServiceViktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning Service
 
Is HTML5 Ready? (workshop)
Is HTML5 Ready? (workshop)Is HTML5 Ready? (workshop)
Is HTML5 Ready? (workshop)
 
Is html5-ready-workshop-110727181512-phpapp02
Is html5-ready-workshop-110727181512-phpapp02Is html5-ready-workshop-110727181512-phpapp02
Is html5-ready-workshop-110727181512-phpapp02
 
GoFFIng around with Ruby #RubyConfPH
GoFFIng around with Ruby #RubyConfPHGoFFIng around with Ruby #RubyConfPH
GoFFIng around with Ruby #RubyConfPH
 
Designing REST API automation tests in Kotlin
Designing REST API automation tests in KotlinDesigning REST API automation tests in Kotlin
Designing REST API automation tests in Kotlin
 
Tools for Solving Performance Issues
Tools for Solving Performance IssuesTools for Solving Performance Issues
Tools for Solving Performance Issues
 
The Ring programming language version 1.9 book - Part 99 of 210
The Ring programming language version 1.9 book - Part 99 of 210The Ring programming language version 1.9 book - Part 99 of 210
The Ring programming language version 1.9 book - Part 99 of 210
 
Os lab final
Os lab finalOs lab final
Os lab final
 
Informatics Practices (new) solution CBSE 2021, Compartment, improvement ex...
Informatics Practices (new) solution CBSE  2021, Compartment,  improvement ex...Informatics Practices (new) solution CBSE  2021, Compartment,  improvement ex...
Informatics Practices (new) solution CBSE 2021, Compartment, improvement ex...
 
Python Google Cloud Function with CORS
Python Google Cloud Function with CORSPython Google Cloud Function with CORS
Python Google Cloud Function with CORS
 
Class 12 computer sample paper with answers
Class 12 computer sample paper with answersClass 12 computer sample paper with answers
Class 12 computer sample paper with answers
 
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
 
Final DAA_prints.pdf
Final DAA_prints.pdfFinal DAA_prints.pdf
Final DAA_prints.pdf
 

Recently uploaded

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 

Recently uploaded (20)

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 

Time Series Analysis Sample Code

  • 1. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 1/40 Time Series Analysis In [49]: import os import datetime import requests import zipfile import io import numpy as np import pandas as pd import matplotlib.pyplot as plt import statsmodels.api as sm from statsmodels.tsa.ar_model import AR from sklearn.metrics import r2_score, mean_squared_error %matplotlib inline 0. Get Data In [50]: sorted(list(filter(lambda f: not f.startswith("."), os.listdir(".")))) In [51]: def get_data(data_url): with requests.get(data_url) as r: with zipfile.ZipFile(io.BytesIO(r.content)) as z: z.extractall() In [52]: data_url = "http://quantquote.com/files/quantquote_daily_sp500_83986.zi p" get_data(data_url=data_url) data_dir = os.path.join("quantquote_daily_sp500_83986", "daily") Out[50]: ['README.md', 'Untitled.ipynb', 'census_data.csv', 'data_science_answers.ipynb', 'data_science_raw.ipynb', 'pandas_basics.ipynb', 'pandas_basics_addntl.ipynb', 'pandas_basics_answers.ipynb', 'python_basics.ipynb', 'python_basics_addntl.ipynb', 'python_basics_answers.ipynb', 'quantquote_daily_sp500_83986', 'requirements.txt', 'time_series_analysis.ipynb', 'time_series_analysis_answers.ipynb']
  • 2. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 2/40 Scope out data directory In [53]: stock_csv_names = sorted(os.listdir(data_dir)) In [54]: cols = ['date', 'time', 'open', 'high', 'low_price', 'close', 'volume'] Check sample data In [55]: df = pd.read_csv(os.path.join(data_dir, stock_csv_names[0]),names=cols) In [56]: df.shape In [64]: df.head() In [58]: df.dtypes We see that most of the typing here looks good (no object / string representations of numeric data) but we do see that date is coming in as an int rather than a datetime. Out[56]: (3452, 7) Out[64]: date time open high low_price close volume 0 1999-11-18 0 42.2076 46.3820 37.4581 39.1928 4.398181e+07 1 1999-11-19 0 39.8329 39.8885 36.9293 37.6251 1.139020e+07 2 1999-11-22 0 38.3208 40.0091 37.1613 39.9442 4.654716e+06 3 1999-11-23 0 39.4247 40.4729 37.3375 37.5138 4.268903e+06 4 1999-11-24 0 37.2262 38.9052 37.1056 38.0889 3.602367e+06 Out[58]: date int64 time int64 open float64 high float64 low_price float64 close float64 volume float64 dtype: object
  • 3. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 3/40 In [59]: df.isnull().sum() In [60]: df.duplicated().sum() No missing data, and no duplicate data. In [61]: df.date = pd.to_datetime(df.date.astype(str), infer_datetime_format=True ) check that this is a proper time-series data set, i.e. we're indexed on time, which in this case will mean that we have one date for every row: In [67]: df.date.nunique()/len(df) In [31]: df=df.set_index('date') In [68]: df.time.nunique() In [69]: df=df.drop('time', axis=1) Problem Get an iterable of DataFrames , one for each stock in our dataset, with the wrangling we did above included. In [82]: def get_csv_path(csv_name, stock_csv_folder=data_dir): return os.path.join(stock_csv_folder, csv_name) Out[59]: date 0 time 0 open 0 high 0 low_price 0 close 0 volume 0 dtype: int64 Out[60]: 0 Out[67]: 1.0 Out[68]: 1
  • 4. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 4/40 In [83]: def get_df(csv_name, cols=cols): df = pd.read_csv(get_csv_path(csv_name), names=cols, usecols=list(filter(lambda c: c!= "time", cols))) df.date = pd.to_datetime(df.date.astype(str), infer_datetime_format= True) return df.set_index("date", drop=False) In [84]: dfs_iter = (get_df(csv_name) for csv_name in stock_csv_names) In [85]: dfs_list = list(dfs_iter) #convert generator object to list In [89]: len(dfs_list) 1. Prices & Returns Prices In [90]: aapl_df = get_df("table_aapl.csv") In [93]: pd.Series(aapl_df.index).quantile([0, 1]) In [94]: aapl_df.isnull().sum() In [95]: aapl_df.duplicated().sum() Out[89]: 500 Out[93]: 0.0 1998-01-02 1.0 2013-08-09 Name: date, dtype: datetime64[ns] Out[94]: date 0 open 0 high 0 low_price 0 close 0 volume 0 dtype: int64 Out[95]: 0
  • 5. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 5/40 In [96]: ax = aapl_df.close.plot(figsize=(11, 8)) t = ax.set_title("aapl: closing price") A couple of observations on the above graph: the stock's price has increased over time there is a good bit of variability in between the start and end points it would have been nice to buy AAPL back in the 90s! We can see clearly that there are a number of different components, if you will, to the above time series: there's an upward trend over time there look to be some periodic-ish patterns there's a fair amount of noise-ish stuff, too
  • 6. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 6/40 In [97]: def plot_time_series_decomposition(series, freq=252): res = sm.tsa.seasonal_decompose(series, freq=freq) fig, (ax1,ax2,ax3) = plt.subplots(3,1, figsize=(11,8)) p1 = res.trend.plot(ax=ax1, rot=0) t1 = ax1.title.set_text("trend") p2 = res.seasonal.plot(ax=ax2, rot=0) t2 = ax2.title.set_text("seasonal") p3 = res.resid.plot(ax=ax3, rot=0) t3 = ax3.title.set_text("resid") fig.tight_layout() In [98]: plot_time_series_decomposition(aapl_df.close) Returns In [101]: aapl_df["return_gross"] = aapl_df.close.divide(aapl_df.close.shift(1)) In [104]: aapl_df["return_net"] = aapl_df.return_gross - 1
  • 7. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 7/40 In [105]: aapl_df.return_net.describe() Returns Distirbution In [112]: ax = aapl_df.return_net.hist(figsize=(11, 8), bins=200) t = ax.set_title("aapl: simple daily return") In [107]: (aapl_df.return_simple < -.1).sum() Out[105]: count 3925.000000 mean 0.001672 std 0.029774 min -0.518150 25% -0.013670 50% 0.000859 75% 0.016312 max 0.183749 Name: return_net, dtype: float64 Out[107]: 10
  • 8. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 8/40 In [108]: aapl_df[aapl_df.return_simple < -.1] Out[108]: date open high low_price close volume return_gross return_sim date 1999- 01-14 1999- 01-14 11.06280 11.18440 9.98318 10.07560 5.813583e+07 0.892333 -0.1076 1999- 09-21 1999- 09-21 17.79520 17.80980 16.77650 16.81050 1.138997e+08 0.875187 -0.1248 2000- 09-29 2000- 09-29 13.64490 14.10200 12.34160 12.52160 2.339738e+08 0.481850 -0.5181 2000- 12-06 2000- 12-06 7.11421 7.29413 6.80785 6.99264 4.577349e+07 0.845883 -0.1541 2001- 07-18 2001- 07-18 10.62510 10.69810 9.92974 10.10480 3.854362e+07 0.829214 -0.1707 2002- 06-19 2002- 06-19 8.44660 8.55844 8.20832 8.33476 5.893970e+07 0.850621 -0.1493 2002- 07-17 2002- 07-17 7.84362 7.87766 7.38652 7.62479 4.142656e+07 0.878923 -0.1210 2008- 01-23 2008- 01-23 132.46100 136.15700 122.67700 135.03800 1.181346e+08 0.880185 -0.1198 2008- 09-29 2008- 09-29 116.41400 116.41400 97.82880 103.12900 9.250666e+07 0.823352 -0.1766 2013- 01-24 2013- 01-24 451.33600 456.96900 441.77900 442.29000 4.939171e+07 0.876643 -0.1233
  • 9. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 9/40 In [113]: ax = aapl_df.return_net.plot(figsize=(11, 8)) t = ax.set_title("aapl: net daily return")
  • 10. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 10/40 In [114]: plot_time_series_decomposition(aapl_df.return_simple.dropna()) In [115]: aapl_df["return_simple_rolling_21_mean"] = aapl_df.return_simple.rolling (21).mean()
  • 11. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 11/40 In [122]: ax = aapl_df.return_simple_rolling_21_mean.plot(figsize=(11, 8)) t = ax.set_title("aapl: net daily return, rolling 21 mean(one-month)") retain this functionality to try other windows In [127]: def set_and_plot_rolling_mean(window): col_name = f"return_simple_rolling_{window}_mean" aapl_df[col_name] = aapl_df.return_simple.rolling(window).mean() ax = aapl_df[col_name].plot(figsize=(11, 8)) t = ax.set_title(f"aapl: net daily return, rolling {window} day mea n")
  • 12. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 12/40 In [121]: set_and_plot_rolling_mean(63) In [123]: set_and_plot_rolling_mean(166)
  • 13. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 13/40 In [124]: set_and_plot_rolling_mean(252) In [126]: def set_and_plot_rolling_std(window): col_name = f"return_simple_rolling_{window}_std" aapl_df[col_name] = aapl_df.return_simple.rolling(window).std() ax = aapl_df[col_name].plot(figsize=(11, 8)) t = ax.set_title(f"aapl: net daily return, rolling {window} day std" )
  • 14. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 14/40 In [128]: set_and_plot_rolling_std(63) Plot rolling mean and std together In [129]: [c for c in aapl_df.columns if "63" in c] In [132]: aapl_quarterly = aapl_df[[c for c in aapl_df.columns if "63" in c]].drop na() In [134]: aapl_quarterly = aapl_quarterly / aapl_quarterly.iloc[0] In [135]: aapl_quarterly.iloc[0:5] Out[129]: ['return_simple_rolling_63_mean', 'return_simple_rolling_63_std'] Out[135]: return_simple_rolling_63_mean return_simple_rolling_63_std date 1998-04-03 1.000000 1.000000 1998-04-06 0.973895 1.004831 1998-04-07 0.601837 0.829599 1998-04-08 0.699827 0.788331 1998-04-09 0.681255 0.784945
  • 15. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 15/40 In [136]: ax = aapl_quarterly.plot(figsize=(11, 8)) t = ax.set_title("aapl: simple daily return, rolling 63 mean and std") inverse relationship between the two In [138]: aapl_quarterly.corr() In [139]: aapl_quarterly.iloc[0:5] Out[138]: return_simple_rolling_63_mean return_simple_rolling_63_std return_simple_rolling_63_mean 1.000000 -0.370614 return_simple_rolling_63_std -0.370614 1.000000 Out[139]: return_simple_rolling_63_mean return_simple_rolling_63_std date 1998-04-03 1.000000 1.000000 1998-04-06 0.973895 1.004831 1998-04-07 0.601837 0.829599 1998-04-08 0.699827 0.788331 1998-04-09 0.681255 0.784945
  • 16. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 16/40 In [141]: plot_time_series_decomposition(aapl_df.return_simple_rolling_63_std.drop na()) Compare rolling returns and volatility with differing window lengths and quantify these relationships using correlation. In [142]: def compare_return_to_vol(return_col, vol_col, df=aapl_df): return df[[return_col, vol_col]].corr() In [144]: compare_return_to_vol("return_simple_rolling_63_mean", "return_simple_ro lling_63_std") Out[144]: return_simple_rolling_63_mean return_simple_rolling_63_std return_simple_rolling_63_mean 1.000000 -0.370614 return_simple_rolling_63_std -0.370614 1.000000
  • 17. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 17/40 In [145]: aapl_df.columns In [148]: compare_return_to_vol("return_simple_rolling_166_mean", "return_simple_r olling_63_std") In [149]: set_and_plot_rolling_std(166) Out[145]: Index(['date', 'open', 'high', 'low_price', 'close', 'volume', 'return_ gross', 'return_simple', 'return_net', 'return_simple_rolling_21_mean', 'return_simple_rolling_63_mean', 'return_simple_rolling_166_mea n', 'return_simple_rolling_252_mean', 'return_simple_rolling_63_st d'], dtype='object') Out[148]: return_simple_rolling_166_mean return_simple_rolling_63_std return_simple_rolling_166_mean 1.000000 -0.280703 return_simple_rolling_63_std -0.280703 1.000000
  • 18. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 18/40 In [151]: compare_return_to_vol("return_simple_rolling_166_mean", "return_simple_r olling_166_std") In [152]: aapl_halflyish = aapl_df[[c for c in aapl_df.columns if "166" in c] ].dropna() In [153]: aapl_halflyish = aapl_halflyish / aapl_halflyish.iloc[0] In [154]: ax = aapl_halflyish.plot(figsize=(11, 8)) t = ax.set_title("aapl: simple daily return, rolling 166 mean and std") 2. Multi-Stock Analysis In [155]: len(dfs_list) Out[151]: return_simple_rolling_166_mean return_simple_rolling_166_std return_simple_rolling_166_mean 1.000000 -0.349501 return_simple_rolling_166_std -0.349501 1.000000 Out[155]: 500
  • 19. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 19/40 In [157]: os.listdir(data_dir)[:5] In [158]: os.listdir(data_dir)[-5:] In [159]: def get_stock_name(csv_name): file_name = csv_name.split(".")[0] return file_name.split("_")[1] In [160]: dfs_list_indexed = [ pd.concat({get_stock_name(csv_name): dfs_list[i]}, axis=1) for i, csv_name in enumerate(stock_csv_names)] In [161]: all_stocks = dfs_list_indexed[0].join(dfs_list_indexed[1:]) In [162]: all_stocks.iloc[:5, :8] Out[157]: ['table_dlph.csv', 'table_cat.csv', 'table_coh.csv', 'table_mcd.csv', 'table_ca.csv'] Out[158]: ['table_schw.csv', 'table_cl.csv', 'table_te.csv', 'table_vz.csv', 'table_hrs.csv'] Out[162]: a aa date open high low_price close volume date open date 1999-11- 18 1999-11- 18 42.2076 46.3820 37.4581 39.1928 4.398181e+07 1999-11- 18 24.5183 1999-11- 19 1999-11- 19 39.8329 39.8885 36.9293 37.6251 1.139020e+07 1999-11- 19 24.5647 1999-11- 22 1999-11- 22 38.3208 40.0091 37.1613 39.9442 4.654716e+06 1999-11- 22 24.6846 1999-11- 23 1999-11- 23 39.4247 40.4729 37.3375 37.5138 4.268903e+06 1999-11- 23 25.0018 1999-11- 24 1999-11- 24 37.2262 38.9052 37.1056 38.0889 3.602367e+06 1999-11- 24 25.0482
  • 20. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 20/40 In [164]: all_stocks["aa"].head() Close In [165]: tickers = all_stocks.columns.levels[0] tickers[:5] In [166]: _all_close_list = [all_stocks[tick].close for tick in tickers] In [170]: _all_close_list[0].head() In [171]: _all_close_list = [srs.rename(tickers[i]) for i, srs in enumerate(_all_close_list)] In [176]: all_stocks_close = _all_close_list[0].to_frame().join(_all_close_list[1 :]) Out[164]: date open high low_price close volume date 1999-11-18 1999-11-18 24.5183 24.5183 24.0579 24.2514 2658683.414 1999-11-19 1999-11-19 24.5647 24.7581 24.2978 24.6846 3022133.556 1999-11-22 1999-11-22 24.6846 25.1450 24.4912 25.0250 4525318.956 1999-11-23 1999-11-23 25.0018 25.4351 24.9515 25.2185 5622139.724 1999-11-24 1999-11-24 25.0482 25.0482 24.6614 24.9283 3144923.734 Out[165]: Index(['a', 'aa', 'aapl', 'abbv', 'abc'], dtype='object') Out[170]: date 1999-11-18 39.1928 1999-11-19 37.6251 1999-11-22 39.9442 1999-11-23 37.5138 1999-11-24 38.0889 Name: close, dtype: float64
  • 21. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 21/40 In [177]: all_stocks_close.iloc[:5, :8] In [178]: all_stocks_returns = all_stocks_close / all_stocks_close.shift(1) Correlation In [179]: corrs = all_stocks_returns.corr() In [180]: corrs.iloc[:5, :5] In [181]: corrs_modified = corrs.replace(1, np.nan) In [182]: corrs_modified.iloc[:5, :5] Out[177]: a aa aapl abbv abc abt ace acn date 1999-11-18 39.1928 24.2514 21.7924 NaN 3.09830 11.9023 14.3008 NaN 1999-11-19 37.6251 24.6846 22.4756 NaN 3.05589 11.8612 13.9294 NaN 1999-11-22 39.9442 25.0250 22.0185 NaN 2.95767 12.2375 14.0259 NaN 1999-11-23 37.5138 25.2185 22.6264 NaN 2.74784 12.0383 13.9294 NaN 1999-11-24 38.0889 24.9283 23.0373 NaN 2.74784 12.1964 13.3722 NaN Out[180]: a aa aapl abbv abc a 1.000000 0.383146 0.361200 0.164908 0.169508 aa 0.383146 1.000000 0.283109 0.067920 0.254100 aapl 0.361200 0.283109 1.000000 0.032483 0.142794 abbv 0.164908 0.067920 0.032483 1.000000 0.282846 abc 0.169508 0.254100 0.142794 0.282846 1.000000 Out[182]: a aa aapl abbv abc a NaN 0.383146 0.361200 0.164908 0.169508 aa 0.383146 NaN 0.283109 0.067920 0.254100 aapl 0.361200 0.283109 NaN 0.032483 0.142794 abbv 0.164908 0.067920 0.032483 NaN 0.282846 abc 0.169508 0.254100 0.142794 0.282846 NaN
  • 22. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 22/40 In [183]: corrs_modified.max().sort_values(ascending=False)[:5] In [184]: corrs_modified.min().sort_values(ascending=False)[:5] Problem Create a price index that provides insight into the daily tendencies of our 500 stocks. We are going to do this as follows: calculate, for each day, the mean spread between open and closing prices across all 500 stocks. Next, vizualize your results. Plot on the same axes a the volatility of the open and closing price spreads. Note that you may want to normalize prior to plotting. Additionally, find: the 10 days with the highest average open-close spreads the 10 days with the highest volatility in their open-close spreads In [185]: _all_open_list = [all_stocks[tick].open for tick in tickers] In [186]: _all_open_list[0].head() In [187]: _all_open_list = [srs.rename(tickers[i]) for i, srs in enumerate(_all_open_list)] In [188]: all_stocks_open = _all_open_list[0].to_frame().join(_all_open_list[1:]) Out[183]: eqr 0.884192 avb 0.884192 spg 0.883757 bxp 0.883757 vno 0.883234 dtype: float64 Out[184]: wyn 0.178355 pfg 0.175845 dfs 0.175066 dd 0.168540 hst 0.159443 dtype: float64 Out[186]: date 1999-11-18 42.2076 1999-11-19 39.8329 1999-11-22 38.3208 1999-11-23 39.4247 1999-11-24 37.2262 Name: open, dtype: float64
  • 23. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 23/40 In [189]: all_stocks_open.iloc[:5, :8] In [190]: mean_spread = (all_stocks_close - all_stocks_open).mean(axis=1) In [191]: mean_spread.head() In [193]: std_spread = (all_stocks_close - all_stocks_open).std(axis=1) In [194]: std_spread.head() In [196]: joined_spread = mean_spread.to_frame().join(std_spread) In [197]: joined_spread = joined_spread / joined_spread.iloc[0] Out[189]: a aa aapl abbv abc abt ace acn date 1999-11-18 42.2076 24.5183 22.1401 NaN 2.88847 12.1775 14.8134 NaN 1999-11-19 39.8329 24.5647 21.7608 NaN 3.08267 12.0003 14.3973 NaN 1999-11-22 38.3208 24.6846 22.3079 NaN 3.01348 12.0193 14.0259 NaN 1999-11-23 39.4247 25.0018 22.3079 NaN 2.95767 12.1585 14.2116 NaN 1999-11-24 37.2262 25.0482 22.6118 NaN 2.73445 12.0794 13.8848 NaN Out[191]: date 1999-11-18 0.029497 1999-11-19 -0.136570 1999-11-22 -0.136851 1999-11-23 -0.344567 1999-11-24 0.187926 dtype: float64 Out[194]: date 1999-11-18 1.447925 1999-11-19 2.082665 1999-11-22 1.295058 1999-11-23 2.393124 1999-11-24 1.648351 dtype: float64
  • 24. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 24/40 In [198]: ax = joined_spread.plot(figsize=(11, 8)) 3. Returns, by Month Check if AAPL post better daily performance in certain months Average daily returns, by month In [200]: aapl_df["month"] = aapl_df.date.apply(lambda d: d.month) In [201]: daily_by_month = aapl_df[["month", "return_simple"] ].groupby("month" ).agg([np.mean, np.std])
  • 25. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 25/40 In [202]: daily_by_month In [203]: daily_by_month.columns = daily_by_month.columns.droplevel() Out[202]: return_simple mean std month 1 0.002190 0.034613 2 0.001037 0.025793 3 0.003630 0.027720 4 0.002755 0.030354 5 -0.000095 0.025757 6 0.000647 0.024897 7 0.003089 0.030554 8 0.001921 0.023400 9 -0.000956 0.042424 10 0.003242 0.033829 11 0.001369 0.028624 12 0.000834 0.025551
  • 26. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 26/40 In [204]: ax = daily_by_month.plot.barh(figsize=(11, 11)) t = ax.set_title("aapl: average simple daily return, by month") So, we see that, on average, AAPL has seen negative daily returns (with higher-than-average volatility) in September - earnings? Average monthly returns, by month In [206]: aapl_df["year"] = aapl_df.date.apply(lambda d: d.year) In [207]: monthly_by_month = aapl_df[["year", "month", "return_simple"] ].groupby(["year", "month"] ).agg([np.amin, np.amax]) In [209]: monthly_by_month = monthly_by_month.diff(axis=1)
  • 27. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 27/40 In [211]: monthly_by_month.columns = monthly_by_month.columns.swaplevel() In [212]: monthly_by_month = monthly_by_month["amax"] In [213]: monthly_by_month.head() In [218]: monthlies = monthly_by_month.groupby(level=1 ).agg([np.mean, np.std]) Out[213]: return_simple year month 1998 1 0.262945 2 0.106502 3 0.151977 4 0.089050 5 0.076243
  • 28. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 28/40 In [219]: ax = monthlies.plot.barh(figsize=(11, 11)) t = ax.set_title("aapl: average simple monthly return, by month") 4. Autocorrelation
  • 29. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 29/40 In [220]: ax = aapl_df.return_simple.to_frame( ).join( aapl_df.return_simple.shift(1).rename("shifted") ).plot.scatter(x="shifted", y="return_simple", figsize=(11, 8)) scrub the outlier and try again In [221]: t_and_t_plus_one = aapl_df.return_simple.to_frame( ).join( aapl_df.return_simple.shift(1).rename("shif ted") ) In [222]: (t_and_t_plus_one < -.5).sum() In [223]: for col in t_and_t_plus_one.columns: t_and_t_plus_one.loc[t_and_t_plus_one[col] < -.5, col] = np.nan Out[222]: return_simple 1 shifted 1 dtype: int64
  • 30. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 30/40 In [224]: ax = t_and_t_plus_one.plot.scatter(x="shifted", y="return_simple", figsi ze=(11, 8)) In [226]: aapl_autocorrs = pd.Series({ lag: aapl_df.return_simple.autocorr(lag=lag) for lag in list (range(1, 20)) })
  • 31. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 31/40 In [227]: ax = aapl_autocorrs.plot.bar(figsize=(11, 8), rot=0) t = ax.title.set_text("aapl: autocorrelation") Problem Extract the most signification lag values from above, and understand whether day-of-the-week makes a difference. In [228]: aapl_df["weekday"] = aapl_df.date.apply(lambda d: d.weekday()) In [229]: aapl_df.weekday.value_counts(normalize=True).sort_index() Out[229]: 0 0.187723 1 0.204534 2 0.205807 3 0.201732 4 0.200204 Name: weekday, dtype: float64
  • 32. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 32/40 In [230]: aapl_df[aapl_df.weekday == 4].return_simple.head() In [231]: aapl_df[aapl_df.weekday == 4].return_simple.autocorr(lag=1) In [232]: for w in sorted(aapl_df.weekday.unique()): print(w, aapl_df[aapl_df.weekday == w].return_simple.autocorr(lag=1 ), sep=": ") Correlations, with rolling window returns In [234]: aapl_df.return_simple.to_frame( ).join( aapl_df.return_simple_rolling_21_mean.shift(1) ).corr() In [235]: aapl_df.return_simple.to_frame( ).join( aapl_df.return_simple_rolling_63_mean.shift(1) ).corr() 5. Log Returns Out[230]: date 1998-01-02 NaN 1998-01-09 0.010519 1998-01-16 -0.022928 1998-01-23 0.009838 1998-01-30 -0.009699 Name: return_simple, dtype: float64 Out[231]: 0.06890856162364511 0: 0.03307978683299217 1: 0.02142547271905113 2: 0.01132272930404902 3: -0.025362416692087088 4: 0.06890856162364511 Out[234]: return_simple return_simple_rolling_21_mean return_simple 1.000000 -0.004471 return_simple_rolling_21_mean -0.004471 1.000000 Out[235]: return_simple return_simple_rolling_63_mean return_simple 1.00000 0.01565 return_simple_rolling_63_mean 0.01565 1.00000
  • 33. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 33/40 Taking the logarithm of prices allows for easy returns calculations. Additionally, taking the log of a data set can squeeze in outliers and reduce skewness: In [237]: aapl_df["close_log"] = aapl_df.close.apply(np.log) In [238]: ax = aapl_df.close_log.plot(figsize=(11, 8)) t = ax.set_title("aapl: log closing price")
  • 34. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 34/40 In [239]: plot_time_series_decomposition(aapl_df.close_log) In [240]: aapl_df["return_log"] = aapl_df.close_log - aapl_df.close_log.shift(1)
  • 35. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 35/40 In [241]: ax = aapl_df.return_log.hist(figsize=(11, 8), bins=100) t = ax.set_title("aapl: log daily return") In [242]: aapl_df.return_simple.skew() In [243]: aapl_df.return_log.skew() For skewed data with negative values, take the cube root of it: to separate out the positive and negative values, and it's interesting to look at each group's frequencies separately. In [244]: aapl_df["return_cubrt"] = aapl_df.return_simple.apply(np.cbrt) Out[242]: -1.198448451367238 Out[243]: -3.4413935608928203
  • 36. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 36/40 In [246]: ax = aapl_df.return_cubrt.hist(figsize=(11, 8), bins=500) t = ax.set_title("aapl: cube root of daily return") In [247]: aapl_df["day_of_week"] = aapl_df.date.apply(lambda d: d.weekday()) In [248]: aapl_df.day_of_week.value_counts(normalize=True).sort_index() In [249]: aapl_df["is_monday"] = aapl_df.day_of_week == 0 Out[248]: 0 0.187723 1 0.204534 2 0.205807 3 0.201732 4 0.200204 Name: day_of_week, dtype: float64
  • 37. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 37/40 In [250]: ax = aapl_df.loc[aapl_df.is_monday == True, "return_cubrt"].hist( figsize=(11, 8), bins=100, density=True, alpha=.35) ax = aapl_df.loc[aapl_df.is_monday == False, "return_cubrt"].hist(ax=ax, alpha=.35, bins=100, density=True) t = ax.set_title("aapl: cube root of daily return") In [251]: aapl_df[["return_simple", "is_monday"]].groupby("is_monday").agg([np.mea n, np.std]) 6. Autoregression In [252]: aapl_returns_2010 = aapl_df.loc[aapl_df.date.apply(lambda d: d.year == 2 010), "return_simple"] Out[251]: return_simple mean std is_monday False 0.001286 0.029905 True 0.003341 0.029161
  • 38. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 38/40 In [253]: train = aapl_returns_2010.loc[:datetime.date(2010, 6, 30)] In [254]: train.head() In [255]: train.tail() In [256]: test = aapl_returns_2010.loc[datetime.date(2010, 7, 1):datetime.date(201 0, 7, 31)] In [257]: model = AR(train.values) In [258]: model_fit = model.fit() In [259]: model_fit.k_ar In [260]: model_fit.params In [261]: params_df = pd.DataFrame(model_fit.params, index=[i for i in range(len(model_fit.params ))], columns=["model_params"]) In [262]: autocorrs = aapl_autocorrs.rename("autocorr").loc[:13].to_frame() Out[254]: date 2010-01-04 0.016518 2010-01-05 0.000701 2010-01-06 -0.016521 2010-01-07 -0.001990 2010-01-08 0.007175 Name: return_simple, dtype: float64 Out[255]: date 2010-06-24 -0.006807 2010-06-25 -0.005433 2010-06-28 0.005089 2010-06-29 -0.044515 2010-06-30 -0.020156 Name: return_simple, dtype: float64 Out[259]: 13 Out[260]: array([ 0.00246467, 0.04125616, 0.03683962, -0.25484031, 0.04155076, 0.09983197, -0.17533004, -0.02235045, -0.07667173, -0.08510721, -0.01421416, -0.03022782, 0.04335507, 0.29210439])
  • 39. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 39/40 In [263]: ax = autocorrs.join(params_df).plot.bar(figsize=(11, 8)) In [264]: predictions = model_fit.predict(start=len(train), end=len(train) + len(test) -1, dynamic=False) In [265]: predictions_df = pd.DataFrame(predictions, index=test.index, columns=["p redicted"])
  • 40. 8/1/2019 Aiden Wu's Sample Code localhost:8888/nbconvert/html/Desktop/PythonRelated/python_fundamentals-master/Aiden Wu's Sample Code.ipynb?download=false 40/40 In [266]: ax = test.to_frame().join(predictions_df).plot(figsize=(11, 8), rot=10) In [ ]: In [ ]: