SlideShare a Scribd company logo
1 of 41
Download to read offline
Application of Social
Sentiment Factors In ETF
Design
Quoc Dung Cao
Yikang Luo
Lu Qiu
Rajput Tanmay
Miao Zhou
Industry Sponsor: Social Market Analytics, Inc.
May 2, 2016
SMA PRACTICUM GROUP, UNIVERSITY OF ILLINOIS AT URBANA CHAMPAIGN
This practicum research was performed under the supervision of Dr. Jeff Blaschak and Dr. Morton
N. Lane. All market sentiment data was provided by Social Market Analytics, co. Inc.
The stock prices were downloaded from finance.yahoo.com.
This Latex Template was originally created by Mathias Legrand, which is the template of The Legrand
Orange Book posted on http://www.latextemplates.com/template/the-legrand-orange-
book, then edited by Andrea Hidalgo and published on http://www.overleaf.com/articles/
clustering-the-interstellar-medium/mtthgyyfrdkn#.Vw_lwMftrK4. We made some ad-
justment when using it during this report.
First release, April 2016
Executive Summary
Exchange traded funds (ETFs) track the performance of an index or a basket of assets. Strategic-
beta ETFs seek to enhance the performance of such index tracking funds by acquiring exposure to
specific market risk factors. This research modifies the existing S&P 500 sector tracking ETFs (SPY,
XLV, and XLY) by enhancing the highest market capitalization subset of the ETFs based on social
media sentiment while still passively investing in the remaining stocks of the ETFs.
Analysis of the lexical content of Twitter posts regarding stocks is used to create a sentiment
metric called the S-Score. We use a dynamic 95% confidence interval of the S-Score to determine a
signal for extreme values of the tweets of individual stocks. This confidence interval is constructed
using a rolling 30-day look-back interval. The interval can evolve to capture the most recent devel-
opment in S-Score, thus provides a dynamic threshold to guide weight adjustments. This allows
for significant reduction in the rebalancing frequency and transactions costs compared to a fixed
threshold strategy.
Our results indicate that sentiment enhanced ETFs yield better returns during a 2-year backtesting
interval, and have more favorable Sharpe Ratios compared to the unenhanced benchmark ETFs.
A variety of strategies for rebalancing – weekly, bi-weekly as well as daily adjustment, allow for
varying degrees of “active” beta management of the ETF portfolios.
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1 Social Market Analytics, Inc.(SMA) 7
1.2 ETF 8
1.3 Sentiment Score 8
1.4 Strategic Beta 8
1.5 Previous Research 9
2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 Data 11
2.1.1 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.2 SMA data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 ETF Design 12
2.2.1 ETF rebalancing strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 Dynamic Confidence Interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Rebalancing Frequency 14
2.4 Limitation 15
3 Strategy Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 Algorithm 18
3.2 Training Period 21
6
4 Back Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1 Data Sample 25
4.2 Transaction Cost Control 25
4.3 Result 26
5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.1 Algorithm Expansion 33
6.2 Criteria Calculation 35
6.3 Python Scripts 36
6.4 R Scripts 37
6.5 Suggestions For Further Research 38
1. Introduction
1.1 Social Market Analytics, Inc.(SMA)
SMA is a social media start-up based in Chicago, Illinois since 2012, providing services to
financial institutions, active traders, and investors. One of the main product suites of SMA is the
Sentiment Signature Feed. SMA Sentiment Signature Feed structures and quantifies Social Media
information and provides actionable intelligence for financial markets.
SMA’s patented product provides a Sentiment Signature for the universe of tradable equities.
The engine Extracts the right data, Evaluates the content, and Calculates the statistics customers
need for alerts, trading algorithms, and business intelligence. The Twitter feed has grown to nearly
500 million Tweets per day. The indicative Tweets used in the SMA calculations are growing even
faster at 10% per month, as professionals use Twitter for news distribution, and general business
utilization expands. No firm can ignore Twitter’s significance for the financial market.
Until now there has been no efficient method for market professionals to intelligently quantify
and measure the vast amount of information embedded in the Twitter message stream. By cleaning,
filtering, and quantifying social media data, Social Market Analytics provides traders with reliable
actionable intelligence before their competitors hear about it in traditional media.
SMA’s market sentiment metrics are comprised of:
• S-Score: Time-weighted representation of a security’s sentiment time series, normalized over
a standard look-back interval.
• SV-Score: a security’s indicative tweet volume time series, normalized over a standard look-
back interval
• S-Buzz: a measure of unusual social media activity relative to a universe of stocks
• S-Volume: is the volume of indicative tweets contributing to metrics calculations at an obser-
vation time.
• S-Dispersion: is a measurement of the tweet source diversity contributing to an S-Score
8 Chapter 1. Introduction
• S-Mean: Weighted average of the sentiment time series for a given security
• S-Volatility: is a percent measurement of the variability of the sentiment level.
• S-Delta: is the percent change in S-Score, a first order measurement of the sentiment trend.
1.2 ETF
Exchange traded funds (ETFs) track the performance of an index or a basket of assets. With high
liquidity and low commissions, ETFs provide an attractive product for investors. ETFs are one of
the fastest growing investment products currently available in the market. Beta is the financial term
for the sensitivity of the asset’s return to the market’s return, while “alpha” is excess returns skilled
investors can generate[6]. An ETF tracking an index such as S&P500 index provides investors with
almost 1 to 1 beta exposure to the index with minimal expense for fund management.
1.3 Sentiment Score
Sentiment data from SMA are derived from Tweets captured from thousands of Twitter accounts
expressing commentary on stocks and the markets[1]. SMA servers compute metrics called S-Factors
to quantify equity sentiment for investment purposes.
S-Score Calculation: For a particular time in every 24-hour window, for N informative tweets,
raw sentiment is calculated as follows,
S(t) =
N
∑
i=1
sentiment_tweet(ti)
Calculating the mean and variance in 20 days look-back period, the raw sentiment is standardized:
S-Score(t) =
S(t)−u(S(t))
σ(S(t))
where u and σ represent the mean and standard deviation of raw sentiments during the past 20
days.
AM/PM Data Points: Regarding this research, we acquire two types of S-Score every market day.
The AM data are generated at 9:10 AM (EST), before the market opens and the PM data is generated
at 2:55 PM (EST), 65 minutes before the market closes. This research relies on S-Score to establish
the threshold of weight-adjustments for the enhanced ETFs.
1.4 Strategic Beta
Since 2005, the market has witnessed the growth of a new class of ETF products known as
strategic beta, especially appealing to investors looking for the better returns than traditional ETFs[5].
Two major characteristics of strategic-beta strategy are anti-traditional weighting methodology
and transparent quantitative weighting methods[3]. It is a mixture of active and passive portfo-
lio management technique. By “active”, investors can subjectively choose certain risk factors
to take; and by “passive”, they only follow a predefined index to get exposure to the factor. In
traditional passive portfolio management, investors have exposure to market risk by investing in
SPY, the ETF that tracks the performance of S&P 500. In strategic-beta strategies, fund managers
1.5 Previous Research 9
are more flexible in picking risk factors, for example, fundamental weighting, low variance and so on.
However, strategic-beta ETFs focused on the market sentiment have not yet been fully explored,
despite contemporary research that shows significant dependence between the Twitter sentiment and
abnormal returns during peak Twitter volume (e.g. quarterly earnings announcements) as well as
around less obvious but “event-driven” peaks[2].
1.5 Previous Research
Previous practicum project in fall 2015 performed research to enhance ETF return based on daily
Twitter sentiment data. It concluded that daily rebalancing of the portfolio was a profitable strategy
in excess returns, Sharpe ratio, and turnover ratio in all SPY, XLV, and XLY.[4]
Specifically, the research optimized a fixed threshold of S-Score for each ETF, for all of the
constituent stocks, and buy/sell the stocks whenever the daily S-Score exceeds/falls below the up-
per/lower bounds of the threshold.
In this current research, we improved the previous findings to reduce the rebalancing frequency,
increase the flexibility of the S-Score thresholds to adapt to the latest sentiment development, and
model individual thresholds for each constituents in the ETF.
2. Methodology
2.1 Data
This section presents background on the data used throughout the project. It describes the
sources, choice of data, and explanation of the three ETFs and the Twitter sentiment data.
2.1.1 Benchmarks
The three chosen ETFs are SPY, XLY, and XLV. SPY is the SPDR S&P 500 ETF. XLV is the
Health Care Select Sector SPDR Fund. XLY is the Consumer Discretionary Select Sector SPDR
Fund. The motivation behind the choice of different industries ETFs and the general market ETF
SPY is to control and investigate more specific impact on Twitter sentiments on different market
sectors.
Based on the previous reports and SMA’s findings, Twitter sentiment seems to have more impact
on SPY and XLY funds, with much slighter impacts on XLV fund. We hypothesize that the familiar-
ity and interest of the general public tweets bring the higher effect and possibly volume of tweets to
the more well-known companies, which heavily present in SPY and XLY funds. In the later sections,
we shall investigate further this hypothesis, based on a representative sample of stocks in each ETF.
The daily price data for the three ETFs were obtained from Yahoo Finance using Python and
Yahoo API. Adjusted closed price will be used for all return and rebalancing calculations as we will
not separately calculate dividend as a part of the portfolio return.
The market capitalization weights of constituent stocks were obtained from SPDR fund prospec-
tus on Feb 2nd, 2016. Even though the weights of individual stocks change everyday due to
individual price movement, we assume the weight on Feb 2nd, 2016 will be the initial weight used at
the beginning of our backtesting period. This is under the assumption that the weights do not change
significantly throughout the history of the fund. Nevertheless, in the backtesting, we also used the
same weights to simulated the growth of the original unenhanced SPY, XLY, and XLV to have a fair
12 Chapter 2. Methodology
evaluation of our ETF design.
The historical data is divided into 2 parts. The first part is used as training period for choosing
the best ETF design strategy. The second part is used as testing period for backtesting. This is a
proposed improvement from previous research. Splitting the data help avoid bias when we back-test
on the same dataset used for strategy design.
As for testing stability of strategy through different training periods and rolling periods, the data
range we used are different, based on the different IPO (initial public offerings) time for each tickers
in the portfolio. For example, the IPO date for FB is May 18th, 2012. After taken into consideration
that FB is an important component in our portfolio, we decide to push the backtesting range for SPY
from 2010 to Jan 1st, 2014. Therefore, the backtesting period for SPY is from Jan 1st, 2014 to Feb
12th, 2016, which is the ending sentiment data. Following this approach, the data range for XLV is
from Aug 1st, 2014 to Feb 12th, 2016, while the data range for XLY is from Jan 1st, 2014 to Feb
12th, 2016.
2.1.2 SMA data
The Twitter Sentiment data was obtained from SMA through various sentiment-score metrics
(S-Score) throughout the backtesting period. To further improve the robustness of our ETF design,
we back-test the ETF design over a longer period than the previous project.
Through simple brute force search and regression models such as single and double independent
variables, we determined the best signal among the three S-Score metrics (the AM, PM, and Delta)
and their respective 1-day, 2-day, and 3-day lags. We found that 2-day AM, 3-day AM, and 1-day
PM work best for SPY, XLV, and XLY respectively. This is very intuitive, since Twitter sentiment
seems to reflect the development of information quite fast, especially for big and popular stocks,
which heavily dominate XLY. XLV is the least sensitive to sentiment data and sentiment also does not
enhance its returns as strongly as others. Another reason the lagged data still have some explanatory
power is probably due to the construction of the PM S-Score, near to the market close time and
people may not have enough time to place orders or split their trades into multiple smaller orders for
the case of institutional investors.
For further research, one can access to SMA’s platform through www.socialmarketanalytics.
com to obtain real-time S-Score to dynamically back-test the ETF design.
2.2 ETF Design
2.2.1 ETF rebalancing strategy
In order to earn excess return relative to the base ETFs, we strategically rebalancing our ETFs
based on Twitter sentiment signals. While considering the rebalance of SPY, XLV and XLY, we
find it unnecessary to consider every single stock in the ETF. Take SPY as an example, which has
500 stocks, with each comprising a small part of the whole market capitalization. Considering the
management cost and the transaction cost, we choose to rebalance the top market-capitalization
stocks.
2.2 ETF Design 13
We choose the first 15 highest market capitalization stocks, which are about one fourth of the
total market capitalization in SPY, first 10 in XLV and first 10 in XLY, which are about half of the
total market capitalization. Specifically, the tickers of those stocks are:
• SPY: AAPL, MSFT, XOM, JNJ, GE, BRK-B, FB, T, WRF, PG, AMZN, GOOGL, GOOG,
JPM, VZ
• XLV: JNJ, PFE, MRK, GILD, AGN, UNH, AMGN, MDT, BMY, ABBV
• XLY: AMZN, HD, DIS, CMCSA, MCD, SBUX, NKE, PCLN, LOW, TWX
And the weight of these in the portfolios are:
Figure 2.1: Initial Weight
The representative subset of stocks will be rebalanced whenever there is an extreme value of
sentiment scores. This subset will be self-financed among themselves. The remaining stocks will
be passively invested similarly to SPY, XLV, and XLY. We call this new ETF fund an Enhanced
ETF(EETF).
2.2.2 Dynamic Confidence Interval
We construct a dynamic confidence interval (DCI) to decide when to rebalance. Unlike the
method in the previous study using a fixed threshold, the upper and lower bound of the confidence
interval will dynamically reflect the fluctuation of the sentiment data. For instance, while extreme
values of sentiment score are observed successively, we can avoid unnecessary successive rebalanc-
ing.
We use a 30 day look-back period, 95% confidence interval and adjust the weight of correspond-
ing stocks whenever we receive an extreme value. When the sentiment score goes above the upper
bound of the confidence interval we will increase the weight of a corresponding stock by multiplying
a factor:
factor = e(sentiment score − confidence interval bound)∗scale
(2.1)
14 Chapter 2. Methodology
From 2011-12-31 to 2013-03-25, A simple graph of AAPL’s DCI is shown below:
Figure 2.2: 500 days DCI for AAPL
The reason why we build a scale coefficient in the factor calculation formula is that we do
not want to make the factor too big. If so, every time we adjust portfolio based on that factor we
would increase transaction cost and systematic risk since the diversification of the portfolio is reduced.
From 2.2, the time length is around 500 days. The yellow line represent the upper bound of DCI,
the green line represents the lower bound of DCI, and those "twitter birds" represent the sentiment
score. In this case, those red "twitter birds" who are out of bounds are exactly the sentiment score
we want to catch. The previous researching group used [−0.25,1.25] as a threshold to capture the
sentiment score, and the number of out-of-bound incidents is more than that of the DCI model.
Another advantage for this model is that DCI can be constructed for each stock. The distribution
of sentiment score varies a lot, thus it is unreasonable to use a fixed threshold for all of them. The
DCI model can fit the sensitivity of any ticker, which perfectly solved this problem. At last, the scale
in the factors will be chosen based on their performance in the strategy selection.
In conclusion, the DCI model provided a more reasonable fit of sentiment score and significantly
reduced the rebelance frequency.
2.3 Rebalancing Frequency
Taking transaction cost into account, too frequently rebalance will reduce the accumulated profit.
As a result, lower rebalancing frequency is preferred.
2.4 Limitation 15
We choose to rebalance bi-weekly, weekly, and daily. For each time period, the sentiment score
before this period will be checked to decide the following tactics. However, rebalance monthly and
quarterly are omitted because SMA sentiment data is based on information collected from tweets.
For such a long period those sentiment scores are not expected to be relevant, given the twenty day
look-back interval used to calculate S-Score.
Figure 2.3: 50 days DCI for AAPL
For AAPL as an example again, we picked a 50 days period. The sentiment score will be
captured every day, and for daily rebalance strategy we check if there is a sentiment score which is
out of DCI. In this particular example, the 51st, 89th, 90th, 92nd, and 99th days are exactly the days
we need to adjust AAPL in our portfolio. Then, the 5 day strategy checks sentiment score on every
grey vertical dashed line, which only captures the 90th day’s sentiment score. The 10 day strategy
checks sentiment score on every black vertical line, and only captures the 90th day’s sentiment score
as well.
In short, we will need to rebalance our portfolio 4 times in daily strategy, 1 time in 5 day strategy
and 10 day strategy.
2.4 Limitation
The main limitation in this research could be the timeframe of the data, which comes from two
sources. First, the sentiment score data is only available from Dec. 1, 2011, then the timeframe of the
data to dig into is about four years. However, for some stocks in XLV, the sentiment score is available
from 2013, which shrinks the timeframe into three years. Second, Facebook’s IPO date is May.18,
2012. As Facebook is the top 10 capitalization stock in SPY, it may have larger influence on the
16 Chapter 2. Methodology
ETF. Thus the timeframe for SPY starts from May 2012 (please see 2.2.1 ETF Rebalancing Strategy
for more details). In short, the timeframe for SPY, XLY and XLV is approximately three to four years.
3. Strategy Selection
We explored many strategies during the training period, which is the first part of the historical
data available at this point, by using the factors including am S-Score, pm S-Score, delta (change
of S-Score from pm to am), and their 1, 2, 3 day lags. Then we select optimal strategies for each
ETF based on the performance in the corresponding training period and use these strategies to do the
back testing to check the robustness in the testing period. For SPY, the training period is from May
21, 2012 to December 31, 2013. For XLY, the period is from December 1, 2011 to December 31,
2013 and for XLV, it is January 2, 2013 to July 31, 2014.
We have two components to adjust during the the training period, and the first and the most
essential factor is the sentiment data type. As mentioned in 2.1.2 SMA data, the AM S-Score, PM
Score and Delta are our major sentiment data. After counting 1 day lag, 2 days lag and 3 days lag of
them, there are totally 12 different sentiment data we can choose to build strategies.
Another consideration is the scale factor in equation (2.1). Based on the fact that we are using an
exponential function to generate factor value, we do not know how to exactly scale those factors.
Thus, we need to use brute force search for all the possibilities during strategy design. The essential
idea here is to make the factor value reasonably small, thus we can pay less for transaction and
eliminate the systemetic risk, since if we put all the money in one stock the diversification of the
portfolio will be reduced.
Thus, by fixing a sentiment data type, we tested 100 scales from [0.1,0.5], seperately on score
higher than upper bound and lower than lower bound. Then, after verifying the scale on other
sentiment data type, we conclude that the optimal scale is 0.4 for the upper bound, 0.3 for the lower
bound. More precisely, the optimal choice of scale factors are:
factorup = e(sentiment score − confidence interval upper bound)∗0.4
factordown = e(sentiment score − confidence interval lower bound)∗0.3
18 Chapter 3. Strategy Selection
3.1 Algorithm
The method we constructed to calculate the portfolio return uses the weight of stocks instead of
the shares. The first and foremost reason is that using shares may give slightly different result with
different principal. Since shares have to be round number. Secondly, the calculation of turnover ratio
could be easier since we only need to calculate the change of weight for each stock and sum them
together.
However, the weight changes even without any rebalancing as the price changes every trading
day. Under this circumstance, we decide to fix the initial weight for each stock to be the same as
its weight in corresponding ETFs on the first trading day. Then on every trading day, we adjust the
weight of each stock in the portfolio according to the new price, calculate the return of the portfolio
which is the weighted average return of the stocks, and finally look at the sentiment data DCI to
determine whether to re-balance on that day.
Mathematically, the formula for updated weight is stated below. Suppose there is a stock i in
the portfolio with initial weight w1
i , the price of i on day t and day t +1 is Pt
i and Pt+1
i . Denote the
sentiment score factor on day t for stock i is ft
i , then the updated weight formula for stock i on day
t +1 is below:
wt+1
i =
Pt+1
i
Pt
i
wt
i ft
i
N
∑
j=1
Pt+1
j
Pt
j
wt
j ft
j (3.1)
And the portfolio return is:
N
∑
j=1
Pt+1
j
Pt
j
wt
j (3.2)
A simple example helps to explain how the whole algorithm works. Let’s randomly pick some
data and the factor of XLV:
Eastern Date Price. JNJ Price. PFE Price. MRK
2011/12/29 57.85 18.71 32.78
2011/12/30 57.58 18.65 32.75
2012/1/3 57.85 18.94 33.28
2012/1/4 57.5 18.76 33.31
2012/1/5 57.43 18.62 33.66
2012/1/6 56.93 18.59 33.42
Table 3.1: Adjusted Close Price Figure for JNJ, PFE and MRK
3.1 Algorithm 19
As you can see from 3.1, 6 days’ price and 3 stocks(JNJ, PFE, MRK) has been chosen. By
applything the simple return formula:
r = Pt
Pt−1
−1
We can get the following return table:
Eastern Date Return. JNJ Return. PFE Return. MRK
2011/12/29 NA NA NA
2011/12/30 -0.0046 -0.0032 -0.0008
2012/1/3 0.0046 0.0152 0.0159
2012/1/4 -0.0061 -0.0091 0.001
2012/1/5 -0.0012 -0.0078 0.0104
2012/1/6 -0.0087 -0.0014 -0.007
Table 3.2: Daily Return for JNJ, PFE and MRK
The initial market capital weight for the three stocks at day 2011/12/19 is:
Eastern Date Weight. JNJ Weight. PFE Weight. MRK Weight Sum
2011/12/29 0.46 0.31 0.23 1
The next step is applying the weight formula (3.1). For example, as for JNJ, the updated weight
for 2011/12/30 without rebalance is:
w
2011/12/30
JNJ =
P
2011/12/30
JNJ
P
2011/12/29
JNJ
w
2011/12/29
JNJ f
2011/12/29
JNJ
3
∑
j=1
P
2011/12/30
j
P
2011/12/29
j
w
2011/12/29
j f
2011/12/29
j
In this case, the factor is 1 if there is no adjusting to the portfolio. Thus, the following days’ updated
weight which is influenced by the change of price is shown as follows:
Eastern Date Weight. JNJ Weight. PFE Weight. MRK Weight Sum
2011/12/29 0.46 0.31 0.23 1
2011/12/30 0.46 0.31 0.23 1
2012/1/3 0.45 0.31 0.23 1
2012/1/4 0.45 0.31 0.24 1
2012/1/5 0.45 0.31 0.24 1
2012/1/6 0.45 0.31 0.24 1
Finally, apply the factor to calculate the adjusted weight by using formula (3.1):
20 Chapter 3. Strategy Selection
Eastern Date Adjw.JNJ F.JNJ Adjw.PFE F.PFE Adjw.MRK F.MRK Adjw.Sum
2011/12/29 0.46 1 0.31 1 0.23 1 1
2011/12/30 0.42 0.87 0.33 1 0.25 1 1
2012/1/3 0.42 1 0.33 1 0.25 1 1
2012/1/4 0.42 1 0.33 1 0.25 1 1
2012/1/5 0.33 0.64 0.4 1 0.28 0.89 1
2012/1/6 0.33 1 0.4 1 0.28 1 1
Table 3.3: Adjusted Weight&Factor for JNJ, PFE and MRK
In the figure above, "F" denotes the factor for individual stock, "Adjw" denotes adjusted weight
for individual stock. By applying the return formula (3.2):
Portfolio Return Original Portfolio Return Adjusted
NA NA
-0.003351547 -0.003351547
0.010699512 0.011146603
-0.015461372 -0.005451914
-0.000339494 -0.000347152
-0.006128506 0.004486034
Table 3.4: Portfolio Return
A graph can make the comparison of results more straightforward:
Figure 3.1: Portfolio Daily Return&Net Value
Note that this 6-day sample was randomly picked from the whole dataset, and it is showing a
result which verified the concept that the strategy is working. The adjustment enhanced the positive
return, reduced the negative return, and increased portfolio net value from 0.9853 to 1.0064 in 6
days.
We will discuss more about the comparison between the base ETFs and the Enhanced ETFs later
in section (4.3).
3.2 Training Period 21
3.2 Training Period
As for SPY, the performance graph of the 12 strategies including Delta, AM S-Score, PM S-Score
and their lags are shown below:
Figure 3.2: SPY Training Period Net Value
It is difficult to choose the best strategy just through looking at the net value. Thus, we made a
table comparison those 12 strategies including some criteria:
Figure 3.3: SPY Training Period Strategy Comparison
As the tables shown above, we calculate the net asset value, maximum drawdown and sharp ratio
to be the criteria used for select the strategies. The critical criteria in the tables have been labeled by
different colors. For more detail on these criteria refer to chapter 6.2, Appendix-Criteria Calculation
By repeating the same procedure, we get the graph and table for XLV and XLY below.
22 Chapter 3. Strategy Selection
Figure 3.4: XLV Training Period
3.2 Training Period 23
Figure 3.5: XLY Training Period
From red to green, the criteria become better. For SPY and XLV, it is obvious that the
2day_am_adjusted is the best one. This strategy uses 2-day lag AM sentiment score to derive
the adjusted factor. Whereas for XLY, the best strategy uses 3-day lag AM sentiment score instead of
the 2-day lag one to calculate the factors.
As for the cause for the 2 day lags, we assume that the sentiment can somehow drag the stock
price up or down. This is not an unwarranted assumption, there is a lot of historical evidence standing
for it. For instance, on April 5th, 2016, news published that MOMO’s shareholder Alibaba joins
executive-led buyout bid1. From April 5th to 6th, 2016, MOMO’s stock price went up over 30%.
However, the sentiment S-Score began to increasing rapidly at April 2nd. Below is a graph from
SMA’s Trader’s Dashboard:
1http://www.wsj.com/articles/momo-shareholder-alibaba-joins-executive-led-buyout-bid-
1459973237
24 Chapter 3. Strategy Selection
Figure 3.6: Monitor for MOMO
As one can see, the S-Score began to increase rapidly 2 days before April 6th, and the stock price
began to increas on April 6th. This scenario supports the rationality of our strategies from another
point of view.
4. Back Test
4.1 Data Sample
The data sample we used for backtesting are slightly different because for SPY, XLY and XLV
the whole data sets are different. The back testing for SPY and XLV is from January 1, 2014 to
February 12, 2016 and for XLY is from August 1, 2014 to February 12, 2016.
4.2 Transaction Cost Control
To ensure the idea that this enhanced ETF is a strategic beta fund, the control of transaction cost
is essential. The definition for turnover ratio is:
The turnover ratio is the percentage of a mutual fund or other investment vehicle’s
holdings that have been "turned over" or replaced with other holdings in a given year.
The type of mutual fund, its investment objective and/or the portfolio manager’s investing
style will play an important role in determining its turnover ratio1.
However, in our case, since the transaction cost would happen both in purchasing a stock and
selling a stock, we figure that the best way to estimate it is to calculate the net change of our portfolio
compared to the whole market capitalization. For any particular day, the turnover ratio can be
calculated by:
stocks
∑
i=1
| Original Market Capitalization −New Market Capitalization |
Market Capitalization
(4.1)
After some derivation, we obtained the following equation:
Turnover =
days
∑
t=1
stocks
∑
k=1
wi
k(1+ri+1
k )(fi+1
k −1) (4.2)
More details about these formulas are stated in the Appendix Chapter 6.1, Algorithm Expansion.
1Read more: Turnover Ratio Definition | Investopedia http://www.investopedia.com/terms/t/
turnoverratio.asp
26 Chapter 4. Back Test
4.3 Result
The back-test strategy for SPY is adjusted by two day lag AM S-Score. The net value graph
is shown below. Also, the criteria table for SPY and enhanced SPY is under the net value graph.
The Daily adjusted column is the strategy that checks every trading day to decide whether there is
an adjustment to be made. Relatively, the 5 Days Adjust column and the 10 Days Adjust column
indicate the strategies which checks every 5 trading days and every 10 trading days.
Figure 4.1: SPY Backtest
4.3 Result 27
Recall the strategy for XLV is the 2 day lag AM S-Score as well.
Figure 4.2: XLV Backtest
28 Chapter 4. Back Test
Finally, the strategy for XLY is using 3 day lag AM S-Score.
Figure 4.3: XLY Backtest
Consider the three tables above, it is easy to support a basic conclusion. For each strategy on each
ETF, it is with high probability that the higher the rebalance frequency is, the better the performance
would be. Nevertheless, when the frequency goes up from 10 trading days to 5 trading days, the
performance enhances just slightly.
Concerning about the fact that what we are trying to construct is a strategic-beta strategy, it is
necessary for us to avoid frequent trading. Therefore, balancing the trade-off of turnover ratio and
the transaction cost, we conclude that the 10 days adjustment strategy enhanced the original ETF
significantly on almost every measurement sector, with an acceptable turnover ratio, which can be
the appropriate re-balance frequency for all the ETFs.
At last, we plot the returns’ densities in order to verify our conclusion, and compare base and
4.3 Result 29
enhanced ETFs.
Figure 4.4: Density Plot Comparison
From both density graphs and skewness values we can find out that our strategy moves the mean
slightly to the right.
In short, the strategies work for all ETFs in the backtest period. The annualized excess returns for
the base ETFs SPY, XLV, and XLY are 4.53%, 2.27%, and 7.09%. The turnover ratios is relatively
higher than their original, which are 10.94% for SPY, 18.3% for XLV, and 22.39% for XLY.
From this result, we find further evidence to reinforce our hypothesis about the sensitivity
of different ETFs to sentiment data in Chapter 2.1.2. This also highlights that trading strategies
30 Chapter 4. Back Test
utilizing sentiment data should be implemented with great precaution due to inhomogeneity of
stocks’ response to sentiment.
5. Conclusion
Previous studies had shown that there is a correlation between social market sentiment data and
stock return. Pursuing this idea, the previous practicum project has shown that daily rebalancing a
stock portfolio can enhance the return/risk reward of a whole portfolio.
Our findings further conclude that Twitter sentiment data really can enhance ETF portfolio re-
turns. Reasonable excess return can be achieved through a weekly or bi-weekly rebalancing without
much tradeoff between turnover and generation of excess return on shares turnover or transaction cost.
The impact of sentiment data is different for each ETF. We think it is due to the structure of the
ETFs and the biasness in tweets. SPY and XLY can be enhanced better through sentiment than XLV,
possibly because the heaviest components of SPY and XLY are much higher in terms of market
capitalization compared to the remaining components. In addition, the top components in SPY and
XLY are more well-known to the public, which attract more market sentiments.
Due to the fact above, we see the importance to model the dynamics of the tweets, both in
temporal and in cross-sectional dimension. In this study, we use a dynamic 95% confidence interval
of the S-Score to determine a signal for extreme values of the tweets of individual stocks. This
confidence interval is constructed using 30-day past data. The interval can evolve to capture the
most recent development in S-Score and reduce the need of unnecessary rebalancing in a fixed
threshold case. Indeed, we can significantly reduce the rebalancing frequency because tweets can
have clustering effects.
6. Appendix
6.1 Algorithm Expansion
weight formula
Now suppose we have a portfolio that contains stocks A, B and C, each with shares of SA, SB, SC.
Denote the stock price for stock j on day i to be Pi
j. Thus, on day 1, the stock prices are P1
A, P1
B, P1
C.
Speaking in more detail, the price table is constructed below.
Date A B C
Day 1 P1
A P1
B P1
C
Day 2 P2
A P2
B P2
C
Table 6.1: price table
Denote Mi
j as the market capital for stock j at day i, and wi
j as the market capital weight for
stock j at day i. In other words, ∑stock numbers
j=1 wi
j ·Mi
j = Mi
port folio. Then, the portfolio return on day
2 for the example below is:
P2
A·SA+P2
B·SB+P2
C·SC
M1
port folio
−1.
Now derive this formula.
P2
A ·SA +P2
B ·SB +P2
C ·SC
M1
port folio
−1 =
P2
A ·SA
M1
port folio
+
P2
B ·SB
M1
port folio
+
P2
C ·SC
M1
port folio
−1 (6.1)
=
P2
A ·SA ·P1
A
M1
port folio ·P1
A
+
P2
B ·SB ·P1
B
M1
port folio ·P1
B
+
P2
C ·SC ·P1
C
M1
port folio ·P1
C
−1 (6.2)
=
P2
A
P1
A
·
SA ·P1
A
M1
port folio
+
P2
B
P1
B
·
SB ·P1
B
M1
port folio
+
P2
C
P1
C
·
SC ·P1
C
M1
port folio
−1(6.3)
=
P2
A
P1
A
·w1
A +
P2
B
P1
B
·w1
B +
P2
C
P1
C
·w1
B −1 (6.4)
34 Chapter 6. Appendix
Therefore, it is reasonable to generalize the result. Portfolio return at day k is:
Port folio Returnk
=
stock numbers
∑
j=1
Pk
j
Pk−1
j
·wk−1
j −1 (6.5)
Next step is to move to the weight adjusting part. As mentioned in the report, even if we do not
make any adjustments to the portfolio, the weight is updating everyday because the price changes.
For the particular example above, the weight for stock A on day 2 is:
w2
A =
P2
A ·SA
P2
A ·SA +P2
B ·SB +P2
C ·SC
(6.6)
The derivation is following:
w2
A =
P2
A ·SA
P2
A ·SA +P2
B ·SB +P2
C ·SC
(6.7)
=
P2
A ·SA
M2
port folio
(6.8)
=
P2
A ·SA M1
port folio
M2
port folio M1
port folio
(6.9)
=
P2
A ·SA
M1
port folio
(
P2
A ·SA +P2
B ·SB +P2
C ·SC
M1
port folio
) (6.10)
=
P2
A ·SA ·P1
A
M1
port folio ·P1
A
(
P2
A
P1
A
·
SA ·P1
A
M1
port folio
+
P2
B
P1
B
·
SB ·P1
B
M1
port folio
+
P2
C
P1
C
·
SC ·P1
C
M1
port folio
) (6.11)
=
P2
A
P1
A
·w1
A (
P2
A
P1
A
·w1
A +
P2
B
P1
B
·w1
B +
P2
C
P1
C
·w1
B) (6.12)
Therefore, from result of equation 6.12, we could write today’s weight of any single stock as a
summation of yesterday’s weight and the change of price. Then, we can just time the factor score to
this weight in order to adjust our portfolio. In this case, a single stock k on day i, the weight is just:
wi
k = (
Pi
k
Pi−1
k
·wi−1
k · factori
k) (
stock numbers
∑
j=1
Pi
j
Pi−1
j
·wi−1
j · factori
j) (6.13)
Turnover Ratio
Turnover of stock k in day 2 is: S2
kP2
k −S1
k.P2
k , where S stands for the shares of stock k at day 1
and day 2 and P is the price of stock k.
6.2 Criteria Calculation 35
Denote MC as Market Capitalization, then we can write the followings:
S2
kP2
k −S1
k.P2
k = | MC2
k −S1
kP2
k
P1
k
P1
k
| (6.14)
= | MC2
k −S1
kP1
k
P2
k
P1
k
| (6.15)
= | MC2
k −MC1
k
P2
k
P1
k
| (6.16)
= | MC2
k −MC1
k (1+r2
k ) | (6.17)
= | MC1
k (1+r2
k)f2
k −MC1
k (1+r2
k) | (6.18)
Because of the fact that next period MC2
k = M1
k (1+r2
k)(f2
k ) = MCoriginal* growth of MC due to price
movement ∗ growth of MC due to weight change, we can derive (7.6) as following:
| MC1
k (1+r2
k)f2
k −MC1
k (1+r2
k) |=| MC1
k ((1+r2
k)f2
k −(1+r2
k)) | (6.19)
Hence, to calculate monthly turnover ratio, we have:
∑
days
t=1 ∑stocks
k=1 | MCi
k(1+ri+1
k )(fi+1
k −1) |
1
days ∑stocks
k=1 MCi
k
≈
days
∑
t=1
stocks
∑
k=1
wi
k(1+ri+1
k )(fi+1
k −1) (6.20)
6.2 Criteria Calculation
• Start date: The first date of the data we use to do the back test.
• End date: The last date of the data we use to do the back test.
• Net Asset Value: The net asset value on the end day which was worth 1 at the start date.
• Total Trading Days: The number of days the stock market is open between start date and end
date.
• Successful rate: Successful rate is the fraction or percentage of trading days making profit
among the total trading days.
• Maximum drawdown: A maximum drawdown (MDD) is the maximum loss from a peak to
a trough of a portfolio, before a new peak is attained. Maximum Drawdown (MDD) is an
indicator of downside risk over a specified time period.
• Annualized Return: Annualized returns are period returns re-scaled to a period of 1 year, 252
trading days.
• Sharpe ratio: The Sharpe ratio is the average return earned in excess of the risk-free rate per
unit of volatility or total risk, which can be calculated by the formula below:
sharp ratio =
¯rp −rf
σp
36 Chapter 6. Appendix
where ¯rp=Expected portfolio return; rf =Risk free rate; σp=Portfolio standard deviation
• Sortino ratio: A modification of the Sharpe ratio that differentiates harmful volatility
from general volatility by taking into account the standard deviation of negative asset
returns, called downside deviation. The Sortino ratio subtracts the risk-free rate of
return from the portfolio’s return, and then divides that by the downside deviation. A
large Sortino ratio indicates there is a low probability of a large loss.
sortino ratio =
¯rp −rf
σd
where ¯rp=Expected portfolio return; rf =Risk free rate; σp=Standard deviation of negative asset returns
• Calmar ratio: The Calmar ratio is a comparison of the average annual compounded
rate of return and the maximum drawdown risk of commodity trading advisors and
hedge funds. The lower the Calmar Ratio, the worse the investment perform on a
risk-adjusted basis over the specified time period; the higher the Calmar Ratio, the
better it performed.
sortino ratio =
¯rp
MDD
where ¯rp=Expected portfolio return; MDD is the maximum drawdown
• Skewness: Skewness describes asymmetry from the normal distribution in a set of
statistical data.
• Mean of Daily Return: The simple mathematical average of daily returns generated in
our backtesting period.
• Standard Deviation of Daily Return: The Standard Deviation of Daily Return is a
measure of how spread out the daily returns are.
R The definition for those criteria above including maximum drawdown, sharpe ratio, sortino
ratio, calmar ratio, skewness are directly from Investopedia. One can access their website by
http://www.investopedia.com.
6.3 Python Scripts
We made a table explaining the basic function of all our Python script. Also, we uploaded all of
them into GoogleDrive thus one can download them.
R To access the complete python scripts file, click this link: https://drive.google.com/a/
illinois.edu/folderview?id=0B7yL1RhMfiOQUnk5eWYydXUzamM&usp=sharing
6.4 R Scripts 37
Figure 6.1: python Scripts Explanation
6.4 R Scripts
We also made a table explaining our R script. The uploaded file can be found in GoogleDrive
below as well.
38 Chapter 6. Appendix
Figure 6.2: R Scripts Explanation
R To access the complete R scripts file, click this link: https://drive.google.com/folderview?
id=0B6n3np7uEHXNNFdZUE9pbHZ1MGc&usp=sharing
6.5 Suggestions For Further Research
We acknowledge that the critical downside of the non-passive trading strategy is the bid/ask
spread, and transaction cost. As mentioned above, we use turnover ratio as a proxy for transaction
cost.
Since stock returns are highly correlated, it is worthwhile to model the correlation structure of
the stock returns together with the correlation with tweets data. We tried to use Principal Component
Analysis (PCA) to analyse the correlation matrix of all the constituent stocks in the ETFs and found
that the first five to eight components can explain most of the variance in the data. Further investiga-
tion on the top factor loading of the principal components revealed that some industries/sectors are
6.5 Suggestions For Further Research 39
the main drivers of the index.
However, the drawback of this method is that the representative stocks picked by PCA are not
considered by their market capitalization. In fact, some of the company values can be too small
compared to others. If we follow their initial weights similar to the original ETFs, there will be some
problems with the self-financing criteria whenever we need to adjust the weights. For instance, we
may need to cash out most of the investment in a few small stocks to increase the weight in AAPL
stock.
Therefore, we suggest further work in using PCA to create a highly influential subset of stocks
and focus on their market sentiment. We recommend a new set of reasonable weights, probably
equal initial weights of these principal stocks as the starting point. We are aware that this weighting
scheme is subjective. Optimization using brute search may apply to search for the best set of weight
so that this can be practical, enhance returns, and reduce turnover ratio further.
Moreover, for the 5 day and 10 day strategy, we just picked the first day as the beginning date.
For example, we have 792 days sentiment factor on XLV, so as for 10 day strategy, we made the
checking date as 1, 11, 21, ..., 791. The checking date for 5 day strategy is 1, 6, 11, ...,791. A better
way to do this can be calculate all the 5 day or 10 day grid and get the average. As for example, the
first group of checking date for 10 day strategy is 1, 11, 21,..., 791, and the second group should be
2, 22, ..., 682. Thus for every days return we can get 10 groups of it, and then take the average. We
can use this rolling daily return as a criteria to design the trading strategy. The same methodology
can be applied on the 5 days strategy easily.
Bibliography
[1] J. Blaschak, A. Blinov, J Gits, F. Harfoush, and K. Myers. Systems and methods of detecting,
measuring, and extracting signatures of signals embedded in social media data streams. U.S.
Patent No 9,104,734, Aug. 2015.
[2] Ranco G, Aleksovski D, Calderelli G, Grcar M, and Mozetic I. The effects of twitter sentiment
on stock price returns. 2015.
[3] Jason Hsu, Vital Kalesnik, and Feifei Li. An investor’s guide to smart beta strategies. AAII
Journal.
[4] Seung Lee, Chuanning Li, Yunqian Li, , and Yoshifumi Ichikawa. Sentiment enhanced etfs.
MSFE Practicum Project Report, 2015.
[5] Ari Polychronopoulos. Building a better beta: Combining fundamentals weighting, low volatility,
and momentum strategies. Research Affiliates Whitepaper, Oct. 2014.
[6] Wigglesworth Robin. Fund managers ready for ’smart beta’ wars. Ft.com, 08 Feb. 2016.
Wish you all the best, UIUC MSFE & SMA group

More Related Content

Similar to main

Abdul+Rahim+wong+1st+year+phd+thesis.docx.pdf
Abdul+Rahim+wong+1st+year+phd+thesis.docx.pdfAbdul+Rahim+wong+1st+year+phd+thesis.docx.pdf
Abdul+Rahim+wong+1st+year+phd+thesis.docx.pdf
drwong3
 
Abdul+Rahim+wong+1st+year+phd+thesis.docx.pdf
Abdul+Rahim+wong+1st+year+phd+thesis.docx.pdfAbdul+Rahim+wong+1st+year+phd+thesis.docx.pdf
Abdul+Rahim+wong+1st+year+phd+thesis.docx.pdf
drwong3
 
Abdul Rahim wong 1st year phd thesis.docx
Abdul Rahim wong 1st year phd thesis.docxAbdul Rahim wong 1st year phd thesis.docx
Abdul Rahim wong 1st year phd thesis.docx
drwong3
 
3 30022 assessing_yourbusinessanalytics
3 30022 assessing_yourbusinessanalytics3 30022 assessing_yourbusinessanalytics
3 30022 assessing_yourbusinessanalytics
cragsmoor123
 
Markit dividend forecasts and their value
Markit dividend forecasts and their valueMarkit dividend forecasts and their value
Markit dividend forecasts and their value
Thomas Matheson
 
Chaintech BitTalk Series Episode 3
Chaintech BitTalk Series Episode 3Chaintech BitTalk Series Episode 3
Chaintech BitTalk Series Episode 3
Francois Zhang Mingqian
 
Pratik_Nawani_Executive Summary_J025_71112110004
Pratik_Nawani_Executive Summary_J025_71112110004Pratik_Nawani_Executive Summary_J025_71112110004
Pratik_Nawani_Executive Summary_J025_71112110004
Pratik Nawani
 
EN_METAMORG_SERVICES [Modo de compatibilidad]
EN_METAMORG_SERVICES [Modo de compatibilidad]EN_METAMORG_SERVICES [Modo de compatibilidad]
EN_METAMORG_SERVICES [Modo de compatibilidad]
Luis Martín
 
EN_METAMORG_SERVICES [Modo de compatibilidad]
EN_METAMORG_SERVICES [Modo de compatibilidad]EN_METAMORG_SERVICES [Modo de compatibilidad]
EN_METAMORG_SERVICES [Modo de compatibilidad]
Luis Martín
 

Similar to main (20)

IRJET- Stock Market Prediction using Machine Learning Techniques
IRJET- Stock Market Prediction using Machine Learning TechniquesIRJET- Stock Market Prediction using Machine Learning Techniques
IRJET- Stock Market Prediction using Machine Learning Techniques
 
Abdul+Rahim+wong+1st+year+phd+thesis.docx.pdf
Abdul+Rahim+wong+1st+year+phd+thesis.docx.pdfAbdul+Rahim+wong+1st+year+phd+thesis.docx.pdf
Abdul+Rahim+wong+1st+year+phd+thesis.docx.pdf
 
Abdul+Rahim+wong+1st+year+phd+thesis.docx.pdf
Abdul+Rahim+wong+1st+year+phd+thesis.docx.pdfAbdul+Rahim+wong+1st+year+phd+thesis.docx.pdf
Abdul+Rahim+wong+1st+year+phd+thesis.docx.pdf
 
Abdul Rahim wong phd thesis.docx
Abdul Rahim wong phd thesis.docxAbdul Rahim wong phd thesis.docx
Abdul Rahim wong phd thesis.docx
 
Abdul Rahim wong 1st year phd thesis.docx
Abdul Rahim wong 1st year phd thesis.docxAbdul Rahim wong 1st year phd thesis.docx
Abdul Rahim wong 1st year phd thesis.docx
 
Use of vectors in financial graphs
Use of vectors in financial graphsUse of vectors in financial graphs
Use of vectors in financial graphs
 
IRJET - Stock Market Analysis and Prediction
IRJET - Stock Market Analysis and PredictionIRJET - Stock Market Analysis and Prediction
IRJET - Stock Market Analysis and Prediction
 
OPENING RANGE BREAKOUT STOCK TRADING ALGORITHMIC MODEL
OPENING RANGE BREAKOUT STOCK TRADING ALGORITHMIC MODELOPENING RANGE BREAKOUT STOCK TRADING ALGORITHMIC MODEL
OPENING RANGE BREAKOUT STOCK TRADING ALGORITHMIC MODEL
 
3 30022 assessing_yourbusinessanalytics
3 30022 assessing_yourbusinessanalytics3 30022 assessing_yourbusinessanalytics
3 30022 assessing_yourbusinessanalytics
 
stock price prediction using sentiment analysis
stock price prediction using sentiment analysisstock price prediction using sentiment analysis
stock price prediction using sentiment analysis
 
Markit dividend forecasts and their value
Markit dividend forecasts and their valueMarkit dividend forecasts and their value
Markit dividend forecasts and their value
 
Productivity improvement through right governance
Productivity improvement through right governanceProductivity improvement through right governance
Productivity improvement through right governance
 
Chaintech BitTalk Series Episode 3
Chaintech BitTalk Series Episode 3Chaintech BitTalk Series Episode 3
Chaintech BitTalk Series Episode 3
 
Stock Market Prediction Using Artificial Neural Network
Stock Market Prediction Using Artificial Neural NetworkStock Market Prediction Using Artificial Neural Network
Stock Market Prediction Using Artificial Neural Network
 
Pratik_Nawani_Executive Summary_J025_71112110004
Pratik_Nawani_Executive Summary_J025_71112110004Pratik_Nawani_Executive Summary_J025_71112110004
Pratik_Nawani_Executive Summary_J025_71112110004
 
Analytics
AnalyticsAnalytics
Analytics
 
Directing Intelligence in retail
Directing Intelligence in  retailDirecting Intelligence in  retail
Directing Intelligence in retail
 
IRJET - Stock Recommendation System using Machine Learning Approache
IRJET - Stock Recommendation System using Machine Learning ApproacheIRJET - Stock Recommendation System using Machine Learning Approache
IRJET - Stock Recommendation System using Machine Learning Approache
 
EN_METAMORG_SERVICES [Modo de compatibilidad]
EN_METAMORG_SERVICES [Modo de compatibilidad]EN_METAMORG_SERVICES [Modo de compatibilidad]
EN_METAMORG_SERVICES [Modo de compatibilidad]
 
EN_METAMORG_SERVICES [Modo de compatibilidad]
EN_METAMORG_SERVICES [Modo de compatibilidad]EN_METAMORG_SERVICES [Modo de compatibilidad]
EN_METAMORG_SERVICES [Modo de compatibilidad]
 

main

  • 1. Application of Social Sentiment Factors In ETF Design Quoc Dung Cao Yikang Luo Lu Qiu Rajput Tanmay Miao Zhou Industry Sponsor: Social Market Analytics, Inc. May 2, 2016
  • 2. SMA PRACTICUM GROUP, UNIVERSITY OF ILLINOIS AT URBANA CHAMPAIGN This practicum research was performed under the supervision of Dr. Jeff Blaschak and Dr. Morton N. Lane. All market sentiment data was provided by Social Market Analytics, co. Inc. The stock prices were downloaded from finance.yahoo.com. This Latex Template was originally created by Mathias Legrand, which is the template of The Legrand Orange Book posted on http://www.latextemplates.com/template/the-legrand-orange- book, then edited by Andrea Hidalgo and published on http://www.overleaf.com/articles/ clustering-the-interstellar-medium/mtthgyyfrdkn#.Vw_lwMftrK4. We made some ad- justment when using it during this report. First release, April 2016
  • 3. Executive Summary Exchange traded funds (ETFs) track the performance of an index or a basket of assets. Strategic- beta ETFs seek to enhance the performance of such index tracking funds by acquiring exposure to specific market risk factors. This research modifies the existing S&P 500 sector tracking ETFs (SPY, XLV, and XLY) by enhancing the highest market capitalization subset of the ETFs based on social media sentiment while still passively investing in the remaining stocks of the ETFs. Analysis of the lexical content of Twitter posts regarding stocks is used to create a sentiment metric called the S-Score. We use a dynamic 95% confidence interval of the S-Score to determine a signal for extreme values of the tweets of individual stocks. This confidence interval is constructed using a rolling 30-day look-back interval. The interval can evolve to capture the most recent devel- opment in S-Score, thus provides a dynamic threshold to guide weight adjustments. This allows for significant reduction in the rebalancing frequency and transactions costs compared to a fixed threshold strategy. Our results indicate that sentiment enhanced ETFs yield better returns during a 2-year backtesting interval, and have more favorable Sharpe Ratios compared to the unenhanced benchmark ETFs. A variety of strategies for rebalancing – weekly, bi-weekly as well as daily adjustment, allow for varying degrees of “active” beta management of the ETF portfolios.
  • 4.
  • 5. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.1 Social Market Analytics, Inc.(SMA) 7 1.2 ETF 8 1.3 Sentiment Score 8 1.4 Strategic Beta 8 1.5 Previous Research 9 2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1 Data 11 2.1.1 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.2 SMA data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2 ETF Design 12 2.2.1 ETF rebalancing strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.2 Dynamic Confidence Interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 Rebalancing Frequency 14 2.4 Limitation 15 3 Strategy Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1 Algorithm 18 3.2 Training Period 21
  • 6. 6 4 Back Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.1 Data Sample 25 4.2 Transaction Cost Control 25 4.3 Result 26 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.1 Algorithm Expansion 33 6.2 Criteria Calculation 35 6.3 Python Scripts 36 6.4 R Scripts 37 6.5 Suggestions For Further Research 38
  • 7. 1. Introduction 1.1 Social Market Analytics, Inc.(SMA) SMA is a social media start-up based in Chicago, Illinois since 2012, providing services to financial institutions, active traders, and investors. One of the main product suites of SMA is the Sentiment Signature Feed. SMA Sentiment Signature Feed structures and quantifies Social Media information and provides actionable intelligence for financial markets. SMA’s patented product provides a Sentiment Signature for the universe of tradable equities. The engine Extracts the right data, Evaluates the content, and Calculates the statistics customers need for alerts, trading algorithms, and business intelligence. The Twitter feed has grown to nearly 500 million Tweets per day. The indicative Tweets used in the SMA calculations are growing even faster at 10% per month, as professionals use Twitter for news distribution, and general business utilization expands. No firm can ignore Twitter’s significance for the financial market. Until now there has been no efficient method for market professionals to intelligently quantify and measure the vast amount of information embedded in the Twitter message stream. By cleaning, filtering, and quantifying social media data, Social Market Analytics provides traders with reliable actionable intelligence before their competitors hear about it in traditional media. SMA’s market sentiment metrics are comprised of: • S-Score: Time-weighted representation of a security’s sentiment time series, normalized over a standard look-back interval. • SV-Score: a security’s indicative tweet volume time series, normalized over a standard look- back interval • S-Buzz: a measure of unusual social media activity relative to a universe of stocks • S-Volume: is the volume of indicative tweets contributing to metrics calculations at an obser- vation time. • S-Dispersion: is a measurement of the tweet source diversity contributing to an S-Score
  • 8. 8 Chapter 1. Introduction • S-Mean: Weighted average of the sentiment time series for a given security • S-Volatility: is a percent measurement of the variability of the sentiment level. • S-Delta: is the percent change in S-Score, a first order measurement of the sentiment trend. 1.2 ETF Exchange traded funds (ETFs) track the performance of an index or a basket of assets. With high liquidity and low commissions, ETFs provide an attractive product for investors. ETFs are one of the fastest growing investment products currently available in the market. Beta is the financial term for the sensitivity of the asset’s return to the market’s return, while “alpha” is excess returns skilled investors can generate[6]. An ETF tracking an index such as S&P500 index provides investors with almost 1 to 1 beta exposure to the index with minimal expense for fund management. 1.3 Sentiment Score Sentiment data from SMA are derived from Tweets captured from thousands of Twitter accounts expressing commentary on stocks and the markets[1]. SMA servers compute metrics called S-Factors to quantify equity sentiment for investment purposes. S-Score Calculation: For a particular time in every 24-hour window, for N informative tweets, raw sentiment is calculated as follows, S(t) = N ∑ i=1 sentiment_tweet(ti) Calculating the mean and variance in 20 days look-back period, the raw sentiment is standardized: S-Score(t) = S(t)−u(S(t)) σ(S(t)) where u and σ represent the mean and standard deviation of raw sentiments during the past 20 days. AM/PM Data Points: Regarding this research, we acquire two types of S-Score every market day. The AM data are generated at 9:10 AM (EST), before the market opens and the PM data is generated at 2:55 PM (EST), 65 minutes before the market closes. This research relies on S-Score to establish the threshold of weight-adjustments for the enhanced ETFs. 1.4 Strategic Beta Since 2005, the market has witnessed the growth of a new class of ETF products known as strategic beta, especially appealing to investors looking for the better returns than traditional ETFs[5]. Two major characteristics of strategic-beta strategy are anti-traditional weighting methodology and transparent quantitative weighting methods[3]. It is a mixture of active and passive portfo- lio management technique. By “active”, investors can subjectively choose certain risk factors to take; and by “passive”, they only follow a predefined index to get exposure to the factor. In traditional passive portfolio management, investors have exposure to market risk by investing in SPY, the ETF that tracks the performance of S&P 500. In strategic-beta strategies, fund managers
  • 9. 1.5 Previous Research 9 are more flexible in picking risk factors, for example, fundamental weighting, low variance and so on. However, strategic-beta ETFs focused on the market sentiment have not yet been fully explored, despite contemporary research that shows significant dependence between the Twitter sentiment and abnormal returns during peak Twitter volume (e.g. quarterly earnings announcements) as well as around less obvious but “event-driven” peaks[2]. 1.5 Previous Research Previous practicum project in fall 2015 performed research to enhance ETF return based on daily Twitter sentiment data. It concluded that daily rebalancing of the portfolio was a profitable strategy in excess returns, Sharpe ratio, and turnover ratio in all SPY, XLV, and XLY.[4] Specifically, the research optimized a fixed threshold of S-Score for each ETF, for all of the constituent stocks, and buy/sell the stocks whenever the daily S-Score exceeds/falls below the up- per/lower bounds of the threshold. In this current research, we improved the previous findings to reduce the rebalancing frequency, increase the flexibility of the S-Score thresholds to adapt to the latest sentiment development, and model individual thresholds for each constituents in the ETF.
  • 10.
  • 11. 2. Methodology 2.1 Data This section presents background on the data used throughout the project. It describes the sources, choice of data, and explanation of the three ETFs and the Twitter sentiment data. 2.1.1 Benchmarks The three chosen ETFs are SPY, XLY, and XLV. SPY is the SPDR S&P 500 ETF. XLV is the Health Care Select Sector SPDR Fund. XLY is the Consumer Discretionary Select Sector SPDR Fund. The motivation behind the choice of different industries ETFs and the general market ETF SPY is to control and investigate more specific impact on Twitter sentiments on different market sectors. Based on the previous reports and SMA’s findings, Twitter sentiment seems to have more impact on SPY and XLY funds, with much slighter impacts on XLV fund. We hypothesize that the familiar- ity and interest of the general public tweets bring the higher effect and possibly volume of tweets to the more well-known companies, which heavily present in SPY and XLY funds. In the later sections, we shall investigate further this hypothesis, based on a representative sample of stocks in each ETF. The daily price data for the three ETFs were obtained from Yahoo Finance using Python and Yahoo API. Adjusted closed price will be used for all return and rebalancing calculations as we will not separately calculate dividend as a part of the portfolio return. The market capitalization weights of constituent stocks were obtained from SPDR fund prospec- tus on Feb 2nd, 2016. Even though the weights of individual stocks change everyday due to individual price movement, we assume the weight on Feb 2nd, 2016 will be the initial weight used at the beginning of our backtesting period. This is under the assumption that the weights do not change significantly throughout the history of the fund. Nevertheless, in the backtesting, we also used the same weights to simulated the growth of the original unenhanced SPY, XLY, and XLV to have a fair
  • 12. 12 Chapter 2. Methodology evaluation of our ETF design. The historical data is divided into 2 parts. The first part is used as training period for choosing the best ETF design strategy. The second part is used as testing period for backtesting. This is a proposed improvement from previous research. Splitting the data help avoid bias when we back-test on the same dataset used for strategy design. As for testing stability of strategy through different training periods and rolling periods, the data range we used are different, based on the different IPO (initial public offerings) time for each tickers in the portfolio. For example, the IPO date for FB is May 18th, 2012. After taken into consideration that FB is an important component in our portfolio, we decide to push the backtesting range for SPY from 2010 to Jan 1st, 2014. Therefore, the backtesting period for SPY is from Jan 1st, 2014 to Feb 12th, 2016, which is the ending sentiment data. Following this approach, the data range for XLV is from Aug 1st, 2014 to Feb 12th, 2016, while the data range for XLY is from Jan 1st, 2014 to Feb 12th, 2016. 2.1.2 SMA data The Twitter Sentiment data was obtained from SMA through various sentiment-score metrics (S-Score) throughout the backtesting period. To further improve the robustness of our ETF design, we back-test the ETF design over a longer period than the previous project. Through simple brute force search and regression models such as single and double independent variables, we determined the best signal among the three S-Score metrics (the AM, PM, and Delta) and their respective 1-day, 2-day, and 3-day lags. We found that 2-day AM, 3-day AM, and 1-day PM work best for SPY, XLV, and XLY respectively. This is very intuitive, since Twitter sentiment seems to reflect the development of information quite fast, especially for big and popular stocks, which heavily dominate XLY. XLV is the least sensitive to sentiment data and sentiment also does not enhance its returns as strongly as others. Another reason the lagged data still have some explanatory power is probably due to the construction of the PM S-Score, near to the market close time and people may not have enough time to place orders or split their trades into multiple smaller orders for the case of institutional investors. For further research, one can access to SMA’s platform through www.socialmarketanalytics. com to obtain real-time S-Score to dynamically back-test the ETF design. 2.2 ETF Design 2.2.1 ETF rebalancing strategy In order to earn excess return relative to the base ETFs, we strategically rebalancing our ETFs based on Twitter sentiment signals. While considering the rebalance of SPY, XLV and XLY, we find it unnecessary to consider every single stock in the ETF. Take SPY as an example, which has 500 stocks, with each comprising a small part of the whole market capitalization. Considering the management cost and the transaction cost, we choose to rebalance the top market-capitalization stocks.
  • 13. 2.2 ETF Design 13 We choose the first 15 highest market capitalization stocks, which are about one fourth of the total market capitalization in SPY, first 10 in XLV and first 10 in XLY, which are about half of the total market capitalization. Specifically, the tickers of those stocks are: • SPY: AAPL, MSFT, XOM, JNJ, GE, BRK-B, FB, T, WRF, PG, AMZN, GOOGL, GOOG, JPM, VZ • XLV: JNJ, PFE, MRK, GILD, AGN, UNH, AMGN, MDT, BMY, ABBV • XLY: AMZN, HD, DIS, CMCSA, MCD, SBUX, NKE, PCLN, LOW, TWX And the weight of these in the portfolios are: Figure 2.1: Initial Weight The representative subset of stocks will be rebalanced whenever there is an extreme value of sentiment scores. This subset will be self-financed among themselves. The remaining stocks will be passively invested similarly to SPY, XLV, and XLY. We call this new ETF fund an Enhanced ETF(EETF). 2.2.2 Dynamic Confidence Interval We construct a dynamic confidence interval (DCI) to decide when to rebalance. Unlike the method in the previous study using a fixed threshold, the upper and lower bound of the confidence interval will dynamically reflect the fluctuation of the sentiment data. For instance, while extreme values of sentiment score are observed successively, we can avoid unnecessary successive rebalanc- ing. We use a 30 day look-back period, 95% confidence interval and adjust the weight of correspond- ing stocks whenever we receive an extreme value. When the sentiment score goes above the upper bound of the confidence interval we will increase the weight of a corresponding stock by multiplying a factor: factor = e(sentiment score − confidence interval bound)∗scale (2.1)
  • 14. 14 Chapter 2. Methodology From 2011-12-31 to 2013-03-25, A simple graph of AAPL’s DCI is shown below: Figure 2.2: 500 days DCI for AAPL The reason why we build a scale coefficient in the factor calculation formula is that we do not want to make the factor too big. If so, every time we adjust portfolio based on that factor we would increase transaction cost and systematic risk since the diversification of the portfolio is reduced. From 2.2, the time length is around 500 days. The yellow line represent the upper bound of DCI, the green line represents the lower bound of DCI, and those "twitter birds" represent the sentiment score. In this case, those red "twitter birds" who are out of bounds are exactly the sentiment score we want to catch. The previous researching group used [−0.25,1.25] as a threshold to capture the sentiment score, and the number of out-of-bound incidents is more than that of the DCI model. Another advantage for this model is that DCI can be constructed for each stock. The distribution of sentiment score varies a lot, thus it is unreasonable to use a fixed threshold for all of them. The DCI model can fit the sensitivity of any ticker, which perfectly solved this problem. At last, the scale in the factors will be chosen based on their performance in the strategy selection. In conclusion, the DCI model provided a more reasonable fit of sentiment score and significantly reduced the rebelance frequency. 2.3 Rebalancing Frequency Taking transaction cost into account, too frequently rebalance will reduce the accumulated profit. As a result, lower rebalancing frequency is preferred.
  • 15. 2.4 Limitation 15 We choose to rebalance bi-weekly, weekly, and daily. For each time period, the sentiment score before this period will be checked to decide the following tactics. However, rebalance monthly and quarterly are omitted because SMA sentiment data is based on information collected from tweets. For such a long period those sentiment scores are not expected to be relevant, given the twenty day look-back interval used to calculate S-Score. Figure 2.3: 50 days DCI for AAPL For AAPL as an example again, we picked a 50 days period. The sentiment score will be captured every day, and for daily rebalance strategy we check if there is a sentiment score which is out of DCI. In this particular example, the 51st, 89th, 90th, 92nd, and 99th days are exactly the days we need to adjust AAPL in our portfolio. Then, the 5 day strategy checks sentiment score on every grey vertical dashed line, which only captures the 90th day’s sentiment score. The 10 day strategy checks sentiment score on every black vertical line, and only captures the 90th day’s sentiment score as well. In short, we will need to rebalance our portfolio 4 times in daily strategy, 1 time in 5 day strategy and 10 day strategy. 2.4 Limitation The main limitation in this research could be the timeframe of the data, which comes from two sources. First, the sentiment score data is only available from Dec. 1, 2011, then the timeframe of the data to dig into is about four years. However, for some stocks in XLV, the sentiment score is available from 2013, which shrinks the timeframe into three years. Second, Facebook’s IPO date is May.18, 2012. As Facebook is the top 10 capitalization stock in SPY, it may have larger influence on the
  • 16. 16 Chapter 2. Methodology ETF. Thus the timeframe for SPY starts from May 2012 (please see 2.2.1 ETF Rebalancing Strategy for more details). In short, the timeframe for SPY, XLY and XLV is approximately three to four years.
  • 17. 3. Strategy Selection We explored many strategies during the training period, which is the first part of the historical data available at this point, by using the factors including am S-Score, pm S-Score, delta (change of S-Score from pm to am), and their 1, 2, 3 day lags. Then we select optimal strategies for each ETF based on the performance in the corresponding training period and use these strategies to do the back testing to check the robustness in the testing period. For SPY, the training period is from May 21, 2012 to December 31, 2013. For XLY, the period is from December 1, 2011 to December 31, 2013 and for XLV, it is January 2, 2013 to July 31, 2014. We have two components to adjust during the the training period, and the first and the most essential factor is the sentiment data type. As mentioned in 2.1.2 SMA data, the AM S-Score, PM Score and Delta are our major sentiment data. After counting 1 day lag, 2 days lag and 3 days lag of them, there are totally 12 different sentiment data we can choose to build strategies. Another consideration is the scale factor in equation (2.1). Based on the fact that we are using an exponential function to generate factor value, we do not know how to exactly scale those factors. Thus, we need to use brute force search for all the possibilities during strategy design. The essential idea here is to make the factor value reasonably small, thus we can pay less for transaction and eliminate the systemetic risk, since if we put all the money in one stock the diversification of the portfolio will be reduced. Thus, by fixing a sentiment data type, we tested 100 scales from [0.1,0.5], seperately on score higher than upper bound and lower than lower bound. Then, after verifying the scale on other sentiment data type, we conclude that the optimal scale is 0.4 for the upper bound, 0.3 for the lower bound. More precisely, the optimal choice of scale factors are: factorup = e(sentiment score − confidence interval upper bound)∗0.4 factordown = e(sentiment score − confidence interval lower bound)∗0.3
  • 18. 18 Chapter 3. Strategy Selection 3.1 Algorithm The method we constructed to calculate the portfolio return uses the weight of stocks instead of the shares. The first and foremost reason is that using shares may give slightly different result with different principal. Since shares have to be round number. Secondly, the calculation of turnover ratio could be easier since we only need to calculate the change of weight for each stock and sum them together. However, the weight changes even without any rebalancing as the price changes every trading day. Under this circumstance, we decide to fix the initial weight for each stock to be the same as its weight in corresponding ETFs on the first trading day. Then on every trading day, we adjust the weight of each stock in the portfolio according to the new price, calculate the return of the portfolio which is the weighted average return of the stocks, and finally look at the sentiment data DCI to determine whether to re-balance on that day. Mathematically, the formula for updated weight is stated below. Suppose there is a stock i in the portfolio with initial weight w1 i , the price of i on day t and day t +1 is Pt i and Pt+1 i . Denote the sentiment score factor on day t for stock i is ft i , then the updated weight formula for stock i on day t +1 is below: wt+1 i = Pt+1 i Pt i wt i ft i N ∑ j=1 Pt+1 j Pt j wt j ft j (3.1) And the portfolio return is: N ∑ j=1 Pt+1 j Pt j wt j (3.2) A simple example helps to explain how the whole algorithm works. Let’s randomly pick some data and the factor of XLV: Eastern Date Price. JNJ Price. PFE Price. MRK 2011/12/29 57.85 18.71 32.78 2011/12/30 57.58 18.65 32.75 2012/1/3 57.85 18.94 33.28 2012/1/4 57.5 18.76 33.31 2012/1/5 57.43 18.62 33.66 2012/1/6 56.93 18.59 33.42 Table 3.1: Adjusted Close Price Figure for JNJ, PFE and MRK
  • 19. 3.1 Algorithm 19 As you can see from 3.1, 6 days’ price and 3 stocks(JNJ, PFE, MRK) has been chosen. By applything the simple return formula: r = Pt Pt−1 −1 We can get the following return table: Eastern Date Return. JNJ Return. PFE Return. MRK 2011/12/29 NA NA NA 2011/12/30 -0.0046 -0.0032 -0.0008 2012/1/3 0.0046 0.0152 0.0159 2012/1/4 -0.0061 -0.0091 0.001 2012/1/5 -0.0012 -0.0078 0.0104 2012/1/6 -0.0087 -0.0014 -0.007 Table 3.2: Daily Return for JNJ, PFE and MRK The initial market capital weight for the three stocks at day 2011/12/19 is: Eastern Date Weight. JNJ Weight. PFE Weight. MRK Weight Sum 2011/12/29 0.46 0.31 0.23 1 The next step is applying the weight formula (3.1). For example, as for JNJ, the updated weight for 2011/12/30 without rebalance is: w 2011/12/30 JNJ = P 2011/12/30 JNJ P 2011/12/29 JNJ w 2011/12/29 JNJ f 2011/12/29 JNJ 3 ∑ j=1 P 2011/12/30 j P 2011/12/29 j w 2011/12/29 j f 2011/12/29 j In this case, the factor is 1 if there is no adjusting to the portfolio. Thus, the following days’ updated weight which is influenced by the change of price is shown as follows: Eastern Date Weight. JNJ Weight. PFE Weight. MRK Weight Sum 2011/12/29 0.46 0.31 0.23 1 2011/12/30 0.46 0.31 0.23 1 2012/1/3 0.45 0.31 0.23 1 2012/1/4 0.45 0.31 0.24 1 2012/1/5 0.45 0.31 0.24 1 2012/1/6 0.45 0.31 0.24 1 Finally, apply the factor to calculate the adjusted weight by using formula (3.1):
  • 20. 20 Chapter 3. Strategy Selection Eastern Date Adjw.JNJ F.JNJ Adjw.PFE F.PFE Adjw.MRK F.MRK Adjw.Sum 2011/12/29 0.46 1 0.31 1 0.23 1 1 2011/12/30 0.42 0.87 0.33 1 0.25 1 1 2012/1/3 0.42 1 0.33 1 0.25 1 1 2012/1/4 0.42 1 0.33 1 0.25 1 1 2012/1/5 0.33 0.64 0.4 1 0.28 0.89 1 2012/1/6 0.33 1 0.4 1 0.28 1 1 Table 3.3: Adjusted Weight&Factor for JNJ, PFE and MRK In the figure above, "F" denotes the factor for individual stock, "Adjw" denotes adjusted weight for individual stock. By applying the return formula (3.2): Portfolio Return Original Portfolio Return Adjusted NA NA -0.003351547 -0.003351547 0.010699512 0.011146603 -0.015461372 -0.005451914 -0.000339494 -0.000347152 -0.006128506 0.004486034 Table 3.4: Portfolio Return A graph can make the comparison of results more straightforward: Figure 3.1: Portfolio Daily Return&Net Value Note that this 6-day sample was randomly picked from the whole dataset, and it is showing a result which verified the concept that the strategy is working. The adjustment enhanced the positive return, reduced the negative return, and increased portfolio net value from 0.9853 to 1.0064 in 6 days. We will discuss more about the comparison between the base ETFs and the Enhanced ETFs later in section (4.3).
  • 21. 3.2 Training Period 21 3.2 Training Period As for SPY, the performance graph of the 12 strategies including Delta, AM S-Score, PM S-Score and their lags are shown below: Figure 3.2: SPY Training Period Net Value It is difficult to choose the best strategy just through looking at the net value. Thus, we made a table comparison those 12 strategies including some criteria: Figure 3.3: SPY Training Period Strategy Comparison As the tables shown above, we calculate the net asset value, maximum drawdown and sharp ratio to be the criteria used for select the strategies. The critical criteria in the tables have been labeled by different colors. For more detail on these criteria refer to chapter 6.2, Appendix-Criteria Calculation By repeating the same procedure, we get the graph and table for XLV and XLY below.
  • 22. 22 Chapter 3. Strategy Selection Figure 3.4: XLV Training Period
  • 23. 3.2 Training Period 23 Figure 3.5: XLY Training Period From red to green, the criteria become better. For SPY and XLV, it is obvious that the 2day_am_adjusted is the best one. This strategy uses 2-day lag AM sentiment score to derive the adjusted factor. Whereas for XLY, the best strategy uses 3-day lag AM sentiment score instead of the 2-day lag one to calculate the factors. As for the cause for the 2 day lags, we assume that the sentiment can somehow drag the stock price up or down. This is not an unwarranted assumption, there is a lot of historical evidence standing for it. For instance, on April 5th, 2016, news published that MOMO’s shareholder Alibaba joins executive-led buyout bid1. From April 5th to 6th, 2016, MOMO’s stock price went up over 30%. However, the sentiment S-Score began to increasing rapidly at April 2nd. Below is a graph from SMA’s Trader’s Dashboard: 1http://www.wsj.com/articles/momo-shareholder-alibaba-joins-executive-led-buyout-bid- 1459973237
  • 24. 24 Chapter 3. Strategy Selection Figure 3.6: Monitor for MOMO As one can see, the S-Score began to increase rapidly 2 days before April 6th, and the stock price began to increas on April 6th. This scenario supports the rationality of our strategies from another point of view.
  • 25. 4. Back Test 4.1 Data Sample The data sample we used for backtesting are slightly different because for SPY, XLY and XLV the whole data sets are different. The back testing for SPY and XLV is from January 1, 2014 to February 12, 2016 and for XLY is from August 1, 2014 to February 12, 2016. 4.2 Transaction Cost Control To ensure the idea that this enhanced ETF is a strategic beta fund, the control of transaction cost is essential. The definition for turnover ratio is: The turnover ratio is the percentage of a mutual fund or other investment vehicle’s holdings that have been "turned over" or replaced with other holdings in a given year. The type of mutual fund, its investment objective and/or the portfolio manager’s investing style will play an important role in determining its turnover ratio1. However, in our case, since the transaction cost would happen both in purchasing a stock and selling a stock, we figure that the best way to estimate it is to calculate the net change of our portfolio compared to the whole market capitalization. For any particular day, the turnover ratio can be calculated by: stocks ∑ i=1 | Original Market Capitalization −New Market Capitalization | Market Capitalization (4.1) After some derivation, we obtained the following equation: Turnover = days ∑ t=1 stocks ∑ k=1 wi k(1+ri+1 k )(fi+1 k −1) (4.2) More details about these formulas are stated in the Appendix Chapter 6.1, Algorithm Expansion. 1Read more: Turnover Ratio Definition | Investopedia http://www.investopedia.com/terms/t/ turnoverratio.asp
  • 26. 26 Chapter 4. Back Test 4.3 Result The back-test strategy for SPY is adjusted by two day lag AM S-Score. The net value graph is shown below. Also, the criteria table for SPY and enhanced SPY is under the net value graph. The Daily adjusted column is the strategy that checks every trading day to decide whether there is an adjustment to be made. Relatively, the 5 Days Adjust column and the 10 Days Adjust column indicate the strategies which checks every 5 trading days and every 10 trading days. Figure 4.1: SPY Backtest
  • 27. 4.3 Result 27 Recall the strategy for XLV is the 2 day lag AM S-Score as well. Figure 4.2: XLV Backtest
  • 28. 28 Chapter 4. Back Test Finally, the strategy for XLY is using 3 day lag AM S-Score. Figure 4.3: XLY Backtest Consider the three tables above, it is easy to support a basic conclusion. For each strategy on each ETF, it is with high probability that the higher the rebalance frequency is, the better the performance would be. Nevertheless, when the frequency goes up from 10 trading days to 5 trading days, the performance enhances just slightly. Concerning about the fact that what we are trying to construct is a strategic-beta strategy, it is necessary for us to avoid frequent trading. Therefore, balancing the trade-off of turnover ratio and the transaction cost, we conclude that the 10 days adjustment strategy enhanced the original ETF significantly on almost every measurement sector, with an acceptable turnover ratio, which can be the appropriate re-balance frequency for all the ETFs. At last, we plot the returns’ densities in order to verify our conclusion, and compare base and
  • 29. 4.3 Result 29 enhanced ETFs. Figure 4.4: Density Plot Comparison From both density graphs and skewness values we can find out that our strategy moves the mean slightly to the right. In short, the strategies work for all ETFs in the backtest period. The annualized excess returns for the base ETFs SPY, XLV, and XLY are 4.53%, 2.27%, and 7.09%. The turnover ratios is relatively higher than their original, which are 10.94% for SPY, 18.3% for XLV, and 22.39% for XLY. From this result, we find further evidence to reinforce our hypothesis about the sensitivity of different ETFs to sentiment data in Chapter 2.1.2. This also highlights that trading strategies
  • 30. 30 Chapter 4. Back Test utilizing sentiment data should be implemented with great precaution due to inhomogeneity of stocks’ response to sentiment.
  • 31. 5. Conclusion Previous studies had shown that there is a correlation between social market sentiment data and stock return. Pursuing this idea, the previous practicum project has shown that daily rebalancing a stock portfolio can enhance the return/risk reward of a whole portfolio. Our findings further conclude that Twitter sentiment data really can enhance ETF portfolio re- turns. Reasonable excess return can be achieved through a weekly or bi-weekly rebalancing without much tradeoff between turnover and generation of excess return on shares turnover or transaction cost. The impact of sentiment data is different for each ETF. We think it is due to the structure of the ETFs and the biasness in tweets. SPY and XLY can be enhanced better through sentiment than XLV, possibly because the heaviest components of SPY and XLY are much higher in terms of market capitalization compared to the remaining components. In addition, the top components in SPY and XLY are more well-known to the public, which attract more market sentiments. Due to the fact above, we see the importance to model the dynamics of the tweets, both in temporal and in cross-sectional dimension. In this study, we use a dynamic 95% confidence interval of the S-Score to determine a signal for extreme values of the tweets of individual stocks. This confidence interval is constructed using 30-day past data. The interval can evolve to capture the most recent development in S-Score and reduce the need of unnecessary rebalancing in a fixed threshold case. Indeed, we can significantly reduce the rebalancing frequency because tweets can have clustering effects.
  • 32.
  • 33. 6. Appendix 6.1 Algorithm Expansion weight formula Now suppose we have a portfolio that contains stocks A, B and C, each with shares of SA, SB, SC. Denote the stock price for stock j on day i to be Pi j. Thus, on day 1, the stock prices are P1 A, P1 B, P1 C. Speaking in more detail, the price table is constructed below. Date A B C Day 1 P1 A P1 B P1 C Day 2 P2 A P2 B P2 C Table 6.1: price table Denote Mi j as the market capital for stock j at day i, and wi j as the market capital weight for stock j at day i. In other words, ∑stock numbers j=1 wi j ·Mi j = Mi port folio. Then, the portfolio return on day 2 for the example below is: P2 A·SA+P2 B·SB+P2 C·SC M1 port folio −1. Now derive this formula. P2 A ·SA +P2 B ·SB +P2 C ·SC M1 port folio −1 = P2 A ·SA M1 port folio + P2 B ·SB M1 port folio + P2 C ·SC M1 port folio −1 (6.1) = P2 A ·SA ·P1 A M1 port folio ·P1 A + P2 B ·SB ·P1 B M1 port folio ·P1 B + P2 C ·SC ·P1 C M1 port folio ·P1 C −1 (6.2) = P2 A P1 A · SA ·P1 A M1 port folio + P2 B P1 B · SB ·P1 B M1 port folio + P2 C P1 C · SC ·P1 C M1 port folio −1(6.3) = P2 A P1 A ·w1 A + P2 B P1 B ·w1 B + P2 C P1 C ·w1 B −1 (6.4)
  • 34. 34 Chapter 6. Appendix Therefore, it is reasonable to generalize the result. Portfolio return at day k is: Port folio Returnk = stock numbers ∑ j=1 Pk j Pk−1 j ·wk−1 j −1 (6.5) Next step is to move to the weight adjusting part. As mentioned in the report, even if we do not make any adjustments to the portfolio, the weight is updating everyday because the price changes. For the particular example above, the weight for stock A on day 2 is: w2 A = P2 A ·SA P2 A ·SA +P2 B ·SB +P2 C ·SC (6.6) The derivation is following: w2 A = P2 A ·SA P2 A ·SA +P2 B ·SB +P2 C ·SC (6.7) = P2 A ·SA M2 port folio (6.8) = P2 A ·SA M1 port folio M2 port folio M1 port folio (6.9) = P2 A ·SA M1 port folio ( P2 A ·SA +P2 B ·SB +P2 C ·SC M1 port folio ) (6.10) = P2 A ·SA ·P1 A M1 port folio ·P1 A ( P2 A P1 A · SA ·P1 A M1 port folio + P2 B P1 B · SB ·P1 B M1 port folio + P2 C P1 C · SC ·P1 C M1 port folio ) (6.11) = P2 A P1 A ·w1 A ( P2 A P1 A ·w1 A + P2 B P1 B ·w1 B + P2 C P1 C ·w1 B) (6.12) Therefore, from result of equation 6.12, we could write today’s weight of any single stock as a summation of yesterday’s weight and the change of price. Then, we can just time the factor score to this weight in order to adjust our portfolio. In this case, a single stock k on day i, the weight is just: wi k = ( Pi k Pi−1 k ·wi−1 k · factori k) ( stock numbers ∑ j=1 Pi j Pi−1 j ·wi−1 j · factori j) (6.13) Turnover Ratio Turnover of stock k in day 2 is: S2 kP2 k −S1 k.P2 k , where S stands for the shares of stock k at day 1 and day 2 and P is the price of stock k.
  • 35. 6.2 Criteria Calculation 35 Denote MC as Market Capitalization, then we can write the followings: S2 kP2 k −S1 k.P2 k = | MC2 k −S1 kP2 k P1 k P1 k | (6.14) = | MC2 k −S1 kP1 k P2 k P1 k | (6.15) = | MC2 k −MC1 k P2 k P1 k | (6.16) = | MC2 k −MC1 k (1+r2 k ) | (6.17) = | MC1 k (1+r2 k)f2 k −MC1 k (1+r2 k) | (6.18) Because of the fact that next period MC2 k = M1 k (1+r2 k)(f2 k ) = MCoriginal* growth of MC due to price movement ∗ growth of MC due to weight change, we can derive (7.6) as following: | MC1 k (1+r2 k)f2 k −MC1 k (1+r2 k) |=| MC1 k ((1+r2 k)f2 k −(1+r2 k)) | (6.19) Hence, to calculate monthly turnover ratio, we have: ∑ days t=1 ∑stocks k=1 | MCi k(1+ri+1 k )(fi+1 k −1) | 1 days ∑stocks k=1 MCi k ≈ days ∑ t=1 stocks ∑ k=1 wi k(1+ri+1 k )(fi+1 k −1) (6.20) 6.2 Criteria Calculation • Start date: The first date of the data we use to do the back test. • End date: The last date of the data we use to do the back test. • Net Asset Value: The net asset value on the end day which was worth 1 at the start date. • Total Trading Days: The number of days the stock market is open between start date and end date. • Successful rate: Successful rate is the fraction or percentage of trading days making profit among the total trading days. • Maximum drawdown: A maximum drawdown (MDD) is the maximum loss from a peak to a trough of a portfolio, before a new peak is attained. Maximum Drawdown (MDD) is an indicator of downside risk over a specified time period. • Annualized Return: Annualized returns are period returns re-scaled to a period of 1 year, 252 trading days. • Sharpe ratio: The Sharpe ratio is the average return earned in excess of the risk-free rate per unit of volatility or total risk, which can be calculated by the formula below: sharp ratio = ¯rp −rf σp
  • 36. 36 Chapter 6. Appendix where ¯rp=Expected portfolio return; rf =Risk free rate; σp=Portfolio standard deviation • Sortino ratio: A modification of the Sharpe ratio that differentiates harmful volatility from general volatility by taking into account the standard deviation of negative asset returns, called downside deviation. The Sortino ratio subtracts the risk-free rate of return from the portfolio’s return, and then divides that by the downside deviation. A large Sortino ratio indicates there is a low probability of a large loss. sortino ratio = ¯rp −rf σd where ¯rp=Expected portfolio return; rf =Risk free rate; σp=Standard deviation of negative asset returns • Calmar ratio: The Calmar ratio is a comparison of the average annual compounded rate of return and the maximum drawdown risk of commodity trading advisors and hedge funds. The lower the Calmar Ratio, the worse the investment perform on a risk-adjusted basis over the specified time period; the higher the Calmar Ratio, the better it performed. sortino ratio = ¯rp MDD where ¯rp=Expected portfolio return; MDD is the maximum drawdown • Skewness: Skewness describes asymmetry from the normal distribution in a set of statistical data. • Mean of Daily Return: The simple mathematical average of daily returns generated in our backtesting period. • Standard Deviation of Daily Return: The Standard Deviation of Daily Return is a measure of how spread out the daily returns are. R The definition for those criteria above including maximum drawdown, sharpe ratio, sortino ratio, calmar ratio, skewness are directly from Investopedia. One can access their website by http://www.investopedia.com. 6.3 Python Scripts We made a table explaining the basic function of all our Python script. Also, we uploaded all of them into GoogleDrive thus one can download them. R To access the complete python scripts file, click this link: https://drive.google.com/a/ illinois.edu/folderview?id=0B7yL1RhMfiOQUnk5eWYydXUzamM&usp=sharing
  • 37. 6.4 R Scripts 37 Figure 6.1: python Scripts Explanation 6.4 R Scripts We also made a table explaining our R script. The uploaded file can be found in GoogleDrive below as well.
  • 38. 38 Chapter 6. Appendix Figure 6.2: R Scripts Explanation R To access the complete R scripts file, click this link: https://drive.google.com/folderview? id=0B6n3np7uEHXNNFdZUE9pbHZ1MGc&usp=sharing 6.5 Suggestions For Further Research We acknowledge that the critical downside of the non-passive trading strategy is the bid/ask spread, and transaction cost. As mentioned above, we use turnover ratio as a proxy for transaction cost. Since stock returns are highly correlated, it is worthwhile to model the correlation structure of the stock returns together with the correlation with tweets data. We tried to use Principal Component Analysis (PCA) to analyse the correlation matrix of all the constituent stocks in the ETFs and found that the first five to eight components can explain most of the variance in the data. Further investiga- tion on the top factor loading of the principal components revealed that some industries/sectors are
  • 39. 6.5 Suggestions For Further Research 39 the main drivers of the index. However, the drawback of this method is that the representative stocks picked by PCA are not considered by their market capitalization. In fact, some of the company values can be too small compared to others. If we follow their initial weights similar to the original ETFs, there will be some problems with the self-financing criteria whenever we need to adjust the weights. For instance, we may need to cash out most of the investment in a few small stocks to increase the weight in AAPL stock. Therefore, we suggest further work in using PCA to create a highly influential subset of stocks and focus on their market sentiment. We recommend a new set of reasonable weights, probably equal initial weights of these principal stocks as the starting point. We are aware that this weighting scheme is subjective. Optimization using brute search may apply to search for the best set of weight so that this can be practical, enhance returns, and reduce turnover ratio further. Moreover, for the 5 day and 10 day strategy, we just picked the first day as the beginning date. For example, we have 792 days sentiment factor on XLV, so as for 10 day strategy, we made the checking date as 1, 11, 21, ..., 791. The checking date for 5 day strategy is 1, 6, 11, ...,791. A better way to do this can be calculate all the 5 day or 10 day grid and get the average. As for example, the first group of checking date for 10 day strategy is 1, 11, 21,..., 791, and the second group should be 2, 22, ..., 682. Thus for every days return we can get 10 groups of it, and then take the average. We can use this rolling daily return as a criteria to design the trading strategy. The same methodology can be applied on the 5 days strategy easily.
  • 40.
  • 41. Bibliography [1] J. Blaschak, A. Blinov, J Gits, F. Harfoush, and K. Myers. Systems and methods of detecting, measuring, and extracting signatures of signals embedded in social media data streams. U.S. Patent No 9,104,734, Aug. 2015. [2] Ranco G, Aleksovski D, Calderelli G, Grcar M, and Mozetic I. The effects of twitter sentiment on stock price returns. 2015. [3] Jason Hsu, Vital Kalesnik, and Feifei Li. An investor’s guide to smart beta strategies. AAII Journal. [4] Seung Lee, Chuanning Li, Yunqian Li, , and Yoshifumi Ichikawa. Sentiment enhanced etfs. MSFE Practicum Project Report, 2015. [5] Ari Polychronopoulos. Building a better beta: Combining fundamentals weighting, low volatility, and momentum strategies. Research Affiliates Whitepaper, Oct. 2014. [6] Wigglesworth Robin. Fund managers ready for ’smart beta’ wars. Ft.com, 08 Feb. 2016. Wish you all the best, UIUC MSFE & SMA group