Recorded Future<br />David Moon<br />Global Head of Financial Services<br />Bill Ladd<br />Chief Analytic Officer<br />
What is Recorded Future?<br />3/1/2011<br />2<br />We believe that the content of the web has predictive power.<br />So......
Web is Loaded with Predictions<br />3/1/2011<br />3<br />Silicon Valley executives head to Vail, Colo. next week for the a...
The RF Stack<br />3/1/2011<br />4<br />Application<br />Daily Average of Scores<br />API / FTP<br />RF Scores & Aggregates...
Aggregates</li></ul>Aggregates<br />RF Scores – Sentiment & Momentum<br />Scores<br />Time<br />Pub Date<br />Harvest Date...
Relationship driven
Machine-Readable
Back-testable</li></ul>Events<br />Entities<br />Entities & Events –Extracted & Normalized<br />Sources<br />
Case Studies<br />Liquidity Management<br />Predicting liquidity with media coverage<br />Short Term Trading<br />“Future ...
Case 1 – Liquidity ManagementPredicting Liquidity with Momentum<br />Recorded Future momentum contains predictive informat...
Case 2 – Short Term TradingFuture Event Distributions<br />3/1/2011<br />7<br />Non-earnings related events are negative.<...
Case 2 – Short Term TradingNews “Should” be Priced in Immediately<br />Buy the rumor, sell the news describes earnings rel...
Case 3 – Strategy AllocationQuantifying Strategy Crowdedness<br />3/1/2011<br />9<br />Recorded Future data yielded an inv...
Case 4 – Risk ModelingVolatility Forecasting Methodology<br />Data Extraction<br />Extract all references to S&P 500 Compa...
Case 4 – Risk ModelingModel Summary<br />Call:<br />lm(formula = spyvol ~ vix + emamo + emaneg, data = blogus)<br />Residu...
Upcoming SlideShare
Loading in …5
×

Recorded Future News Analytics for Financial Services

12,138 views
12,038 views

Published on

Published in: Economy & Finance, Business
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
12,138
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
0
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide

Recorded Future News Analytics for Financial Services

  1. 1. Recorded Future<br />David Moon<br />Global Head of Financial Services<br />Bill Ladd<br />Chief Analytic Officer<br />
  2. 2. What is Recorded Future?<br />3/1/2011<br />2<br />We believe that the content of the web has predictive power.<br />So...<br />We’ve harvested and organized the only real-time source for past, planned and speculative events on the web.<br />To...<br />Allow users to “slice-and-dice” the web to make predictions. <br />
  3. 3. Web is Loaded with Predictions<br />3/1/2011<br />3<br />Silicon Valley executives head to Vail, Colo. next week for the annual Pacific Crest Technology Leadership Forum<br />Drought and malnutrition hinder next year’s development plans in Yemen...<br />“Strange new Russian worm set to unleash botnet on 4/1/2012...”<br /> The carrier may select partners to set up a new carrier as early as next month<br />“According to TechCrunch China’s new 4G network will be deployed by mid-2010”<br />“... Dr Sarkar says the new facility will be operational by March 2014...”<br />“2010 is the year when Iran will kick out Islam. Ya Ahura we will.”<br />“...opposition organizers plan to meet on Thursday to protest...”<br />“Excited to see Mubarak speak this weekend...”<br />
  4. 4. The RF Stack<br />3/1/2011<br />4<br />Application<br />Daily Average of Scores<br />API / FTP<br />RF Scores & Aggregates<br />Client Scores & Aggregates Clients can use the same underlying date to define their own<br /><ul><li>Scores: Proprietary sentiment, momentum, event score, etc
  5. 5. Aggregates</li></ul>Aggregates<br />RF Scores – Sentiment & Momentum<br />Scores<br />Time<br />Pub Date<br />Harvest Date<br />Inferred Dates<br />Recorded Future Driven Linguistic Processing yields a corpus that is<br /><ul><li>Structured
  6. 6. Relationship driven
  7. 7. Machine-Readable
  8. 8. Back-testable</li></ul>Events<br />Entities<br />Entities & Events –Extracted & Normalized<br />Sources<br />
  9. 9. Case Studies<br />Liquidity Management<br />Predicting liquidity with media coverage<br />Short Term Trading<br />“Future event” study<br />Strategy Allocation<br />Measuring investment strategy crowdedness with online media.<br />Risk Modeling<br />Anticipating future volatility with media sentiment and macroeconomic discussion.<br />3/1/2011<br />5<br />
  10. 10. Case 1 – Liquidity ManagementPredicting Liquidity with Momentum<br />Recorded Future momentum contains predictive information for dollar volume of S&P 500 companies.<br />Control for trailing market volume on a 1 and 20-day basis.<br />Use 1-day trailing momentum.<br />Call:lm(formula = Dollarvol.1 ~ 0 + lDollarvol.1 + smaDvol.Dollarvol.1 + smaxlMo, data = seriesdf)Residuals:Min 1Q Median 3Q Max -5.039e+09 -2.215e+07 -2.284e+06 1.813e+07 1.597e+10 Coefficients:Estimate Std. Error t value Pr(>|t|) lDollarvol.1 0.513193 0.003237 158.54 < 2e-16 ***smaDvol.Dollarvol.1 0.471645 0.003817 123.56 < 2e-16 ***smaxlMo 0.077162 0.015683 4.92 8.67e-07 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 170900000 on 72109 degrees of freedomMultiple R-squared: 0.8539, Adjusted R-squared: 0.8539 F-statistic: 1.405e+05 on 3 and 72109 DF, p-value: < 2.2e-16<br />3/1/2011<br />6<br />
  11. 11. Case 2 – Short Term TradingFuture Event Distributions<br />3/1/2011<br />7<br />Non-earnings related events are negative.<br />We controlled for earnings and non-earnings related news.<br />The study queried instances where there was advance notice of specific future events.<br />Events defined as one day long with S&P 500 constituents<br />These typically provided one to three days advance notice<br />~19,000 unique events satisfied these criteria<br /> ~1-3 days<br />t(days)<br />
  12. 12. Case 2 – Short Term TradingNews “Should” be Priced in Immediately<br />Buy the rumor, sell the news describes earnings related events.<br />Market adjusted returns increase on approach to the event day and decline thereafter.<br />It does not describe non-earnings related events.<br />No increase in returns on approach to event-day<br />Statistically significant increase in volume (0.3σ) and decrease in market adjusted returns.<br />Non-earnings related events were net negative.<br />3/1/2011<br />8<br />Typical Publication Day<br />Predicted Event Day<br />
  13. 13. Case 3 – Strategy AllocationQuantifying Strategy Crowdedness<br />3/1/2011<br />9<br />Recorded Future data yielded an inverse correlation between the performance of a momentum strategy and the business media’s discussion of momentum.<br />The study introduced a synthetic linguistic score.<br />Relied on standard API queries<br />Scored fragments based on momentum-related terms<br />Increased discussion of momentum-related trading correlated with declining returns.<br />Inverse correlation with $NAV/share of momentum mutual fund<br />Monthly correlation of -0.56 over the past year<br />
  14. 14. Case 4 – Risk ModelingVolatility Forecasting Methodology<br />Data Extraction<br />Extract all references to S&P 500 Companies from Recorded Future’s structured content database from January 1, 2009 to December 9, 2010.<br />Includes synonyms (IBM vs. International Business Machines, etc.)<br />Reduce to only mentions on “Blog” sources.<br />Compute sentiment and momentum of text surrounding references to the Index over the time period.<br />Data Aggregation<br />Compute daily series of count-weighted mean sentiment and momentum.<br />Modeling<br />Calculate exponential moving averages of these values over a 26-day trailing window. <br />Regress against 1-month forward realized volatility of S&P 500.<br />Model Assessment<br />Economic evaluation of model parameters – do they make sense?<br />Comparison to other volatility metrics – how does the signal compare?<br />3/1/2011<br />10<br />
  15. 15. Case 4 – Risk ModelingModel Summary<br />Call:<br />lm(formula = spyvol ~ vix + emamo + emaneg, data = blogus)<br />Residuals:<br /> Min 1Q Median 3Q Max <br />-0.0087503 -0.0020655 -0.0004415 0.0020463 0.0100361 <br />Coefficients:<br /> Estimate Std. Error t value Pr(>|t|) <br />(Intercept) -1.237e-02 2.460e-03 -5.028 7.03e-07 ***<br />vix 3.938e-04 2.511e-05 15.681 < 2e-16 ***<br />emamo 2.337e-02 8.164e-03 2.863 0.00439 ** <br />emaneg 3.204e-01 3.631e-02 8.824 < 2e-16 ***<br />---<br />Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 <br />Residual standard error: 0.003263 on 478 degrees of freedom<br /> (25 observations deleted due to missingness)<br />Multiple R-squared: 0.6867, Adjusted R-squared: 0.6848 <br />F-statistic: 349.3 on 3 and 478 DF, p-value: < 2.2e-16 <br />3/1/2011<br />11<br />Regressors are VIX value, and 28-day EMAs of average momentum and negative sentiment in text surrounding S&P500 companies.<br />Controlling for VIX, an increase in chatter around S&P 500 companies and an increase in negative sentiment around S&P500 companies lead increases in one-month forward realized volatility.<br />Positive sentiment NOT a statistically significant term in this model. Volatility driven by fear, not euphoria?<br />R-squared of 0.68 respectable compared to VIX’s ability to predict 1-month forward volatility – R-squared 0.63. <br />RF data orthogonal to market data – controlling for VIX leads to models with R-squared > 0.63<br />
  16. 16. Getting Started – Data & Aggregates<br />Data Instances<br />Sources & Documents<br />Entities & Events<br />Canonical events<br />Entity identifiers: tickers, industry taxonomy<br />Time<br />Publication Date<br />Event Date<br />Calculated Scores<br />Momentum, Sentiment<br />Aggregates<br />US equities aggregates<br />Daily composite momentum and sentiment scores for constituents of the Russell 3000<br />Custom aggregates built on data elements<br />3/1/2011<br />12<br />Canonical info<br />Sentiment<br />Momentum<br />Event time<br />Co-occurring entities<br />Source metadata<br />Document metadata<br />RF State Data<br />Entity information<br />
  17. 17. Access – Historical & Live Data<br />3/1/2011<br />13<br />Recorded Future Web Service API<br />Recorded Future FTP Archive<br />Data Formats – JSON, CSV<br />Historical Data Delivery – API, FTP<br />API – Historical results from raw data via web-service calls<br />FTP – Files of aggregates, and bulk history<br />Live Data Delivery – API <br />Customized calls – as frequently as intra-day<br />RF Aggregates – calculated daily<br />JSON HTTP <br />Request<br />.zip archive<br />csv<br /> aggregates<br />json/tsv<br /> instances<br />FTP Request<br />JSON/CSV<br />Response<br />Historical Batch Download<br />Live Download<br />Load RF Data<br />RF Customer Analytic Environment<br />(R, Matlab, Java, Python, Excel, etc.)<br />
  18. 18. Applications – Slicing the Data<br />Case Studies, revisited<br />Liquidity Management<br />Pull aggregate Day/Company momentum data for S&P 500<br />Short Term Trading<br />Pull instance data for S&P 500 companies where publish date is before event date<br />Strategy Allocation<br />Pull instance data where document category is “Business/Finance” and score fragments based on word/phrase choice<br />Risk Modeling<br />Pull aggregate momentum and sentiment data for the S&P 500 Companiesfor specified time period<br />Different slices entail unique media-analytic feeds<br />3/1/2011<br />14<br />
  19. 19. Summary<br />Recorded Future provides the world’s only real-time source of past, planned and speculative events.<br />Designed for clients to create unique media-analytic feeds via web-services API and FTP access.<br />Applied to liquidity planning, short term trading, strategy allocation, risk modeling, among other scenarios<br />3/1/2011<br />15<br />

×