Machine learning for factor investing - Tony Guida
https://quspeakerseries5.splashthat.com/
Topic: Machine Learning for Factor Investing: case study on "Trees"
In this presentation, Tony will first introduce the concept of supervised learning. Then he will cover the practitioner angle for constructing non linear multi factor signals using stock characteristics. He will show the added value of ML based signals over traditional linear stale factors blend in equity.
This master class is derived from Guillaume Coqueret and Tony Guida's latest book "Machine Learning for Factor Investing"
In All About Factors, we cover the basics of what factors are, where we expect them to derive their excess returns from, their advantages and disadvantages and if there is indeed any merit to this approach or if it just another Wall Street marketing gimmick.
After covering the commonly accepted factors basics, we discuss expectations for factor investing, the theory as to why short-term pain must be present for long-term return, and some key considerations in moving from the academic research to creating investible portfolios.
Also explored is the current on-going debate between industry titans Rob Arnott (Research Affiliates) and Cliff Asness (AQR) as to the efficacy of using valuation-based spreads to time factor exposures.
Lastly, we look at some different methods that a retail investor can utilize smart-beta investing, by highlighting some of the current industry techniques for diversifying factor exposures and building a multi-factor portfolio.
This presentation is aimed to answer the one of the most fundamental question of Trading: "How much to trade?" You might have decided what to trade but the question how much to trade raises a critical issue which needs to be handled using Statistics. This presentation take its audience through the various money management techniques and position sizing algorithms which come handy for every trader.
Through examining their nature and mechanisms, identifying their spin-offs and analyzing their performance, this presentation is designed to discuss what to look out for when conduct due diligence on different hedge fund strategies.
Planning a Start-up? Our private equity investment PowerPoint presentation slide is just what you need. These equity-based crowdfunding PPT templates will fill the gap between the investors and your company. Download from here: https://www.slideteam.net/private-equity-investment-deck-powerpoint-presentation-slides.html
In All About Factors, we cover the basics of what factors are, where we expect them to derive their excess returns from, their advantages and disadvantages and if there is indeed any merit to this approach or if it just another Wall Street marketing gimmick.
After covering the commonly accepted factors basics, we discuss expectations for factor investing, the theory as to why short-term pain must be present for long-term return, and some key considerations in moving from the academic research to creating investible portfolios.
Also explored is the current on-going debate between industry titans Rob Arnott (Research Affiliates) and Cliff Asness (AQR) as to the efficacy of using valuation-based spreads to time factor exposures.
Lastly, we look at some different methods that a retail investor can utilize smart-beta investing, by highlighting some of the current industry techniques for diversifying factor exposures and building a multi-factor portfolio.
This presentation is aimed to answer the one of the most fundamental question of Trading: "How much to trade?" You might have decided what to trade but the question how much to trade raises a critical issue which needs to be handled using Statistics. This presentation take its audience through the various money management techniques and position sizing algorithms which come handy for every trader.
Through examining their nature and mechanisms, identifying their spin-offs and analyzing their performance, this presentation is designed to discuss what to look out for when conduct due diligence on different hedge fund strategies.
Planning a Start-up? Our private equity investment PowerPoint presentation slide is just what you need. These equity-based crowdfunding PPT templates will fill the gap between the investors and your company. Download from here: https://www.slideteam.net/private-equity-investment-deck-powerpoint-presentation-slides.html
Given the recent financial crisis and the extended impact on global credit market and liquidity, it is imperative that financial institutions strengthen their market risk management capabilities to effectively meet compelling business objectives and challenges which include portfolio pricing and portfolio exposure management
"Active Learning in Trading Algorithms" by David Fellah, Head of the EMEA Lin...Quantopian
Presented at QuantCon Singapore 2016, Quantopian's quantitative finance and algorithmic trading conference, November 11th.
Institutional orders generally exceed the absorption capacity in the immediate order book and are frequently split horizontally over time and vertically over price. The task of splitting apart a meta-order is achieved through a sequence of market transactions performed by trading algorithms, causing market impact. Consequently a great deal of research is spent on understanding market impact and its role in algorithm design in order to reduce it.
In this presentation, we discuss an application of Deep Reinforcement Learning to minimise transaction costs across a diverse range of instruments. We first discuss high-frequency market impact and its role in optimal planning for single-position and portfolio trading. We then discuss examples of how machine learning is used in short-term forecasting to augment order placement decisions.
Finally, we discuss how the algorithm considers these effects jointly, how it optimizes a dynamic policy, and how it improves performance against surrogate hand-tuned algorithms.
LTCM's case in 1998 would be presented in this slide.
It's my mid-term project at NCCU, Money and Banking department, for the course of " Option- Theory and Application. "
Counterparty Credit Risk and CVA under Basel IIIHäner Consulting
Financial institutions which apply for an IMM waiver under Basel III need to fullfill a broad set of requirements. We present the quantitative, organizational and operational implications and provide some hand-on guidance how to fulfill the regulatory requirements.
Basel, IRB, IFRS 9 , Stress Testing, CCAR, DFAST, Loss given default, Exposure at default, Prepayments, Scorecards, Margin of Conservatism, Age Period Cohort analysis, Survival analysis, Segmentation
Challenges in Practical Market Risk Management - a presentation by Anshuman Prasad, Director, Risk and Analytics at CRISIL GR&A made at the 15th Annual GARP Risk Management Convention, New York.
Ted Alexander of Magellan Asset Management discusses the investment implications of 8 predictions in artificial intelligence, with a focus on healthcare.
Ted delivered his presentation at 'The Future of Financial Advice', the Booster Financial Adviser Conference 2016 in Wellington, New Zealand on 4 November 2016.
Given the recent financial crisis and the extended impact on global credit market and liquidity, it is imperative that financial institutions strengthen their market risk management capabilities to effectively meet compelling business objectives and challenges which include portfolio pricing and portfolio exposure management
"Active Learning in Trading Algorithms" by David Fellah, Head of the EMEA Lin...Quantopian
Presented at QuantCon Singapore 2016, Quantopian's quantitative finance and algorithmic trading conference, November 11th.
Institutional orders generally exceed the absorption capacity in the immediate order book and are frequently split horizontally over time and vertically over price. The task of splitting apart a meta-order is achieved through a sequence of market transactions performed by trading algorithms, causing market impact. Consequently a great deal of research is spent on understanding market impact and its role in algorithm design in order to reduce it.
In this presentation, we discuss an application of Deep Reinforcement Learning to minimise transaction costs across a diverse range of instruments. We first discuss high-frequency market impact and its role in optimal planning for single-position and portfolio trading. We then discuss examples of how machine learning is used in short-term forecasting to augment order placement decisions.
Finally, we discuss how the algorithm considers these effects jointly, how it optimizes a dynamic policy, and how it improves performance against surrogate hand-tuned algorithms.
LTCM's case in 1998 would be presented in this slide.
It's my mid-term project at NCCU, Money and Banking department, for the course of " Option- Theory and Application. "
Counterparty Credit Risk and CVA under Basel IIIHäner Consulting
Financial institutions which apply for an IMM waiver under Basel III need to fullfill a broad set of requirements. We present the quantitative, organizational and operational implications and provide some hand-on guidance how to fulfill the regulatory requirements.
Basel, IRB, IFRS 9 , Stress Testing, CCAR, DFAST, Loss given default, Exposure at default, Prepayments, Scorecards, Margin of Conservatism, Age Period Cohort analysis, Survival analysis, Segmentation
Challenges in Practical Market Risk Management - a presentation by Anshuman Prasad, Director, Risk and Analytics at CRISIL GR&A made at the 15th Annual GARP Risk Management Convention, New York.
Ted Alexander of Magellan Asset Management discusses the investment implications of 8 predictions in artificial intelligence, with a focus on healthcare.
Ted delivered his presentation at 'The Future of Financial Advice', the Booster Financial Adviser Conference 2016 in Wellington, New Zealand on 4 November 2016.
How to Invest in AI - Top 10 Artificial Intelligence StocksNgoc Truong
Macrovue‘s Webinar: How To Investing in Artificial Intelligence - Top 10 AI Stock Picks
Macrovue, the world's first global thematic investment platform giving Australians the ability to invest in international thematic share portfolios.
In this presentation, you will explore:
• The impact AI will have on the global economy
• The companies at the forefront of AI technology
• Why now is a good time to invest in AI technology
• An overview of some of the AI stocks in the portfolio
• Our stock selection criteria and research methodology.
The Macrovue Investment team has researched and constructed a portfolio focused on the five main AI technology systems in practice now.
These 10 companies are the early AI adopters that combine a strong digital capability with proactive strategies that have higher profit margins and are likely to widen the performance gap with other firms in the future.
Sales and marketing teams in enterprises have too many leads to pursue but have limited time and budget at their disposal. To build a strong sales pipeline, marketers should target their prospects with the right content to engage their interests and nurture them before handing them off to their sales teams. Prioritizing the right deals for sales team requires effective strategies for scoring leads and accurately forecasting opportunities help them identify issues early to meet their targets. In this talk, we will look under the hood of the machine learning pipelines in Salesforce Einstein that help sales and marketing teams win more deals. Specifically, we'll look at the problem of scoring prospects based on their engagement so that marketers know when they are ready to buy. Next, we will share our journey on model interpretability in providing actionable insights with our predictions. Finally, we will describe how we generate scores and insights for all customers through a model tournament, so that enterprises and small businesses alike can reap the benefits of machine learning.
A Vision for Quantitative Investing in the Data Economy by Michael Beal at Qu...Quantopian
Quantitative Investors have long been charged with an exhilarating challenge - to derive insight from data. To support this ardor, a plethora of traditional data and technology vendors have entrenched themselves as critical partners in our pursuit of Alpha.
Over the last decade, a new partner in the pursuit of “automated truth from data” has emerged. Billions of dollars in Venture Capital funding have created an ecosystem of “Big Data”, “Cognitive Intelligence”, “Cloud Technology”, etc. companies seeking to extract information from anything and everything (e.g. unstructured text, sensors, satellites, etc.). This “Data Revolution” began in California and is now blossoming globally.
As “Silicon Alley” brings financial technology to the mainstream, what new opportunities await the ambitious? What disruptions threaten the complacent? And which historical analogs best illuminate the path forward for Quantitative Investors in the “Data Economy”?
Private Equity: Powering Alpha Via AI, Analytics & AutomationCognizant
Embedding a data-driven approach that relies on the latest digital technologies, tools and techniques can help to increase the value of portfolio companies and enable them to transform – which can be critical while formulating exit strategies.
Stories from the Financial Service AI Trenches: Lessons Learned from Building...Databricks
EY helps clients establish their data- and AI-driven transformation strategies, operationalise their AI governance frameworks, as well as build and monitor AI solutions. In this presentation we discuss how we have approached the nuances of building AI solutions in financial services, and how a highly-regulated industry meets innovation with experiment-driven emerging technologies.
Build an AI Roadmap and Win the Consumer Goods Intelligence RaceGib Bassett
My presentation from Salesforce Connections 2018. In it, I describe why it's important to think about a use case driven strategy for advanced analytics and AI.
Bootstrap, Angel or Venture: Determining the Right Financing Strategy for You...Judy Loehr
This presentation was shared at Dreamforce 2016 to help early-stage cloud business application startup teams understand how investors will evaluate their markets so they can plan the right financing strategy from the beginning.
Thomson Reuters is pleased to be a sponsor for this years A-Team Entity Data and Applications Directory. This special publication lists all the major suppliers of regulatory and risk data services, covering areas such as:
FATCA, Solvency, EMIR, Dodd-Frank, UCITS, LEI, Counterparty Risk and so much more.
Similar to Machine learning for factor investing (20)
Uniform Legal Framework for AI: The EU AI Act establishes a uniform legal framework for the development, marketing, and use of artificial intelligence systems within the EU, aimed at promoting trustworthy and human-centric AI while ensuring a high level of health, safety, and fundamental rights protection.
Risk-Based Approach: The regulation adopts a risk-based approach, classifying AI systems based on the level of risk they pose, from minimal to unacceptable risk, with stringent requirements for high-risk AI systems, particularly those impacting health, safety, and fundamental rights.
Prohibitions for Certain AI Practices: Unacceptable risk practices, such as manipulative social scoring and real-time biometric identification in public spaces without justification, are prohibited to protect individual rights and freedoms.
Mandatory Requirements for High-Risk AI Systems: High-risk AI systems must comply with mandatory requirements before they can be marketed, put into service, or used within the EU. These requirements include transparency, data governance, technical documentation, and human oversight to ensure safety and compliance with fundamental rights.
Conformity Assessment and Compliance: Providers of high-risk AI systems must undergo a conformity assessment procedure to demonstrate compliance with the mandatory requirements. This includes maintaining technical documentation and conducting risk management activities.
Transparency Obligations: AI systems must be transparent, providing users with information about the AI system's capabilities, limitations, and the purpose for which it is intended, ensuring informed use of AI technologies.
Market Surveillance: The EU AI Act establishes mechanisms for market surveillance to monitor and enforce compliance, with the European Artificial Intelligence Board (EAIB) playing a central role in coordinating activities across member states.
Protection of Fundamental Rights: The Act emphasizes the protection of fundamental rights, including privacy, non-discrimination, and consumer rights, with specific provisions to safeguard these rights in the context of AI use.
Innovation and SME Support: The regulation aims to foster innovation and support small and medium-sized enterprises (SMEs) through regulatory sandboxes and by reducing administrative burdens for low and minimal risk AI applications.
Global Impact and Alignment: While the EU AI Act directly applies to the EU market, its global impact is significant, influencing international standards and practices in AI development and use. Financial industry professionals worldwide should be aware of these regulations as they may affect global operations and international collaborations.
The financial industry is witnessing an emerging trend of Large Language Models (LLMs) applications to improve operational efficiency. This article, based on a round table discussion hosted by TruEra and QuantUniversity in New York in May 2023, explores the potential use cases of LLMs in financial institutions (FIs), the risks to consider, approaches to manage these risks, and the implications for people, skills, and ways of working. Frontline personnel from Data and Analytics/AI teams, Model Risk, Data Management, and other roles from fifteen financial institutions devoted over two hours to discussing the LLM opportunities within their industry, as well as strategies for mitigating associated risks.
The discussions revealed a preference for discriminative use cases over generative ones, with a focus on information retrieval and operational automation. The necessity for a human-in-the-loop was emphasized, along with a detailed discourse on risks and their mitigation.
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALSQuantUniversity
Join CFA Institute and QuantUniversity for an information session about the upcoming CFA Institute Professional Learning course: Python and Data Science for Investment professionals.
Learn how artificial intelligence (AI) and machine learning are revolutionizing industries — this course will introduce key concepts and illustrate the role of machine learning, data science techniques, and AI through examples and case studies from the investment industry. The presentation uses simple mathematics and basic statistics to provide an intuitive understanding of machine learning, as used by firms, to augment traditional decision making.
https://quforindia.splashthat.com/
Mathematical Finance & Financial Data Science Seminar
AI and machine learning are entering every aspect of our life. Marketing, autonomous driving, personalization, computer vision, finance, wearables, travel are all benefiting from the advances in AI in the last decade. As more and more AI applications are being deployed in enterprises, concerns are growing about potential "AI accidents" and the misuse of AI. With increased complexity, some are questioning whether the models actually work! As the debate about fairness, bias, and privacy grow, there is increased attention to understanding how the models work and whether the models are thoroughly tested and designed to address potential issues.
The area "Responsible AI" is fast emerging and becoming an important aspect of the adoption of machine learning and AI products in the enterprise. Companies are now incorporating formal ethics reviews, model validation exercises, and independent algorithmic auditing to ensure that the adoption of AI is transparent and has gone through formal validation phases.
In this talk, Sri will introduce Algorithmic auditing and discuss why Algorithmic auditing will be a formal process industries using AI will need. Sri will also discuss the emerging risks in the adoption of AI and discuss how QuSandbox, his company is building, will address the emerging needs of formal Algorithmic auditing practices in enterprises.
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...QuantUniversity
Machine Learning: Considerations for Fairly and Transparently Expanding Access to Credit
With Raghu Kulkarni and Steve Dickerson
Recently, machine learning has been used extensively in credit decision making. As ML proliferates the industry, issues of considerations for fair and transparent access to credit decision making is becoming important.
In this talk, Dr.Raghu Kulkarni and Dr.Steven Dickerson from Discover Financial Services will share their experiences at Discover. The talk will include:
- An overview of how ML models are used across financial life cycle
- Practical problems practitioners run into and why explainability and bias detection becomes important.
References:
1- https://www.h2o.ai/resources/white-paper/machine-learning-considerations-for-fairly-and-transparently-expanding-access-to-credit/
2- https://arxiv.org/abs/2011.03156
Seeing what a gan cannot generate: paper reviewQuantUniversity
Seeing what a GAN cannot Generate Paper review: Bau, David et al. “Seeing What a GAN Cannot Generate.” 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019): 4501-4510.
Machine Learning in Finance: 10 Things You Need to Know in 2021QuantUniversity
Machine Learning and AI has revolutionized Finance! In the last five years, innovations in computing, technology and business models have created multiple products and services in Fintech prompting organizations to prioritize their data and AI strategies. What will 2021 bring and how should you prepare for it? Join Sri Krishnamurthy,CFA as we kickoff the QuantUniversity’s Winter school 2021. We will introduce you to the upcoming programs and have a masterclass on 10 innovations in AI and ML you need to know in 2021!
Markowitz portfolio optimization is optimal in theory, however, when applied in practice it often fails catastrophically. Usually, this is addressed by various simplifications to increase robustness. In this talk I will make the case that the reason this theory fails in practice is because uncertainty in the parameter estimation is not taken into account. By using Bayesian statistics we can fix Markowitz and retain all its desirable properties while still having a robustness technique that can be easily extended. This talk is geared at intermediate and will give a general introduction to Bayesian modeling using PyMC3 and focus on application and code examples rather than theory.
With Alternative Data becoming more and more popular in the industry, quants are eager to adopt them into their investment processes. However, with a plethora of options, API standards, trying and evaluating datasets is a major hindrance to adoption of datasets.
Join Yaacov, Sri, James and Brad discuss the opportunities, pitfalls and challenges of Alternative Data and its adoption in finance
A Unified Framework for Model Explanation
Ian Covert, University of Washington
Explainable AI is becoming increasingly important, but the field is evolving rapidly and requires better organizing principles to remain manageable for researchers and practitioners. In this talk, Ian will discuss a new paper that unifies a large portion of the literature using a simple idea: simulating feature removal. The new class of "removal-based explanations" describes 20+ existing methods (e.g., LIME, SHAP) and reveals underlying links with psychology, game theory and information theory.
Practical examples will be presented and available on the Qu.Academy site
Reference:
Explaining by Removing: A Unified Framework for Model Explanation
Ian Covert, Scott Lundberg, Su-In Lee
https://arxiv.org/abs/2011.14878
Machine Learning Interpretability -
Self-Explanatory Models: Interpretability, Diagnostics and Simplification
With Agus Sudjianto, Wells Fargo
The deep neural networks (DNNs) have achieved great success in learning complex patterns with strong predictive power, but they are often thought of as "black box"models without a sufficient level of transparency and interpretability. It is important to demystify the DNNs with rigorous mathematics and practical tools, especially when they are used for mission-critical applications. This talk aims to unwrap the black box of deep ReLU networks through exact local linear representation, which utilizes the activation pattern and disentangles the complex network into an equivalent set of local linear models (LLMs). We develop a convenient LLM-based toolkit for interpretability, diagnostics, and simplification of a pre-trained deep ReLU network. We propose the local linear profile plot and other visualization methods for interpretation and diagnostics, and an effective merging strategy for network simplification. The proposed methods are demonstrated by simulation examples, benchmark datasets, and a real case study in credit risk assessment. The paper that will be presented in this talk can be found here.
Qu speaker series 14: Synthetic Data Generation in FinanceQuantUniversity
In this master class, Stefan shows how to create synthetic time-series data using generative adversarial networks (GAN). GANs train a generator and a discriminator network in a competitive setting so that the generator learns to produce samples that the discriminator cannot distinguish from a given class of training data. The goal is to yield a generative model capable of producing synthetic samples representative of this class. While most popular with image data, GANs have also been used to generate synthetic time-series data in the medical domain. Subsequent experiments with financial data explored whether GANs can produce alternative price trajectories useful for ML training or strategy backtests.
In 2009 author and motivational speaker Simon Sinek delivered the now-classic TED talk “Start with why”. Viewed by over 28 million people, “Start with Why” is the third most popular TED video of all time and it teaches us that great leaders and companies inspire us to take action by focusing on the WHY over the “what” or the “how”. In this talk we’ll ask how applied data and computational scientists can use the power of WHY to frame problems, inspire others, and give them answers to business questions they might never think of asking.
Bio
Jessica Stauth is a Managing Director in Fidelity Labs, an internal startup incubator with a mission to create new fintech businesses that drive growth for the firm. Dr. Stauth previously held roles as Managing Director of Portfolio Management, Research, and Trading at Quantopian, a crowd-sourced systematic hedge fund based in Boston, Director of Quant Product Strategy for Thomson Reuters (now Refinitiv), and as a Senior Quant Researcher at the StarMine Corporation, where she built global stock selection models including the design and implementation of the StarMine Short Interest model. Dr. Stauth holds a PhD in Biophysics from UC Berkeley, where her research focused on computational neuroscience.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
1. Qu Speaker Series
Machine Learning for Factor Investing
An Afternoon with Tony Guida
2020 Copyright QuantUniversity LLC.
Hosted By:
Sri Krishnamurthy, CFA, CAP
sri@quantuniversity.com
www.qu.academy
08/05/2020
Online
https://quspeakerseries5.spla
shthat.com/
2. 2
QuantUniversity
• Boston-based Data Science, Quant
Finance and Machine Learning
training and consulting advisory
• Trained more than 1000 students in
Quantitative methods, Data Science
and Big Data Technologies using
MATLAB, Python and R
• Building a platform for AI
and Machine Learning Exploration
and Experimentation
8. 8
• Tony Guida is a Quantitative Portfolio Manager and researcher. Tony’s work is focused primarily on
extracting market inefficiencies from different sources from traditional fundamentals, market
signals, alternative data, and machine learning. His expertise is in mid to low frequency in equities.
Tony started his career at Unigestion in 2006 where he joined the quantitative equity low volatility
team to work as a research analyst. He evolved into a member of the research and investment
committee for Minimum Variance Strategies, where he led the factor investing research group for
institutional clients.
• In 2015, he moved to Edhec Risk Scientific Beta as a Senior Consultant for Risk allocation and factor
strategies before going to a major UK pension fund in 2016 to build the in-house systematic equity,
co-managing 8 billion GBP as a senior quantitative portfolio manager. He joined RAM-Active
Investments in January 2019.
• Tony holds a Bachelor and Master degrees in Econometry and Finance from the University of Savoy
France, is the editor-in-chief and founder for the Journal of Machine Learning in Finance. Tony co-
wrote and edited book “Big Data and Machine Learning in Quantitative Investment” Wiley 2018, co-
wrote “Machine Learning for Factor Investing” Chapman Hall/ CRC.
Tony Guida
9. Machine Learning Ensemble for Factor Investing
For professional investors only - promotional material
Tony Guida
10. This material is addressed to professional clients for informative purposes only. It is neither an offer nor an invitation to buy or sell investment products and may not be interpreted as investment
advice. It is not intended to be distributed, published or used in a jurisdiction where such distribution, publication or use is forbidden, and is not intended for any person or entity to whom or
to which it would be illegal to address such a material. In particular, investment products are not offered for sale in the United States or its territories and possessions, nor to any US person
(citizens or residents of the United States of America). The opinions herein do not consider individual clients' circumstances, objectives, or needs. Before entering into any transaction, clients
are advised to form their own opinion and consult professional advisors to obtain an independent review of the specific risks incurred (tax, financial etc.). Upon request, RAM AI Group is
available to provide more information to clients on risks associated with investments. The information and analysis contained herein are based on sources deemed reliable. However, RAM AI
Group does not guarantee their accuracy, correctness or completeness, and it does not accept any liability for any loss or damage resulting from their use. All information and assessments
are subject to change without notice. Changes in exchange rates may cause the NAV per share in the investor's base currency to fluctuate. There is no guarantee to get back the full amount
invested. Past performances, whether actual or back-tested, are not necessarily indicative of future performance. Without prejudice of the due addressee’s own analysis, RAM understands
that this communication should be regarded as a minor non-monetary benefit according to MIFID regulations. Clients are invited to base their investment decisions on the most recent
prospectus, key investor information document (KIID) and financial reports which contain additional information relating to the investment product. These documents are available free of
charge from the SICAV’s and Management Company’s registered offices, its representative and distributor in Switzerland, RAM Active Investments S.A. and at Macard Stein & Co AG, Paying
and Information Agent in Germany; and at RAM Active Investments (Europe) SA – Succursale Milano in Italy. This marketing material has not been approved by any financial Authority, it is
confidential and addressed solely to its intended recipient; its total or partial reproduction and distribution are prohibited. Issued in Switzerland by RAM Active Investments S.A. which is
authorised and regulated in Switzerland by the Swiss Financial Market Supervisory Authority (FINMA). Issued in the European Union and the EEA by the Management Company RAM Active
Investments (Europe) S.A., 51 av. John F. Kennedy L-1855 Luxembourg, Grand Duchy of Luxembourg. The reference to RAM AI Group includes both entities, RAM Active Investments S.A.
and RAM Active Investments (Europe) S.A.
Important Information
11. Given the exponential increase in data availability, the obvious
temptation of any asset manager is to try to infer future returns from
the abundance of attributes available at the firm level.
Current computational power allows to “test” almost all types of new
characteristics/signals.
Sharing knowledge effect. Cross fertilization between Hard science and
finance is increasing (implicitly and explicitly).
A need to innovate. Legacy approach for constructing Style/Factor
equity portfolio has been delivering less return than 10 years ago.
Why this topic?
12. To test more characteristics/signals
To leverage on non-linear complex patterns, rule based
To adapt and identify to trends by re-running models
To ensemble more models, wisdom of the crowd
To be less biased than trad. dogmatic quant. approach
What can we expect from ML in Factor Investing?
19. Rolling Windows for training ( case for 12M forward)
In this example we use a rolling window of 60 months to predict the 12M forward performance of a
stock.
Training set (80%)
Test
set(20%)
12M Holdout
to avoid look
ahead bias
Dec 99 Dec 04 1st Prediction using trained model, Dec 05
TRAINING (60 Months)
Training set (80%)
Test
set(20%)
12M Holdout
to avoid look
ahead bias
Jan 05 2nd Prediction using trained model, Jan 06Jan 00
Feb 00 […………………] [……]
20. Hyperparameters:
13
• The learning rate, 𝜂: it is the step size shrinkage used in update to prevents overfitting.
After each boosting step, we can directly get the weights of new features and 𝜂 actually
shrinks the feature weights to make the boosting process more conservative.
• The maximum depth: it is the longest path (in terms of node) from the root to a leaf of the
tree. Increasing this value will make the model more complex and more likely to be
overfitting.
• Regression 𝝀: it is the 𝐿2 regularization term on weights (mentioned in the technical
section) and increasing this value will make model more conservative.
• gamma: minimum loss reduction required to make a further partition on a leaf node of the
tree. The larger, the more conservative the algorithm will be.
model max_depth eta round eval_metric subsample
XGB 5 1% 150 error 0.8
col_by_sample
0.8
22. We will only predict 12M future performance
We will use a boosted tree classification ML model
Our Investment universe is composed of global stocks (~2000)
Full dataset from Dec-1999 until April-2020.
Dataset contains alt and traditional data.
Data engineering from Exploratory Data Analysis (EDA) is based on:
o Training on tails (extreme quantile from Label/fit cross section) training
o Outliers removals
o Low-coverage instance (row) removal
o Low-coverage feature (column) removal
• (~ 400) features, monthly normalised in percentile
• We use a rolling window of 5 years- 80% Training 20% Testing
• Compare the results according to 3 angles : accuracy, interpretability and out of
sample performance analytics where we will dig into the data engineering.
Objective, Data and Protocol
Paper name
24. First things first: L/S performance from ML models
Performance is Gross of TC and gross of management fees. Backtested performance in USD using London fixing. It is neither an offer nor an invitation to buy or sell investment products and may
not be interpreted as investment advice.
Source: RAM, Bloomberg, Factset, RAM’s alternative data providers.
31. Turnover analysis for both models
BASE_EDA Average of D1 Average of D2 Average of D3 Average of D5 Average of D4 Average of D6 Average of D7 Average of D8 Average of D9 Average of D10
2006 234% 476% 584% 611% 603% 603% 613% 546% 464% 279%
2007 293% 618% 712% 791% 772% 782% 752% 718% 635% 232%
2008 266% 581% 714% 760% 746% 750% 765% 750% 673% 364%
2009 383% 669% 800% 843% 845% 847% 835% 792% 713% 398%
2010 353% 650% 744% 741% 782% 760% 746% 684% 650% 290%
2011 255% 637% 781% 883% 812% 845% 793% 710% 633% 360%
2012 168% 535% 745% 856% 823% 852% 785% 679% 563% 380%
2013 229% 586% 763% 857% 856% 852% 814% 721% 636% 380%
2014 392% 653% 700% 752% 721% 747% 747% 688% 530% 364%
2015 314% 554% 662% 674% 694% 644% 565% 577% 445% 341%
2016 309% 583% 700% 731% 697% 677% 678% 655% 549% 322%
2017 234% 465% 620% 556% 619% 564% 629% 656% 568% 355%
2018 353% 640% 756% 745% 783% 723% 726% 705% 640% 320%
2019 429% 734% 772% 821% 807% 794% 779% 731% 685% 330%
2020 (as of end of April) 379% 704% 779% 806% 811% 819% 804% 782% 682% 335%
BASE_NoEDA Average of D1 Average of D2 Average of D3 Average of D5 Average of D4 Average of D6 Average of D7 Average of D8 Average of D9 Average of D10
2006 688% 513% 622% 690% 654% 729% 715% 648% 586% 356%
2007 344% 653% 775% 791% 746% 804% 772% 708% 607% 282%
2008 238% 527% 790% 754% 748% 843% 804% 723% 614% 339%
2009 270% 591% 889% 846% 905% 866% 826% 776% 716% 407%
2010 586% 757% 783% 897% 870% 939% 903% 854% 731% 366%
2011 755% 739% 859% 877% 906% 898% 838% 795% 681% 394%
2012 425% 728% 618% 803% 743% 787% 821% 778% 631% 471%
2013 535% 791% 739% 789% 816% 794% 791% 661% 577% 319%
2014 730% 597% 663% 719% 698% 739% 753% 674% 600% 307%
2015 760% 677% 685% 662% 670% 621% 608% 544% 503% 279%
2016 659% 559% 609% 516% 606% 476% 530% 524% 464% 229%
2017 628% 610% 661% 605% 650% 580% 531% 569% 552% 324%
2018 588% 554% 592% 670% 626% 712% 655% 683% 650% 368%
2019 779% 781% 776% 793% 766% 844% 804% 740% 623% 329%
2020 (as of end of April) 863% 689% 751% 887% 818% 828% 821% 783% 605% 287%
32. Rolling 12 months performance
Performance is Gross of TC and gross of management fees. Backtested performance in USD using London fixing. It is neither an offer nor an invitation to buy or sell investment products and may
not be interpreted as investment advice.
Source: RAM, Bloomberg, Factset, RAM’s alternative data providers.
34. Data engineering: What’s the impact of NOT doing it
Portfolio Long
D10/Short D1. Legs
equal weighted.
Cash Neutral.
Performance
p.a. (USD
gross of TC
and mgmt.
fees)
Data
engineering
Volatility Sharpe (
risk free
assumed
zero )
Run time
(seconds)
Turnover
(Yearly
average 2-
ways)
BASE_EDA 10.4% ALL 10.6% 0.99 720 362%
BASE_NoEDA 4.2% NONE 10.8% 0.38 4200 450%
BASE_NoEDA1 6.7% ONLY TAILS 12.1% 0.55 1120 469%
BASE_NoEDA2 4.8% ONLY
COVERAGE
10.7% 0.45 3800 497%
BASE_NoEDA3 5.0% ONLY
OUTLIERS
9.72% 0.51 3750 518%
35. What tails training does on accuracy
Source: Hypothetical exercice based on a different yet similar datasets (World including EM)
38. We will predict 1M future performance
We will use a boosted tree classification ML model
Our Investment universe is composed of global stocks including EM (~1700)
Full dataset from Dec-1999 until May-2020. Style datasets are subpart of the global.
Stocks are filtered according absolute and relative metrics for MCAP and ADV.
Data engineering for training is based on:
o Training on tails (extreme quantile from Label/fit cross section) training
o Outliers removals (from label and features)
o Low-coverage instance (row) removal
o Low-coverage feature (column) removal
• (~ 200) features, monthly normalised in percentile
• We use a rolling window of 5 years- 80% Training 20% Testing
From the ML output (probability of outperforming) we create a signal and we
construct portfolio from top/bottom decile (around 150 stocks in each portfolio).
Protocol for ML
39. Implementation ML vs trad. Signal blending
EW portfolios based on ML alpha. Selecting top decile.
Comparing against simple linear average of composite factor MF made of
Profitability, Volatility, Value, Momentum and Size. TRAD VS MODERN
Comparing against linear average of the top 15 most important feature.
PROXY FOR NON LINEARITY ADDED VALUE.
59. Machine Learning model debate: Deep Learning vs RestOfML,
Python vs R, Classification vs Regression etc..
What this presentation was NOT about
What type of dataset to use or not.
A lecture on “ML how to” for hyperparameters.
A full portfolio construction/optimisation with investment’s
constraints + risk management, trading implementation
constraints..
60. 53
Conclusion
• Machine learning is not new but a “new” way for doing research today.
• ML used with traditional data proved to add a non-linear adaptative
component to alpha prediction
• Matter of survival to be capable on onboarding, analysing and implementing
• Big/alt data in the investment toolbox.
61.
62. This material is addressed to professional clients for informative purposes only. It is neither an offer nor an invitation to buy or sell investment products and may not be interpreted as investment
advice. It is not intended to be distributed, published or used in a jurisdiction where such distribution, publication or use is forbidden, and is not intended for any person or entity to whom or
to which it would be illegal to address such a material. In particular, investment products are not offered for sale in the United States or its territories and possessions, nor to any US person
(citizens or residents of the United States of America). The opinions herein do not consider individual clients' circumstances, objectives, or needs. Before entering into any transaction, clients
are advised to form their own opinion and consult professional advisors to obtain an independent review of the specific risks incurred (tax, financial etc.). Upon request, RAM AI Group is
available to provide more information to clients on risks associated with investments. The information and analysis contained herein are based on sources deemed reliable. However, RAM AI
Group does not guarantee their accuracy, correctness or completeness, and it does not accept any liability for any loss or damage resulting from their use. All information and assessments
are subject to change without notice. Changes in exchange rates may cause the NAV per share in the investor's base currency to fluctuate. There is no guarantee to get back the full amount
invested. Past performances, whether actual or back-tested, are not necessarily indicative of future performance. Without prejudice of the due addressee’s own analysis, RAM understands
that this communication should be regarded as a minor non-monetary benefit according to MIFID regulations. Clients are invited to base their investment decisions on the most recent
prospectus, key investor information document (KIID) and financial reports which contain additional information relating to the investment product. These documents are available free of
charge from the SICAV’s and Management Company’s registered offices, its representative and distributor in Switzerland, RAM Active Investments S.A. and at Macard Stein & Co AG, Paying
and Information Agent in Germany; and at RAM Active Investments (Europe) SA – Succursale Milano in Italy. This marketing material has not been approved by any financial Authority, it is
confidential and addressed solely to its intended recipient; its total or partial reproduction and distribution are prohibited. Issued in Switzerland by RAM Active Investments S.A. which is
authorised and regulated in Switzerland by the Swiss Financial Market Supervisory Authority (FINMA). Issued in the European Union and the EEA by the Management Company RAM Active
Investments (Europe) S.A., 51 av. John F. Kennedy L-1855 Luxembourg, Grand Duchy of Luxembourg. The reference to RAM AI Group includes both entities, RAM Active Investments S.A.
and RAM Active Investments (Europe) S.A.
Important Information
65. Thank you!
Sri Krishnamurthy, CFA, CAP
Founder and CEO
QuantUniversity LLC.
srikrishnamurthy
www.QuantUniversity.com
Contact
Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be
distributed or used in any other publication without the prior written consent of QuantUniversity LLC.
11