Pragmatic
Algorithmic Auditing 1.0
2021 Copyright QuantUniversity LLC.
Presented By:
Sri Krishnamurthy, CFA, CAP
sri@quantuniversity.com
www.quantuniversity.com
NYU Mathematical Finance
and Data Science Seminar
April 20, 2021
2
Speaker bio
• Advisory and Consultancy for Financial
Analytics
• Prior Experience at MathWorks, Citigroup
and Endeca and 25+ financial services and
energy customers.
• Columnist for the Wilmott Magazine
• Author of forthcoming book
“Pragmatic AI and ML in Finance”
• Teaches AI/ML and Fintech Related topics in
the MS and MBA programs at Northeastern
University, Boston
• Reviewer: Journal of Asset Management
Sri Krishnamurthy
Founder and CEO
QuantUniversity
3
QuantUniversity
• Boston-based Data Science, Quant
Finance and Machine Learning
training and consulting advisory
• Trained more than 1000 students in
Quantitative methods, Data Science
and Big Data Technologies using
MATLAB, Python and R
• Building a platform for AI
and Machine Learning
Experimentation
1. Algorithmic Auditing – Introduction
2. Algo auditing Frameworks
3. 5 things to note when auditing an algorithm
1. Use case
2. Data
3. Model
4. Environment
5. Process
4. Case study
Agenda
Introduction
6
Interest in Machine learning continues to grow
https://www.wipo.int/edocs/pubdocs/en/wipo_pub_1055.pdf
7
MACHINE LEARNING AND AI IS REVOLUTIONIZING FINANCE
8
Machine Learning & AI in finance: A paradigm shift
8
Stochastic
Models
Factor Models
Optimization
Risk Factors
P/Q Quants
Derivative pricing
Trading Strategies
Simulations
Distribution
fitting
Quant
Real-time analytics
Predictive analytics
Machine Learning
RPA
NLP
Deep Learning
Computer Vision
Graph Analytics
Chatbots
Sentiment Analysis
Alternative Data
Data Scientist
11
12
Algorithm Audits in the news
Machine Learning Workflow
Data Scraping/
Ingestion
Data
Exploration
Data Cleansing
and Processing
Feature
Engineering
Model
Evaluation
& Tuning
Model
Selection
Model
Deployment/
Inference
Supervised
Unsupervised
Modeling
Data Engineer, Dev Ops Engineer
Data Scientist/Quants
Software/Web Engineer
• AutoML
• Model Validation
• Interpretability
Robotic Process Automation (RPA) (Microservices, Pipelines )
• SW: Web/ Rest API
• HW: GPU, Cloud
• Monitoring
• Regression
• KNN
• Decision Trees
• Naive Bayes
• Neural Networks
• Ensembles
• Clustering
• PCA
• Autoencoder
• RMS
• MAPS
• MAE
• Confusion Matrix
• Precision/Recall
• ROC
• Hyper-parameter
tuning
• Parameter Grids
Risk Management/ Compliance(All stages)
Analysts
&
Decision
Makers
16
18
• Algorithmic auditing is a structured process conducted internally or
by a qualified independent third party that involves:
▫ Verifying and/or validating the working of the algorithm along with the
data, model, environment, process, contextual to the use-cases in
which the algorithm is intended to be used.
▫ Identifying issues that are clearly articulated and scoped for the
algorithm. Criteria could include:
– Bias, fairness, discrimination, explainability, interpretability etc.
▫ Documenting the understanding of the algorithm’s behavior, uses as
observed and evaluated by a qualified individual.
▫ Recommending mitigation, control and elimination of noted risks.
Algorithmic Auditing
19
Review this:
https://www2.deloitte.com/content/dam/insights/us/articles/
4767_FoW-in-govt/DI_Algorithm-auditor.pdf
21
How would you structure an algorithmic audit?
• Typically after the model is done
• Independence
• Subject matter expertise
• Disclosure
• Legal/Industry standard
External
• During or before model deployment
• Due diligence
• Policy – Model risk/Governance
• Best practices
• Workflow
Internal
22
Why an algorithmic audit?
• Fraud detection
• Credit decision
• Facial recognition
Potential Systemic issues
• Blackbox models
• Vendor models
• Proprietary models
Transparency
• Methods used for the decision making process
• Security and Privacy
• Use of variables and data
Accountability
23
• SMACTR (Google, Partnership of AI)
• SAI : Supreme Audit Institutions (Finland, Germany, the Netherlands,
Norway and the UK)
• ICO : UK
• TUV Austria: Trusted Artificial intelligence white paper
AI Auditing frameworks
24
• Scoping
• Mapping
• Artifact Collection
• Testing
• Reflection
SMACTR
Ref: https://arxiv.org/abs/2001.00973 Closing the AI Accountability Gap: Defining an End-
to-End Framework for Internal Algorithmic Auditing
25
Toward Trustworthy AI Development: Mechanisms for
Supporting Verifiable Claims
Ref: https://arxiv.org/pdf/2004.07213.pdf
26
Toward Trustworthy AI Development: Mechanisms for
Supporting Verifiable Claims
Ref: https://arxiv.org/pdf/2004.07213.pdf
27
Toward Trustworthy AI Development: Mechanisms for
Supporting Verifiable Claims
Ref: https://arxiv.org/pdf/2004.07213.pdf
28
https://www.auditingalgorithms.net/executive-
summary.html
Auditing machine learning algorithms: A white paper for
public auditors
29
Main general problem areas and risks:
• Developers of ML algorithms will often focus on optimising specific numeric
performance metrics. As a result, there is a high risk that requirements of
compliance, transparency and fairness are neglected.
• Product owners within the auditee organisation might not communicate their
requirements well to ML developers, leading to ML algorithms that could, in a worst
case scenario, increase costs and make routine tasks more time-consuming.
• Auditee organisations often lack the resources and competence to develop ML
applications internally and thus rely on consultants or procure ready-made solutions
from commercial businesses. This increases the risk of using ML without the
understanding necessary both for ML-based production/maintenance and
compliance requirements.
• There is significant uncertainty among public-sector entities in the MoU member
states about the use of personal data in ML models. While the data protection
agencies have begun to issue guidelines, organisational regulatory structures are not
necessarily in place and accountability tends to be unclarified.
Auditing machine learning algorithms: A white paper for
public auditors
https://www.auditingalgorithms.net/executive-summary.html
30
• Auditors need a good understanding of the high-level principles of ML
algorithms and up-to-date knowledge of the rapid technical
developments in this field - this is sufficient to perform a baseline audit
by reviewing the respective documentation of an ML-system.
• For a thorough audit that includes substantial tests, auditors need to
understand common coding languages and model implementations, and
be able to use appropriate software tools.
• ML-related IT infrastructure often includes cloud-based solutions due to
the high demand on computing power. Therefore, auditors need a basic
understanding of cloud services for this kind of audit work.
Auditing machine learning algorithms: A white paper for
public auditors
https://www.auditingalgorithms.net/executive-summary.html
31
• This paper reaches the following conclusions and recommendations for SAIs:
• SAIs should be able to audit ML-based AI applications in order to fulfil their statutory
mission and to assess whether use of ML contributes to efficient and effective public
services, in compliance with relevant rules and regulations.
• ML audits require special auditor knowledge and skills, and SAIs should build up the
competence of their auditors.
• The ML audit catalogue and helper tool proposed in this paper have been tested in
our case studies and may be used as templates. They are living documents and thus
should be refined by application to more cases and to more diverse cases, and
consistently updated with new AI research results.
• SAIs should build up their capacities to perform more ML audit work.
• The authors hope that the guidance and good practices provided within this paper,
alongside the audit helper tool, will enable the international audit community to
begin auditing ML.
Auditing machine learning algorithms: A white paper for
public auditors
https://www.auditingalgorithms.net/executive-summary.html
32
ICO - UK
https://ico.org.uk/media/for-organisations/guide-to-data-
protection/key-data-protection-themes/guidance-on-ai-and-
data-protection-0-0.pdf
33
The framework:
• gives us a clear methodology to audit AI applications and ensure
they process personal data fairly, lawfully and transparently;
• ensures that the necessary measures are in place to assess and
manage risks to rights and freedoms that arise from AI;
• and supports the work of our investigation and assurance teams
when assessing the compliance of organisations using AI.
ICO - UK
https://ico.org.uk/media/for-organisations/guide-to-data-
protection/key-data-protection-themes/guidance-on-ai-and-
data-protection-0-0.pdf
34
The framework output:
• Auditing tools and procedures which our investigation and
assurance teams will use when assessing the compliance of
organisations using AI. The specific auditing and investigation
activities they undertake vary, but can include off-site checks, on-
site tests and interviews, and in some cases the recovery and
analysis of evidence, including AI systems themselves.
• This detailed guidance on AI and data protection for organisations,
which outlines our thinking.
• A toolkit designed to provide further practical support to
organisations auditing the compliance of their own AI systems
ICO - UK
https://ico.org.uk/media/for-organisations/guide-to-data-protection/key-data-protection-
themes/guidance-on-ai-and-data-protection-0-0.pdf
35
TUV-Austria
https://www.tuv.at/loesungen/digital-services/trusted-ai/
36
37
Questions to ask:
• Do we really need this algorithm?
• How will this algorithm be used?
• Who/What will it affect?
1. Use cases are important
38
Things to think about:
• How much data do we
have?
• How will this affect the
model?
• Do we have enough data?
• Are their privacy concerns?
2. Don’t forget the data
39
All scenarios haven’t
played out
• Stress scenarios
• What-if scenarios
Challenges with real datasets
Figure ref: http://www.actuaries.org/CTTEES_SOLV/Documents/StressTestingPaper.pdf
40
41
Questions to ask
• Blackbox/Whitebox
• Does the model work?
• How do we handle imbalanced
classes?
• Is it fair/biased?
• Can you explain the model?
3. Model Audit
42
4. Environment Audit :
Where will the model run?
5.0 Pipeline audit
Data Scraping/
Ingestion
Data
Exploration
Data Cleansing
and Processing
Feature
Engineering
Model
Evaluation
& Tuning
Model
Selection
Model
Deployment/
Inference
Supervised
Unsupervised
Modeling
Data Engineer, Dev Ops Engineer
Data Scientist/Quants
Software/Web Engineer
• AutoML
• Model Validation
• Interpretability
Robotic Process Automation (RPA) (Microservices, Pipelines )
• SW: Web/ Rest API
• HW: GPU, Cloud
• Monitoring
• Regression
• KNN
• Decision Trees
• Naive Bayes
• Neural Networks
• Ensembles
• Clustering
• PCA
• Autoencoder
• RMS
• MAPS
• MAE
• Confusion Matrix
• Precision/Recall
• ROC
• Hyper-parameter
tuning
• Parameter Grids
Risk Management/ Compliance(All stages)
Analysts
&
Decision
Makers
44
The Algo-Audit Checklist
45
Data Scraping/
Ingestion
Data
Exploration
Data Cleansing
and Processing
Feature
Engineering
Model
Evaluation
& Tuning
Model
Selection
Model
Deployment/
Inference
Supervised
Unsupervised
Modeling
Data Engineer, Dev Ops Engineer
Data Scientist/Quants
Software/Web Engineer
• AutoML
• Model Validation
• Interpretability
Robotic Process Automation (RPA) (Microservices, Pipelines )
• SW: Web/ Rest API
• HW: GPU, Cloud
• Monitoring
• Regression
• KNN
• Decision Trees
• Naive Bayes
• Neural Networks
• Ensembles
• Clustering
• PCA
• Autoencoder
• RMS
• MAPS
• MAE
• Confusion Matrix
• Precision/Recall
• ROC
• Hyper-parameter
tuning
• Parameter Grids
Risk Management/ Compliance(All stages)
Analysts
&
Decision
Makers
57
The Algo-Audit Checklist
Thank you!
Sri Krishnamurthy, CFA, CAP
Founder and CEO
QuantUniversity LLC.
srikrishnamurthy
www.QuantUniversity.com
Contact
Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be
distributed or used in any other publication without the prior written consent of QuantUniversity LLC.
59

Algorithmic auditing 1.0

  • 1.
    Pragmatic Algorithmic Auditing 1.0 2021Copyright QuantUniversity LLC. Presented By: Sri Krishnamurthy, CFA, CAP sri@quantuniversity.com www.quantuniversity.com NYU Mathematical Finance and Data Science Seminar April 20, 2021
  • 2.
    2 Speaker bio • Advisoryand Consultancy for Financial Analytics • Prior Experience at MathWorks, Citigroup and Endeca and 25+ financial services and energy customers. • Columnist for the Wilmott Magazine • Author of forthcoming book “Pragmatic AI and ML in Finance” • Teaches AI/ML and Fintech Related topics in the MS and MBA programs at Northeastern University, Boston • Reviewer: Journal of Asset Management Sri Krishnamurthy Founder and CEO QuantUniversity
  • 3.
    3 QuantUniversity • Boston-based DataScience, Quant Finance and Machine Learning training and consulting advisory • Trained more than 1000 students in Quantitative methods, Data Science and Big Data Technologies using MATLAB, Python and R • Building a platform for AI and Machine Learning Experimentation
  • 4.
    1. Algorithmic Auditing– Introduction 2. Algo auditing Frameworks 3. 5 things to note when auditing an algorithm 1. Use case 2. Data 3. Model 4. Environment 5. Process 4. Case study Agenda
  • 5.
  • 6.
    6 Interest in Machinelearning continues to grow https://www.wipo.int/edocs/pubdocs/en/wipo_pub_1055.pdf
  • 7.
    7 MACHINE LEARNING ANDAI IS REVOLUTIONIZING FINANCE
  • 8.
    8 Machine Learning &AI in finance: A paradigm shift 8 Stochastic Models Factor Models Optimization Risk Factors P/Q Quants Derivative pricing Trading Strategies Simulations Distribution fitting Quant Real-time analytics Predictive analytics Machine Learning RPA NLP Deep Learning Computer Vision Graph Analytics Chatbots Sentiment Analysis Alternative Data Data Scientist
  • 11.
  • 12.
  • 15.
    Machine Learning Workflow DataScraping/ Ingestion Data Exploration Data Cleansing and Processing Feature Engineering Model Evaluation & Tuning Model Selection Model Deployment/ Inference Supervised Unsupervised Modeling Data Engineer, Dev Ops Engineer Data Scientist/Quants Software/Web Engineer • AutoML • Model Validation • Interpretability Robotic Process Automation (RPA) (Microservices, Pipelines ) • SW: Web/ Rest API • HW: GPU, Cloud • Monitoring • Regression • KNN • Decision Trees • Naive Bayes • Neural Networks • Ensembles • Clustering • PCA • Autoencoder • RMS • MAPS • MAE • Confusion Matrix • Precision/Recall • ROC • Hyper-parameter tuning • Parameter Grids Risk Management/ Compliance(All stages) Analysts & Decision Makers
  • 16.
  • 18.
    18 • Algorithmic auditingis a structured process conducted internally or by a qualified independent third party that involves: ▫ Verifying and/or validating the working of the algorithm along with the data, model, environment, process, contextual to the use-cases in which the algorithm is intended to be used. ▫ Identifying issues that are clearly articulated and scoped for the algorithm. Criteria could include: – Bias, fairness, discrimination, explainability, interpretability etc. ▫ Documenting the understanding of the algorithm’s behavior, uses as observed and evaluated by a qualified individual. ▫ Recommending mitigation, control and elimination of noted risks. Algorithmic Auditing
  • 19.
  • 21.
    21 How would youstructure an algorithmic audit? • Typically after the model is done • Independence • Subject matter expertise • Disclosure • Legal/Industry standard External • During or before model deployment • Due diligence • Policy – Model risk/Governance • Best practices • Workflow Internal
  • 22.
    22 Why an algorithmicaudit? • Fraud detection • Credit decision • Facial recognition Potential Systemic issues • Blackbox models • Vendor models • Proprietary models Transparency • Methods used for the decision making process • Security and Privacy • Use of variables and data Accountability
  • 23.
    23 • SMACTR (Google,Partnership of AI) • SAI : Supreme Audit Institutions (Finland, Germany, the Netherlands, Norway and the UK) • ICO : UK • TUV Austria: Trusted Artificial intelligence white paper AI Auditing frameworks
  • 24.
    24 • Scoping • Mapping •Artifact Collection • Testing • Reflection SMACTR Ref: https://arxiv.org/abs/2001.00973 Closing the AI Accountability Gap: Defining an End- to-End Framework for Internal Algorithmic Auditing
  • 25.
    25 Toward Trustworthy AIDevelopment: Mechanisms for Supporting Verifiable Claims Ref: https://arxiv.org/pdf/2004.07213.pdf
  • 26.
    26 Toward Trustworthy AIDevelopment: Mechanisms for Supporting Verifiable Claims Ref: https://arxiv.org/pdf/2004.07213.pdf
  • 27.
    27 Toward Trustworthy AIDevelopment: Mechanisms for Supporting Verifiable Claims Ref: https://arxiv.org/pdf/2004.07213.pdf
  • 28.
  • 29.
    29 Main general problemareas and risks: • Developers of ML algorithms will often focus on optimising specific numeric performance metrics. As a result, there is a high risk that requirements of compliance, transparency and fairness are neglected. • Product owners within the auditee organisation might not communicate their requirements well to ML developers, leading to ML algorithms that could, in a worst case scenario, increase costs and make routine tasks more time-consuming. • Auditee organisations often lack the resources and competence to develop ML applications internally and thus rely on consultants or procure ready-made solutions from commercial businesses. This increases the risk of using ML without the understanding necessary both for ML-based production/maintenance and compliance requirements. • There is significant uncertainty among public-sector entities in the MoU member states about the use of personal data in ML models. While the data protection agencies have begun to issue guidelines, organisational regulatory structures are not necessarily in place and accountability tends to be unclarified. Auditing machine learning algorithms: A white paper for public auditors https://www.auditingalgorithms.net/executive-summary.html
  • 30.
    30 • Auditors needa good understanding of the high-level principles of ML algorithms and up-to-date knowledge of the rapid technical developments in this field - this is sufficient to perform a baseline audit by reviewing the respective documentation of an ML-system. • For a thorough audit that includes substantial tests, auditors need to understand common coding languages and model implementations, and be able to use appropriate software tools. • ML-related IT infrastructure often includes cloud-based solutions due to the high demand on computing power. Therefore, auditors need a basic understanding of cloud services for this kind of audit work. Auditing machine learning algorithms: A white paper for public auditors https://www.auditingalgorithms.net/executive-summary.html
  • 31.
    31 • This paperreaches the following conclusions and recommendations for SAIs: • SAIs should be able to audit ML-based AI applications in order to fulfil their statutory mission and to assess whether use of ML contributes to efficient and effective public services, in compliance with relevant rules and regulations. • ML audits require special auditor knowledge and skills, and SAIs should build up the competence of their auditors. • The ML audit catalogue and helper tool proposed in this paper have been tested in our case studies and may be used as templates. They are living documents and thus should be refined by application to more cases and to more diverse cases, and consistently updated with new AI research results. • SAIs should build up their capacities to perform more ML audit work. • The authors hope that the guidance and good practices provided within this paper, alongside the audit helper tool, will enable the international audit community to begin auditing ML. Auditing machine learning algorithms: A white paper for public auditors https://www.auditingalgorithms.net/executive-summary.html
  • 32.
  • 33.
    33 The framework: • givesus a clear methodology to audit AI applications and ensure they process personal data fairly, lawfully and transparently; • ensures that the necessary measures are in place to assess and manage risks to rights and freedoms that arise from AI; • and supports the work of our investigation and assurance teams when assessing the compliance of organisations using AI. ICO - UK https://ico.org.uk/media/for-organisations/guide-to-data- protection/key-data-protection-themes/guidance-on-ai-and- data-protection-0-0.pdf
  • 34.
    34 The framework output: •Auditing tools and procedures which our investigation and assurance teams will use when assessing the compliance of organisations using AI. The specific auditing and investigation activities they undertake vary, but can include off-site checks, on- site tests and interviews, and in some cases the recovery and analysis of evidence, including AI systems themselves. • This detailed guidance on AI and data protection for organisations, which outlines our thinking. • A toolkit designed to provide further practical support to organisations auditing the compliance of their own AI systems ICO - UK https://ico.org.uk/media/for-organisations/guide-to-data-protection/key-data-protection- themes/guidance-on-ai-and-data-protection-0-0.pdf
  • 35.
  • 36.
  • 37.
    37 Questions to ask: •Do we really need this algorithm? • How will this algorithm be used? • Who/What will it affect? 1. Use cases are important
  • 38.
    38 Things to thinkabout: • How much data do we have? • How will this affect the model? • Do we have enough data? • Are their privacy concerns? 2. Don’t forget the data
  • 39.
    39 All scenarios haven’t playedout • Stress scenarios • What-if scenarios Challenges with real datasets Figure ref: http://www.actuaries.org/CTTEES_SOLV/Documents/StressTestingPaper.pdf
  • 40.
  • 41.
    41 Questions to ask •Blackbox/Whitebox • Does the model work? • How do we handle imbalanced classes? • Is it fair/biased? • Can you explain the model? 3. Model Audit
  • 42.
    42 4. Environment Audit: Where will the model run?
  • 43.
    5.0 Pipeline audit DataScraping/ Ingestion Data Exploration Data Cleansing and Processing Feature Engineering Model Evaluation & Tuning Model Selection Model Deployment/ Inference Supervised Unsupervised Modeling Data Engineer, Dev Ops Engineer Data Scientist/Quants Software/Web Engineer • AutoML • Model Validation • Interpretability Robotic Process Automation (RPA) (Microservices, Pipelines ) • SW: Web/ Rest API • HW: GPU, Cloud • Monitoring • Regression • KNN • Decision Trees • Naive Bayes • Neural Networks • Ensembles • Clustering • PCA • Autoencoder • RMS • MAPS • MAE • Confusion Matrix • Precision/Recall • ROC • Hyper-parameter tuning • Parameter Grids Risk Management/ Compliance(All stages) Analysts & Decision Makers
  • 44.
  • 45.
  • 46.
    Data Scraping/ Ingestion Data Exploration Data Cleansing andProcessing Feature Engineering Model Evaluation & Tuning Model Selection Model Deployment/ Inference Supervised Unsupervised Modeling Data Engineer, Dev Ops Engineer Data Scientist/Quants Software/Web Engineer • AutoML • Model Validation • Interpretability Robotic Process Automation (RPA) (Microservices, Pipelines ) • SW: Web/ Rest API • HW: GPU, Cloud • Monitoring • Regression • KNN • Decision Trees • Naive Bayes • Neural Networks • Ensembles • Clustering • PCA • Autoencoder • RMS • MAPS • MAE • Confusion Matrix • Precision/Recall • ROC • Hyper-parameter tuning • Parameter Grids Risk Management/ Compliance(All stages) Analysts & Decision Makers
  • 57.
  • 59.
    Thank you! Sri Krishnamurthy,CFA, CAP Founder and CEO QuantUniversity LLC. srikrishnamurthy www.QuantUniversity.com Contact Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be distributed or used in any other publication without the prior written consent of QuantUniversity LLC. 59