The document describes an algebraic attack on the KeeLoq block cipher. KeeLoq is used in car keyless entry systems and has 528 rounds with a 64-bit key. The attack sets up a system of equations modeling the encryption rounds using a SAT solver to recover the key from known plaintext-ciphertext pairs. It converts the algebraic normal form equations to conjunctive normal form required by the SAT solver to solve for the key.
this is good question set for CCAT exam and alos for CCEE
for more details please visit
http://acts.cdac.in
http://cdacguru.wordpress.com
http://fb.com/cdacguru
The document discusses attribute-based encryption (ABE) schemes, including Key-Policy ABE (KP-ABE) and Ciphertext-Policy ABE (CP-ABE). It defines the components of KP-ABE and CP-ABE, including setup, encryption, key generation, and decryption algorithms. It also describes the security models and proves the selective security of the GPSW KP-ABE scheme and correctness of the Waters CP-ABE scheme under the decisional bilinear Diffie-Hellman assumption. The document outlines the KP-ABE and CP-ABE constructions and security proofs in detail.
Predicting organic reaction outcomes with weisfeiler lehman networkKazuki Fujikawa
This document discusses neural message passing networks for modeling quantum chemistry. It defines message passing networks as having message functions that update node states based on neighboring node states, vertex update functions that update node states based to accumulated messages, and a readout function that produces an output for the full graph. It provides examples of specific message, update, and readout functions used in existing message passing models like interaction networks and molecular graph convolutions.
Introducing R package ESG at Rmetrics Paris 2014 conferenceThierry Moudiki
The document describes the ESG package in R, which provides tools for generating economic scenarios for use in insurance valuation and capital requirements calculations. It discusses the available risk factors in ESG, including nominal interest rates, equity returns, property returns, and corporate bond returns. It also outlines the package's object-oriented structure and provides examples of how to use ESG to simulate scenarios and calculate insurance liabilities and capital requirements.
This document discusses the implementation of digital filters in fixed-point arithmetic on embedded systems. It presents the need for methodology and tools to design fixed-point embedded filter systems. The key steps are: 1) choosing a filter algorithm, 2) rounding coefficients to fixed-point, and 3) implementing the algorithm. Optimal implementations minimize degradation from quantization errors while meeting resource constraints. The document outlines a global flow from filter design to code generation and optimization.
NIPS2017 Few-shot Learning and Graph ConvolutionKazuki Fujikawa
The document discusses meta-learning and prototypical networks for few-shot learning. It introduces prototypical networks, which learn a metric space such that classification can be performed by finding the nearest class prototype to a query example in embedding space. The document summarizes results on few-shot image classification benchmarks like Omniglot and miniImageNet, finding that prototypical networks achieve state-of-the-art performance.
The document describes an algebraic attack on the KeeLoq block cipher. KeeLoq is used in car keyless entry systems and has 528 rounds with a 64-bit key. The attack sets up a system of equations modeling the encryption rounds using a SAT solver to recover the key from known plaintext-ciphertext pairs. It converts the algebraic normal form equations to conjunctive normal form required by the SAT solver to solve for the key.
this is good question set for CCAT exam and alos for CCEE
for more details please visit
http://acts.cdac.in
http://cdacguru.wordpress.com
http://fb.com/cdacguru
The document discusses attribute-based encryption (ABE) schemes, including Key-Policy ABE (KP-ABE) and Ciphertext-Policy ABE (CP-ABE). It defines the components of KP-ABE and CP-ABE, including setup, encryption, key generation, and decryption algorithms. It also describes the security models and proves the selective security of the GPSW KP-ABE scheme and correctness of the Waters CP-ABE scheme under the decisional bilinear Diffie-Hellman assumption. The document outlines the KP-ABE and CP-ABE constructions and security proofs in detail.
Predicting organic reaction outcomes with weisfeiler lehman networkKazuki Fujikawa
This document discusses neural message passing networks for modeling quantum chemistry. It defines message passing networks as having message functions that update node states based on neighboring node states, vertex update functions that update node states based to accumulated messages, and a readout function that produces an output for the full graph. It provides examples of specific message, update, and readout functions used in existing message passing models like interaction networks and molecular graph convolutions.
Introducing R package ESG at Rmetrics Paris 2014 conferenceThierry Moudiki
The document describes the ESG package in R, which provides tools for generating economic scenarios for use in insurance valuation and capital requirements calculations. It discusses the available risk factors in ESG, including nominal interest rates, equity returns, property returns, and corporate bond returns. It also outlines the package's object-oriented structure and provides examples of how to use ESG to simulate scenarios and calculate insurance liabilities and capital requirements.
This document discusses the implementation of digital filters in fixed-point arithmetic on embedded systems. It presents the need for methodology and tools to design fixed-point embedded filter systems. The key steps are: 1) choosing a filter algorithm, 2) rounding coefficients to fixed-point, and 3) implementing the algorithm. Optimal implementations minimize degradation from quantization errors while meeting resource constraints. The document outlines a global flow from filter design to code generation and optimization.
NIPS2017 Few-shot Learning and Graph ConvolutionKazuki Fujikawa
The document discusses meta-learning and prototypical networks for few-shot learning. It introduces prototypical networks, which learn a metric space such that classification can be performed by finding the nearest class prototype to a query example in embedding space. The document summarizes results on few-shot image classification benchmarks like Omniglot and miniImageNet, finding that prototypical networks achieve state-of-the-art performance.
Kirby Urner discusses using Python to teach mathematics through programming and storytelling. Some key ideas include using Python to demonstrate mathematical concepts like functions, objects, algorithms, and data structures. Examples shown include generating sequences, animating polyhedral numbers, and building mathematical objects in Python. The document concludes that programming in Python can help build a stronger understanding of mathematics compared to specialized learning languages.
This document discusses speaker diarization, which is the process of segmenting an audio stream into homogeneous segments according to speaker identity. It covers feature extraction methods like MFCCs, segmentation using Bayesian Information Criteria to compare Gaussian mixture models, and clustering algorithms like k-means and hierarchical agglomerative clustering. Dendrogram visualizations are used to identify natural speaker clusters. The overall goal is to partition audio recordings of discussions or debates into homogeneous segments to attribute speech segments to individual speakers.
The document describes Identity-Based Encryption (IBE), including the key algorithms involved: Setup, Extract, Encrypt, Decrypt. It explains that IBE allows encrypting messages for arbitrary string identities like email addresses, without needing a public key. The PKG runs Setup to generate parameters and a master key, and Extract to generate private keys from identities. Encrypt uses an identity and message to create a ciphertext, while Decrypt uses the private key to recover the message. Applications include key revocation and delegation. Security relies on bilinear pairings on elliptic curves.
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEHONGJOO LEE
45 min talk about collecting home network performance measures, analyzing and forecasting time series data, and building anomaly detection system.
In this talk, we will go through the whole process of data mining and knowledge discovery. Firstly we write a script to run speed test periodically and log the metric. Then we parse the log data and convert them into a time series and visualize the data for a certain period.
Next we conduct some data analysis; finding trends, forecasting, and detecting anomalous data. There will be several statistic or deep learning techniques used for the analysis; ARIMA (Autoregressive Integrated Moving Average), LSTM (Long Short Term Memory).
Cloud computing is an ever-growing field in today‘s era.With the accumulation of data and the
advancement of technology,a large amount of data is generated everyday.Storage, availability and security of
the data form major concerns in the field of cloud computing.This paper focuses on homomorphic encryption,
which is largely used for security of data in the cloud.Homomorphic encryption is defined as the technique of
encryption in which specific operations can be carried out on the encrypted data.The data is stored on a remote
server.The task here is operating on the encrypted data.There are two types of homomorphic encryption, Fully
homomorphic encryption and patially homomorphic encryption.Fully homomorphic encryption allow arbitrary
computation on the ciphertext in a ring, while the partially homomorphic encryption is the one in which
addition or multiplication operations can be carried out on the normal ciphertext.Homomorphic encryption
plays a vital role in cloud computing as the encrypted data of companies is stored in a public cloud, thus taking
advantage of the cloud provider‘s services.Various algorithms and methods of homomorphic encryption that
have been proposed are discussed in this paper
This document contains a 20 question mock exam for the GATE exam. It provides instructions that each question is worth 1 mark, unanswered questions receive 0 marks and incorrect answers receive negative marks. It then lists 20 multiple choice questions related to computer science topics like operating systems, algorithms, data structures, computer networks and formal languages. For each question there are 4 possible answer choices and space to write the answer.
This document discusses formatting bits to better implement signal processing algorithms with integer arithmetic. It begins by introducing the context and objectives, which is to develop a methodology and tools to implement embedded filter algorithms using only integer arithmetic while controlling errors. It then discusses fixed-point arithmetic and how filters can be implemented using sum-of-products operations. The objective is given a bound on the final error, to find an implementation that reduces bit usage while controlling output error. The document proposes a two-step bit formatting method that first formats the most significant bits using Jackson's rule, then determines the minimum number of least significant bits that need to be kept to ensure faithful rounding of the final result.
Elliptic curve cryptography (ECC) uses elliptic curves over finite fields for encryption, digital signatures, and key exchange. It provides the same security as RSA or discrete logarithm schemes but with smaller key sizes (e.g. 256-bit ECC vs. 3072-bit RSA). ECC algorithms are also faster and use less energy than other schemes. While ECC offers advantages, security relies on using cryptographically strong elliptic curves and there is no deterministic method to encode messages as curve points.
Lattice-Based Cryptography: CRYPTANALYSIS OF COMPACT-LWEPriyanka Aash
Destructive and constructive methods in lattice-based cryptography will be discussed. Topic 1: Cryptanalysis of Compact-LWE Authors: Jonathan Bootle; Mehdi Tibouchi; Keita Xagawa Topic 2: Two-message Key Exchange with Strong Security from Ideal Lattices Authors: Zheng Yang; Yu Chen; Song Luo
(Source: RSA Conference USA 2018)
This document summarizes Chapter 13 of Liang's Introduction to Java Programming textbook. It discusses graphics and drawing in Java, including:
- Java's coordinate system and how each GUI component has its own coordinate space
- Using the Graphics class to draw strings, lines, rectangles, ovals, arcs and polygons
- Overriding the paintComponent method to draw on a component
- Examples of drawing different shapes and using classes like FigurePanel and MessagePanel
B.Sc.IT: Semester - VI (April - 2015) [IDOL - Revised Course | Question Paper]Mumbai B.Sc.IT Study
B.Sc.IT: Semester - VI (April - 2015) [IDOL - Revised Course | Question Paper]
april - 2017, april - 2016, april - 2015, april - 2014, april - 2013, october - 2017, october - 2016, october - 2015, october - 2014, may - 2016, may - 2017, december - 2017, 75:25 pattern, 60:40 pattern, revised course, old course, mumbai bscit study, mumbai university, bscit semester vi, bscit question paper, old question paper, previous year question paper, semester vi question paper, question paper, CBSGS, IDOL, kamal t, internet technology, digital signals and systems, data warehousing, ipr and cyber laws, project management, geographic information system
Discrete Logarithmic Problem- Basis of Elliptic Curve CryptosystemsNIT Sikkim
ECC was developed in 1985 independently by Neal Koblitz and Victor Miller. Both men saw the application of the elliptic curve discrete log problem (ECDLP) as a replacement for the conventional discrete log problem (DLP) which is used in DSA, and the integer factorization problem found in RSA. For both problems, sub-exponential solutions have been generated; the
same which cannot be said for ECDLP . In addition to offering increased security for a smaller key size, operations of adding and doubling can be optimized successfully on a mobile
platform . ECC offers a viable replacement to the most common public-key cryptography algorithms on mobile devices.
In this paper, we present a complete digital signature message stream, just the way the RSA digital
signature scheme does it. We will focus on the operations with large numbers due to the fact that operating
with large numbers is the essence of RSA that cannot be understood by the usual illustrative examples with
small numbers[1].
20101017 program analysis_for_security_livshits_lecture02_compilersComputer Science Club
This document provides an introduction and overview of compiler optimization techniques, including:
1) Flow graphs, constant folding, global common subexpressions, induction variables, and reduction in strength.
2) Data-flow analysis basics like reaching definitions, gen/kill frameworks, and solving data-flow equations iteratively.
3) Pointer analysis using Andersen's formulation to model references between local variables and heap objects. Rules are provided to represent points-to relationships.
Atomic algorithm and the servers' s use to find the Hamiltonian cyclesIJERA Editor
Inspired by the movement of the particles in the atom, I demonstrated in [5] the existence of a polynomial
algorithm of the order
O(n
3
)
for finding Hamiltonian cycles in a graph with basis
E= {x0,...
, xn− 1
}
. In this
article I will give an improvement in space and in time of the algorithm says: we know that there exist several
methods to find the Hamiltonian cycles such as the Monte Carlo method, Dynamic programming, or DNA
computing. Unfortunately they are either expensive or slow to execute it. Hence the idea to use multiple servers
to solve this problem : Each point
xi
in the graph will be considered as a server, and each server
xi
will
communicate with each other server
x j
with which it is connected . And finally the server
x0
will receive
and display the Hamiltonian cycles if they exist
My presentation at University of Nottingham "Fast low-rank methods for solvin...Alexander Litvinenko
Overview of my (with co-authors) low-rank tensor methods for solving PDEs with uncertain coefficients. Connection with Bayesian Update. Solving a coupled system: stochastic forward and stochastic inverse.
Nonlinear analysis of fixed support beam with hinge by hinge method in c prog...Salar Delavar Qashqai
This C program performs a nonlinear analysis of a fixed support beam with hinge using the hinge method. It includes functions for importing data, generating stiffness matrices, performing matrix operations like inversion and multiplication, calculating element internal forces, and outputting results to text, MATLAB, and Excel files. The main function calls the various analysis functions and outputs initial data, analysis reports, and any error messages.
Tensorflow in practice by Engineer - donghwi chaDonghwi Cha
- Tensorflow is an introduction to the machine learning framework Tensorflow covering key concepts like computation graphs, operations, sessions, training, replication, and clustering.
- Key aspects discussed include how Tensorflow executes operations as a static computation graph, uses sessions to run graphs and tensors to hold values, and supports data parallelism through replication across devices/workers.
- The document provides examples of building neural network models in Tensorflow and discusses techniques for training models like backpropagation and distributing training using data parallelism.
To describe the dynamics taking place in networks that structurally change over time, we propose an approach to search for attributes whose value changes impact the topology of the graph. In several applications, it appears that the variations of a group of attributes are often followed by some structural changes in the graph that one may assume they generate. We formalize the triggering pattern discovery problem as a method jointly rooted in sequence mining and graph analysis. We apply our approach on three real-world dynamic graphs of different natures - a co-authoring network, an airline network, and a social bookmarking system - assessing the relevancy of the triggering pattern mining approach.
In this paper, we present a complete digital signature message stream, just the way the RSA digital
signature scheme does it. We will focus on the operations with large numbers due to the fact that operating
with large numbers is the essence of RSA that cannot be understood by the usual illustrative examples with
small numbers[1].
Kirby Urner discusses using Python to teach mathematics through programming and storytelling. Some key ideas include using Python to demonstrate mathematical concepts like functions, objects, algorithms, and data structures. Examples shown include generating sequences, animating polyhedral numbers, and building mathematical objects in Python. The document concludes that programming in Python can help build a stronger understanding of mathematics compared to specialized learning languages.
This document discusses speaker diarization, which is the process of segmenting an audio stream into homogeneous segments according to speaker identity. It covers feature extraction methods like MFCCs, segmentation using Bayesian Information Criteria to compare Gaussian mixture models, and clustering algorithms like k-means and hierarchical agglomerative clustering. Dendrogram visualizations are used to identify natural speaker clusters. The overall goal is to partition audio recordings of discussions or debates into homogeneous segments to attribute speech segments to individual speakers.
The document describes Identity-Based Encryption (IBE), including the key algorithms involved: Setup, Extract, Encrypt, Decrypt. It explains that IBE allows encrypting messages for arbitrary string identities like email addresses, without needing a public key. The PKG runs Setup to generate parameters and a master key, and Extract to generate private keys from identities. Encrypt uses an identity and message to create a ciphertext, while Decrypt uses the private key to recover the message. Applications include key revocation and delegation. Security relies on bilinear pairings on elliptic curves.
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEHONGJOO LEE
45 min talk about collecting home network performance measures, analyzing and forecasting time series data, and building anomaly detection system.
In this talk, we will go through the whole process of data mining and knowledge discovery. Firstly we write a script to run speed test periodically and log the metric. Then we parse the log data and convert them into a time series and visualize the data for a certain period.
Next we conduct some data analysis; finding trends, forecasting, and detecting anomalous data. There will be several statistic or deep learning techniques used for the analysis; ARIMA (Autoregressive Integrated Moving Average), LSTM (Long Short Term Memory).
Cloud computing is an ever-growing field in today‘s era.With the accumulation of data and the
advancement of technology,a large amount of data is generated everyday.Storage, availability and security of
the data form major concerns in the field of cloud computing.This paper focuses on homomorphic encryption,
which is largely used for security of data in the cloud.Homomorphic encryption is defined as the technique of
encryption in which specific operations can be carried out on the encrypted data.The data is stored on a remote
server.The task here is operating on the encrypted data.There are two types of homomorphic encryption, Fully
homomorphic encryption and patially homomorphic encryption.Fully homomorphic encryption allow arbitrary
computation on the ciphertext in a ring, while the partially homomorphic encryption is the one in which
addition or multiplication operations can be carried out on the normal ciphertext.Homomorphic encryption
plays a vital role in cloud computing as the encrypted data of companies is stored in a public cloud, thus taking
advantage of the cloud provider‘s services.Various algorithms and methods of homomorphic encryption that
have been proposed are discussed in this paper
This document contains a 20 question mock exam for the GATE exam. It provides instructions that each question is worth 1 mark, unanswered questions receive 0 marks and incorrect answers receive negative marks. It then lists 20 multiple choice questions related to computer science topics like operating systems, algorithms, data structures, computer networks and formal languages. For each question there are 4 possible answer choices and space to write the answer.
This document discusses formatting bits to better implement signal processing algorithms with integer arithmetic. It begins by introducing the context and objectives, which is to develop a methodology and tools to implement embedded filter algorithms using only integer arithmetic while controlling errors. It then discusses fixed-point arithmetic and how filters can be implemented using sum-of-products operations. The objective is given a bound on the final error, to find an implementation that reduces bit usage while controlling output error. The document proposes a two-step bit formatting method that first formats the most significant bits using Jackson's rule, then determines the minimum number of least significant bits that need to be kept to ensure faithful rounding of the final result.
Elliptic curve cryptography (ECC) uses elliptic curves over finite fields for encryption, digital signatures, and key exchange. It provides the same security as RSA or discrete logarithm schemes but with smaller key sizes (e.g. 256-bit ECC vs. 3072-bit RSA). ECC algorithms are also faster and use less energy than other schemes. While ECC offers advantages, security relies on using cryptographically strong elliptic curves and there is no deterministic method to encode messages as curve points.
Lattice-Based Cryptography: CRYPTANALYSIS OF COMPACT-LWEPriyanka Aash
Destructive and constructive methods in lattice-based cryptography will be discussed. Topic 1: Cryptanalysis of Compact-LWE Authors: Jonathan Bootle; Mehdi Tibouchi; Keita Xagawa Topic 2: Two-message Key Exchange with Strong Security from Ideal Lattices Authors: Zheng Yang; Yu Chen; Song Luo
(Source: RSA Conference USA 2018)
This document summarizes Chapter 13 of Liang's Introduction to Java Programming textbook. It discusses graphics and drawing in Java, including:
- Java's coordinate system and how each GUI component has its own coordinate space
- Using the Graphics class to draw strings, lines, rectangles, ovals, arcs and polygons
- Overriding the paintComponent method to draw on a component
- Examples of drawing different shapes and using classes like FigurePanel and MessagePanel
B.Sc.IT: Semester - VI (April - 2015) [IDOL - Revised Course | Question Paper]Mumbai B.Sc.IT Study
B.Sc.IT: Semester - VI (April - 2015) [IDOL - Revised Course | Question Paper]
april - 2017, april - 2016, april - 2015, april - 2014, april - 2013, october - 2017, october - 2016, october - 2015, october - 2014, may - 2016, may - 2017, december - 2017, 75:25 pattern, 60:40 pattern, revised course, old course, mumbai bscit study, mumbai university, bscit semester vi, bscit question paper, old question paper, previous year question paper, semester vi question paper, question paper, CBSGS, IDOL, kamal t, internet technology, digital signals and systems, data warehousing, ipr and cyber laws, project management, geographic information system
Discrete Logarithmic Problem- Basis of Elliptic Curve CryptosystemsNIT Sikkim
ECC was developed in 1985 independently by Neal Koblitz and Victor Miller. Both men saw the application of the elliptic curve discrete log problem (ECDLP) as a replacement for the conventional discrete log problem (DLP) which is used in DSA, and the integer factorization problem found in RSA. For both problems, sub-exponential solutions have been generated; the
same which cannot be said for ECDLP . In addition to offering increased security for a smaller key size, operations of adding and doubling can be optimized successfully on a mobile
platform . ECC offers a viable replacement to the most common public-key cryptography algorithms on mobile devices.
In this paper, we present a complete digital signature message stream, just the way the RSA digital
signature scheme does it. We will focus on the operations with large numbers due to the fact that operating
with large numbers is the essence of RSA that cannot be understood by the usual illustrative examples with
small numbers[1].
20101017 program analysis_for_security_livshits_lecture02_compilersComputer Science Club
This document provides an introduction and overview of compiler optimization techniques, including:
1) Flow graphs, constant folding, global common subexpressions, induction variables, and reduction in strength.
2) Data-flow analysis basics like reaching definitions, gen/kill frameworks, and solving data-flow equations iteratively.
3) Pointer analysis using Andersen's formulation to model references between local variables and heap objects. Rules are provided to represent points-to relationships.
Atomic algorithm and the servers' s use to find the Hamiltonian cyclesIJERA Editor
Inspired by the movement of the particles in the atom, I demonstrated in [5] the existence of a polynomial
algorithm of the order
O(n
3
)
for finding Hamiltonian cycles in a graph with basis
E= {x0,...
, xn− 1
}
. In this
article I will give an improvement in space and in time of the algorithm says: we know that there exist several
methods to find the Hamiltonian cycles such as the Monte Carlo method, Dynamic programming, or DNA
computing. Unfortunately they are either expensive or slow to execute it. Hence the idea to use multiple servers
to solve this problem : Each point
xi
in the graph will be considered as a server, and each server
xi
will
communicate with each other server
x j
with which it is connected . And finally the server
x0
will receive
and display the Hamiltonian cycles if they exist
My presentation at University of Nottingham "Fast low-rank methods for solvin...Alexander Litvinenko
Overview of my (with co-authors) low-rank tensor methods for solving PDEs with uncertain coefficients. Connection with Bayesian Update. Solving a coupled system: stochastic forward and stochastic inverse.
Nonlinear analysis of fixed support beam with hinge by hinge method in c prog...Salar Delavar Qashqai
This C program performs a nonlinear analysis of a fixed support beam with hinge using the hinge method. It includes functions for importing data, generating stiffness matrices, performing matrix operations like inversion and multiplication, calculating element internal forces, and outputting results to text, MATLAB, and Excel files. The main function calls the various analysis functions and outputs initial data, analysis reports, and any error messages.
Tensorflow in practice by Engineer - donghwi chaDonghwi Cha
- Tensorflow is an introduction to the machine learning framework Tensorflow covering key concepts like computation graphs, operations, sessions, training, replication, and clustering.
- Key aspects discussed include how Tensorflow executes operations as a static computation graph, uses sessions to run graphs and tensors to hold values, and supports data parallelism through replication across devices/workers.
- The document provides examples of building neural network models in Tensorflow and discusses techniques for training models like backpropagation and distributing training using data parallelism.
To describe the dynamics taking place in networks that structurally change over time, we propose an approach to search for attributes whose value changes impact the topology of the graph. In several applications, it appears that the variations of a group of attributes are often followed by some structural changes in the graph that one may assume they generate. We formalize the triggering pattern discovery problem as a method jointly rooted in sequence mining and graph analysis. We apply our approach on three real-world dynamic graphs of different natures - a co-authoring network, an airline network, and a social bookmarking system - assessing the relevancy of the triggering pattern mining approach.
In this paper, we present a complete digital signature message stream, just the way the RSA digital
signature scheme does it. We will focus on the operations with large numbers due to the fact that operating
with large numbers is the essence of RSA that cannot be understood by the usual illustrative examples with
small numbers[1].
A Signature Algorithm Based On Chaotic Maps And Factoring ProblemsSandra Long
This document describes a new digital signature algorithm based on chaotic maps and factorization problems. It consists of three main phases:
1) System initialization which defines parameters like cryptographic hash function, large prime numbers p and q, element a of order n in GF(p), and multiplicative group G generated by a.
2) Key generation where the signer selects private keys d, x and computes public keys e, y using chaotic maps and modular arithmetic.
3) Signature generation where the signer selects a random number r, computes intermediate values using chaotic maps and factorization, and outputs the signature (v1, v2, S) for the hashed message. The security relies on the difficulty of simultaneously solving
Fast Identification of Heavy Hitters by Cached and Packed Group TestingRakuten Group, Inc.
The document summarizes a research paper on efficiently identifying heavy hitters in data streams using cached and packed group testing techniques. The paper proposes using packed bidirectional counter arrays to implement the operations of combinatorial group testing (CGT) in constant time. This improves the time complexity of CGT for updating frequencies and querying heavy hitters from O(log(n)) to O(1), eliminating dependency on the size of the data universe n. Experimental results show the proposed method achieves competitive precision, update throughput, and query throughput compared to existing CGT and hierarchical count-min sketch approaches.
The document proposes a "blind coupon mechanism" (BCM) to spread signals or rumors quickly in a network while preventing an adversary from identifying the source or presence of the signal. The BCM uses an abstract group structure and instantiates it using elliptic curves over Z_n or bilinear groups. It allows processes to spread coupons by continually broadcasting and combining received coupons with their own, in a way that an adversary cannot distinguish dummy from signal coupons or forge new signal coupons.
This document contains a 20 question multiple choice quiz on computer science topics. The questions cover areas like algorithms, data structures, complexity analysis, logic, automata theory and databases. Sample questions ask about the minimum number of multiplications needed to evaluate a polynomial, the expected value of the smallest number in a random sample, and the recovery procedure after a database system crash during transaction logging.
This presentation provides an overview of decision-making in organisations and introduces a new language called ESL that uses actors to create an executable model that can be analysed. A number of small examples of ESL are shown. The presentation concludes with a larger case study that addresses the recent demonetisation event in India.
This document discusses algorithms and analysis of algorithms. It covers key concepts like time complexity, space complexity, asymptotic notations, best case, worst case and average case time complexities. Examples are provided to illustrate linear, quadratic and logarithmic time complexities. Common sorting algorithms like quicksort, mergesort, heapsort, bubblesort and insertionsort are summarized along with their time and space complexities.
This document discusses pointcuts and static analysis in aspect-oriented programming. It provides an example of using aspects to ensure thread safety in Swing by wrapping method calls in invokeLater. It proposes representing pointcuts as relational queries over a program representation, and rewriting pointcuts as Datalog queries for static analysis. Representing programs and pointcuts relationally in this way enables precise static analysis of crosscutting concerns.
The document summarizes key concepts from Chapter 8 of the textbook "Fundamentals of Multimedia" on lossy compression algorithms. It introduces lossy compression and discusses distortion measures, rate-distortion theory, quantization techniques including uniform, non-uniform, and vector quantization. It also covers transform coding techniques such as the discrete cosine transform and its use in image compression standards to remove spatial redundancies by transforming pixel values into frequency coefficients.
This document presents a framework for verifying the safety of classification decisions made by deep neural networks. It defines safety as the network producing the same output classification for an input and any perturbations of that input within a bounded region. The framework uses satisfiability modulo theories (SMT) to formally verify safety by attempting to find an adversarial perturbation that causes misclassification. It has been tested on several image classification networks and datasets. The framework provides a method to automatically verify safety properties of deep neural networks.
This document appears to be an exam for a course on machine learning or artificial intelligence. It contains instructions for taking the exam, which is closed book and does not allow calculators. It then lists 6 multiple choice or short answer questions worth various point values that assess knowledge of topics like linear separators, sources of error in machine learning models, designing neural network architectures, and radial basis feature transformations.
Noise Contrastive Estimation-based Matching Framework for Low-Resource Securi...Tu Nguyen
Tactics, Techniques and Procedures (TTPs) represent sophisticated attack patterns in the cybersecurity domain, described encyclopedically in textual knowledge bases. Identifying TTPs in cybersecurity writing, often called TTP mapping, is an important and challenging task. Conventional learning approaches often target the problem in the classical multi-class or multilabel classification setting. This setting hinders the learning ability of the model due to a large number of classes (i.e., TTPs), the inevitable skewness of the label distribution and the complex hierarchical structure of the label space. We formulate the problem in a different learning paradigm, where the assignment of a text to a TTP label is decided by the direct semantic similarity between the two, thus reducing the complexity of competing solely over the large labeling space. To that end, we propose a neural matching architecture with an effective sampling-based learn-to-compare mechanism, facilitating the learning process of the matching model despite constrained resources.
This document discusses various natural language processing APIs and techniques:
1. It discusses end-to-end APIs that can perform tasks like question answering without requiring specifying rules or patterns. Examples of applications that can use these APIs are chatbots and FAQ systems.
2. It also discusses using domain-specific languages like SQL within APIs to query databases and knowledge bases. Sequence-to-sequence models are mentioned for translating natural language to structured queries.
3. Various natural language processing tools and techniques are mentioned that can be used as part of APIs, such as word embeddings, parsers, named entity recognition, and semantic role labeling.
Köhler, Sven, Bertram Ludäscher, and Yannis Smaragdakis. 2012. “Declarative Datalog Debugging for Mere Mortals.” In Datalog in Academia and Industry, edited by Pablo Barceló and Reinhard Pichler, 111–22. Lecture Notes in Computer Science 7494. Springer Berlin Heidelberg. doi:10.1007/978-3-642-32925-8_12.
Abstract. Tracing why a “faulty” fact A is in the model M = P(I) of program P on input I quickly gets tedious, even for small examples. We propose a simple method for debugging and “logically profiling” P by generating a provenance-enriched rewriting P̂, which records rule firings according to the logical semantics. The resulting provenance graph can be easily queried and analyzed using a set of predefined and ad-hoc queries. We have prototypically implemented our approach for two different Datalog engines (DLV and LogicBlox), demonstrating the simplicity, effectiveness, and system-independent nature of our method.
Scalable inference for a full multivariate stochastic volatilitySYRTO Project
Scalable inference for a full multivariate stochastic volatility
P. Dellaportas, A. Plataniotis and M. Titsias UCL(London), AUEB(Athens), AUEB(Athens)
Final SYRTO Conference - Université Paris1 Panthéon-Sorbonne
February 19, 2016
This document contains a practice sheet with various function-related problems and questions. Some of the problems involve writing functions to perform tasks like calculating x^n, total bill amount, quadratic equation roots, and more. Other questions ask about function concepts like call by value, return types, function definitions, passing arguments, and errors in function code.
This document describes the solutions and questions for a midterm exam in 6.036: Spring 2018. It provides instructions for taking the exam such as writing your name on each page and coming to the front to ask questions. The exam consists of 6 multiple choice questions worth a total of 100 points. Question 1 involves linear classification and calculating margins. Question 2 asks about sources of error in machine learning models. Question 3 involves choosing appropriate representations and loss functions for different prediction problems. Question 4 introduces radial basis features for nonlinear classification. Question 5 discusses shortcut connections in neural networks.
This document summarizes a new method for designing robust fuzzy observers for nonlinear systems based on Takagi-Sugeno fuzzy models. The method uses linear matrix inequalities to design observers that minimize the H-norm of the closed loop system, providing a measure of robustness and disturbance attenuation. The observer design method is similar to existing parallel distributed compensation controller design methods, making it possible to adapt controller design techniques for observer design. The observer estimates system states and outputs based on measured outputs and system inputs while attenuating effects of disturbances and uncertainties.
Similar to Security of Artificial Intelligence (20)
This document provides an introduction to Bayesian analysis and probabilistic modeling. It begins with an overview of Bayes' theorem and common probability distributions used in Bayesian modeling like the Bernoulli, binomial, beta, Dirichlet, and multinomial distributions. It then discusses how these distributions can be used in Bayesian modeling for problems like estimating probabilities based on observed data. Specifically, it explains how conjugate prior distributions allow the posterior distribution to be of the same family as the prior. The document concludes by discussing how neural networks can quantify classification uncertainty by outputting evidence for different classes modeled with a Dirichlet distribution.
Argumentation and Machine Learning: When the Whole is Greater than the Sum of...Federico Cerutti
Tutorial at IJCAI 2019
Argumentation technology is a rich interdisciplinary area of research that has emerged as one of the most promising paradigms for common sense reasoning and conflict resolution. In this tutorial I will explore the elements underpinning the vast majority of the approaches in argumentation theory: this brings to light the connections among the various disciplines involved in argumentation theory, from epistemology, to law studies, to complexity theory. I will discuss the most recent real-world research grade prototypes, which present innovative ways for applying well-established theories, and enlarge the scope of applications for argumentation theory, from legal reasoning to sense-making in intelligence analysis. I will then discuss how machine learning approaches are useful for addressing both the knowledge acquisition problem as well as the identification of the most suitable algorithms for argumentative reasoning. The knowledge acquisition problem in argumentation theory is mostly an instance of argument mining tasks, actively studied by researchers both in the argumentation community, and in the natural language processing community. I will discuss the current stage of algorithms for computing semantics extensions—sets of collectively acceptable arguments—of argumentation frameworks, and show the results of recent investigations on the use of machine learning techniques for improving the performance of argumentation algorithms.
Finally, I will discuss the current state-of-the-art approaches using argumentation as part of their architecture. Some of them leverage argumentation technology as a regulariser in learning. Most use argumentation to support explainability and algorithmic accountability. With this tutorial the attendees will acquire a deep and comprehensive understanding of the state-of-the-art of technological capabilities of argumentation technology, and of the synergy already envisaged between it and machine learning. This is particularly important given the current interest from research funding agencies in argument mining and explainable AI.
Human-Argumentation Experiment Pilot 2013: Technical MaterialFederico Cerutti
Technical appendix to the paper: "Formal Arguments, Preferences, and Natural Language Interfaces to Humans: an Empirical Evaluation" by Federico Cerutti, Nava Tintarev, Nir Oren, ECAI 2014, Pages 207 - 212,
DOI10.3233/978-1-61499-419-0-207.
http://ebooks.iospress.nl/volumearticle/36941
Abstract of the paper:
It has been claimed that computational models of argumentation provide support for complex decision making activities in part due to the close alignment between their semantics and human intuition. In this paper we assess this claim by means of an experiment: people's evaluation of formal arguments --- presented in plain English --- is compared to the conclusions obtained from argumentation semantics. Our results show a correspondence between the acceptability of arguments by human subjects and the justification status prescribed by the formal theory in the majority of the cases. However, post-hoc analyses show that there are some significant deviations, which appear to arise from implicit knowledge regarding the domains in which evaluation took place. We argue that in order to create argumentation systems, designers must take implicit domain specific knowledge into account.
Probabilistic Logic Programming with Beta-Distributed Random VariablesFederico Cerutti
by Federico Cerutti; Lance Kaplan; Angelika Kimmig; Murat Sensoy
Paper accepted at AAAI2019
We enable aProbLog—a probabilistic logical programming
approach—to reason in presence of uncertain probabilities
represented as Beta-distributed random variables. We
achieve the same performance of state-of-the-art algorithms
for highly specified and engineered domains, while simultaneously
we maintain the flexibility offered by aProbLog
in handling complex relational domains. Our motivation is
that faithfully capturing the distribution of probabilities is
necessary to compute an expected utility for effective decision
making under uncertainty: unfortunately, these probability
distributions can be highly uncertain due to sparse data. To
understand and accurately manipulate such probability distributions
we need a well-defined theoretical framework that is
provided by the Beta distribution, which specifies a distribution
of probabilities representing all the possible values of a
probability when the exact value is unknown.
Supporting Scientific Enquiry with Uncertain SourcesFederico Cerutti
In this paper we propose a computational methodology
for assessing the impact of trust associated to sources of
information in scientific enquiry activities—i.e. relating relevant
information and form logical conclusions, as well as identifying
gaps in information in order to answer a given query. Often trust
in the source of information serves as a proxy for evaluating the
quality of the information itself, especially in the cases of information
overhead. We show how our computational methodology
support human analysts in situational understanding, as well as
highlighting issues that demand further investigation.
This document provides an introduction to formal argumentation theory and summarizes several key concepts:
- It discusses argumentation frameworks consisting of a set of arguments and attacks between arguments. Various semantics are used to identify acceptable sets of arguments.
- Some important semantics properties are outlined, including conflict-freeness, admissibility, strong admissibility, reinstatement, I-maximality, and directionality. Different semantics satisfy different combinations of these properties.
- References are provided for works on argumentation semantics by Dung, Baroni et al., and others that formally define argumentation frameworks and semantics.
Handout: Argumentation in Artificial Intelligence: From Theory to PracticeFederico Cerutti
Handouts for the IJCAI 2017 tutorial on Argumentation. This document is a collection of technical definitions as well as examples of various topics addressed in the tutorial. It is not supposed to be an exhaustive compendium of twenty years of research in argumentation theory.
This material is derived from a variety of publications from many researchers who hold the copyright and any other intellectual property of their work. Original publications are thoroughly cited and reported in the bibliography at the end of the document. Errors and misunderstandings rest with the author of this tutorial: please send an email to federico.cerutti@acm.org for reporting any.
Argumentation in Artificial Intelligence: From Theory to PracticeFederico Cerutti
Argumentation technology is a rich interdisciplinary area of research that, in the last two decades, has emerged as one of the most promising paradigms for commonsense reasoning and conflict resolution in a great variety of domains.
In this tutorial we aim at providing PhD students, early stage researchers, and experts from different fields of AI with a clear understanding of argumentation in AI and with a set of tools they can start using in order to advance the field.
Part 1 of 2
Handout for the course Abstract Argumentation and Interfaces to Argumentative...Federico Cerutti
This document provides an overview of abstract argumentation frameworks and semantics. It begins with definitions of Dung's argumentation framework (AF), including concepts like conflict-free sets, acceptable arguments, and admissible sets. It then covers properties that argumentation semantics can satisfy, like being conflict-free or reinstating acceptable arguments. Several semantics are defined, like complete, grounded, preferred and stable extensions. The document also discusses labelling-based representations of semantics and computational properties of decision problems for different semantics. In the second half, it outlines implementations, ranking-based semantics, argumentation schemes, semantic web argumentation, and natural language interfaces for argumentation systems.
Argumentation in Artificial Intelligence: 20 years after Dung's work. Left ma...Federico Cerutti
Handouts for the IJCAI 2015 tutorial on Argumentation.
This document is a collection of technical definitions as well as examples of various topics addressed in the tutorial. It is not supposed to be an exhaustive compendium of twenty years of research in argumentation theory.
This material is derived from a variety of publications from many researchers who hold the copyright and any other intellectual property of their work. Original publications are thoroughly cited and reported in the bibliography at the end of the document. Errors and misunderstandings rest with the author of this tutorial: please send an email to federico.cerutti@acm.org for reporting any.
Argumentation in Artificial Intelligence: 20 years after Dung's work. Right m...Federico Cerutti
Handouts for the IJCAI 2015 tutorial on Argumentation.
This document is a collection of technical definitions as well as examples of various topics addressed in the tutorial. It is not supposed to be an exhaustive compendium of twenty years of research in argumentation theory.
This material is derived from a variety of publications from many researchers who hold the copyright and any other intellectual property of their work. Original publications are thoroughly cited and reported in the bibliography at the end of the document. Errors and misunderstandings rest with the author of this tutorial: please send an email to federico.cerutti@acm.org for reporting any.
A tutorial by Federico Cerutti
http://scienceartificial.com
Slides of the tutorial given at IJCAI 2015 http://ijcai-15.org/
Website for the tutorial: http://scienceartificial.com/IJCAI2015tutorial
Algorithm Selection for Preferred Extensions EnumerationFederico Cerutti
The document discusses algorithms for enumerating preferred extensions in abstract argumentation frameworks. It compares the performance of four algorithms: AspartixM, NAD-Alg, PrefSAT, and SCC-P. It finds that algorithm selection based on graph features can accurately predict runtime, with up to 80% accuracy in classification, and improves performance over a single best solver by 2-3 times. Key discriminating features include density, number of arguments, number of strongly connected components, and features related to computing graph properties.
Computational trust mechanisms aim to produce a trust rating from both direct and indirect information about agents behaviour. J\o sang’s Subjective Logic has been widely adopted as the core of such systems via its fusion and discount operators. Recently we proposed an operator for discounting opinions based on geometrical properties, and, continuing this line of investigation, this paper describes a new geometry based fusion operator. We evaluate this fusion operator together with our geometric discount operator in the context of a trust system, and show that our operators outperform those originally described by J\o sang. A core advantage of our work is that these operators can be used without modifying the remainder of the trust and reputation system
In this paper we describe a decision process framework allowing an agent to decide what information it should reveal to its neighbours within a com- munication graph in order to maximise its utility. We assume that these neigh- bours can pass information onto others within the graph, and that the commu- nicating agent gains and loses utility based on the information which can be in- ferred by specific agents following the original communicative act. To this end, we construct an initial model of information propagation and describe an optimal decision procedure for the agent.
This paper presents a novel SAT-based approach for the computation of extensions in abstract argumentation, with focus on preferred semantics, and an empirical evaluation of its performances. The approach is based on the idea of reducing the problem of computing complete extensions to a SAT problem and then using a depth-first search method to derive preferred extensions. The proposed approach has been tested using two distinct SAT solvers and compared with three state-of-the-art systems for preferred extension computation. It turns out that the proposed approach delivers significantly better performances in the large majority of the considered cases.
Cerutti--Introduction to Argumentation (seminar @ University of Aberdeen)Federico Cerutti
The document discusses argumentation theory and non-monotonic logics. It introduces argumentation frameworks, which represent arguments and the attacks between them. It describes different types of arguments and attacks. It also covers argumentation semantics, which evaluate arguments within a framework to determine which arguments are justified. Various semantics are examined, including complete semantics and the labelling approach. Examples using abstract frameworks and logic programming are provided to illustrate key concepts in argumentation theory.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
8. Continuously check
that the radar works
and its measure: if the
radar identifies a car
less than 50 meters
away, compute the
speed of the car in
front of you and adapt
your speed.
Start
Radar Check
and Input
∃car(X) s.t.
distance(X) <
50
Adapt speed
yes
no
while (1){
i f ( radarcheck () &&
radar () < 50){
adapt ( ) ;
}
}
8
10. // bool v , w, x , y , z , radarcheck ;
radarcheck = f a l s e ;
i f ( w && x && ( ! x | | y ) && y && z && ( ! ( x && y & z ) | | w) && ( ! x | | !w) ) {
goto A7630 ;
} e l s e {
goto A2092 ; }
A231 : goto A0928 ;
A7630 : radarcheck = t r u e ;
A2092 : goto A231 ;
A0928 : a s s e r t ( radarcheck == f a l s e ) ;
10
12. ! (w == FALSE | | x == FALSE | | y == FALSE | | z == FALSE | |
! (w == FALSE) && ! ( x == FALSE) | | ! ( x == FALSE) && y == FALSE | |
! ( ( ( signed i n t ) y & ( signed i n t ) z ) == 0)} && ! ( x == FALSE) && w == FALSE
)
! (w == FALSE | | x == FALSE | | y == FALSE | | z == FALSE | |
! (w == FALSE) && ! ( x == FALSE) | | ! ( x == FALSE) && y == FALSE | |
! ( y == FALSE | | z == FALSE) && ! ( x == FALSE) && w == FALSE
)
¬(¬w ∨ ¬x ∨ ¬y ∨ ¬z ∨ w ∧ x ∨ x ∧ ¬y ∨ ¬(¬y ∨ ¬z) ∧ x ∧ ¬w)
12
14. “
[. . . ] Particularly vexing is the realisation that the
error came from a piece of the software that was not
needed. The software involved is part of the Iner-
tial Reference System. [. . . ] After takeoff [. . . ] this
computation is useless. In the Ariane 5 flight, how-
ever, it caused an exception, which was not caught
and—boom.
The exception was due to a floatin gpoint error
during a conversion from a 64- bit floating-point value
[. . . ] to a 16-bit signed integer. [. . . ] There was no
explicit exception handler to catch the exception, so
it followed the usual fate of uncaught exceptions and
crashed the entire software, hence the onboard com-
puters, hence the [(500 million USD, ed.)] mission.,, https://www.flickr.com/photos/48213136@N06/8958839420
J. -. Jazequel and B. Meyer, “Design by contract: the lessons of Ariane,” in Computer, vol. 30, no. 1, pp.
129-130, Jan. 1997, doi: 10.1109/2.562936.
14
15. “
We have proved that the initial boot code running in data centers at Amazon Web
Services is memory safe [using, ed.] the C Bounded Model Cheker (CBMC). ,,
https://aws.amazon.com/security/provable-security/
http://tiny.cc/AWSINGBS20
B. Cook, et. al., “Model checking boot code from AWS data centers,” in CAV 2018, pp. 457-486, 2018
Federico Cerutti has had no interaction with AWS.
15
19. (Simplified) Needham-Schroeder Protocol
A B
1: {Na, A}Kb
2: {Na, Nb}Ka
3: {Nb}Kb
A is authenticated with B only if A has sent a fresh challenge nonce encrypted with an
appropriate key to B.
B must reply to A’s challenge with the same nonce, again encrypted with a key so that only A
can decrypt it.
All must happen in the right order.
19
R. M. Needham and M. D. Schroeder. 1978. Using encryption for authentication in large networks of computers. Commun. ACM 21, 12 (Dec. 1978), 993–999.
J. P. Delgrande, T. Grote, and A. Hunter. 2009. A General Approach to the Verification of Cryptographic Protocols Using Answer Set Programming. In LPNMR ’09, 355–367.
20. authenticated (A, B, T) :− send (A, B, enc (M1, K1) , T1) ,
fresh (A, nonce (A, Na) , T1) ,
part m ( nonce (A, Na) , M1) ,
key pair (K1 , Kinv1 ) , has (A, K1 , T1) ,
has (B, Kinv1 , T1) ,
not has (C, Kinv1 , T1) : agent (C) : C != B,
send (B, A, enc (M2, K2) , T2) ,
receive (A, B, enc (M2, K2) , T) ,
part m ( nonce (A, Na) , M2) ,
key pair (K2 , Kinv2 ) , has (B, K2 , T1) ,
has (A, Kinv2 , T1) ,
not has (C, Kinv2 , T1 ) : agent (C) : C != A,
T1 < T2 , T2 < T.
receive (A, B, M, T+1) :− send (B, A, M, T) ,
not intercept (M, T+1).
20
J. P. Delgrande, T. Grote, and A. Hunter. 2009. A General Approach to the Verification of Cryptographic Protocols Using Answer Set Programming. In LPNMR ’09, 355–367.
21. believes (A, completed (A, B) , T) :− send (A, B,
enc (m( nonce (C, Na) ,
p r i n c i p a l (A)) ,
pub key (B)) ,
T1) ,
receive (A, B, enc (m( nonce (C, Na) , nonce (D, Nb)) , pub key (A)) , T2) ,
send (A, B, enc (m( nonce (D, Nb)) , pub key (B)) , T3) ,
not intruder (A) ,
T1 < T2 ,
T2 <= T3 ,
T3 <= T,
A != B,
C != D.
believes (A, authenticated (A, B) , T) :− believes (A, completed (A, B) , T) ,
A != B.
21
J. P. Delgrande, T. Grote, and A. Hunter. 2009. A General Approach to the Verification of Cryptographic Protocols Using Answer Set Programming. In LPNMR ’09, 355–367.
22. Attacker’s capabilities
It can intercept messages and it receives the messages that it intercepts
0 { intercept (M, T+1) } 1 :− send (A, B, M, T) .
receive ( I , A, M, T+1) :− send (A, B, M, T) ,
intercept (M, T+1).
Send messages whenever it wants to, also faking the sender id
1 { receive (A, B, M, T+1) : p r i n c i p a l (B) } 1 :− send ( I , A, M, T) .
22
J. P. Delgrande, T. Grote, and A. Hunter. 2009. A General Approach to the Verification of Cryptographic Protocols Using Answer Set Programming. In LPNMR ’09, 355–367.
23. Goals
Agents
They should both believe to be authenticated, and they should both are
goal (A, B, T) :− authenticated (A, B, T) ,
believes (A, authenticated (A, B) , T) ,
authenticated (B, A, T) ,
believes (B, authenticated (B, A) , T) .
Attacker
An agent believes to be authenticated when it is not true
attack :− believes (A, authenticated (A, B) , T) ,
not authenticated (A, B, T) .
23
J. P. Delgrande, T. Grote, and A. Hunter. 2009. A General Approach to the Verification of Cryptographic Protocols Using Answer Set Programming. In LPNMR ’09, 355–367.
24. Problems encoded in this way can be solved
using search algorithms very similar to the SAT
one
24
27. What about probabilities?
An emerging paradigm: probabilistic logic programming
Just as an example
https://dtai.cs.kuleuven.be/problog/tutorial/various/09_airflap.html
27
28. These are Knowledge-based approaches
The solving algorithms are general purpose
The real value is in the encoded knowledge
28
29. For the avoidance of doubt, this is knowledge
w ∧
x ∧
y ∧
z ∧
(y ∨ ¬x) ∧
(¬w ∨ ¬x) ∧
(w ∨ ¬x ∨ ¬y ∨ ¬z)
. . .
authenticated (A, B, T) :−
send (A, B, enc (M1, K1) , T1) ,
f r e s h (A, nonce (A, Na) , T1) ,
part m ( nonce (A, Na) , M1) ,
k e y p a i r (K1 , Kinv1 ) ,
has (A, K1 , T1) ,
has (B, Kinv1 , T1) ,
not has (C, Kinv1 , T1) :
agent (C) : C != B,
send (B, A, enc (M2, K2) , T2) ,
r e c e i v e (A, B, enc (M2, K2) , T) ,
part m ( nonce (A, Na) , M2) ,
k e y p a i r (K2 , Kinv2 ) ,
has (B, K2 , T1) ,
has (A, Kinv2 , T1) ,
not has (C, Kinv2 , T1) :
agent (C) : C != A,
T1 < T2 , T2 < T.
. . .
. . .
0 . 7 : : wind ( weak ) ;
0 . 3 : : wind ( s t r o n g ) .
0 . 2 5 : : w i n d e f f e c t (T, −1);
0 . 5 : : w i n d e f f e c t (T, 0 ) ;
0 . 2 5 : : w i n d e f f e c t (T, 1 ) :−
wind ( weak ) .
. . .
f l a p p o s i t i o n (Time , Pos ) :−
Time > 0 ,
a t t e m p t e d f l a p p o s i t i o n (Time ,
Pos ) ,
l e g a l f l a p p o s i t i o n ( Pos ) .
. . .
29
32. Agenda
Introduction to AI (for security)
• GOFAI
• ML
Guidelines
• Rules
• Guidelines for creating and using AI
Security of AI
• GOFAI
• ML
32
33. Training set: N observations of a real-valued input variable x, x ≡ (x1, . . . , xN )T
together with
corresponding observations of the values of the real-valued target variable t, denoted
t ≡ (t1, . . . , tN )T
.
The goal is to find a function y(x, w) as close as possible to the original function f (x) from
which we obtained the training set.
x
t
0 1
−1
0
1
33
Fig. 1.2 of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
34. Let’s approximate with a polynomial with degree M
x
t
M = 0
0 1
−1
0
1
x
t
M = 1
0 1
−1
0
1
34
Fig. 1.4a and 1.4b of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
35. Let’s approximate with a polynomial with degree M
x
t
M = 3
0 1
−1
0
1
x
t
M = 9
0 1
−1
0
1
35
Fig. 1.4c and 1.4d of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
36. Training set: learning the parameters
Validation set: optimise model complexity
Validation set: get the performance of the final model
36
37. Regression: y(x, w) ∈ Rn
Classification: y(x, w) ∈ N
Supervised learning: training/validation/test set contain observations of the target variable
Unsupervised learning: no observations of the target variable
Semi-Supervised learning: few target variable observation, between supervised and
unsupervised
Self-supervised learning: the system uses some automatic techniques for creating some
labelling and (hopefully) improves with time
Online learning: data available in sequence
Reinforcement learning: no training/validation/test provided, just a reward function and
the ability to learn from mistakes
. . .
37
39. X (resp. Y ) be a discrete random variable that can take values xi with i = 1, . . . , M (resp. yj
with j = 1, . . . , L).
The probability that X will take the value xi and Y will take the value yj is written
p(X = xi , Y = yj ) and is called the joint probability of X = xi and Y = yj .
39
40. Sum rule or marginalisation:
p(X = xi ) =
L
j=1
p(X = xi , Y = xj )
Product rule:
p(X = xi , Y = yj ) = p(Y = yj |X = xi )p(X = xi )
Sum and product rules apply to general random variables, not only discrete ones.
40
41. p(X,Y )
X
Y = 2
Y = 1
p(Y )
p(X)
X X
p(X|Y = 1)
41
Fig. 1.11a–1.11d of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
42. The weighted average of the function f (x) under a probability distribution p(x), or expectation
of f (x) is:
E[f ] =
x
p(x)f (x)
E[f ] = p(x)f (x)dx
It can be approximate from N points drawn from the distribution
E[f ] ≃
1
N
N
n=1
f (xn)
In the case of functions of several variables,
Ex [f (x, y)] =
x
p(x)f (x, y)
42
44. For two random variables, the covariance expresses the extent to which they vary together, and
is defined by:
cov[x, y] = Ex,y [xy] − E[x]E[y]
In the case of vectors of random variables:
cov[x, y] = Ex,y [xyT
] − E[x]E[yT
]
Note that cov[x] ≡ cov[x, x].
44
47. Suppose we randomly pick one of the boxes and from that
box we randomly select an item of fruit, and having observed
which sort of fruit it is we replace it in the box from which it
came.
We could imagine repeating this process many times. Let us
suppose that in so doing we pick the red box 40% of the time
and we pick the blue box 60% of the time, and that when we
remove an item of fruit from a box we are equally likely to
select any of the pieces of fruit in the box.
We are told that a piece of fruit has been selected and it is an orange.
Which box does it came from?
47
Fig. 1.9 of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
48. p(B = r|F = o) =
p(F = o|B = r)p(B = r)
p(F = o)
=
=
6
8 · 4
10
6
8 · 4
10 + 1
4 · 6
10
=
=
3
4
·
2
5
·
20
9
=
2
3
48
Fig. 1.9 of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
49. The goal in classification is to take an input vector x and to assign it to one of K discrete
classes Ck .
The input space is thus divided into decision regions whose boundaries are called decision
boundaries or decision surfaces.
Consider first the case of two classes. The posterior probability for class C1 can be written as:
p(C1|x) =
p(x|C1)p(C1)
p(x|C1)p(C1) + p(x|C2)p(C2)
=
1
1 + exp (−a)
= σ(a)
with
a = ln
p(x|C1)p(C1)
p(x|C2)p(C2)
and σ(a) is the logistic sigmoid function defined by
σ(a) =
1
1 + exp (−a)
In the case of more than two classes we obtain the softmax function.
49
51. The neural network model is a nonlinear function from a set of input variables {xi } to a set of
output variables {yk } controlled by a vector w of adjustable parameters.
x0
x1
xD
z0
z1
zM
y1
yK
w
(1)
MD
w
(2)
KM
w
(2)
10
hidden units
inputs outputs
Hidden units zj = h(aj ) with aj activations and
aj =
D
i=1
w
(1)
ji xi + w
(1)
j0
Assuming a sigmoid output function:
yk (x, w) = σ
M
j=1
w
(2)
kj h
D
i=1
w
(1)
ji xi + w
(1)
j0 + w
(2)
k0
51
Fig. 5.1 of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
52. The nonlinear function h(·) can be sigmoidal functions such as the logistic sigmoid, but also
the rectifier linear unit (ReLU)
ReLU(x) = max {0, x}
−4 −2 0 2 4
0
2
4
6
52
53. Given q ≥ 1 an integer, K = Rq
be a compact space,∗
f : K → R be continuous, h : R → R
be continuous but not polynomial. Then for every ε > 0 there exist N ≥ 1, ak , bk ∈ R,
wk ∈ Rq
such that:
max
x∈K
f (x) −
N
k=1
ak h(wk x + bk ) ≤ ε
Pinkus, A. (1999). “Approximation theory of the MLP model in neural networks.” In:Acta Numerica,
pp. 143-195. doi:10.1017/1190S0962492900002919.
∗
A compact space contains all its limit points and has all its points lying within some fixed distance of each
other.
53
54. Parameters Optimisation
Given a training set comprising a set of input vectors sec xn, n = 1, . . . , N, and a corresponding
set of target vectors {tn}, we want to minimise an error function E(w).
54
55. w1
w2
E(w)
wA wB wC
∇E
First note that if we make a small step in weight
space from w to w + δw then the change in the
error function is δE ≃ δwT
∇E(w), where the
vector ∇E(w) points in the direction of greatest
rate of increase of the error function.
w(τ+1)
= w(τ)
+ ∆w(τ)
where τ labels the iteration step.
The simplest approach to using gradient information
is to choose the weight update so that:
w(τ+1)
= w(τ)
− η∇E(w(τ)
)
where the parameter η > 0 is the learning rate.
55
Fig. 5.5 of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
56. Batch methods: at each step the weight vector is moved in the direction of the greatest rate of
decrease of the error function, and so this approach is known as gradient descent or steepest
descent.
On-line version aka sequential gradient descent or stochastic gradient descent of gradient
descent: error functions based on maximum likelihood for a set of independent observations
comprise a sum of terms, one for each data point:
E(w) =
N
n=1
En(w)
Update to the weight vector based on one data point at the time, so that
w(τ+1)
= w(τ)
− η∇En(wτ
)
56
57. value of δ for a particular hidden unit can be obtained by propagating the δ’s backward from
units higher up in the network.
zi
zj
δj
δk
δ1
wji wkj
57
Fig. 5.7 of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
59. Generative models
Determine p(x|Ck ) for each
class Ck individually; then
infer the prior class
probabilities; then use
Bayes’ theorem to find the
posterior probabilities.
Alternatively, obtain the
posterior probabilities from
the joint distribution
p(x, Ck ).
Analogously for regression.
Discriminative models
Model directly the posterior
class probabilities p(Ck |x)
(analogously p(t|x))
without computing the joint
distribution.
Direct
Find a discriminant function
f (x) which maps each input
x directly onto a class label.
Analogously, find a
regression function y(x)
directly from the training
data.
59
60. p(x|C1)
p(x|C2)
x
classdensities
0 0.2 0.4 0.6 0.8 1
0
1
2
3
4
5
Generative
x
p(C1|x) p(C2|x)
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
1.2
Discriminative/Direct
60
Fig. 1.27a-b of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
61. x
p(C1|x) p(C2|x)
0.0
1.0
θ
reject region
61
Fig. 1.26 of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
62. The no free lunch theorem for machine learning†
states that, averaged over all possible data
generating distributions, every classification (resp. regression) model has the same error rate
when dealing with previously unobserved points. In other words, no model is universally any
better than any other.
These results hold only when we average over all possible data generating distributions.
However, if we can make assumptions about the kinds of distributions we encounter in
real-world applications, then we can design models that perform well on these distributions.
†
Wolpert, D. H. and Macready, W. G. (1997). “No free lunch theorems for optimization.” In: IEEE
transactions on evolutionary computation 1.1, pp. 67-82.
62
64. Predicted
Complex Event
Ground truth
Gradient back-propagation
M. Roig Vilamala, H. Taylor, T. Xing, L. Garcia, M. Srivastava, L. Kaplan, A. Preece, A. Kimming, F. Cerutti.
A Hybrid Neuro-Symbolic Approach for Complex Event Processing. ICLP 2020.
https://arxiv.org/abs/2009.03420
65. Agenda
Introduction to AI (for security)
• GOFAI
• ML
Guidelines
• Rules
• Guidelines for creating and using AI
Security of AI
• GOFAI
• ML
65
67. GDPR–AI: Conceptual framework
4(1) Personal data
Identification • Re-identification (e.g. pseudo-anonimity) • Identifiability (e.g. fusion
with external datasources)
4(2) Profiling
Inferred personal data is personal data • Data subjects have the right to rectification
independently of whether inferred information is verifiable or statistic
4(11) Consent
Specificity • Granularity • Freedom (and the problem of clear imbalance)
67
https://www.europarl.europa.eu/thinktank/en/document.html?reference=EPRS_STU(2020)641530
68. GDPR–AI: Data protection principles
5(1)(a) Fairness, transparency
Data subjects should not be misled • Inference should also be fair (verifiable, etc. . . )
5(1)(b) Purpose limitation
New purposes for data must be compatible • Personal data in training set/model
• Personal data affecting personalised inferences
5(1)(c) Data minimisation
5(1)(d) Accuracy
Personal data must be accurate
5(1)(e) Storage limitation
68
https://www.europarl.europa.eu/thinktank/en/document.html?reference=EPRS_STU(2020)641530
69. GDPR–AI: Information duties
13/14 Data subjects need to receive relevant information
Information on automated decision-making
Existence of automated decision-making, including profiling • Meaningful information
about the logic involved and the envisaged consequences of such processing for the data
subject
69
https://www.europarl.europa.eu/thinktank/en/document.html?reference=EPRS_STU(2020)641530
70. GDPR–AI: Data subjects’ rights
15 The right to access
A little ambiguous and it seems not to imply the need for individualised explanation of
automated assessment and decisions
17 The right to be forgotten
Delete data used for constructing a model, albeit not deleting the model if shown not to
have personal data any longer
19 The right to object
Objecting to profiling and direct marketing • Objecting to research and statistical
purposes (except for reasons of public interests)
70
https://www.europarl.europa.eu/thinktank/en/document.html?reference=EPRS_STU(2020)641530
71. GDPR–AI: Automated decision-making
22(1-2) Prohibition of automated decisions
The data subject shall have the right not to be subject to a decision based solely
on automated processing, including profiling, which produces legal effects concerning
him or her or similarly significantly affects him or her.
Exceptions: Necessary for entering into or performing a contract • Authorised by
EU/member state law with counterbalances • Based on data subject’s explicit consent
22(3) Safeguard measures
the data controller shall implement suitable measures to safeguard the data sub-
ject’s rights and freedoms and legitimate interests, at least the right to obtain human
intervention on the part of the controller, to express his or her point of view and to
contest the decision
22(4) No automated decision on sensitive data
Unless with explicit consent or in the interest of public interest
71
https://www.europarl.europa.eu/thinktank/en/document.html?reference=EPRS_STU(2020)641530
72. GDPR–AI: Privacy
24 Responsibility of data controller
implement appropriate technical and organisational measures to ensure and to be
able to demonstrate that processing is performed in accordance with this Regulation
25(1) Data protection by design and privacy by default
25(2) Data minimisation
35-36 Data protection impact assessment
In presence of high-risk, ask the supervisory authority (national data protection authority)
37 Need for data protection officers
40-43 Codes of conduct and certification
72
https://www.europarl.europa.eu/thinktank/en/document.html?reference=EPRS_STU(2020)641530
73. Agenda
Introduction to AI (for security)
• GOFAI
• ML
Guidelines
• Rules
• Guidelines for creating and using AI
Security of AI
• GOFAI
• ML
73
74. 1. IEEE Ethics in Action and IEEE 7000
The IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems. Ethically
Aligned Design: A Vision for Prioritizing Human Well-being with Autonomous and
Intelligent Systems, First Edition. IEEE, 2019.
https://ethicsinaction.ieee.org/
2. EU coordinated plan 2018 and whitepaper 2020
https://ec.europa.eu/digital-single-market/en/news/
coordinated-plan-artificial-intelligence
https://ec.europa.eu/info/publications/
white-paper-artificial-intelligence-european-approach-excellence-and-trust_
en
74
76. IEEE Global Initiative on Ethics of A/IS: Principles
1. Human Rights: A/IS‡
shall be created and operated to respect, promote, and protect
internationally recognized human rights.
2. Well-being: A/IS creators shall adopt increased human well-being as a primary success
criterion for development.
3. Data Agency: A/IS creators shall empower individuals with the ability to access and
securely share their data, to maintain people’s capacity to have control over their identity.
4. Effectiveness: A/IS creators and operators shall provide evidence of the effectiveness and
fitness for purpose of A/IS.
‡
Autonomous and Intelligent Systems
76
77. IEEE Global Initiative on Ethics of A/IS: Principles (cont.)
5. Transparency: The basis of a particular A/IS decision should always be discoverable.
6. Accountability: A/IS shall be created and operated to provide an unambiguous rationale
for all decisions made.
7. Awareness of Misuse: A/IS creators shall guard against all potential misuses and risks of
A/IS in operation.
8. Competence: A/IS creators shall specify and operators shall adhere to the knowledge and
skill required for safe and effective operation.
77
79. IEEE Global Initiative on Ethics suggests adoption of Millian ethics to overcome general
assumptions of anthropomorphism in A/IS
Determinism: human actions follow necessarily from antecedent conditions and psychological
laws. Laplace’s demon could predict perfectly human behaviour.
However, for example:
1. Antecedent conditions include the education we received
2. We can therefore modify our future actions by self-education
Doctrine of free will: “our will, by influencing some of our circumstances, can modify our
future habits or capabilities of willingӤ
§
J. S. Mill. “Autobiography”. 1873.
79
80. Which Ethics
Traditional ethics: virtues or moral character (Plato, Aristotle); deontology or duties and
rules (Kant); utilitarianism or consequence of actions (Mill)
Feminist ethics or ethics of care (Noddings): should we care for A/IS?
Globally diverse traditions: Buddhist/Ubuntu/Shinto-Influenced Ethical tradition and their
role in A/IS
80
81. Policy framework
1. Ensure that A/IS support, promote, and enable internationally recognised legal norms.
2. Develop government expertise in A/IS.
3. Ensure governance and ethics are core components in A/IS research, development,
acquisition, and use
4. Create policies for A/IS to ensure public safety and responsible A/IS design.
5. Educate the public on the ethical and societal impacts of A/IS.
81
82. 7000. Model Process for Addressing Ethical Concerns During System Design
7001. Transparency of Autonomous Systems
7002. Data Privacy Process
7003. Algorithmic Bias Considerations
7004. Standard on Child and Student Data Governance
7005. Standard on Employer Data Governance
7006. Standard on Personal Data AI Agent Working Group
7007. Ontological Standard for Ethically driven Robotics and Automation Systems
7008. Standard for Ethically Driven Nudging for Robotic, Intelligent and Autonomous Systems
7009. Standard for Fail-Safe Design of Autonomous and Semi-Autonomous Systems
7010. IEEE Recommended Practice for Assessing the Impact of Autonomous and Intelligent
Systems on Human Well-Being
7011. Standard for the Process of Identifying & Rating the Trust-worthiness of News Sources
7012. Standard for Machine Readable Personal Privacy Terms
7013. Standard for Inclusion and Application Standards for Automated Facial Analysis
Technology
7014. Standard for Ethical considerations in Emulated Empathy in Autonomous and Intelligent
Systems
82
https://ethicsinaction.ieee.org/p7000/
83. IEEE 7010-2020. Well-being Impact Assessment: an iterative process
1. Internal analysis and user and stakeholder engagement
The nature of the AI system • The needs it meets or problems it solves • Who the
users (intended and unintended) are • Who the broader stakeholders might be • The
likelihood of possible positive and negative impacts, and how can they be mitigated.
2. Development and refinement of well-being indicators dashboard
Twelve domains: affect, community, culture, education, economy, environment, health,
human settlements, government, psychological/mental well-being, and work.
3. Data planning and collection
Collection of both baseline data and data over time, allowing changes in well-being
indicators to be assessed over time
4. Data analysis and improvement to AI
Analysis helps determine if an AI does have negative impacts, or if efforts to mitigate
negative impacts or increase positive impacts are successful. Importantly, analysis then
feeds into improvements to AI design, development, assessment, monitoring, and
management.
5. Iteration
83
https://arxiv.org/abs/2005.06620
85. “
For AI made in Europe one key principle will be ethics by design whereby ethical and
legal principles, on the basis of the General Data Protection Regulation, competition
law compliance, absence of data bias are implemented since the beginning of the design
process. When defining the operational requirements, it is also important to take into
account the interactions between humans and AI systems.
Another key principle will be security by design, whereby cybersecurity, the protec-
tion of victims and the facilitation of law enforcement activities should be taken into
account from the beginning of the design process. ,,
85 European Commission, 2018. Annex to COM(2018) 795 final - Coordinated Plan on Artificial Intelligence
86. AI systems need to be human-centric, resting on a commitment to their use in the service of
humanity and the common good, with the goal of improving human welfare and freedom. We
therefore identify Trustworthy AI as our foundational ambition, since human beings and
communities will only be able to have confidence in the technology’s development.
86
EUROPEAN COMMISSION, 2019. High-Level Expert Group on Artificial Intelligence.
87. Trustworthy AI has three components, which should be met throughout the system’s entire life
cycle:
1. it should be lawful, complying with all applicable laws and regulations;
2. it should be ethical, ensuring adherence to ethical principles and values; and
3. it should be robust, both from a technical and social perspective, since, even with good
intentions, AI systems can cause unintentional harm.
Even if an ethical purpose is ensured, individuals and society must also be confident that
AI systems will not cause any unintentional harm. Such systems should perform in a safe,
secure and reliable manner, and safeguards should be foreseen to prevent any unintended
adverse impacts. It is therefore important to ensure that AI systems are robust.
87
EUROPEAN COMMISSION, 2019. High-Level Expert Group on Artificial Intelligence.
88. Ethical Principles
• Respect for human autonomy
• Prevention of harm
AI systems should neither cause nor exacerbate harm or otherwise adversely affect human
beings. [...] They must be technically robust and it should be ensured that they are not
open to malicious use. Note that the principle of prevention of harm and the principle of
human autonomy may be in conflict.
• Fairness
• Explicability
Explicability is crucial for building and maintaining users’ trust in AI systems. This means
that processes need to be transparent, the capabilities and purpose of AI systems openly
communicated, and decisions—to the extent possible—explainable to those directly and
indirectly affected. Without such information, a decision cannot be duly contested. An
explanation as to why a model has generated a particular output or decision (and what
combination of input factors contributed to that) is not always possible.
88
EUROPEAN COMMISSION, 2019. High-Level Expert Group on Artificial Intelligence.
89. Requirements of Trustworthy AI
Human agency and oversight
Technical robustness and safety
Privacy and data governance
Transparency
Diversity, non-discrimination, and fairness
Societal and environmental wellbeing
Accountability
89
EUROPEAN COMMISSION, 2019. High-Level Expert Group on Artificial Intelligence.
90. Technical Robustness and Safety: Resilience to attack and security
AI systems, like all software systems, should be protected against vulnerabilities that can allow
them to be exploited by adversaries, e.g. hacking.
Attacks may target the data (data poisoning), the model (model leakage) or the underlying
infrastructure, both software and hardware. If an AI system is attacked, e.g. in adversarial
attacks, the data as well as system behaviour can be changed, leading the system to make
different decisions, or causing it to shut down altogether.
Systems and data can also become corrupted by malicious intention or by exposure to
unexpected situations.
Insufficient security processes can also result in erroneous decisions or even physical harm.
For AI systems to be considered secure, possible unintended applications of the AI system (e.g.
dual-use applications) and potential abuse of the system by malicious actors should be taken
into account, and steps should be taken to prevent and mitigate these.
90
EUROPEAN COMMISSION, 2019. High-Level Expert Group on Artificial Intelligence.
91. Technical Robustness and Safety: Fallback plan and general safety
AI systems should have safeguards that enable a fallback plan in case of problems. This can
mean that AI systems switch from a statistical to rule-based procedure, or that they ask for a
human operator before continuing their action.
It must be ensured that the system will do what it is supposed to do without harming living
beings or the environment. This includes the minimisation of unintended consequences and
errors.
In addition, processes to clarify and assess potential risks associated with the use of AI systems,
across various application areas, should be established. The level of safety measures required
depends on the magnitude of the risk posed by an AI system, which in turn depends on the
system’s capabilities.
Where it can be foreseen that the development process or the system itself will pose
particularly high risks, it is crucial for safety measures to be developed and tested proactively.
91
EUROPEAN COMMISSION, 2019. High-Level Expert Group on Artificial Intelligence.
92. Technical Robustness and Safety: Accuracy
Accuracy pertains to an AI system’s ability to make correct judgements, for example to
correctly classify information into the proper categories, or its ability to make correct
predictions, recommendations, or decisions based on data or models.
An explicit and well-formed development and evaluation process can support, mitigate and
correct unintended risks from inaccurate predictions. When occasional inaccurate predictions
cannot be avoided, it is important that the system can indicate how likely these errors are. A
high level of accuracy is especially crucial in situations where the AI system directly affects
human lives.
92
EUROPEAN COMMISSION, 2019. High-Level Expert Group on Artificial Intelligence.
93. Technical Robustness and Safety: Reliability and Reproducibility
It is critical that the results of AI systems are reproducible, as well as reliable. A reliable AI
system is one that works properly with a range of inputs and in a range of situations. This is
needed to scrutinise an AI system and to prevent unintended harms.
Reproducibility describes whether an AI experiment exhibits the same behaviour when repeated
under the same conditions. This enables scientists and policy makers to accurately describe
what AI systems do. Replication files can facilitate the process of testing and reproducing
behaviours.
93
EUROPEAN COMMISSION, 2019. High-Level Expert Group on Artificial Intelligence.
94. Transparency: Traceability
The data sets and the processes that yield the AI system’s decision, including those of data
gathering and data labelling as well as the algorithms used, should be documented to the best
possible standard to allow for traceability and an increase in transparency.
This also applies to the decisions made by the AI system.
This enables identification of the reasons why an AI-decision was erroneous which, in turn,
could help prevent future mistakes. Traceability facilitates auditability as well as explainability.
94
EUROPEAN COMMISSION, 2019. High-Level Expert Group on Artificial Intelligence.
95. Transparency: Explainability
Explainability concerns the ability to explain both the technical processes of an AI system and
the related human decisions (e.g. application areas of a system).
Technical explainability requires that the decisions made by an AI system can be understood
and traced by human beings.
Moreover, trade-offs might have to be made between enhancing a system’s explainability (which
may reduce its accuracy) or increasing its accuracy (at the cost of explainability). Whenever an
AI system has a significant impact on people’s lives, it should be possible to demand a suitable
explanation of the AI system’s decision-making process. Such explanation should be timely and
adapted to the expertise of the stakeholder concerned (e.g. layperson, regulator or researcher).
In addition, explanations of the degree to which an AI system influences and shapes the
organisational decision-making process, design choices of the system, and the rationale for
deploying it, should be available (hence ensuring business model transparency).
95
EUROPEAN COMMISSION, 2019. High-Level Expert Group on Artificial Intelligence.
96. Transparency: Communication
AI systems should not represent themselves as humans to users; humans have the right to be
informed that they are interacting with an AI system. This entails that AI systems must be
identifiable as such.
In addition, the option to decide against this interaction in favour of human interaction should
be provided where needed to ensure compliance with fundamental rights. Beyond this, the AI
system’s capabilities and limitations should be communicated to AI practitioners or end-users in
a manner appropriate to the use case at hand. This could encompass communication of the AI
system’s level of accuracy, as well as its limitations.
96
EUROPEAN COMMISSION, 2019. High-Level Expert Group on Artificial Intelligence.
97. Initially: Make clear what the system can do • Make clear how well the system can do
what it can do
During interaction: Time services based on context • Show contextually relevant
information • Match relevant social norms • Mitigate social biases
When wrong: Support efficient invocation • Support efficient dismissal • Support
efficient correction • Scope services, when in doubt • Make clear why the system did
what it did
Over time: Remember recent interactions • Learn from user behaviour • Update and
adapt cautiously • Encourage granular feedback • Convey the consequences of user
actions • Provide global controls • Notify users about changes
https://aka.ms/aiguidelines
§
S. Amershi et. al., “Guidelines for Human-AI Interaction,” CHI 2019
97
98. “
Even in simple collaboration scenarios, e.g. those in which an AI system assists a
human operator with predictions, the success of the team hinges on the human correctly
deciding when to follow the recommendations of the AI system and when to override
them. [. . . ]
Extracting benefits from collaboration with the AI system depends on the human
developing insights (i.e., a mental model) of when to trust the AI system with its
recommendations. [. . . ]
If the human mistakenly trusts the AI system in regions where it is likely to err,
catastrophic failures may occur. ,,
§
Bansal, Gagan, et al. “Beyond Accuracy: The Role of Mental Models in Human-AI Team Performance.”
AAAI Conference on Human Computation and Crowdsourcing. 2019.
98
99. Misclassification of the white side of a trailer as bright sky: this caused a car operating with
automated vehicle control systems (level 2) to crash against a tractor-semitrailer truck near
Williston, Florida, USA on 7th May 2016.
The car driver died due to the sustained injury.
The car manufacturer stated that the “camera failed to recognize the white truck against a
bright sky.Ӧ
¶
http://tiny.cc/2tb4uy
99
105. (1) 1 = 0
(2) 3 = 0 (multiplying both sides of (1) by 3)
(3) π = 0 (multiplying both sides of (1) by π)
(4) π = 3 (from (2) and (3))
105
106. “
If I am a rock (r) then light travels at 1 metres per second (l) ,,
(1) r ⊃ l ≡ ¬r ∨ l
(2) r ≡ ⊥ (I am not a rock)
(3) l ≡ ⊤ makes (1) true
¬ r ∨ l
⊥ ⊤ ⊤ ⊤
⊥ ⊤ ⊥ ⊥
⊤ ⊥ ⊤ ⊤
⊤ ⊥ ⊤ ⊥
106
107. “
All the elephants in the room are pink ,,
(1) ∀Xelephant(X) ⊃ pink(X)
(2) If ∀Xelephant(X) = ⊥, then (1) is trivially true (vacuous truth)
107
108. “
Fish identified previously from Sri Lanka as P. amphibius (Pethiyagoda 1991), are
now recognized as an endemic species, P. kamalika (Silva et al. 2008), which is re-
stricted to the wet zone. ,,
P. amphibius
FishXYZ
Pethiyagoda, 1991
¶
Bahir, M., & Gabadage, D. (2009). Taxonomic and scientific inaccuracies in a consultancy report on
biodiversity: a cautionary note. Journal of Threatened Taxa, 1(6), 317-322.
108
109. “
Fish identified previously from Sri Lanka as P. amphibius (Pethiyagoda 1991), are
now recognized as an endemic species, P. kamalika (Silva et al. 2008), which is re-
stricted to the wet zone. ,,
P. amphibius
FishXYZ
Pethiyagoda, 1991
Silva et al. 2008
P. kamalika
¶
Bahir, M., & Gabadage, D. (2009). Taxonomic and scientific inaccuracies in a consultancy report on
biodiversity: a cautionary note. Journal of Threatened Taxa, 1(6), 317-322.
109
110. “
P. ticto (Hamilton, 1822) was described from Bengal. Deraniyagala (1956) gave a
name to the P. ticto like fish in Sri Lanka, describing it as P. ticto melanomaculatus.
At present P. ticto melanomaculatus is not recognized as a valid taxon (Pethiyagoda
1991). Recent molecular investigations show marked differences between P. ticto and
P. melanomaculatus (Meegaskumbura et al. 2008). ,,
P. ticto
P.ticto
melanomaculatus
Deraniyagala 1956
¶
Bahir, M., & Gabadage, D. (2009). Taxonomic and scientific inaccuracies in a consultancy report on
biodiversity: a cautionary note. Journal of Threatened Taxa, 1(6), 317-322.
110
111. “
P. ticto (Hamilton, 1822) was described from Bengal. Deraniyagala (1956) gave a
name to the P. ticto like fish in Sri Lanka, describing it as P. ticto melanomaculatus.
At present P. ticto melanomaculatus is not recognized as a valid taxon (Pethiyagoda
1991). Recent molecular investigations show marked differences between P. ticto and
P. melanomaculatus (Meegaskumbura et al. 2008). ,,
P. ticto
P.ticto
melanomaculatus
Deraniyagala 1956
Pethiyagoda, 1991
¶
Bahir, M., & Gabadage, D. (2009). Taxonomic and scientific inaccuracies in a consultancy report on
biodiversity: a cautionary note. Journal of Threatened Taxa, 1(6), 317-322.
111
112. “
P. ticto (Hamilton, 1822) was described from Bengal. Deraniyagala (1956) gave a
name to the P. ticto like fish in Sri Lanka, describing it as P. ticto melanomaculatus.
At present P. ticto melanomaculatus is not recognized as a valid taxon (Pethiyagoda
1991). Recent molecular investigations show marked differences between P. ticto and
P. melanomaculatus (Meegaskumbura et al. 2008). ,,
P. ticto
P.ticto
melanomaculatus
Deraniyagala 1956
Pethiyagoda, 1991
Meegaskumbura et al., 2008
¶
Bahir, M., & Gabadage, D. (2009). Taxonomic and scientific inaccuracies in a consultancy report on
biodiversity: a cautionary note. Journal of Threatened Taxa, 1(6), 317-322.
112
114. Agenda
Introduction to AI (for security)
• GOFAI
• ML
Guidelines
• Rules
• Guidelines for creating and using AI
Security of AI
• GOFAI
• ML
114
115. Training Phase Attacks
Attacks during training time attempt to influence or corrupt the model directly by altering the
dataset used for training.
• Data Injection: The adversary does not have any access to the training data as well as to
the learning algorithm but has ability to augment a new data to the training set. He can
corrupt the target model by inserting adversarial samples into the training dataset.
• Data Modification: The adversary does not have access to the learning algorithm but has
full access to the training data. He poisons the training data directly by modifying the
data before it is used for training the target model.
• Logic Corruption: The adversary has the ability to meddle with the learning algorithm.
These attacks are referred as logic corruption. Apparently, it becomes very difficult to
design counter strategy against these adversaries who can alter the learning logic, thereby
controlling the model itself.
115
116. Testing Phase Attacks
Adversarial attacks at the testing time do not tamper with the targeted model but rather forces
it to produce incorrect outputs. The effectiveness of such attacks is determined mainly by the
amount of information available to the adversary about the model.
Testing phase attacks can be broadly classified into either White-Box or Black-Box attacks.
116
117. Testing Phase Attacks: White-Box Attacks
In white-box attack on a machine learning model, an adversary has total knowledge about the
model used for classification (e.g., type of neural network along with number of layers).
The attacker has information about the algorithm used in training (e.g., gradient-descent
optimization) and can access the training data distribution.
He also knows the parameters of the fully trained model architecture.
The adversary utilizes available information to identify the feature space where the model may
be vulnerable, i.e, for which the model has a high error rate.
Then the model is exploited by altering an input using adversarial example crafting method
(more later on).
117
118. Black-box Attacks: transferrability
Adversarial sample transferability is the property that adversarial samples produced by training on a
specific model can affect another model, even if they have different architectures.
In black-box attacks, the adversary does not have access to the target model F, and thus train a
substitute model F′
locally to generate adversarial example X + δX which then can be transfered to
the victim model.
1. Intra-technique transferability: If models F and F′
are both trained using same machine learning
technique (e.g. both are NN or SVM)
2. Cross-technique transferability: If learning technique in F and F′
are different, for example, F is a
neural network and F′
is a SVM.
The attacks have been shown to generalize to non-differentiable target models, like SVMs. Therefore,
differentiable models such as neural networks or logistic regression can be used to learn a substitute
model for models trained with SVM or nearest neighbours.
¶
Papernot, Nicolas, Patrick McDaniel, and Ian Goodfellow. “Transferability in machine learning: from
phenomena to black-box attacks using adversarial samples.” arXiv preprint arXiv:1605.07277 (2016).
118
119. Testing Phase Attacks: Black-Box Attacks
Non-Adaptive Black-Box Attack
For a target model (f ), a non-adaptive black-box adversary only gets access to the target
model’s training data distribution µ.
The adversary then chooses a training procedure for a model architecture f ′
and trains a local
model over samples from the data distribution µ to approximate the model learned by the
target classifier.
The adversary crafts adversarial examples on the local model f ′
using white-box attack
strategies and applies these crafted inputs to the target model to force mis-classifications.
119
120. Testing Phase Attacks: Black-Box Attacks
Adaptive Black-Box Attack
For a target model (f ), an adaptive black-box adversary does not have any information
regarding the training process but can access the target model as an oracle (analogous to
chosen-plaintext attack in cryptography).
The adversary issues adaptive oracle queries to the target model and labels a carefully selected
dataset, i.e., for any arbitrarily chosen x the adversary obtains its label y by querying the target
model f .
The adversary then chooses a procedure train′
and model architecture f ′
to train a surrogate
model over tuples (x, y) obtained from querying the target model.
The surrogate model then produces adversarial samples by following white-box attack
technique for forcing the target model to mis-classify malicious data.
120
121. Testing Phase Attacks: Black-Box Attacks
Strict Black-Box Attack
A black-box adversary sometimes may not contain the data distribution µ but has the ability to
collect the input-output pairs (x, y) from the target classifier.
However, he can not change the inputs to observe the changes in output like an adaptive
attack procedure.
This strategy is analogous to the known-plaintext attack in cryptography and would most likely
to be successful for a large set of input-output pairs.
121
122. Adversary Goals
• Confidence Reduction: The adversary tries to reduce the confidence of prediction for the
target model. For example, a legitimate image of a ‘stop’ sign can be predicted with a
lower confidence having a lesser probability of class belongingness.
• Misclassification: The adversary tries to alter the output classification of an input example
to any class different from the original class. For example, a legitimate image of a ‘stop’
sign will be predicted as any other class different from the class of stop sign.
• Targeted Misclassification: The adversary tries to produce inputs that force the output of
the classification model to be a specific target class. For example, any input image to the
classification model will be predicted as a class of images having ‘go’ sign.
• Source/Target Misclassification: The adversary attempts to force the output of
classification for a specific input to be a particular target class. For example, the input
image of ‘stop’ sign will be predicted as ‘go’ sign by the classification model.
122
123. • Exploratory Attack: These attacks do not influence training dataset. Given black box
access to the model, they try to gain as much knowledge as possible about the learning
algorithm of the underlying system and pattern in training data.
• Evasion Attack: This is the most common type of attack in the adversarial setting. The
adversary tries to evade the system by adjusting malicious samples during testing phase.
This setting does not assume any influence over the training data.
• Poisoning Attack: This type of attack, known as contamination of the training data, takes
place during the training time of the machine learning model. An adversary tries to poison
the training data by injecting carefully designed samples to compromise the whole learning
process eventually.
123
124. Exploratory Attacks: Model Inversion Attack
Fredrikson et al. consider a linear regression model f that predicted drug dosage using patient
information, medical history and genetic markers; given white-box access to model f and an
instance of data (X = {x1, x2, ..., xn}, y), model inversion infers genetic marker x1.
¶
Fredrikson, Matthew, et al. “Privacy in pharmacogenetics: An end-to-end case study of personalized
warfarin dosing.” 23rd USENIX Security Symposium (USENIX Security 14). 2014.
124
125. An attacker can produce a recognizable image of a person, given only API access to a facial
recognition system and the name of the person whose face is recognized by it.
¶
Fredrikson, Matt, Somesh Jha, and Thomas Ristenpart. “Model inversion attacks that exploit confidence
information and basic countermeasures.” Proceedings of the 22nd ACM SIGSAC Conference on Computer and
Communications Security. 2015.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted
without fee provided that copies are not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation.
125
127. Poisoning Attacks: Adversarial Examples Generation
• Label Manipulation: The adversary has the capability to modify the training labels only,
and he obtains the most vulnerable label given the full or partial knowledge of the learning
model.
A basic approach is to randomly perturb the labels, i.e., select new labels for a subset of
training data by picking from a random distribution.
• Input Manipulation: In this scenario, the adversary is more powerful and can corrupt the
input features of training points analyzed by the learning algorithm, in addition to its
labels.
This scenario also assumes that the adversary has the knowledge of the learning algorithm.
127
129. Evasion Attacks
• White-box attacks: two steps
1. Direction Sensitivity Estimation
2. Perturbation Selection
• Black-bock attacks
129
130. White Box attacks. Step 1: Direction Sensitivity Estimation
The adversary evaluates the sensitivity of a class change to each input feature by identifying
directions in the data manifold around sample X in which the model F is most sensitive and
likely to result in a class change
130
131. Fast Gradient Method (FGM): calculates the gradient of the cost function with respect to the
input of the neural network. The adversarial examples are generated using the following
equation:
X∗ = X + ǫ ∗ sign(∇x J(X, ytrue))
Here, J is the cost function of the trained model, ∇x denotes the gradient of the model with
respect to a normal sample X with correct label ytrue, and ǫ denotes the input variation
parameter which controls the perturbation’s amplitude.
¶
I. Goodfellow, J. Shlens, C. Szegedy 2015 Explaining and Harnessing Adversarial Examples. In ICLR 2015.
https://arxiv.org/abs/1412.6572
131
132. White Box attacks. Step 2: Perturbation Selection
The adversary then exploits the knowledge of sensitive information to select a perturbation δX
among the input dimensions in order to obtain an adversarial perturbation which is most
efficient.
132
133. Perturb all the input dimensions with a small quantity in the direction of the sign of the
gradient calculated using the FGM method.
This method efficiently minimizes the Euclidian distance between the original and the
corresponding adversarial samples.
133
134. Perturb a selected input dimensions: select only a limited number of input dimensions to
perturb by identifying which combination of input dimensions, if perturbed, will contribute to
the adversarial goals.
This method effectively reduces the number of input features perturbed while crafting
adversarial examples.
For choosing the input dimensions which forms the perturbations, all the dimensions are sorted
in decreasing order of their contribution to the adversarial goal.
Input components are added to perturbation δx in the decreasing order until the resulting
sample x∗ = x + δx is misclassified by the model F.
134
135. LISA-CNN could interpret this as Speed Limit 45
¶
Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash,
Tadayoshi Kohno, Dawn Song Robust Physical-World Attacks on Deep Learning Visual Classification Computer
Vision and Pattern Recognition (CVPR 2018)
Permission is granted to use and reproduce the images for publications or for research with acknowledgement
135
136.
137. Adversarial Training
Inject adversarial examples into the training set: the defender generates a lot of adversarial
examples and augments these perturbed data while training the targeted model.
137
138. Gradient Hiding
A natural defense against gradient-based attacks and attacks using adversarial crafting method
such as FGM, could consist in hiding information about the model’s gradient from the
adversary.
For instance, if the model is non-differentiable (e.g, a SVM, or a Nearest Neighbor Classifier),
gradient-based attacks are rendered ineffective.
However, this defense are easily fooled by learning a surrogate Black-Box model having
gradient and crafting examples using it (cf. adversarial sample transferrability).
138
139. Feature Squeezing
By reducing the complexity of representing the data, the adversarial perturbations disappear
because of low sensitivity.
Examples: reducing the quantization levels, or the sampling frequencies.
Though these techniques work well in preventing adversarial attacks, these have the collateral
effect of worsening the accuracy of the model on true examples
139
140. Blocking the Transferrability: Null Labelling
The main idea behind the proposed approach is to augment a new NULL label in the dataset
and train the classifier to reject the adversarial examples by classifying them as NULL.
Three steps:
1. Initial Training of the target classifier on the clean dataset;
2. Computing the NULL probabilities: The probability of belonging to the NULL class is then
calculated using a function f for the adversarial examples generated with different amount
of perturbations;
3. Adversarial Training: Each clean sample is then re-trained with the original classifier along
with different perturbed inputs for the sample. The label for the training data is decided
based on the NULL probabilities obtained in the previous step.
¶
Hosseini, Hossein, et al. “Blocking transferability of adversarial examples in black-box learning systems.”
arXiv preprint arXiv:1703.04318 (2017).
140
141. Uncertainty-Awareness
Change the loss function so to output pieces of evidences in favour of different classes that
should then be considered through Bayesian update resulting into a Dirichlet Distribution
¶
Sensoy, Murat, Lance Kaplan, and Melih Kandemir. “Evidential deep learning to quantify classification
uncertainty.” Advances in Neural Information Processing Systems. 2018.
141
142. Given the parameters of our model w, we can capture our assumptions about w, before
observing the data, in the form of a prior probability distribution p(w). The effect of the
observed data D = {t1, . . . , tN } is expressed through the conditional p(D|w), hence Bayes
theorem takes the form:
p(w|D) =
likelihood
p(D|w)
prior
p(w)
p(D)
posterior ∝ likelihood · prior
p(D) = p(D|w)p(w)dw
It ensures that the posterior distribution on the left-hand side is a valid probability density and integrates to one.
142
143. Frequentist paradigm
• w is considered to be a fixed parameter,
whose values is determined by some form
of estimator, e.g. the maximum likelihood
in which w is set to the value that
maximises p(D|w)
• Error bars on this estimate are obtained
by considering the distribution of possible
data sets D.
• The negative log of the likelihood
function is called an error function: the
negative log is a monotonically decreasing
function hence maximising the likelihood
is equivalent to minimising the error.
Bayesian paradigm
• There is only one single data set D (the
one observed) and the uncertainty in the
parameters is expressed through a
probability distribution over w.
• The inclusion of prior knowledge arises
naturally: suppose that a fair-looking coin
is tossed three times and lands heads
each time. A classical maximum
likelihood estimate of the probability of
landing heads would give 1.
There are cases where you want to reduce
the dependence on the prior, hence using
noninformative priors.
143
144. Binary variable: Bernoulli
Let us consider a single binary random variable x ∈ {0, 1}, e.g. flipping coin, not necessary fair,
hence the probability is conditioned by a parameter 0 ≤ µ ≤ 1:
p(x = 1|µ) = µ
The probability distribution over x is known as the Bernoulli distribution:
Bern(x|µ) = µx
(1 − µ)1−x
E[x] = µ
144
145. Now suppose that we have a data set of observations x = (x1, . . . , xN )T
drawn independently
from a Bernoulli distribution (iid) whose mean µ is unknown, and we would like to determine
this parameter from the data set.
p(D|µ) =
N
n=1
p(xn|µ) =
N
n=1
µxn
(1 − µ)1−xn
Let’s maximise the (log)-likelihood to identify the parameter (log simplifies and reduces risks of
underflow):
ln p(D|µ) =
N
n=1
ln p(xn|µ) =
N
n=1
{xn ln µ + (1 − xn) ln(1 − µ)}
145
146. The log likelihood depends on the N observations xn only through their sum
n
xn, hence the
sum provides an example of a sufficient statistics for the data under this distribution,
No other statistic that can be calculated from the same sample provides any additional information as to
the value of the parameter
146
147. d
dµ
ln p(D|µ) = 0
N
n=1
xn
µ
−
1 − xn
1 − µ
= 0
N
n=1
xn − µ
µ(1 − µ)
= 0
N
n=1
xn = Nµ
µML =
1
N
N
n=1
xn
aka sample mean. Risk of overfit: consider to toss the coin three times and each time is head
147
148. In order to develop a Bayesian treatment to the overfit problem of the maximum likelihood
estimator for the Bernoulli. Since the likelihood takes the form of the product of factors of the
form µx
(1 − µ)1−x
, if we choose a prior to be proportional to powers of µ and (1 − µ) then the
posterior distribution, proportional to the product of the prior and the likelihood, will have the
same functional form as the prior. This property is called conjugacy.
148
149. Binary variables: Beta distribution
Beta(µ|a, b) =
Γ(a + b)
Γ(a)Γ(b)
µa−1
(1 − µ)b−1
with
Γ(x) ≡
∞
0
ux−1
e−u
du
E[µ] =
a
a + b
var[µ] =
ab
(a + b)2(a + b + 1)
a and b are hyperparameters controlling the distribution of parameter µ.
149
150. µ
a = 0.1
b = 0.1
0 0.5 1
0
1
2
3
µ
a = 1
b = 1
0 0.5 1
0
1
2
3
µ
a = 2
b = 3
0 0.5 1
0
1
2
3
µ
a = 8
b = 4
0 0.5 1
0
1
2
3
150
Fig. 2.2a-d of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
151. Considering a beta distribution prior and the binomial likelihood function, and given l = N − m
p(µ|m, l, a, b) ∝ µm+a−1
(1 − µ)l+b−1
Hence p(µ|m, l, a, b) is another beta distribution and we can rearrange the normalisation
coefficient as follows:
p(µ|m, l, a, b) =
Γ(m + a + l + b)
Γ(m + a)Γ(l + b)
µm+a−1
(1 − µ)l+b−1
µ
prior
0 0.5 1
0
1
2
µ
likelihood function
0 0.5 1
0
1
2
µ
posterior
0 0.5 1
0
1
2
151
Fig. 2.3a-c of C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag.
c 2006 C. M. Bishop. Permission is given to reproduce the figures for non-commercial purposes including education and research.
152. Epistemic vs Aleatoric uncertainty
Aleatoric uncertainty
Variability in the outcome of an experiment
which is due to inherently random effects (e.g.
flipping a fair coin): no additional source of
information but Laplace’s daemon can reduce
such a variability.
Epistemic uncertainty
Epistemic state of the agent using the model,
hence its lack of knowledge that—in
principle—can be reduced on the basis of
additional data samples.
It is a general property of Bayesian learning
that, as we observe more and more data, the
epistemic uncertainty represented by the
posterior distribution will steadily decrease
(the variance decreases).
152
153. Multinomial variables: categorical distribution
Let us suppose to roll a dice with K = 6 faces. An observation of this variable x equivalent to
x3 = 1 (e.g. the number 3 with face up) can be:
x = (0, 0, 1, 0, 0, 0)T
Note that such vectors must satisfy
K
k=1
xk = 1.
p(x|µ) =
K
k=1
µxk
k
where µ = (µ1, . . . , µK )T
, nad the parameters µk are such that µk ≥ 0 and
k
µk = 1.
Generalisation of the Bernoulli
153
154. p(D|µ) =
N
n=1
K
k=1
µxnk
k
The likelihood depends on the N datapoints only through the K quantities
mk =
n
xnk
which represent the number of observations of xk = 1 (e.g. with k = 3, the third face of the
dice). These are called the sufficient statistics for this distribution.
154
155. Finding the maximum likelihood requires a Lagrange multiplier that
K
x=1
mk ln µk + λ
K
k=1
µk − 1
Hence
µML
k =
mk
N
which is the fraction of N observations for which xk = 1.
155
156. Multinomial variables: the Dirichlet distribution
The Dirichlet distribution is the generalisation of the beta distribution to K dimensions.
Dir(µ|α) =
Γ(α0)
Γ(α1) · · · Γ(αK )
K
k=1
µαk −1
k
such that
k
µk = 1, α = (α1, . . . , αK )T
, αk ≥ 0 and
α0 =
K
k=1
αk
156
157. Considering a Dirichlet distribution prior and the categorical likelihood function, the posterior is
then:
p(µ|D, α) = Dir(µ|α + m) =
=
Γ(α0 + N)
Γ(α1 + m1) · · · Γ(αK + mK )
K
k=1
µαk +mk −1
k
The uniform prior is given by Dir(µ|1) and the Jeffreys’ non-informative prior is given by
Dir(µ|(0.5, . . . , 0.5)T
).
The marginals of a Dirichlet distribution are beta distributions.
157
158. From Evidence to Dirichlet
Let us now assume a Dirichlet distribution over K classes that is the result of Bayesian update
with N observations and starting with a uniform prior:
Dir(µ | α) = Dir(µ | e1 + 1, e2 + 2, . . . , eK + 1 )
where ei is the number of observations (evidence) for the class k, and
k
ek = N.
158
159. Dirichlet and Epistemic Uncertainty
The epistemic uncertainty associated to a Dirichlet distribution Dir(µ | α) is given by
u =
K
S
with K the number of classes and S = α0 =
K
k=1
αk is the Dirichlet strength.
Note that if the Dirichlet has been computed as the resulting of Bayesian update from a
uniform prior, 0 ≤ u ≤ 1, and u = 1 implies that we are considering the uniform distribution
(an extreme case of Dirichlet distribution).
Let us denote with µk
αk
S
.
159
160. Loss function
If we then consider Dir(mi | αi ) as the prior for a Multinomial p(yi | µi ), we can then compute the
expected squared error (aka Brier score)
E[ yi − mi
2
2] =
K
k=1
E[y2
i,k − 2yi,k µi,k + µ2
i,k ] =
X
k=1
y2
i,k − 2yi,k E[µi,k ] + E[µ2
i,k ] =
=
K
k=1
y2
i,k − 2yi,k E[µi,k ] + E[µi,k ]2
+ var[µi,k ] =
=
K
k=1
(yi,k − E[µi,k ])2
+ var[µi,k ] =
=
K
k=1
yi,k −
αi,k
Si
2
+
αi,k (Si − αi,k )
S2
i (Si + 1)
=
=
K
k=1
(yi,k − µi,k )2
+
µi,k (1 − µi,k )2
Si + 1
The loss over a batch of training samples is the sum of the loss for each sample in the batch.
Sensoy, Murat, Lance Kaplan, and Melih Kandemir. “Evidential deep learning to quantify classification
uncertainty.” Advances in Neural Information Processing Systems. 2018.
160
161. Learning to say “I don’t know”
To avoid generating evidence for all the classes when the network cannot classify a given
sample (epistemic uncertainty), we introduce a term in the loss function that penalises the
divergence from the uniform distribution:
L =
N
i=1
E[ yi − µi
2
2] + λt
N
i=1
KL ( Dir(µi | αi ) || Dir(µi | 1) )
where:
• λt is another hyperparameter, and the suggestion is to use it parametric on the number of
training epochs, e.g. λt = min 1,
t
CONST
with t the number of current training epoch, so that
the effect of the KL divergence is gradually increased to avoid premature convergence to the
uniform distribution in the early epoch where the learning algorithm still needs to explore the
parameter space;
• αi = yi + (1 − yi ) · αi are the Dirichlet parameters the neural network in a forward pass has put
on the wrong classes, and the idea is to minimise them as much as possible.
Sensoy, Murat, Lance Kaplan, and Melih Kandemir. “Evidential deep learning to quantify classification
uncertainty.” Advances in Neural Information Processing Systems. 2018.
161
162. KL recap
Consider some unknown distribution p(x) and suppose that we have modelled this using q(x).
If we use q(x) instead of p(x) to represent the true values of x, the average additional amount
of information required is:
KL(p||q) = − p(x) ln q(x)dx − − p(x) ln p(x)dx
= − p(x) ln
q(x)
p(x)
dx
= −E ln
q(x)
p(x)
This is known as the relative entropy or Kullback-Leibler divergence, or KL divergence between
the distributions p(x) and q(x).
Properties:
• KL(p||q) ≡ KL(q||p);
• KL(p||q) ≥ 0 and KL(p||q) = 0 if and only if p = q
162
163. KL ( Dir(µi | αi ) || Dir(µi | 1) ) = ln
Γ(
K
k=1 αi,k )
Γ(K)
K
k=1 Γ(αi,k )
+
K
k=1
(αi,k −1)
ψ(αi,k ) − ψ
K
j=1
αi,j
where ψ(x) =
d
dx
ln ( Γ(x) ) is the digamma function
Sensoy, Murat, Lance Kaplan, and Melih Kandemir. “Evidential deep learning to quantify classification
uncertainty.” Advances in Neural Information Processing Systems. 2018.
163
165. EDL + GAN for adversarial training
M. Sensoy, L. Kaplan, F. Cerutti, M. Saleki, “Uncertainty-Aware Deep Classifiers using Generative Models.”
AAAI 2020
165
166. VAE + GAN
G
D' D
G
D
D
Figure 2: Original training samples (top), samples
Sensoy, Murat, Lance Kaplan, and Melih Kandemir. “Evidential deep learning to quantify classification
uncertainty.” Advances in Neural Information Processing Systems. 2018.
166
167. Robustness against FGS
Sensoy, Murat, Lance Kaplan, and Melih Kandemir. “Evidential deep learning to quantify classification
uncertainty.” Advances in Neural Information Processing Systems. 2018.
167
168. Anomaly detection
(mnist) (cifar10)
Sensoy, Murat, Lance Kaplan, and Melih Kandemir. “Evidential deep learning to quantify classification
uncertainty.” Advances in Neural Information Processing Systems. 2018.
168
170. Conclusions
Introduction to AI (for security)
• GOFAI
• ML
Guidelines
• Rules
• Guidelines for creating and using AI
Security of AI
• GOFAI
• ML
170