The document summarizes key concepts from chapter 2 of the lecture slides on linear algebra for deep learning. It defines scalars as single numbers and vectors as 1-D arrays of numbers that can be indexed. Matrices are 2-D arrays of numbers that are indexed with two numbers. Tensors generalize this to arrays with more dimensions. The document also discusses matrix operations like transpose, dot product, and inversion which are important for solving systems of linear equations. It introduces norms as functions to measure the size of vectors.
2. Linear Algebra for Machine Learning: Basis and DimensionCeni Babaoglu, PhD
The seminar series will focus on the mathematical background needed for machine learning. The first set of the seminars will be on "Linear Algebra for Machine Learning". Here are the slides of the second part which is discussing basis and dimension.
Here is the link of the first part which was discussing linear systems: https://www.slideshare.net/CeniBabaogluPhDinMat/linear-algebra-for-machine-learning-linear-systems/1
For a system involving two variables (x and y), each linear equation determines a line on the xy-plane. Because a solution to a linear system must satisfy all of the equations, the solution set is the intersection of these lines, and is hence either a line, a single point, or the empty set
Numerical solution of eigenvalues and applications 2SamsonAjibola
This project aims at studying the methods of numerical solution of eigenvalue problems and their applications. An accurate mathematical method is needed to solve direct and inverse eigenvalue problems related to different applications such as engineering analysis and design, statistics, biology e.t.c. Eigenvalue problems are of immense interest and play a pivotal role not only in many fields of engineering but also in pure and applied mathematics, thus numerical methods are developed to solve eigenvalue problems. The primary objective of this work is to showcase these various eigenvalue algorithms such as QR algorithm, power method, Krylov subspace iteration (Lanczos and Arnoldi) and explain their effects and procedures in solving eigenvalue problems.
These Slides are very usefull interms of engineering and as well as in other fields of Study .. These are Related with linear Algebra and there Properties Methods to find out the unknowns from the equation...
2. Linear Algebra for Machine Learning: Basis and DimensionCeni Babaoglu, PhD
The seminar series will focus on the mathematical background needed for machine learning. The first set of the seminars will be on "Linear Algebra for Machine Learning". Here are the slides of the second part which is discussing basis and dimension.
Here is the link of the first part which was discussing linear systems: https://www.slideshare.net/CeniBabaogluPhDinMat/linear-algebra-for-machine-learning-linear-systems/1
For a system involving two variables (x and y), each linear equation determines a line on the xy-plane. Because a solution to a linear system must satisfy all of the equations, the solution set is the intersection of these lines, and is hence either a line, a single point, or the empty set
Numerical solution of eigenvalues and applications 2SamsonAjibola
This project aims at studying the methods of numerical solution of eigenvalue problems and their applications. An accurate mathematical method is needed to solve direct and inverse eigenvalue problems related to different applications such as engineering analysis and design, statistics, biology e.t.c. Eigenvalue problems are of immense interest and play a pivotal role not only in many fields of engineering but also in pure and applied mathematics, thus numerical methods are developed to solve eigenvalue problems. The primary objective of this work is to showcase these various eigenvalue algorithms such as QR algorithm, power method, Krylov subspace iteration (Lanczos and Arnoldi) and explain their effects and procedures in solving eigenvalue problems.
These Slides are very usefull interms of engineering and as well as in other fields of Study .. These are Related with linear Algebra and there Properties Methods to find out the unknowns from the equation...
Here is a basic Linear Algebra review for the class of Machine Learning. This is actually becoming a new class in the mathematics of Intelligent Systems, there I will be teaching stuff in
1.- Linear Algebra - From the basics to the Cayley-Hamilton Theorem with applications
2.- Mathematical Analysis - from set to the Reimann Integral
3.- Topology - Mostly in Hilbert Spaces
4.- Optimization - Convex functions, KKT conditions, Duality Theory, etc.
The stuff is going to be interesting...
1. Linear Algebra for Machine Learning: Linear SystemsCeni Babaoglu, PhD
The seminar series will focus on the mathematical background needed for machine learning. The first set of the seminars will be on "Linear Algebra for Machine Learning". Here are the slides of the first part which is giving a short overview of matrices and discussing linear systems.
It contains the basics of matrix which includes matrix definition,types of matrices,operations on matrices,transpose of matrix,symmetric and skew symmetric matrix,invertible matrix,
application of matrix.
Eigen values and eigen vectors engineeringshubham211
mathematics...for engineering mathematics.....learn maths...............................The individual items in a matrix are called its elements or entries.[4] Provided that they are the same size (have the same number of rows and the same number of columns), two matrices can be added or subtracted element by element. The rule for matrix multiplication, however, is that two matrices can be multiplied only when the number of columns in the first equals the number of rows in the second. Any matrix can be multiplied element-wise by a scalar from its associated field. A major application of matrices is to represent linear transformations, that is, generalizations of linear functions such as f(x) = 4x. For example, the rotation of vectors in three dimensional space is a linear transformation which can be represented by a rotation matrix R: if v is a column vector (a matrix with only one column) describing the position of a point in space, the product Rv is a column vector describing the position of that point after a rotation. The product of two transformation matrices is a matrix that represents the composition of two linear transformations. Another application of matrices is in the solution of systems of linear equations. If the matrix is square, it is possible to deduce some of its properties by computing its determinant. For example, a square matrix has an inverse if and only if its determinant is not zero. Insight into the geometry of a linear transformation is obtainable (along with other information) from the matrix's eigenvalues and eigenvectors.
Applications of matrices are found in most scientific fields. In every branch of physics, including classical mechanics, optics, electromagnetism, quantum mechanics, and quantum electrodynamics, they are used to study physical phenomena, such as the motion of rigid bodies. In computer graphics, they are used to project a 3-dimensional image onto a 2-dimensional screen. In probability theory and statistics, stochastic matrices are used to describe sets of probabilities; for instance, they are used within the PageRank algorithm that ranks the pages in a Google search.[5] Matrix calculus generalizes classical analytical notions such as derivatives and exponentials to higher dimensions.
A major branch of numerical analysis is devoted to the development of efficient algorithms for matrix computations, a subject that is centuries old and is today an expanding area of research. Matrix decomposition methods simplify computations, both theoretically and practically. Algorithms that are tailored to particular matrix structures, such as sparse matrices and near-diagonal matrices, expedite computations in finite element method and other computations. Infinite matrices occur in planetary theory and in atomic theory. A simple example of an infinite matrix is the matrix representing the derivative operator, which acts on the Taylor series of a function
...
5. Linear Algebra for Machine Learning: Singular Value Decomposition and Prin...Ceni Babaoglu, PhD
The seminar series will focus on the mathematical background needed for machine learning. The first set of the seminars will be on "Linear Algebra for Machine Learning". Here are the slides of the fifth part which is discussing singular value decomposition and principal component analysis.
Here are the slides of the first part which was discussing linear systems: https://www.slideshare.net/CeniBabaogluPhDinMat/linear-algebra-for-machine-learning-linear-systems/1
Here are the slides of the second part which was discussing basis and dimension:
https://www.slideshare.net/CeniBabaogluPhDinMat/2-linear-algebra-for-machine-learning-basis-and-dimension
Here are the slides of the third part which is discussing factorization and linear transformations.
https://www.slideshare.net/CeniBabaogluPhDinMat/3-linear-algebra-for-machine-learning-factorization-and-linear-transformations-130813437
Here are the slides of the fourth part which is discussing eigenvalues and eigenvectors.
https://www.slideshare.net/CeniBabaogluPhDinMat/4-linear-algebra-for-machine-learning-eigenvalues-eigenvectors-and-diagonalization
Here is a basic Linear Algebra review for the class of Machine Learning. This is actually becoming a new class in the mathematics of Intelligent Systems, there I will be teaching stuff in
1.- Linear Algebra - From the basics to the Cayley-Hamilton Theorem with applications
2.- Mathematical Analysis - from set to the Reimann Integral
3.- Topology - Mostly in Hilbert Spaces
4.- Optimization - Convex functions, KKT conditions, Duality Theory, etc.
The stuff is going to be interesting...
1. Linear Algebra for Machine Learning: Linear SystemsCeni Babaoglu, PhD
The seminar series will focus on the mathematical background needed for machine learning. The first set of the seminars will be on "Linear Algebra for Machine Learning". Here are the slides of the first part which is giving a short overview of matrices and discussing linear systems.
It contains the basics of matrix which includes matrix definition,types of matrices,operations on matrices,transpose of matrix,symmetric and skew symmetric matrix,invertible matrix,
application of matrix.
Eigen values and eigen vectors engineeringshubham211
mathematics...for engineering mathematics.....learn maths...............................The individual items in a matrix are called its elements or entries.[4] Provided that they are the same size (have the same number of rows and the same number of columns), two matrices can be added or subtracted element by element. The rule for matrix multiplication, however, is that two matrices can be multiplied only when the number of columns in the first equals the number of rows in the second. Any matrix can be multiplied element-wise by a scalar from its associated field. A major application of matrices is to represent linear transformations, that is, generalizations of linear functions such as f(x) = 4x. For example, the rotation of vectors in three dimensional space is a linear transformation which can be represented by a rotation matrix R: if v is a column vector (a matrix with only one column) describing the position of a point in space, the product Rv is a column vector describing the position of that point after a rotation. The product of two transformation matrices is a matrix that represents the composition of two linear transformations. Another application of matrices is in the solution of systems of linear equations. If the matrix is square, it is possible to deduce some of its properties by computing its determinant. For example, a square matrix has an inverse if and only if its determinant is not zero. Insight into the geometry of a linear transformation is obtainable (along with other information) from the matrix's eigenvalues and eigenvectors.
Applications of matrices are found in most scientific fields. In every branch of physics, including classical mechanics, optics, electromagnetism, quantum mechanics, and quantum electrodynamics, they are used to study physical phenomena, such as the motion of rigid bodies. In computer graphics, they are used to project a 3-dimensional image onto a 2-dimensional screen. In probability theory and statistics, stochastic matrices are used to describe sets of probabilities; for instance, they are used within the PageRank algorithm that ranks the pages in a Google search.[5] Matrix calculus generalizes classical analytical notions such as derivatives and exponentials to higher dimensions.
A major branch of numerical analysis is devoted to the development of efficient algorithms for matrix computations, a subject that is centuries old and is today an expanding area of research. Matrix decomposition methods simplify computations, both theoretically and practically. Algorithms that are tailored to particular matrix structures, such as sparse matrices and near-diagonal matrices, expedite computations in finite element method and other computations. Infinite matrices occur in planetary theory and in atomic theory. A simple example of an infinite matrix is the matrix representing the derivative operator, which acts on the Taylor series of a function
...
5. Linear Algebra for Machine Learning: Singular Value Decomposition and Prin...Ceni Babaoglu, PhD
The seminar series will focus on the mathematical background needed for machine learning. The first set of the seminars will be on "Linear Algebra for Machine Learning". Here are the slides of the fifth part which is discussing singular value decomposition and principal component analysis.
Here are the slides of the first part which was discussing linear systems: https://www.slideshare.net/CeniBabaogluPhDinMat/linear-algebra-for-machine-learning-linear-systems/1
Here are the slides of the second part which was discussing basis and dimension:
https://www.slideshare.net/CeniBabaogluPhDinMat/2-linear-algebra-for-machine-learning-basis-and-dimension
Here are the slides of the third part which is discussing factorization and linear transformations.
https://www.slideshare.net/CeniBabaogluPhDinMat/3-linear-algebra-for-machine-learning-factorization-and-linear-transformations-130813437
Here are the slides of the fourth part which is discussing eigenvalues and eigenvectors.
https://www.slideshare.net/CeniBabaogluPhDinMat/4-linear-algebra-for-machine-learning-eigenvalues-eigenvectors-and-diagonalization
Mathematics (from Greek μάθημα máthēma, “knowledge, study, learning”) is the study of topics such as quantity (numbers), structure, space, and change. There is a range of views among mathematicians and philosophers as to the exact scope and definition of mathematics
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
2. (Goodfellow 2016)
About this chapter
• Not a comprehensive survey of all of linear algebra
• Focused on the subset most relevant to deep
learning
• Larger subset: e.g., Linear Algebra by Georgi Shilov
3. (Goodfellow 2016)
Scalars
• A scalar is a single number
• Integers, real numbers, rational numbers, etc.
• We denote it with italic font:
a, n, x
4. (Goodfellow 2016)
Vectors
• A vector is a 1-D array of numbers:
• Can be real, binary, integer, etc.
• Example notation for type and size:
rder. We can identify each individual number by its index in that ordering.
ypically we give vectors lower case names written in bold typeface, such
s x. The elements of the vector are identified by writing its name in italic
ypeface, with a subscript. The first element of x is x1, the second element
x2 and so on. We also need to say what kind of numbers are stored in
he vector. If each element is in R, and the vector has n elements, then the
ector lies in the set formed by taking the Cartesian product of R n times,
enoted as Rn. When we need to explicitly identify the elements of a vector,
e write them as a column enclosed in square brackets:
x =
2
6
6
6
4
x1
x2
...
xn
3
7
7
7
5
. (2.1)
We can think of vectors as identifying points in space, with each element
ving the coordinate along a different axis.
ometimes we need to index a set of elements of a vector. In this case, we
efine a set containing the indices and write the set as a subscript. For
xample, to access x1, x3 and x6, we define the set S = {1, 3, 6} and write
S. We use the sign to index the complement of a set. For example x 1 is
he vector containing all elements of x except for x1, and x S is the vector
ontaining all of the elements of x except for x1, x3 and x6.
Matrices: A matrix is a 2-D array of numbers, so each element is identified by
wo indices instead of just one. We usually give matrices upper-case variable
ames with bold typeface, such as A. If a real-valued matrix A has a height
• Vectors: A vector is an array of
order. We can identify each indiv
Typically we give vectors lower
as x. The elements of the vector
typeface, with a subscript. The fi
is x2 and so on. We also need t
the vector. If each element is in R
vector lies in the set formed by t
denoted as Rn. When we need to
5. (Goodfellow 2016)
Matrices
• A matrix is a 2-D array of numbers:
• Example notation for type and shape:
A =
2
4
A1,1 A1,2
A2,1 A2,2
A3,1 A3,2
3
5 ) A>
=
A1,1 A2,1 A3,1
A1,2 A2,2 A3,2
The transpose of the matrix can be thought of as a mirror image across the
nal.
th column of A. When we need to explicitly identify the elements of a
x, we write them as an array enclosed in square brackets:
A1,1 A1,2
A2,1 A2,2
. (2.2)
times we may need to index matrix-valued expressions that are not just
gle letter. In this case, we use subscripts after the expression, but do
onvert anything to lower case. For example, f(A)i,j gives element (i, j)
e matrix computed by applying the function f to A.
ors: In some cases we will need an array with more than two axes. In
eneral case, an array of numbers arranged on a regular grid with a
ble number of axes is known as a tensor. We denote a tensor named “A”
this typeface: A. We identify the element of A at coordinates (i, j, k)
riting Ai,j,k.
ndices and write the set as a subscript
x6, we define the set S = {1, 3, 6} and
x the complement of a set. For example
ents of x except for x1, and x S is the
of x except for x1, x3 and x6.
ray of numbers, so each element is identifi
. We usually give matrices upper-case va
h as A. If a real-valued matrix A has a
we say that A 2 Rm⇥n. We usually id
g its name in italic but not bold font, an
Column
Row
6. (Goodfellow 2016)
Tensors
• A tensor is an array of numbers, that may have
• zero dimensions, and be a scalar
• one dimension, and be a vector
• two dimensions, and be a matrix
• or more dimensions.
7. (Goodfellow 2016)
Matrix Transpose
CHAPTER 2. LINEAR ALGEBRA
A =
2
4
A1,1 A1,2
A2,1 A2,2
A3,1 A3,2
3
5 ) A>
=
A1,1 A2,1 A3,1
A1,2 A2,2 A3,2
Figure 2.1: The transpose of the matrix can be thought of as a mirror image across the
main diagonal.
the i-th column of A. When we need to explicitly identify the elements of a
matrix, we write them as an array enclosed in square brackets:
A1,1 A1,2
A2,1 A2,2
. (2.2)
rtant operation on matrices is the transpose. The transpose of a
mirror image of the matrix across a diagonal line, called the main
ning down and to the right, starting from its upper left corner. See
graphical depiction of this operation. We denote the transpose of a
A>, and it is defined such that
(A>
)i,j = Aj,i. (2.3)
an be thought of as matrices that contain only one column. The
a vector is therefore a matrix with only one row. Sometimes we
33
ations have many useful properties that make mathematical
more convenient. For example, matrix multiplication is
A(B + C) = AB + AC. (2.6)
A(BC) = (AB)C. (2.7)
is not commutative (the condition AB = BA does not
alar multiplication. However, the dot product between two
:
x>
y = y>
x. (2.8)
matrix product has a simple form:
(AB)>
= B>
A>
. (2.9)
onstrate Eq. 2.8, by exploiting the fact that the value of
8. (Goodfellow 2016)
Matrix (Dot) Product
duct of matrices A and B is a third matrix C. In order
ned, A must have the same number of columns as B has
⇥ n and B is of shape n ⇥ p, then C is of shape m ⇥ p.
product just by placing two or more matrices together,
C = AB. (2.4)
n is defined by
Ci,j =
X
k
Ai,kBk,j. (2.5)
d product of two matrices is not just a matrix containing
dual elements. Such an operation exists and is called the
Hadamard product, and is denoted as A B.
en two vectors x and y of the same dimensionality is the
can think of the matrix product C = AB as computing
etween row i of A and column j of B.
= •m
p
m
p
n
n
Must
match
e defined, A must have the same number of columns as B has
pe m ⇥ n and B is of shape n ⇥ p, then C is of shape m ⇥ p.
atrix product just by placing two or more matrices together,
C = AB. (2.4)
ration is defined by
Ci,j =
X
k
Ai,kBk,j. (2.5)
andard product of two matrices is not just a matrix containing
ndividual elements. Such an operation exists and is called the
t or Hadamard product, and is denoted as A B.
between two vectors x and y of the same dimensionality is the
. We can think of the matrix product C = AB as computing
uct between row i of A and column j of B.
34
9. (Goodfellow 2016)
Identity Matrix
PTER 2. LINEAR ALGEBRA
2
4
1 0 0
0 1 0
0 0 1
3
5
Figure 2.2: Example identity matrix: This is I3.
A2,1x1 + A2,2x2 + · · · + A2,nxn = b2 (2
. . . (2
Am,1x1 + Am,2x2 + · · · + Am,nxn = bm. (2
Matrix-vector product notation provides a more compact representation
ations of this form.
Inverse Matrices
werful tool called matrix inversion that allows us to
for many values of A.
ersion, we first need to define the concept of an identity
x is a matrix that does not change any vector when we
at matrix. We denote the identity matrix that preserves
n. Formally, In 2 Rn⇥n, and
8x 2 Rn
, Inx = x. (2.20)
ity matrix is simple: all of the entries along the main
the other entries are zero. See Fig. 2.2 for an example.
10. (Goodfellow 2016)
Systems of Equations
t of useful properties of the matrix product here, but
that many more exist.
ear algebra notation to write down a system of linear
Ax = b (2.11)
n matrix, b 2 Rm is a known vector, and x 2 Rn is a
we would like to solve for. Each element xi of x is one
Each row of A and each element of b provide another
Eq. 2.11 as:
A1,:x = b1 (2.12)
A2,:x = b2 (2.13)
. . . (2.14)
expands to
aware that many more exist.
ough linear algebra notation to write down a system of linear
Ax = b (2.11)
a known matrix, b 2 Rm is a known vector, and x 2 Rn is a
riables we would like to solve for. Each element xi of x is one
iables. Each row of A and each element of b provide another
ewrite Eq. 2.11 as:
A1,:x = b1 (2.12)
A2,:x = b2 (2.13)
. . . (2.14)
Am,:x = bm (2.15)
tly, as:
A1,1x1 + A1,2x2 + · · · + A1,nxn = b1 (2.16)
35
11. (Goodfellow 2016)
Solving Systems of Equations
• A linear system of equations can have:
• No solution
• Many solutions
• Exactly one solution: this means multiplication by
the matrix is an invertible function
12. (Goodfellow 2016)
Matrix Inversion
• Matrix inverse:
• Solving a system using an inverse:
• Numerically unstable, but useful for abstract
analysis
identity matrix is simple: all of the entries along the main
all of the other entries are zero. See Fig. 2.2 for an example.
se of A is denoted as A 1, and it is defined as the matrix
A 1
A = In. (2.21)
Eq. 2.11 by the following steps:
Ax = b (2.22)
A 1
Ax = A 1
b (2.23)
Inx = A 1
b (2.24)
36
f the identity matrix is simple: all of the entries along the main
while all of the other entries are zero. See Fig. 2.2 for an example.
inverse of A is denoted as A 1, and it is defined as the matrix
A 1
A = In. (2.21)
solve Eq. 2.11 by the following steps:
Ax = b (2.22)
A 1
Ax = A 1
b (2.23)
Inx = A 1
b (2.24)
36
13. (Goodfellow 2016)
Invertibility
• Matrix can’t be inverted if…
• More rows than columns
• More columns than rows
• Redundant rows/columns (“linearly dependent”,
“low rank”)
14. (Goodfellow 2016)
Norms
• Functions that measure how “large” a vector is
• Similar to a distance between zero and the point
represented by the vector
is given by
||x||p =
X
i
|xi|p
!1
p
for p 2 R, p 1.
Norms, including the Lp norm, are functions mapping vect
values. On an intuitive level, the norm of a vector x measure
the origin to the point x. More rigorously, a norm is any func
the following properties:
• f(x) = 0 ) x = 0
• f(x + y) f(x) + f(y) (the triangle inequality)
• 8↵ 2 R, f(↵x) = |↵|f(x)
The L2 norm, with p = 2, is known as the Euclidean nor
Euclidean distance from the origin to the point identified by
15. (Goodfellow 2016)
• Lp
norm
• Most popular norm: L2 norm, p=2
• L1 norm, p=1:
• Max norm, infinite p:
Norms
o measure the size of a vector. In machine learning, we
ectors using a function called a norm. Formally, the L
||x||p =
X
i
|xi|p
!1
p
the Lp norm, are functions mapping vectors to non-n
ive level, the norm of a vector x measures the distan
nt x. More rigorously, a norm is any function f that
ties:
APTER 2. LINEAR ALGEBRA
ause it increases very slowly near the origin. In several machine learning
lications, it is important to discriminate between elements that are exactly
and elements that are small but nonzero. In these cases, we turn to a function
t grows at the same rate in all locations, but retains mathematical simplicity:
L1 norm. The L1 norm may be simplified to
||x||1 =
X
i
|xi|. (2.31)
L1 norm is commonly used in machine learning when the difference between
o and nonzero elements is very important. Every time an element of x moves
y from 0 by ✏, the L1 norm increases by ✏.
We sometimes measure the size of the vector by counting its number of nonzero
ments. Some authors refer to this function as the “L0 norm,” but this is incorrect
minology. The number of non-zero entries in a vector is not a norm, because
zero and elements that are small but nonzero. In these cases, we turn to a function
that grows at the same rate in all locations, but retains mathematical simplicity:
the L1 norm. The L1 norm may be simplified to
||x||1 =
X
i
|xi|. (2.31)
The L1 norm is commonly used in machine learning when the difference between
zero and nonzero elements is very important. Every time an element of x moves
away from 0 by ✏, the L1 norm increases by ✏.
We sometimes measure the size of the vector by counting its number of nonzero
elements. Some authors refer to this function as the “L0 norm,” but this is incorrect
terminology. The number of non-zero entries in a vector is not a norm, because
scaling the vector by ↵ does not change the number of nonzero entries. The L1
norm is often used as a substitute for the number of nonzero entries.
One other norm that commonly arises in machine learning is the L1 norm,
also known as the max norm. This norm simplifies to the absolute value of the
element with the largest magnitude in the vector,
||x||1 = max
i
|xi|. (2.32)
Sometimes we may also wish to measure the size of a matrix. In the context
of deep learning, the most common way to do this is with the otherwise obscure
Frobenius norm sX
16. (Goodfellow 2016)
• Unit vector:
• Symmetric Matrix:
• Orthogonal matrix:
Special Matrices and Vectors
A = A>
. (2.35)
en arise when the entries are generated by some function of
es not depend on the order of the arguments. For example,
ance measurements, with Ai,j giving the distance from point
= Aj,i because distance functions are symmetric.
ector with unit norm:
||x||2 = 1. (2.36)
vector y are orthogonal to each other if x>y = 0. If both
orm, this means that they are at a 90 degree angle to each
n vectors may be mutually orthogonal with nonzero norm.
only orthogonal but also have unit norm, we call them
ix is a square matrix whose rows are mutually orthonormal
e mutually orthonormal:
> >
1 n
machine learning algorithm in terms of arbitrary matrices,
sive (and less descriptive) algorithm by restricting some
ices need be square. It is possible to construct a rectangular
quare diagonal matrices do not have inverses but it is still
them cheaply. For a non-square diagonal matrix D, the
scaling each element of x, and either concatenating some
is taller than it is wide, or discarding some of the last
D is wider than it is tall.
is any matrix that is equal to its own transpose:
A = A>
. (2.35)
n arise when the entries are generated by some function of
s not depend on the order of the arguments. For example,
nce measurements, with Ai,j giving the distance from point
= Aj,i because distance functions are symmetric.
es often arise when the entries are generated by some function of
at does not depend on the order of the arguments. For example,
distance measurements, with Ai,j giving the distance from point
Ai,j = Aj,i because distance functions are symmetric.
is a vector with unit norm:
||x||2 = 1. (2.36)
nd a vector y are orthogonal to each other if x>y = 0. If both
ero norm, this means that they are at a 90 degree angle to each
most n vectors may be mutually orthogonal with nonzero norm.
e not only orthogonal but also have unit norm, we call them
matrix is a square matrix whose rows are mutually orthonormal
ns are mutually orthonormal:
A>
A = AA>
= I. (2.37)
41
LGEBRA
A 1
= A>
, (2.38)
17. (Goodfellow 2016)
Eigendecomposition
• Eigenvector and eigenvalue:
• Eigendecomposition of a diagonalizable matrix:
• Every real symmetric matrix has a real, orthogonal
eigendecomposition:
atrix as an array of elements.
dely used kinds of matrix decomposition is called eigen-
h we decompose a matrix into a set of eigenvectors and
square matrix A is a non-zero vector v such that multipli-
the scale of v:
Av = v. (2.39)
as the eigenvalue corresponding to this eigenvector. (One
vector such that v>A = v>, but we are usually concerned
.
or of A, then so is any rescaled vector sv for s 2 R, s 6= 0.
he same eigenvalue. For this reason, we usually only look
example of the effect of eigenvectors and eigenvalues. Here, we have
h two orthonormal eigenvectors, v(1)
with eigenvalue 1 and v(2)
with
Left) We plot the set of all unit vectors u 2 R2
as a unit circle. (Right)
of all points Au. By observing the way that A distorts the unit circle, we
cales space in direction v(i)
by i.
form a matrix V with one eigenvector per column: V = [v(1), . . . ,
, we can concatenate the eigenvalues to form a vector = [ 1, . . . ,
ndecomposition of A is then given by
A = V diag( )V 1
. (2.40)
en that constructing matrices with specific eigenvalues and eigenvec-
to stretch space in desired directions. However, we often want to
rices into their eigenvalues and eigenvectors. Doing so can help us
ain properties of the matrix, much as decomposing an integer into
rs can help us understand the behavior of that integer.
matrix can be decomposed into eigenvalues and eigenvectors. In some
ALGEBRA
on exists, but may involve complex rather than real numbers.
ook, we usually need to decompose only a specific class of
simple decomposition. Specifically, every real symmetric
posed into an expression using only real-valued eigenvectors
A = Q⇤Q>
, (2.41)
gonal matrix composed of eigenvectors of A, and ⇤ is a
eigenvalue ⇤i,i is associated with the eigenvector in column i
19. (Goodfellow 2016)
Singular Value Decomposition
• Similar to eigendecomposition
• More general; matrix need not be square
LINEAR ALGEBRA
ly applicable. Every real matrix has a singular value decomposition,
is not true of the eigenvalue decomposition. For example, if a matrix
, the eigendecomposition is not defined, and we must use a singular
osition instead.
at the eigendecomposition involves analyzing a matrix A to discover
f eigenvectors and a vector of eigenvalues such that we can rewrite
A = V diag( )V 1
. (2.42)
ular value decomposition is similar, except this time we will write A
of three matrices:
A = UDV >
. (2.43)
hat A is an m ⇥ n matrix. Then U is defined to be an m ⇥ m matrix,
m ⇥ n matrix, and V to be an n ⇥ n matrix.
20. (Goodfellow 2016)
Moore-Penrose Pseudoinverse
• If the equation has:
• Exactly one solution: this is the same as the inverse.
• No solution: this gives us the solution with the
smallest error
• Many solutions: this gives us the solution with the
smallest norm of x.
When A has more columns than r
pseudoinverse provides one of the ma
he solution x = A+y with minima
olutions.
When A has more rows than colu
n this case, using the pseudoinverse
possible to y in terms of Euclidean n
rithms for computing the pseudoinverse are not based on this defini-
er the formula
A+
= V D+
U>
, (2.47)
nd V are the singular value decomposition of A, and the pseudoinverse
onal matrix D is obtained by taking the reciprocal of its non-zero
taking the transpose of the resulting matrix.
as more columns than rows, then solving a linear equation using the
provides one of the many possible solutions. Specifically, it provides
x = A+y with minimal Euclidean norm ||x||2 among all possible
as more rows than columns, it is possible for there to be no solution.
using the pseudoinverse gives us the x for which Ax is as close as
in terms of Euclidean norm ||Ax y||2.
e Trace Operator
rator gives the sum of all of the diagonal entries of a matrix:
21. (Goodfellow 2016)
Computing the Pseudoinverse
eudoinverse allows us to make some headway in these
of A is defined as a matrix
A+
= lim
↵&0
(A>
A + ↵I) 1
A>
. (2.46)
mputing the pseudoinverse are not based on this defini-
la
A+
= V D+
U>
, (2.47)
singular value decomposition of A, and the pseudoinverse
D is obtained by taking the reciprocal of its non-zero
ranspose of the resulting matrix.
mns than rows, then solving a linear equation using the
of the many possible solutions. Specifically, it provides
Take reciprocal of non-zero entries
The SVD allows the computation of the pseudoinverse:
22. (Goodfellow 2016)
Trace
perator
sum of all of the diagonal entries of a matrix:
Tr(A) =
X
i
Ai,i. (2.48)
ful for a variety of reasons. Some operations that are
sorting to summation notation can be specified using
46
pression using many useful identities. For example, the trace
nt to the transpose operator:
Tr(A) = Tr(A>
). (2.50)
square matrix composed of many factors is also invariant to
ctor into the first position, if the shapes of the corresponding
resulting product to be defined:
Tr(ABC) = Tr(CAB) = Tr(BCA) (2.51)
Tr(
nY
i=1
F (i)
) = Tr(F (n)
n 1Y
i=1
F (i)
). (2.52)
cyclic permutation holds even if the resulting product has a
r example, for A 2 Rm⇥n and B 2 Rn⇥m, we have
23. (Goodfellow 2016)
Learning linear algebra
• Do a lot of practice problems
• Start out with lots of summation signs and indexing
into individual entries
• Eventually you will be able to mostly use matrix
and vector product notation quickly and easily