Academic
  • Introduction
  • Artificial Intelligence
    • Introduction
    • AI Concepts, Terminology, and Application Areas
    • AI: Issues, Concerns and Ethical Considerations
  • Biology
    • Scientific Method
    • Chemistry of Life
    • Water, Acids, Bases
    • Properties of carbon
    • Macromolecules
    • Energy and Enzymes
    • Structure of a cell
    • Membranes and transport
    • Cellular respiration
    • Cell Signaling
    • Cell Division
    • Classical and molecular genetics
    • DNA as the genetic material
    • Central dogma
    • Gene regulation
  • Bioinformatics
    • Bioinformatics Overview
  • Deep Learning
    • Neural Networks and Deep Learning
      • Introduction
      • Logistic Regression as a Neural Network
      • Python and Vectorization
      • Shallow Neural Network
      • Deep Neural Network
    • Improving Deep Neural Networks
      • Setting up your Machine Learning Application
      • Regularizing your Neural Network
      • Setting up your Optimization Problem
      • Optimization algorithms
      • Hyperparameter, Batch Normalization, Softmax
    • Structuring Machine Learning Projects
    • Convolutional Neural Networks
      • Introduction
    • Sequence Models
      • Recurrent Neural Networks
      • Natural Language Processing & Word Embeddings
      • Sequence models & Attention mechanism
  • Linear Algebra
    • Vectors and Spaces
      • Vectors
      • Linear combinations and spans
      • Linear dependence and independence
      • Subspaces and the basis for a subspace
      • Vector dot and cross products
      • Matrices for solving systems by elimination
      • Null space and column space
    • Matrix transformations
      • Functions and linear transformations
      • Linear transformation examples
      • Transformations and matrix multiplication
      • Inverse functions and transformations
      • Finding inverses and determinants
      • More Determinant Depth
  • Machine Learning
    • Introduction
    • Linear Regression
      • Model and Cost Function
      • Parameter Learning
      • Multivariate Linear Regression
      • Computing Parameters Analytically
      • Octave
    • Logistic Regression
      • Classification and Representation
      • Logistic Regression Model
    • Regularization
      • Solving the Problem of Overfitting
    • Neural Networks
      • Introduction of Neural Networks
      • Neural Networks - Learning
    • Improve Learning Algorithm
      • Advice for Applying Machine Learning
      • Machine Learning System Design
    • Support Vector Machine
      • Large Margin Classification
      • Kernels
      • SVM in Practice
  • NCKU - Artificial Intelligence
    • Introduction
    • Intelligent Agents
    • Solving Problems by Searching
    • Beyond Classical Search
    • Learning from Examples
  • NCKU - Computer Architecture
    • First Week
  • NCKU - Data Mining
    • Introduction
    • Association Analysis
    • FP-growth
    • Other Association Rules
    • Sequence Pattern
    • Classification
    • Evaluation
    • Clustering
    • Link Analysis
  • NCKU - Machine Learning
    • Probability
    • Inference
    • Bayesian Inference
    • Introduction
  • NCKU - Robotic Navigation and Exploration
    • Kinetic Model & Vehicle Control
    • Motion Planning
    • SLAM Back-end (I)
    • SLAM Back-end (II)
    • Computer Vision / Multi-view Geometry
    • Lie group & Lie algebra
    • SLAM Front-end
  • Python
    • Numpy
    • Pandas
    • Scikit-learn
      • Introduction
      • Statistic Learning
  • Statstics
    • Quantitative Data
    • Modeling Data Distribution
    • Bivariate Numerical Data
    • Probability
    • Random Variables
    • Sampling Distribution
    • Confidence Intervals
    • Significance tests
Powered by GitBook
On this page
  • Compositions of linear transformations 1
  • Compositions of linear transformations 2
  • Matrix product examples
  • Matrix product associativity
  • Distributive property of matrix products

Was this helpful?

  1. Linear Algebra
  2. Matrix transformations

Transformations and matrix multiplication

PreviousLinear transformation examplesNextInverse functions and transformations

Last updated 5 years ago

Was this helpful?

Compositions of linear transformations 1

現在要來討論的是, linear transformation 的組合要怎麼表示?他還算是 linear transformation 嗎?

我們現在有 Rn, Rm, Rl 空間,以及 S 和 T 兩個 linear transformation function

S:X→Y∣X⊆Rn,Y⊆RmS(x⃗)=Ax⃗∣A is m×nT:Y→Z∣Y⊆Rm,Z⊆RlT(x⃗)=Bx⃗∣B is l×m\begin{aligned} &S: X \to Y \mid X \subseteq \mathbb{R}^n, Y \subseteq \mathbb{R}^m \\ &S(\vec{x}) = \mathbf{A}\vec{x} \mid \mathbf{A} \text{ is } m \times n \\\\ &T: Y \to Z \mid Y \subseteq \mathbb{R}^m, Z \subseteq \mathbb{R}^l \\ &T(\vec{x}) = \mathbf{B}\vec{x} \mid \mathbf{B} \text{ is } l \times m \end{aligned}​S:X→Y∣X⊆Rn,Y⊆RmS(x)=Ax∣A is m×nT:Y→Z∣Y⊆Rm,Z⊆RlT(x)=Bx∣B is l×m​

那我們要怎麼一次定義 x 到 z ,也就是先對 vector 做 S 再做 T 的 transformation 這段變化呢

T∘S:X→Z ( The composition of T with S )T∘S=T(S(x⃗))T \circ S: X \to Z \text{ ( The composition of T with S )} \\ T \circ S = T(S(\vec{x}))T∘S:X→Z ( The composition of T with S )T∘S=T(S(x))

那這樣子定義還是不是 linear transformation 呢

  • 條件一 OK

T∘S(x⃗+y⃗)=T(S(x⃗+y⃗))=T(S(x⃗)+S(y⃗))=T(S(x⃗))+T(S(y⃗))=T∘S(x⃗)+T∘S(y⃗)\begin{aligned} T \circ S(\vec{x} + \vec{y}) &= T(S(\vec{x}+\vec{y}))\\ &= T(S(\vec{x})+S(\vec{y}))\\ &= T(S(\vec{x})) + T(S(\vec{y}))\\ &= T \circ S(\vec{x}) + T \circ S(\vec{y}) \end{aligned}T∘S(x+y​)​=T(S(x+y​))=T(S(x)+S(y​))=T(S(x))+T(S(y​))=T∘S(x)+T∘S(y​)​
  • 條件二 OK

T∘S(cx⃗)=T(S(cx⃗))=T(cS(x⃗))=cT(S(x⃗))=cT∘S(x⃗)\begin{aligned} T \circ S(c\vec{x} ) &= T(S(c\vec{x})) \\ &= T(cS(\vec{x})) \\ &= cT(S(\vec{x})) \\ &= c T \circ S(\vec{x}) \end{aligned}T∘S(cx)​=T(S(cx))=T(cS(x))=cT(S(x))=cT∘S(x)​

所以我們的確可以把 S 和 T 兩個 transformation 合併,用一個 matrix 來表達

T∘S(x⃗)=Cx⃗∣C is l×nT \circ S (\vec{x}) = \mathbf{C}\vec{x} \mid C \text{ is } l \times nT∘S(x)=Cx∣C is l×n

Compositions of linear transformations 2

現在將更進一步探討 S 和 T 兩個 transformation 如何轉為 matrix 表示

T(x⃗)=Bx⃗∣B is l×mS(x⃗)=Ax⃗∣A is m×nT∘S=B(A(x⃗))=Cx⃗T(\vec{x})= \mathbf{B}\vec{x} \mid \mathbf{B} \text{ is } l \times m\\ S(\vec{x})= \mathbf{A}\vec{x} \mid \mathbf{A} \text{ is } m \times n\\ T \circ S = \mathbf{B}(\mathbf{A}(\vec{x})) = \mathbf{C}\vec{x}T(x)=Bx∣B is l×mS(x)=Ax∣A is m×nT∘S=B(A(x))=Cx

我們要如何找到這個 C ,首先這個 C 將會是多大 size

因為 vector 從 Rn 開始被轉換,所以先定義 identity matrix 為 In

In=[10⋯001⋯000⋱⋮00⋯1]\mathbf{I_n} = \begin{bmatrix} 1&0&\cdots& 0\\ 0&1&\cdots&0 \\0&0&\ddots&\vdots\\0&0&\cdots&1\end{bmatrix}In​=​1000​0100​⋯⋯⋱⋯​00⋮1​​

而我們的 C 就會等於

C=[B(A([10⋮0]))B(A([01⋮0]))⋯B(A([00⋮1]))]\mathbf{C} = \begin{bmatrix} \mathbf{B}(\mathbf{A}\left(\begin{bmatrix} 1\\0\\\vdots\\0\end{bmatrix}\right))& \mathbf{B}(\mathbf{A}\left(\begin{bmatrix} 0\\1\\\vdots\\0\end{bmatrix}\right))& \cdots& \mathbf{B}(\mathbf{A}\left(\begin{bmatrix} 0\\0\\\vdots\\1\end{bmatrix}\right)) \end{bmatrix}C=​B(A​​10⋮0​​​)​B(A​​01⋮0​​​)​⋯​B(A​​00⋮1​​​)​​

雖然看起來很複雜,但我們將 A 拆開來看,A 的運算應該如下

A=[a1⃗a2⃗⋯an⃗][x1x2⋮xn]=x1a1⃗+x2a2⃗+⋯+xnan⃗\mathbf{A} = \begin{bmatrix} \vec{a_1} &\vec{a_2}&\cdots &\vec{a_n}\end{bmatrix} \begin{bmatrix} x_1\\x_2\\\vdots\\x_n\end{bmatrix} = x_1\vec{a_1}+x_2\vec{a_2}+\cdots+x_n\vec{a_n}A=[a1​​​a2​​​⋯​an​​​]​x1​x2​⋮xn​​​=x1​a1​​+x2​a2​​+⋯+xn​an​​

所以 C 的 A 只有乘到每一項對應的 vector 而已,而這個 matrix 也就是 BA 相乘 !

C=[B(a1⃗)B(a2⃗)⋯B(an⃗)]=BA\mathbf{C} = \begin{bmatrix} \mathbf{B}(\vec{a_1})& \mathbf{B}(\vec{a_2})& \cdots& \mathbf{B}(\vec{a_n}) \end{bmatrix} = \mathbf{BA}C=[B(a1​​)​B(a2​​)​⋯​B(an​​)​]=BA

Matrix product examples

我們現在知道 B 和 A 相乘的意義其實就是 S 和 T 這兩個 transformation 的 composition

我們實際舉個例子來運算看看

B2×3=[1−120−21],  A3×4=[1011201−13102]\underset{2\times3}{\mathbf{B}} = \begin{bmatrix}1&-1&2\\0&-2&1\end{bmatrix},\,\, \underset{3\times4}{\mathbf{A}} = \begin{bmatrix}1&0&1&1\\2&0&1&-1\\3&1&0&2\end{bmatrix}2×3B​=[10​−1−2​21​],3×4A​=​123​001​110​1−12​​

所以 BA 會等於

BA=[B(a1⃗)B(a2⃗)⋯B(an⃗)]=[B[123]B[001]B[110]B[1−12]]=[5206−11−24]\begin{aligned} \mathbf{BA} &= \begin{bmatrix} \mathbf{B}(\vec{a_1})& \mathbf{B}(\vec{a_2})& \cdots& \mathbf{B}(\vec{a_n}) \end{bmatrix} \\ &= \begin{bmatrix} \mathbf{B}\begin{bmatrix}1\\2\\3\end{bmatrix} & \mathbf{B}\begin{bmatrix}0\\0\\1\end{bmatrix} & \mathbf{B}\begin{bmatrix}1\\1\\0\end{bmatrix} & \mathbf{B}\begin{bmatrix}1\\-1\\2\end{bmatrix} \end{bmatrix}\\ &= \begin{bmatrix} 5&2&0&6\\-1&1&-2&4 \end{bmatrix} \end{aligned}BA​=[B(a1​​)​B(a2​​)​⋯​B(an​​)​]=​B​123​​​B​001​​​B​110​​​B​1−12​​​​=[5−1​21​0−2​64​]​

更詳細的運算過程可以去看影片

我們可以發現,能夠運算如此順利,是因為 B 和 A 已經 well defined

因為 2 x 3 和 3 x 4 的矩陣相乘,我們可以得到一個 2 x 4 的矩陣

今天若是 B 和 A 交換,要求 AB 是求不出來的

所以這邊又點出一個觀念,就是 矩陣相乘是沒有交換律的 (NO commutative)

AB≠BA\mathbf{AB \neq BA}AB=BA

Matrix product associativity

那們矩陣相乘有沒有結合律呢?我們從更多的 transformation 來看

H(x⃗)=Ax⃗G(x⃗)=Bx⃗F(x⃗)=Cx⃗H(\vec{x}) = \mathbf{A}\vec{x}\\ G(\vec{x}) = \mathbf{B}\vec{x}\\ F(\vec{x}) = \mathbf{C}\vec{x}\\H(x)=AxG(x)=BxF(x)=Cx

我們將轉移到 F 後,再轉移到 G ,最終轉移到 H,可以這樣寫

((H∘G)∘F)(x⃗)=(H∘G)(F(x⃗))=H(G(F(x⃗)))=H(G∘F(x⃗))=(H∘(G∘F))(x⃗)\begin{aligned} ((H\circ G) \circ F)(\vec{x}) &= (H\circ G)(F(\vec{x})) \\&= H(G(F(\vec{x})))\\&= H(G\circ F(\vec{x})) \\&= (H\circ (G \circ F))(\vec{x}) \end{aligned}((H∘G)∘F)(x)​=(H∘G)(F(x))=H(G(F(x)))=H(G∘F(x))=(H∘(G∘F))(x)​

結果我們發現

((H∘G)∘F)(x⃗)=(H∘(G∘F))(x⃗)=(H∘G∘F)(x⃗)((H\circ G) \circ F)(\vec{x}) = (H\circ (G \circ F))(\vec{x}) = (H\circ G \circ F)(\vec{x})((H∘G)∘F)(x)=(H∘(G∘F))(x)=(H∘G∘F)(x)

也就是說 矩陣相乘是有結合律的 (HAS Associative)

(AB)C=A(BC)=ABC(\mathbf{AB})\mathbf{C} = \mathbf{A}(\mathbf{BC}) = \mathbf{ABC}(AB)C=A(BC)=ABC

簡單來說就是可以忽略掉括號啦

Distributive property of matrix products

最後我們來看矩陣相乘有沒有分配律 !

A=k×mB=m×nC=m×n\mathbf{A} = k \times m\\ \mathbf{B} = m \times n\\ \mathbf{C} = m \times n\\A=k×mB=m×nC=m×n

試著乘乘看

A(B+C)=A[b1⃗+c1⃗b2⃗+c2⃗⋯bn⃗+cn⃗]=[A(b1⃗+c1⃗)A(b2⃗+c2⃗)⋯A(bn⃗+cn⃗)]=[Ab1⃗+Ac1⃗Ab2⃗+Ac2⃗⋯Abn⃗+Acn⃗]=[Ab1⃗Ab2⃗⋯Abn⃗]+[Ac1⃗Ac2⃗⋯Acn⃗]=AB+AC\begin{aligned} \mathbf{A}(\mathbf{B+C}) &= \mathbf{A} \begin{bmatrix}\vec{b_1}+\vec{c_1}& \vec{b_2}+\vec{c_2} & \cdots & \vec{b_n}+\vec{c_n}\end{bmatrix}\\ &= \begin{bmatrix}\mathbf{A}(\vec{b_1}+\vec{c_1})& \mathbf{A}(\vec{b_2}+\vec{c_2}) & \cdots & \mathbf{A}(\vec{b_n}+\vec{c_n})\end{bmatrix} \\ &= \begin{bmatrix}\mathbf{A}\vec{b_1}+\mathbf{A}\vec{c_1}& \mathbf{A}\vec{b_2}+\mathbf{A}\vec{c_2} & \cdots & \mathbf{A}\vec{b_n}+\mathbf{A}\vec{c_n}\end{bmatrix} \\ &= \begin{bmatrix}\mathbf{A}\vec{b_1}& \mathbf{A}\vec{b_2} & \cdots & \mathbf{A}\vec{b_n}\end{bmatrix} + \begin{bmatrix}\mathbf{A}\vec{c_1}& \mathbf{A}\vec{c_2} & \cdots & \mathbf{A}\vec{c_n}\end{bmatrix} \\ &= \mathbf{AB} + \mathbf{AC} \end{aligned}A(B+C)​=A[b1​​+c1​​​b2​​+c2​​​⋯​bn​​+cn​​​]=[A(b1​​+c1​​)​A(b2​​+c2​​)​⋯​A(bn​​+cn​​)​]=[Ab1​​+Ac1​​​Ab2​​+Ac2​​​⋯​Abn​​+Acn​​​]=[Ab1​​​Ab2​​​⋯​Abn​​​]+[Ac1​​​Ac2​​​⋯​Acn​​​]=AB+AC​

所以矩陣相乘是有分配律的 (HAS Distributive)

統整一下:

  • matrix multiplication has no commutative

    AB≠BA\mathbf{AB} \neq \mathbf{BA}AB=BA
  • matrix multiplication has associative

A(BC)=(AB)C=ABC\mathbf{A}(\mathbf{BC}) = (\mathbf{AB})\mathbf{C} = \mathbf{ABC}A(BC)=(AB)C=ABC
  • matrix multiplication has distributive

A(B+C)=AB+AC\mathbf{A}(\mathbf{B+C}) = \mathbf{AB}+\mathbf{AC}A(B+C)=AB+AC

https://youtu.be/BuqcKpe5ZQs
https://youtu.be/x1z0hOyjapU
https://youtu.be/Hhc96U_HvQE
https://youtu.be/oMWTMj78cwc
https://youtu.be/f_DTiXZpb8M