Inverse of a function
我們先複習一下 function
f : X → Y f ( a ) = b \begin{aligned}
&f: X \to Y \\
&f(a) = b
\end{aligned} f : X → Y f ( a ) = b 還有 Identity function
I x : X → X I x ( a ) = a ∣ a ∈ X \mathbf{I_x}: X \to X \\
\mathbf{I_x}(a) = a \mid a\in X I x : X → X I x ( a ) = a ∣ a ∈ X 任何值經過 identity function 運算後,還是自己
看起來很無用,但他可以幫助我們很多,例如在底下的 inverse function 的證明中就可以用到他
Invertible
這邊講的是可逆函數 (invertible function),
指的是 function 能透過對應的 inverse function 將值運算回自己
f : X → Y f − 1 : Y → X \begin{aligned}
f&:X\to Y\\
f^{-1}&:Y\to X
\end{aligned} f f − 1 : X → Y : Y → X 所以一個 function 可逆我們定義為
f : X → Y invertible ⟺ there exists f − 1 : Y → X s.t. f − 1 ∘ f = I x and f ∘ f − 1 = I y \begin{aligned}
f:X\to Y \text{ invertible } \iff &\text{there exists } f^{-1}:Y\to X\\
&\text{s.t. }f^{-1} \circ f = \mathbf{I_x} \\
&\text{and } f\circ f^{-1} = \mathbf{I_y}
\end{aligned} f : X → Y invertible ⟺ there exists f − 1 : Y → X s.t. f − 1 ∘ f = I x and f ∘ f − 1 = I y 為什麼 function 和 inverse function 的組合會變成 Identity function
因為 f 先做,會從 X 變成 Y,這時 f-1 從 Y 再變回 X,就像在 X 執行 Identity function 一樣
f − 1 ∘ f : ( X → Y ) → X f^{-1}\circ f: (X\to Y) \to X f − 1 ∘ f : ( X → Y ) → X 另外一邊也是,換用函數方法表示
f ∘ f − 1 : f ( f − 1 ( b ) ) = b ∣ b ∈ Y f\circ f^{-1}: f(f^{-1}(b)) = b \mid b \in Y f ∘ f − 1 : f ( f − 1 ( b )) = b ∣ b ∈ Y 那任何 function 只要為 invertible ,他對應的 inverse function 是唯一的嗎?
Is f − 1 unique ? \text{ Is } f^{-1} \text{ unique ?} Is f − 1 unique ? 我們假設 f 有兩個 inverse function g 和 h
f : X → Y g : Y → X h : Y → X f: X \to Y\\
g: Y \to X\\
h: Y \to X f : X → Y g : Y → X h : Y → X 也就是說
g ∘ f = I x f ∘ g = I y h ∘ f = I x f ∘ h = I y \begin{aligned}
g\circ f = \mathbf{I_x}\\
f\circ g = \mathbf{I_y}\\\\
h\circ f = \mathbf{I_x}\\
f\circ h = \mathbf{I_y}\\
\end{aligned} g ∘ f = I x f ∘ g = I y h ∘ f = I x f ∘ h = I y 現在我們將 g 表示為跟 identity function 的 composition
沒有什麼問題,等於 g 讓 Y 變到 X 再變到 X 一樣
g = I x ∘ g = ( h ∘ f ) ∘ g = h ∘ ( f ∘ g ) = h ∘ I y = h \begin{aligned}
g &= \mathbf{I_x}\circ g\\
&= (h\circ f)\circ g \\
&= h \circ (f \circ g) \\
&= h\circ \mathbf{I_y} \\
&= h
\end{aligned} g = I x ∘ g = ( h ∘ f ) ∘ g = h ∘ ( f ∘ g ) = h ∘ I y = h 第二和三行:Ix 等於 h 和 f 的 composition ,而我們學過 transformation 是有 associative 的
第四行:f 和 g 等於 Iy,表示先從 Y 轉到 Y,再從 Y 轉到 X,跟直接 h 是一樣的
第五行:所以 g = h 表示 inverse function 是 unique 的
Invertibility implies a unique solution to f(x)=y
上面證明只要 function 是 invertible ,其對應的 inverse function 就是 unique
現在我們想知道,若 function 是 invertible 那麼 x 是唯一從 f 變成 y 的值嗎?
我們先假設任何 y ,都有一個唯一的 x 經過 f(x) = y ,然後稱這個假設為函數 S
S : Y → X S ( y ) : The unique solution in x to f ( x ) = y S: Y\to X \\
S(y): \text{ The unique solution in } x \text{ to } f(x) = y S : Y → X S ( y ) : The unique solution in x to f ( x ) = y 現在有任意一個變數 b 在 Y subset 裡面, S(b) 代表的就是 X 中那一個會變成 b 的唯一 x
S ( b ) = the unique solution to f ( x ) = b ⇒ f ( S ( b ) ) = b S(b) = \text{the unique solution to }f(x) =b\\
\Rightarrow f(S(b)) = b S ( b ) = the unique solution to f ( x ) = b ⇒ f ( S ( b )) = b 我們將這個式子展開,發現 f 和 S 可以表示為 Iy ,因為先從 Y 到 X 再回 Y
f ( S ( b ) ) = ( f ∘ S ) ( b ) = I y b = b \begin{aligned}f(S(b)) &= (f \circ S)(b)\\
&= \mathbf{I_y} b\\
&= b
\end{aligned} f ( S ( b )) = ( f ∘ S ) ( b ) = I y b = b 先把這個記起來,我們反過來看另外一邊
若有 a 透過 f(a) 從 X 變到 Y,那麼對 f(a) 做 S transformation 能變回 a 嗎?
因為我們早已定義 S 會幫我們找到唯一的 x 值,所以很明顯要找的 x 就是 a
S ( f ( a ) ) = the unique solution to f ( x ) = f ( a ) ⇒ x = a S(f(a)) = \text{ the unique solution to } f(x) = f(a) \Rightarrow x = a S ( f ( a )) = the unique solution to f ( x ) = f ( a ) ⇒ x = a 我們也將這個式子展開
S ( f ( a ) ) = ( S ∘ f ) ( a ) = I x a = a \begin{aligned}
S(f(a)) &= (S \circ f)(a)\\
&= \mathbf{I_x}a\\
&= a
\end{aligned} S ( f ( a )) = ( S ∘ f ) ( a ) = I x a = a 我們發現 f 和 S 正是 invertible 的結果
( f ∘ S ) = I y ( S ∘ f ) = I x (f\circ S) = \mathbf{I_y}\\
(S\circ f) = \mathbf{I_x}\\ ( f ∘ S ) = I y ( S ∘ f ) = I x 所以結論:若 f 為 invertible ,任何 y 都是由唯一的 x 經過 f 而取得
f : X → Y invertible ⟺ ∀ y ∈ Y : ∃ unique solution x to f ( x ) = y f:X\to Y \text{ invertible} \iff \forall y\in Y:\exist \text{ unique solution } x \text{ to } f(x) =y f : X → Y invertible ⟺ ∀ y ∈ Y : ∃ unique solution x to f ( x ) = y Surjective (onto) and injective (one-to-one) functions
Surjective (onto)
所有的 y 至少都會有一個 x map 到他,符合 f(x) = y
Every y ∈ Y ∃ at LEAST one x ∈ such that f ( x ) = y \text{Every } y \in Y \,\exist \text{ at LEAST one }x\in \text{ such that } f(x) = y Every y ∈ Y ∃ at LEAST one x ∈ such that f ( x ) = y 我們知道 Image 不一定要 map 到所有的 Y,但 onto 將會 map 到整個 Y
I m ( f ) = Y = range ( f ) Im(f) = Y = \text{range}(f) I m ( f ) = Y = range ( f ) 之前學過:Range 代表的是 x map 到 y 的 subset 範圍
舉個例子,所有的 Y 都至少有一人 map 到他
Injective (one-to-one)
所有的 y 只要有人 map 到他,都會是唯一,而上面例子中, 4 和 5 就不符合 one-to-one
For any y ∈ Y at MOST one x ∈ X such that f ( x ) = y \text{For any }y \in Y \text{ at MOST one }x\in X \text{ such that } f(x) = y For any y ∈ Y at MOST one x ∈ X such that f ( x ) = y 注意的是,可以有 y 不被任何人 map 到,但不可以一次有兩個以上的人 map 到 y
Relating invertibility to being onto and one-to-one
現在我們可以重新定義 invertible,我們知道 invertible 原本的定義如下
f : X → Y invertible ⟺ ∀ y ∈ Y : ∃ unique solution x to f ( x ) = y f:X\to Y \text{ invertible} \iff \forall y\in Y:\exist \text{ unique solution } x \text{ to } f(x) =y f : X → Y invertible ⟺ ∀ y ∈ Y : ∃ unique solution x to f ( x ) = y 首先 所有的 y 都有一個 unique solution x 來對應他,這正好就是 surjective (onto) 的定義
再來 所有的 y 都有一個 unique solution x 來對應他,這正好也是 injective (one-to-one) 的定義
所以為了滿足 f is invertible, f 必須要滿足 surjective 以及 injective 兩者
Determining whether a transformation is onto
前面都是直接用定義和圖案來觀察 transformation 是否符合 onto,
我們知道 transformation 還可以轉為 matrix vector product
x ⃗ ∈ R n T ( x ⃗ ) = A x ⃗ = b ⃗ ∈ R m \begin{aligned}
&\vec{x} \in \mathbb{R}^n \\
&T(\vec{x}) = \mathbf{A}\vec{x} = \vec{b} \in \mathbb{R}^m
\end{aligned} x ∈ R n T ( x ) = A x = b ∈ R m 用 matrix 來表示 transformation 時的 onto 定義為
for every b ⃗ ∈ R m ∃ at least one solution x ⃗ to A x ⃗ = b ⃗ where x ⃗ ∈ R n \begin{aligned}
&\text{for every }\vec{b} \in \mathbb{R}^m \\
&\exist \text{ at least one solution } \vec{x} \text{ to } \mathbf{A}\vec{x} = \vec{b}\\
&\text{where } \vec{x} \in \mathbb{R}^n
\end{aligned} for every b ∈ R m ∃ at least one solution x to A x = b where x ∈ R n 我們知道 Ax 相乘其實就是 A 的 linear combination
而這個 combination 將會生成任意的 b in Rm
A = [ a 1 ⃗ a 2 ⃗ ⋯ a n ⃗ ] , x ⃗ = [ x 1 x 2 ⋮ x n ] A x ⃗ = x 1 a 1 ⃗ + x 2 a 2 ⃗ + ⋯ + x n a n ⃗ \begin{aligned}
&\mathbf{A} = \begin{bmatrix} \vec{a_1}&\vec{a_2}&\cdots&\vec{a_n} \end{bmatrix},
\vec{x} = \begin{bmatrix}x_1\\x_2\\\vdots\\x_n\end{bmatrix} \\
&\mathbf{A}\vec{x} = x_1\vec{a_1} + x_2\vec{a_2} + \cdots + x_n\vec{a_n}
\end{aligned} A = [ a 1 a 2 ⋯ a n ] , x = x 1 x 2 ⋮ x n A x = x 1 a 1 + x 2 a 2 + ⋯ + x n a n 所以要能夠滿足 onto Rm ,代表 A 的 column space 必須要能夠 span Rm
s p a n ( a 1 ⃗ , a 2 ⃗ , ⋯ , a n ⃗ ) = R m C ( A ) = R m \begin{aligned}
span(\vec{a_1},\vec{a_2}, \cdots, \vec{a_n} ) &= \mathbb{R}^m \\
C(\mathbf{A}) &= \mathbb{R}^m
\end{aligned} s p an ( a 1 , a 2 , ⋯ , a n ) C ( A ) = R m = R m 若 Ax 沒有 span 整個 Rm,代表沒有 onto
沒有 onto ,代表 Ax = b 的 [A | b] 化簡為 [R (rref) | c] 時
會有一行全為 0 但結果不為 0 (no solution)
[ b 1 A ⋮ b n ] = [ 1 0 ⋯ 0 0 1 ⋯ 0 0 0 ⋯ 0 2 b 1 + 3 b 2 + ⋯ ] = Can’t span R m \begin{bmatrix}\begin{array}{c|c} &b_1 \\\mathbf{A}&\vdots\\&b_n\end{array}\end{bmatrix}
=
\begin{bmatrix}\begin{array}{cccc|c}
1&0&\cdots&0& \\
0&1&\cdots&0&\\
0&0&\cdots&0&2b_1+3b_2+\cdots\end{array}\end{bmatrix} = \text{Can't span } \mathbb{R}^m A b 1 ⋮ b n = 1 0 0 0 1 0 ⋯ ⋯ ⋯ 0 0 0 2 b 1 + 3 b 2 + ⋯ = Can’t span R m 若 Ax 可以 span 整個 Rm,代表 onto
有 onto ,代表 rref(A) 必須在每一個 row 都出現 pivot entry,也就是每一行都是 pivot column
T is Onto ⟺ C ( A ) = R m ⟺ m pivot entries ⟺ m pivot columns ⟺ Rank ( A ) = dim ( C ( A ) ) = m \begin{aligned}
T \text{ is Onto} &\iff C(\mathbf{A}) = \mathbb{R}^m\\
&\iff m \text{ pivot entries} \\
&\iff m \text{ pivot columns} \\
&\iff \text{Rank}(\mathbf{A}) = \text{dim}(C(\mathbf{A})) = m
\end{aligned} T is Onto ⟺ C ( A ) = R m ⟺ m pivot entries ⟺ m pivot columns ⟺ Rank ( A ) = dim ( C ( A )) = m 在矩陣上就是 A 變到 rref(A) 之後,每一個 row 都有一個 leading one
A = [ a 1 ⃗ a 2 ⃗ ⋯ a m ⃗ ] ↓ rref ( A ) ↓ R = [ 1 1 0 ⋯ 0 0 1 2 ⋯ 0 0 0 ⋯ 1 m ] ⇒ span ( R m ) ⇒ Onto \begin{aligned}
&\mathbf{A} = &&\begin{bmatrix} \vec{a_1}&\vec{a_2}&\cdots&\vec{a_m}\end{bmatrix}\\\\
&\downarrow \text{rref}(\mathbf{A}) &&\downarrow\\\\
&\mathbf{R} = &&\begin{bmatrix}
1_1&0&\cdots&0\\
0&1_2&\cdots&0\\
0&0&\cdots&1_m\\
\end{bmatrix} \Rightarrow \text{span}(\mathbb{R}^m) \Rightarrow \text{Onto}
\end{aligned} A = ↓ rref ( A ) R = [ a 1 a 2 ⋯ a m ] ↓ 1 1 0 0 0 1 2 0 ⋯ ⋯ ⋯ 0 0 1 m ⇒ span ( R m ) ⇒ Onto 舉個例子,來確認 S 這個 transformation 是否符合 onto
S = R 2 → R 3 , S ( x ⃗ ) = [ 1 2 3 4 5 6 ] x ⃗ S = \mathbb{R}^2 \to \mathbb{R}^3, S(\vec{x}) =
\begin{bmatrix} 1&2\\3&4\\5&6\end{bmatrix}
\vec{x} S = R 2 → R 3 , S ( x ) = 1 3 5 2 4 6 x 我們將 S matrix 化簡為 rref,發現只有 2 pivot entries
[ 1 2 3 4 5 6 ] → [ 1 2 0 2 0 4 ] → [ 1 2 0 1 0 4 ] → [ 1 0 0 1 0 0 ] ⇒ rank ( [ 1 2 3 4 5 6 ] ) = 2 \begin{bmatrix} 1&2\\3&4\\5&6\end{bmatrix} \to
\begin{bmatrix} 1&2\\0&2\\0&4\end{bmatrix} \to
\begin{bmatrix} 1&2\\0&1\\0&4\end{bmatrix} \to
\begin{bmatrix} 1&0\\0&1\\0&0\end{bmatrix} \Rightarrow
\text{rank}\left(\begin{bmatrix} 1&2\\3&4\\5&6\end{bmatrix} \right) = 2 1 3 5 2 4 6 → 1 0 0 2 2 4 → 1 0 0 2 1 4 → 1 0 0 0 1 0 ⇒ rank 1 3 5 2 4 6 = 2 所以 S 的 rank 為 2,S 不是 onto,也代表 S 不可能是 invertible
Exploring the solution set of Ax = b
我們來看一個 linear transformation
T : R 2 → R 2 T ( x ⃗ ) = [ 1 − 3 − 1 3 ] [ x 1 x 2 ] = [ b 1 b 2 ] ⇒ [ 1 − 3 b 1 − 1 3 b 2 ] = [ 1 − 3 b 1 0 0 b 1 + b 2 ] \begin{aligned}
&T: \mathbb{R}^2 \to \mathbb{R}^2 \\
&T(\vec{x}) = \begin{bmatrix} 1&-3\\-1&3 \end{bmatrix}
\begin{bmatrix} x_1\\x_2 \end{bmatrix} =
\begin{bmatrix} b_1\\b_2 \end{bmatrix} \\
\Rightarrow
&\begin{bmatrix} \begin{array}{cc|c} 1 & -3 &b_1 \\-1&3&b_2 \end{array} \end{bmatrix} =
\begin{bmatrix} \begin{array}{cc|c} 1 & -3 &b_1 \\0&0&b_1+b_2 \end{array} \end{bmatrix}
\end{aligned} ⇒ T : R 2 → R 2 T ( x ) = [ 1 − 1 − 3 3 ] [ x 1 x 2 ] = [ b 1 b 2 ] [ 1 − 1 − 3 3 b 1 b 2 ] = [ 1 0 − 3 0 b 1 b 1 + b 2 ] 我們發現這個 transformation 已經不可能是 onto
但我們可以知道,要讓 b 有結果, b1 + b2 必須等於 0
b 1 + b 2 = 0 → b 1 = − b 2 b_1 + b_2 = 0 \rightarrow b_1 = -b_2 b 1 + b 2 = 0 → b 1 = − b 2 也就是說任何 x 經過 T 一定要在 b1 = -b2 直線上的任何一點,才是有 solution
因為只有變成這條直線,沒有 span 整個 R2 ,所以 T 當然是沒有 onto 的
但我們可以找找看到底 domain 在哪裡
首先必須先遵守 b1 + b2 = 0 的原則,並且從第一列我們得到解
x 1 − 3 x 2 = b 1 x 1 = b 1 + 3 x 2 x_1-3x_2 = b_1 \\
x_1 = b_1 + 3x_2 x 1 − 3 x 2 = b 1 x 1 = b 1 + 3 x 2 也就是說只有 x vector (domain) 等於以下時,才會轉換到 b1 + b2 = 0 的直線上
x ⃗ = [ x 1 x 2 ] = [ b 1 0 ] + x 2 [ 3 1 ] \vec{x} = \begin{bmatrix} x_1\\x_2 \end{bmatrix} =
\begin{bmatrix} b_1\\0 \end{bmatrix}+
x_2 \begin{bmatrix} 3\\1 \end{bmatrix} x = [ x 1 x 2 ] = [ b 1 0 ] + x 2 [ 3 1 ] 例如當 b 為 [5, -5] 時,domain 為
x ⃗ = [ 5 0 ] + x 2 [ 3 1 ] \vec{x} =
\begin{bmatrix} 5\\0 \end{bmatrix}+
x_2 \begin{bmatrix} 3\\1 \end{bmatrix} x = [ 5 0 ] + x 2 [ 3 1 ] 淺藍色的直線就是我們的 x vector (domain),上面的任何一點經過 T 都會 map 到 [5, -5]
當 b = [4, -4] 時, domain 就是 [4, 0] + x2 * [3, 1]
我們很好的 visualize 這個 transformation
那若我們要 map 到 b = [0, 0] 會發生什麼事
這時候 x 代表的就是 null space of A 了 ! 在這個例子就是 [3, 1] 乘以任何 scalar
A x ⃗ = 0 ⃗ x ⃗ = [ 0 0 ] + x 2 [ 3 1 ] ∈ N ( A ) \begin{aligned}
&\mathbf{A}\vec{x} = \vec{0} \\
&\vec{x} =
\begin{bmatrix} 0\\0 \end{bmatrix}+
x_2 \begin{bmatrix} 3\\1 \end{bmatrix} \in N(\mathbf{A})
\end{aligned} A x = 0 x = [ 0 0 ] + x 2 [ 3 1 ] ∈ N ( A ) 所以我們可以把 solution set 統整為某個 particular vector 和 null space 的組合
Assuming A x ⃗ = b ⃗ has a solution The solution set = { x p ⃗ } ∪ N ( A ) \begin{aligned}
&\text{Assuming } \mathbf{A}\vec{x} = \vec{b} \text{ has a solution} \\
&\text{The solution set = }
\begin{Bmatrix} \vec{x_p} \end{Bmatrix}\cup N(\mathbf{A})
\end{aligned} Assuming A x = b has a solution The solution set = { x p } ∪ N ( A ) 會在之後證明
而我們想要讓一個 transformation 能夠 one-to-one => 最多一個解
那這個 solution set 裡面的 null space 勢必要 empty 、空向量、zero vector
Matrix condition for one-to-one transformation
我們知道要求 null space of A 的方法
N ( A ) = A x ⃗ = 0 ⃗ [ A ∣ 0 ⃗ ] ⇒ [ rref ( A ) ∣ 0 ⃗ ] N(\mathbf{A}) = \mathbf{A}\vec{x} = \vec{0} \\
\begin{bmatrix} \mathbf{A} \mid \vec{0} \end{bmatrix} \Rightarrow
\begin{bmatrix} \text{rref}(\mathbf{A}) \mid \vec{0} \end{bmatrix} N ( A ) = A x = 0 [ A ∣ 0 ] ⇒ [ rref ( A ) ∣ 0 ] 我們會得到類似這樣的 homogeneous solution
x ⃗ = a n 1 ⃗ + b n 2 ⃗ + ⋯ + c n n ⃗ N ( A ) = span ( n 1 ⃗ , n 2 ⃗ , ⋯ , n n ⃗ ) \vec{x} = a\vec{n_1} + b\vec{n_2} + \cdots + c\vec{n_n} \\
N(\mathbf{A}) = \text{span}(\vec{n_1}, \vec{n_2}, \cdots, \vec{n_n}) x = a n 1 + b n 2 + ⋯ + c n n N ( A ) = span ( n 1 , n 2 , ⋯ , n n ) 而要求得 b 有解,也就是要取得 inhomogeneous solution 的方法
A x ⃗ = b ⃗ [ A ∣ b ⃗ ] ⇒ [ rref ( A ) ∣ b ⃗ ′ ] \mathbf{A}\vec{x} = \vec{b} \\
\begin{bmatrix} \mathbf{A} \mid \vec{b} \end{bmatrix} \Rightarrow
\begin{bmatrix} \text{rref}(\mathbf{A}) \mid \vec{b}' \end{bmatrix} A x = b [ A ∣ b ] ⇒ [ rref ( A ) ∣ b ′ ] 我們會得到 particular solution + homogeneous solution
x ⃗ = b ⃗ ′ + a n 1 ⃗ + b n 2 ⃗ + ⋯ + c n ⃗ = x p ⃗ + x h ⃗ \begin{aligned}
\vec{x} &= \vec{b}' + a\vec{n_1} + b\vec{n_2} + \cdots + c\vec{n} \\
&= \vec{x_p} + \vec{x_h}
\end{aligned} x = b ′ + a n 1 + b n 2 + ⋯ + c n = x p + x h 現在我們來證明這個 claim
Any solution to the inhomogeneous system A x ⃗ = b ⃗ will take the form x p ⃗ + x h ⃗ \text{Any solution to the inhomogeneous system } \mathbf{A}\vec{x} = \vec{b} \\
\text{ will take the form } \vec{x_p} + \vec{x_h} Any solution to the inhomogeneous system A x = b will take the form x p + x h 將 xp + xh 帶入 A transformation,果然為 x 的解
Is x p ⃗ + x h ⃗ a solution to A x ⃗ = b ⃗ ? A ( x p ⃗ + x h ⃗ ) = A x p ⃗ + A x h ⃗ = b ⃗ + 0 ⃗ = b ⃗ \begin{aligned}
&\text{Is }\vec{x_p} + \vec{x_h} \text{ a solution to }\mathbf{A}\vec{x} = \vec{b} \text{ ?} \\\\
&\mathbf{A}(\vec{x_p} + \vec{x_h}) =
\mathbf{A}\vec{x_p} + \mathbf{A}\vec{x_h} = \vec{b}+\vec{0} = \vec{b}
\end{aligned} Is x p + x h a solution to A x = b ? A ( x p + x h ) = A x p + A x h = b + 0 = b 因為 A*xp 為 Ax = b 的特定解,所以為 b 向量
因為 A*xh 為 Ax = 0 的解,所以為 0 向量
利用任何 x 來減掉 xp ,最後也可以導出 xp + xh 為 x 的解
Is any solution x ⃗ to A x ⃗ = b ⃗ takes the form x ⃗ = x p ⃗ + x h ⃗ ? Assume x ⃗ is any solution to A x ⃗ = b ⃗ A ( x ⃗ − x p ⃗ ) = A x ⃗ − A x p ⃗ = b ⃗ − b ⃗ = 0 ⃗ \begin{aligned}
&\text{Is any solution }\vec{x} \text{ to }\mathbf{A}\vec{x}=\vec{b}
\text{ takes the form } \vec{x} = \vec{x_p} + \vec{x_h} \text{ ?} \\\\
&\text{Assume }\vec{x} \text{ is any solution to }\mathbf{A}\vec{x} = \vec{b}\\
&\mathbf{A}(\vec{x} - \vec{x_p}) = \mathbf{A}\vec{x} - \mathbf{A}\vec{x_p} = \vec{b} - \vec{b} = \vec{0}
\end{aligned} Is any solution x to A x = b takes the form x = x p + x h ? Assume x is any solution to A x = b A ( x − x p ) = A x − A x p = b − b = 0 也就是說
( x ⃗ − x p ⃗ ) is a solution to A x ⃗ = 0 ⃗ ( x ⃗ − x p ⃗ ) is a member of N ( A ) ( x ⃗ − x p ⃗ ) = x h ⃗ x ⃗ = x p ⃗ + x h ⃗ (\vec{x}-\vec{x_p}) \text{ is a solution to }\mathbf{A}\vec{x} = \vec{0} \\
(\vec{x}-\vec{x_p}) \text{ is a member of }N(\mathbf{A})\\
(\vec{x}-\vec{x_p}) = \vec{x_h} \\
\vec{x} = \vec{x_p} + \vec{x_h} ( x − x p ) is a solution to A x = 0 ( x − x p ) is a member of N ( A ) ( x − x p ) = x h x = x p + x h 證明完所有 x 皆可以化為 particular solution + homogeneous solution 後,我們來看 one-to-one
Under one-to-one condition: Any solution = { x p ⃗ + x h ⃗ ∣ x h ⃗ ∈ N ( A ) } can only be 1 solution ⇒ N ( A ) = { 0 ⃗ } \begin{aligned}
\text{Under one-to-one condition:}\\
\text{Any solution} &=
\begin{Bmatrix}\vec{x_p} + \vec{x_h}\mid \vec{x_h} \in N(\mathbf{A}) \end{Bmatrix} \text{ can only be 1 solution}\\
&\Rightarrow N(\mathbf{A}) = \begin{Bmatrix} \vec{0} \end{Bmatrix}
\end{aligned} Under one-to-one condition: Any solution = { x p + x h ∣ x h ∈ N ( A ) } can only be 1 solution ⇒ N ( A ) = { 0 } 也就是說要符合 one-to-one 的話,Ax = 0 (null space) 只能有 0
[ a 1 ⃗ a 2 ⃗ ⋯ a n ⃗ ] [ x 1 x 2 ⋮ x n ] = [ 0 0 ⋮ 0 ] = x 1 a 1 ⃗ + x 2 a 2 ⃗ + ⋯ + x n a n ⃗ = 0 ⃗ \begin{bmatrix}\vec{a_1}&\vec{a_2}&\cdots&\vec{a_n}\end{bmatrix}
\begin{bmatrix}x_1\\x_2\\\vdots\\x_n\end{bmatrix} =
\begin{bmatrix}0\\0\\\vdots\\0\end{bmatrix} =
x_1\vec{a_1} + x_2\vec{a_2} + \cdots+ x_n\vec{a_n} = \vec{0} [ a 1 a 2 ⋯ a n ] x 1 x 2 ⋮ x n = 0 0 ⋮ 0 = x 1 a 1 + x 2 a 2 + ⋯ + x n a n = 0 因為只能有 0 向量,所以
x 1 , x 2 , ⋯ , x n = 0 x_1, x_2, \cdots, x_n = 0 x 1 , x 2 , ⋯ , x n = 0 好像似曾相識,代表在 one-to-one 時:
C ( A ) = span ( a 1 ⃗ , a 2 ⃗ , ⋯ , a n ⃗ ) ⇒ a 1 ⃗ , a 2 ⃗ , ⋯ , a n ⃗ are linear independence ⇒ a 1 ⃗ , a 2 ⃗ , ⋯ , a n ⃗ are a basis for C ( A ) ⇒ dim ( C ( A ) ) = n ⇒ Rank ( A ) = n \begin{aligned}
C(\mathbf{A}) &= \text{span}(\vec{a_1}, \vec{a_2}, \cdots, \vec{a_n})\\
&\Rightarrow \vec{a_1}, \vec{a_2}, \cdots, \vec{a_n} \text{ are linear independence}\\
&\Rightarrow \vec{a_1}, \vec{a_2}, \cdots, \vec{a_n} \text{ are a basis for } C(\mathbf{A}) \\
&\Rightarrow \text{dim}(C(\mathbf{A})) = n \\
&\Rightarrow \text{Rank}(\mathbf{A}) = n
\end{aligned} C ( A ) = span ( a 1 , a 2 , ⋯ , a n ) ⇒ a 1 , a 2 , ⋯ , a n are linear independence ⇒ a 1 , a 2 , ⋯ , a n are a basis for C ( A ) ⇒ dim ( C ( A )) = n ⇒ Rank ( A ) = n xh 也會變成空的,而 any solution 只對應到 particular solution,所以為 one-to-one
Any solution = { x p ⃗ } ⇒ one-to-one \text{Any solution} = \begin{Bmatrix}\vec{x_p}\end{Bmatrix}
\Rightarrow
\text{one-to-one} Any solution = { x p } ⇒ one-to-one Simplifying conditions for invertibility
現在來整理一下 invertible 的條件,並簡化 invertible 的敘述
我們要知道以下這個 transformation 是否 invertible
T : R n → R m T ( x ⃗ ) = A x ⃗ ( A is m × n ) T: \mathbb{R}^n \to \mathbb{R}^m\\
T(\vec{x}) = \mathbf{A} \vec{x} \,\, ( \mathbf{A} \text{ is } m \times n) T : R n → R m T ( x ) = A x ( A is m × n ) 我們要觀察他有沒有同時滿足 onto 和 one-to-one
If T is invertible: onto ⇒ Rank ( A ) = m one-to-one ⇒ Rank ( A ) = n \begin{aligned}
\text{If } T \text{ is invertible: }\\
\text{onto } &&\Rightarrow \text{Rank}(\mathbf{A})= m \\
\text{one-to-one } &&\Rightarrow \text{Rank}(\mathbf{A}) = n
\end{aligned} If T is invertible: onto one-to-one ⇒ Rank ( A ) = m ⇒ Rank ( A ) = n 所以 T 要為 invertible,代表他的 matrix A 必須要
Rank ( A ) = m = n \text{Rank}(\mathbf{A}) = m = n Rank ( A ) = m = n 也就是說 A 會是一個 n × n 的 square matrix
而且 A 的 reduced row echelon form 會是一個 n 階的 identity matrix
因為每一行都必須是 pivot column
A = [ a 1 ⃗ a 2 ⃗ ⋯ a n ⃗ ] n × n rref ( A ) = [ 1 0 ⋯ 0 0 1 ⋯ 0 0 0 ⋯ 1 ] n × n = I n \mathbf{A} = \begin{bmatrix} \vec{a_1} & \vec{a_2} & \cdots & \vec{a_n}\end{bmatrix}_{n\times n}
\\
\text{rref}(\mathbf{A}) =
\begin{bmatrix} 1&0&\cdots&0\\ 0&1&\cdots&0\\0&0&\cdots&1\end{bmatrix}_{n \times n} =\mathbf{I_n} A = [ a 1 a 2 ⋯ a n ] n × n rref ( A ) = 1 0 0 0 1 0 ⋯ ⋯ ⋯ 0 0 1 n × n = I n 所以我們得到結論
T : R n → R m T ( x ⃗ ) = A x ⃗ ∣ A m × n T is invertible only if rref ( A ) = I n \begin{aligned}
&T:\mathbb{R}^n\to \mathbb{R}^m \\
&T(\vec{x}) = \mathbf{A}\vec{x} \mid \mathbf{A}_{m \times n }\\
&T \text{ is invertible only if } \text{rref}(\mathbf{A}) = \mathbf{I_n}
\end{aligned} T : R n → R m T ( x ) = A x ∣ A m × n T is invertible only if rref ( A ) = I n Showing that inverses are linear
T : R n → R n ∣ T ( x ⃗ ) = A x ⃗ rref ( A ) = I n T: \mathbb{R}^n \to \mathbb{R}^n \mid T(\vec{x}) = \mathbf{A}\vec{x} \\
\text{rref}(\mathbf{A}) = \mathbf{I_n} T : R n → R n ∣ T ( x ) = A x rref ( A ) = I n 若 T 要 invertible ,那 T 和 inverse 的組合等於在 Rn 不變的 transformation
T is invertible ⟺ ∃ some T − 1 s.t. T − 1 ∘ T = I R n T ∘ T − 1 = I R n \begin{aligned}
T \text{ is invertible} \iff& \exist \text{ some }T^{-1} \text{ s.t.}\\
&T^{-1} \circ T = \mathbf{I_{\mathbb{R}^n}} \\
&T \circ T^{-1} = \mathbf{I_{\mathbb{R}^n}}
\end{aligned} T is invertible ⟺ ∃ some T − 1 s.t. T − 1 ∘ T = I R n T ∘ T − 1 = I R n 我們知道 T 已經是 linear transformation 了,現在想知道 inverse function 是不是 linear
Linear transformation 的第一條成立
( T ∘ T − 1 ) ( a ⃗ + b ⃗ ) = T ( T − 1 ( a ⃗ + b ⃗ ) ) = T ( T − 1 ( a ⃗ ) ) + T ( T − 1 ( b ⃗ ) ) = T ( T − 1 ( a ⃗ ) + T − 1 ( b ⃗ ) ) apply T ( x ⃗ + y ⃗ ) = T ( x ⃗ ) + T ( y ⃗ ) T − 1 T ( T − 1 ( a ⃗ + b ⃗ ) ) = T − 1 T ( T − 1 ( a ⃗ ) + T − 1 ( b ⃗ ) ) multiply T − 1 to both side ( T − 1 ∘ T ) ( T − 1 ( a ⃗ + b ⃗ ) ) = ( T − 1 ∘ T ) ( T − 1 ( a ⃗ ) + T − 1 ( b ⃗ ) ) ( I R n ) ( T − 1 ( a ⃗ + b ⃗ ) ) = ( I R n ) ( T − 1 ( a ⃗ ) + T − 1 ( b ⃗ ) ) T − 1 ( a ⃗ + b ⃗ ) = T − 1 ( a ⃗ ) + T − 1 ( b ⃗ ) \begin{aligned}
(T \circ T^{-1})(\vec{a} + \vec{b}) &= T(T^{-1}(\vec{a}+\vec{b})) \\
&= T(\color{red}{T^{-1}(\vec{a})}) + T(\color{red}{T^{-1}(\vec{b})}) \\
&= T(T^{-1}(\vec{a})+ T^{-1}(\vec{b}))
& \text{apply }T(\vec{x}+\vec{y}) = T(\vec{x})+T(\vec{y}) \\\\
\color{red}{T^{-1}}T(T^{-1}(\vec{a}+\vec{b})) &=
\color{red}{T^{-1}}T(T^{-1}(\vec{a})+ T^{-1}(\vec{b}))
& \text{ multiply } T^{-1} \text{ to both side} \\
(T^{-1}\circ T)(T^{-1}(\vec{a}+\vec{b})) &=
(T^{-1}\circ T)(T^{-1}(\vec{a})+ T^{-1}(\vec{b})) \\
(\mathbf{I}_{\mathbb{R^n}})(T^{-1}(\vec{a}+\vec{b})) &=
(\mathbf{I}_{\mathbb{R^n}})(T^{-1}(\vec{a})+ T^{-1}(\vec{b}))\\
T^{-1}(\vec{a}+\vec{b}) &=
T^{-1}(\vec{a})+ T^{-1}(\vec{b})
\end{aligned} ( T ∘ T − 1 ) ( a + b ) T − 1 T ( T − 1 ( a + b )) ( T − 1 ∘ T ) ( T − 1 ( a + b )) ( I R n ) ( T − 1 ( a + b )) T − 1 ( a + b ) = T ( T − 1 ( a + b )) = T ( T − 1 ( a ) ) + T ( T − 1 ( b ) ) = T ( T − 1 ( a ) + T − 1 ( b )) = T − 1 T ( T − 1 ( a ) + T − 1 ( b )) = ( T − 1 ∘ T ) ( T − 1 ( a ) + T − 1 ( b )) = ( I R n ) ( T − 1 ( a ) + T − 1 ( b )) = T − 1 ( a ) + T − 1 ( b ) apply T ( x + y ) = T ( x ) + T ( y ) multiply T − 1 to both side Linear transformation 的第二條也成立
( T ∘ T − 1 ) ( c a ⃗ ) = c a ⃗ apply a ⃗ = ( T ∘ T − 1 ) a ⃗ = c ( T ∘ T − 1 ) ( a ⃗ ) ( T ∘ T − 1 ) ( c a ⃗ ) = c ( T ∘ T − 1 ) ( a ⃗ ) T ( T − 1 ( c a ⃗ ) ) = c T ( T − 1 ( a ⃗ ) ) = T ( c T − 1 ( a ⃗ ) ) apply T ( c x ⃗ ) = c T ( x ⃗ ) T − 1 ( T ( T − 1 ( c a ⃗ ) ) ) = T − 1 ( T ( c T − 1 ( a ⃗ ) ) ) multiply T − 1 to both side ( T − 1 ∘ T ) ( T − 1 ( c a ⃗ ) ) = ( T − 1 ∘ T ) ( c T − 1 ( a ⃗ ) ) T − 1 ( c a ⃗ ) = c T − 1 ( a ⃗ ) \begin{aligned}
(T\circ T^{-1})(c\vec{a}) &= c\vec{a} &
\text{apply } \vec{a} = (T\circ T^{-1})\vec{a}\\
&= c(T \circ T^{-1})(\vec{a}) \\\\
(T\circ T^{-1})(c\vec{a})
&= c(T \circ T^{-1})(\vec{a})\\
T(T^{-1}(c\vec{a}))
&= \color{red}{cT(T^{-1}(\vec{a})) = T(cT^{-1}(\vec{a}))}
& \text{apply }T(c\vec{x}) = cT(\vec{x})
\\
T^{-1}(T(T^{-1}(c\vec{a})))&= T^{-1}(T(cT^{-1}(\vec{a})))
& \text{ multiply } T^{-1} \text{ to both side} \\
(T^{-1}\circ T)(T^{-1}(c\vec{a})) &=
(T^{-1}\circ T)(cT^{-1}(\vec{a}))\\
T^{-1}(c\vec{a})&=
cT^{-1}(\vec{a})
\end{aligned} ( T ∘ T − 1 ) ( c a ) ( T ∘ T − 1 ) ( c a ) T ( T − 1 ( c a )) T − 1 ( T ( T − 1 ( c a ))) ( T − 1 ∘ T ) ( T − 1 ( c a )) T − 1 ( c a ) = c a = c ( T ∘ T − 1 ) ( a ) = c ( T ∘ T − 1 ) ( a ) = c T ( T − 1 ( a )) = T ( c T − 1 ( a )) = T − 1 ( T ( c T − 1 ( a ))) = ( T − 1 ∘ T ) ( c T − 1 ( a )) = c T − 1 ( a ) apply a = ( T ∘ T − 1 ) a apply T ( c x ) = c T ( x ) multiply T − 1 to both side 我們可以確立當 T 是 linear 時, inverse 也會是 linear
而且是 linear 就代表,inverse function 也能使用 matrix vector product 來表示!
更厲害的是我們知道 T 和 inverse 的組合會得到 identity matrix of n
( T − 1 ∘ T ) ( x ⃗ ) = A − 1 A x ⃗ = I R n x ⃗ = I n x ⃗ ( T ∘ T − 1 ) ( x ⃗ ) = A A − 1 x ⃗ = I R n x ⃗ = I n x ⃗ (T^{-1} \circ T)(\vec{x}) =
\mathbf{A}^{-1}\mathbf{A}\vec{x} =
\mathbf{I_\mathbb{R^n}}\vec{x} =
\mathbf{I_n}\vec{x}
\\
(T \circ T^{-1})(\vec{x}) =
\mathbf{A}\mathbf{A}^{-1}\vec{x} =
\mathbf{I_\mathbb{R^n}}\vec{x} =
\mathbf{I_n}\vec{x} ( T − 1 ∘ T ) ( x ) = A − 1 A x = I R n x = I n x ( T ∘ T − 1 ) ( x ) = A A − 1 x = I R n x = I n x 所以 matrix A 和他的 inverse matrix 也會得到 identity matrix of n
A − 1 A = A A − 1 = I n \mathbf{A}^{-1}\mathbf{A} = \mathbf{A}\mathbf{A}^{-1} = \mathbf{I_n} A − 1 A = A A − 1 = I n 可以說是 matrix multiplication 中,交換律 (commutative) 的例外
要怎麼求出 A 的 inserse matrix,我們留到下一章節