Academic
  • Introduction
  • Artificial Intelligence
    • Introduction
    • AI Concepts, Terminology, and Application Areas
    • AI: Issues, Concerns and Ethical Considerations
  • Biology
    • Scientific Method
    • Chemistry of Life
    • Water, Acids, Bases
    • Properties of carbon
    • Macromolecules
    • Energy and Enzymes
    • Structure of a cell
    • Membranes and transport
    • Cellular respiration
    • Cell Signaling
    • Cell Division
    • Classical and molecular genetics
    • DNA as the genetic material
    • Central dogma
    • Gene regulation
  • Bioinformatics
    • Bioinformatics Overview
  • Deep Learning
    • Neural Networks and Deep Learning
      • Introduction
      • Logistic Regression as a Neural Network
      • Python and Vectorization
      • Shallow Neural Network
      • Deep Neural Network
    • Improving Deep Neural Networks
      • Setting up your Machine Learning Application
      • Regularizing your Neural Network
      • Setting up your Optimization Problem
      • Optimization algorithms
      • Hyperparameter, Batch Normalization, Softmax
    • Structuring Machine Learning Projects
    • Convolutional Neural Networks
      • Introduction
    • Sequence Models
      • Recurrent Neural Networks
      • Natural Language Processing & Word Embeddings
      • Sequence models & Attention mechanism
  • Linear Algebra
    • Vectors and Spaces
      • Vectors
      • Linear combinations and spans
      • Linear dependence and independence
      • Subspaces and the basis for a subspace
      • Vector dot and cross products
      • Matrices for solving systems by elimination
      • Null space and column space
    • Matrix transformations
      • Functions and linear transformations
      • Linear transformation examples
      • Transformations and matrix multiplication
      • Inverse functions and transformations
      • Finding inverses and determinants
      • More Determinant Depth
  • Machine Learning
    • Introduction
    • Linear Regression
      • Model and Cost Function
      • Parameter Learning
      • Multivariate Linear Regression
      • Computing Parameters Analytically
      • Octave
    • Logistic Regression
      • Classification and Representation
      • Logistic Regression Model
    • Regularization
      • Solving the Problem of Overfitting
    • Neural Networks
      • Introduction of Neural Networks
      • Neural Networks - Learning
    • Improve Learning Algorithm
      • Advice for Applying Machine Learning
      • Machine Learning System Design
    • Support Vector Machine
      • Large Margin Classification
      • Kernels
      • SVM in Practice
  • NCKU - Artificial Intelligence
    • Introduction
    • Intelligent Agents
    • Solving Problems by Searching
    • Beyond Classical Search
    • Learning from Examples
  • NCKU - Computer Architecture
    • First Week
  • NCKU - Data Mining
    • Introduction
    • Association Analysis
    • FP-growth
    • Other Association Rules
    • Sequence Pattern
    • Classification
    • Evaluation
    • Clustering
    • Link Analysis
  • NCKU - Machine Learning
    • Probability
    • Inference
    • Bayesian Inference
    • Introduction
  • NCKU - Robotic Navigation and Exploration
    • Kinetic Model & Vehicle Control
    • Motion Planning
    • SLAM Back-end (I)
    • SLAM Back-end (II)
    • Computer Vision / Multi-view Geometry
    • Lie group & Lie algebra
    • SLAM Front-end
  • Python
    • Numpy
    • Pandas
    • Scikit-learn
      • Introduction
      • Statistic Learning
  • Statstics
    • Quantitative Data
    • Modeling Data Distribution
    • Bivariate Numerical Data
    • Probability
    • Random Variables
    • Sampling Distribution
    • Confidence Intervals
    • Significance tests
Powered by GitBook
On this page
  • Normal Equation
  • Normal Equation Noninvertibility

Was this helpful?

  1. Machine Learning
  2. Linear Regression

Computing Parameters Analytically

PreviousMultivariate Linear RegressionNextOctave

Last updated 5 years ago

Was this helpful?

Normal Equation

有另一個方法可以不用經過 iteration 找到 minimize cost function 的 result (parameters)

這個方法叫做 Normal equation

只要將 training sets 的 features 和 results 轉換為 matrix

就可以套用 normal equation 的公式來直接得到 optimal solution

θ=(XTX)−1XTy∣θ∈Rn+1\theta = (X^TX)^{-1}X^Ty \mid \theta \in \mathbb{R}^{n+1}θ=(XTX)−1XTy∣θ∈Rn+1
  • 用 house pricing 作為範例 :

    • Features 將會加上 x0 組成 matrix X

    • 而已知的 results 將會組成 vector y

  • 在 Octave 中要實作 normal equation 的語法如下

    pinv(X'*X)*X'*y
    • pinv = 求反矩陣

    • X' = 求 transpose

  • Normal equation 不需要 Feature scaling

  • Gradient descent 和 Normal equation 的差異如下 :

Gradient Descent

Normal Equation

Alpha

No alpha

Iteration

No iteration

O(kn^2)

O(n^3)

n 可以很大

n 很大會變慢

通常 normal equation 的 n (features) 不能很大

當 n 超過 10,000 時,最好使用 gradient descent 取代 normal equation

Normal Equation Noninvertibility

  • 在計算 normal equation 時,若 X^TX 不是 invertible 的話怎麼辦 ?

    • 事實上,Octave 中要計算反矩陣有兩種方法 : inv and pinv

    • 而 pinv 不管有沒有 invertible 都會返回一個反矩陣

  • 但若 X^TX 不是 invertible 的話,可能有以下問題 :

    • 使用了 Redundent features

    • features 數量太多了,已經超過了 training sets 的數量 (m < n)

所以可以從這兩點先來修改,應該可以優化計算