Academic
  • Introduction
  • Artificial Intelligence
    • Introduction
    • AI Concepts, Terminology, and Application Areas
    • AI: Issues, Concerns and Ethical Considerations
  • Biology
    • Scientific Method
    • Chemistry of Life
    • Water, Acids, Bases
    • Properties of carbon
    • Macromolecules
    • Energy and Enzymes
    • Structure of a cell
    • Membranes and transport
    • Cellular respiration
    • Cell Signaling
    • Cell Division
    • Classical and molecular genetics
    • DNA as the genetic material
    • Central dogma
    • Gene regulation
  • Bioinformatics
    • Bioinformatics Overview
  • Deep Learning
    • Neural Networks and Deep Learning
      • Introduction
      • Logistic Regression as a Neural Network
      • Python and Vectorization
      • Shallow Neural Network
      • Deep Neural Network
    • Improving Deep Neural Networks
      • Setting up your Machine Learning Application
      • Regularizing your Neural Network
      • Setting up your Optimization Problem
      • Optimization algorithms
      • Hyperparameter, Batch Normalization, Softmax
    • Structuring Machine Learning Projects
    • Convolutional Neural Networks
      • Introduction
    • Sequence Models
      • Recurrent Neural Networks
      • Natural Language Processing & Word Embeddings
      • Sequence models & Attention mechanism
  • Linear Algebra
    • Vectors and Spaces
      • Vectors
      • Linear combinations and spans
      • Linear dependence and independence
      • Subspaces and the basis for a subspace
      • Vector dot and cross products
      • Matrices for solving systems by elimination
      • Null space and column space
    • Matrix transformations
      • Functions and linear transformations
      • Linear transformation examples
      • Transformations and matrix multiplication
      • Inverse functions and transformations
      • Finding inverses and determinants
      • More Determinant Depth
  • Machine Learning
    • Introduction
    • Linear Regression
      • Model and Cost Function
      • Parameter Learning
      • Multivariate Linear Regression
      • Computing Parameters Analytically
      • Octave
    • Logistic Regression
      • Classification and Representation
      • Logistic Regression Model
    • Regularization
      • Solving the Problem of Overfitting
    • Neural Networks
      • Introduction of Neural Networks
      • Neural Networks - Learning
    • Improve Learning Algorithm
      • Advice for Applying Machine Learning
      • Machine Learning System Design
    • Support Vector Machine
      • Large Margin Classification
      • Kernels
      • SVM in Practice
  • NCKU - Artificial Intelligence
    • Introduction
    • Intelligent Agents
    • Solving Problems by Searching
    • Beyond Classical Search
    • Learning from Examples
  • NCKU - Computer Architecture
    • First Week
  • NCKU - Data Mining
    • Introduction
    • Association Analysis
    • FP-growth
    • Other Association Rules
    • Sequence Pattern
    • Classification
    • Evaluation
    • Clustering
    • Link Analysis
  • NCKU - Machine Learning
    • Probability
    • Inference
    • Bayesian Inference
    • Introduction
  • NCKU - Robotic Navigation and Exploration
    • Kinetic Model & Vehicle Control
    • Motion Planning
    • SLAM Back-end (I)
    • SLAM Back-end (II)
    • Computer Vision / Multi-view Geometry
    • Lie group & Lie algebra
    • SLAM Front-end
  • Python
    • Numpy
    • Pandas
    • Scikit-learn
      • Introduction
      • Statistic Learning
  • Statstics
    • Quantitative Data
    • Modeling Data Distribution
    • Bivariate Numerical Data
    • Probability
    • Random Variables
    • Sampling Distribution
    • Confidence Intervals
    • Significance tests
Powered by GitBook
On this page
  • Train / Dev / Test sets
  • Mismatched train / test distribution
  • Bias / Variance
  • Basic Recipe for Machine Learning
  • Bias vs. Variance tradeoff

Was this helpful?

  1. Deep Learning
  2. Improving Deep Neural Networks

Setting up your Machine Learning Application

Train / Dev / Test sets

  • 在以前的 machine learning 資料不多 (100 ~ 10000 筆)

    • 拆成 60 / 20 / 20 的 train / dev / test sets 差不多

    • dev set = cross validation set = development set (名詞不同而已)

  • 而現今的 big data 時代 (1000000 筆)

    • dev set 只是為了評估不同演算法的效率 (10000 筆就夠)

    • test set 只是要評估 classifier 的效果如何 (10000 筆就夠)

    • 所以現在在分配三者的量時,可能會採取

      • 98 / 1 / 1 甚至是 99.5 / 0.25 / 0.25

Mismatched train / test distribution

  • 開發時的 train data 和 dev/test data 最好來自同一個 distribution

  • 例如 train data 使用網路抓取的高解析圖片

  • test data 則使用用戶手機拍下的低解析圖片

  • 這會讓整個測試結果不佳,也會讓流程變慢

Bias / Variance

  • 假設訓練看貓照片時,人眼辨識錯誤率為 0%

Result

high variance

high bias

high on both

low on both

Train set error

1%

15%

15%

0.5%

Dev set error

11%

16%

30%

1%

  • train 的很好但 dev 測試很差,代表 overfit

  • train 的很爛 dev 也一樣爛,代表 underfit

  • train 的很爛 dev 又更爛,代表 overfit + underfit

  • train 的很好 dev 也很好,代表完美

  • 這些測試建立在人眼錯誤率為 0% (稱作 optimal error)

  • 測試結果會隨著 optimal error 的改變而有所不同

Basic Recipe for Machine Learning

  • 從 bias 和 variance 可以找出對應解法

  • High bias (bad training set performance) ?

    • 建構更大的 neural network

    • Train longer

    • (NN architecture search) (maybe not work)

  • High variance (bad dev set performance) ?

    • 取得更多 data

    • Regularization

    • (NN architecture search) (maybe not work)

Bias vs. Variance tradeoff

  • 在 machine learning 時代

    • bias 降低,variance 就會變高,反之亦然。

  • 但在 big data deep learning 時代

    • 建構更大 nn 可以降低 bias 而不影響其他因素

    • 取得更多 data 可以降低 variance 而不影響其他因素

      • 但需要有一個很棒的 regularization implementation

    • 這也是為什麼 deep learning 在 supervised learning 能有非常好的表現

PreviousImproving Deep Neural NetworksNextRegularizing your Neural Network

Last updated 5 years ago

Was this helpful?