Academic
  • Introduction
  • Artificial Intelligence
    • Introduction
    • AI Concepts, Terminology, and Application Areas
    • AI: Issues, Concerns and Ethical Considerations
  • Biology
    • Scientific Method
    • Chemistry of Life
    • Water, Acids, Bases
    • Properties of carbon
    • Macromolecules
    • Energy and Enzymes
    • Structure of a cell
    • Membranes and transport
    • Cellular respiration
    • Cell Signaling
    • Cell Division
    • Classical and molecular genetics
    • DNA as the genetic material
    • Central dogma
    • Gene regulation
  • Bioinformatics
    • Bioinformatics Overview
  • Deep Learning
    • Neural Networks and Deep Learning
      • Introduction
      • Logistic Regression as a Neural Network
      • Python and Vectorization
      • Shallow Neural Network
      • Deep Neural Network
    • Improving Deep Neural Networks
      • Setting up your Machine Learning Application
      • Regularizing your Neural Network
      • Setting up your Optimization Problem
      • Optimization algorithms
      • Hyperparameter, Batch Normalization, Softmax
    • Structuring Machine Learning Projects
    • Convolutional Neural Networks
      • Introduction
    • Sequence Models
      • Recurrent Neural Networks
      • Natural Language Processing & Word Embeddings
      • Sequence models & Attention mechanism
  • Linear Algebra
    • Vectors and Spaces
      • Vectors
      • Linear combinations and spans
      • Linear dependence and independence
      • Subspaces and the basis for a subspace
      • Vector dot and cross products
      • Matrices for solving systems by elimination
      • Null space and column space
    • Matrix transformations
      • Functions and linear transformations
      • Linear transformation examples
      • Transformations and matrix multiplication
      • Inverse functions and transformations
      • Finding inverses and determinants
      • More Determinant Depth
  • Machine Learning
    • Introduction
    • Linear Regression
      • Model and Cost Function
      • Parameter Learning
      • Multivariate Linear Regression
      • Computing Parameters Analytically
      • Octave
    • Logistic Regression
      • Classification and Representation
      • Logistic Regression Model
    • Regularization
      • Solving the Problem of Overfitting
    • Neural Networks
      • Introduction of Neural Networks
      • Neural Networks - Learning
    • Improve Learning Algorithm
      • Advice for Applying Machine Learning
      • Machine Learning System Design
    • Support Vector Machine
      • Large Margin Classification
      • Kernels
      • SVM in Practice
  • NCKU - Artificial Intelligence
    • Introduction
    • Intelligent Agents
    • Solving Problems by Searching
    • Beyond Classical Search
    • Learning from Examples
  • NCKU - Computer Architecture
    • First Week
  • NCKU - Data Mining
    • Introduction
    • Association Analysis
    • FP-growth
    • Other Association Rules
    • Sequence Pattern
    • Classification
    • Evaluation
    • Clustering
    • Link Analysis
  • NCKU - Machine Learning
    • Probability
    • Inference
    • Bayesian Inference
    • Introduction
  • NCKU - Robotic Navigation and Exploration
    • Kinetic Model & Vehicle Control
    • Motion Planning
    • SLAM Back-end (I)
    • SLAM Back-end (II)
    • Computer Vision / Multi-view Geometry
    • Lie group & Lie algebra
    • SLAM Front-end
  • Python
    • Numpy
    • Pandas
    • Scikit-learn
      • Introduction
      • Statistic Learning
  • Statstics
    • Quantitative Data
    • Modeling Data Distribution
    • Bivariate Numerical Data
    • Probability
    • Random Variables
    • Sampling Distribution
    • Confidence Intervals
    • Significance tests
Powered by GitBook
On this page
  • A New Golden Age for Computer Architecture
  • 歷史
  • 挑戰
  • 未來
  • Great Ideas in Computer Architecture
  • Abstraction (Layers of Representation/Interpretation)
  • Moore’s Law
  • Principle of Locality/Memory Hierarchy
  • Parallelism
  • Performance Measurement & Improvement
  • Dependability via Redundancy
  • Number representation
  • Floating point
  • Memory management

Was this helpful?

  1. NCKU - Computer Architecture

First Week

PreviousNCKU - Computer ArchitectureNextNCKU - Data Mining

Last updated 5 years ago

Was this helpful?

A New Golden Age for Computer Architecture

中文文章

歷史

  • IBM 藉由 micro instruction 實現 讓單個 ISA 解決他所有的產品

  • 設計 Control unit 為一大挑戰

  • 此時的 ISA 還是為

  • 電晶體及 RAM 的竄起,CISC 指令能夠更加複雜,出現了 及 (還是 CISC)

  • 出現了幾乎可以取代 CISC 的

  • RISC = RAM for cache, simple ISA, better chip integration, better load-store ISAs

  • 出現 RISC 贏過 CISC 的公式 (4X 快)

    TimeProgram=InstructionsProgram×Clock cyclesInstructions×TimeClock cycles\frac{\text{Time}}{\text{Program}} = \frac{\text{Instructions}}{\text{Program}} \times \frac{\text{Clock cycles}}{\text{Instructions}} \times \frac{\text{Time}}{\text{Clock cycles}}ProgramTime​=ProgramInstructions​×InstructionsClock cycles​×Clock cyclesTime​
  • 在 PC 時代還是 x86 主宰市場,但在後 PC 時代已經是 RISC 的天下 (99% Processors)

  • 出現了 說不定可以成為下一個 RISC

  • 但 VLIW 被證實有太多 issues,不過還是活躍於 市場

  • CISC 已經 30 年沒有突破,VLIW 是 15 年,目前還是 RISC 的天下

挑戰

  • End of the uniprocessor era

  • 市場改變 => 從 PC/Server 變成 IoT/Mobile/Clouds

  • 資訊安全如 Meltdown & Spectre

未來

  • Software Programmer 在撰寫程式前更需要知道硬體運作

  • 或是找出方法讓 Python 也可以 run 的跟 C with compiler + HW 一樣

    • DSA 不是針對單個應用,而是整個應用 domain

    • 需要有該應用 domain 的深度認識

    • 現有例子為 : neural network processors for ML (TPU), GPUs for graphics, Programmable network switches

    • 使用相應的語言來處理適合的 task

    • OpenGL, Tensorflow, P4

Great Ideas in Computer Architecture

Abstraction (Layers of Representation/Interpretation)

High-level language program 如何藉由 Compiler, Assembler, Machine interpretation, Architecture implementation 來實現

以及其中牽涉到的 physics, electrons and quarks 等基礎原理是怎麼實作進來的

Moore’s Law

David House 提出 每 18 個月晶片的效能將會提高一倍

Principle of Locality/Memory Hierarchy

Jim Gray 定義了一系列譬喻 : "How far away is the data ?"

  • Register : Head

  • On-chip cache : Room

  • On-board cache : Campus

  • Memory : Sacramento (薩克拉門托城市)

  • Disk : Pluto (冥王星)

  • Tape : Andromeda (仙女座)

而在 memory hierarchy 中,金字塔越往下代表 Cheaper, Bigger, Slower

  • CPU

    • Processor register

  • CPU Cache

    • L1, L2, L3

  • Physical memory

    • RAM

  • Solid state memory

    • SSD, Flash

  • Virtual memory

    • Hard drives

Parallelism

    • like Google docs

要考慮工作中的 Serial part 是不能平行的 !

Performance Measurement & Improvement

Great Performance measurement gives you great performance.

所以需要一個好的 benchmark 來測試效能 !

Dependability via Redundancy

利用 Redundancy 讓系統在某一個區塊錯誤時不會整個崩潰

在 datacenter, disks (RAID), memory bits (ECC) 都可以看到這樣的作法

Number representation

Bits can represent anything

  • Characters

  • Logical values

  • Pi

Base 10

Base 2

Base 16

Decimal to Binary

將二進位由大到小從左到右寫出

觀察是否可以被最大的減去,由左往右直到 1 為止 :

Decimal to Hexadecimal

跟上面差不多 :

Binary to Hexadecimal

將 二進位補滿 4 個數一組,再轉換每組數字 :

Hexadecimal to Binary

跟上面相反,把數字拆開成為 4 個一組的二進位數,再把前面多餘的 0 刪掉 :

Binary Operations

Add, subtract is same as decimal.

Binary Overflow

當兩個 binary 數值相加超過 hardware 能承受最大的 bits (11...1) 時,就會發生 Overflow

Binary Negative numbers

Sign and Magnitude

  • 在原本沒有負數的 unsigned numbers 取他的最左 bit 做為正負號 (0001 => 1001)

  • 若有 N bits 則範圍為 (減一是因為 0)

  • 因為要保證最左不參與加減法,所以電路複雜

  • 而且這樣會有兩個 0 (負的跟正的各一)

Ones' complement

  • 將所有的數字反轉 (0001 => 1110)

  • 若有 N bits 則範圍為 (減一是因為 0)

  • 電路設計變簡單

  • 需要循環進位的電路 (將 overflow 加回最低位)

  • 一樣有兩個 0 (正負各一)

Two's complement

  • 利用在 negative, complement + 1 來解決兩個 0 的 "Overlap" (0001 => 1110 => 1111)

  • 範圍為

  • 因為解決了 0 的問題,看起來像是負數比正數還要多一個

  • 負數可以用二的 power 來表示,公式為

  • 例如 1101 :

Floating point

Fraction representation

representation 方式跟 decimal 差不多 : 而轉換方式如下 :

加法 :

multiplication

Scientific Notation

Floating Point Representation

IEEE 754 Floating Point Standard

Representation for ± ∞

Representation for 0

Special Numbers

NaN

Memory management

struct

linked list

global

3 pools

memory management

stack

Heap

malloc

3 way find free space first, next, best fit

End of and => Power 吃重、Transistor 不再進化

Hardware 只剩 這條路可走 : 針對應用領域做最佳化的處理器架構

搭配的是

制定全新的開源 ISA 基礎 :

但 ! 指出 : 並不是平行化的越多,如 X 倍,執行速度就 X 倍

N bits  ⟺   at most 2N thingsN \text{ bits} \iff \text{ at most } 2^N \text{ things}N bits⟺ at most 2N things
327110=(3×103)+(2×102)+(7×101)+(1×100)3271_{10} = (3 \times 10^3) + (2 \times 10^2) + (7 \times 10^1) + (1 \times 10^0)327110​=(3×103)+(2×102)+(7×101)+(1×100)
11012=(1×23)+(1×22)+(0×21)+(1×20)=8+4+0+1=13\begin{aligned} 1101_2 &= (1 \times 2^3) + (1 \times 2^2) + (0 \times 2^1) + (1 \times 2^0)\\ &= 8 + 4 + 0 + 1 \\ &= 13 \end{aligned}11012​​=(1×23)+(1×22)+(0×21)+(1×20)=8+4+0+1=13​
0xA5=A516=(10×161)+(5×160)=160+5=165\begin{aligned} 0xA5 = A5_{16} &= (10 \times 16^1) + (5 \times 16^0)\\ &= 160 + 5 \\ &= 165 \end{aligned}0xA5=A516​​=(10×161)+(5×160)=160+5=165​
e.g., 11110 to hex : 11110=000111100001=11110=8+4+2=14=E11110=1E\begin{aligned} \text{e.g., 11110 to hex : }&\\ 11110 &= 0001 1110\\ 0001 &= 1\\ 1110 &= 8 + 4 + 2 = 14 = E\\ 11110 &= 1E \end{aligned}e.g., 11110 to hex : 111100001111011110​=00011110=1=8+4+2=14=E=1E​
e.g., 1E to binary : 1=0001E=14=8+4+2+0=11101E=00011110=11110\begin{aligned} \text{e.g., 1E to binary : }&\\ 1 &= 0001\\ E &= 14 = 8 + 4 + 2 + 0 = 1110\\ 1E &= 0001 1110\\ &= 11110 \end{aligned}e.g., 1E to binary : 1E1E​=0001=14=8+4+2+0=1110=00011110=11110​

−(2N−1−1)∼(2N−1−1)-(2^{N-1} - 1) \sim (2^{N-1} - 1)−(2N−1−1)∼(2N−1−1)

−(2N−1−1)∼(2N−1−1)-(2^{N-1} - 1) \sim (2^{N-1} - 1)−(2N−1−1)∼(2N−1−1)

−(2N−1)∼(2N−1−1)-(2^{N-1}) \sim (2^{N-1}-1)−(2N−1)∼(2N−1−1)

d31×−(231)+d30×230+⋯+d1×21+d0×20d_{31} \times {\color{red}-(2^{31})} + d_{30} \times 2^{30} + \cdots + d_1 \times 2^1 + d_0 \times 2^0d31​×−(231)+d30​×230+⋯+d1​×21+d0​×20

=1×−(23)+1×22+0×21+1×20=−(23)+22+0+20=−8+4+0+1=−3\begin{aligned} &= 1 \times -(2^{3}) + 1 \times 2^2 + 0 \times 2^1 + 1 \times 2^0\\ &= -(2^3) + 2^2 + 0 + 2^0\\ &= -8 + 4 + 0 + 1\\ &= -3 \end{aligned}​=1×−(23)+1×22+0×21+1×20=−(23)+22+0+20=−8+4+0+1=−3​

10.10102=1×21+1×2−1+1×2−3=2.6251010.1010_2 = 1\times2^1 + 1\times2^{-1} + 1\times2^{-3} = 2.625_{10}10.10102​=1×21+1×2−1+1×2−3=2.62510​

Dennard scaling
Moore’s Law
Domain Specific Architectures (DSAs)
Domain-specific language (DSL)
RISC-V
Video
Slide
Pipeline
Fork and join
Data parallelism
Ahdahl's law
Slide
補充 - 解讀計算機編碼
Slide
補充 - 數值系統
Slide
Video
https://www.jiqizhixin.com/articles/2017-10-31-15
https://www.jiqizhixin.com/articles/2019-01-30-12
Article
Notes
Microprogramming
CISC 架構
minicomputer
Microprocessor
RISC
VLIW
Embedded DSP