First Week

A New Golden Age for Computer Architecture

中文文章
https://www.jiqizhixin.com/articles/2017-10-31-15
https://www.jiqizhixin.com/articles/2019-01-30-12

歷史

IBM 藉由 micro instruction 實現 Microprogramming 讓單個 ISA 解決他所有的產品
設計 Control unit 為一大挑戰
此時的 ISA 還是為 CISC 架構
電晶體及 RAM 的竄起，CISC 指令能夠更加複雜，出現了 minicomputer 及 Microprocessor (還是 CISC)
出現了幾乎可以取代 CISC 的 RISC
RISC = RAM for cache, simple ISA, better chip integration, better load-store ISAs
出現 RISC 贏過 CISC 的公式 (4X 快)
$\frac{\text{Time}}{\text{Program}} = \frac{\text{Instructions}}{\text{Program}} \times \frac{\text{Clock cycles}}{\text{Instructions}} \times \frac{\text{Time}}{\text{Clock cycles}}$
在 PC 時代還是 x86 主宰市場，但在後 PC 時代已經是 RISC 的天下 (99% Processors)
出現了 VLIW 說不定可以成為下一個 RISC
但 VLIW 被證實有太多 issues，不過還是活躍於 Embedded DSP 市場
CISC 已經 30 年沒有突破，VLIW 是 15 年，目前還是 RISC 的天下

挑戰

End of Dennard scaling and Moore’s Law => Power 吃重、Transistor 不再進化
End of the uniprocessor era
市場改變 => 從 PC/Server 變成 IoT/Mobile/Clouds
資訊安全如 Meltdown & Spectre

未來

Software Programmer 在撰寫程式前更需要知道硬體運作
或是找出方法讓 Python 也可以 run 的跟 C with compiler + HW 一樣
Hardware 只剩 Domain Specific Architectures (DSAs) 這條路可走 : 針對應用領域做最佳化的處理器架構
- DSA 不是針對單個應用，而是整個應用 domain
- 需要有該應用 domain 的深度認識
- 現有例子為 : neural network processors for ML (TPU), GPUs for graphics, Programmable network switches
搭配的是 Domain-specific language (DSL)
- 使用相應的語言來處理適合的 task
- OpenGL, Tensorflow, P4
制定全新的開源 ISA 基礎 : RISC-V

Great Ideas in Computer Architecture

Abstraction (Layers of Representation/Interpretation)

High-level language program 如何藉由 Compiler, Assembler, Machine interpretation, Architecture implementation 來實現

以及其中牽涉到的 physics, electrons and quarks 等基礎原理是怎麼實作進來的

Moore’s Law

David House 提出每 18 個月晶片的效能將會提高一倍

Principle of Locality/Memory Hierarchy

Jim Gray 定義了一系列譬喻 : "How far away is the data ?"

Register : Head
On-chip cache : Room
On-board cache : Campus
Memory : Sacramento (薩克拉門托城市)
Disk : Pluto (冥王星)
Tape : Andromeda (仙女座)

而在 memory hierarchy 中，金字塔越往下代表 Cheaper, Bigger, Slower

CPU
- Processor register
CPU Cache
- L1, L2, L3
Physical memory
- RAM
Solid state memory
- SSD, Flash
Virtual memory
- Hard drives

Parallelism

但 ! Ahdahl's law 指出 : 並不是平行化的越多，如 X 倍，執行速度就 X 倍

要考慮工作中的 Serial part 是不能平行的 !

Performance Measurement & Improvement

Great Performance measurement gives you great performance.

所以需要一個好的 benchmark 來測試效能 !

Dependability via Redundancy

利用 Redundancy 讓系統在某一個區塊錯誤時不會整個崩潰

在 datacenter, disks (RAID), memory bits (ECC) 都可以看到這樣的作法

Number representation

Bits can represent anything

Characters
Logical values
Pi
$N \text{ bits} \iff \text{ at most } 2^N \text{ things}$

Base 10

3271_{10} = (3 \times 10^3) + (2 \times 10^2) + (7 \times 10^1) + (1 \times 10^0)

Base 2

\begin{aligned} 1101_2 &= (1 \times 2^3) + (1 \times 2^2) + (0 \times 2^1) + (1 \times 2^0)\\ &= 8 + 4 + 0 + 1 \\ &= 13 \end{aligned}

Base 16

\begin{aligned} 0xA5 = A5_{16} &= (10 \times 16^1) + (5 \times 16^0)\\ &= 160 + 5 \\ &= 165 \end{aligned}

Decimal to Binary

將二進位由大到小從左到右寫出

觀察是否可以被最大的減去，由左往右直到 1 為止 :

Decimal to Hexadecimal

跟上面差不多 :

Binary to Hexadecimal

將二進位補滿 4 個數一組，再轉換每組數字 :

\begin{aligned} \text{e.g., 11110 to hex : }&\\ 11110 &= 0001 1110\\ 0001 &= 1\\ 1110 &= 8 + 4 + 2 = 14 = E\\ 11110 &= 1E \end{aligned}

Hexadecimal to Binary

跟上面相反，把數字拆開成為 4 個一組的二進位數，再把前面多餘的 0 刪掉 :

\begin{aligned} \text{e.g., 1E to binary : }&\\ 1 &= 0001\\ E &= 14 = 8 + 4 + 2 + 0 = 1110\\ 1E &= 0001 1110\\ &= 11110 \end{aligned}

Binary Operations

Add, subtract is same as decimal.

Binary Overflow

當兩個 binary 數值相加超過 hardware 能承受最大的 bits (11...1) 時，就會發生 Overflow

Binary Negative numbers

Sign and Magnitude

在原本沒有負數的 unsigned numbers 取他的最左 bit 做為正負號 (0001 => 1001)
若有 N bits 則範圍為 (減一是因為 0)
$-(2^{N-1} - 1) \sim (2^{N-1} - 1)$
因為要保證最左不參與加減法，所以電路複雜
而且這樣會有兩個 0 (負的跟正的各一)

Ones' complement

將所有的數字反轉 (0001 => 1110)
若有 N bits 則範圍為 (減一是因為 0)
$-(2^{N-1} - 1) \sim (2^{N-1} - 1)$
電路設計變簡單
需要循環進位的電路 (將 overflow 加回最低位)
一樣有兩個 0 (正負各一)

Two's complement

利用在 negative, complement + 1 來解決兩個 0 的 "Overlap" (0001 => 1110 => 1111)
範圍為
$-(2^{N-1}) \sim (2^{N-1}-1)$
因為解決了 0 的問題，看起來像是負數比正數還要多一個
負數可以用二的 power 來表示，公式為
$d_{31} \times {\color{red}-(2^{31})} + d_{30} \times 2^{30} + \cdots + d_1 \times 2^1 + d_0 \times 2^0$
例如 1101 :
$\begin{aligned} &= 1 \times -(2^{3}) + 1 \times 2^2 + 0 \times 2^1 + 1 \times 2^0\\ &= -(2^3) + 2^2 + 0 + 2^0\\ &= -8 + 4 + 0 + 1\\ &= -3 \end{aligned}$

Floating point

Fraction representation

representation 方式跟 decimal 差不多 : 而轉換方式如下 :

10.1010_2 = 1\times2^1 + 1\times2^{-1} + 1\times2^{-3} = 2.625_{10}

加法 :

multiplication

Scientific Notation

Floating Point Representation

IEEE 754 Floating Point Standard

Representation for ± ∞

Representation for 0

Special Numbers

NaN

Memory management

struct

linked list

global

3 pools

memory management

stack

Heap

malloc

3 way find free space first, next, best fit

PreviousNCKU - Computer Architecture NextNCKU - Data Mining

Last updated 5 years ago

Was this helpful?