#### **Principles of Computer Architecture**

#### Miles Murdocca and Vincent Heuring

#### **Chapter 7: Memory**

Principles of Computer Architecture by M. Murdocca and V. Heuring

7-1

#### **Chapter Contents**

- 7.1 The Memory Hierarchy
- 7.2 Random Access Memory
- 7.3 Chip Organization
- 7.4 Commercial Memory Modules
- 7.5 Read-Only Memory
- 7.6 Cache Memory
- 7.7 Virtual Memory
- 7.8 Advanced Topics
- 7.9 Case Study: Rambus Memory
- 7.10 Case Study: The Intel Pentium Memory System

# **The Memory Hierarchy**



Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring

# **Functional Behavior of a RAM Cell**



Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring



Principles of Computer Architecture by M. Murdocca and V. Heuring

7-5

# A Four-Word Memory with Four Bits per Word in a 2D Organization



Principles of Computer Architecture by M. Murdocca and V. Heuring

# A Simplified Representation of the Four-Word by Four-Bit RAM



Principles of Computer Architecture by M. Murdocca and V. Heuring



Principles of Computer Architecture by M. Murdocca and V. Heuring

# Two Four-Word by Four-Bit RAMs are Used in Creating a Four-Word by Eight-Bit RAM



Principles of Computer Architecture by M. Murdocca and V. Heuring

7-9

# Two Four-Word by Four-Bit RAMs Make up an Eight-Word by Four-Bit RAM



Principles of Computer Architecture by M. Murdocca and V. Heuring

7-10

#### **Chapter 7: Memory**

#### **Single-In-Line Memory Module**

Adapted from(Texas Instruments, MOS Memory: Commercial and Military **Specifications Data** Book, Texas Instruments, Literature **Response Center**, P.O. Box 172228, Denver, Colorado, 1991.)

7-11

| PIN NOMENCLATURE                               |                                                            |  |  |  |
|------------------------------------------------|------------------------------------------------------------|--|--|--|
| A0-A9Address InputsCASColumn-Address Strobe    |                                                            |  |  |  |
| DQ1-DQ8<br>NC                                  | Data In/Data Out<br>No Connection                          |  |  |  |
| RAS<br>V <sub>cc</sub><br>V <sub>ss</sub><br>W | Row-Address Strobe<br>5-V Supply<br>Ground<br>Write Enable |  |  |  |



Principles of Computer Architecture by M. Murdocca and V. Heuring

# **A ROM Stores Four Four-Bit Words**



© 1999 M. Murdocca and V. Heuring

### A Lookup Table (LUT) Implements an Eight-Bit ALU



Principles of Computer Architecture by M. Murdocca and V. Heuring

# Placement of Cache in a Computer System



• The *locality principle*: a recently referenced memory location is likely to be referenced again (*temporal locality*); a neighbor of a recently referenced memory location is likely to be referenced (*spatial locality*).

Principles of Computer Architecture by M. Murdocca and V. Heuring

# An Associative Mapping Scheme for a Cache Memory



Principles of Computer Architecture by M. Murdocca and V. Heuring

# Associative Mapping Example

• Consider how an access to memory location  $(A035F014)_{16}$  is mapped to the cache for a  $2^{32}$  word memory. The memory is divided into  $2^{27}$  blocks of  $2^5 = 32$  words per block, and the cache consists of  $2^{14}$  slots: Tag Word

| 27 bits 5 bits |
|----------------|
|----------------|

• If the addressed word is in the cache, it will be found in word  $(14)_{16}$  of a slot that has tag  $(501AF80)_{16}$ , which is made up of the 27 most significant bits of the address. If the addressed word is not in the cache, then the block corresponding to tag field  $(501AF80)_{16}$  is brought into an available slot in the cache from the main memory, and the memory reference is then satisfied from the cache.

Word

1010000001101011111000000001010100

Principles of Computer Architecture by M. Murdocca and V. Heuring

### **Replacement Policies**

- When there are no available slots in which to place a block, a *re-placement policy* is implemented. The replacement policy governs the choice of which slot is freed up for the new block.
- Replacement policies are used for associative and set-associative mapping schemes, and also for virtual memory.
- Least recently used (LRU)
- First-in/first-out (FIFO)
- Least frequently used (LFU)
- Random
- Optimal (used for analysis only look backward in time and reverse-engineer the best possible strategy for a particular sequence of memory references.)

Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring

### A Direct Mapping Scheme for Cache Memory



Principles of Computer Architecture by M. Murdocca and V. Heuring

Chapter 7: Memory

# **Direct Mapping Example**

• For a direct mapped cache, each main memory block can be mapped to only one slot, but each slot can receive more than one block. Consider how an access to memory location (A035F014)<sub>16</sub> is mapped to the cache for a  $2^{32}$  word memory. The memory is divided into  $2^{27}$  blocks of  $2^5 = 32$  words per block, and the cache consists of  $2^{14}$  slots:

| Tag     | Slot    | Word   |
|---------|---------|--------|
| 13 bits | 14 bits | 5 bits |

• If the addressed word is in the cache, it will be found in word  $(14)_{16}$  of slot  $(2F80)_{16}$ , which will have a tag of  $(1406)_{16}$ .



Principles of Computer Architecture by M. Murdocca and V. Heuring

# A Set Associative Mapping Scheme for a Cache Memory



Principles of Computer Architecture by M. Murdocca and V. Heuring

# **Set-Associative Mapping Example**

7-21

• Consider how an access to memory location  $(A035F014)_{16}$  is mapped to the cache for a 2<sup>32</sup> word memory. The memory is divided into 2<sup>27</sup> blocks of 2<sup>5</sup> = 32 words per block, there are two blocks per set, and the cache consists of 2<sup>14</sup> slots:



#### **Cache Read and Write Policies**



Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring

# **Hit Ratios and Effective Access Times**

• Hit ratio and effective access time for single level cache:

 $Hit\ ratio\ =\ \frac{No.\ times\ referenced\ words\ are\ in\ cache}{Total\ number\ of\ memory\ accesses}$ 

 $Eff.\ access\ time\ =\ \frac{(\#\ hits)(Time\ per\ hit)+(\#\ misses)(Time\ per\ miss)}{Total\ number\ of\ memory\ access}$ 

#### • Hit ratios and effective access time for multi-level cache:

 $H_1 = \frac{No. \ times \ accessed \ word \ is \ in \ on-chip \ cache}{Total \ number \ of \ memory \ accesses}$ 

 $H_2 = \frac{\text{No. times accessed word is in off-chip cache}}{\text{No. times accessed word is not in on-chip cache}}$ 

 $T_{EFF} = (No. on-chip cache hits)(On-chip cache hit time) + (No. off-chip cache hits)(Off-chip cache hit time) + (No. off-chip cache misses)(Off-chip cache miss time) / Total number of memory accesses$ 

Principles of Computer Architecture by M. Murdocca and V. Heuring

# **Direct Mapped Cache Example**

 Compute hit ratio and effective access time for a program that executes from memory locations 48 to 95, and then loops 10 times from 15 to 31.

7-24

 The direct mapped cache has four 16word slots, a hit time of 80 ns, and a miss time of 2500 ns. Loadthrough is used. The cache is initially empty.



Principles of Computer Architecture by M. Murdocca and V. Heuring

### **Table of Events for Example Program**

| Event              | Location | Time              | Comment                        |  |  |
|--------------------|----------|-------------------|--------------------------------|--|--|
| 1 miss             | 48       | 2500ns            | Memory block 3 to cache slot 3 |  |  |
| 15 hits            | 49-63    | 80ns×15=1200ns    |                                |  |  |
| 1 miss             | 64       | 2500ns            | Memory block 4 to cache slot 0 |  |  |
| 15 hits            | 65-79    | 80ns×15=1200ns    |                                |  |  |
| 1 miss             | 80       | 2500ns            | Memory block 5 to cache slot 1 |  |  |
| 15 hits            | 81-95    | 80ns×15=1200ns    |                                |  |  |
| 1 miss             | 15       | 2500ns            | Memory block 0 to cache slot 0 |  |  |
| 1 miss             | 16       | 2500ns            | Memory block 1 to cache slot 1 |  |  |
| 15 hits            | 17-31    | 80ns×15=1200ns    |                                |  |  |
| 9 hits             | 15       | 80ns×9=720ns      | Last nine iterations of loop   |  |  |
| 144 hits           | 16-31    | 80ns×144=12,240ns | Last nine iterations of loop   |  |  |
| Total hits $= 213$ |          | Total misses = 5  |                                |  |  |

Principles of Computer Architecture by M. Murdocca and V. Heuring

# Calculation of Hit Ratio and Effective Access Time for Example Program

$$Hit \ ratio = \frac{213}{218} = 97.7\%$$
  
EffectiveAccessTime =  $\frac{(213)(80ns) + (5)(2500ns)}{218} = 136ns$ 

Principles of Computer Architecture by M. Murdocca and V. Heuring

**Chapter 7: Memory** 

# Neat Little LRU Algorithm

• A sequence is shown for the Neat Little LRU Algorithm for a cache with four slots. Main memory blocks are accessed in the sequence: 0, 2, 3, 1, 5, 4.



Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring

#### **Overlays**

• A partition graph for a program with a main routine and three subroutines:



Principles of Computer Architecture by M. Murdocca and V. Heuring

# **Virtual Memory**

- Virtual memory is stored in a hard disk image. The physical memory holds a small number of virtual pages in physical page frames.
- A mapping between a virtual and a physical memory:



Principles of Computer Architecture by M. Murdocca and V. Heuring

#### Page Table

The page table maps between virtual memory and physical memory.



Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring

# Using the Page Table

#### • A virtual address is translated into a physical address:



Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring



Principles of Computer Architecture by M. Murdocca and V. Heuring

#### Segmentation

 A segmented memory allows two users to share the same word processor code, with different data spaces:



Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring

#### Fragmentation



Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring

#### **Translation Lookaside Buffer**

• An example TLB holds 8 entries for a system with 32 virtual pages and 16 page frames.

| Valid | Virtual page<br>number | Physical page number |  |  |
|-------|------------------------|----------------------|--|--|
| 1     | 01001                  | 1100                 |  |  |
| 1     | 10111                  | 1001                 |  |  |
| 0     |                        |                      |  |  |
| 0     |                        |                      |  |  |
| 1     | 01110                  | 0000                 |  |  |
| 0     |                        |                      |  |  |
| 1     | 00110                  | 0111                 |  |  |
| 0     |                        |                      |  |  |

Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring

#### **3-Variable Decoder**

 A conventional decoder is not extensible to large sizes because each address line drives twice as many logic gates for each added address line.



Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring

**Chapter 7: Memory** 

#### **Tree Decoder - 3 Variables**

• A tree decoder is more easily extended to large sizes because fanin and fan-out are managed by adding deeper levels.



Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring

# **Tree Decoding – One Level at a Time**

• A decoding tree for a 16-word random access memory:



Principles of Computer Architecture by M. Murdocca and V. Heuring

# Content Addressable Memory – Addressing

 Relationships between random access memory and content addressable memory:

|                                                                   | Address  | Value                      | _   | Field1         | Field2             | Field3   |  |
|-------------------------------------------------------------------|----------|----------------------------|-----|----------------|--------------------|----------|--|
|                                                                   | 0000A000 | 0F0F0000                   |     | 000            | А                  | 9E       |  |
|                                                                   | 0000A004 | 186734F1                   |     | 011            | 0                  | F0       |  |
|                                                                   | 0000A008 | 0F000000                   |     | 149            | 7                  | 01       |  |
|                                                                   | 0000A00C | FE681022                   |     | 091            | 4                  | 00       |  |
|                                                                   | 0000A010 | 3152467C                   |     | 000            | Е                  | FE       |  |
|                                                                   | 0000A014 | C3450917                   |     | 749            | С                  | 6E       |  |
|                                                                   | 0000A018 | 00392B11                   |     | 000            | 0                  | 50       |  |
|                                                                   | 0000A01C | 10034561                   |     | 575            | 1                  | 84       |  |
| $\sim$ 32 bits $\rightarrow$ $\sim$ 32 bits $\rightarrow$         |          |                            |     | ←12 bits→      | ←4 bits→           | ←8 bits→ |  |
| Random access memory                                              |          | Content addressable memory |     |                |                    |          |  |
| Principles of Computer Architecture by M. Murdocca and V. Heuring |          |                            | © 1 | 1999 M. Murdoo | cca and V. Heuring |          |  |

Chapter 7: Memory

# **Overview of CAM**

 Source: (Foster, C. C., Content Addressable Parallel Processors, Van Nostrand Reinhold Company, 1976.)



© 1999 M. Murdocca and V. Heuring

# **Addressing Subtrees for a CAM**



Principles of Computer Architecture by M. Murdocca and V. Heuring

7-41

# **Block Diagram of Dual-Read RAM**



Principles of Computer Architecture by M. Murdocca and V. Heuring

**Chapter 7: Memory** 

# **Rambus Memory**

 Rambus technology on the Nintendo 64 motherboard (top left and bottom right) enables cost savings over the conventional Sega Saturn motherboard design (bottom left). (Photo source: Rambus, Inc.)







Principles of Computer Architecture by M. Murdocca and V. Heuring

© 1999 M. Murdocca and V. Heuring

#### **The Intel Pentium Memory System**



© 1999 M. Murdocca and V. Heuring