Computer Architecture - Final Study Set

¡Supera tus tareas y exámenes ahora con Quizwiz!

DRAM

(Dynamic Random Access Memory) - Main Memory ● Dynamic: since capacitors leak power, it's unable to retain data, even when powered on, without being "refreshed" regularly (thousands of times per second) ●Volatile - lose data when there is no power ●Advantages: High density (single transistor cells) → greater memory capacity, cheaper to manufacture ●Disadvantages: Slower access, higher power consumption ●Typical use: system memory, video graphic memory

SRAM

(Static Random Access Memory) - cache ● Requires constant power supply to function ●Static: no action (i.e., refreshing) required to keep content in tact; lasts until power is turned off ●Volatile: lose data when there is no power ●Advantages: fast access speed, lower power consumption ●Disadvantages: Low density (6 transistor cells) → lower memory capacity and high manufacturing cost ●Typical use: CPU caches (internal memory)

Spatial Locality

(locality in space) ● If a data location is referenced, data locations with nearby addresses will tend to be referenced soon. ● Move entire blocks consisting of contiguous words to the faster, upper levels of memory.

Temporal Locality

(locality in time) ● If a data location is referenced then it is likely to be referenced again soon ● Keep the most recently accessed data items in the faster memory closer to the processor

Advantages of VM

- Allows multiple processes to run concurrently - data sharing - data security - processes can be larger than physical memory - data can be placed anywhere in memory

Disadvantages of VM

- Slower/longer memory access time - more difficult implementation

Memory Hierarchy

-Registers -On Chip L1 Cache (SRAM) -Off Chip L2 Cache (SRAM) -Main Memory (DRAM) -Local Secondary Storage (Local Disks) -Remote Secondary Storage (Distributed file systems, web servers)

The 3 Types of Pipeline Hazards

1. Structural Hazard 2. Data Hazard 3. Control Hazard

Two Main types of Bus Snooping Protocols

1. Write Invalidate 2. Write Update

Write Update

A Bus Snooping Protocol that immediately updates all copies in all caches. The writing data processor broadcasts the new data over the bus. All caches that contain copies of the data are then updated.

Write-Invalidate

A Bus Snooping Protocol. The writing data processor renders all copies in the caches of all other processors in the system invalid before it changes its local copy. Done by sending an invalidation signal over the bus, which causes all of the other caches to check for a copy of the invalidated file. Once the cache copies have been invalidated, the data on the local machine can be updated until another processor requests it.

Page Table

A VM system that stores the mapping between virtual addresses and physical addresses. There is one page table entry for every virtual page number and every virtual page. The virtual page number is used as an index to the table. The table stores the base address for the page in physical memory. The Page Table is managed by the operating system & can be located in Main Memory.

Fully Associative Cache

A cache structure in which a block can be placed in any location in the cache. On a lookup, the system needs to check the address against all the entries to determine a hit. Dis: Expensive, may increase hit time.

Direct Mapped Cache

A cache structure in which each memory location is mapped to exactly one location in the cache. Location of the block in the cache is determined by the address of the memory block. Disadvantage: Fixed, 1 possible block placement

MISD (Multiple Instruction, Single Data)

A computer with multiple processors that can perform different operations (multiple instruction) on same data (single data)

SIMD (Single Instruction, Multiple Data)

A computer with multiple processors. All processors simultaneously execute the same instruction on different data. Examples include vector processors, GPU. Requires a single instruction stream but multiple data streams.

Pipelining

A datapath technique that splits the instruction into separate steps. Each step takes a single clock cycle. Start fetching and executing the next instruction before the current one has completed

Swap Space

A designated area of hard drive that acts as a RAM for VM.

Physical Page

A fixed-length contiguous block of physical memory, described by a single entry in the page table.

Virtual Page

A fixed-length contiguous block of virtual memory, described by a single entry in the page table.

Amdahl's law

A formula used to find the maximum improvement possible by just improving a particular part of a system. Often used in parallel computing to predict the theoretical speedup when using multiple processors.

Virtual Memory (VM)

A method of indirection that allows the execution of processes that require larger memory than the actual size of the RAM. It is done by reserving some hard-disk space to be used as memory. VM maps the virtual address generated by the CPU to its actual physical address. Gives the processor the illusion of unlimited physical memory.

MIMD (Multiple Instruction, Multiple Data)

A parallel computer with multiple processors executing multiple instructions on multiple data all at once. Requires multiple data streams. Example: multicore processors (mid 2000s)

Cache Coherency

A shared memory architecture problem of having copies of the same memory block across multiple caches. Write-through & Write-back cache procedures don't fix this problem.

Translation Lookaside Buffer (TLB)

A small, special Page Table Cache that keeps track of recently used address mappings to avoid having to do a page table lookup in RAM. VA is generated & the TLB is checked for an entry for the Physical Address Location. If it's not there, the page table in memory is accessed and the entry is brought into the TLB (same process as any other cache) To be fast, TLB must be small; often fully-associative (not mandatory)

Speculation

Advanced pipelining technique that determines what the result would be from executing the next instruction(s). If the branch prediction was correct, the result is used, otherwise it is discarded (speculate on the outcome and execute the program as if the guesses are correct) . Work is done before it is known if it's needed → to prevent a delay that would result if the work needed to be done and isn't. Improve performance by reducing impact of conditional branches. Speculation may be done in the compiler or by hardware

Out of Order Execution (OOE)

Advanced pipelining technique that uses In-order instruction fetch, but allow instructions to begin execution as soon as its data operands are available (assuming resources are available). Still use in-order commit. Possible introduction of of WAR and WAW data hazards. Requires multiple functional units and various scheduling algorithms

Superscalar

An approach to increase the potential amount of instruction-level parallelism by replicating the internal components (duplicate hardware) so we can launch multiple instructions in every pipeline stage. Results in decreased CPI. *Limitations on instructions that can be issued in parallel

Branch Prediction

An approach to solving control hazards by assuming a given outcome and proceeding without waiting to see the actual branch outcome. Performance depends on the accuracy of the prediction and how soon you can check the prediction. Two approaches: Static Prediction & Dynamic Prediction.

Structural Hazard

An attempt for two different instructions to use the same resource at the same time. Solved through the down cycles on the clock

Control Hazard

An attempt to make a decision about program control flow before the condition has been evaluated and the new PC target address calculated (branch instructions). Possible Approaches: Stall, Move Branch Decision Point, Delay Decision, Predict.

Data Hazard

An attempt to use data before it is ready (An instruction's source operand(s) are produced by a prior instruction still in the pipeline). The only one we have to deal with. RAW*, WAR, WAW. Solved through forwarding & stalling

Side Channel Attack

An unintended pathway that leaks information from one entity to another. Can allow an attacker to infer information by observing non-functional characteristics of the system (e.g., execution time or memory consumed).

Tag Bits

Bits that uniquely identify each block that can map to a particular cache location; the remaining bits minus the index and offset (based on high-order bits).

Why is branch prediction important?

Branch prediction is essentially an optimization problem where the emphasis is on achieving the lowest miss rate, low power consumption, and low complexity with minimum resources. The most efficient solution to control hazards.

Deep Pipelines

Can increase the clock speed if the pipeline stages are subdivided → deeper pipelines with more but shorter stages. Instructions take more clock cycles to complete, processor still completes one instruction per cycle, BUT there are more cycles per second. Shorter stages means higher clock frequencies → higher throughput. *Limitations on pipeline depth (register overhead, hazard penalties)/consequences*

Flynn's Taxonomy

Classification scheme for parallel architectures (SISD, MISD, SIMD, MIMD)

Moore's Law

Co-founder of Intel, Gordon Moore's prediction that ""The number of transistors incorporated in a chip will approximately double every 24 months."

Meltdown

Exploit in out-of-order execution that allows the user to "read" cached kernel data. Breaks isolation between user's application and operating system.

Spectre

Exploit in speculative execution/ branch prediction that allows user to "read" cached data from other applications. Breaks isolation in memory between processes.

Memory Hierarchy (Memory vs. Speed vs. Cost)

Higher in Pyramid: smaller, faster, costlier per byte. Lower in Pyramid: larger, slower, cheaper per byte

Offset Bits

Identifies each unique byte contained in the block; # of bits required is dependent on the # of bytes in the block (based on low-order bits).

Page Faults

Miss in the page table, entry not valid. Basically the page is not in RAM. Many VM design choices are motivated by the high cost of a page fault. A page fault to disk will take millions of clock cycles to process

Bus Snooping

Most popular cache-cache coherence protocol. All caches on the bus monitor the bus to determine if they have a copy of the requested data block on the bus. Every cache has a copy of the sharing status of every block of physical memory it has.

Motivations For Virtual Memory

Not enough RAM & program isolation

Amdahl's Law Equation

P: Proportion of program that can run in parallel (1 - P): The proportion that remains serial N: # of processors/cores Max Speedup = 1 / [(1-P) + P/N]

How does pipelining improve performance?

Pipelining improves throughput & parallelism, both improve performance.

Shared Memory

Process of communication between processors required by parallel architecture multiprocessors when problem-solving. Shared memory architecture use shared memory for communication by changing variables.

SISD (Single Instruction, Single Data)

Single uni-core processor, executes a single instruction stream, to operate on data stored in a single memory. Used in older generation computers

Forwarding

Take the result from from the earliest point that it exists in any of the pipeline state registers and forward it to the functional units that need it that cycle.

VA translation to PA

The VA is based of the computer architecture, whereas the PA is based off the RAM & physical page #. Both share the same page offset, which isn't needed to find the PP # because only the VP # is needed to find the PP # translation in the page table. The PP# & associated page offset are what's broken down into the tag, index, offset.

Set Associative Cache

The cache is partitioned into sets of entries. Each memory location can only be stored in its assigned set. It can be stored in any cache entry in that set. On a lookup, the system needs to check the address against all the entries in its set to determine if there is a cache hit.

Virtual Address Space

The complete range of addresses that the CPU thinks it has access to in the RAM. Based on the Computer Architecture (For Example: 32-bit architecture => CPU thinks 2^32 possible addresses).

Processor-Memory Performance Gap

The performance increase of processors over time is much greater than the performance increase of memory over time. This gap grows 50% each year with processor performance doubling every 1.5 years and memory performance only doubling every 10 years.

Physical Address Space

The range of actual addresses the processor has access to.

Pipelining, speculation, out-of-order execution Vulnerability

The user privilege check is only done at the time of execution, not during out-of-order execution or speculation branching. Also, the effects are wiped but cache isn't

Latency

Time from instruction start to completion, NOT reduced by pipelining.

Throughput

Total work done in a given time, increased by pipelining.

Index Bits

Used for determining block placement in the cache. # of bits required depends on the number of blocks in the cache (based on middle-order bits).

How is VM used to run a program larger than physical memory?

VM uses main memory as a "cache" for secondary storage.

How is isolation achieved using VM?

Virtual memory ensures a program only read/write the portions of main memory assigned to it & prevents programs from writing over each other.

Why the switch to multicore? Diminishing performance in single processor performance gains due to

○Power Wall - Moores Law & Denard Scaling no longer apply. ○ Multicore chips not necessarily faster than single core, but overall performance improved by handling more work in parallel

How to Avoid Page Faults

● Larger Page Size ● Organizations that reduce the page fault rate (increase associativity.) ● Handling page faults through software ● Write-back is used (add a dirty bit)

Side Channel Attack Categories

● Timing Attacks ● Power Analysis ● Acoustic Analysis ● Electromagnetic Analysis


Conjuntos de estudio relacionados

GS 497 CH 3 The IT Audit Process

View Set

Real Estate Fundamentals Final Exam

View Set

Module 2: Civil War & Reconstruction (1846-1877)

View Set

5.9 Troubleshoot IP Communications Quiz

View Set

Test 7 - Tool and Machine Safety

View Set

Factoring Trinomials: a=1 (contiued)

View Set