CSC 364 Final Review

¡Supera tus tareas y exámenes ahora con Quizwiz!

how does associative mapping for cache work [6.2 p18 ex]

each main memory block can be loaded into any cache line, where cache control logic interprets memory address as tag and word fields, where tag uniquely identifies a memory block. (Therefore, there is no way to determine the number of cache lines based on a memory address because the exact line will always vary for blocks) [tag and memory block are still both stored in line together, but the tag is significantly larger than with direct addressing]

how does direct mapping for cache work [6.2 p12-15 diagrams and ex]

each main memory block maps into a unique line of cache, where cache line i receives block based on main memory block # and # lines in cache (i = j % m)

what is the random access memory access method

every addressable location in memory has a unique, physically wired addressing mechanism, where any location can be selected at random and directly addressed and accessed. Time to access a given location is independent of sequence of prior accesses (constant). [main memory and some cache system use this]

explain the concept of virtual memory

facility that allows programs to address memory using logic, regardless of physical space available. This allows instructions to contain virtual memory addresses instead of the full physical ones, where read/write commands to main memory can use memory management unit (MMU) to translate virtual to physical address when needed

what are the functions of the QPI link layer

flow control (prevents sending data faster than receiver can process, and clears buffers for more incoming data) and error control (detects and recovers from bit errors, and isolates higher layers from experiencing bit errors (though bits transmitted at physical layer are occasionally changed during transmission due to noise, etc))

how does SDRAM work to read/write compared to DRAM

for DRAM, processor presents addresses and control signals to memory to indicate a set of data at an address should be read from/written into DRAM. After delay (access time) DRAM writes/reads data, but during delay DRAM performs internal functions (activating capacitors, sensing data, routing data out through output buffers), causing processor to wait for delay. SDRAM, however, moves via clock, so processor's instruction and address is latched by SDRAM, which replies after x clock pulses, allowing processor to work at same time while waiting

describe the structure/states of SRAM [diagram 6.3 p6]

four transistors cross connect to produce stable logic state. Logic state 1 [C1 high/C2 low; T1/T4 off, T2/T3 on] and logic state 0 [C1 low/C2 high; T1/T4 on, T2/T3 off], where both states are stable as long as direct current (DC) voltage applied

describe the two categories of semiconductor memory errors

hard failure (permanent physical defect preventing memory cell(s) from reliably storing data (stuck or switching uncontrollably), unfixable), vs soft errors (random, nondestructive event altering memroy cell(s) contents w/out damaging the cell(s), caused by power supply issues or alpha particles)

what is the peripheral component interconnection (PCI)

high bandwidth, processor-independent bus that can function as a mezzanine or peripheral bus to deliver better system performance for high speed I/O subsystems (ex: graphic displays adapters, network interface controllers, and disk controllers)

how does increasing the memory block (and therefore cache line) size for cache impact the "hit" rate

hit ratio, as size increases, will also increase at first because data nearby a referenced word (aka in the same block) are likely to be referenced in the near future; however, getting too large will causes the probability of using the newly fetched data to be less than probability of reusing the information that has to be replaced [larger blocks reduce # blocks cache can hold, and each additional word in a block is further from initially requested word and less likely to need] [8-64 bytes is optimum]

when a cache block is set to be replaced, what two cases must be considered

if the old cache block has not been altered (simply overwrite it) or if 1+ write operations have been performed on a word in that cache line (write out that line of cache to main memory before overwriting it with new block)

what is the associative memory access method

instead of address, a word is retrieved based on a portion of its contents, where each location has its own addressing mechanism and retrieval time is constant (independent of location or prior accesses)

one of the most critical system bottlenecks when using high-performance processors is __, where DRAM is constrained in this way by __

interface to main internal memory; constrained by its internal architecture and its interface to processor's memory bus

how is the capacity memory characteristic represented

internal memory capacity expressed in words (8, 16, 32 bit), external is expressed in bytes (megabytes, gigabytes, etc)

what is the direct access memory access method

involves a shared read-write mechanism where each record (unit of data) has a unique address (physical location), so the general location of the data can be found followed by a linear access to get the rest of the way (access time varies)

what does the address bus do and what is the importance of its width

it designates source or destination of data from data bus (8, 16, or 32 bit word size for the addresses), and used to address I/O ports (higher order bits select module, lower select memory location or I/O port in module). Width determines max memory capacity of the system

what is the key characteristic of a bus, and why/how does this limit their use

it is shared transmission medium, so signals transmitted by one device can be received by any other, meaning multiple transmissions on bus at same time will overlap and be garbled (one transmit at a time)

what is the difference between large and small caches

larger caches have a larger number of gates involved in addressing, so they are slower than small ones

functions of QPI routing layer

layer defined by firmware, used to describe possible paths a packet can follow and determine which available system interconnection course for it to traverse

describe the 3 most common replacement algorithms for cache when associative or set associative mapping were used

least recently used [(LRU) most effective, replace the block with the least recent reference], first-in-first-out [(FIFO) round-robin or circular buffer to replace block that has been in cache longest], least frequently used [(LFU) associate a counter with each line to replace the block in the set with the fewest references total]

where are the locations of caches when using a multilevel cache and their differences

level 1 internal, on-chip cache is located on same chip as processor due to logic density of microchips getting smaller and eliminating bus use for cache access; other cache are typically external and accessed over bus, though it is possible to have two or three on the processor with some modern chips

how does a multi-level cache organization work

level 1 through n, where level 1 cache is closest to processor (fastest) and level n cache is furthest away (slowest, closest to main memory)

the key to success of memory hierarchy is in decreasing frequency of access, which is based on the principle of __. How does that principle work

locality of reference: during program execution, memory references by processor clutter (loops and subroutines that repeat or table/array operations involving access to clustered set of data words). While the clusters change over a long time, the processor primarily works with fixed clusters, so these frequently fixed clusters of instructions/data are stored in higher memory hierarchy for shorter access time (data organized so % accesses for higher data is larger than lower data so more requested data is faster to access)

what is the location memory characteristic

location refers to if memory is internal (processor registers, cache, main memory) or external (peripheral store devices via I/O controllers)

list the key characteristics of the computer memory system

location, capacity, unit of transfer, access method, performance, physical type/characteristics, organization

what is the difference between logical and physical caches [6.2 p9 diagram]

logical cache stores virtual addresses, so processor accesses cache directly, without going through MMU. Physical cache stores data using main memory physical addresses

what four address spaces does PCIe transaction layer support

memory (main memory and PCIE I/O devices), I/O (reserved memory address ranges for legacy PCI devices), configuration (read/write configuration registers associated with I/O devices), message (control signals for interrupts, error handling, power management)

how does addressing the cache work when direct mapping was used to store blocks

memory address (s+w bits) is divided into fields. The w least significant bits are the word/byte field (specify which of the w words in the block to access). The other s bits specify one of 2^s blocks of memory, where r of the s bits is the line field (specify which of 2^r cache lines the block is stored in), and the other s-r bits (most significant bits) are the tag field (specify which block of the 2^(s-r) blocks that can be mapped to that cache line is mapped there)

what is the sequential access memory access method

memory organized into data units (records), where access must be made in a specific linear sequence, so access time varies

what are some typical control lines on control bus

memory read/write, I/O read/write, bus grant/request, interrupt request, clock, reset

what five types of transfers must the interconnection structure support

memory-to-processor (processor reads instruction/data from memory), processor-to-memory (processor writes unit of data to memory), I/O-to-processor (read data via I/O module), processor-to-I/O (send data to I/O device), I/O to or from memory (I/O module use direct memory access without processor)

read only memory (ROM) is nonvolatile memory typically used for applications in

microprogramming, system programs, function tables, library subroutines (for frequently used functions)

when a cache block is set to be replaced but needs to be saved first, what two problems must be considered

multiple devices may have main memory access (an I/O device may have modified the main memory data separately from the equivalent cache block being modified) or multiple processors can be attached to the same bus and each processor has a separate cache (altering a word in one cache can invalidate the others)

what are the three main characteristics of Intel's QuickPath interconnect (QPI) (point-to-point interconnection)

multiple direct connections (direct pairwise connections b/t system components to eliminate need for arbitration found in shared transmission systems), layered protocol architecture (layered protocol architecture for processor interconnections instead of control signals), packetized data transfer (data sent as packets with control headers and error control codes instead of a raw data stream)

what are the cache size rules

must be small enough so overall average cost per bit is close to that of main memory alone, but large enough so overall access time is close to that of cache alone

when referring to direct mapping with a memory address divided into s+w bits, what are these equal to: number of addressable units, block/line size, # cache lines, cache size

number of addressable units = 2^(s+w) words or bytes; block/line size = 2^w words or bytes; # cache lines = 2^r (r= length of line field of address); cache size = 2^(r+w) words or bytes

what is the point of a replacement algorithm for cache, and what other cache characteristic does it depend on

once cache is filled and a new block is brought in, an existing block must be replaced depending on the type of mapping used. For direct, every block has a specific line so there is no choice, but for associative and set associative mapping, need a replacement algorithm

what are the units of transfer used by physical, link, and protocol layers of QPI architecture [diagram lesson 5 p12]

physical (20-bit Phit), link (80-bit flit), protocol (packets comprised of integral number of flits)

what is PCI express

point-to-point interconnection scheme intended to replace PCI bus schemes. Key requirement is high capacity to support needs of higher data rate I/O devices (ex: gigabit ethernet), and has requirement dealing with need to support time dependent data streams

how does the cache read operation work [6.2 p5 flowchart of process]

processor generates read address (RA) of word to be read, where if the word is in cache it is fetched and delivered to processor. If not, block is retrieved from main memory, where it is [these two occur simultaneously] loaded into cache line ad delivered to CPU

describe the two write policies for when writing over a data in cache

write through (simplest way; apply all write operations along the way to both cache and main memory so main memory is always valid; any other processor cache modules can monitor traffic to maintain their caches' as well, but disadvantage b/c memory traffic gets heavy) and write back (only update cache to minimize writes to memory; "use" bit set if updated in cache, where if that line of cache is being replaced, the main memory only updated if use bit is 1; disadv b/c parts of memory could be invalid, so have bottleneck of complex circuitry so I/O for ex can access cache directly)

how does electrically erasable PROM work

(EEPROM) read-mostly memory able to be written to w/out erasing all contents, but the write takes several hundred microseconds longer than read. More expensive and less dense (fewer bits per chip) than EPROM, but advantage in being updated in a standard fashion

how does optically erasable PROM work

(EPROM) read/written to electrically like PROM, but to write, all storage cells must be erased to initial state by exposure of package chip to ultraviolet radiation (more expensive than PROM, but advantage of being update-able)

how does DDR SDRAM improve upon the data rate compared to SDRAM

(double data rate SDRAM) data transfer synchronized w/both rise and fall of clock (SDRAM is only rise); uses higher clock rate on bus to increase transfer rate; prefetch buffering scheme used (buffer is a memory cache located on the SDRAM chip that allows it to preposition bits to place them on data bus faster)

describe the basics of how DRAM works

(dynamic random access memory) made of cells that store data as charge (1) or lack of (0) on capacitor. Dynamic b/c capacitors discharge naturally, so charges must be refreshed periodically

what is the access time memory performance parameter

(latency) for RAM, the time it takes to perform a read/write operation; for non RAM (random access memory), time it takes to position the read-write mechanism at the desired location

what is PROM and how does it differ from standard ROM

(programmable ROM) is less expensive than ROM. Nonvolatile and only written to once, used when small # ROMs w/particular memory content is needed. Flexible and convenient b/c writing is performed electronically and can be performed by supplier or customer

what is external/nonvolatile memory used for

(secondary (auxiliary) memory) used to store programs and data files, visible to programmer in terms of files and records (not words or bytes)

describe the basics of how SRAM works

(static random access memory) digital device using logic similar to processor, storing binary values with flip-flop logic gate and holding the data as long as power supplied (no need to recharge constantly b/c no capacitor)

what is the general concept of SDRAM

(synchronous DRAM) exchanges data w/processor synchronized to external clock signal (run at full speed of processor/memory bus w/out imposing wait states).

describe the QPI four-layered architecture [diagram lesson 5 p12]

[bottom to top] physical (actual wire carrying signals, and circuitry and logic to support ancillary features required to move 1's and 0's), link (responsible for reliable transmission and flow control), routing (provides framework for directing packets through fabric), protocol (high-level set of rules for exchanging packets of data b/t devices)

what are the layers of the PCIe interconnection system [lesson 5 p17]

[bottom to top] physical (actual wires carrying signals, and circuitry logic to support ancillary features required to transmit/receive 1's and 0's), data link (responsible for reliable transmission and flow control), transaction (generate/consume data packets used to implement load/store data transfer mechanisms; also manages flow control of packets b/t two components on a link)

describe memory hierarchy in terms of RAM; and what does going from top to bottom change [diagram 6.3 p18]

[top to bottom] Static RAM (SRAM, used as cache), DRAM (off-chip main memory), flash memory (NAND flash, advantages over nonvolatile memory), hard disk (external storage); moving top to bottom, have decrease in performance and endurance, as well as decreasing cost per bit and increasing capacity/density

what are the memory hierarchy components [diagram 6.1 p11] and what changes occur moving from top to bottom of pyramid

[top to bottom] inboard (registers, cache, main memory), outboard (magnetic disks), off-line (magnetic tape); from top to bottom, there is a decreasing cost per bit and frequency of memory access by processor, but an increasing capacity and access time

how does unit of transfer memory characteristic apply to internal and external memory

[unit of transfer relates to capacity] for internal, the unit of transfer equals the number of electrical lines in/out memory module (=# bits read out/written into memory at a time); for external, data is transferred in units larger than words, called blocks (64, 128, 256 bits)

what is read mostly memory and the 3 most common forms of it

a ROM variation used when read operations are more frequent than writes, but still need memory to be nonvolatile (optically erasable PROM (EPROM), electrically erasable PROM (EEPROM), and flash memory)

computers consist of a set of components or modules (CPU, memory, I/O), that communicate with each other, so a computer is essentially...

a network of basic modules connected by paths, where the collection of paths is called the interconnection structure

how does set associative mapping for cache work (including the memory address information)

compromise b/t direct and associative mapping, where cache control logic interprets a memory address of (s+w) bits into tag (s-d bits), set (d bits), and word (w bits) fields. s bits of tag and set represent one of 2^s memory blocks, while the d set bits specify one of 2^d sets that the cache is divided into (each set contains a # of lines, and a given block maps to any line in a given set). Tag is much smaller than in associative mapping, and is only compared to other tags within the single set that the block can be mapped to

how many lines can the data bus have, and why is the width of the data bus important

consist of 32, 64, 128 one bit lines for moving data b/t system modules. Width of data bus is key to determine overall system performance (ex: 32-bit data bus but 64-bit instructions means memory must be accessed twice per instruction cycle)

list various cache characteristics that differentiate different cache architectures

cache addresses, cache size, mapping function, replacement algorithms, write policy, line size, number of caches

functions of QPI protocol layer

cache coherency protocol, which deals with ensuring main memory values held in multiple caches are consistent

main memory consists of up to 2^n words, each with unique n-bit addresses. How does the build of the cache relate to this [6.2 p4 diagram]

cache consists of m blocks (called lines), where each line consists of K words plus a small tag (portion of main memory address correlating to the block), and control bits to indicate if line has been modified since entering cache. # cache lines << # main memory blocks

describe the organization of the cache in relation to the other components it is connected to and interacts with [6.2 p6 diagram]

cache is connected to processor through address, data, and control lines [data and address lines connect to data and address buffers, which are attached to system bus for main memory accesses if "miss" to send address to main memory then retrieve data to send to both cache and processor]. If "hit", data and address buffers are disabled, no system bus needed

what are the three factors of design constraints in the memory hierarchy and their relations to each other

capacity (how much), access time (how fast), cost (how expensive). Faster access time=greater cost/bit; greater capacity= both smaller cost per bit and slower access time

how does the computer's memory module work and what are the inputs and outputs of it

consists of N words of equal length assigned unique numerical addresses (0 to N-1). Inputs [read and write control signals, address (location of read/write operation), data (to store if a write command)] and Outputs [data (if read command)]

how does the computer's I/O module work and what are the inputs and outputs of it

control 1+ external devices (each interface to external device is a port with address 0 to M-1) using its two operations, read and write. Inputs [read and write control signals, address (port/specified device), internal and external data] and Outputs [internal data, external data, interrupt signals] (external data paths for inputs and outputs of data with another external device)

what does control bus do

control lines transmit control signals with command and timing information among system modules to control access/use of data and address buses to avoid collisions (timing signals indicate validity of data/address info, and command signals specify operations to perform)

what are data packets generated and consumed by data link and transaction layers called

data link (data link layer packers DLLP), transaction (transaction layer packers TLP 32 or 64 bit addressing)

how does the computer's interconnection structure work with its modules (what are the three main modules)

designed based on exchanged that must be made among modules, where types of exchanges are dictated by forms of inputs and outputs of each module (three modules: memory, I/O, Processor)

what is the purpose of cache memory (as in what is it designed to do)

designed to combine memory access time of expensive, high speed memory (registers) with the large memory size of less expensive, lower-speed memory (main memory)

why do buses consist of multiple communication lines

each line can only transmit signals to represent a binary 1 or 0, so several lines are needed to transmit several bits at once

modern main memory systems are normally implemented with semiconductor chips. What is the basic unit of semiconductor memory and its properties

the memory cell, which exhibits two stable states to be either written to to set the state or read to sense the state

what is the purpose of using mapping functions

to map blocks of main memory into cache lines (b/c # cache lines << # memory blocks), and to determine which main memory block currently occupies a cache line

what are the two main functions of the PCIe transaction layer

to receive read/write requests from software above the layer and creates request packets for transmission to a destination via link layer (below it)

how do sending and requesting data operations of bus work

to send data to another module, obtain use of bus and transfer data through it. To request data from another module, obtain bus use, transfer request to other module via appropriate control and address lines, then wait for module to send the data

what are the differences between unified and split caches

unified cache stores references to both data and instructions (higher hit rate, balances instruction and data fetches automatically, and only one cache is needed); while split involves using two caches to separate instructions and data (both exist at same level like level 1) (eliminate concurrency of calls b/c two separate caches to access, and is important for pipelining) [typically split used for level 1 and unified for higher levels]

during flow control function of QPI link layer, what is a credit scheme

used to control flow of data, where during initialization, sender is given a set number of credits to send flits to a receiver. When flit is sent to receiver, sender decrements its credit count. When buffer is freed at receiver, credit is returned to sender from that buffer. (receiver controls pace of data transmitted over QPI link)

what types of memory are used by the inboard memory

uses volatile, implemented with semiconductor tech (three levels varying in speed and cost each with different semiconductor memory types)

what are the physical characteristics of data storage and some examples of each

volatile (info decays naturally or is lost when electrical power is off) vs nonvolatile (info once recorded remains without deterioration until deliberately changed, regardless of electrical power); ex: magnetic-surface memory (nonvolatile), semiconductor memory (either), nonerasable memory (can't be altered outside destruction; semiconductor memory of this type is ROM)

describe the two distinct types of flash memory [diagram 6.3 p14]

NOR flash memory [basic unit of access is bit (memory cell), where cells are connected in parallel to bit line to read/write/erase each individually. Bit line goes low if any memory cell is turned on by its unique word line (cause NOR)]; NAND flash memory [organized in transistor arrays w/16 or 32 in series, where bit line goes low only if all of word's transistors go low (cause NAND)]

what are the three performance parameters of the performance memory characteristic

access time (latency), memory cycle time, transfer rate

what is the memory cycle time memory performance parameter

access time plus additional time before 2nd access can commence. Additional time may be required for transients to die out on signal lines or to regenerate data if they are read destructively. [concerned with system bus, not processor]

describe the DRAM cell structure and (read/write) operations [diagram 6.3 p4]

address line is active when read/write command is to be conducted, where voltage in the address line closes the transistor (acts like a switch). Write operation occurs when voltage signal applied to bit line (high vs low voltage 1 vs 0), and signal applied to address line to transfer charge to capacitor. Read operation occurs when address line is selected to turn on transistor, allowing capacitor to feed out its stored charge to bit line and to a sense amplifier [capacitor must be recharged] [sense amplifier compares capacitor voltage to reference to get 1 or 0]

describe the cell operation for SRAM [diagram 6.3 p6]

address line open/close switch to control transistors T5/T6, where a signal on address line switches both transistors on to allow read/write. [write applies desired bit value to line B and complement to !B to force transistors T1-T4 into proper state] [read gets bit value from line B

adv and disadv to associative mapping for cache

adv (flexible which bock to replace when reading new block into cache); disadv (to check if block is already in cache, every cache line must be checked simultaneously for a match, which is complex circuitry)

advantage and disadvantage to direct mapping cache

adv (simple and inexpensive to implement); disadv (fixed cache location for any block, so if two things that map to same line are consistently referenced, the blocks will be continually swapped and cause low hit rate (called thrashing)

for soft errors, what is the process for preparing to and checking for error corrections [diagram 6.3 p21] [also review hw 4 hamming code and slides 22-25]

before data stored in memory, a calculation (f) is performed to produce a code, which is stored with the data in memory (M bit word + k bit code). When data is read, the memory M bits are put back into f to get the code again, and the original and new versions of the code are compared. This comparison yields 3 results [no error detected so data sent out; error detected and can be fixed so is passed into corrector before sending it out; error detected but not fixable, so report this]

compare SRAM and DRAM

both volatile (require continuous power to preserve bits); Dynamic memory cell is simpler/smaller than static (more dense, meaning more cells in smaller area and less expensive), but it requires refresh circuitry (worthwhile for larger memories b/c of smaller cost of DRAM cells at high density). Static is faster than DRAM, used for small/fast memory like cache

what are the two main types of interconnection structues

bus (various multiple-bus structures) and point-to-point (with packetized data transfer)

what is the transfer rate memory performance parameter

rate at which data can be transferred in/out memory unit. For RAM, it is 1/(cycle time)

how does the computer's processor module work and what are the inputs and outputs of it

read in instructions and data to process and produce a data to output. Inputs [instructions, data, interrupt signals] and outputs [address, control signals, data]

how does flash memory work (also in relation to other memory types)

read-mostly memory that is intermediate b/t EPROM and EEPROM in cost/functionality. Takes seconds to erase all (faster than EPROM) and can also erase individual blocks, Microchip organized so a section of memory cells are erased in a single action (flash). High density like EPROM b/c only one transistor per bit

describe the parts of the inboard memory in the memory hierarchy

registers (in CPU, faster, smallest, and most expensive; few dozen-100s), cache (improve performance by staging movement of data b/t main memory and CPU), main memory (principal internal memory system, where each location has unique address and extended with cache (smaller but faster))

how does split transaction technique for PCIe transaction layer work

request packet is sent out by source PCIe device (waits for completion packet response). Completion following a request is initiated by completer only when it has data and/or status ready for delivery. Each packet has unique identifier that enables completion packets to be directed to correct originator. (completion and request occur at different times, unlike bus method that need both lines at same time)

the semiconductor memory cell has three functional terminals capable of carrying an electrical signal, which are [6.3 p2 diagram]

select terminal (select the memory cell for read/write), control terminal (indicate read or write), data in or sense (data in terminal writes to cell a 0 or 1 with electrical signal; or sense terminal outputs state of cell)

what are the most common forms of physical memory types

semiconductor memory, magnetic surface memory, optical, magneto-optical

what are the four types of memory access methods (list them)

sequential access, direct access, random access, associative (first two external memory, last two are internal memory)

what does the emerging technologies memory hierarchy of RAM look like from top to bottom [diagram 6.3 p19]

spin-transfer torque (STT-RAM, either cache or main memory), phase-change (PCRAM, replace/supplement DRAM for main memory), restrictive (ReRAM, replace/supplement secondary storage and main memory)

what connects the major computer components

system bus connect them (CPU, memory, I/O)

how does the cache work in relation to the CPU and main memory

the cache contains a copy of portions of main memory, where when processor attempts to read a word of memory, a check is made to see if the word is in cache ("hit", cache word transfer to processor) or not ("miss", main memory block transfer (block of fixed number of words) to cache, which delivers word to processor)


Conjuntos de estudio relacionados

WGU-C909: Elementary Reading Methods and Interventions (Revised)

View Set

Chapter 34 Obstetrics and Neonatal Care

View Set

7th grade geography chapter 7 South America

View Set

Chapter 34 Obstetrics and Neonatal Care

View Set