Computer Architecture Exam 1
This early computer used the stored program concept; its logical design was described in a 1945 report written by John von Neumann
EDVAC
The performance of the DEC VAX 11/780 computer was approximately _____ MIPS.
1
A memory system for a pipelined vector processing unit (VPU) is deisgned using 10ns RAM devices using a 4-way low-order interleave. What would the effective time per main memory access under ideal conditions?
10ns/4 = 2.5ns
A computer that used (or will use) ferrite core technology in its main memory system would most likely have a manufacture date of:
1965
With what generation of computers is Von Neumann most closely associated? Which machine(s) did he have a role in developing?
1st generation EDVAC
A computer with a CPU clock cycle time of 0.5 ns runs at _____ MHz.
2000
Kilo is 10 to the power of _____..
3
Define and Describe virtual memory
A method of abstracting memory space to appear to a program that it is the sole program running and that it has access to the entire memory space.
Write-through [3]
A policy whereby writes to cached locations update main memory immediately/at the same time
Write-back
A policy whereby writes to cached locations update main memory when the line is displaced
Benchmark [3]
A program(s) that are used to compare the performance of various computer systems
Name at least one advantage of a virtually addressed cache.
A translation to a physical address does not have to occur in order to check the cache for a hit or miss.
Princeton Architecture [2]
A type of computer system design in which the CPU uses a common memory interface/bus for accessing instructions and data operands
Harvard Architecture [2]
A type of computer system design in which the CPU uses separate memory interfaces for accessing instructions and data operands
SAM
A type of memory organization that uses relative addressing to locate the next item to be accessed
Did Aiken's work directly address the problem of the von Neumann bottleneck? If so, tell how; If not, describe how his work contributed in some other way to the field of computer science.
Aiken solved the problem of the bottleneck with the Harvard Architecture. This gave the data and instructions their own respective memory and bus.
The _____ was a computer kit that made its appearance during the fourth generation of computing. Approximately 10,000 of these kits were sold to hobbyists who built their own computers.
Altair
Compatibility
An architectural attribute that expresses the support provided for "legacy" or other architectures
Ease of use [2]
An architectural quality attribute that reflects how well a given instruction set architecture supports systems programming
Expandability
An architectural quality attribute that reflects the ability of designers to create compatible implementations covering a range of price/performance points
Generality [2]
An architectural quality attribute that reflects the range of applications for which a machine is suitable // an architectural attribute that expresses the support provided for a wide variety of applications
Which of these virtual memory techniques uses a table(s) to map the addresses of variable-size regions of memory?
Both B and C: Segmentation, segmentation with paging
How are paging and segmentation similar?
Both suffer from page faults that occur when a segment or page is not resident in memory and must be retrieved from the disk.
The _____ was a supercomputer of the fourth generation; it was the product of a new company started by an engineer who formerly worked at Control Data Corporation.
CRAY-1
The dynamic RAM (DRAM) that is commonly used for main memory in computers is essentially a large array of _____ plus the circuitry used to decode, read, and write the information.
Capacitors
A(n) _____ is an image of the contents of main memory at the time when a particular program crashed.
Coredump
What is not part of the von Neumann execution cycle as discussed in class?
Decode operand register
The _____ of a storage device refers to the amount of information that can be stored in a given physical space or volume.
Density
A controller for a four-way set-associative cache that uses a(n) _____ replacement algorithm would replace the line that has been resident in cache for the longest time.
FIFO
Which generation of computers is Aiken most closely associated to?
First generation
Whetstones are associated with _____.
Floating point
Which line(s) of the direct-mapped cache could main memory location 1E0027A map into? (Give the line number(s), which will be in the range of 0 to (n-1) if there are n lines in the cache.) Give the memory address (in hexadecimal) of another location that could not reside in cache at the same time as this one (if such a location exists)
From Solution manual in book: To answer this question, we need to write the memory address in binary. 1E0027A hexadecimal equals 01111000000000001001111010 binary. We can break this down into a tag of 011110, an index of 00000000001001 and a byte offset within the line of 111010. In a direct-mapped cache, the binary index tells us the number of the only line that can contain the given memory location. So, this location can only reside in line 10012 = 9 decimal. Any other memory location with the same index but a different tag could not reside in cache at the same time as this one. One example of such a location would be the one at address 2F0027A
Which of the following cache mapping strategies would be likely to achieve the highest hit ratio given the same program and data set?
Fully associative
At what university did Howard Aiken work?
Harvard University
A main memory system is designed using 10 ns RAM devices using an 8-way low-order interleave. Other than increased hardware cost and complexity, are there any potential disadvantages of using a low-order interleaved memory design? If so, discuss one such disadvantage and the circumstances under which it might be significant.
If the system as multiple processors, low-order interleaving would be bad because the single processor has the capability of using all of the bandwidth. It was have to be halted for another processor to run.
Key register [2]
In an associative memory, this determines the bit positions where matching is significant
Argument register
In an associative memory, this holds the value being searched for
Name at least one advantage of a physically addressed cache.
It does not have to be flushed between context switches.
Suppose you are considering purchasing this system to run a program(s) that makes extensive use of floating-point operations on mostly vector and matrix operands consisting of real-number data. Of the widely-used benchmarks discussed in class, other than SPECfp, which do you think would be the best to use in comparing this system to others you are considering? (feel free to offer reasons if you think I may not agree with your choice.)
Linkpack benchmark
How are paging and segmentation different?
Main difference is that paging has fixed sizes while segmentation has varying sizes. These may be combined by putting pages in segments. Also, paging suffers from internal fragmentation while segmentation suffers from external fragmentation.
Which machine(s) did Howard Aiken have a role in developing?
Mark-1 and Mark-2
MFLOPS stands for _____.
Millions of Floating Point Operations per second
Would you expect the hit ratio to remain constant in this computer system? If so, explain why; if not, name at least three factors that might increase or decrease the hit ratio.
No, it would not remain constant. 1. The program has just begun, so its instructions weren't yet cached. 2. The program had attempted to jump to a part of the code that hasn't yet been cached. 3. The program has attempted to load a file that hasn't yet been cached. or 1. How long the program has been running 2. How many programs are in memory 3. The type of data that is being operated on
A(n) _____ memory is a random-access memory that can be accessed by word slice or by bit slice.
Orthogonal
Name and briefly describe the two principal approaches to implementing virtual memory systems.
Paging and Segmentation. Paging uses fixed size pages whose sizes normally depend on hardware considerations (i.e. hard drive sector size). Segementation uses segments of varying sizes whose sizes normally depend on software considerations (i.e. length of code and data types).
In a system that has an on-chip MMU and off-chip cache, tags would be part of the _____ address.
Physical
What university did the chief architect of the von Neumann execution cycle work at?
Princeton University
What was the principal difference between the machine developed by Princeton's researchers vs. the one developed at Harvard? Why was this significant?
Princeton's architecture suffers from the "von Neumann bottleneck" where the memory that holds instructions and data share a common bus. The Harvard design eliminates this problem by giving each its own bus. This is faster and implemented in modern computer's cache. The portion of cache that holds instructions and the portion that holds data can transfer to the CPU at the same time.
The Princeton and Harvard architectures were competing computer designs developed at two American universities. Name the prominent computing researcher, the generation of machines, and the specific machine most commonly associated with the introduction of each approach.
Princeton: John von Neumann 1st generation EDVAC Harvard: Howard Aiken 1st generation Mark-1
Advantage of virtual memory?
Programmers don't have to worry about corrupting the memory of another running program because it appears as invisible to each other.
A(n) _____ is the basic unit of information transfer between a cache and the slower memory that sits behind it.
Refill line
An early computer designed by researchers at Harvard avoided the "von Neumann bottleneck" using which of these approaches?
Separate memories for instructions and data
Howard Aiken is a major figure in the history of digital computing. At what university did he work? Which generation of computer is Aiken most closely associated? Which machines did he have a role in developing?
Studied at Harvard. He was involved with the first generation. He developed MarkI and MarkII.
A memory system for a pipelined vector processing unit (VPU) is deisgned using 10ns RAM devices using a 4-way low-order interleave. What would constitute "ideal conditions"? In other words, under what circumstances could the access time you just calculated be achieved?
The "ideal conditions" could include all addresses to be accessed sequentially
A memory system for a pipelined vector processing unit (VPU) is deisgned using 10ns RAM devices using a 4-way low-order interleave. What would constitute "worst-case conditions"? In other words, under what circumstances would memory accesses be the slowest? What would the access time be in this worst-case scenario?
The "worst-case conditions" might include all addresses being accessed in multiples of four, which would yield an access time of 10ns.
What is the "von Neumann bottleneck"? Name and describe an approach that has been used to help alleviate this bottleneck.
The bottleneck is the single bus from the CPU to main memory with data and instructions are accessed through. The Harvard Architecture fixes this by including two memories. One for data and one for instructions. Each memory then has its own bus to the CPU.
What is the von Neumann execution cycle and what does it do? Describe the steps in this cycle.
The cycle demonstrates the steps required to perform an operation (M.L. instruction) The CPU accesses main memory through a single bus to collect data and the instruction. 1. Instructions have to be fetched and decoded 2. Operands are addressed and fetched 3. Operation is performed and results stored in memory or register Repeat
Physical address
The memory address that is used to access the main memory hardware
Which line(s) of the fully associative cache could main memory location 1E0027A map into? (Give the line number(s), which will be in the range of 0 to (n-1) if there are n lines in the cache.) Give the memory address (in hexadecimal) of another location that could not reside in cache at the same time as this one (if such a location exists)
The memory location [hexadecimal here] could be mapped to any of the lines from 0 to (n-1). There are no other addresses that could not reside in cache at the same time as [hexadecimal] because any address can be mapped to any line in a fully associative memory.
Name and describe the most significant performance related problem associated with the von Neumann architecture. Did Aikens work directly address this problem? If so, tell how; if not, describe how this work contributed in some way to the field of computer architecture.
The most significant problem was the von Neumann bottleneck. Aiken's work did directly address the bottleneck by developing two buses and two memories, one for data and one for instructions.
Index
The portion of a memory address that determines in which cache line a given main memory location's contents may be cached
Tag [2]
The portion of a memory address that must be checked to determine whether or not a given memory access results in a cache "hit"
Locality of Reference [4]
The principle that allows hierarchical storage systems to function at close to the speed of the faster, smaller level(s)
Hit ratio
The probability of avoiding a main memory access by finding the desired information in cache // A fraction of the total number of memory references that do not require main memory to be accessed Hit ratio: Ph = number of hits/total number of main memory accesses or Ph = number of hits/(number of hits + number of misses)
Disadvantage of virtual memory?
The translation process between virtual and physical addresses added overhead in terms of time.
Name and describe the most significant performance-related problem associated with the von Neumann architecture.
The von Neumann bottleneck. This problem is caused when data and instruction use a common bank and bus.
What is the von Neumann execution cycle?
The von Neumann execution cycle is the sequence of steps that occur and repeat while a program is running.
What is the principal difference between the machine(s) you named above (EDVAC vs Mark-1) vs. most computing machines that had previously been constructed? Why was this significant?
These machines had data and instructions in memory so they could perform a variety of functions by changing(?) software. Previous machines were hard wired to do a function. This change allowed computers to be used for more tasks.
Delayed page fault [2]
This can occur during the execution of a string or vector instruction when part of an operand is present in physical main memory and the rest is not
Segment Fault
This happens when a location within a variable-sized region of a program's logical memory is referenced, but it is not currently mapped to a location in physical main memory
Integrated circuit [2]
This technological development was an important factor in moving from "second-generation" to "third-generation" computers
VLSI
This technological development was an important factor in moving from "third-generation" to "fourth-generation" computers
Which of the following is not one of the computer architectural quality factors described in our textbook?
User-friendliness
Linpack is associated with _____.
Vectors, matrices
Describe a situation in which the choice of one approach vs the other might be forced upon a system designer regardless of the relative advantages or disadvantages (virtually addressed vs physically addressed)
When the cache is located off chip.
Which of the following would be best suited as a benchmark for applications that do mathematical calculations on floating-point numbers that are not elements of vectors or arrays?
Whetstones
Mega is 10 to the power of _____.
6
Giga is 10 to the power of ____.
9
The functionality provided in hardware by an associative memory is most similar to which of the following software concepts?
A Database query
MMU [2]
A hardware device that handles the details of address translation in a system with virtual memory
TLB [4]
A type of cache used to hold virtual-to-physical address translation information
Key register
In an associative memory, this determines the bit positions where matching is significant
Dhrystones are associated with _____.
Integer Arithmetic
Dirty bit
This is set to indicate that the contents of a faster memory subsystem have been modified since they were initialized from a lower level of the memory hierarchy
Valid bit [2]
This is set to indicate that the contents of a faster memory subsystem have been properly initialized from a lower level of the memory hierarchy // This is set to indicate that the contents of a faster memory subsystem have been properly initialized and are not "garbage data" from before reset or belonging to another process
Transistor
This technological development was an important factor in moving from "first-generation" to "second-generation" computers
A memory system for a pipelined vector processing unit (VPU) is deisgned using 10ns RAM devices using a 4-way low-order interleave. When ideal conditions exist, we would like the VPU to be able to access memory every clock cycle with no "wait states" (that is, without any cycles wasted waiting for memory to respond). Given this requiremnet, what is the highest VPU bus clock frequency that can be used with this memory system?
f = 1/2.5ns = 0.4GHz = 400 MHz