COMP 7300 Exam 2
NAND flash memory vs NOR flash memory
- NOR flash memory provides high-speed random access. It can read and write data to specific locations, and can reference and retrieve a single byte. - NAND reads and writes in small blocks. - NAND provides higher bit density than NOR and greater write speed. - NAND flash does not provide a random-access external address bus, so the data must be read on a blockwise basis (also known as page access), where each block holds hundreds to thousands of bits. - NAND memory has made some inroads, but NOR remains the dominant technology for internal memory. It is ideally suited for microcontrollers where the amount of program code is relatively small and a certain amount of application data does not vary. - NAND memory is better suited for external memory, such as USB flash drives, memory cards (in digital cameras, MP3 players, etc.), and in what are known as solid-state disks (SSDs).
NAND flash memory
- Organized in transistor arrays, with 16 or 32 transistors in series. - The bit line goes low only if all the transistors in the corresponding word lines are turned on. - This is similar in function to a NAND logic gate.
Hard errors
- Permanent physical defect - Memory cell or cells affected cannot reliably store data but become stuck at 0 or 1 or switch erratically between 0 and 1 - Can be caused by: - Harsh environmental abuse - Manufacturing defects - Wear
Daisy chain (hardware poll, vectored)
- The interrupt acknowledge line is daisy-chained through the modules - Vector - address of the I/O module- processor uses the vector the unique identifier - Vectored as a pointer to the appropriate device-service routine, avoiding the need to execute a general interrupt-service routine first
Access Time
- The sum of the seek time and the rotational delay - The time it takes to get into position to read or write
Software poll
- When processor detects an interrupt, it branches to an interrupt-service routine whose job is to poll each I/O module to determine which module caused the interrupt - Time consuming
Universal Serial Bus (USB)
- Widely used for peripheral connections - Is the default interface for slower speed devices - Commonly used high-speed I/O
EEPROM
- electrically erasable programmable read-only memory (EEPROM) - Electrically erasable programmable read-only memory - Can be written into at any time without erasing prior contents - Combines the advantage of non-volatility with the flexibility of being updatable in place - More expensive than EPROM
Hamming Codes
A code is an error detecting and error correcting code, and detect up to two-bit errors or correct one-bit errors. Figure 5.7 pg. 176
A dynamic RAM has refresh cycle of 32 times per msec. Each refresh operation requires 100 nsec and a memory cycle requires 250 nsec. What percentage of memory's total operating time is required for refreshes? (A) 0.64 (B) 0.96 (C) 2.00 (D) 0.32
(D) 0.32 In 1 ms, the time devoted to refresh is => refresh = refresh cycle * refresh operation = 32 × 250 ns = 3200 ns. The fraction of time devoted to memory refresh is => refresh/ 1ms = (3200 × 10^-6 s)/10^-3 s = 0.0032
Soft errors
- Random, non-destructive event that alters the contents of one or more memory cells - No permanent damage to memory - Can be caused by: - Power supply problems - Alpha particles
SDRAM
- Synchronous DRAM (SDRAM) - One of the most widely used forms of DRAM - Exchanges data with the processor synchronized to an external clock signal and running at the full speed of the processor/memory bus without imposing wait states - With synchronous access, the DRAM moves data in and out under control of the system clock - The processor or other master issues the instruction and address information, which is latched by the DRAM - The DRAM then responds after a set number of clock cycles - Meanwhile, the master can safely do other tasks while the SDRAM is processing
RAID level 3 application
- Video production and live-streaming - Image Editing - Video Editing - Prepress applications - Any applications requiring high throughput
A computer consists of a processor and an I/O device D connected to main memory M via a shared bus with a data bus width of one word. The processor can execute a max of 4 MIPS. An average instruction requires 5 machine cycles, 3 of which use the memory bus. A memory read/write operation uses 1 machine cycle. Suppose that the processor is continuously executing "background" programs that require 90% of the instruction rate but not any I/O instructions. Assume that one processor cycle equals one bus cycle. Now suppose the I/O device is used to transfer very large amounts of data between M and D. - Estimate the same rate if DMA is used.
If we assume that the DMA module can use all these cycles, and ignore any setup or status-checking time, then this value is the maximum I/O transfer rate.
PCI Express
PCI Express is a high-speed bus system for connecting peripherals of a wide variety of types and speeds.
SATA
Serial Advanced Technology Attachment is an interface for disk storage systems. - Widely used in desktop computers, and in industrial and embedded applications.
CD
Compact Disk. A non-erasable disk that stores digitized audio information. The standard system uses 12-cm disks and can record more than 60 minutes of uninterrupted playing time.
Direct memory access (DMA)
The I/O module and main memory exchange data directly without processor involvement
Cylinder
The set of all the tracks in the same relative position on the platter
Isolated I/O
o Separate address spaces o Need I/O or memory select lines o Special commands for I/O § Limited set
Access Time
· The sum of the seek time and the rotational delay · The time it takes to get into position to read or write
Rotational Delay
· The time it takes for the beginning of the sector to reach the head
Tapes
· Use the same reading and recording techniques as disk systems · Medium is flexible polyester tape coated with magnetizable material · Coating may consist of particles of pure metal in special binders or vapor-plated metal films · Data on the tape are structured as a number of parallel tracks running lengthwise · Serial recording o Data are laid out as a sequence of bits along each track · Data are read and written in contiguous blocks called physical records · Blocks on the tape are separated by gaps referred to as inter-record gaps
Seek time
On a movable-head system the time it takes to position the head at the track
Track
Organized on the platter in a concentric set of rings
RAID level 2
Redundant via Hamming code: Redundant via Hamming code; an error-correcting code is calculated across corresponding bits on each data disk, and the bits of the code are stored in the corresponding bit positions on multiple parity disks.
What is burst mode?
SDRAM employs a burst mode to eliminate the address setup time and row and column line pre-charge time after the first access. In burst mode, a series of data bits can be clocked out rapidly after the first bit has been accessed. This mode is useful when all the bits to be accessed are in sequence and in the same row of the array as the initial access.
RAID level 6
Striping with double parity: Block-interleaved dual distributed parity; two different parity calculations are carried out and stored in separate blocks on different disks.
Sector
Data are transferred to and from the disk
Track
Data is organized on the platter in a concentric set of rings,
RAID Level 2
Each bit of data word is written to a data disk drive (4 in this example: 0 to 3). - Each data word has its Hamming Code ECC word recorded on the ECC disks. - On Read, the ECC code verifies correct data or corrects single disk errors. Characteristics: · Makes use of a parallel access technique · In a parallel access array all member disks participate in the execution of every I/O request · Spindles of the individual drives are synchronized so that each disk head is in the same position on each disk at any given time · Data striping (bit level striping) is used: o Strips are very small, often as small as a single byte or word Performance: · An error-correcting code is calculated across corresponding bits on each data disk and the bits of the code are stored in the corresponding bit positions on multiple parity disks · Typically, a Hamming code is used, which is able to correct single-bit errors and detect double-bit errors · The number of redundant disks is proportional to the log of the number of data disks · Would only be an effective choice in an environment in which many disk errors occur
Programmed I/O
o Data are exchanged between the processor and the I/O module o Processor executes a program that gives it direct control of the I/O operation o When the processor issues a command, it must wait until the I/O operation is complete o If the processor is faster than the I/O module, this is wasteful of processor time
Infini band
· I/O specification aimed at the high-end server market · Heavily relied on by IBM Enterprise series of mainframes · Standard describes an architecture and specifications for data flow among processors and intelligent I/O devices · Has become a popular interface for storage area networking and other large storage configurations · Enables servers, remote storage, and other network devices to be attached in a central fabric of switches and links · The switch-based architecture can connect up to 64,000 servers, storage systems, and networking devices
Transfer Time
· Once the head is in position, the read or write operation is then performed as the sector moves under the head · This is the data transfer portion of the operation
Device Identification - ways to find which device caused an interrupt
• Multiple interrupt lines • Software poll • Daisy chain (hardware poll, vectored) • Bus arbitration (vectored)
Consider a dynamic RAM that must be given a refresh cycle 50 times per millisecond. Each refresh operation requires 120 ns; a memory cycle requires 200 ns. What percentage of the memory's total operating time must be given to refreshes?
Given, DRAM refresh cycle duration is 120nsec. DRAM is refreshed 50 times per msec. Hence, the time spent for memory refreshes per millisecond => 50 × 120 = 6000ns Percentage of time spent on memory refreshes per millisecond => 6000ns /1ms = 0.006 = 0.6%
What is Hamming code? Give examples.
Hamming code is an error detecting and error correcting code. Hamming codes can detect up to two-bit errors or correct one-bit errors. For example, if 4-bit information is to be transmitted, then n=4. The number of redundant bits is determined by the trial and error method. 2^P ≥ n+P+1 P Parity Bits n = # of data bits Let P=2, we get, 2^2 ≥ 4+2+1 The above equation implies 4 not greater than or equal to 7. So let's choose another value of P=3.
DMA breakpoints - Figs. 7.15
The Intel 8237A DMA controller interfaces to the 80 x 86 family of processors and to DRAM memory to provide a DMA capability. Figure 7.15 indicates the location of the DMA module. When the DMA module needs to use the system buses (data, address, and control) to transfer data, it sends a signal called HOLD to the processor. The processor responds with the HLDA (hold acknowledge) signal, indicating that the DMA module can use the buses. For example, if the DMA module is to transfer a block of data from memory to disk, it will do the following: 1. The peripheral device (such as the disk controller) will request the service of DMA by pulling DREQ (DMA request) high. 2. The DMA will put a high on its HRQ (hold request), signaling the CPU through its HOLD pin that it needs to use the buses. 3.The CPU will finish the present bus cycle (not necessarily the present instruction) and respond to the DMA request by putting high on its HDLA (hold acknowledge), thus telling the 8237 DMA that it can go ahead and use the buses to perform its task. HOLD must remain active high as long as DMA is performing its task. 4. DMA will activate DACK (DMA acknowledge), which tells the peripheral device that it will start to transfer the data. 5. DMA starts to transfer the data from memory to peripheral by putting the address of the first byte of the block on the address bus and activating MEMR, thereby reading the byte from memory into the data bus; it then activates IOW to write it to the peripheral. Then DMA decrements the counter and increments the address pointer and repeats this process until the count reaches zero and the task is finished. 6. After the DMA has finished its job it will deactivate HRQ, signaling the CPU that it can regain control over its buses.
RAID Level 1 Applications
Applications: 1. Accounting 2. Payroll 3. Financial
RAID level 3
Bit-interleaved parity: Bit-interleaved parity; similar to level 2 but instead of an error-correcting code, a simple parity bit is computed for the set of individual bits in the same position on all of the data disks.
RAID level 4
Block-interleaved: Block-interleaved parity; a bit-by-bit parity strip is calculated across corresponding strips on each data disk, and the parity bits are stored in the corresponding strip on the parity disk.
When a DMA module takes control of the bus, and while it retains control of the bus, what does the processor do?
When the processor wishes to read or write a block of data, it issues a command to the DMA module, by sending to the DMA module the following information: ■ Whether a read or write is requested, using the read or write control line between the processor and the DMA module. ■ The address of the I/O device involved, communicated on the data lines. ■ The starting location in memory to read from or write to, communicated on the data lines, and stored by the DMA module in its address register. ■ The number of words to be read or written, again communicated via the data lines, and stored in the data count register. The processor then continues with other work. It has delegated this I/O operation to the DMA module. The DMA module transfers the entire block of data, one word at a time, directly to or from memory, without going through the processor. When the transfer is complete, the DMA module sends an interrupt signal to the processor. Thus, the processor is involved only at the beginning and end of the transfer (Figure 7.4c).
Consider a dynamic RAM that must be given a refresh cycle 64 times per millisecond. Each refresh operation requires 150 ns; a memory cycle requires 250 ns. What is the max number of memory cycles in 1ms that can be used for reads/writes (refresh cycles are not part of memory accesses?)
(from previous problem) Refresh Cycle Percentage: ([# of refresh cycles] X [refresh cycle time] ) / (memory cycle time) * 100 = 0.0096= 0.96% Prime Refresh Cycle Time: 1 - (refresh cycle time) = 99.04 (given) Time: 1ms = 10^6 (given) memory cycle time = 250 ns memory i/o available time: (prime refresh cycle time) / (100) X time = (99.04) / (100) * 10^6 = 990,400 <ns> max i/o cycles: (memory i/o available time) / (memory cycle time) = (990,400) / (250 <ns>) = 3961.6 <memory cycles>
DDR RAM
- Double Data Rate (DDR) Synchronous Dynamic Random Access Memory (SDRAM) - A common type of memory used as RAM for most every modern processor. - Numerous companies make DDR chips, which are widely used in desktop computers and servers. DDR achieves higher data rates in three ways: - First, the data transfer is synchronized to both the rising and falling edge of the clock, rather than just the rising edge - Second, DDR uses higher clock rate on the bus to increase the transfer rate - Third, a buffering scheme is used
Bus arbitration (vectored)
- An I/O module must first gain control of the bus before it can raise the interrupt request line - When the processor detects the interrupt, it responds on the interrupt acknowledge line - Then the requesting module places its vector on the data lines
Multiple interrupt lines
- Between the processor and the I/O modules - Most straightforward approach to the problem - Consequently, even if multiple lines are used, it is likely that each line will have multiple I/O modules attached to it
Thunderbolt
- Fastest, peripheral connection technology to become available for general-purpose use - Developed by Intel with collaboration from Apple - The technology combines data, video, audio, and power into a single high-speed connection for peripherals such as hard drives, RAID arrays, video-capture boxes, and network interfaces
Flash
- Intermediate between EPROM and EEPROM in both cost and functionality - Uses an electrical erasing technology, does not provide byte-level erasure - Microchip is organized so that a section of memory cells are erased in a single action or "flash"
Transfer time
- Once the head is in position, the read or write operation is then performed as the sector moves under the head - This is the data transfer portion of the operation
Interleaved Memory - Banks
- A number of chipscan be grouped together to form a memory bank. - Composed of a collection of DRAM chips - Grouped together to form a memory bank - Each bank is independently able to service a memory read or write request - K banks can service K requests simultaneously, increasing memory read or write rates by a factor of K - If consecutive words of memory are stored in different banks, then the transfer of a block of memory is speeded up.
Refresh Operation Quantitative Problems (?)
- All DRAMs require a refresh operation. - Refresh Operation: A simple technique for refreshing is, in effect, to disable the DRAM chip while all data cells are refreshed. - The refresh counter steps through all the row values. For each row, the output lines from the refresh counter are supplied to the row decoder and the RAS line is activated. - The data are read out and written back into the same location. - This causes each cell in the row to be refreshed. Why is DRAM refreshed? - Image result for refresh operation in dram - A DRAM cell is composed of an access transistor and a capacitor. Data is stored in the capacitor as electrical charge, but electrical charge leaks over time. Therefore, DRAM must be refreshed periodically to preserve the stored data. Refresh negatively impacts DRAM performance and power dissipation.
Ethernet
- Predominant wired networking technology - Has become essential for supporting personal computers, workstations, servers, and massive data storage devices in organizations large and small Began as an experimental - Has moved from bus-based to switch-based - Data rate has periodically increased by an order of magnitude - There is a central switch with all the devices connected directly to the switch
NOR flash memory
- The basic unit of access is a bit, referred to as a memory cell - Cells in NOR flash are connected in parallel to the bit lines so that each cell can be read/write/erased individually. - If any memory cell of the device is turned on by the corresponding word line, the bit line goes low. - This is similar in function to a NOR logic gate.
FireWire
- Was developed as an alternative to small computer system interface (SCSI) to be used on smaller systems, such as personal computers, workstations, and servers - Objective was to meet the increasing demands for high I/O rates while avoiding the bulky and expensive I/O channel technologies developed for mainframe and supercomputer systems - IEEE standard 1394, for a High Performance Serial Bus
WiFi
- predominant wireless Internet access technology - Now connects computers, tablets, smart phones, and other electronic devices such as video cameras TVs and thermostats - In the enterprise has become an essential means of enhancing worker productivity and network effectiveness - Public hotspots have expanded dramatically to provide free Internet access in most public places
What is a parity bit? Give examples.
A parity bit, or check bit is a bit added to the end of a string of binary code that indicates whether the number of bits in the string with the value one is even or odd. Parity bits are used as the simplest form of error detecting code. The two types of parity checking are: - Even Parity : Here the total number of bits in the message is made even. - Odd Parity: Here the total number of bits in the message is made odd.
Parity
A technique that checks whether data has been lost or written over when it is moved from one place in storage to another or when it is transmitted between computers.
CD-RW
CD Rewritable Similar to a CD-ROM. The user can erase and rewrite to the disk multiple times. · Can be repeatedly written and overwritten · Phase change disk uses a material that has two significantly different reflectivities in two different phase states · Amorphous state · Molecules exhibit a random orientation that reflects light poorly Crystalline state Has a smooth surface that reflects light well · A beam of laser light can change the material from one phase to the other Disadvantage: is that the material eventually and permanently loses its desirable properties Advantage: is that it can be rewritten
Does the hard drive/CDROM run on CLV or CAV?
CD-ROM drive operates in CLV mode
Very briefly discuss flash memory (< 1 page). What is the mechanism used to increase its lifespan?
Charge trapping / de-trapping technology is key to the improvement of the flash memory wear characteristics To gain the maximum use from a Flash memory, a process called wear leveling is often used. There are three main types of wear levelling mechanism that are used: - No wear levelling - Dynamic wear levelling - Static wear levelling
Draw the diagram of a typical 16 Mb DRAM (4M X 4) and explain its action.
DRAM Dynamic RAM (DRAM) is made with cells that store data as charge on capacitors. Because capacitors have a natural tendency to discharge, DRAMs require periodic charge refreshing to maintain data storage. Logically, the memory array is organized as four square arrays of 2048 X 2048 elements, which are connected by both horizontal (row) and vertical (column) lines. Each horizontal line connects to the Select terminal of each cell in its row; each vertical line connects to the Data-In/Sense terminal of each cell in its column. Address lines supply the address of the word to be selected. Here, 11 address lines are needed to select one of 2048 rows. The action is performed using a row decoder. An additional 11 address lines select one of 2048 columns of 4 bits per column. The 22 required address lines for a 2048 X 2048 array are passed through select logic external to the chip and multiplexed onto the 11 address lines. First, 11 address signals are passed to the chip to define the row address of the array, and then the other 11 address signals are presented for the column address. These signals are accompanied by row address select (RAS) and column address select (CAS) signals to provide timing to the chip. Four data lines are used for the input and output of 4 bits to and from a data buffer. The write enable (WE) and output enable (OE) pins determine whether a write or read operation is performed. On input (write), the bit driver of each bit line is activated for a 1 or 0 according to the value of the corresponding data line. On output (read), the value of each bit line is passed through a sense amplifier and presented to the data lines. The row line selects which row of cells is used for reading or writing Because only 4 bits are read/written to this DRAM, there must be multiple DRAMs connected to the memory controller to read/write a word of data to the bus. The figure includes refresh circuitry. All DRAMs require a refresh operation. A simple technique for refreshing is, in effect, to disable the DRAM chip while all data cells are refreshed. The refresh counter steps through all the row values. For each row, the output lines from the refresh counter are supplied to the row decoder and the RAS line is activated. The data are read out and written back into the same location. This causes each cell in the row to be refreshed.
Why is the capacity of DVD more than a CD?
DVDs can store more information than compact discs CDs because they have smaller pits, placed closer together.
What are the differences among EPROM, EEPROM?
Differences among EPROM and EEPROM: Read-Only Memory (ROM) contains a permanent pattern of data that cannot be changed. A ROM is nonvolatile, i.e. no power source is required to maintain the bit values in memory. A variation on read-only memory is the read-mostly memory, which is useful for applications in which read operations are far more frequent than write operations but for which nonvolatile storage is required. There are three common forms of read-mostly memory: EPROM, EEPROM, and flash memory. Optically Erasable Programmable Read-Only Memory (EPROM) is read and written electrically. However, before a write operation, all the storage cells must be erased to the same initial state by exposure of the packaged chip to ultraviolet radiation. This erasure process can be performed repeatedly; each erasure can take as much as 20 minutes to perform. Thus, the EPROM can be altered multiple times and holds its data virtually indefinitely. Electrically Erasable Programmable Read-Only Memory (EEPROM) is a read-mostly memory that can be written into at any time without erasing prior contents; only the byte or bytes addressed are updated. The write operation takes considerably longer than the read operation, on the order of several hundred microseconds per byte. EEPROM combines the advantage of non-volatility with the flexibility of being updatable-in-place, using ordinary bus control, address, and data lines. EEPROM is more expensive than EPROM and also is less dense, supporting fewer bits per chip
DMA breakpoints - Figs. 7.13
Figure 7.13: DMA and Interrupt Breakpoints During an Instruction Cycle - Shows where in the instruction cycle the processor may be suspended. - In each case, the processor is suspended just before it needs to use the bus. - The DMA module then transfers one word and returns control to the processor. Note that this is not an interrupt; the processor does not save a context and do something else. Rather, the processor pauses for one bus cycle. The overall effect is to cause the processor to execute more slowly. Nevertheless, for a multiple-word I/O transfer, DMA is far more efficient than interrupt-driven or programmed I/O.
Very briefly discuss flash memory (< 1 page). How is its lifespan limited?
Flash memory has a finite lifetime. This means that Flash memory reliability and life are issues that need to be accounted when considering its use.
How do the following work: flash memory?
Flash memory is a type of semiconductor memory. It is intermediate between EPROM and EEPROM in both cost and functionality. Like EEPROM, flash memory uses an electrical erasing technology. An entire flash memory can be erased in one or a few seconds, which is much faster than EPROM. In addition, it is possible to erase just blocks of memory rather than an entire chip. Flash memory gets its name because the microchip is organized so that a section of memory cells are erased in a single action or "flash." However, flash memory does not provide byte-level erasure. Like EPROM, flash memory uses only one transistor per bit, and so achieves the high density (compared with EEPROM) of EPROM. Conventional memory technologies, for e.g. DRAM and NAND flash, exploit the capacitance from electrical charge stored in the memory cell. They have the restriction on storage density scaling with reduced cell area. To overcome this limitation, new resistance-based memory technologies, for e.g. phase-change RAM (PRAM), magnetic RAM (MRAM), or resistive RAM (ReRAM), are being developed. In comparison with conventional capacitance-based memory technologies, these nonvolatile memory technologies are faster and can more easily accommodate scaling to sub-20-nm nodes.
Very briefly discuss flash memory (< 1 page).
Flash memory is an electronic non-volatile computer memory storage medium that can be electrically erased and reprogrammed.
Consider a dynamic RAM that must be given a refresh cycle 50 times per millisecond. Each refresh operation requires 120 ns; a memory cycle requires 200 ns. What is the maximum number of memory cycles in 1ms that can be used for reads/writes (refresh cycles are not part of memory accesses)?
From above, 0.6% of the time is spent on memory refresh. This implies 99.4% of the time is available for memory access i.e. read/write operations. Hence, per 1 msec, the maximum time available for memory read/write operations => 99.4/ 100 × 1ms = 99.4 × 10^6 / 100 ns = 994000 ns. Given, Memory cycle duration is 200 ns. Considering each memory operation can be completed in one memory cycle the maximum number of memory read/write operations that can be performed in 1 msec duration => 994000 / 200 = 4970
A DMA module is transferring characters to memory using cycle stealing from a device transmitting at 19,200 bps. The processor is fetching instructions at the rate of 4.5 million instructions per second (MIPS). By how much will the processor be slowed down due to the DMA activity?
Given: 8 bits/char or byte 4.5 MIPS device transmitting at 19,200 bps Calculation: DMA module transferring characters at 19,200 bps -> 19200/8 = 2400 bps Processor is fetching instructions are the rate = 4.5 MIPs Slowdown = 2400/(4.5 * 10^6) = 0.000533 s = 5.33 * 10-4 s % slow rate = [2400/(4.5 * 10^6)] * 100 = 0.0533%
Consider a dynamic RAM that must be given a refresh cycle 64 times per ms. Each refresh operation requires 150 ns; a memory cycle requires 250 ns. What percentage of the memory's total operating time must be given to refreshes?
In 1 ms, the time devoted to refresh is => refresh = refresh cycle * refresh operation = 64 × 150 ns = 9600 ns. The fraction of time devoted to memory refresh is => refresh/ 1ms = (9.6 × 10^-6 s)/10^-3 s = 0.0096, which is approximately 1%.
Consider a dynamic RAM that must be given a refresh cycle 64 times per millisecond. Each refresh operation requires 150 ns; a memory cycle requires 250 ns. What % of the memory's total operating time must be given to refreshes?
In 1 ms, the time devoted to refresh is => refresh = refresh cycle * refresh operation = 64 × 150 ns = 9600 ns. The fraction of time devoted to memory refresh is => refresh/ 1ms = (9600 × 10^-6 s)/10^-3 s = 0.0096 = 0.96%
How do the following work: STT-RAM?
MRAM uses tunneling resistance that depends on the relative magnetization directions of ferromagnetic electrodes.6-8 It is an attractive memory option because of its superior scalability, speed, and power consumption. Depending on the writing technique, MRAM is classified as either field switching or spin-transfer torque (STT) MRAM, the latter type being of more interest because of its simple structure and excellent scalability. Two factors decide how far MRAM can shrink: the critical current density, which is the current density for magnetic switching and the thermal stability factor, which is closely correlated with retention. Critical current density should be as low as possible to fully exploit MRAM's low power consumption and high-density advantages. In contrast, the thermal stability factor should be as high as possible to guarantee reliable device functionality. Adjusting process parameters requires a tradeoff between the two. Several challenges remain for sub-20-nm MTJ cells and need further development effort on optimizing the ferromagnetic electrode's composition, upgrading the MTJ stack scheme, and developing advanced MTJ patterning technology.
What do you understand by interleaved memory?
Main memory is composed of a collection of DRAM memory chips. A number of chips can be grouped together to form a memory bank. It is possible to organize the memory banks in a way known as interleaved memory. Each bank is independently able to service a memory read or write request, so that a system with K banks can service K requests simultaneously, increasing memory read or write rates by a factor of K. If consecutive words of memory are stored in different banks, then the transfer of a block of memory is speeded up. The interleaved memory system is most effective when the number of memory banks is equal to or an integer multiple of the number of words in a cache line. Shown in the picture below is an example for 2-way interleaved memory organization and accesses to memory blocks over a period of time.
RAID level 1
Mirroring: Mirrored; every disk has a mirror disk containing the same data.
RAID Level 2 Application
No commercial implementation exist and not commercially viable.
How do the following work: PCRAM?
PRAM is recommended for standard interface or storage memory applications because of its non-volatility, relatively fast operational speed, and scalability.
Interrupt-driven I/O
Processor issues an I/O command, continues to execute other instructions, and is interrupted by the I/O module when the latter has completed its work
PROM
Programmable ROM (PROM) - Less expensive alternative - Nonvolatile and may be written into only once - Writing process is performed electrically and may be performed by supplier or customer at a time later than the original chip fabrication - Special equipment is required for the writing process - Provides flexibility and convenience - Attractive for high volume production runs
Differences between Programmed I/O, Interrupt-Driven I/O and DMA?
Programmed I/O: The processor issues an I/O command, on behalf of a process, to an I/O module; that process then busy-waits for the operation to be completed before proceeding. Interrupt-driven I/O: The processor issues an I/O command on behalf of a process, continues to execute subsequent instructions, and is interrupted by the I/O module when the latter has completed its work. The subsequent instructions may be in the same process, if it is not necessary for that process to wait for the completion of the I/O. Otherwise, the process is suspended pending the interrupt and other work is performed. Direct memory access (DMA): A DMA module controls the exchange of data between main memory and an I/O module. The processor sends a request for the transfer of a block of data to the DMA module and is interrupted only after the entire block has been transferred.
RAID Level 1
Provides redundancy through a process called disk mirroring. Characteristics: · Redundancy is achieved by the simple expedient of duplicating all the data · Data striping is used, but each logical strip is mapped to two separate physical disks so that every disk in the array has a mirror disk that contains the same data · Can also be implemented without data striping, although this is less common Positive Aspects: · A read request can be serviced by either of the two disks that contains the requested data · There is no "write penalty" · Recovery from a failure is simple, when a drive fails the data can be accessed from the second drive · Provides real-time copy of all data · Can achieve high I/O request rates if the bulk of the requests are reads · Principal disadvantage is the cost - Good choice for applications that require high performance and high availability, such as transactional applications, email, and operating systems.
How do the following work: ReRAM ?
ReRAM operation is mainly field-induced ion migration and hence has a relatively slow speed and limited endurance, relatively good data retention and a simple cell structure that is easy to manufacture. The potentially low cost to fabricate ReRAM makes it attractive for large scale integration. 3D ReRAM architecture is developed which provides higher bit density than planar structures. Though ReRAM has reliability advantages, low operation voltage, and random access, its power consumption is higher because it has fewer parallel working cells. In addition, for ReRAM to be a major memory product, researchers must find a way to better control its stochastic nature and variability. ReRAM's production cost and reliability are critical factors towards its adoption in the memory market.
ROM
Read Only Memory - Contains a permanent pattern of data that cannot be changed or added to - No power source is required to maintain the bit values in memory - Data or program is permanently in main memory and never needs to be loaded from a secondary storage device - Data is actually wired into the chip as part of the fabrication process Disadvantages of this: - No room for error, if one bit is wrong the whole batch of ROMs must be thrown out - Data insertion step includes a relatively large fixed cost
Draw the diagram of an SRAM cell and explain its action.
SRAM Static RAM (SRAM) is a digital device and stores binary values using flip-flop logic-gate configurations. SRAM holds data as long as power is supplied to it and does not need refresh circuitry. Four transistors (T1, T2, T3, T4) are cross connected in an arrangement that produces a stable logic state. In logic state 1, point C1 is high and point C2 is low; in this state, T1 and T4 are off and T2 and T3 are on.1 In logic state 0, point C1 is low and point C2 is high; in this state, T1 and T4 are on and T2 and T3 are off. Both states are stable as long as the direct current (DC) voltage is applied. SRAM address line is used to control two transistors (T5 and T6). When a signal is applied to this line, the two transistors are switched on, allowing a read or write operation. For a write operation, the desired bit value is applied to line B, while its complement is applied to line B-bar. This forces the four transistors (T1, T2, T3, T4) into the proper state. For a read operation, the bit value is read from line B.
SRAM/DRAM distinction
SRAM vs. DRAM A dynamic RAM (DRAM) is made with cells that store data as charge on capacitors. A static RAM (SRAM) is a digital device that uses the same logic elements used in the processor. Both volatile - Power must be continuously supplied to the memory to preserve the bit values Dynamic cell - Simpler to build, smaller - More dense (smaller cells = more cells per unit area) - Less expensive - Requires the supporting refresh circuitry - Tend to be favored for large memory requirements - Used for main memory Static - Faster - Used for cache memory (both on and off chip) ■ Static RAM (SRAM): SRAM provides rapid access time, but is the most expensive and the least dense (bit density). SRAM is suitable for cache memory. ■ Dynamic RAM (DRAM): Cheaper, denser, and slower than SRAM, DRAM has traditionally been the choice off-chip main memory.
SCSI
Small Computer System Interface a once common standard for connecting peripheral devices (disks, modems, printers, etc.) to small and medium-sized computers
RAID level 5
Striping with parity: Block-interleaved distributed parity; similar to level 4 but distributes the parity strips across all disks.
RAID level 0
Striping: non-redundant
What is a DDR SDRAM? How does an SDRAM differ from a DRAM?
Synchronous DRAM (SDRAM) exchanges data with the processor synchronized to an external clock signal and running at the full speed of the processor/memory bus without imposing wait states. With synchronous access, SDRAM moves data in and out under control of the system clock. The processor or other master issues the instruction and address information, which is latched by SDRAM. SDRAM then responds after a set number of clock cycles. Meanwhile, the master can safely do other tasks while SDRAM is processing the request. A new version of SDRAM, referred to as Double-Data-Rate SDRAM (DDR-SDRAM) can send data twice per clock cycle, once on the rising edge of the clock pulse and once on the falling edge. There have been two generations of improvement to the DDR technology. DDR2 increases the data transfer rate by increasing the operational frequency of the RAM chip and by increasing the prefetch buffer from 2 bits to 4 bits per chip. The prefetch buffer is a memory cache located on the RAM chip, which enables the RAM chip to preposition bits to be placed on the data base as rapidly as possible. DDR3, increases the prefetch buffer size to 8 bits. Theoretically, a DDR module can transfer data at a clock rate in the range of 200 to 600 MHz; a DDR2 module transfers at a clock rate of 400 to 1066 MHz; and a DDR3 module transfers at a clock rate of 800 to 1600 MHz. In practice, somewhat smaller rates are achieved.
When a device interrupt occurs, discuss different ways the processor can determine which device issued the interrupt.
The occurrence of an interrupt triggers a number of events, both in the processor hardware and in software: 1. The device issues an interrupt signal to the processor. 2. The processor finishes execution of the current instruction before responding to the interrupt, as indicated in Figure 3.9. 3. The processor tests for an interrupt, determines that there is one, and sends an acknowledgment signal to the device that issued the interrupt. The acknowledgment allows the device to remove its interrupt signal. 4. The processor now needs to prepare to transfer control to the interrupt routine. To begin, it needs to save information needed to resume the current program at the point of interrupt. The minimum information required is (a) the status of the processor, which is contained in a register called the program status word (PSW), and (b) the location of the next instruction to be executed, which is contained in the program counter. These can be pushed onto the system control stack. 5. The processor now loads the program counter with the entry location of the interrupt-handling program that will respond to this interrupt. Depending on the computer architecture and operating system design, there may be a single program; one program for each type of interrupt; or one program for each device and each type of interrupt. If there is more than one interrupt-handling routine, the processor must determine which one to invoke. This information may have been included in the original interrupt signal, or the processor may have to issue a request to the device that issued the interrupt to get a response that contains the needed information.
Rotational delay
The time it takes for the beginning of the sector to reach the head
Major functions of I/O modules
Thus, an I/O module is required. This module has two major functions (Figure 7.1): • Interface to the processor and memory via the system bus or central switch • Interface to one or more peripheral devices by tailored data links
Different ways of mapping I/O addresses to the processor address space.
With memory-mapped I/O, there is a single address space for memory locations and I/O devices. The processor treats the status and data registers of I/O modules as memory locations, and uses the same machine instructions to access both memory and I/O devices. With isolated I/O, a command specifies whether the address refers to a memory location or an I/O device. The full range of addresses maybe available for both.
Memory mapped I/O
o Devices and memory share an address space o I/O looks just like memory read/write o No special commands for I/O § Large selection of memory access commands available
Circular redundancy check
· A technique used to detect errors in digital data. · As a type of checksum, the CRC produces a fixed-length data set based on the build of a file or larger data set. · In terms of its use, CRC is a hash function that detects accidental changes to raw computer data commonly used in digital telecommunications networks and storage devices such as hard disk drives.
RAID Level 3
· Consists of byte-level striping with dedicated parity· Redundancy: · Requires only a single redundant disk, no matter how large the disk array · Employs parallel access, with data distributed in small strips · Instead of an error correcting code, a simple parity bit is computed for the set of individual bits in the same position on all the data disks · Can achieve very high data transfer rates Performance: o In the event of a drive failure, the parity drive is accessed and data is reconstructed from the remaining devices o Once the failed drive is replaced, the missing data can be restored on the new drive and operation resumed o In the event of a disk failure, all the data are still available in what is referred to as reduced mode o Return to full operation requires that the failed disk be replaced and the entire contents of the failed disk be regenerated on the new disk o In a transaction-oriented environment, performance suffers
A computer consists of a processor and an I/O device D connected to main memory M via a shared bus with a data bus width of one word. The processor can execute a max of 4 MIPS. An average instruction requires 5 machine cycles, 3 of which use the memory bus. A memory read/write operation uses 1 machine cycle. Suppose that the processor is continuously executing "background" programs that require 90% of the instruction rate but not any I/O instructions. Assume that one processor cycle equals one bus cycle. Now suppose the I/O device is used to transfer very large amounts of data between M and D. - If programmed I/O is used and each one-word transfer requires the processor to execute 4 instructions, estimate the max I/O data transfer rate in words/sec possible through D.
· The processor can only devote 10% of its time to I/O. ·Thus, the maximum I/O instruction execution rate is 10^6 × 0.10 = 100,000 instructions per second. Since the processor to execute 4 instructions. · The I/O transfer rate is therefore 100,000/4 =25,000 words/second. · The number of machine cycles available for DMA control is 10^6 *[(0.10 × 5 + 0.90 × 4)] = 4.1 × 10^6