CIS 315
Shared Memory Multiprocessors, Cont.
*-Uniform memory access (UMA) > * -Equal time for all memory access. -A single "common" space is connected to a group of processors via a bus or network. -As number of processors increases, overall system performance is degraded due to bus bandwidth being saturated or system cost increases due to network dynamics. *-Non-uniform memory access (NUMA) > * -Each processor has its own memory space distributed across processors. -Uses special hardware controllers (snoopy cache controllers) to manage cache memory with either write-through or write-back protocols.
Storage Area Networks (SANS)
-& cloud computing services expand on the I/O architectures to help address more scalable data storage needs. -Represent a dedicated network of storage devices while network attached storage is a file server based storage scheme. -Allows for faster access to large storage resources but do not provide the elasticity (allocation & de-allocation on demand) or redundancy as cloud storage dos.
PPT Slides, Chapter 9
-1-3. -4-6. -7. -8-12. -13-19. -20-26. -30-37. -64-67.
PPT Slides, Chapter 13
-1-4. -5-8. -10. -13-18. -20-22. -30-34. -36-37.
Parallel & Multiprocssor Architectures > Vector Processors, Cont.
-2 types (based on how instructions access data) > *-Register - register >* -Registers are source & destination points. -Requires long data streams to be broken into fixed length segments. *-Memory - vectors >* -data from memory routed directly to ALU & results sent back to memory. -Longer startup time per memory latency.
ATA
-AT attachment. -Parallel architecture that supports both the original internal 16 bit AT interface cards & 32 bit interfaces for a maximum of 4 disk drives or other I/O devices.
Other I/O Architectures
-Original PC/AT I/O architecture variants that support smaller storage models & that can compliment SCSI include > ATA, SATA, PCI, & USB.
SCSI-2
-Parallel architecture that supports a single bus capable of managing a variety of disk speeds with no re-cabling needed. -Disk drives are daisy chained (input of 1 drive is output of another drive) & communicate directly with one another. -CPU communicates with an asynchronous protocol based controller that manages I/O commands.
PCI
-Peripheral component interconnect. -An extension of the system data bus capable of negotiating bus speeds & data transfers with no CPU intervention. -Supports 64-bit buses & helps maximize transfer rates. -264 MBps for a 32-bit system & 528 MBps for a 64-bit system.
General Main Idea, Chapter 9
-Previous chapters focused on uni-processor systems from a programmer's perspective to help us understand how hardware contributes to overall system performance. -Beyond Von-Neumann model approach to investigate how reduced instruction set computing contributes to overall system performance. -Limited to how much improved performance per particular physical limitations. -Look at how to distribute computational load with parallel & multiprocessor architecture systems.
Flynn's Taxonomy
-RISC & CISC have evolved toward each other so we use Flynn's Taxonomy to help categorize architectures. -Types of data are based on the number of instructions, the number of data streams that flow into the processor. -SISD > single instruction, single data stream; used with uni-processor systems. -MISD > multiple instructions, single data stream; multiple instructions on the same data. -MMD > multiple instructions, multiple data streams; independent instruction & data streams.
Summary of Coverage, Chapter 9
-RISC vs CISC architectures. -Flynn's taxonomy. -Parallel & multiprocessor architectures.
SCSI-3
-SAM-3. -Layered interface with separate physical connections for transport protocols & interface commands that support both serial & parallel I/O. -SAM-3 serial protocols include > serial storage architecture (SSA), serial bus or IEEE-1394 or Firewire, serial attached SCSI, internet SCSI (iSCSI), & fibre channel (FC).
SATA
-Serial ATA. -Addressed bottleneck of ATA & heat concerns as processors speeds increased with a thinner (4 data lines & 3 ground wires) cabling scheme. -Supports 32-bit CRC error checking for all bits versus data-only error checking in ATA. -Concurrent data transfers per pt. to pt. configuration model.
Less Complex Instructions Means
-Shorter clock cycles. -Fewer transistors needed. -Cheaper manufacturing costs. -Allows for more space on circuit board for other chips. -Supports easier instruction pipe-lining per having control hardwired. -Compiler has to manage instruction complexity.
Summary of Coverage, Chapter 13
-Small computer system interface (SCSI). -AT attachment (ATA), serial ATA (SATA), peripheral component interconnect (PCI), universal serial bus (USB). -Storage area networks & cloud storage.
Note 5
-Storage area network & cloud computing concepts dal more with network concepts than with I/O architectures. But it is important to read & understand the comparison as follows...
Shared Memory Multiprocessors
-Supports symmetric multiprocessors type MIMD systems. -Very large memory space utilized & accessed by all processors.
Formula for Difference Between RISC & CISC
-Time/Program = Time/Cycle x Cycles/Instruction x Instructions/Program
USB
-Universal serial bus. -Connects to many peripheral devices of different varieties through an adapter card in a plug-&-play type fashion. -Plug-&-play achieved by publishing device driver software & using a host-device protocol pertinent to the device. -Three generations...up to 3.1. -Supports 4 different transfer modes (control, isochronous, interrupt, & bulk) using a 4-conductor cabling system (2 for data transfer, 1 for power, 1 for ground). -Not really a bus, serial peripheral interfac that connects to a microcomputer expansion bus. -Low power consumption, good for laptops & handheld computers.
Other I/O Connections
-Variations to this bus include the AT attachment (ATA), ATAPI, FastATA, & EIDE. -Ultra ATA supports a burst transfer rate of 133 MBps. -AT buses are too slow for today's systems.
Note 2
-Whit most architectures use a combination of RISC & CISC, modern architectures for mobiles use RISC typically.
Note
-With computer performance measured by program execution time, it's proportional to clock cycle time, number of clock cycles pr instruction, & number of instructions per program.
Parallel & Multiprocssor Architectures > Vector Processors
-An SIMD processor (or supercomputer = CRAY) that performs operations on data matrices. -Supports applications for medical diagnosing & image processing. -Uses pipe-lining to support overlapping of arithmetic operations, thus less decoding, memory usage, & data can be pre-fetched in pairs.
SSA
-Associatd with IBM during 1992 - 1996 time frame. -Attempt to break from parallel SCSI to better support increased throughout while also being backward compatible with SCSI-2 protocols. -Multiple disks & multiple hosts connected with 2 twisted pairs of copper wire. -Support signals to be transmitted in opposite directions (full duplex) per the 2 twisted pair cabling.
IEEE-1394/Firewire
-Associated with Apple during the 1980s. -A self-configuring peer-to-peer storage network that supports plug-and-play (IE, hot plugging) with both asynchronous & isochronous data transfer modes. -6-conductor sable, 4 for data & control, 2 for power. -Up to 15 ft of cable between each device. -Up to 63 daisy chained devices, support of hot-plugging.
Fibre Channel (FC)
-Associated with a technology consortium that included HP, IBM, & SUN Microsystems. -Objective is focused on pt. to pt. protocols to support high-speed interfaces to storage devices. -Components are costly & requires specialized training. -iSCI is a good alternative.
Note 4
-Compared with an approach that uses very long instruction word (VLIW) processors. These depend completely on a compiler to produce a single long instruction from many independent instructions. In this type of system, compiler gives direction to an execution unit. -Best approach because compiler can better identify instruction dependencies.
Parallel & Multiprocssor Architectures > Superscalar Architectures, Cont.
-Components that support super-pipe-lining include > -Execution units - consist of several floating point & integer adders & multipliers that work independently so several instructions can be run in parallel. -Instruction fetch unit - retrieves multiple instructions simultaneously from memory. -Decoding unit - determines which instructions can be run in parallel.
CISC (Complex Instruction Set Computing)
-Designed to have complex instructions to support smaller programs that require less storage & thus addressing high cost of memory. -Supports both small & large & complex instructions so it's a variable length instruction set. -Includes large number of instruction that access memory & that require varying # of clock cycles. -Requires microcode-based control units that interpret instructions as they are fetched from memory; takes time.
Note 3
-Difference between two architectures is how memory is used & how communication is managed, best managed by OS software.
RISC Chips
-Have capacity to support hundreds of registers. -Allows for better support of procedure calls & parameter passing, since these tasks require data to be stored & retrieved from memory before & after execution. -Key is to have register windows overlapped so that the procedure calls & parameter passing equates to shifting between one register window to another. -Register windows consist of 16 register sets of 32 registers each. -Types of register windows include > global, local, input, & output.
MIMD
-Includes 2 types of parallel architectures > -SMP > symmetric multiprocessors that use a shared memory & memory based communication model. -MPP > massively parallel processors that use a distributed memory & network based communication model. -MISD > multiple instruction streams, single data streams.
CISC, Cont.
-Increases performance by reducing # of instructions per program. -Relies on microcode to address instruction complexity per interpreting each instruction as it is fetched from memory & telling processors how to execute each instruction. -Because of the additional time in decoding instructions & with variable length instructions, CISC has poor support for pipelining.
RISC (Reduced Instruction Set Computing)
-Instruction complexity is what is reduced along with having minimal instructions for data movement, ALU operations, & branching that are permitted to access memory. -Instructions are fixed in length & thus time needed for execution is constant. -Increases performance by shortening clock cycles. -Access memory only with explicit load & store instructions.
Parallel & Multiprocssor Architectures > Superscalar Architectures
-Instruction-level parallelism type design that uses special hardware to support super-pipe-lining (multiple instructions to be executed in each cycle) & a compiler for scheduling operations that best leverage resources to improve system performance. -Super-pipe-lining happens when a pipeline has stages that require less than half a clock cycle to complete.
Internt SCSI (iSCSI)
-Leverages internet & LAN protocols to support fast & reliable SCSI commands & data transfers. -Virtually eliminates distance constraints associated with FC.
SCSI
-Main objectiv is to self-manage with interface to allow the CPI to focus more on computational tasks & less on I/O tasks. -Originally introduced in 1981 & established as an ANSI standard in 1986. -Original SCSI-1 defined commands, transport protocol & physical connections to link 7 drive to the CPU. -Along parallel cable, may contain as many as 68 pins. -Dual loop topology provided throughput at 40 MBps (simplex) or 80 MBps (duplex).
General Main Idea, Chapter 13
-Massive amounts of data storage, moves away from centralized storage models connected connected directly to the host system. -Various I/O architectures that support both enterprise-wide & smaller storage schemes include > SCSI, ATA, SATA, PCI, & USB. -SCSI-3 used in at least 70% of enterprise-wide systems. -Majority of focus is placed on basic SCSI concepts & terms.