Chapter 1 Formulas

Ace your homework & exams now with Quizwiz!

What is a workload?

A set of programs run on a computer that is either the actual collection of applications run by a user or constructed from real programs to approximate such a mix. A typical workload specifies both the programs and relative frequencies.

Levels of Architecture (Outside going in)

Application Software -> System Software -> Hardware

What are the five components of a computer?

Datapath/Databus, Input, Output, Memory, Control MODIC CODMI MODIC MIDOC Control (brain) + Datapath (brawn) = Processor Input + Output = Devices

Amdahl's Law

The performance enhancement possible with a given improvement is limited by the amount that the improved feature is used.

What's the problem with assemblers?

They're specific to each machine you're on. As you move through diff prozzrs, each prozzr is not backward compatible, save for a break b/w complex and reduced.

How to calculate execution time in "cycles per program"?

Time = cycles/program = (Instructions/Program) × (Clock Cycles / Instruction) × (Seconds /Clock Cycle)

Geometric Mean Formula

https://cdn1.byjus.com/wp-content/uploads/2019/02/Geometric-Mean-Formula.png I don't hate myself enough to figure out how to get this to display in Quizlet. nth root(∏ (n overscript, i=1 underscript) Execution time ratio_i) "Execution time ratio_i is the execution time, normalized to the reference computer, for the ith program of a total of n in the workload." ∏ (n overscript, i=1 underscript) a_i means the product a_1 × a_2 × ... × a_n Used in comparing SPECratios. It gives the same relative answer no matter what computer is used to normalize results, unlike an arithmetic mean.

More stuff

pg 53 and A6 PPT slides. Review Section 1.7 from the book. Also, I don't think many of the early chapter 1 or slide concepts have really been covered her in detail btw. I certainly haven't covered assembly lang here.

Symbols

∝ Proportionality (direct?) × Multiplication

How do you calculate speedup, and what do you want from it?

(Old Execution Time)/(New Execution Time) = speedup. You want Speedup > 1

What determines Instruction count per program and average cycles per instruction?

- IC per program is determined by program, ISA, and compiler - Average cycles per instruction is determined by the CPU hardware. If different instructions have different CPI, average CPI is affected by instruction mix.

How do [filler] affect the system

-Algorithm •Determines number of operations executed -Programming language, compiler, architecture •Determine number of machine instructions executed per operation -Processor and memory system •Determine how fast instructions are executed -I/O system (including OS) •Determines how fast I/O operations are executed

Performance depends on

-Algorithm: affects IC, possibly CPI -Programming language: affects IC, CPI -Compiler: affects IC, CPI -Instruction set architecture: affects IC, CPI, Clock Rate (Powerpoint lacks Clock rate and instead has "T subscript c" whatever that means. Transistor count? Idk. May have been a typo) Also present on page 39 with more detailed explanations as to why these are. Also, IC (on this card) is instruction count, not Integrated Chip. It is also affected by the hardware, though that is not specifically listed here

Why is parallel programming HARD?

1) By definition, it's programming for performance. It needs to be fast AND useful. Divides focus. Tradeoffs. Trying to have a cake and eat it too? If you don't care about performance, program sequentially. 2) The overhead of scheduling and coordination of the divided slices of an application so that each processor has roughly the same amount to do at the same time can negate or override the potential benefits of parallel programming Sub-tasks must be *scheduled*, things (things?) have to be *balanced evenly* to achieve a desired speedup, and it is important to *reduce communication and synchronization overhead*. Scheduling, load balancing, time for synchronization, and overhead for communication between parties are all challenges for parallel programming.

What problems exist with MIPS?

1) MIPS specifies the instruction execution rate but does not take into account the capabilities of instructions. We cannot compare computers with different instruction sets as the instruction count will certainly differ. 2) MIPS varies between programs on the same computer; thus a computer cannot have a single MIPS rating.

How to improve performance? (CPU Time)

1) Reduce the number of clock cycles 2) Increase the clock rate 3) Hardware designer must often trade off clock rate against cycle count

Eight Great Ideas

1. Performance via Parallelism 2. Hierarchy of Memories 3. Abstraction 4. Common Case Fast 5. Dependability via Redundency 6. Performance via Prediction vs 7. Design for Moore's Law 8. Performance via Pipelining

Clock Cycles (per program) =

= Instruction Count × Cycles Per Instruction = Instruction Count × CPI

What are response time and throughput and how are they affected by 1) replacing a processor with a faster one or by 2) adding more processors

A faster processor will increase throughput and decrease response time. More processors will increase throughput. Sometimes. Response time? Idk

What is a benchmark?

A program selected for use in comparing/measuring CPU performance. Usually they are standardized or may come in sets/bundles for testing. Having standards for comparison helps fairly compare computers in the market. They are changed every so often for reasons. Running representative sets of programs to determine an average execution time You want your benchmarks to do different things depending on what you're testing for (slide 50)

The Power Wall

After a point, we can't reduce voltage, and we can't remove heat. What can we do?

What is a profiler?

Allows you to find "hotspots" in code. Statements/functions executed a number of times. Helps to find where the bottle necks are. Some you just can't. Eg: I/O. System processes. Fairly complex stuff. Tricks exist to speeding things up, but may not work in general. Doing things in a different way with same result can optimize memory use, speed, and performance. Loops vs Recursion. Some things you can do with loops rather than recursion. Some you can't

What is pipelining?

An elegant technique that runs programs faster by overlapping the execution of instructions

What are the problems with Benchmarks?

Benchmarks aren't fully representative and can cause non-optimal design decisions. Performance optimize for wrong suite. SPEC is most famous, but represents a mix of UNIX programs. If designing X86 architecture, should use a mix of Windows apps People cheat. Which is why benchmarks need to be changed every so often. People can try to make special cases to pass the benchmark if they know it ahead of time and not perform well outside that benchmark. Benchmarks also can't catch applications that are user based.

(T/F) You can use a subset of the performance equation as a performance metric

By the way, for these T and F's it's not enough to just say T and F. You need to remember the REASON it's T or F as well. False. I'll need to add a more detailed reason later. Clock rate, instruction count, or CPI alone aren't enough. Using 2/3 may be valid in a limited context, but it's often misused. Most attempted alternatives to measuring time have one problem or another. Always has been the case. Even MIPS.

Weighted Average CPI

CPI = Σ(upper: n, lower: i=1) (CPI_i × (Instruction Count_i/Instruction Count))

Give me Integrated Chip Cost Maths

Cost per die = (Cost per wafer / (dies per wafer × yield)) Dies per wafer ≈ Wafer area / Die Area Die Area ≈ Wafer area / Dies per wafer Yield = (1 + (Defects per area × Die area/2))^(-2) or Yield = 1/((1 + (Defects per area × Die area/2))^2)

Components of Performance and How they are measured

CPU Execution Time for a program: Seconds for the program Instruction Count: Instructions executed for the program Clock Cycles per instruction (CPI): Average number of clock cycles per instruction Clock Cycle time: Seconds per clock Cycle Clock rate: Clock Cycles per Second

Without clock cycles, what are the formulas for CPU Time?

CPU Time (s/p) = Instruction Count (pp) × CPI × Clock Cycle Time (s/c) CPU Time = (Instruction Count × CPI) / Clock rate (c/s)

CPU Time as a big formula

CPU Time (spp) = (Instructions /Program) × (Clock Cycles / Instruction) × (Seconds / Clock Cycle)

"The classic CPU Time Formula"

CPU Time = Instruction Count × CPI × Clock Cycle Time or CPU Time = (Instruction Count × CPI) / (Clock Rate)

What is a microprocessor?

Chips with multiple processors. Companies often refer to processors as "cores" and microprocessors as "multicore microprocessors" to reduce confusion between the word "processor" and the word "microprocessor." A quadcore microprocessor is a chip that contains four processors or four cores, for example. Microprocessors are more focused on increasing throughput than response time.

If different instruction classes take different numbers of cycles, how to calculate?

Clock Cycles = Σ(upper: n, lower: i=1) (CPI_i × Instruction Count_i)

How do I calculate Clock Cycles per program? (aka: Clock Cycle *Count*)

Clock Cycles per program (cpp) = Instruction Count (ipp) × Cycles per Instruction (cpi) Clock Cycle Count (cpp) = CPU Time (s/p) * Clock Rate (c/s ) This second one is derived from a CPU Time formula Observation: Counts usually seem to be per program (CCC, IC, etc) while rates are almost always per second, while any time usually has seconds in the numerator with the word before time saying what the denominator is (eg: Clock Cycle time, CPU Time (kind of))

What are the units used in a bunch of this clock cycle and CPU time stuff.

Clock Rate is in GHz. Clock rate is how many cycles can be done per second. 1 GHz = 10^9 cycles per second Clock Cycles (JUST the cycles and not per anything) are in Cycles. CPU Time is in Seconds (per program). Clock Cycle time is also in some form of seconds (per cycle). (Time it takes to complete one clock cycle) "One GHz represents 1 billion cycles per second" (10^9 cycles / 1 second) Instructions are measured in instructions. CPI is Cycles per Instruction (Cycles/Instruction)

How to calculate CPU Time?

Clock cycle Count are cycles per program (CPP) or are in cycles (c). Clock cycle *time* is in seconds per cycle (spc) CPU time is in seconds per program (spp) Clock *rate* is in cycles per second (cps) OR is in GHz (10^9 cycles/1 second) CPU Time = CPU Clock Cycle *Count* × CPU Clock cycle time CPU Time = CPU Clock Cycle *Count* / Clock rate CPU Time = (Instruction Count × CPI) / Clock Rate CPU Time = (Instruction Count × CPI) × CPU Clock cycle *time* I guess this means that CPU Clock Cycle *Time* = 1 / Clock Rate Clock rate = 1/CPU Clock cycle *time*

What is CPI?

Clock cycles per instruction

Be careful with and be sure to include your "pers"

Clock cycles per second, clock cycles per program, and clock cycles per second are all very different things. Seconds as well.

Elapsed Time vs CPU Time vs Whatever other kind of times

Elapsed time = wallclock time = response time. Everything is considered. Absolutely everything. This means if your CPU is doing multiple things, they are not excluded. This is what the user sees btw. CPU time focuses just on the time spent actually working on the task requested. Discounts I/O time and other jobs' shares. CPU Time = Execution Time CPU Time = User Time + System Time Response time is how long it takes to do a task (a bit vague honestly) Throughput is total work done per unit of time Different programs are affected differently by CPU and system performance

What is instruction level parallelism, and an example of it?

Example: Pipelining What is it?: It's where the parallel nature of the hardware is abstracted away so the programmer and compiler can think of the hardware as executing instructions sequentially. Explicitly forcing programmers to write their programs to be parallel has, historically, been unsuccessful at large.

Amdahl's Law Equation

Execution Time after Improvement = (Execution time affected by improvement / Amount of Improvement) + Execution Time unaffected Execution Time New = Execution Time Old ((1 - fraction enhanced) + (Fraction Enhanced/Speedup Enhanced)) Fraction enhanced is fraction amount of program that is being enhanced/edited. Speed up REGULAR is overall speedup. Speed up enhanced is the speedup amount of just the enhanced part.

The only reliable measure of performance is...

Execution time, the product of of IC, CPI, and CR. Seconds per program = Instructions per Program × Clock Cycles per Instruction × Seconds per Clock Cycle. Individually, these parts can not tell you performance on their own.

SPEC Java Benchmark (SPECJBB2005)

Exercises: processors, caches, main memory; and JVM, compiler, garbage collector, and pieces of the OS. Performance is measured in throughput, and the units are business operations per second.

(T/F) The improvement of one aspect of a computer will increase overall performance by an amount proportional to the size of improvement.

False. This is not necessarily true. The opportunity for improvement is affected by how much time the event consumes. Common case fast. Amdahl's law.

(T/F) Computers at low utilization use little power

False? (pg 50)

What is the MIPS formula? (both versions)

First: MIPS = (Instruction Count / (Execution Time × (10^6)) Second: MIPS = (Clock Rate / (CPI × (10^6))

High Level to Machine code

High Level lang is processed by a compiler. The compiler either directly turns it into machine code or compiles it into assembly language. An assembler assembles assembly language into machine code.

What is the direction of confidentiality of abstraction?

Lower levels hide details from the levels above them. This is the inverse of a rank hierarchy in an organization. Junior Researchers on time and O5-Council on bottom. Underground and surface level? Gotta dig deep to uncover secrets. Whatever helps you to remember.

What is MIPS?

Millions of Instructions per Second. A way of measuring program execution speed based on the number of millions of instructions. MIPS is composed of the IC/(execution time × (10^6)) MIPS ratings are inverse to execution time. Bigger numbers mean faster computers

Change in Power (ratio)

New Power / Old Power

How do I get Performance?

Performance = 1/Exectution TIme

Relative Performance?

Performance x/Performance y = Execution Time y / Execution Time x = n X is n times faster than y This one's a headache. And I hate it... Think up a solution to that later.

Power Ratio

Power New/Power old

Power formula? (for IC CMOS technology)

Powerpoint: Power = Capacitive load × (Voltage^2) × Frequency Textbook: Energy ∝ Capacitive load × (Voltage^2) Energy ∝ (1/2) × Capacitive load × (Voltage^2) Power ∝ (1/2) × Capacitive load × (Voltage^2) × Frequency Switched

How have microprocessors affected programming?

Programmers now have to write programs to take advantage of the multiple processors to improve performance of their code as the number of cores increases. Used to not be so, but this is how it is now (see pg 43).

SPEC Power Benchmark

SPEC also made a power benchmark. THere's a chart on page 48. It reports power consumption of servers at different workload levels, divided into 10% increments, over a period of time.

What is SPEC/SPECratio

SPEC:: System Evaluation Cooperative: An effort funded and supported by a number of computer vendors to create standard sets of benchmarks for modern computer systems. SPEC has made a lot of benchmarks. SPECratio: a single number used to summarize all 12 integer benchmarks. "Dividing the execution time of a reference processor by the execution time of the measured computer normalizes the execution time measurements. This normalization yields the measure known as the SPECratio. Bigger numbers mean better performance. SPEC ratio is the inverse of execution time." A CINT2006 or CEP2006 measurement is obtained by taking the geometric mean of the SPECratios.

"Comparing Code Segments"

See pg 37

Speedup Equation

Speedup = Performance new / Performance old = Execution Time old / Execution time new = 1/((1 - fraction enhanced) + (Fraction Enhanced/Speedup Enhanced)) This last one is just Amdahl's Law with both sides divided by Execution time old and then put to the power of negative one

What is the Instruction Set Architecture. Assembly language? Stuff from slide 19-20 of the first ppt?

TBA

"overall ssj_ops per watt"

To simplify marketing of computers, SPEC's boiling down of the things exercised by their Java benchmark to one number. The formula is overall ssj_ops per watt = (Σ(upper: 10, lower: i = 0, inside: ssj_ops_i)) / (Σ(over: 10, under: i=0, right: power_i)) where ssj_ops_i is performance at each 10% increment and power_i is power consumed at each performance level.

(T/F) Designing for performance and designing for energy efficiency are related goals.

True. Since Energy is power over time, it is often the case that hard/software optimizations that take less time save energy overall, even if the optimization takes a bit more energy when it is used. Why? Not only the program running is consuming power. Other things on the system are still running as the program does its thing.

For studying

Try homework 1. That's some really dang good practical application.


Related study sets

gustatory receptors and the neural pathway for gustation

View Set

Econ 1 Test 2 Chapters 5-6, 7-8, 13

View Set

Graphing Lines in Slope-Intercept Form

View Set