QM+C

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Poisson Distribution

a discrete probability distribution describing the likelihood of a particular number of independent events within a particular interval The rate of which events occur

The Handshaking lemma

a graph then the sum of the degrees of the all the vertices of the graph is twice the number of the edges of the graph. Proof: Each edge joins two vertices and thus contributes to the degree of two vertices. To use this result we apply the following corollaries (a corollary is an easy consequence of a lemma or theorem). HAND SHAKING LEMMA (Graphs) COROLLARIES i) For any graph the sum of the vertex degrees must be an even number. ii) For any graph there must be an even number of vertices of odd degree. iii) If the graph G contains n vertices and is regular of degree r then G has nr/2 edges

theorical probabilty

a ideal we can't see

Diagraph

a non empty set of elements called the vertex set and a list of ordered pairs of these vertices called arcs

graph

a non empty set of elements called vertices (singular: vertex) and a list of unordered pairs of these vertices called edges.

Eulerian Trail

a path that traverses every edge once - it is not necessarily a circuit

one time pad

a polyalphabetic substitution cipher in which the cipher key is longer than the message and (importantly) the key is only used once.

discrete random variable

a random variable that may assume either a finite number of values or an infinite sequence of values. Denoted by a small letter

empty set

a set with no elements extremly important represented by empty braces or a o with a slash through it.

line of best fit

a smooth line that reflects the general pattern in a graph

Ceaser Cipher

a technique for encryption that shifts the alphabet by some number of characters

Circuit

a trail the starts and ends with the same vertices

independent variable

a variable (often denoted by x ) whose variation does not depend on that of another.

Min Weight Spanning Tree

a weighted graph in which the sum of the weights on the edges of the tree is as small as possible over all the possible spanning trees.

parity bit

add 1 if its even add 0 if it's odd

Lossy Compression

allows for a small loss of data consistent with little loss of 'information' in the content. data compression techniques in which some amount of data is lost. This technique attempts to eliminate redundant information.

linear regression

an algorithm to find a precise line of fit for a set of data

Exponetial dist

an average service time (mew) - gives the pr serivice will exceed time (t)

Incident

an edge (e) is incident to vertices v and w if and only if e = vw

The 10 digit ISBN

an example of an error detection code - it can detect one error (and the interchange of two symbols but not more.).

Non-Linear Feedback shift registers

an improvement of LFSR - lsfr is susceptible to a known Plaintext attack.

Transpostion ciphers

arrangement of the symbols in the plaintext.

simple queue m/m/1

arrival rate (lamda) service time u-1 (mews complement)

Feistel Function

based around different booleaan functions.

collision

because the digest is generally smaller than the message there are different messages that will hash to the same value; i.e. there are distinct messages 1m and 2m such that ).()(21mhmh This is called a collision.

Complete Bipartite

bipartite graph in which every member of one of the disjoint subsets is joined to every member of the other subset. These graphs are denoted Kp,q where p and q are the numbers of vertices in each of the two sets

In-Degree Sequence

bracketed list of the in-degrees of all the vertices in a digraph written in non-decreasing order.

Out-Degree Sequence

bracketed list of the out- degrees of all the vertices in a digraph written in non-decreasing order.

hill cipher

breack into vertex key * vertex matrix decipher inverse matrix * encryption

Gray code

can be used to minimise errors in the representation of the orientation of the platform. using two codes that differ by a single digit to represent 'neigbouring' orientations will usually result in a read value within one position when a misread of a code occurs. eg. 000 001 011 111 101 100 110 010

two-tailed test

can fail either way a test of the null hypothesis where the alternative hypothesis is not expressed directionally

one-tailed test

can only fail one way

Arc-list

collection of all the arcs of a digraph is called the arc-list and is denoted A(D).

Hamiltorian cycle

cycle in a graph is a Hamiltonian cycle if and only if it visits all the vertices in the graph and, apart from the start/terminal vertex of the cycle, it visits each vertex exactly once. A graph that contains a Hamiltonian cycle is called Hamiltonian

relative frequency

divide all totals by sample size the fraction or percent of the time that an event occurs in an experiment. if sample size = 100 use %

Shortest path

doesn't have to pass through all nodes least coat from a-z = shortest path

Mono-alphabetic substitution

each letter of the alphabet is replaced by a single cipher mod 27 add symbol?

beaufort table

essentially the same as the viginere except the rows are reversed. z,y,x....

type 2 error

failing to reject a false null hypothesis (false positive)

inverse functions

functions that undo each other f-1

Tell me three times system

gives more info more likley two right than two wrong

Samples

glimpses of a bigger a picture

Regular

graph G is regular if and only if all the vertices of G have the same degree. If a regular graph is such that all the vertices have degree rthen the graph is regular of degree r.

Bipartite

graph in which the vertex set can be partitioned into two disjoint subsets A, B in such a way that each edge of the graph joins an element of A to an element of an B.

Vertex colouring

graph without loops is an assignment of colours (usually represented by integers) to the vertices of a graph in such a way that two adjacent vertices have NOT been assigned the same colour (integer). Chromatic number for a graph is the minimum number of colours (numbers) required to produce a vertex colouring of the graph. We denote the chromatic number of a graph G by the symbols G

lfsr

helps generate a sequence of pseudo random numbers.

complement

if A is an even, Not A is also an event. this is known as the complement

coprime

if a pair has only the common divisor of 1

xor

looks like a target symbol

test statistic

mean - hypothesis (mew) / s.d / ^n mean minus hypothesis divided by the population std deviation divided by the sqaure root of the sample

chi-squared test

measure category stats obs -exp^2/ exp this is done for each value in the table then added all together= chi value

Continuous data

measured data, often time . Data that can take on any value. There is no space between data values for a given domain. Graphs are represented by solid lines. data is continuous because it can theoretically take any value in a range (think numbers with decimal places). Continuous data is an idealised c

element

members of a set represented by small letter each element should only be listed once.

send and send again system

method of error detection

Transportation problem

min cost path ( a12 + a13 + a14 ect) Subject to any contraints for path remember non-neg value all >= 0

Simle finite Capacity Queue M/M/1/K

modelling assumptions of the M/M/1/K are identical to M/M/1 with one exception: we assume that there is a finite capacity to the system. We denote the system capacity by K (some authors use C).

observation

n

intersection

n comonalities.

binominal coefficiant

n- success k-failure

Sample

na random sample a 'small' number of the population are drawn at random and (only) their data is collected. We tacitly assume that the sample is sufficiently large so as to be representative of the whole population, at least from the point of view of drawing conclusions about the population. - subset of the population

Null Hypothesis (H0)

no change in mean

Determining probability from physical charachteristics

no. of ways it can occur / total no. of possible outcomes

sequence

ordered collection of objects

mew (u)

population mean

Pr(A and B)

pr(A) x Pr (B)

The OR rule

pr(a)+pr(b)-Pr (A and B)

Frequency continous

ranges eg 10-20 20-30 30-40 or 10 - <20 20<30 30<40 or 10-15 16-20 21-25

Lossless Compression Algorithms

records data perfectly - the original file can be produced exactly a mathematical formula for image compression that assumes that the likely value of a pixel can be inferred from the values of surrounding pixels *are not always effective in reducing the size of file.

sets

represented using captial letters

Parity Check Matrix

row = pcm column

std dev

s

variance

s^2

Pseudo-random number generator

seed - a truly random number - multiply seed by itself then output the middle of the middle of the number - this output is then used as the next seed.

Queueing theory

service rate (mew) mean rate of arrival (lamda) number of serverss (n) time (t) wait(w) que(q) client (x) avg customer (l) service utilizations (p) arrival time <= service rate

co domain set

set containing all the assigned values.

range of a random variable

set of all possible values of the random variables

Vertex-Set

set of vertices of a digraph D is called the vertex-set and is denoted V(D)

a (alpha)

significance level

Empirical

something we observe in actual data

Variance

standard deviation squared

o with curve thing

std deviation

markov chain transition matrice

takes info and puts input/outpbut into a matrix row-input column-output numbers in each row - 1

z-score

the distance between the mean of a dist & a data point in the std deviations

the indetity function

the domain and the codomian of the identity function are the same. (id)

Median

the middle score in a distribution; half the scores are above it and half are below it

Mode

the most frequently occurring score(s) in a distribution

mode

the most frequently occurring score(s) in a distribution

Dispersion

the pattern of spacing of a population within an area

Homophonic substitution cipher

the plaintext alphabet is mapped into a larger ciphertext alphabet. This allows more than one symbol to be associated with the plaintext symbol. In particular common letters such as e can be replaced by one of several options.

cumulative probability distribution

the probability that the random variable is less than or equal to a particular value

relative frequency in probability

the proportion of times an outcome would occur in the long run

rate of code

the ratio of usefull information in a codeword to the total codeword length k/n

domain set

the set containing the elements to be assigned values by the rule.

Distributions

the shape of data

cumulative frequency

the sum of the frequencies for that class and all previous classes

Primary function of statistics

to conclude information about a population through analysis of samples. It is the study of relationships between populations and the samples drawn from them that forms the theoretical underpinning of the science of statistics.

positive skew

to the left

negative skew

to the right

bivariate data

two measurements taken from the same entity. independent(x)- input dependent(y)- response

Bimodal data

two peaks (camel) Two sets of data measure together (disguised as one)

sample statistics

uncertain and random

error syndrome

vector that tells you which equations are not satisfied in a recieved codeword

AND

x men symbol

Function Composition

you could potentionally construct a new function from two existing function. - when all the elements in the 'first' function are contained in the domain of the second.

Run length Encoding (RLE)

- simplest compression technique A compression algorithm that represents an image in terms of the length of runs of identical pixels think ross' cyber escape room

normal distribution

Bell Curve unimodal - only one peak mode and median is the same as the normal distribution is symetric Mean - where the center of the dist is S.D - how thin or squished the dist is. - is the avg distance between any point to them mean.

Real numbers

R numbers with a point

uniform distrubution

each value has the same frequencey (equally likely)

factorials

!

Probability theory

"the science of dealing with uncertainty" Event(informal.): an occurrence of interest to the analyst. The probability of an event A written Pr(A): a measure of the likelihood that the event will occur.

set of words

*

The 13 digit ISBN

- start with the 9 digit code which makes up the 10 digit ISBN without the checksum.

bayesian stats

- updating -changing stats

hash function

-divide into vectors of same size (n) ie (1010) -pad when neccary add all vestors mod 2 Accepts an input message of any length and generates, through a one-way operation, a fixed-length output.

max flow

-edges have flowvalues & max capacity -as much as you can without exceedin capacity flow <=capacity

Min spanning tree

-gets rid of uneccessary arcs , while ensuring nodes are still connected

specifying a function

-requre two sets - a rule that assigns a *unique* element in the second set to each element in the first.

frequency discrete

0 1 2 3 4 5 6 ect

properties of a good hash

1. It is easy to compute the hash value for each message (desirable for efficiency in implementation) 2. Given a hash it is impractical to generate a message with that hash value. 3. Any changes to the message will almost certainly result in a changed hash value 4. It is infeasible to find two different messages with the same hash

playfair cipher

5x5 square different row and column - rectangle - adjacent letter same row- to the right same column- down

Degree Sequence

A bracketed list of the edge degree of all the vertices in a graph written in non-decreasing order.

Causation

A cause and effect relationship in which one variable controls the changes in another variable.

Eulerian Graph

A circuit in a graph is Eulerian if and only if it traverses every edge in the graph once. (Note the fact that it is a circuit implies that it must start and terminate at the same vertex). A graph that contains an Eulerian circuit is called Eulerian.

Cycle

A circuit in which the only vertex to appear twice is the start/end vertex.

unique decomposability

A code is uniquely decomposable (UD) if any string of codewords corresponds to a unique message.

Linear feedback Shift Registers

A commonly used procedure for generating (psuedo) random bits is to use feedback shift registers.

corollary

A connected graph G contains an Eulerian path if and only if there are exactly two vertices of odd degree

cycle

A cycle graph is composed of a single cycle. The cycle graphs are regular of degree 2 and are denoted by Cn where n is the number of vertices

(A|B)

A depends on B

Weighted graph

A graph in which there is a number associated with each edge [its weight].

Connected

A graph is connected if there is a path from every vertex to every other vertex. Otherwise it is disconnected

null

A graph with no edges. The null graph with n vertices is denoted Nn and is regular of degree 0

Eulerian graph

A graph with no odd vertices

Labelled

A labelled graph is one in which every vertex has been assigned an identifier. A graph in which no vertex has been assigned an identifier is called unlabelled

Incidence matrix

A matrix representing the edges in a graph If a graph has n vertices and m edges; then the incidence matrix is an n x m matrix. The rows identify the vertices and the columns the edges.

Correlation

A measure of the relationship between two variables

Path

A non empty sequence of vertices such that between consecutive pairs in the sequence there is an edge, forms a walk. A walk in which no edge is repeated is called a trail. A trail in which no vertex is repeated is called a path.

Hamiltonian path

A path that contains all the vertices of the graph

Transpostion Matrices

A permutation matrix is a square matrix in which a 1 appears precisely once in each row and column; the remaining elements are all zero. The matrix can be any size

Population

A population is the set of all possible measurements of a defined type. Note that we are referring to the set (population) of data values (not the actual items that gave rise to the data.)

Markov Chain

A process in which, from any one time to another, the probability of moving from any given value on a measure to another value stays the same

Spanning Tree

A spanning tree for a connected graph G is a connected subgraph on all the vertices of G which is also a tree.

Interpretating probabilites

All Probabilities lie between 0 and 1 and have the following interpretation: Pr(Event) The Event 0 Cannot occur Near 0 is unlikely Near 0.5 is as likely to happen as not Near 1is likely to happen 1 must occur You may prefer to multiply a probability by 100 and interpret the result as the percentage of times the event will happen

Tree

Any connected graph with no cycles. The tree with n nodes has (n-1) edges. (Note the star graphs and path graphs are examples of trees).

Concatenation

Attaching codewords in a message side-by-side. the error in this is that the code may not be able to be decoded properly

Frequency

Data in its originally collected form is referred to as raw data. Prior to analysis raw data needs to be organised into a manageable form. Typically this involves ordering and/or grouping it. To organise the data we consider the range of values that it can take and divide this up into a manageable number of smaller ranges

E and slash E

E denotes element membership for example x E A means x is an element of set A. * note it isn't actually an E it just looks like one. slash E = not a member of

mutually exclusive

Events that cannot occur at the same time.

Bayes' Theorem

Expansion of conditional probabilities The probability of an event occurring based upon other event probabilities.

In-Degree

If D is a digraph with vertex v then the in-degree of v, denoted in-deg(v) is the number of loops incident to v plus the number of remaining arcs incident tov. (Note that in- deg(v) is obvious from a pictorial representation of the graph - it is the number of arrowed lines pointing towards the vertex

Out-Degree

If D is a digraph with vertex v then the out-degree of v, denoted out-deg(v) is number of loops incident to v plus the number of remaining arcs incident from v.

Vertex degree

If G is a graph with vertex v then the degree of v, denoted deg(v) is twice the number of loops incident to v plus the number of remaining edges incident to v. (Note deg(v) is obvious from a pictorial representation of the graph - it is the number of lines enteringv).

Mulitple Edge

If a pair of vertices has more than one edge connecting them then the edges are referred to as multiple edges.

LOOP

If a vertex has an edge from itself to itself then this edge is called a loop. A grach without a loop is called simple.

The hand shaking (Di)Lemma

In any digraph the sum of the out-degrees is equal to the sum of the in-degrees is equal to the number of arcs

natural numbers

N numbers used for counting Positve

mean of binominal dist

N*P n = no. of events p = probability of

Discrete data

Numerical data values that can be obtained from counting - usually whole numbers.

The Not rule

Pr(not A) = 1 - Pr(A)

Poisson Distribution

Probability distribution for the number of arrivals during each time period

prefix free codes

Rather than struggle to find UD codes directly, we look for prefix free (PF) codes - which are easy to find codes - since PF implies UD. A prefix free (PF) code requires that no code member be the prefix of another code member.

Adjacency Matrix

Records the number of direct links between vertices. The row sum is the degree of the vertex represented by the row (same for the column sum). Because a simple graph has no loops there are always zeros 0 on each of the main diagonal elements 0111 1011 1101 1110

Type 1 error

Rejecting null hypothesis when it is true (False Negative)

Hill Cipher

Square matrix operating with mod arithmitic number is broken down into vectors ( , ) the times by key matrix using the mod. to diciper use inverse matrix (-1 * key matirix)

Little's Law

States a mathematical relationship between throughput rate, flow time, and the amount of work-in-process inventory

Huffman Coding

The Huffman code can reduce the amount of space required to store a file and it is straightforward to decode since it is PF. Huffman codes are optimal in the sense that no other lossless fixed-to-variable length code has a lower average rate *learn example* Huffman coding. (algorithm) Definition: A minimal variable-length character codingbased on the frequency of each character. First, each character becomes a one-node binary tree, with the character as the only node. The character's frequency is the tree's frequency.

laws of probability

The basis for hypothesis testing and confidence interval estimation.

Edge - List

The collection of all the edges of a graph is called the edge-list and is denoted E(G)

interquartile range

The difference between the upper and lower quartiles.

Edge Connectivity

The edge connectivity of a connected graph is the smallest number of edges that can be removed from the graph and cause it to become disconnected. We denote the edge connectivity of a graph G by (G). [Spoken: lambda of G]

alternative hypothesis

The hypothesis that states there is a difference between two or more sets of data. - significant difference of mean

The key distribution problem.

The key distribution problem refers to the problem of distributing keys without a safe channel; and if you have a safe channel why the need to distribute keys. Public keys are available to anyone (and hence easily distributed) but are of no use in decryption: this requires the corresponding private key, which is held by one 'person' only.

significance level

The probability of a Type I error. A benchmark against which the P-value compared to determine if the null hypothesis will be rejected. See also alpha.

Bayes' Theorem

The probability of an event occurring based upon other event probabilities.

Vertex -Set

The set of vertices of a graph G is called the vertex-set and is denoted V(G)

vertex connectivity

The vertex connectivity of a connected graph (with the exception of complete graphs - why?) is the smallest number of vertices that can be removed from the graph with their incident edges and cause it to become disconnected. We denote the vertex connectivity of the graph G as (G). [Spoken: kappa of G]

Polyalphabetic Cipher

The way you scramble the alphabet actually changes throughout the message Example: Vigenère cipher

Minimum distance of a code

The weight of a binary codeword is the sum of the bits in the word. We denote the weight as a function w. For linear codes the metric used is called the Hamming distance of the code . For two codewords x and y taken from a linear (n,k) code, we represent the hamming distance mathematically by ),(yxd . It is calculated by summing the number of positions by which the two codewords differ. The minimum distance of a code is the smallest value of the Hamming distance taken over all possible pairs of codewords. The minimum distance is represented using the symbol(delta- hooked o thing ).

code properties

There are three properties of codes that are important when considering the application to which the code is to be put: Economy, Reliability and Security. By economy we mean reducing the lengths of communications as much as possible without loss of meaning in the content. Economy is obtained using compression techniques. This is clearly important in an age were gigabytes of data are moving around the internet . By reliability we primarily mean the detection of errors in transmitted messages. This can be done using parity check matrices or hash functions. But reliability can also include the correction of detected errors. By security we mean ensuring that our messages are not meaningful to third parties.

Graph Isomorphism

Two graphs that have the same structure

adjacent

Two vertices u and v are adjacent if and only if there is an edge between them. i.e. (uv) is in the edge list.

union

U a combo

Polyalphabetic

Uses more than one alphabet to defeat frequency analysis -shift changes longer shift word - stonger cipher one time pad

parity code for error correction

We can use two parity bits to identify the location of an error (and since we are using binary) we can then correct it. This procedure is based on blocking the codes

The Kraft -McMillian Number

When we are using variable length strings it is convenient to have a test of when particular assignments of codewords should be rejected. (K)

set of integers

Z natural extension of natural numbers negatives + positive

Cube

a bipartite graph has n = 2kvertices (for some positive integer k) and is regular of degree k. The cube graph regular of degree k is denoted Qk

list

a collection of objects in which repetitions are allowed.

Star

a complete bipartite graph in which there is only one element in one of the subsets and is denoted K1,s

T- Dist

a cont. probability dist. thats unimodal, useful way to rep sample dist.

Probability Tree

a diagram that can be used to calculate the probabilities of combinations of events resulting from multiple random trials

Jackson Networks

a discipline within the mathematical theory of probability, a Jackson network is a class of queueing network where the equilibrium distribution is particularly simple to compute as the network has a product-form solution


Set pelajaran terkait

accounting chapter 7-9 multiple choice

View Set

albert personalities/testing and individual differences

View Set

Culinary Test Chapter 12.1 Fruit

View Set