CS400 exam one

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

treap

A random priority is assigned to every key and must maintain two properties: -They are in order with respect to their keys, as in a typical binary search tree -They are in heap order with respect to their priorities, that is, no key has a key of lower priority as an ancestor O(log N) expected time for all operations, O(N) worst case.

Trie

nil leaves imply completed word/phrase when traversing, store place in tree so that as you build words you don't have to search from scratch

time complexity of insert/delete/lookup for balanced tree vs. unbalanced tree

o(n) if unbalanced vs. o(logn) if balanced

double hashing. general formula? Why do you want a prime (ish) table size?

open addressing, uses 2nd HF to compute step size for probing. This HF is completely independent of the first, and a single key will always yeild the same step size. use this step size. formula: Hk, Hk + ss*1, + ss*2, + ss*3 if the step size is a multiple of the table, you'll jump to the same spots over and over and won't be able to place it!

B - tree: what does order m mean? what does branching factor mean in terms of m? what is the min numChildren for nodes? for root? what is the min numKeys and max numKeys for nodes? for root?

order m = branching factor = max numChildren min children: - leaf = 0 - root = 2 (either side) - internal = ceiling(m/2) min keys: - root = 1 - all others = ceiling(m/2) - 1 max keys: all = m - 1

Quadratic Probing

Checks the square of the nth time it has to check, causes secondary clustering. Not guaranteed to find an open table spot unless table is 1/2 empty. hash, hash + 1, hash + 4, hash + 9, hash + 16, etc.

git diff

Command to compare the files in staging area with the files in working directory checks for conflicts

convert 57 to binary

Divide the number by 2. Get the integer quotient for the next iteration. Get the remainder for the binary digit. Repeat the steps until the quotient is equal to 0. binary is list of remainders in revers (111001)

weighting

Emphasizing some parts of the key over another. ex: p1 * 11^1 , p2*11^2 , p3*11^3 . . . . Then added together in folding

Rehashing (when to do it, how its done, time complexity)

Expanding the table: double table size, find closest prime number (OR! subtract 1 from the doubled number if the prime is p far away because you're dealing with large table numbers) Rehash each element for the new table size. done when load factor is generally over ~.7 time complexity = O(n)

Is is better to show all results (passes and fails), just those that pass, or just those that fail?

For automated testing, we really only need to see results for tests that fail. Of course, you must make sure that other tests are being run, maybe use a test counter or some other way to show that a test has been run.

balanced vs. height balanced

HB = left subtree height - right subtree height is -1, 0 or 1 B = can't tell from picture. height of tree as it grows is bounded by O(logN). Snapshot of the tree in time will not tell you if true, must know with certainty how it will grow.

Height balanced and/or balanced: RBT, AVL

HB: AVL B: AVL and RBT if maintains HB, will be B

left rotate algorithm

store current node and parent node (x is passed in as current node), cut relevant parent/child ties reassign parent and child ties return new root of this situation (z) z x \ / x ---> z

Unit Testing

test individual units or pieces of code for a system ex. method, functionality within that method unit = data structure unit test = one of many tests of its functionality

load factor

the fraction of the table's capacity that is filled ~.7 ish is the general limit

what will java return if you call hashCode() on an int?

the int itself

git checkout -- <filename>

unmodifies a modified file, sets to last HEAD commit

end to end test

used to test whether the flow of an application right from start to finish is behaving as expected (problem: hard to execute all possible code paths)

trivial hash function (use case)

using the key itself as the hash code, if data is discrete and spaced out over a reasonable range (ie ints under 100)

full hash table

when LF > LF threshold

does hashcode need to result in an int

yes, that's the whole point, you need it to give you an index

add a type cast (do both generic and specific) to the declaration of a new list

List x = new LinkedList(); - can contain any objects List x = new LinkedList<Integer>(); - contains only integers, makes it easier to detect errors/specificity List x = new LinkedList<K>();

primary clustering

Many elements hashing to the same hash location

number of expected collisions (trying to place a key where one already exists) where N is number of keys and M is number of indexes to place them at

N(N-1)/2M

complexity of BST print, lookup, insert, delete

O(H) for all and want O(logN)

complexity of lookup, insert, and delete in a B tree

O(logN) with a base of b where b is the branching factor and N is the # of nodes

complexity of insert and delete in a red black tree

O(logbase2N)

cp -r

Recursively copy directories

linear probing, general implementation

Step size is 1. Find the index, and keep incrementing by one until you find a free space. - tends toward primary clustering, but will always find a spot

Black box test

Tester has no prior knowledge of network infrastructure testing expected results with actual hard to know where problems originate, need many unit tests bennies: anyone can implement, just need to know interface

BST delete and time complexity

delete(key) node = lookup(key) if (isLeaf) >> just delete if (hasOneChild) >> replace with child if (hasTwoKids) >> replace with in order successor, delete in order successor worst o(n)

techniques for generating hashchode

extraction (break up into parts) weighting (weigh some parts to be more important) folding (combine weighted vals back into an int)

cp filename filename cp -r Src_file1 Src_file2 Src_file3 Dest_directory cp -r directory1 directory2

filename filename: copies first's contents into second, overwriting it. if 2nd one doesn't exist, creates it multiple src files then directory: copies all files to directory. must end with directory name if multiple files are to be copied directory directory: if 2 doesn't exist, creates it and copies 1 into it. if 2 does exist, 1 becomes a subdirectory of 2

repository (repo)

files, tracking data, configurations etc. that is being tracked

Stages of Team Development

forming (getting to know, polite, strong leader needed) storming (conflict) cycle between these norming (resolve, bond) performing (hard work, no friction)

checkin out

get earlier version of files from you repo to your local working directory (git checkout abdf)

how to propose changes in git

git add <filename>

create a working copy of a local repository

git clone /path/to/repository

commit changes to the head

git commit -m "Commit message"

how do you "commit" ? what does this mean, what step in the process is it?

git commit -m "Commit message" this DOES NOT put it in your remote repo, but it does commit it to the HEAD

create a new repository

git init

to study repository history

git log

send changes from HEAD your remote repository

git push origin master where "master" is whatever branch you're pushing to

displays the state of the working directory and the staging area

git status

two steps to go from key to hash index

hashcode() generates a number, and that number is modded by the table size to get the index. hash.index = hash_code % TS

hashtable delete(k key)

hashtable[hash(key)] = null; (assuming no collisions)

hashtable insert(k key, d data) {

hashtable{hash(key)} = data;

points to the last commit you made (current node reference)

head

calculate balance factor code. what is the balance factor of a node (pos neg)

height of left - height of right

height of subtree code

if (node == NULL) return 0; /* compute the depth of each subtree */ int lDepth = maxDepth(node.left); int rDepth = maxDepth(node.right); /* use the larger one */ return max(ldepth, rdepth) +1

branching factor

in a search tree, the number of children of a given node. Often, the branching factors of individual nodes will vary, so an average value may be used. To guarantee a branching factor of 2 to 4, each internal node must store 1 to 3 keys.

where and why does .hashCode() use and XOR

in dealing w/ doubles. as far as I know, it puts it into binary, splits the bits up, shifts them on top of each other and does an XOR to determine the hashcode. if you fill out an XOR truth table, you'll find that it'd be true 50% of the time, whereas ANDOR is 3/4 true, and both is 1/4 true.

BST Insert and time complexity

insert(parent, node, key) if (node = null){ if (key<parent) key = parent.left else parent.right; return;} if (key<node) return insert(node, node.left, key) if (key>node) return insert(node, node.right, key) best o(1) worst o(n)

how to compile in linux with the arguments "10 1 2"

javac *.java java MyProgram 10 1 2

redirecting output to a file once compiling

javac *.java java TestPQ java TestPQ PQ01 PQ02 PQ03 MyPQ > results.txt

List<String> ls = new ArrayList<String>(); // 1 List<Object> lo = ls; // 2 lo.add(new Object()); // 3 String s = ls.get(0); // 4 Why won't this work?

line 4 attempts to assign object to a string. should just use generics In general, if Foo is a subtype (subclass or subinterface) of Bar, and G is some generic type declaration, it is not the case that G<Foo> is a subtype of G<Bar>. Instead, the supertype of all Collections<type> is Collection<?>

ls

list files in current directory

list files in current directory - command line

ls

the default branch when you create a repository

master

B+ Tree

1) Maintain a copy of all keys in the leaves of the tree. 2) Create a linked-list out of the leaf nodes of the tree. 3) all data actually stored in leaf nodes. internal nodes simply act as a road map to get near to desired value 4) advantageous for range queries 5) insertion and deletion behaves similarly to b-tree

in order traversal method and what it does

// print in ascending order! inOrder(node node) if (n!= null) print(inOrder(n.left)) print(node) print(inOrder(n.right))

pre order traversal method and what it does

// print self, print immediate left child, all the way down, print right children back up preOrder(Node n) if(n != null) print(n) print(preOrder(n.left)) print(preOrder(n.right))

perfect hashing (use case/when is it possible)

-zero collisions -best when few inserts and deletes, static data like a dictionary -constant search time as worse case O(1) because HF returns correct HI every time

post order traversal recursive method and what it will print

// far left leaf node, its sibling, then its parent. It then repeats this pattern on its parent.s sibling, always visiting the root of things last void printPostorder(Node node) { if (node == null) return; printPostorder(node.left); printPostorder(node.right); print(node) }

".." vs "."

"cd ../folder/folder/folder" lets you navigate here without knowing the whole path "cd ..\" goes up one level in the directory basically .. represents parent directory, . represents cwd

Version

(aka revision, or COMMIT) - name (or number) for a given copy

local repository

- Stored on local computer.

properties of a good hash function

- deterministic (if you put the key in the HF, it returns the same thing every time ie not dependent on date/time/random) - SHOULD achieve relatively uniform distribution (clusters lead to worse clusters) - SHOULD minimize collisions (mostly mapping uniquely) - ALSO a problem if all values seem to be entered equidistant (clustering) - SHOULD be fast and easy for a COMPUTER to compute

max/min heap insert and delete

-binary trees, no bearing on left or right -min: child always greater than parent, min at top -max: child always less than parent, max at top -insert: always add at bottom level, from left to right. say its a max heap. if you add a key that is greater than its parent, swap with parent all the way up until its not anymore

cp vs. scp vs. pscp

-cp: - used to copy file, files, directories to a new location on disk with different name. generally its "cp source destination" -scp: windows and linux allow copying to remote directory without establishing connection. "scp sourcefile remotedestination", can also use this to copy on one local machine (you can use it on ya own computer) basically functions same as cp -pscp: putty's scp command, allows windows users to do the above without launching putty

RBT insert

1. If empty, new root set black 2. Create node, color red a. If parent is black, done b. If parent is red, check uncle color i. If uncle is black/null 1. Straight line (left left or right right) a. Rotate to fix straight line, switch colors of parent and gpa 2. Triangle (left right or right left) a. Do according rotations and recolor like normal, swap parent and gpa ii. If uncle is red 1. Push black down from grandparent a. Gpa is red, unc and parent black 2. If grandparent is root node, make it black and exit 3. Go up to gpa and great gpa and so on, making fixes if needed 3. Always move back up the line and check these properties as you back up

B Tree Delete

1. leaf node - steal from sibling - merge with sibling and parent if none can be spared (move all the way up to root) 2. interior - steal in order successor or in order predecessor - if not possible, delete and combine children - if ever something is not possible, combine with sibling and parent 1. steal in order successor or predecessor (if children) 2. delete, combine children 3. steal from sibling 4. combine with sibling and parent

three steps to rehashing

1. new table with double table size to the "nearest" prime, may be far away, tablesize * 2 - 1 is probably good enough 2. rehash all keys into table 3. reassign ht pointer to this new table

what are the three cases of BST delete when you've found the key to delete

1. no children, just delete 2. one child, replace with child 3. two children, replace with in order successor and delete in order successor

AVL insert algorithm

1. recursive BST insert 2. check balance factor, do rotation ( ll, rl, lr, OR rr ) 3. exit method (all ancestors will be recursively checked because of recursive BST insert)

111001 convert to binary

1⋅2^5 + 1⋅2^4 + 1⋅2^3 + 0⋅2^2 + 0⋅2^1 + 1⋅2^0 = 57

when weighing characters in string for HF, what is the suggested number to weigh them by?

31, i.e. C1 * p^1 + C2 * p^2 + C3 * p^3 . . . .

Integration Testing

After unit testing, integration testing is done to see that the modules communicate the necessary data between and among themselves and that all modules work together smoothly.

generic list that can include any objects of the subtype comparable

LinkedList<K extends Comparable>

three trees of your local repository

Working Directory which holds the actual files. Index which acts as a staging area HEAD which points to the last commit you've made.

ASCII code

a code that defines how keyboard characters are encoded into digital strings of ones and zeros - important for exam: given A you should be able to say what X is (add number of letters between, goes up in order)

code coverage

a measure of how many parts of a program have been tested. hard to create tests and run the program so that all paths execute

complete (empty tree, one node)

all levels are full except maybe not last level, but all those nodes are pushed to the left

avl vs redblack advantages

avl- many rotations, redblack maximum of two rotations avl better for lookup, RB better for many inserts and deletes

left bias/right bias

b tree, when there is an uneven order, thus each node has an even amount of keys, this decides whether you promote the left of middle or right of middle key when separating

what do you do if an insert to RBT gets you a double red and the uncle is red? if the uncle is black?

black - do avl rotation and recolor parent and granparent, move up tree and check for rules broken red - push blackness down to children, make gpa red. move up tree and check for violations (if root node is gpa, must turn black)

extraction

break up number into pieces which will then be added together via folding

BST - search() and best/worst case time complexity

bst lookup(node, key) { if (node.key = key) return node; if (ley < node.key) return lookup(node.left ,key); if (key > node.key) return lookeup(node.right, key); best = o(1) worst = o(n) (straight line)

cat

can be used for lotta stuff -cat filename shows contents of file -cat filename filename shows contents of both -cat >newFile creates file newFile -can also use extra verbiage to specify how copying and showing contents is done

rm

can delete any file or directory. won't do anything unless you include -d (delete empty directory) or -r (delete directory and all contents)

cd

change directory. "cd ../directory/directory"

> and >> (git)

command > file command = ls -al, cat, tree, etc > writes the command line resulting output to the file, overwriting what is there >> appends the command line output to the file

javac

compiles, which is NECESSARY before running it javac *.java javac *.java *.java *.java

Suggested method for hashing String keys, reasoning

convert all string characters to ascii, Ci C0 * p^i + C1 * p^(i-1) + C2 * p^(i-2). . . . then mod by table size, which should be a prime or prime-ish must weight letters differently, bc CAT would get same as ACT P = 31 generally for letters if you're considering only lowercase, otherwise next prime 53 for upper and lower

check out a repo, what's it mean, how to do it locally and remotely? (github)

create a working copy of local repo git clone /path/to/repository remote: git clone username@host:/path/to/repository

pwd

print working directory, in form of /src/src/fold/dir/

what does "adding to the index" mean? how do you do it? what step in the process is it?

proposing changes, comes first after you edit git add <filename> git add *

update your local repository to the most recent commit

pull

checkin in

put new and changed local working directory files into repo (git commit)

AVL Delete

recursive BST delete, followed by checking node for balance and doing relevant rotation (ll rr lr OR rl)

max height of red black tree

red black tree has a max height of 2 * log(n)

rmrdir

removes directory only if empty, gives error if not empty must use rm on non-empty if you want to delete still

remote repository

repository on a different computer or network

white box testing

require knowledge of the implementation, and access to all fields of the program that are being tested

complexity of rehashing and resizing

resize - o(1) rehash - o(n)

naive expand

resizing and putting elements at their same index. -creates clusters -rehashing leads to diff indexes (% table size)

hashtable lookup(k key)

return hashtable[hash(key)];

commit

saves and names changes as a new "version" in the repository

bit shift in hashing (right vs left) what does it do to a number if you shift left 4 times? why represent numbers this way in hash code?

shift right - slide one place to right, ie divide by 2 and eliminate remainder (1101 -> 101) shift left - slide one place left, doubling the number (1101 -> 11010) shift left 4 times = n * 2^4 = n * 16 easy operations for computer to do quickly. Can make numbers very large and then mod by table size


Kaugnay na mga set ng pag-aaral

Key Concepts in Communication & Culture

View Set

1800's and Industrial Revolution - Conway

View Set

Psychology 1301 - DBU BEALE Test 2

View Set

FBLA International/Global Business

View Set

4337 Programming Languages (quiz 1-4) Exam 1

View Set

Writing Workshop: Evaluating Sources That Support a Claim

View Set