Graphing Algorithms Final Exam

¡Supera tus tareas y exámenes ahora con Quizwiz!

graph G { a -- b; b -- c; b -- d; } What does this graph look like?

---------C -------/ a --- b -------\ ---------D

Steps for maximum bipartite matching using max-flow

1. Add Source and sink and edges to each 2. Give every node a capacity of 1 3. Add residuals 4. Find augmenting paths

Louvain Steps

1. Assign each node a cluster 2. Calculate change of modularity for adding node to any neighboring communites 3. If its positive, add it 4. Repeat till you iterate over all nodes in the graph and no change in the clustering was made. 5. New graph is created with all nodes in each communites become single node and make multi graph. 6. Repeat till no change

KNN Search Steps

1. Check to see if the heap contains the smallest values already and k is the number of neighbors looking for 2. check if leaf node 3. if it is, loop through the data and find data points that are closer than Q.peek 4. If its not a leaf node, call KNN_search on both left and right child

Lloyd's Algorithm Steps

1. Choose K cluster centers randomly 2. Assign data points to the cluster center they are most close to 3. update cluster center according to mean location of all data points

How to find minimum cut using max flow?

1. Find the max flow 2. Traverse in a breadth first fashion from s and when we encounter a saturated edge, we add it to the cut and do not continue

Kmeans++

1. First cluster center is chosen at random 2. each cluster center after is chosen by weighting the random choice among data points according to the square of their distance to their closest cluster center

Ball Tree steps

1. If length is 1, it is a leaf node and create a new ball tree with the single node being the center 2. Else create a new ball tree and set its center and radius based on the data and and split into left and right ball trees to create

Spectral Clustering

1. Laplacian transform L = D - A... where D is diagnoal with row sums as valuess 2. The lowest k eigenvalues are 0 according to the number of connected components

Color optimization: Local Search

1. Start with some coloring 2. Make local moves if it improves the solution. 3. Indirect objective function to make local moves to delete a color

Graph coloring Greedy Steps

1. Take the nodes in some order 2. for each one, we could color it the minimum color that is not currently used by its neighbors

A face is bound by

3 edges

All Planar graphs are at most what colorable?

4

hill-climbing strategy

A commonly used strategy in problem solving. If people use this strategy, then whenever their efforts toward solving a problem give them a choice, they will choose the option that carries them closer to the goal.

What is Rand Index used for?

A metric to tell how similar or how different two clusterings are (a + b) / (a +b + c +d) a: Same, same b: diff, diff c: same, diff d: diff, same

Edmonds-Karp Algorithm

A variation of Ford-Fulkerson Algorithm that uses BFS instead of DFS to ensure strongly polynomial time. O(VE^2)

What is Graph Coloring?

Assigning a color to each node such that no two adjacent nodes are the same color, sudoku

Clustree

Creates a tree/graph to visualize the best number of clusters. Bad when new clusters are formed with two different clusters

silhouette score

Each data point should be close to other points in same cluster and far from points in different cluster

Face

Every space the graph subdivides the space into

kmeans

Find k cluster centers in multi dimensional space such that the sum of the squared distances from data points to their closest cluster center is minimized NP-Hard

What does karger's do?

Finds the global min-cut that increases the number of connected components

Ball trees

Gives a nice property of having to search this space or not

What does Louvain method do?

Graph clustering also known as community detection

Planar Graph

Graph that can be drawn in the plane without crossing edges

Which approach does the louvain method use?

Greedy agglomerative clustering approach in which every node starts out as its own cluster

KNN Search

Holds a max heap of the closest points yet. Must be a max head cause if you pop, you want to remove the furthest closest point in the heap to replace it.

A* and its comparison to Dijkstra

In Dijkstra's, nodes are expanded in order of distance from the source node whereas in A*, nodes are expanded by minimum distance to source node + heuristic.

Hopcraft-Karp Pseudo Code

M <- empty set repeat ---G' alternating level graph ---P maximal set of vertex-disjoint shortest augmenting paths ---M xor edges in P until P is empty

Perfect Matching

Matching the covers every vertex of the graph

Chromatic number

Minimum number of colors a graph can be colored

What does Lloyd's Algorithm do?

Optimizes kmeans

Strongly Polynomial

Relied on the number of input data points said to be strongly polynomial

Weakly Polynomial Algorithm

Relies on the values of the input data rather than just the number of input data points

DSatur Heuristic

Similar to greedy but deals with next node to be the node whose neighbors have largest num of dif colors

Graph Face

Space that is enclosed by the graph edges

Modularity formula

Summation of edges within a community c - E(edges within community)

Kemp Chaining

Swapping nodes with their neighbors until no change is needed

Why using DFS for the ford fulkerson is not optimal

The longer augmenting paths we find, the more likely we will be to use an edge with a small bottleneck value

Divisive

Top down approach

Augmenting Path

Unmatched node to unmatched node which alternates using edges not in current matching and edges in current matching

Karger's steps

Works by the contraction operation... To contract an edge, we make two nodes that are incident on that edge into a single node. Delete self loops but retain multi edges. Randomly select edges and its repeated till 2 nodes

s-t cut

a cut that disconnects any path from s to t.

flow graph

directed graph with source and sink nodes where edges have a flow and capacity

Beam Search

at each stage in your sub solution, you branch in aBFS manner and then keep the top k scoring sub solutions (or the best lower bound sub solutions) for the next steps. If k is sufficiently large, the sub solution of the global solution is likely to be retained.

Agglomerative

bottom up approach

Problem with Lloyd's algorithm

can fall into local optima quite easily To overcome... restart the algorithm picking different initial cluster centers randomly differently each time. Then pick the final clustering which minimizes the sum of squared distances from observations to their cluster centers

K Colorable

colored with k colors

Alternating Level graph

created by a breadth first seach starting at unmatched nodes in set A alternating between using edges not in matching and edges in matching. Ends at level in which we find at least one unmatched node

saturated edge

edge where capacity - flow = 0

Different ways to determine the number of clusters in your data?

elbow plots, silhouette score, and clustree

Branch and bound

eliminates groups of trees from consideration upon discovering that all their members are worse than the best tree found so far

Why are planar graphs sparse?

for a given number of nodes, the number of possible edges a planar graph can have is limited edges <= 3v - 6

residual edge

for every directed edge <x, y> if there is not as edge <y, x> then we create it with capacity 0 and flow 0.

Ford-Fulkerson psuedo code

for x,y in G,edges: ---e.flow <-0 ---if y, x, e2 is not in G.edges: ------G.add_edge(y, x, flow = 0) while there is a path from s to t such that every edge has capacity - flow >0: ---bottleneckVal = min(capacity -flow among all edges in path p ---for each edge x, y, e in path p: ------e.flow = e.flow + bottleneck_value ------backwards_edge = G.edges.get(y,x ------backwards_edge = backwards_edge.flow - bottleneck_value

Intuition of Louvain

good clustering on a graph is one in which there are more edges between nodes that are both within a cluster than we would expect if the connections were random (Modularty)

Kuratowski's theorem

graph is planar if and only if it does not contain a subgraph which is a subdivision of K5 or K3,3

Feasible

if it obeys the constraints of the problem

What does the Hopcraft-Karp Solve?

maximum cardinality matching for unweighted bipartite graphs

Max Flow min cut theorem

maximum value of an s-t flow is equal to the minimum capacity of all s-t cuts

What does the Hungarian method solve?

maximum weighted perfect matching problem

augmenting path

path from source to sink along edges that have not been saturated

Hierachial clustering

seeks to either build up clusters bycombining data points and clusters that are close to one another into larger and larger clusters or find splitsof the dataset recursively splitting clusters until each data point is its own cluster.

Adjusted Rand Index

subtracts off expected level of similarity between two clusterings

Feasibility

travel through infeasible solutions by giving objective function a penalty or probability and ending up with a feasible one.

Elbow Plots

x axis is number of clusters, y axis is the loss function Loss function can be modularity or sum of squared distances. Always goes down as you increase clusters but will decrease by much less when you go past the true cluster num


Conjuntos de estudio relacionados

N204, Fall 2018 - Midterm Practice Questions

View Set

La Conjugaison de verbes suivants

View Set

Hypothalamus and hypophysis cerebri

View Set

BIOLOGY - Chapter 30 - vocabulary and questions

View Set

INEN 3322 Materials Process Final

View Set

Chapter 32: Hematologic Function and Treatment

View Set