Info

Notes I wrote down while taking the Graph Theory by William Fiest course on Youtube.

Graph Theory Introduction

link to video

Types of graphs

Special Graphs:

Representing Graphs in computers

Adjacency Matrix

Using an adjacency matrix, we can represent graphs inside 2-dimensional arrays.

ProsCons
Space efficient for representing dense graphsRequire space
Edge weight lookup is Iterating over all edges takes time
Simplest graph representation

Adjacency List

An adjacency list is a way to represent a graphs as a map from nodes to lists of edges.

Example of an adjacency list representing graphs

ProsCons
Space efficient for representing sparse graphsLess space efficient for denser graphs
Iterating over all edges is efficientEdge weight lookup is
Slightly more complex graph representation

Edge List

An edge list is a way to represent a graph simply as an unordered list of edges. Assume the notation for any triplet means: “the cost from node to node is “.

Example:

[(C, A, 4), (A, C, 1), (B, C, 6), (A, B, 4), (C, B, 1), (C, D, 2)]
ProsCons
Space efficient for representing sparse graphsLess space efficient for denser graphs
Iterating over all edges is efficientEdge weight lookup is
Very simple structure

Common Graph Theory Problems

link to video

Identifying a problem:

  • Is the graph directed or undirected?
  • Are the edges of the graph weighted?
  • Is the graph I will encounter likely to be sparse or dense with edges?
  • Should I use an adjacency matrix, adjacency list, an edge list or other structure to represent the graph efficiently?

Common problems:

  • Shortest path problem: BFS, Dijkstra’s algorithm, Bellman-Ford, Floyd-Warsahll, A* and many more
  • Connectivity: Union find data structure or any search algorithm (BFS, DFS)
  • Negative cycles: Bellman-Ford and Floyd-Warshall
  • Strongly connected components (SCCs can be thought of as self-contained cycles within a directed graph where every vertex in a givencycle can reach every other vertex in the same cycle): Tarjan’s and Kosaraju’s algorithms
  • Traveling Salesman Problem: Held-Karp, branch and bound and many approximation algorithms
  • Bridges
  • Cut Vertex
  • Minimum spanning tree (a MST is a subset of the edges of a connected, edge-weighted graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight): Kruskal’s, Prim’s & Borůvka’s algorithm
  • Network flow: max flow Ford-Fulkerson, Edmonds-Karp & Dinic’s algorithm

Bridges & cut vertices are important in graph theory because they often hint at weak points, bottlenecks or vulnerabilities in a graph

link to video

Time complexity:

Simple traversal:

# Global or class scope variables
n = number of nodes in the graph
g = adjacency list representing graph
visited = [false, ..., false] #size n
 
def dfs(at):
	if visited[at]: return
	visited[at] = true
 
	neighbours = g[at]
	for next in neighbours:
		dfs(next)
 
# Start DFS at node zero
start_node = 0
dfs(start_node)

Finding components:

# Global or class scope variables
n = number of nodes in the graph
g = adjacency list representing the graph
count = 0
components = empty integer array #size n
visited = [false, ..., false] #size n
 
def find_components():
	for(i = 0; i < n; i++):
		if !visited[i]:
			count++
			dfs(i)
	return (count, components)
 
def dfs(at):
	visited[at] = true
	components[at] = count
	for next in g[at]:
		if !visited[next]:
			dfs(next)

What else can DFS do?

  • compute a graph’s minimum spanning tree
  • detect and find cycles in a graph
  • check if a graph is bipartite
  • find strongly connected components
  • topologically sort the nodes of a graph
  • find bridges and cut vertices
  • find augmenting paths in a flow network
  • generate mazes

link to video

Time complexity:

Particularly useful for one thing: finding the shortest path on unweighted graphs.

A BFS starts at some arbitrary node of a graph and explores the neighbour nodes first, before moving to the next level neighbours. It uses a queue data structure to track which node to visit next. Upon reaching a new node the algorithm adds it to the queue to visit it later.

# Global/class scope variables
n = number of nodes in the graph
g = adjacency list representing unweighted graph
 
# s = start node, e = end node, and 0 <= e, s < n
function bfs(s, e):
	# Do a BFS starting at node s
	prev = solve(e)
 
	# Return constructed path from s -> e
	return reconstructPath(s, e, prev)
 
function solve(s):
	q = queue data structure with enqueue and dequeue
	q.enqueue(s)
 
	visited = [false, ..., false] # size n
	visited[s] = true
 
	prev = [null, ..., null] # size n
 
	while !q.isEmpty(): 
		node = q.dequeue()
		neighbours = g.get(node)
 
		for(next : neighbours):
			if !visited[next]:
				q.enqueue(next)
				visited[next] = true
				prev[next] = node
	return prev
 
function reconstructPath(s, e, prev):
	# Reconstruct path going backwards from e
	path = []
	for (at = e; at != null; at = prev[at]):
		path.add(at)
 
	path.reverse()
 
	# If s and e are connected return path
	if path[0] == s:
		return path
	return []

BFS grid shortest path

link to video

Problem: You are trapped in a 2D dungeon and need to find the quickest way out! The dungeon is composed of unit cubes which may or may not be filled with rock. It takes one minute to move one unit north, south, east, west. You cannot move diagonally and the maze is surrounded by solid rock on all sides.

Is an escape possible? If yes, how long will it take?


The dungeon has a size of and you start at cell and there’s an exist at cell . A cell full of rock is indicated by a and empty cells are represented by a

Example of the dungeon

# Global/class scope variables
R, C = ... # R = number of rows, C = number of columns
m = ... # Input character matrix of size RxC
sr, sc = ... # 'S' symbo row and column values
rq, cq = ... # Empty Row Queue (RQ) and Column Queue (CQ)
 
# Variables used to track the number of steps taken.
move_count = 0
nodes_left_in_layer = 1
nodes_in_next_layer = 0
 
# Variables used to track whether the 'E' character
# ever gets reached during the BFS.
reached_end = false
 
# RxC matrix of false values used to track whether
# the node at position (i, j) has been visited.
visited = ...
 
# North, south, east, west direction vectors.
dr = [-1, +1, 0, 0]
dc = [0, 0, +1, -1]
 
function solve():
	rq.enqueue(sr)
	cq.enqueue(sc)
	visited[sr][sc] = true
	while rq.size() > 0: # or cq.size() > 0
		r = rq.dequeue()
		c = cq.dequeue()
		
		if m[r][c] == 'E':
			reached_end = true
			break
			
		explore_neighbours(r, c)
		nodes_left_in_layer--
		
		if nodes_left_in_layer == 0:
			nodes_left_in_layer = nodes_in_next_layer
			nodes_in_next_layer = 0
			move_count++
			
	if reached_end:
		return move_count
	return -1
 
function explore_neighbours(r, c):
	for(i = 0; i < 4; i++):
		rr = r + dr[i]
		cc = c + dc[i]
 
		# Skip out of bounds locations
		if rr < 0 or cc < 0: continue
		if rr >= R or cc >= C: continue
 
		# Skip visited locations or blocked cells
		if visited[rr][cc]: continue
		if m[rr][cc] == '#': continue
 
		rq.enqueue(rr)
		cq.enqueue(cc)
		visited[rr][cc] = true
		nodes_in_next_layer++
	

Topological Sort Algorithm

link to video

Topological ordering is not unique.

# assumption: graph is stored as adjacency list
function topsort(graph):
	N = graph.numberOfNodes()
	V = [false, ..., false] # Length N
	ordering = [0, ..., 0] # Length N
	i = N - 1 # Index for ordering array
 
	for (at = 0; at < N; at++):
		if V[at] == false: 
			i = dfs(i, at, V, ordering, graph)
 
	return ordering
 
function dfs(i, at, V, ordering, graph):
	V[at] = true
	
	edges = graph.getEdgesOutFromNode(at)
	for edge in edges:
		if V[edge.to] == false:
			dfs(i, edge.to, V, ordering, graph)
 
	ordering[i] = at
	return i - 1

Shortest/Longest path on a Directed Acyclic Graph

link to video

The Single Source Shortest Path (SSSP) problem can be solved efficiently on a DAG in time. This is due to the fact that the nodes can be ordered in a topological ordering via topsort and processed sequentially.