Hello people…! In this post, I will talk about Graph Theory Basics, what are its terminologies, types and implementations in C. Graphs are difficult to code, but they have the most interesting real-life applications. When you want to talk about the real-life applications of graphs, you just cannot resist talking about the Facebook’s Graph Search! Now, I don’t really know that algorithm, but it uses graphs to find out your closest friends, or any other associations you have with the other users.
Other applications include finding the shortest paths, and by shortest path, I mean in the “universal sense”, it could be the shortest path of a Traveling Salesman, or the shortest path to win Snake and Ladder or even the shortest way to solve the Rubik’s Cube…! Amazing, right…?!
You can continue to read this post here to learn about Graphs, or watch this tutorial video (ad-free) –
or, watch the same video on Theory of Coding’s YouTube channel here (contains ads). Now, let’s continue.
Graphs are much clear when defined in mathematical terms. A Graph G (V, E), consists of two sets,
- Set of Vertices, V
- Set of Edges, E
As the name suggest, V, is the set of vertices or, the set of all nodes in a given graph and E, is the set of all the edges between these nodes, or the associations between them. So, |V| denotes the number of nodes in a graph and |E| denotes the number of edges. Let us take up a small example to understand these terms –
As you can see, we have two types of graphs, Directed and Undirected Graphs. In Undirected Graphs, the direction for an edge is not defined. This is generally used to indicate that the edge is actually bi-directional in nature, i.e., one can go from V1 β V2, as well as from V2 β V1. This is used to represent the graph where the states (nodes) are re-doable, such as, in a Rubik’s Cube, you can go from one configuration of the cube to the other as well as the vice-versa.
The other type, the Directed Graph restricts the… “traversal”, if you say… to only one direction. For example, in the Snakes and Ladders game, you can play dice and go from Position 5 β Position 10, but you can’t roll the dice such that it gets you from Position 10 β Position 5… That’s obvious…! So, it really depends on the requirement of the situation, which graph you choose. Generally, we only have to deal with graphs which are purely Directed or purely Undirected. We do not talk about the hybrid type. Some of the other type of graphs are shown below –
In Disconnected Graphs, there will be at least one pair of vertices V1 and V2 β V, which doesn’t have a path. In the above example, vertex 2 and 4 are not reachable from each other. Similarly, vertex 1 and 5 are not reachable from each other. In this graph, vertices 1, 2, 3 for a Connected Component and vertices 4, 5, 6, 7 form another Connected Component.
In Weighted Graphs, the edges are given “weights”. To understand a Weighted Graph, you can think of the vertices as cities and the edges as the distance between them (so they will have some value). You could be asked the shortest path between two cities. So, in the context of a Weighted graph, the shortest path may not be the one with least edges. One must keep that in mind.
There are plenty of other graph types, like Simple Graphs which do not contain any loops. A loop is an edge from a vertex to itself. The graphs which do contain loops are Non-Simple Graphs.
Now, we will look at the way the graphs are implemented. There are mainly two ways to implement graphs, they are –
- Adjacency Matrix
- Adjacency List
Adjacency Matrix
It is a very simple representation where we take a two-dimensional matrix of the size |V+1| Γ |V+1|, i.e., the declaration in C would look like,
int adjMat[V + 1][V + 1];
As the arrays are zero-indexed, we generally prefer to keep the sizes V + 1. Now, initially all the values would be zeroes. We put ‘1’ in an element of the two-dimensional array, adjMat[i][j], if there exists an edge between Vi β Vj. Remember, arrays are zero-indexed. Let us take an example to understand this,
I hope it is clear from the example, how we can represent the graph using an Adjacency Matrix. It is very easy to code. All you have to do is create a two-dimensional matrix and assign the values. For a weighted graph, all you have to do is assign the weights of the edges instead of assigning value 1 in the cells –
Adjacency List
It is an array of pointers of size |V| + 1, where each pointer points to a linked list. The linked list holds the nodes which are adjacent to the ith vertex. By adjacent, we mean those vertices that can be accessed from ith node by making a single move. It is a little complex implementation if you are uncomfortable with linked lists. But it is this implementation that we will use for Graph Search Algorithms. This is because we can easily and efficiently know which of the vertices of V are neighbors of a given vertex. And that is what you want, if you are to do a Graph Search. Now, let us take an example to understand the concept of Adjacency Lists.
For the above illustrated graph which has 4 nodes, we create an array of size 5 (for easy 1-indexing), where each element in the array is a pointer to a linked list. Hence, the building block of our Adjacency List is an array of pointers. These pointers will each act as the head pointers of a linked list. Each linked list node is simple, it contains an integer and a next pointer.
Like I stated before, the linked list will hold the list of vertices that are adjacent to the ith vertex. From vertex 3, you can go to vertex 2 and 4. Hence the linked list corresponding to vertex 3 in the Adjacency List has elements 2 and 4.
Similarly, for a weighted graph, we will just have an extra integer in the linked list node to store the weight of the edge –
I hope this makes it clear as to how we use the Adjacency List to implement a Graph. As I mentioned before, it is one of the most versatile implementations of the Graph Data Structure. Now, try to code the implementation in C, or any language you like. Please try this at least once by yourself so that you can get brain deep into the Graph Data Structure. It is only a linked list. And one more thing, don’t get confused between pointer-to-an-array and array-of-pointers…! So, if we put the value of |V| + 1 in a variable ‘vertices‘, now the declaration of our Adjacency List would be,
struct node * adjacency_list[vertices];
Remember this is an “array of pointers”, the declaration for “pointer to an array” would look like,
struct node (* adjcency_list)[vertices];
This is due to operator precedence. Some people make mistakes here, so, watch out! If you want to code in C++, we can have a vector where each element of the vector can hold another vector or a list, so, the declaration would be, like,
vector<list<int>> adjacencyList(vertices + 1);
Try to code for an hour or two… If you don’t get the code, it’s ok…! I’ve put my C code below. Those who got the code right… You are fabulous…! π … But please take a minute to go through my code, I might have written a slightly better code than yours.
Comparison between Adjacency Matrix and Adjacency List –
Adjacency Matrix | Adjacency List |
---|---|
For a Directed Graph, it consumes O(|V|2) space which is often under-utilized in the implementation. For an Undirected Graph also, it consumes O(|V|2) space which is also under-utilized as the generated matrix is symmetric about diagonal and values just repeat. | For a Directed Graph it consumes O(|V| + |E|) space which is less, and is utilized optimally. For an Undirected Graph it consumes O(|V| + 2 β |E|) space, which is O(|V| + |E|), which is less. |
As the memory allocation is contiguous, data access is slightly faster. | It is basically a Linked List, so memory allocation is not contiguous, hence traversal is slow as as compared to the traversal in an array. However, this can be eliminated if we use a Vector of C++ STL Library in the place of the Linked List. But C++ STL Vector is not recommended for graphs that keep growing. |
Inserting an edge takes O(1) time, i.e., constant time. Therefore, inserting |E| edges take O(|E|) time, as direct access is available. | Inserting an edge can take O(|E|) in the worst case, which is, there is an edge between every pair of vertices, one must traverse the whole Linked List over-and-over causing insertion of |E| nodes to take O(|E|2) time. However, if one follows Head Insertion, inserting a node will take O(1) time, and inserting |E| nodes will take O(|E|) time. |
Deleting an edge is very simple, we just set the corresponding element value to 0, which takes O(1) time. | Deleting an edge is time-taking as, one would have to traverse the Linked List, which is slow (non-contiguous). In the worst case, it takes O(|E|) time. |
With the knowledge of Adjacency Matrix and Adjacency List alone you won’t be able to do the competitive programming questions, because you do not know how to explore your graph to get the desired result. For this, you need to know the Breadth First Search (BFS) and the Depth First Search (DFS). This post is only to give an introduction. Feel free to comment your doubts..! Keep practicing…. Happy Coding…! π
42 thoughts on “Graph Theory Basics”
In the below statement there is a small mistake.it should be “In UnDirected Graphs the direction……..”
“In Directed Graphs the direction for and edge is not defined. This is generally used to indicate that the edge is actually bi-directional in nature, i.e, one can go from Node 1 to Node 2 as well as Node 2 to Node 1.”
Thanks a lot Sai Avinash. It’s a silly mistake. Thanks for correcting me…! ☺
Can you tell more about what are you doing in the ‘add’ function? And how the returned value is treated by ‘adjacency_list[v]’?
Sure..! Now, if you look at the Adjacency List diagram, you will see that adjacency_list[v] stores the address of the first node. That is, it acts as a head pointer to the linked list. Now, when we call add() function, we pass this value, the address of the first node to the head variable. So, now, head points to the first node.
So inside the add() function, the situation looks like –
Now, we create a new node and assign put some values into it –
The blue part of the node indicates the next variable whereas the white part indicates the val variable. Now, we assign head to p->next. Then, p is returned. So, the situation becomes –
As you can see, p->next is pointing to head and the value of p is assigned to adjacency_list[v], which is the address of the newly allocated node. So, essentially, the new node is at the starting of the previous linked list. This is head insertion. This is what we do in add() function. Take your time and think properly for a while and I’m sure you’ll understand what’s happening…! Feel free to ask if you have any more doubts… π
There is typo that I found π . As you said you have used Camel Case naming convention . So the Function that you have declared “AddEdge()” , I think it should be addEdge ().
The functions follow Pascal Case sir..! π
Thanks for the article !!. Helped me in understanding graphs.
Thank you Neeraj..! π … I’m glad my post could help… π
can you please tell me how to create adjacency list of character
Number of nodes = 5
A,G,T,a,b (nodes)
Number of edges = 4;
there is a edge between A and G;
there is a edge between A and a;
there is a edge between T and G;
there is a edge between a and b;
Hi, Pravin..! π … Here’s the modified code to create a graph of character nodes using adjacency list. As you have mentioned in your input that you want to create nodes for upper and lower case letters, I created adjacency list just big enough to accommodate them (26 + 26)… And then I map the upper case letters to the first 26 indices and the lower case letters to the last 26 indices… Try to run the code which I have given you… I’m sure you’ll understand… Feel free to comment if you have any more doubts π
thank you
But I have two more doubt
1) what if I add two more edges of integer say
1->2 and 2->3 in the same modified program by you.?
2) And can’t I eliminate printing of which I don’t have nodes in my adjacency list as I am not having node Z but in code provided by you printing the node of which I don’t have use.
You can have nodes for integers too… You could start using adjacency list index 52 onwards for your integers… Then your integer nodes get mapped to (integerValue + 52)th index… The point is you can have anything in your adjacency list… You only need to know how to map them properly. Regarding your second point… You can just cancel the inner loop by putting a condition before it… Something like –
With this, you can avoid printing the nodes which don’t have edges… But, in that case, ‘G’ won’t get printed for the test case in your previous comment… If you want that to be printed too, then you’ll have to keep a track of what was entered. Then your code gets clumsy and uses more memory… It’s your wish… Feel free to comment if you have any more doubts π
Hi Vamsi,
Thanks for the tutorial.
Is there any specific reason for using vecor< list <pair > > > for adjacency list?
we could use array of vectors like vector< pair> adjList[nodes +1];
or vector<vector>>
Yes, you can do that… I used a vector of lists because I wanted it to resemble an adjacency list “visually” as much as possible. In the days of C, we learn adjacency list as an array pointers, pointing to linked lists, so here, the vector resembles that array, and the list resembles the linked list… But then, you can implement it as you seem fit… π
can you tell how the complexity of inserting nodes in adjacency list would be |E|^2 in worst case.
Consider a complete graph (graph where there is is an edge between a pair of distinct vertices)… So in the linked list corresponding to a vertex, you would have |E| entries… As I stated in my post, if you follow head insertion, inserting |E| items into a linked list by head insertion takes O(|E|) time (because head insertion takes O(1) time and we are inserting O(|E|) elements)… Now, if you want that list to be sorted, you would have to traverse the linked list for the appropriate position to insert. In the worst case, you would traverse the whole list every time and insert it at the tail… Think of it as |E| tail insertions which will take O(|E|2) time… Logically, the second method is a little like insertion sort over |E| items. Think about it for a while, I’m sure you’ll get it π
Sir,i guess in undirected graph set of edges will be reflexive,i.e. both{1,2} and {2,1} will be different edges as it is undirected,it can go either way.Correct me if I am wrong.Thanks in advance π
Sorry for the late reply… Yes, you are perfectly right! π
Great Tutorial for graph beginners.Kudos to you and your team for such an elaborative and thorough tutorial. π
Thanks a lot! π
Hello, thank you for this introduction, helped me a lot. Could you add a compiler and execute on your page if possible? I tried in other sites but couldnt get it to work
Okay that’s an interesting idea… I will try to do that… Btw… Which piece of code gave you trouble in running??
The C one just above.. I guess the online compilers I used might have had issues., since the code is completely correct
Hi Vamsi,
thank you for this great tutorial on Graph theory, since I’m novice to programming I’m still trying to grasp my mind over this, but I believe I’m on the track π
I’ve just wanted to notice you that the link to:
” Adjacency List in C# ” (http://theoryofprogramming.azurewebsites.net/adjacency-list-in-c-sharp/) is broken, but that it can be reached through search at the following link http://theoryofprogramming.azurewebsites.net/adjacency-list-implementation-in-c-sharp/
That’s it for now and I’m sure going to spend a lots of time on these pages. And thanks once more to both of you guys!
should we not do newhead.next=null and currenthead.next=newnode
so that it matches ur diagram
That is not exactly head insertion.
Thank you so much, this is one of the best tutorial I’ve ever read. good job
Thanks a lot! π
i tried by taking vertex value as character type but the output is not as expected.
I have done the necessary changes(for converting integer type to character type )
please help.
Also notify wherever fflush() is required…..
Thank you in advance.. π
Where you’ve written:
list< vector > adjacencyList;
Shouldn’t it be:
vector< list > adjacencyList(vertices);
Because this is an array (vector) of |V|+1 (verticies) linked lists (list)?
Yes, you are right. Corrected my post. Thanks for pointing it out. π
Hi Vamsi, I’m learning algorithms and i face problems when it comes to graphs and other dynamic programming problems. I’m well versed with Java array and String algorithm. Kindly provide some tips for me. My email is srivastav.varun12@gmail.com. If you can drop me your contact info, I can get in touch with you.
Sure, check your inbox π
Thanks a Lot Vamsi! It really helped!
I’m glad I could help you! π
very clear and useful tutorial,if you have any tips related to DS please mail me (sudheerkonagalla@gmail.com)
Thanks vamsi.
Your C language Graph-implementation program is not printing valid results for undirected graph. An edge from 1 to 4 should appear in both adjacency list of vertex 1 and 4 . However it is not the case in your program.
By the way, the very first undirected graph diagram in your tutorial where you have written edges as subsets of vertices. An edge from 1 to 2 for undirected graph will have two sets (1,2) and (2,1). Again the latter set is not mentioned.
Well, I’ve tested the C code and it is printing both edges, could you please try again? And in the diagram I’ve represented the edge as
{1, 2}
without arrows, as compared to the 2nd diagram. By dropping the use of arrows I meant to convey that it is a undirected edge.Hey, thanks for the tutorial, it’s really helpful. But, in the post above, there is no example of an Adjacency List after you run the program. So, I just kinda confused right now, because you know, I don’t know what I have to input.
Hi, for the C program you can try this input –
3
3
1 2
2 3
3 1
It should give you this output –
Enter the Number of Vertices -
Enter the Number of Edges -
Adjacency List -
adjacencyList[1] -> 3 -> 2 -> NULL
adjacencyList[2] -> 3 -> 1 -> NULL
adjacencyList[3] -> 1 -> 2 -> NULL