A common difference between data in the euclidean domain and graphs is the neighbouring nodes. (Can we call euclidean points disconnected nodes?)
The BIG question here is how GNNs take care of neighbours of a node?
One way is to use an aggregate of the neighbouring features.
Let’s say we have a simple graph with three nodes. (in the above figure)
Their adjacency matrix looks like this
The features of each node say (are) as
If we multiply the features with the adjacency matrix, we get the list of features of neighbouring nodes.
But we have a problem the features of the node itself are not included. We have a hack to include them. We add the identity matrix to the adjacency matrix as follows (we are adding self-loop):
By adding features they may become unscaled. How do we deal with this? By normalizing using the inverse of degree matrix (a matrix which shows how many neighbours a node has)
Now what? Multiply it with a learnable weight matrix, use an activation function and your GCN is ready!
There are many tweaks we can make.
- The aggregation function used for the neighbouring features is simply addition, we can try some other permutation invariant operation (a fancy term if you don’t want the order of operations to matter).
- We can also change the way the inverse of the degree matrix is calculated by using square roots.
Here we are just dealing with a single node, various tasks like classification can be performed on it. We can also add other layers after this GCN layer.
I will be explaining the mathematics of an algorithm dealing with the entire graph in the next article. Stay tuned.
Don’t forget to share your wonderful feedback!