The journey to deep Learning on graphs: How GCNs overcame transductive limitations and the Over-smoothing problems
13 min readJust now
–
Day 4, and we’re diving into one of my favourite topics: graphs. Social networks, molecules, the internet, recommendation engines — so much of the world is just nodes and edges. For a long time, machine learning struggled with this kind of data. Standard models want neat, fixed-size vectors, but graphs are messy, complex, and irregular. Today, I want to walk through the journey of how we learned to apply deep learning to graphs, starting from the old ways and building up to the models we use today.
The world before deep learning on graphs
Before GNNs became the standard, if you wanted to do machine learning on a graph, your options were …
The journey to deep Learning on graphs: How GCNs overcame transductive limitations and the Over-smoothing problems
13 min readJust now
–
Day 4, and we’re diving into one of my favourite topics: graphs. Social networks, molecules, the internet, recommendation engines — so much of the world is just nodes and edges. For a long time, machine learning struggled with this kind of data. Standard models want neat, fixed-size vectors, but graphs are messy, complex, and irregular. Today, I want to walk through the journey of how we learned to apply deep learning to graphs, starting from the old ways and building up to the models we use today.
The world before deep learning on graphs
Before GNNs became the standard, if you wanted to do machine learning on a graph, your options were pretty limited and most of the methods fell into two buckets:
First, you had classic graph-based algorithms. For example: Label Propagation. The idea was simple: you have a few nodes with labels (like a few users you know are fraudulent) and a whole lot of nodes without labels. The algorithm assumes that “you are who you hang out with,” so it iteratively passes labels to neighboring nodes based on similarity. It relied on a very strong assumption that connected nodes should be similar, which isn’t always true. Plus, these methods didn’t really learn features; they just spread existing information…