https://towardsdatascience.com/how-to-do-deep-learning-on-graphs-with-graph-convolutional-networks-7d2250723780
https://github.com/dmlc/dgl
https://github.com/dglai/DGL-GTC2019/blob/master/slides.pptx
http://tkipf.github.io/misc/SlidesCambridge.pdf
GCN
G(V,E)
H0=X←[N⋅F0]
N = # of nodes
F0 = # of features of each node
Adjacency matrix
representation of the graph structure
A←[N⋅N]
Output
Hl+1=f(Hl⋅A)←[N⋅Fl+1]
Propagation
Sum Rule
sum up feature representations of the neighbors of the ith node
f(Hl⋅A)=σ(A⋅Hl⋅Wl)
aggregate(A,X)=∑j=1NAi,j⋅Xj
W = weight matrix, dimension = [F_l, F_l+1]
Mean Rule: Self-loop & normalization
A^=A+I
f(X⋅A)=D−1⋅A^⋅X
aggregate(A,X)=∑j=1NDi,iAi,j⋅Xj
D = degree matrix (for normalization)
Spectral Rule
f(X⋅A)=D−0.5⋅A^⋅D−0.5⋅X
aggregate(A,X)=∑j=1NDi,i0.51⋅Ai,j⋅Dj,j0.51⋅Xj
aggregate feature of the ith node = the degree of the ith node, and also the degree of the jth node.
Inductive = supervised
Transductive = unsupervised