YOU MUST EAT THIS GRAPH NEWS, GRAPH OMAKASE. 1 week June
Topological Deep Learning: Going Beyond Graph Data
[https://arxiv.org/pdf/2206.00606.pdf]
It is a paper that can answer the question “Why TDA?” clearly.
TDA can only be utilized when solving Geometry problems.It’s a paper that breaks the stereotype.
Referring to the perspective of combinatorial complex natural network, the existing Pair-wise, hypergraph, and combinatorial complex (matrixrank) aspects are well described in topological terms. In particular, I first learned from this paper that graph augmentation is possible by combining hypergraph and topology.
I think it’s a thesis that explains creative and new insights logically well. I think it will be very helpful for those who have a need for a fancy research topic or “rather than hypergraph.”
If you’re interested in practical aspects, I recommend starting with 2.2 Utility of Topology. the contents are following.
1.Building hierarchical representations of data.
2.Improving performance through inductive bias.
3.Construction of efficient graph-based message passing over long-range.
4.Improved expressive power.
We describe in detail the four benefits of TDA, which are also useful in machine learning. In addition, pooling is an essential process in graph classification, and it is also very interesting to replace this process with topolgy.
GPT-GNN: Generative Pre-Training of Graph Neural Networks
[https://arxiv.org/pdf/2006.15437.pdf]
Introduction
In the field of NLP and CV, labeling tools exist, so it is a very general area. In contrast, graph data sets are relatively difficult to label. The biggest reason is that it is difficult to label what kind of relationship (edge) it is. In this paper, I would like to approach the need for labeled graph data sets with pre-training and fine-tuning skills. Then pre-training is the key. How can we get weight while preserving data intrinsic distribution? We talk about those ways.
Preliminiary
GNN ‘s pretraining and fine-tunning
Graph neural networks (GNNs) are typically trained using a two-step process: pre-training and fine-tuning. In the pre-training phase, the GNN is trained on a large-scale graph dataset to learn meaningful representations of nodes and edges. The learned representations capture structural and relational information present in the graph.
- Data preparation: The first step is to prepare the graph dataset for pre-training. This involves representing the graph as a set of nodes and edges, where each node and edge has associated features or attributes.
- Encoder architecture: The next step is to define the architecture of the graph neural network. GNNs typically consist of multiple layers of graph convolutional operations that aggregate information from neighboring nodes and update the node representations. The exact architecture can vary depending on the specific GNN model being used.
- Pre-training objective: A pre-training objective is defined to guide the learning process. The objective typically involves reconstructing or predicting some aspect of the graph structure. One common objective is node-level representation learning, where the GNN is trained to predict the attributes or labels of nodes in the graph based on their local neighborhood.
- Pre-training algorithm: The pre-training algorithm iterates over the graph dataset and updates the parameters of the GNN to minimize the pre-training objective. The algorithm typically uses gradient-based optimization techniques, such as stochastic gradient descent (SGD), to update the network parameters.
Summary
The core of this paper is to apply labels from data that are not labeled. So how do we define labels? In this paper, attribute and structure dependency are used to define labels. This dependency applies the Masking Technique and Permutation Technique that you have experienced a lot in BERT. Simply put, when node and edge are permuted (random order), masked, we compare the distribution of each node attribute, edge (structure). By repeating this process, the correlation is observed depending on whether there are any characteristics or not, and the label generation is based on that correlation.
Insight
In addition to the parts mentioned in the Summary, there are some interesting ideas called 1. attribute & edge generation loss design 2. additive embedded queue, large-scale graph subsampling. Among them, additive embedding will be sub-sampling every generation, and it will be parallelepiped in large-graph by incorporating a Queue perspective into the sub-sampling.
Graph Neural Networks can Recover the Hidden Features Solely from the Graph Structure
[https://arxiv.org/pdf/2301.10956.pdf]
Introduction
Message passing proposal, aggregation, and node and edge feature for updating GNN are key, so what if this feature is uninformative or non-exist? In this paper, to alleviate the problem, we propose an idea to reflect the hidden feature.
Preliminiary
Correct and Smooth technique , the contrastive perceptive in this paper. it was deal with post-pre concept. not including the embedding space.
The Correct and Smooth (C&S) procedure involves two post-processing steps: (i) an “error correlation” step that spreads residual errors in training data to correct errors in test data, and (ii) a “prediction correlation” step that smooths the predictions on the test data. These steps are implemented via simple modifications to standard label propagation techniques from early graph-based semi-supervised learning methods.
The Correct and Smooth (C&S) procedure uses label propagation techniques to capture informative distributions of the graph. The two post-processing steps in C&S are designed to exploit correlation in the label structure of the graph, which helps to improve the accuracy of node classification. However, it’s important to note that C&S does not rely on learning over the graph and only uses it in pre-processing and post-processing steps.
Summary
Intrinsic graph characteristics? I think that’s the question. In this paper, we propose a recovery hidden feature through the graph’s own structure. It’s to find the insights that we didn’t find in the graph structure. We approach it from a different perspective from the previously applied degree, spectral view (laplacian), or K-n graph.
It’s about extracting inductive bios through new distance and new express solutions. It’s a very fresh point of view. We’re focusing a little bit more on graph uniqueness to find graph inheritance features. For example, are nodes that were 2-hop in the real-case really 2-hop in the embedded space? As a result of the experiment, there was a limitation that this was not the case, so we devised a method that could apply the appropriate real world feature to the embedded feature by actively utilizing the aforementioned node distance and node expressive.
Insight
Ground-truth vs. in the experimental section. You can observe the traditional method vs. our method three-wave battle. It was an insight into my way of indirectly measuring the embedsing value with performance results alone. I think it would be good for you to take a look and study how the embedding results lead to the performance result.
Correct & Smooth, which is written in the preliminary, focuses on the embedded space if it is from the perspective of increasing performance through distribution capture by generating diffusion based on error. These two views may seem mutually exclusive, but they have something in common. It’s about focusing on structure. I wonder what kind of synergy will come from combining one side of the structure in real space and the other side of the structure in embedded space. Why don’t you try it?