YOU MUST EAT THIS GRAPH NEWS, GRAPH OMAKASE. 2 weeks June
Spatial Meta-graph Learning for Traffic Prediction
[https://arxiv.org/pdf/2211.14701.pdf ]
Introduction
If you look back at the beginning of the graph, you’ll remember. Königsberger Brückenproblem, how to reach your destination efficiently through seven bridges? It’s a problem that solves. If we put this classic problem in modern times, it’s the shortest-distance arrival planning function in the map app that we often use.
It would be overwhelming to reflect the shortest distance in real time, but in the event of an accident on each road, it is a key capability of map service to quickly and precisely plan, calculate, and apply the shortest distance immediately.
To develop this core competency, related service companies are thinking about one challenge. Exactly, how do you optimize multiple factors that occur instantly at the same time? That’s it. I’ve told you about a lot of factors, but there are two important factors among them: time variable and input signal.
So how can we approach these two elements graphically and contribute to performance? I talk about him in this paper.
preliminary
[spaito-temporary learning]
Spatio-temporary learning in graph neural networks (GNNs) refers to a task that learns spatial and temporal patterns of graphed data.
Graphs: Graphs represent mathematical structures in which objects called nodes are connected by edges. In the context of a GNN, the graph shows the node as an object (e.g., user, molecule, location, etc.), and the edge as to the relationship or interaction between these objects.
Spatial learning: Spatial learning of GNN means understanding patterns and relationships between neighboring nodes in the graph. Each node in the graph is connected to neighboring nodes through an edge, and spatial learning aims to capture local contextual information for each node by collecting and processing information from neighboring nodes.
Time learning: GNN’s time learning means identifying temporal patterns and changes within graph data. This is especially important in situations where the graph structure changes or the properties of nodes and edges change dynamically. GNNs can model temporal dependencies by updating node and edge properties based on historical information. This allows you to learn and predict the future state of the graph.
Spatial-time learning: Spatial-time learning means combining spatial and temporal aspects in GNN to identify complex patterns occurring within the graph. This involves modeling local spatial patterns and temporal dependencies of graph structures and characteristics to identify complex patterns arising from node-edge interactions.
By simultaneously considering space and time information, GNNs can effectively model and predict the behavior of graphically represented dynamic systems.
In summary, space-time learning of GNN means having the ability to understand and predict the dynamic properties of the underlying system by grasping both the local patterns and temporal dependencies of graphed data.
Summary
The key is to predict future patterns based on past patterns by optimizing both factors of space and time at the same time. The upper part of Figure 2 utilizes an architecture called GCRU. Simply put, it is a model that adds GRU to apply time variable to GCN. Here, GCRU plays a minor role as one of the tools used to leverage the core idea of this paper, Meta-Graph Learner.
So what about the meta-graph learners? It consists of three components.
Metanode Bank
It’s the heart of this paper. In direct translation, we have created a separate space for storing metadata and used it in a timely manner. So why do we need this meta-node bank? That’s the question. Similar to metadata management. There’s a limit to re-learning and deducing all the data that comes into streaming after storing the numbers, so we figure out if it’s similar to this meta information, reduce the scope, and deduce it. In other words, it functions like a computer’s cache memory.
Hyper-Network
legislated to prevent a crisis by boosting memory(meta). The meta-graph correction with memory lift FC layer warm weight is also followed by decoder init layerOnly in front of the agration injection. In other words, the purpose of this process is to reflect (share) meta information in the decoder start layer (GCRU) weight.
Loss function constraint for robustness.
We use loss function to learn to be similar and not to be similar. There must be node information derived from the meta-node bank, and based on the similarity between these information, the purpose is to derive similar patterns to more similar embedding vectors and not similar ones to the opposite. This process eventually creates a new layer (GCRU) with the weight learned by meta, pos, and neg patterns.
Largely divided, layers that do not reflect meta and layers that reflect meta are derived through the previous process. Concat and refer to these two layers.
insight
Weight sharing tool Network, HyperNetworks [https://arxiv.org/pdf/1609.09106.pdf ]
Hypernetwork is a technique used by neural networks to generate or share parameters (weights) of other networks. Hyper-networks for weight sharing are widely used in many ways, providing a flexible mechanism for dynamically generating network weights based on specific input conditions or contexts. This technology was used to reflect meta weight, but I think meta information such as Recommender system and abnormal deteciton will be useful in essential fields.
Temporal Knowledge Graph Reasoning with Historical Contrastive Learning
[https://arxiv.org/pdf/2211.10904.pdf]
Introduction
Knowledge graph relationship inference is a more difficult task than conventional graph relationship inference. It’s because we have to deduce it from data based on knowledge. It is a given that the difficulty of predicting one of the (s,p,o) tasks is high, but how much more difficult is the (s,p,o,t) quadruple task given a temporary situation? For example, if blah~ was an existing task for me before, the difficulty becomes even more difficult because the time element ‘when’ for me to apologize is included in this.
In this paper, we propose an idea that utilizes historical and non-historical dependencies to overcome the difficulty.
*s-subject , p-predicate , o-object
Preliminary
[contrastive learning]
In graph neural networks, contrast learning is a method of learning models by leveraging similarities and differences in data. Contrast learning is used to learn meaningful representations from graph data and can be useful in a variety of applications.
Contrast learning works in the following ways:
Generating Positive and Negative Pairs: First, we generate positive (similar) and negative (different) pairs from the graph data. A positive pair consists of two objects with similar features or attributes, and a negative pair consists of objects with different features or attributes.
Create embeddings: Create embeddings (low-dimensional representations) of objects by entering the generated positive and negative pairs into a graph neural network. The embedding serves to map objects to locations close to the space based on similarity.
Reinforce similarity and difference: Applies a method to enhance similarity and distinguish differences based on the generated embeddings. Typically, we use Euclidean distance or cosine similarity to minimize the distance between positive pairs and maximize the distance between negative pairs.
Optimizing loss functions: we define loss functions based on similarities and differences and optimize them to learn graph neural networks. Typical loss functions include continuous loss or triplet loss.
Summary
To leverage historical, non-historical dependency, we use the concept of continuous leanring. When a query is given (s,p,?,t), it determines which historical, non-historical factor this query is dependent on to create a context vector and makes an reference based on that vector. At this point, the masked-technique is utilized for interference efficiency.
So here, you might be wondering what you’re going to do in Check if it’s controlling. it was similarity score function , frequency score function. Simply put, the function to measure query and simplicity by internal product and the function to express how frequency the query is found before a specific time in 1 and 0 through the indicator operator. We measure time dependency and quantify it as a context vector.
Insight
I think Ablation study is the real country. By checking the performance differences between idea components such as hard-mask, soft-mask, positive ratio, and negative ratio, we can infer what factors affect KGE, so I think those who will be in charge of these tasks can use them as good starter points in the future.
In addition, if you compare the results from the KGE Embedded baseline with the ideas in this paper and solve the original question of why they perform better than baseline, you’ll find that you’ve learned the overall frame of knowledge graph embedding.