YOU MUST EAT THIS GRAPH NEWS, GRAPH OMAKASE. 1 weeks MARCH
Preview
[GNN of SIR department]
[Yuri is effective at maximizing information with Nam Manyeon?]
[LINK prediction] Do you only understand preparatory learning?]
[From the perspective of global context, which is always talked about, what does the thesis retweeted by Professor Jure mean?]
[How would it improve if you add a drop of differential equation to LightGCN?]
network science and graph embedding
Identifying critical nodes in complex networks by graph representation learning
[https://arxiv.org/pdf/2201.07988.pdf]
- Finding important nodes and edges in network data has been done not only in complex networks, but also in physical, statistical and mathematical fields. As a result, pagerank and degree centrality, which we often use, were born. However, as the amount of data grows exponentially, it becomes so difficult to find a “critical” node among countless nodes that it applies heuristic (human bias) methods such as cutting (sampling) certain parts and using them only as a reference. Specifically, we use percolation theory to design several heuristic rules (interactive selection or local attributes grouping) for approximate results of the optimal solution.
- In this paper, we present the idea of extracting important nodes from network data through a combination of graph embedding methods and heuristics mentioned above. It uses a method called infusion maximization. Influence maximization is a technology that tracks and changes the form with the largest amount of information (entropy) when a specific network is input, and then derives the modified network form as output. In other words, the purpose is to find a form in which information exchange occurs a lot in input. Here we borrow the SIR model for that approach. We check the information propagation path through the SIR model, and we learn the optimal node propagation path by embedding graphs based on that path. If so, you can figure out which nodes will eventually deliver the information to your network.
- After all, the core of graph embedding is message passing, and I think that the key element of message passing was targeted well with the SIR model. In addition, it was a very interesting paper because the history of how complex network researchers applied for network critical node identification was well organized and it felt like charging knowledge that they didn’t know. In particular, it was a very good paper because it felt like I grew further after learning about methods such as VoteRank, EnRenew, Improved Kshell, and NCVoteRank. Now, why don’t you try to look for critical nodes from a different perspective, not just pagerank, using the methods mentioned earlier?
LINK PREDICTION WITH NON-CONTRASTIVE LEARNING
[https://arxiv.org/abs/2211.14394]
- Contrastive-learning, positive sample and negative sample are combined to make embedding better. It is a method that is used a lot in Link prediction, especially in the recommender system. The author argues that this method has a 1. Negative sampling cost 2. Overfitting problem. So, the author wants to use the non-contrastive learning method. We propose the idea of converting the experimental cost to the cheap cost by conducting negative sampling through the simple corrpolution function.
- So, there’s one question that comes up here. The purpose of this paper is to improve the negative sampling cost of contrastive learning, but again, negative sampling? That’s the question. So you’re going to rethink what non-contrastive learning is. The difference between the two is simply whether or not they give levers to pos and neg. Contrastive learning is often seasoned with augmentation, such as weighting neg samples.
- The results of the Transductive learning experiment are as follows: After training of BGRL (proposed method) and ML-GCN (conventional method), we observe the pos, neg data distribution and find that the difference for w/olveraging negative sampling does not appear that way. The author emphasizes that the performance of ML-GCN is dominant, but not overwhelming.
- Inductive learning unfolds the result of the reversal. The BGRL shows a much more dominant result over ML-GCN. From this, it can be inferred that the technique of leveraging negative sampling is not very useful in an inductive setup. In other words, constructive learning, which is a set that is close to real-world, is not an all-rounder.
- Now, if I were to talk about the difference between Transductive and Inductive, I could say whether to keep or eliminate the nodes that we used to learn. In Transductive learning, the nodes that were applied to the learning are applied to the train/test/inference as they are, while in inductive learning, all of those nodes are removed. In the end, inductive learning is a much more difficult task.
- Then, what is the core corrpution function in this paper? I’m sure you’re curious about it. It’s very simple. Row-wise shuffle the node feature. Keep the structure as it is, and just randomly shuffle the feature. For example, chef Omakase’s existing features are flowers, wind, and if they were birds, they are taught in order of birds, wind, and flowers through shuffle.
- Cold-start problem, graph SSL, etc. have become a hot topic, and interest in continuous learning is increasing rapidly. There are always side effects next to trends. I think it’s a paper that explained the side effects well.
Global Context Enhanced Graph Neural Networks for Session-based Recommendation
[https://arxiv.org/pdf/2106.05081.pdf]
- I brought a paper that emphasizes continuously in pyg with Professor Jureleskovec. The core of the paper is, “Context will occur for each session, and it will be embedded separately into global and session.” The method of designing a session graph is the same as the traditional method of sequencing the user’s behavior by time and connecting it to a graph. Then, you can see that global graph is a new perspective and the main idea of the paper.
- Here’s how to create a global graph: 1. List the sessions. 2. Cut the before/after actions based on the actions that occur at a specific point in time. 3. It connects the cut-off specific point-in-time behavior and the context before and after into a single graph. At this point, setting the context before and after can be controlled by hyperparameters. If you check the user’s long-term, you can form a global graph differently depending on the purpose by holding this parameter high and short-term low.
- When training the formed graph, the following four parameters are learned. 1.in 2.out 3.in-out 4.self 4 parameters can be seen as edge weight soon. You can think of it as using the characteristics of the graph, weighted and directed. If you learn these four edges, the values of behavior overlap and you can see at what point the behavior was an important characteristic.
- Now you might have a question. You may have noticed, but the question is, how do we reflect the order of action in the session? In this paper, a positional vector is simply created and concatenated. Then you end up with a context for each particular session, and both information about where it was located in the session is quantified by vectors.
- Finally, we talk about five research questions: how much global feature affects performance and how much positional encoding affects performance. The results of the experiment are very interesting, so if you don’t have time, I recommend you to watch the Experimental section with only what I wrote above!
- Why is the context of the user important in the recommendation system? I think it’s a good paper to read with that question. For your information, if you look at it in conjunction with the kumo recommendation system mentioned in the previous post, I think the efficiency will almost double. This is because you can understand why graphs are efficient in the recommendation system to the trend of session-based graphs while understanding the context.
LT-OCF: Learnable-Time ODE-based Collaborative Filtering
[https://arxiv.org/pdf/2108.06208.pdf]
- It’s been a long time since I’ve had a paper full of math formulas. Rather than avoiding math because it is complicated, I brought this paper because I thought that mixing engineering and math techniques evenly and not being picky is very helpful in improving my ability. We talk about why the combination of differential equations and graph smootheness is significant in collaborative filtering (CF).
- Collaborative filtering, recommendation systems, and graphs remind you of models. It’s LightGCN. The co-evolving between the user and the item is added to the layer combination of the famous recommended model LightGCN for prediction. Now, you have two questions. 1. Why add the concept of time 2. The concept of time must be discrete, so how to use it. The reason for adding the answer, the concept of time, is simple. The intention between the user and the item will be different over time, so you can think of it as to contain that intention. It transforms the concept of time in a discrete form into a differential equation that can be interpreted and even into a trainable form.
- A good technique to watch out for is non-backpropagation. We use the adjoint sensitivity method to replace backpropagation. This is also possible thanks to the characteristics of the model that replaced non-linearity with linearity. In short, the adjoint method is to optimize by transposing the Jacobian matrix, the partial differential matrix of the equation (scalar) for the variable. The author customizes this adjoint sensitivity method to utilize time.
- In summary, the advantages of this paper are the following two. 1. Using the differential equation, we reduced the parameters a lot. 2. Use discret time value by continuously transforming it. Unlike the session-based modeling method, which attempted to reflect the user’s intention according to the behavior in the interaction between user items, I think it is a paper that examines how to mathematically contain the concept of time. In addition to the proposed differential equation, there are experiments and results applying various differential equations, so it is a good paper for those who are curious about mathematical techniques and recommendation systems over time.
Graph omakase meetup event
- I participated in the meeting of Google K8s in Singapore. If you look carefully in the picture above, you will find me. Hint that you’re smiling brightly…There’s this. It was a very impressive event to try to lower the barriers to entry to technology by interacting with the technology that you have. It’s sad if it ends with “Impressed”, right? So I’m going to do something.
- I’m going to do a graph-related technology meeting in Korea. The location and content are still undecided, but we will figure out the personality of those who support attendance or presentation, plan it as soon as possible, and share the date with Omakase customers. Don’t feel pressured and apply! I promise you that it will be a valuable and fun meet-up…!!
- To apply, reply to Chef Omakase’s email or click the linked-in icon below to send a message!
if you have any question or feedback , so touch my linkedin profile https://www.linkedin.com/in/ii-tae-jeong/
thank you!