YOU MUST EAT THIS GRAPH NEWS, GRAPH OMAKASE. 3weeks October
A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests
[https://arxiv.org/pdf/2302.07090.pdf]
Expressive power, a factor that is always emphasized when discussing the pros and cons of graph data. Even graphs with the same number of nodes and the same number of edges are different graphs depending on how they are connected within the graph. Understanding “how they are connected” and expressing that in numbers is the key to expressive power, which is often evaluated through isomorphism tests.
While it would be ideal to transform all graphs by measuring the number of cases and finding the eigenvalues for each, in practice this is rarely the case as we need to achieve the highest efficiency with limited resources. Therefore, there are several attempts underway to extract efficiency. This thesis approaches Expressive power by focusing on the ‘layers’ among these attempts.
1.Graph generation policy, 2.Message-passing aggregation scheme, 3.Final pooling scheme are the working process of Subgraph GNN algorithm. When extracting a graph, there are many ways to get subgraphs, such as getting all subgraphs, excluding certain nodes, or getting an ergonetwork more than a certain number of hops.
There are also many choices for aggregation, such as exchanging information between extracted subgraphs, only connected nodes, only certain nodes, or exchanging information by aggregating all subgraphs. It takes a lot of scenarios and experimentation to determine which one is right for your situation. After all this complexity, the data is finally represented as numbers for us to utilize.
Determining what is reasonable and efficient in many cases is a complex problem. The authors of the paper experienced this problem and came up with this idea to solve it. In this paper, we introduce a hierarchy into the process of deriving characteristics between subgraphs to improve performance. For example, if a local aggregation performs better than a global aggregation, they apply a logical relation to distinguish the hierarchy. When you define tiers based on performance in this way, different classes emerge. We utilize this to validate the isomorphism.
We use the ‘pebbling game’ tool to validate the effectiveness of the algorithm. Apart from the performance of the algorithms, the process of how this tool works is very interesting: it generates different scenarios by adding or removing pebbles (in the paper, this is done through interaction between the Spolier and Duplicator players) and examines whether the algorithms can distinguish them.
Subgraph GNN is mainly used for molecular structure analysis and inference in biology, but I think it will be useful in many fields including finance and manufacturing as the use of graphs is expanding, so I think it will be useful for those who are interested and curious about deriving value from subgraphs.
From hypergraph energy functions to hypergraph-nueral-networks.
Energy function , an algorithm I haven’t seen in a long time. I haven’t seen it since I was studying Boltzmann machines. In this paper, we exploit the explicitly parameterized nature of the energy function. The idea is to use the effect of minimizing energy efficiently for node embedding. We use the energy function to improve the performance of the hypergraph node classification task. While this sounds easy enough, there is a lot of complexity that goes into this algorithm.
Designing an appropriate energy function for the hypergraph and then designing the regularization, which is the core of the energy function, and how to optimize it based on the obtained values. Unusually, the paper also mentions star expansion and clique expansion to account for the case when the hyperedge has no features. The key is to use the information generated by transforming graph ↔ hypergraph. The process is also very interesting and I recommend you to take a look at it!
CAN LLMS EFFECTIVELY LEVERAGE GRAPH STRUCTURAL INFORMATION: WHEN AND WHY
[https://arxiv.org/pdf/2309.16595.pdf]
LLM is a very hot topic in the artificial intelligence scene these days, to the point where it’s said that if you don’t know LLM, you’re a spy. Graph technology is a great complement to this trend. I’ve heard people ask, “Why LLM with graphs?” , “LLM and graphs? I didn’t do so well when I incorporated graphs.” If you’re getting the cold shoulder, or if you’re having doubts like the aforementioned questions, this thesis is not the answer, but I think it will help you a bit.
As the title suggests, the paper is centered around the questions “Why?” and “When?”. The paper does a good job of addressing my curiosity about combining topology and LLM by considering the prompt styles of zero-shot, few-show, and CoT and the structural characteristics of graphs. In particular, I was impressed by the aspect of utilizing homophily. At the same time, I came across this paper after seeing the results of experimenting with RAG by mentioning link prediction in the prompt in the graph deep learning open chat room, so I wondered if it would be even more amazing to incorporate graph tasks into the prompt.
I heard that graphing is helpful for LLM, but what exactly does it do? This paper is especially recommended for those who are studying to incorporate trends into graphs. I also recommend it to those who are struggling with illusions in the industry, as I think it brings insights that can broaden your horizons to consider additional characteristics!