YOU MUST EAT THIS GRAPH NEWS , Graph Omakase / 1st Week of Jan
Today’s Network Knowledge
Do you know about the phenomenon called Load?
“A topological interpretation and quantification of how much a node participates in the shortest distance between all pairs of nodes.” is expressed in complex system physics as Load, Load. For those who are interested in graph embedding, you can understand the amount of information transmitted between nodes during the message passing process.
The reason why I brought this up is that papers on scalability are attracting attention these days. I usually approach a lot through engineering aspects such as gpu resource allocation, data compression, modeling efficiency, etc. However, in terms of complex networks, there are many insights on how to analyze and interpret the amount of information in terms of ‘power law network, load efficiency’, etc., so I would like to share this view.
Scalable Graph Transformers for Million Nodes
[https://towardsdatascience.com/scalable-graph-transformers-for-million-nodes-2f0014ceb9d4]
From the beginning, the story goes on, referring to the difference between the graph transformer and the graph natural network, “where is the resorting function the message or not.”
The local feature aggregation, which requires information resorts, and the graph transformer capable of all feature aggregation, can be both local + global feature aggregation. This is said to be powerful in over-squashing issues and graph task flexibility.
But if you think about it simply, you have to get the information from all the nodes, so you run into a computational quantity problem (O(N22) for query, key, value calculation). To solve the problem, we present the idea of NodeFormer, which utilizes the advantages of random frame map and Gumble-softmax.
Specifically, you can say that Gumble-softmax was used for the idea of Performer. A detailed description is in the Performer. The key architecture is the idea of reducing spatial complexity by incorporating kernel tricks into RFM.
In addition, Relational Bias and Edge regularization loss are utilized to select elements that are as important as extracting information in a global adaptive manner. In summary, I will make good use of attrition score. That’s the idea.
To understand the thesis in detail, Transformer, Performer, Gumble-softmax, etc. It reflects a lot of mathematical techniques, so there is this architecture. I recommend you to take it lightly in terms of various attempts to reflect the global structure of the graph.
BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing
[https://arxiv.org/pdf/2112.08541.pdf]
This paper describes how to do feature retriving and subgraph sampling (neighborhood sampling) for GPU-efficient. Specifically, the data I/O size (subgraph is imported for each batch, and the node feature of the subgraph must be imported) is very large, while the model (GraphSAGE has fewer parameters such as mean after node sampling) is light. Therefore, we define as a problem that huge gap occurs and cannot be efficiently calculated, and propose a method to solve it.
The cache engine (policy) was tuned to handle features efficiently. Existing models (pinterest, PaGraph) often store and utilize the hottest node feature in Cache. At this time, the problem is that the cache hit ratio is not efficient to apply it in a static manner, and it is tuned and applied in a dynamic manner. To compare the results, proceed with amortized analysis. And then we interpret the results and we talk about why cache engineering is important.
We propose a graph partitioning algorithm. Simply, to make cpu & gpu cross-parting efficient, 1. multi-level coarsening (random sampling, generation in HDFS) 2. assembly (sampled node in blocks) 3. uncoversing process is performed as a node and generating process. Simply, you can say that random sampling for learning is conducted in HDFS.
Finally, to facilitate transmission and reception between data, we also talk about network PCIe cpu resource allocation and talk about improvement ideas.
We experimented by setting up three frameworks: Euler, DGL, and PyG with baseline. The content is so vast, and you may find it somewhat boring and difficult to read because you need a lot of prior knowledge about computer science. However, if you are interested in graph and scalability, I recommend you to read it because it is a good idea that shows your knowledge of optimization.
Using Graph Learning for Personalization
[https://kumo.ai/ns-newsarticle-using-graph-learning-for-personalization-how-gnns-solve-inherent-structural-issues-with-recommender-systems]
It kindly explains what the advantages of applying Graph to the Recommendender system and how it is applied by leading overseas companies.
We argue for why GNN for three reasons:
1. Models require fixed data input, which ignores the underlying relational structure in the data
2. Models are extremely sensitive to new data
3. Models are rigid, with limited adaptability to new predictions or use cases
When GNN is receiving a lot of attention, why is it mentioned a lot in the recommendation system? I recommend it to those who are curious about it. I don’t think there will be much inconvenience in reading because it is a post that has been well-written by excluding mathematical elements such as formulas as much as possible.
Chartalist: Labeled Graph Datasets for UTXO and Account-based Blockchains
[https://openreview.net/pdf?id=10iA3OowAV3]
Blockchain and cryptocurrence analysis are becoming increasingly interested, and blockchain is positioned on one axis of graph application. Because the form of the transparent database of the blockchain is, after all, a linked list, so you can analyze the flows of assets through a link. Therefore, it is efficient to map and use the node-node, which is the advantage of graph, as wallet-wallet, etc., so it is mainly approached in terms of graph.
The connection is mainly divided into two categories: UTXO and Account-based. The details are explained in the paper. The concept of Account-based is the same as the remittance event we usually use, but UTXO is somewhat new. So you might need to study a little bit.
In blockchain industry, all information is transparently released on the web. However, many people think about how to approach it, what tasks to apply, which features to use, labeling for learning reasoning, large data sets, and applications. The answers to the problem are well presented in this paper.
Blockchain also has similar characteristics in that it deals with financial network, money send, and receive behavior, so it would be good for those who wanted to try financial network analysis but defended because of various limitations.
ItemSage: Learning Product Embeddings for Shopping Recommendations at Pinterest
[https://arxiv.org/pdf/2205.11728.pdf]
I’m sure you all know about ‘PINSAGE’, a representative example of gn applied in the industry. It’s an idea that borrows a transformer structure to further improve the model.
Simply, if we used image embedding only in the past, this idea uses a structure called SearchSage to embed additional text. So, it is an architecture that combines PINSAGE output + SearchSage output to reflect the context in [CLS] token, and makes recommendations through similarity between the contexts.
What you should pay attention to is 1. multi-task learning, 2. Hash embeddeding applied to deal with high-dimensional text, and 3. How to capture user engagement (specifically, how did you mainly use knowledge graphing to reflect heterogeneous context? I think so.
Is the unified model excellent from the previous results? If not, for what reason? If you look closely at the interpretation of the thesis, you will be able to derive good insights. 🙂
Next-item Recommendation with Sequential Hypergraphs
[https://dl.acm.org/doi/pdf/10.1145/3397271.3401133]
It’s been a long time since I wrote a paper on hypergraphs. It describes how hypergraphs are used in the recommendation area. Although there are various applications, this paper mentions the importance of before and after the user’s behavior, and argues that the key is to understand the window, time Granularity, and that it is good to extract the context through a hypergraph.
For this reason, if the graph is approached in a 1:1 pair-wise aspect, it is insufficient to reflect the internal context by embedding only the context of the moment sequentially. On the other hand, the hypergraph is advantageous in reflecting all actions that occur at this point in terms of co-occurrence through the hypedge that can observe 1:N. That’s how we’re solving the logic.
These days, there are a lot of sequential recommenders. Recently, there seems to be so much need in the industry that it has established itself as an axis of Task in wsdm and www challenge. In common sense, dynamic embeddeding seems to be more advantageous than static embeddeding because the user’s context changes from time to time, but in fact, if you look at the leaderboard, all of the top methodologies are static embedded (LINE, graph transformer, etc.), there is still a lot of room for improvement. What do you think about pioneering it?
## Graph Omakase Business Trip Service Guide ##
Over the past several weeks, I’ve delivered information about the hypergraph. Is this gonna be useful? I’m sure there are people who are wondering that. If you’re curious about where hypergraph can help you analyze or infer in the field,
For those who don’t know how to code when using PyG for the first time,
For those in the above two groups, if you reply to a linked-in message or e-mail, we will help you prototyping according to the data. 🙂 At this time, the nature of each data is different, so if you send the data definition, it will be more detailed and high-quality prototyping!
contact point is here [ https://www.linkedin.com/in/ii-tae-jeong/ ] thanks. :)