Semantic search and its supplement ‘Graph based prompting’

Jeong ii tae
6 min readMar 24, 2024

--

Graph Neural Prompting with Large Language Models

Paper : https://arxiv.org/abs/2309.15427

The paper brings forth a well-implemented idea of the architecture aimed at by Graph-RAG and convincingly proves its utility through experiments.

Before delving into RAG (Retrieval Augmented Generation), let’s first touch on the topic of search. The quality of search is crucial in RAG because the outcome of the generation depends on what is retrieved. Among various search technologies, let’s discuss semantic search, which RAG utilizes.

What is Semantic Search? According to Elastic, semantic search refers to search engine technology that interprets the meanings of words and phrases. Unlike regular expressions that fetch exact matches of phrases and predefined expressions, or lexical searches that retrieve matching content, semantic search returns content that matches the intent and meaning of the query. Why should it return content that matches the query’s intent and meaning, rather than just exact matches? This is because queries often lack precision in real-world scenarios. For instance, if a user searches for “good restaurants,” a regular expression search might look for exact matches of the phrase “good restaurants” in the database, potentially returning no results if the exact phrase isn’t found. In contrast, semantic search would interpret the search intent as looking for restaurants with good food, returning relevant and semantically matched results. The proverb “Even if you speak incoherently, it’s understood perfectly” aptly represents the essence of this technology. Thus, accurately deriving user-intended values is a critical aspect of user interaction within platforms across various industries.

  • Similarly, RAG (Retrieval Augmented Generation) utilizes semantic search technology to understand the intent embedded in user queries and to specify ‘what to retrieve’ during the retrieval phase. However, transforming user intent and queries into numerical representations for semantic search can introduce certain drawbacks. One such drawback is when data related to the user’s intent appears numerically similar but is actually different when viewed as text.
  • Let’s discuss a specific example to illustrate this issue. When embedding eight Korean sentences that express feelings of liking or disliking, we can observe how they are represented in the embedding space. In the left red box (①), we see that sentences “I dislike you,” “I like you,” and “I love you” are located in a similar space, intriguingly split between love and dislike around the concept of liking.
  • In the right yellow box (②), we notice that number 6, expressing the emotion of love, is closely positioned to number 7, which does not express love but rather conveys a feeling based on the medium of ‘friendship’.
  • Is there a clear way to analyze the cause of these phenomena? I believe not, and think that there is still much room for research in this area.
  • Thus, while semantic search has the advantage of being able to search well for values that are numerically similar in meaning, it also has the drawback that the meaning can be distorted due to noise introduced during the numerical conversion process.
  • So, how can we compensate for this aspect? The answer is Knowledge Graphs. By designing the meanings between data and building an ontology to express this in the form of a graph, Knowledge Graphs demonstrate the ‘relations’ between actual data points. Therefore, the ability to contain and utilize factual information is the core value of knowledge graphs, often referred to as Knowledge Graph Reasoning.
  • Today’s paper discusses the improvement of LLM performance by conveying factual information through the use of Knowledge Graphs.

How Graph Neural Prompting Works

1.Subgraph Retrieval

To effectively utilize a knowledge graph, one must retrieve a ‘relevant knowledge graph’. This involves transforming the user’s query into numerical form and fetching related entities from the graph. Through the entity linking task, the system compares user queries with connected knowledge graph nodes, selecting relevant entities and their two-hop neighbors based on these entities.

2.Graph Neural Prompting

This stage involves refining the retrieved subgraph and integrating textual and graphical information. It’s crucial to determine the significance of the information within the data, removing any noise that might negatively impact the results.

1) GNN Encoder
Utilizing pre-trained entity embeddings, the system first represents entities numerically. Then, it employs a graph attention network to embed the two-hop subgraph connected to the entity into a graph embedding.

2) Cross Modality Pooling
This step uses self-attention to identify which nodes in the graph are relevant to the user’s query. Similarly, it applies self-attention to the node embeddings from the GNN encoder and text embeddings transformed from the user’s query.

The process involves an inner product operation between the self-attention applied graph and text information, considered somewhat unconventional.

Simplifying, the node embedding is treated as a query and the text embedding as key and value, with significance extracted via the softmax function. Average pooling (graph embedding) then compiles these values into refined embeddings.

3) Domain Projector
The final step in processing the embeddings for LLM input involves refining them to match the user query embedding dimensions through a feed-forward neural network.

4) Self-supervised Link Prediction
To apply the knowledge graph to downstream target data beyond specific datasets, the system uses a knowledge graph embedding DistMult score function to identify potential entity links. It conducts a link prediction task, considering existing links as positive labels and randomly selected unlinked entities as negative labels.

3. Prompting LLMs for Question Answering

Using graph prompting combined with text embedding, the system performs generation prediction with soft prompts as parameters. The difference between soft and hard prompts lies in their approach: soft prompts implicitly nudge information to the LLM, while hard prompts explicitly instruct it.

Experimental Results

As shown in Figure 1, utilizing graph information improves performance regardless of fine-tuning. Notably, in LLM Frozen scenarios, performance enhancement of nearly 16% was observed compared to conventional prompt tuning (hard prompt). Given the rapid development of NLP foundation models, where new models often outperform fine-tuned predecessors, the combination of RAG and prompt engineering — especially graph-based prompting — promises significant advancements in the field.

  • The results presented in Figure 4 are particularly intriguing. The Cross Modality Layer, which effectively combines text and graph information, shows improved performance in commonsense reasoning tasks as its utilization increases in an 11B LLM model.
  • Conversely, in the biomedical domain, performance decreases. This suggests that while graph information significantly benefits common sense domains, it might act as noise in the biomedical domain.
  • In summary, I believe that during the process of dimension reduction and expansion through various Feed-Forward Networks (FFNs), some inherent data information is likely lost.
  • Unfortunately, the paper does not delve into this aspect in detail, which was a disappointment. Attempting to access the paper’s code via the provided GitHub link leads to another letdown due to intellectual property issues, preventing a deeper understanding of how this idea was implemented. This indirectly suggests that Amazon might be utilizing this idea in actual products.
  • As the field of GraphRAG gains application across various domains, discussions on its utilization and anticipated limitations are becoming increasingly common.
  • The primary topics of these discussions are speed, scalability, and knowledge graph design. How these three challenges are addressed seems to be the deciding factor for the engineering skill’s applicability in real-world products. I hope everyone had a productive week and wish you all a week filled with good fortune and growth ahead.

Contact Info

Gmail : jeongiitae6@gmail.com

Linkedin : https://www.linkedin.com/in/ii-tae-jeong/

--

--

Jeong ii tae

Linkedin : jeongiitae / i'm the graph and network data enthusiast. I always consider how the graph data is useful in the real-world.