YOU MUST EAT THIS GRAPH NEWS, GRAPH OMAKASE. 2 weeks April

Jeong Yitae

7 min readApr 16, 2023

Neural Graph Databases

[https://towardsdatascience.com/neural-graph-databases-cc35c9e1d04f]

Completeness. Query engines assume that graphs in classical graph DBs are complete.

preliminary

Why do I need useful knowledge graphs in my question answering task?

Structured representation: Knowledge graphs represent information in a structured way that uses nodes to represent entities and edges to represent relationships between entities. This allows you to query knowledge graphs and find relevant information easily.

Semantic relationships: Knowledge graphs represent relationships between entities, so they can be used to answer questions that require an understanding of semantic relationships between entities. For example, using a knowledge graph, “Who was the actor in the movie directed by Steven Spielberg?”Can you answer questions like this?”

Multilingual support: Allows you to create knowledge graphs in multiple languages. In other words, you can use knowledge graphs to answer questions in different languages. This feature is particularly useful for multilingual question answering systems.

Scalability: Knowledge graphs can scale to include multiple entities and relationships, which helps answer complex questions that require knowledge from multiple domains.

Completeness & Incompleteness

In the context of a knowledge graph, “complete” and “incomplete” refer to the extent to which the knowledge graph represents all information related to a particular domain or topic.

A “complete” knowledge graph includes all relevant information about a topic or domain and all relationships between different entities in the domain. This means that the knowledge graph can answer any questions related to that topic or domain.

On the other hand, “incomplete” knowledge graphs may have no sensitive information or may not display all relationships between entities. This means that knowledge graphs may not answer specific questions, or may provide incomplete or inaccurate answers.

In summary, completeness refers to the degree to which a knowledge graph represents all relevant information and relationships in a particular area or topic, while incompleteness refers to the degree to which the knowledge graph misses important information or relationships.

Summary

a textualized knowledge base of abstract knowledge. Knowledge graph converted into graph form to systematically manage vast amounts of knowledge. A spatial graph database suitable for storing graphs. Technical graph deep learning that quantifies graph data to make it easier for computers to understand. Have you ever imagined what it would be like if all the elements mentioned were combined? We’re talking about the Neural Graph Database (NGDB), a combination of those elements.

What do you think of when you think of knowledge graph? It contains a lot of knowledge, so it would be good to make good use of it. That’s what you’d think. In fact, if you look closely at the knowledge graph, there is a limitation that there are empty nodes and relationships everywhere, so there is a lack of data consistency validity. In other words, even if there is a lot of data, it can be seen as a pie in the sky when it is actually used.

If you go further, even if you load it on a graph database and use it, you can’t expect good quality from the results. So how about using graph deep learning to compensate for this ‘empty’? The article and paper describe the synergy of combining graph database join indexing and graph machine learning by defining ‘empty’ as a problem as a link prediction task.

insight

A database system that can answer high-dimensional questions. I think there will be more research on knowledge graph embeddings. Accordingly, there seems to be a lot of discussion about how to weight the knowledge graph form (h,r,t) and how to use closed assumptions in the right place. In other words, there is a lot of room for research and improvement, so I think it’s Blue Ocean.

ChatGPT versus Traditional Question Answering for Knowledge Graphs: Current Status and Future Directions Towards Knowledge Graph Chatbots

[https://arxiv.org/pdf/2302.06466.pdf]

Summary

How would you evaluate the performance of ChatGPT? And will the Knowledge graph with QA work? It is a paper that can be answered to the question. Before I start my research, I often look at survey paper with figures and trends for the field to know about a particular field for planning. This paper was written for the purpose of the survey, so I think it will be of great help to those who want to use it by combining ChatGPT and Knowledge graph.

Insight

The time is coming when knowledge graphs can shine I think it takes a clear purpose and direction to extract ‘value’ information from a model (ChatGPT) where large amounts of data are learned. I think the knowledge graph is the best tool to help the purpose and direction.

CPU Affinity for PyG Workloads

[https://pytorch-geometric.readthedocs.io/en/latest/advanced/cpu_affinity.html]

preliminary

Affinity mask is a term used in computer science that refers to a bitmask that is used to associate a set of resources or processing units with a particular task or process. In the context of operating systems, an affinity mask is used to specify which processors or cores are eligible to execute a particular thread or process.

Summary

GPU that was thought to be a necessity before deep learning, may not be a necessity. Let’s take a look at how deep learning graphs is done through the CPU. The key is to improve efficiency by managing core and memory through execution bind and memory bind.

It is said to improve its efficiency by minimizing core roll and memory bound. A core stall is a phenomenon that occurs when a processor or core is unable to continue executing a command because it is waiting to retrieve data from memory. In the end, data corestall has an effect on memory bound, so it allocates the worker required for memory caching well and optimizes the burden. Rather than actually approaching it in terms of computation, such as backward backpropagation calculation, it’s about how Dataloader fetches data and injects it into the model.

Insight

There are many databases that implement the Feature store and Graph store. Of course, it’s good if we have our own database, but if we can’t, of course, it happens, so if we design a pipeline to load data efficiently through cpu and perform gpu operations through remaining resources, I think it’ll be a more optimized architecture.

Choose Your Weapon: Survival Strategies for Depressed AI Academics

[https://arxiv.org/pdf/2304.06035.pdf]

It’s a word that’s often talked about in the AI field. “No free lunch” No free lunch. There is no guarantee that a model optimized for a particular data is optimized for other data. That’s theorem at the heart of the story. An attempt to overcome this is emerging: Large model training, a paradigm that collects and learns as much data as possible for model generalization. The formula and phenomenon derived from this is “The more data, the better the model performance.” Of course, data should also be quality data. However, the more data parameters you have, the more likely you are to have more quality data. In the end, you need ‘resources’ that can cook a lot of data well. Here, resource means GPU, data store, CPU, etc.

If you compare the “resources” that can be used by individuals or academia with the “resources” that can be used by the industry, you first think that, of course, industrial resources will be richer. So, how can we efficiently utilize ‘resources’ and discover contributions from an individual or academic perspective? That’s the question. It is a paper that systematically answers the question and writes it down. I recommend it to researchers who are worried about where to start their research.

In addition, if you look at this paper and [https://www.cs197.seas.harvard.edu/ ], the effect will be doubled, so I recommend it if you have time.

pseudo-lab gathering with HP

[https://pseudo-lab.com/]

After work on weekdays, I participated in the Data Science Social Gathering event hosted by the fake research institute, supported by HP, and had time to exchange insights with various people. It was even more meaningful to meet my teacher who made me enter the field of graph 3 or 4 years ago. If I didn’t know the field of graphing then, I wonder what I would be doing now.At that time, I only studied without any purpose, so I participated in the car that I felt was slow to grow, and I think it’s a choice that I don’t regret even now.

Like this, if you have stagnant growth or lack the purpose of studying, why don’t you participate in the fake lab, a space where you feel like you’re growing up with a variety of people? It’s good that the core motto of the community is “non-profit,” so it doesn’t cost anything to study. There are systematic community systems that are “free” and you can feel that your skills have improved enough to be clearly compared before and after participation because you support various people to grow. Why don’t you join us, Omakase? :)

Connect point

if you have a question or to connect me , click this link https://www.linkedin.com/in/ii-tae-jeong/

Resume update

https://www.notion.so/tteon/Jeong-IiTae-fb9139af14db4730b2aac1942d6bbada?pvs=4

YOU MUST EAT THIS GRAPH NEWS, GRAPH OMAKASE. 2 weeks April

Neural Graph Databases

preliminary

Summary

insight

ChatGPT versus Traditional Question Answering for Knowledge Graphs: Current Status and Future Directions Towards Knowledge Graph Chatbots

Summary

Insight

CPU Affinity for PyG Workloads

preliminary

Summary

Insight

pseudo-lab gathering with HP

Connect point

Resume update

Written by Jeong Yitae

No responses yet