CIKM 2021 | Picture Nerver Network on the heterogeneous chart

Figure neural network

It is a representative method in the field of learning. It is necessary to label data to ensure the performance of the model, and in the real system, there is usually a large-scale non-label data, resulting in a limited performance. To this end, an intuitive idea is to design GNN’s pre-training strategy, learn to migrate knowledge from the general structural properties of the figure. Most of the current pre-training strategies are designed for homogeneism, with each node and edges belong to the same type, and in actual systems are usually heterogeneous, multiple types of nodes are associated with different types of edges. , Rich in semantic information. Existing models are difficult to effectively build symptom information.

In this article, we put forward a kind ofCompare pre-training strategy of neural network on isomechanical map CPT-HG

To capture semantics and structural properties in self-supervision. Specifically, we have designed a pre-training task in relationship grades and elemental levels and further enhance their representative by comparing learning. exist

Relationship level, By distinguishing the simplest heterogeneous view to capture the corresponding semantic information;Chart level, Construct a different element diagram instance to capture the corresponding semantic information.

Paper Name:

Contrastive pre-training of gnns on Heterogeneous Graphs

Meeting: CIKM2021

Paper link:

In recent years, the map has become an abstraction that represents a variety of real world data sets. As a diagram structure data, the emerging tool for machine learning, the nerve network (GNN) is to learn powerful diagrams by recursively polymerizing the contents of adjacent nodes (ie, features or embedded), thereby retaining content and structural information. They have proven to improve performance of various graph applications, such as nodes and diagrams, recommendations, and graphs. In general, the GNN model is training using (semi) supervising information, and different downstream tasks require a large number of markup data. However, in most realistic scenarios, a large number of marking data is usually costly. In order to make full use of unmarkable graph structure data, the recent part works inspiration from the recent natural language processing and computer vision, and proposes a pre-trained GNN model on the map. Although these GNN pre-training methods have achieved good performance, they are designed for the same composition, each of which belongs to the same type. In contrast, existing strategies ignore heterogeneous diagrams, where multiple types of nodes interact through different types.

The network in real life can constitute a heterogeneous map, which reflects a rich semantic and composed of a variety of types of nodes and unique structures generated. As shown in Fig. 1 (a), a simple heterogeneous map is constructed for bibliographic data, which is composed of nodes of authors, papers, conferences and terminology, and authors, papers and the terms of the terms of the paper. Different types of nodes or edges usually exhibit different network properties, such as degree, and cluster coefficients. For example, the meeting node is usually higher than the author node. In addition, this isomer has also produced more complex semantic contexts, involving multiple relationships between multiple nodes, for example, describes the semantic context of "two authors of similar themes". In addition to a simple example,Heterogeneous mapIn many fields, it is also common, such as e-commerce interacting in various ways, for example, in various ways, and in biology associated with diseases, proteins and drugs. Taking into account their universality, it is important for the GNN pre-training strategy for heterogeneous map design.

In this article, we put forward a comparative pre-training program, which not only considers the difference between a single node, but also retains high-order semantics between multiple nodes. More specifically, this articleA pre-training task is designed to distinguish between different types of two nodes.(For example, author – papers and papers – meeting relationships) come to the foundation of the downstream task coding. Inspired by comparative learning [42], in order to enhance the representation of the sample, this paper constructs negative relationship grade samples from two aspects:

FromInconsistencyThe negative sample, two of which are different from the positive sample;

2. FromIrregular nodeNegative samples, two of which have no links at all in the figure.

At the same time, this paper proposes a subgraph pre-training task on a heterogeneous map, and a sub-map instance is used to generate a sub-map instance for comparison, so it is possible to encode information encoding for high-order semantics related to different upper and downstream tasks.

In this section, the pre-training model corresponding to this article will be introduced from two aspects of pre-training tasks, respectively:Analysis from the edge of the relationship and the meta level.

2.1 Relationship level pre-training task

For a given correct case, node set, and through relationships on the heterogeneous map constitute a corresponding example. Here, it is a three-way group before the pre-training. For negative samples, we build in two ways, one is inconsistent relationship, one is a non-connected node.

Inconsistent relationshipFor a given timing, there is a relationship between a node and nodes through an inconsistent relationship. Therefore, the corresponding negative sample example is constructed to represent an inconsistent relationship of the three-tuple, by indicating:

Since the scale is relatively large, the method in this paper will randomly sampling the collections to build a negative learning to prepare a neural network model. The corresponding loss function is:

Among them, the relationship of the learning weight matrix

Not connected nodeBased on the previous work, this paper provides a simple negative sampling scheme, which is directly sampled without a K-jet node to be used as a negative sample. In order to ensure the quality of the negative sample, the node selected as the selected node is used as the negative sample. The corresponding loss function is:

Therefore, for the pre-training task, the overall loss function is:

2.2 Subgraph level pre-training tasks

In order to capture the high-order information of the model, a natural idea is to use the Yuan path to explore high-order relationships. However, the extensive use of the Yuan path pre-trained GNN on the isomerial chart.Two weaknesses:

WithChartCompared to the fact that the energy path is characterized by rich semantics and extracting high-order structures;

2. FromSource nodeStart, the number of nodes that can be reached in the Yuan path can be too large, and the number of nodes from the same source node can be covered, because its structure is more complicated and more restrictive, which makes the metamodogram more efficient.

Therefore, this article is consideredChartTo capture high-order information.

Structural sampleFor a given element diagram and source node, a chart instance is constructed as a collection of metamod maps of the node, which is represented as a collection of all instances of the element diagram. Therefore, the sample based on the graph is as follows:

Negative sample queueIn order to build a negative sample of a metapogram, a dynamic queue is used to maintain a negative sample collection. Because of the real-time sampling negative samples, specifically, based on the previous positive samples during the training process, this article adds the nearest positive sample and remove the earliest queue end to generate negative samples.

Therefore, in order to capture high-order semantic information, this part of the model source node and the corresponding positive and negative sample, the corresponding loss function:

In order to consider two pre-training tasks, this paper achieves the corresponding effect by the following losses:

Third, experiment

3.1 Link Forecast

The following tableDemonstrate all methods on link prediction tasksThe last line represents the increase in the proposed method relative to the existing method. It can be seen that the model is about 2% relative to the relative improvements of the existing optimal baseline on all data sets.These indicators verify the validity of the proposed model.

3.2 Node Classification

The following table shows all the performance on the node classification task, and the last line represents the increase in the proposed method relative to the existing method. It can be seen that the model is about 1% around all DBLP and Aminer data sets, relative to the relative improvement of the existing optimal baseline.These indicators verify the effectiveness of the proposed model

3.3 ablation experiment

By replacing different baseline map neural network models, it can be seen that the good results can be achieved relative to the unpredictable model. The pre-training GAT does not have a satisfactory performance because it is difficult to learn normal attention between the pre-training and fine tuning chart, soModel performance is worse than the model without pre-trained.

This article comes from:]

Author: Jiang Xiangqiang

IllustrastionBY Dmitry Nikunikov from ICons8

-The End-

New this week!

About me "door"

▼ ▼

The door is a new type of venture company that focuses on the discovery, acceleration and investment technology drive type entrepreneurial, covering the innovation service, willing the door technology community, and the Women’s Ventures.

At the end of 2015, the founding team was constructed by Microsoft Venture Investment in China’s founding team.

If you are a start-up business in the technical field, not only want to get investment, but also hope to get a series of continuous, valuable post-service services.

Welcome to send me "door" to me:

? One button to send you to Techbeat Happy Planet

Typroa charge! What else can be replaced, 10 MARKDOWN editors [recommended collection]

Hello everyone, I am Yi, pay attention to I don’t regularly share free use software. The full text has accumulated 2800 words, 21 maps, it is recommended to read it slowly

I didn’t work overtime on the weekend, and I opened my computer to write something. As usual, when I opened Typroa, I played a stroke, I saw the NEWER VERSION at the end of my English level 4, and I suddenly understood it.

Then update, then jump to Typroa’s official website,$ 14.99The charging suggests that I stab my heart at the time. [It is said that Typroa’s Hanhua is completed by the community, and now it is actually charged]

Typroa has been used for more than a year, with a simple interface, instant rendering characteristics becomes my only editing software. After the introduction of the habit, I suddenly started charging. I can’t help but remember the NOTABILITY charging operation encountered for some time.

So in the face of this situation, the wallet dry, I started a software search. Search and use, I really found a lot of alternative software, share with you any free MarkDown software.

Since the author is a Windows user, it can only be shared according to the platform used, cross-platform capabilities are only simple introduced.

Let the sudden charge, so this time it is recommended to open source software, the source code can be found on GitHub, you can respect.

  • Github official website

  • Cross-platform: MacwindowsLinux

VS Code is a programming software from Microsoft. After years, iterative development, functionality and performance have become increasingly stable. As a programmer, starting from the birth of VS Code.

The power of VS Code is that you can support a lot of dedicated features by installing extensions. For example, C code, Java code, flowchart. Of course, MarkDown is also supported.

After installing the MarkDown plugin, you can edit the text, render the style of viewing the document on the right.

Comment: The programmer will like VS Code, it is not possible after installing the plugin.

  • Github official website

  • Cross-platform: Windows, Mac, Linux, iOS, Android

Joplin is a cross-platform open source and completely free MarkDown note software. After the initial use, the discovery function is far more than Typroa, which is more suitable for me.

Joplin’s interface is equally simple, I believe soon, I can start, and the lack of beauty is that some menus are not completely Chinese, but some simple words do not affect the use.

In addition to the normal editing capabilities with Typroa, Joplin supports multiple cloud disk synchronization. Typroa does not support cloud synchronization, each time you need manual synchronization, this time is replacing the editor because of the disaster.

Joplin also supports end-to-end encryption, TO-DO-to-do items, note history version, full Chinese search, external editor open, web clip plugin, etc.

Comment: It is better than Typroa … currently replaced with Joplin

  • Github official website

  • Cross-platform: WindowsMaclinux

Mark Text is a simple and elegant open source MARKDOWN editor that supports Linux, MacOS, and Windows platforms.

The function of Mark Text is basically consistent with TyproA, supports real-time rendering of MarkDown, and imports the original Typroa editable documents into Mark Text, smooth excessive obstacles.

A good feature than Typroa is to support SM.MS and Github Tu bed. For those who often write blogs, it will reduce the trouble of uploading pictures.

The lack of beauty is not supported, and some people may not be friendly.

Comment: Can be compared with Typroa, replace Typroa

  • Github official website

  • Cross-platform

NOTABLE is an open source MarkDown editor that supports Linux, MacOS, Windows and other operating systems.

NOTABLE uses a classic triple structure, and the left column also supports the "All Notes" also supports the label. The middle bar shows the list of notes contained in all notes or corresponding tags, and the right column is the main editing and preview area.

NOTABLE does not support MarkDown real-time rendering, and some people who like focus words will like this "focus" input mode.

In addition, NOTABLE supports full-text search, but does not support Chinese menus.

Comment: Do not support Chinese menus, a little regret

  • Github official website

  • Cross-platform: MacwindowsDebianFedoraarm64

Zettlr is a MarkDown editor that is ideal for writing professional texts, whether college students, researchers, reporters or writers.

Zettlr’s unique literature reference, focus mode, thermogram search, code highlight, and organizational structures can make MarkDown transform into productivity tools from the editor.

Comment: Does not support the Chinese menu, a bit regret! Professional

  • Github

  • Cross-platform: WindowsLinuxMac

Trilium is an open source MARKDOWN note software that uses a directory hierarchy notes. The notes can be arranged to be arbitrary deep trees. Rich desired, incoming note editing, including, for example, form, image, and mathematics and MARKDOWN automatic set format; support using source code editing notes, including syntax highlighting; quick and easy navigation between notes, full-text search and notes upgrade; seamless Note Version Control; Comment Properties can be used for annotation organizations, queries, and advanced scripts; synchronize with self-host synchronization servers;

In addition to these features, Trilium supports the relationship diagram and link diagrams between notes. The relationship diagram of visualization allows knowledge to form a correlation, convenient to memory.

Comment: I can use the relationship diagram between Typroa, the relationship between the notes is suitable as a knowledge base.

Many document editing is now completed through multiplayer online collaboration, or using different device editing, this time uses an online editor is more appropriate.

There are many tabular software, but NOTION is a "redefined digital note" all-around knowledge management software. Give users a very rich feature to meet more than 90% of users’ notes.

The basic unit in the NOTION is [Block], a paragraph, a picture, a video, a table is a block. Multiple Block makes up a Page page that can be understood as a note.

This flexible operation can be applied to more scenes, multiple templates, can be personal notes, calendars, knowledge bases, can also be team libraries, boards and project management tools.

Comment: The powerful function, the Chinese interface is developed, there is a certain study cost. Data stocks are stored in the cloud, and the security documents need themselves.

TIP: If you consider Chinese, you can look at the software of "wolai", phenomenon, localized NOTITION software.

These three online editing software but more, and also support MarkDown, I believe many people are also using. Some time ago, the graphite document was chaired by the law, and the picture size was limited, and the article was limited to the public sharing, so it was not a moth now.

Quietly tell everyone: If you want to charge the stone, you can go to a treasure to see, there is a surprise …

Typroa retreats to previous test versions, you can also use it for free. However, I have found the replacement software is too lazy to try.

If you just replace Typroa,Offline MARKDOWN Software Recommended JOPLIN. Excessive learning costs from TyproA to Joplin, and Joplin has more features.

If there is a need for online synchronization sharing, you can also view the online document editor of the graphite document, Tencent document, etc., and the function is also very powerful.

Without the best editor, only the best is best for yourself! Some people like an editor to cover all the features, and some people like the editor of the editing function. Before choosing the software that suits you, fully considering your needs, such as secure (whether you can back up), retrieve, free, synchronize, etc.

You can also use multiple software at the same time, like I use Joplin, graphite documentation, have a cloud note, Tencent document multiple software, I believe that everyone can find the editor that suits you in these software.

What is worth buying app all net shop shopping prices