Graph neural network

This article is reproduced from: "Top Applications of Graph Neural Networks 2021"

The translation is done by Google, and the machine translation effect is not good, but it does not affect the overall understanding.

At the beginning of the year, I felt that Graph Neural Network (GNN) became a buzzword.As a researcher in this field, I am proud of my work (at least not ashamed).It’s not always the case: three years ago, when I was busy withGANWhen talking to my colleagues at Transformers, their overall impression of me was that I was working on a particular niche problem.Well, this field is quite mature, here I suggest to take a look at the top applications of GNN that we have recently.

If this in-depth educational content on convolutional neural networks is useful to you, you can Subscribe to our AI research mailing list To be notified when we release new materials. 

Recommended system

Naturally, graphs appear in the context of user interaction with products in an e-commerce platform. Therefore, many companies use GNN for product recommendation.A standard use case is to model the interaction between the user and the project graph, learn node embeddings with some form of negative sampling loss, and usekNThe index retrieves similar items for a given user in real time.The company that first used this pipeline was Uber Eats  , The company passedGraphSage The network recommends food and restaurants.

In the case of food recommendations, the graphs obtained are relatively small due to the geographical limitations of recommendations, but some companies use GNNs on the scale of billions of edges.Chinese retail giant Alibaba has placed products on a network with billions of users and products Figure embeddingAnd GNN.Even building such a diagram may be an engineering nightmare, but for the recent Aligraph For pipelines, it only takes five minutes to build a graph with 400M nodes.Impressive, ha ha. Aligraph supports efficient distributed graph storage, optimized sampling operators and a large number of internal GNNs.It has been deployed for the recommendation and personalized search of multiple products in the company.

AlibabaAmazonAnd many other e-commerce companies use GNN to enhance the recommendation system.

same,  Pinterest came up with PinSage A model that uses personalized PageRank to effectively sample neighborhoods and effectively update node embeddings by aggregating each neighborhood.Their follow-up PinnerSage This framework has been extended to handle multi-embedded content to address different user preferences.These are just a few in the fieldfamousExample (you can alsoOn Amazonan examination About Knowledge Graph and GNN Or Fabula AI uses GNN for research on fake news detection), but it is obvious that if the signal comes from the user’s interaction, it is significant.

Portfolio Optimization

The solution of combinatorial optimization (CO) problems is the main force of many important applications in finance, logistics, energy, life sciences and hardware design.Most of these problems are represented graphically.As a result, in the past century, a large amount of ink has been sprinkled on algorithmic methods to solve the CO problem more effectively.However, the modern computing revolution driven by machine learning provides a convincing new way to learn how to solve such problems.

Google Brain team Use GNN to optimize new hardware (e.g.Google’s TPU)The power, area and performance of the chip block.Computer chips can be divided into graphics of memory and logic components, each of which is represented by its coordinates and type.While complying with the limits of density and wiring congestion, determining the location of each component is a laborious process, which is still the job of an electrical engineer. Their GNN model is combined with strategy and value RL functions, It can generate optimized layouts for circuit chips that match or are better than manually designed hardware.

Compared with Chess and Go, the complexity of the Chip Placement problem. (source

The other method takes a different approach and integrates the ML model into an existing solver.E.g,  Gasse et al. A graph network is proposed to learn the branch and bound variable selection strategy: the key step in the mixed integer linear program (MILP) solver.In this way, the learned representation attempts to minimize the runtime of the solver and has proven to be a good compromise between reasoning time and decision quality.

In DeepMind and GoogleLatest work, The graph net is used for two key sub-tasks involved in the MILP solver: joint variable allocation and limiting target value.Their neural network method is 2-10 times faster than existing solvers for huge data sets including Google's production packaging and planning system.For more results on this topic, you can refer to several Recent survey It is to discuss the deeper combination of GNNS, ML, and CO.

Computer vision

Since there are close connections between objects in the world, images containing these objects can also benefit from GNN.One way to perceive images is through场景 图, which is ScenesThe set of objects that exist in and the relationships between them.Scene graphs have been used in image retrieval, understanding and reasoning, subtitles, visual problem solving, and image generation, showing that it can greatly improve the performance of the model.

In the works of Facebook, Can be popularCVThe objects in the COCO dataset are placed on the canvas, the position and size of the objects are specified, and a scene graph is created from them.Then, use GNN to encode the graph to determine the embedding of each object, and then compare it withCNNUsed together to generate the mask, bounding box and appearance of the object.therefore, The end user only needs For GNN/CNN in the figureAdd new node(Specify the relative position and size of the node) to generate an image with these objects.

Use scene graphs to generate images. The user can set the object Place it anywhere on the canvas (the red "river"; moved from the center to the lower right corner) to reflect the changes in the image (the river generated in the image also moves to the lower right corner).

Another source of graphics in CV is the matching of two related images-a classic problem that used to be done by hand-made descriptors. 3D Graphics Company Magic Leap Released a product calledSuper GlueThe GNN architecture, which can perform graphics matching in real-time video, is used to complete tasks such as 3D reconstruction, location recognition, localization, and mapping (SLAM). SuperGlue consists of an attention-based GNN that learns the representation of image key points, which are further fed to the best transmission layer for matching.The model can be used in modernGPUThe matching is performed in real-time on the Internet, and can be easily integrated into modern SLAM systems.For more details on the intersection of graphics and computer vision, please check These ones survey.

Physical Chemistry

Life sciences benefit from representing the interactions between particles or molecules as graphs and then using GNN to predict the properties of such systems.On Facebook and CMUOpen Catalyst project, The ultimate goal is to find new ways to store renewable energy such as solar or wind energy.One of the potential solutions is to convert this energy into other fuels, such as hydrogen, through chemical reactions.However, this requires the discovery of new catalysts to drive chemical reactions at a high rate, and known methods such as DFT very expensive. Open Catalyst project opens The largest catalyst data set, its DFT relaxation and GNN baseline.It is hoped to find new low-cost molecules that can add to the current expensive simulations that take several days and have effective molecular energy and force ML approximations (which may take several milliseconds).

Examples of the initial state and relaxed state of adsorbates (small linking molecules) and catalyst surfaces.In order to find the relaxed state of a pair of adsorbent catalysts, expensive DFT simulations must be performed, which may take several days. Zitnick et al. 2020 year

Researchers at DeepMind GNN is also used to simulate the dynamics of complex particle systems (such as water or sand).By predicting the relative motion of each particle in each step, the dynamics of the entire system can be reasonably reconstructed, and the basic laws of controlling motion can be further understood.For example, this Used to understand the glass transition, This is one of the most interesting unsolved problems in solid-state theory.Using GNN can not only simulate the dynamics of the transition process, but also better understand how particles affect each other based on distance and time.

In addition, the physics laboratory in the United States fermilabCommitted to shipping GNN to CERN's Large Hadron Collider (LHC).The goal is to process millions of images and select images related to the discovery of new particles.Their task isIn FPGAUpper realization GNN  And combine them withData acquisition processorIntegrated together, this will allow GNN to operate remotely across the globe.For more applications of GNN in particle physics, please check This latest survey.

Drug discovery

With billions of dollars in research and development funds and fierce competition, pharmaceutical companies are fiercely looking for new drug development paradigms.In biology, graphs can represent interactions on various scales.At the molecular level, edges can be bonds between atoms in a molecule or interactions between amino acid residues in a protein.On a larger scale, graphics can represent interactions between more complex structures (such as proteins, mRNA or metabolites).According to a specific level of abstraction, these maps can be used for target recognition, molecular property prediction, high-throughput screening, new drug design, protein engineering, and drug reuse.

The timetable for discovering drugs in different applications of GNN. Gaudelet et al., 2020

Perhaps one of the most promising results of using GNN for drug discovery is that of MIT researchers and their collaborators Published on Cell (2020).In this work, a trainee namedChempropThe deep GNN model to predict whether the molecule has antibiotic properties: the growth inhibitory effect on the bacteria Escherichia coli.After training it with only about 2500 molecules in the FDA-approved drug library, Chemprop was applied to a larger data set, includingHalicinMoleculesDrug Repurpose Hub, the drug is in "A Space Odyssey" in 2001 Is named as HAL 9000.

It is worth noting that previously only the Halicin molecule has been studied because its structure is very different from known antibiotics.However, in vivo and in vitro clinical experiments in the laboratory have shown that Halicin is a broad-spectrum antibiotic.Extensive benchmarking of powerful NN models highlights the importance of using GNN learning features in Halicin’s discovery.In addition to the practicality of this work, the architecture of Chemprop should attract more attention: Unlike many GNN models, Chemprop has 5 layers and 1600 hidden dimensions, far exceeding the typical GNN parameters used for such tasks .Hope this is only a small part of the upcoming new medical discovery of artificial intelligence.For more results on this topic, check out Recent survey 和 Blog post Researched more GNN applications in the field of drug discovery.