Giorgio Ciano (DIISM, University of Siena)
Dec 12, 2018 – 11:00 AM
DIISM, Artificial Intelligence laboratory (room 201), Siena SI
Most of the techniques used in Machine Learning (e.g. Deep Neural Networks) use vectors as inputs. However, graphs are a data structure that is more general and suitable to represent real-world problems. Even if it is possible to transform graph representations into vector representations (for instance, obtained via a graph traversal), such a transformation may lose some essential information. The Graph Neural Network (GNN) is a connectionist model able to process graphs directly, without the need of a pre-processing step. This work studies two possible extensions of the original GNN model.
The first extension is related to the possibility of varying the neuron model, by changing the activation function used by GNNs. Indeed, the original model adopts the tanh (hyperbolic tangent) activation function. On the other hand, modern deep learning architectures exploit a variety of diff erent activation functions. For example, when using rectified linear units (ReLUs), for each neuron there is an area of the input space where the neuron is always inactive. When the inputs are in this zone, the gradient with respect to the neuron weights will be zero. Disabling neurons is a desirable mechanism (called sparsity) that simplifies learning, making it faster. There are also several variations of the ReLUs, for example leaky ReLUs, which allow a small, positive gradient when the unit is not active. Other variants are ELUs and SELUs that try to make the mean activations closer to zero in order to speed up learning. Such activation functions will be evaluated in the GNN framework in order to understand the network learning dynamics and to compare the resulting architectures from the computational point of view.
The second goal of the thesis is to experiment transductive learning with GNNs. In pure transductive learning, the training patterns are used directly in the classification procedure. Therefore, the main diff erence between inductive and transductive learning is that, while in the former GNNs use only the parameters learnt during the training procedure, in the latter the available targets are added to the node labels and are diff used directly through the graph in the classification phase.
Both the extensions of the GNN model are experimented on synthetic and real-world benchmarks. The employed benchmarks are those originally adopted to assess the GNN model and include synthetic datasets (subgraph detection and clique localization in graphs), and Mutagenesis (http://kt.ijs.si/janez_kranjc/ilp_datasets/).