Vincenzo Laveglia (DIISM, University of Siena)
May 31, 2018 – 9:30 AM
DIISM, Artificial Intelligence laboratory (room 201), Siena SI
Being able to train neural networks (shallow or deep) requires the ability to identify the right hyper-parameters for the model. Some of these hyper-parameters are more related to the training procedure (epochs, learning rate, momentum, initial weight distributions,…), while others are strictly related to the model architecture (number of layers, number of neurons per layer, activation functions,…). Although still under development, we discuss a potential alternative approach for training neural networks that aims at simplifying the learning process by avoiding the tuning/searching of some architectural hyper-parameters. We then show few early results of an application to a simple problem.