New architectures for very deep learning

Srivastava, Rupesh Kumar

Back

Doctoral thesis

New architectures for very deep learning

Srivastava, Rupesh Kumar
Schmidhuber, Jürgen (Degree supervisor)

01.02.2018

116 p

Thèse de doctorat: Università della Svizzera italiana, 2018

English Artificial Neural Networks are increasingly being used in complex real- world applications because many-layered (i.e., deep) architectures can now be trained on large quantities of data. However, training even deeper, and therefore more powerful networks, has hit a barrier due to fundamental limitations in the design of existing networks. This thesis develops new architectures that, for the first time, allow very deep networks to be optimized efficiently and reliably. Specifically, it addresses two key issues that hamper credit assignment in neural networks: cross-pattern interference and vanishing gradients. Cross- pattern interference leads to oscillations of the network’s weights that make training inefficient. The proposed Local Winner-Take-All networks reduce interference among computation units in the same layer through local competition. An in-depth analysis of locally competitive networks provides generalizable insights and reveals unifying properties that improve credit assignment. As network depth increases, vanishing gradients make a network’s outputs increasingly insensitive to the weights close to the inputs, causing the failure of gradient-based training. To overcome this limitation, the proposed Highway networks regulate information flow across layers through additional skip connections which are modulated by learned computation units. Their beneficial properties are extended to the sequential domain with Recurrent Highway Networks that gain from increased depth and learn complex sequential transitions without requiring more parameters.

Language

English

Classification

Computer science and technology

License

License undefined

Identifiers

RERO DOC 322586
URN urn:nbn:ch:rero-006-117364
ARK ark:/12658/srd1318812

Persistent URL

https://n2t.net/ark:/12658/srd1318812

Statistics

Document views: 222 File downloads:

2018INFO006.pdf: 115

Doctoral thesis

New architectures for very deep learning

Machine learning

Deep learning

Very deep learning

Neural networks

Highway networks

Local competition

Competitive learning

Statistics