Refine
Document Type
- Article (reviewed) (2)
- Doctoral Thesis (1)
Language
- English (3)
Is part of the Bibliography
- yes (3) (remove)
Keywords
- Deep learning (3) (remove)
Institute
Open Access
- Open Access (3)
- Diamond (1)
- Gold (1)
Due to its performance, the field of deep learning has gained a lot of attention, with neural networks succeeding in areas like Computer Vision (CV), Neural Language Processing (NLP), and Reinforcement Learning (RL). However, high accuracy comes at a computational cost as larger networks require longer training time and no longer fit onto a single GPU. To reduce training costs, researchers are looking into the dynamics of different optimizers, in order to find ways to make training more efficient. Resource requirements can be limited by reducing model size during training or designing more efficient models that improve accuracy without increasing network size.
This thesis combines eigenvalue computation and high-dimensional loss surface visualization to study different optimizers and deep neural network models. Eigenvectors of different eigenvalues are computed, and the loss landscape and optimizer trajectory are projected onto the plane spanned by those eigenvectors. A new parallelization method for the stochastic Lanczos method is introduced, resulting in faster computation and thus enabling high-resolution videos of the trajectory and secondorder information during neural network training. Additionally, the thesis presents the loss landscape between two minima along with the eigenvalue density spectrum at intermediate points for the first time.
Secondly, this thesis presents a regularization method for Generative Adversarial Networks (GANs) that uses second-order information. The gradient during training is modified by subtracting the eigenvector direction of the biggest eigenvalue, preventing the network from falling into the steepest minima and avoiding mode collapse. The thesis also shows the full eigenvalue density spectra of GANs during training.
Thirdly, this thesis introduces ProxSGD, a proximal algorithm for neural network training that guarantees convergence to a stationary point and unifies multiple popular optimizers. Proximal gradients are used to find a closed-form solution to the problem of training neural networks with smooth and non-smooth regularizations, resulting in better sparsity and more efficient optimization. Experiments show that ProxSGD can find sparser networks while reaching the same accuracy as popular optimizers.
Lastly, this thesis unifies sparsity and neural architecture search (NAS) through the framework of group sparsity. Group sparsity is achieved through ℓ2,1-regularization during training, allowing for filter and operation pruning to reduce model size with minimal sacrifice in accuracy. By grouping multiple operations together, group sparsity can be used for NAS as well. This approach is shown to be more robust while still achieving competitive accuracies compared to state-of-the-art methods
Blockchain-IIoT integration into industrial processes promises greater security, transparency, and traceability. However, this advancement faces significant storage and scalability issues with existing blockchain technologies. Each peer in the blockchain network maintains a full copy of the ledger which is updated through consensus. This full replication approach places a burden on the storage space of the peers and would quickly outstrip the storage capacity of resource-constrained IIoT devices. Various solutions utilizing compression, summarization or different storage schemes have been proposed in literature. The use of cloud resources for blockchain storage has been extensively studied in recent years. Nonetheless, block selection remains a substantial challenge associated with cloud resources and blockchain integration. This paper proposes a deep reinforcement learning (DRL) approach as an alternative to solving the block selection problem, which involves identifying the blocks to be transferred to the cloud. We propose a DRL approach to solve our problem by converting the multi-objective optimization of block selection into a Markov decision process (MDP). We design a simulated blockchain environment for training and testing our proposed DRL approach. We utilize two DRL algorithms, Advantage Actor-Critic (A2C), and Proximal Policy Optimization (PPO) to solve the block selection problem and analyze their performance gains. PPO and A2C achieve 47.8% and 42.9% storage reduction on the blockchain peer compared to the full replication approach of conventional blockchain systems. The slowest DRL algorithm, A2C, achieves a run-time 7.2 times shorter than the benchmark evolutionary algorithms used in earlier works, which validates the gains introduced by the DRL algorithms. The simulation results further show that our DRL algorithms provide an adaptive and dynamic solution to the time-sensitive blockchain-IIoT environment.
In this paper pathophysiological interrelated deactivation/activation phenomena are set out in the example of whiplash injury. These phenomena could have been underestimated in previous positron emission tomography studies as their focus was on hypoperfusion rather than hyperperfusion. In addition, statistical parametric mapping analysis of cerebral studies is normally not fine-tuned to special interesting areas rather than to obvious clusters of difference.