Refine
Document Type
Conference Type
- Konferenzartikel (1)
Language
- English (3)
Has Fulltext
- no (3) (remove)
Is part of the Bibliography
- yes (3)
Keywords
- Optimization (3) (remove)
Institute
Open Access
- Closed (2)
- Diamond (1)
- Open Access (1)
The need of suitable system of records in gaining ground as companies seek to maximize performance by harnessing the knowledge of their businesses, is discussed. Focused systems of record deliver a clear and consistent view even as they address a range of functions. Enterprise resource planning (ERP), as the financial system of record, embodies that view of manufacturing, inventory management, accounting and order processing. Customer relationship management (CRM), as a system of record, taps not only into the marketing and sales and service, but also into product development.
Due to its performance, the field of deep learning has gained a lot of attention, with neural networks succeeding in areas like Computer Vision (CV), Neural Language Processing (NLP), and Reinforcement Learning (RL). However, high accuracy comes at a computational cost as larger networks require longer training time and no longer fit onto a single GPU. To reduce training costs, researchers are looking into the dynamics of different optimizers, in order to find ways to make training more efficient. Resource requirements can be limited by reducing model size during training or designing more efficient models that improve accuracy without increasing network size.
This thesis combines eigenvalue computation and high-dimensional loss surface visualization to study different optimizers and deep neural network models. Eigenvectors of different eigenvalues are computed, and the loss landscape and optimizer trajectory are projected onto the plane spanned by those eigenvectors. A new parallelization method for the stochastic Lanczos method is introduced, resulting in faster computation and thus enabling high-resolution videos of the trajectory and secondorder information during neural network training. Additionally, the thesis presents the loss landscape between two minima along with the eigenvalue density spectrum at intermediate points for the first time.
Secondly, this thesis presents a regularization method for Generative Adversarial Networks (GANs) that uses second-order information. The gradient during training is modified by subtracting the eigenvector direction of the biggest eigenvalue, preventing the network from falling into the steepest minima and avoiding mode collapse. The thesis also shows the full eigenvalue density spectra of GANs during training.
Thirdly, this thesis introduces ProxSGD, a proximal algorithm for neural network training that guarantees convergence to a stationary point and unifies multiple popular optimizers. Proximal gradients are used to find a closed-form solution to the problem of training neural networks with smooth and non-smooth regularizations, resulting in better sparsity and more efficient optimization. Experiments show that ProxSGD can find sparser networks while reaching the same accuracy as popular optimizers.
Lastly, this thesis unifies sparsity and neural architecture search (NAS) through the framework of group sparsity. Group sparsity is achieved through ℓ2,1-regularization during training, allowing for filter and operation pruning to reduce model size with minimal sacrifice in accuracy. By grouping multiple operations together, group sparsity can be used for NAS as well. This approach is shown to be more robust while still achieving competitive accuracies compared to state-of-the-art methods
Modern industrial production is heavily dependent on efficient workflow processes and automation. The steady flow of raw materials as well as the separation of vital parts and semi-finished products are at the core of these automated procedures. Commonly used systems for this work are bowl feeders, which separate the parts and material by a combination of mechanical vibration and friction. The production of these tools, especially the design of the ramping spiral, is delicate and time-consuming work, as the shape, slope, and material must be carefully adjusted for the corresponding parts. In this work, we propose an automated approach, making use of optimization procedures from artificial intelligence, to design the spiral ramps of the bowl feeders. Therefore, the whole system and considered parts are physically simulated and the optimized geometry is subsequently exported into a CAD system for the actual building, respectively printing. The employment of evolutionary optimization gives the need to develop a mathematical model for the whole setup and find an efficient representation of integral features.