Refine
Document Type
- Master's Thesis (6)
- Article (reviewed) (2)
- Bachelor Thesis (1)
- Doctoral Thesis (1)
Keywords
- Deep learning (10) (remove)
Institute
- Fakultät Elektrotechnik, Medizintechnik und Informatik (EMI) (ab 04/2019) (7)
- Fakultät Medien (M) (ab 22.04.2021) (3)
- Fakultät Elektrotechnik und Informationstechnik (E+I) (bis 03/2019) (1)
- IMLA - Institute for Machine Learning and Analytics (1)
- ivESK - Institut für verlässliche Embedded Systems und Kommunikationselektronik (1)
Open Access
- Closed Access (4)
- Open Access (4)
- Closed (2)
- Diamond (2)
- Gold (1)
Für die Prognose von Zeitreihen sind bezüglich der Qualität der Vorhersagen heutzutage neuronale Netze und Deep Learning das Mittel der Wahl. LSTM-Netzwerke etablierten sich dazu als eine gut funktionierende Herangehensweise. 2017 wurde der auf Attention basierende Transformer für die Übersetzung von Sprache vorgestellt. Aufgrund seiner Fähigkeit mit sequenziellen Daten zu arbeiten, ist er auch für Zeitreihenprobleme interessant. Diese wissenschaftliche Arbeit befasst sich mit der Vorhersage von Zeitreihen mit einem Transformer. Es wird analysiert, inwiefern sich ein Transformer für Zeitreihenvorhersagen von einem Transformer für Sprachübersetzungen unterscheidet und wie gut die Vorhersagen im Vergleich zu denen eines LSTM-Netzwerkes abschneiden. Dazu werden ein LSTM- und ein Transformer-Netzwerk auf Luftqualitäts- und Wetterdaten in Berlin trainiert, um den Feinstaubgehalt (PM25) in der Luft vorherzusagen. Die Ergebnisse werden mit einem Benchmark-Modell anhand von Evaluationsmetriken verglichen. Anschließend wird evaluiert, wie die Fehler des Transformers reduziert werden können und wie gut der Transformer generalisiert.
Bei der Produktion von Solarzellen aus multikristallinem Silizium haben Defekte aus der Kristallisationsphase starken Einfluss auf die Materialqualität der Wafer und damit auf den Wirkungsgrad der späteren Solarzelle. Ein Verständnis des Kornwachstums in multikristallinem Silizium während des Kristallisationsprozesses kann zur Optimierung desselben beitragen. In dieser Arbeit werden Methoden untersucht, optische Flüsse zwischen Korngrenzenbildern multikristalliner Si-Wafer mittels neuronaler Netze zu berechnen. Hierfür wird die Architektur eines ausgereiften faltungsbasierten neuronalen Netzes zur optischen Fluss-Berechnung genutzt und durch angepasstes Training auf Waferstrukturen zugeschnitten. Dies umfasst die Synthese eigener, auf Waferbilder basierender Trainingsdaten und das Training mit einer angepassten Fehlerfunktion zur Bewertung der Zuordnungsgenauigkeit von Körnern zwischen Wafern durch den optischen Fluss. Beide Maßnahmen zusammen führen zu einer Reduktion des Zuordnungsfehlers von Körnern zwischen Waferbildern um 45 % gegenüber einem hochoptimierten, auf allgemeine optische Flüsse trainierten Modell basierend auf demselben Netzwerk. Die geschätzte Zuordnungsgenauigkeit des besten Modells beträgt 92,4 % der Pixel der Korngrenzenbilder eines Wafers. Weiteres Verbesserungspotenzial ist vorhanden.
Garbage in, Garbage out: How does ambiguity in data affect state-of-the-art pedestrian detection?
(2024)
This thesis investigates the critical role of data quality in computer vision, particularly in the realm of pedestrian detection. The proliferation of deep learning methods has emphasised the importance of large datasets for model training, while the quality of these datasets is equally crucial. Ambiguity in annotations, arising from factors like mislabelling, inaccurate bounding box geometry and annotator disagreements, poses significant challenges to the reliability and robustness of the pedestrian detection models and their evaluation. This work aims to explore the effects of ambiguous data on model performance with a focus on identifying and separating ambiguous instances, employing an ambiguity measure utilizing annotator estimations of object visibility and identity. Through accurate experimentation and analysis, trade-offs between data cleanliness and representativeness, noise removal and retention of valuable data emerged, elucidating their impact on performance metrics like the log average miss-rate, recall and precision. Furthermore, a strong correlation between ambiguity and occlusion was discovered with higher ambiguity corresponding to greater occlusion prevalence. The EuroCity Persons dataset served as the primary dataset, revealing a significant proportion of ambiguous instances with approximately 8.6% ambiguity in the training dataset and 7.3% in the validation set. Results demonstrated that removing ambiguous data improves the log average miss-rate, particularly by reducing the false positive detections. Augmentation of the training data with samples from neighbouring classes enhanced the recall but diminished precision. Error correction of wrong false positives and false negatives significantly impacts model evaluation results, as evidenced by shifts in the ECP leaderboard rankings. By systematically addressing ambiguity, this thesis lays the foundation for enhancing the reliability of computer vision systems in real-world applications, motivating the prioritisation of developing robust strategies to identify, quantify and address ambiguity.
In the past ten years, applications of artificial neural networks have changed dramatically. outperforming earlier predictions in domains like robotics, computer vision, natural language processing, healthcare, and finance. Future research and advancements in CNN architectures, Algorithms and applications are expected to revolutionize various industries and daily life further. Our task is to find current products that resemble the given product image and description. Deep learning-based automatic product identification is a multi-step process that starts with data collection and continues with model training, deployment, and continuous improvement. The caliber and variety of the dataset, the design selected, and ongoing testing and improvement all affect the model's effectiveness. We achieved 81.47% training accuracy and 72.43% validation accuracy for our combined text and image classification model. Additionally, we have discussed the outcomes from the other dataset and numerous methods for creating an appropriate model.
In this paper pathophysiological interrelated deactivation/activation phenomena are set out in the example of whiplash injury. These phenomena could have been underestimated in previous positron emission tomography studies as their focus was on hypoperfusion rather than hyperperfusion. In addition, statistical parametric mapping analysis of cerebral studies is normally not fine-tuned to special interesting areas rather than to obvious clusters of difference.
Licht war für die Menschheit schon immer ein Hilfsmittel zur Orientierung. Das Zusammenspiel zwischen hellen und schattierten Oberflächen macht eine räumliche Wahrnehmung erst möglich. Die Lokalisierung von Lichtquellen bietet darüber hinaus für zahlreiche Anwendungsfelder, wie beispielsweise Augmented Reality, ein großes Potential.
Das Ziel der vorliegenden Arbeit war es, ein neuronales Netzwerk zu entwickeln, welches mit Hilfe eines selbst generierten, synthetischen Datensatzes eine Lichtsetzung parametrisiert. Dafür wurden State-of-the-Art Netzwerke aus der digitalen Bildverarbeitung eingesetzt.
Zu Beginn der Arbeit mussten die Eigenschaften der Lichtsetzung extrahiert werden. Eine weitere fundamentale Anforderung war die Aufbereitung des Wissens von Deep Learning.
Für die Generierung des synthetischen Datensatzes wurde eigens ein Framework entwickelt, welches auf der Blender Engine basiert.
Anschließend wurden die generierten Bilder und Metadaten in einem abgewandelten VGG16- und ResNet50-Netz trainiert, validiert und evaluiert.
Eine gewonnene Erkenntnis ist, dass sich künstlich generierte Daten eignen um ein neuronales Netz zu trainieren. Des Weiteren konnte gezeigt werden, dass sich mit Hilfe von Deep Learning Lichtsetzungsparameter extrahieren lassen.
Eine weiterführende Forschungsaufgabe könnte mit dem vorgeschlagenen Ansatzdie Lichtinszenierung von Augmented Reality Anwendungen verbessern.
Even though the internet has only been there for a short period, it has grown tremendously. To- day, a significant portion of commerce is conducted entirely online because of increased inter- net users and technological advancements in web construction. Additionally, cyberattacks and threats have expanded significantly, leading to financial losses, privacy breaches, identity theft, a decrease in customers’ confidence in online banking and e-commerce, and a decrease in brand reputation and trust. When an attacker pretends to be a genuine and trustworthy institution, they can steal private and confidential information from a victim. Aside from that, phishing has been an ongoing issue for a long time. Billions of dollars have been shed on the global economy. In recent years, there has been significant progress in the development of phishing detection and identification systems to protect against phishing attacks. Phishing detection technologies frequently produce binary results, i.e., whether a phishing attempt was made or not, with no explanation. On the other hand, phishing identification methodologies identify phishing web- pages by visually comparing webpages with predetermined authentic references and reporting phishing together with its target brand, resulting in findings that are understandable. However, technical difficulties in the field of visual analysis limit the applicability of currently available solutions, preventing them from being both effective (with high accuracy) and efficient (with little runtime overhead). Here, we evaluate existed framework called Phishpedia. This hybrid deep learning system can recognize identity logos from webpage screenshots and match logo variants of the same brand with high precision. Phishpedia provides high accuracy with low run- time. Lastly, unlike other methods, Phishpedia does not require training on any phishing sam- ples whatsoever. Phishpedia exceeds baseline identification techniques (EMD, PhishZoo, and LogoSENSE), inaccurately detecting phishing pages in lengthy testing using accurate phishing data. The effectiveness of Phishpedia was tested and compared against other standard machine learning algorithms and some state-of-the-art algorithms. The given solutions performed better than different algorithms in the given dataset, which is impressive.
Due to its performance, the field of deep learning has gained a lot of attention, with neural networks succeeding in areas like Computer Vision (CV), Neural Language Processing (NLP), and Reinforcement Learning (RL). However, high accuracy comes at a computational cost as larger networks require longer training time and no longer fit onto a single GPU. To reduce training costs, researchers are looking into the dynamics of different optimizers, in order to find ways to make training more efficient. Resource requirements can be limited by reducing model size during training or designing more efficient models that improve accuracy without increasing network size.
This thesis combines eigenvalue computation and high-dimensional loss surface visualization to study different optimizers and deep neural network models. Eigenvectors of different eigenvalues are computed, and the loss landscape and optimizer trajectory are projected onto the plane spanned by those eigenvectors. A new parallelization method for the stochastic Lanczos method is introduced, resulting in faster computation and thus enabling high-resolution videos of the trajectory and secondorder information during neural network training. Additionally, the thesis presents the loss landscape between two minima along with the eigenvalue density spectrum at intermediate points for the first time.
Secondly, this thesis presents a regularization method for Generative Adversarial Networks (GANs) that uses second-order information. The gradient during training is modified by subtracting the eigenvector direction of the biggest eigenvalue, preventing the network from falling into the steepest minima and avoiding mode collapse. The thesis also shows the full eigenvalue density spectra of GANs during training.
Thirdly, this thesis introduces ProxSGD, a proximal algorithm for neural network training that guarantees convergence to a stationary point and unifies multiple popular optimizers. Proximal gradients are used to find a closed-form solution to the problem of training neural networks with smooth and non-smooth regularizations, resulting in better sparsity and more efficient optimization. Experiments show that ProxSGD can find sparser networks while reaching the same accuracy as popular optimizers.
Lastly, this thesis unifies sparsity and neural architecture search (NAS) through the framework of group sparsity. Group sparsity is achieved through ℓ2,1-regularization during training, allowing for filter and operation pruning to reduce model size with minimal sacrifice in accuracy. By grouping multiple operations together, group sparsity can be used for NAS as well. This approach is shown to be more robust while still achieving competitive accuracies compared to state-of-the-art methods
The identification of vulnerabilities is an important element of the software development process to ensure the security of software. Vulnerability identification based on the source code is a well studied field. To find vulnerabilities on the basis of a binary executable without the corresponding source code is more challenging. Recent research has shown how such detection can be performed statically and thus runtime efficiently by using deep learning methods for certain types of vulnerabilities.
This thesis aims to examine to what extent this identification can be applied sufficiently for a variety of vulnerabilities. Therefore, a supervised deep learning approach using recurrent neural networks for the application of vulnerability detection based on binary executables is used. For this purpose, a dataset with 50,651 samples of 23 different vulnerabilities in the form of a standardised LLVM Intermediate Representation was prepared. The vectorised features of a Word2Vec model were then used to train different variations of three basic architectures of recurrent neural networks (GRU, LSTM, SRNN). For this purpose, a binary classification was trained for the presence of an arbitrary vulnerability, and a multi-class model was trained for the identification of the exact vulnerability, which achieved an out-of-sample accuracy of 88% and 77%, respectively. Differences in the detection of different vulnerabilities were also observed, with non-vulnerable samples being detected with a particularly high precision of over 98%. Thus, the methodology presented allows an accurate detection of vulnerabilities, as well as a strong limitation of the analysis scope for further analysis steps.
Blockchain-IIoT integration into industrial processes promises greater security, transparency, and traceability. However, this advancement faces significant storage and scalability issues with existing blockchain technologies. Each peer in the blockchain network maintains a full copy of the ledger which is updated through consensus. This full replication approach places a burden on the storage space of the peers and would quickly outstrip the storage capacity of resource-constrained IIoT devices. Various solutions utilizing compression, summarization or different storage schemes have been proposed in literature. The use of cloud resources for blockchain storage has been extensively studied in recent years. Nonetheless, block selection remains a substantial challenge associated with cloud resources and blockchain integration. This paper proposes a deep reinforcement learning (DRL) approach as an alternative to solving the block selection problem, which involves identifying the blocks to be transferred to the cloud. We propose a DRL approach to solve our problem by converting the multi-objective optimization of block selection into a Markov decision process (MDP). We design a simulated blockchain environment for training and testing our proposed DRL approach. We utilize two DRL algorithms, Advantage Actor-Critic (A2C), and Proximal Policy Optimization (PPO) to solve the block selection problem and analyze their performance gains. PPO and A2C achieve 47.8% and 42.9% storage reduction on the blockchain peer compared to the full replication approach of conventional blockchain systems. The slowest DRL algorithm, A2C, achieves a run-time 7.2 times shorter than the benchmark evolutionary algorithms used in earlier works, which validates the gains introduced by the DRL algorithms. The simulation results further show that our DRL algorithms provide an adaptive and dynamic solution to the time-sensitive blockchain-IIoT environment.