OPUS 4 | 004 Informatik

Garbage in, Garbage out: How does ambiguity in data affect state-of-the-art pedestrian detection? (2024)

Scholz, Jannes

This thesis investigates the critical role of data quality in computer vision, particularly in the realm of pedestrian detection. The proliferation of deep learning methods has emphasised the importance of large datasets for model training, while the quality of these datasets is equally crucial. Ambiguity in annotations, arising from factors like mislabelling, inaccurate bounding box geometry and annotator disagreements, poses significant challenges to the reliability and robustness of the pedestrian detection models and their evaluation. This work aims to explore the effects of ambiguous data on model performance with a focus on identifying and separating ambiguous instances, employing an ambiguity measure utilizing annotator estimations of object visibility and identity. Through accurate experimentation and analysis, trade-offs between data cleanliness and representativeness, noise removal and retention of valuable data emerged, elucidating their impact on performance metrics like the log average miss-rate, recall and precision. Furthermore, a strong correlation between ambiguity and occlusion was discovered with higher ambiguity corresponding to greater occlusion prevalence. The EuroCity Persons dataset served as the primary dataset, revealing a significant proportion of ambiguous instances with approximately 8.6% ambiguity in the training dataset and 7.3% in the validation set. Results demonstrated that removing ambiguous data improves the log average miss-rate, particularly by reducing the false positive detections. Augmentation of the training data with samples from neighbouring classes enhanced the recall but diminished precision. Error correction of wrong false positives and false negatives significantly impacts model evaluation results, as evidenced by shifts in the ECP leaderboard rankings. By systematically addressing ambiguity, this thesis lays the foundation for enhancing the reliability of computer vision systems in real-world applications, motivating the prioritisation of developing robust strategies to identify, quantify and address ambiguity.

Cross-Cloud-Atomarität mittels eines 2-Phasen-Commit-Protokolls (2024)

Rädler, Alexander

Immer mehr Unternehmen setzen auf eine Cross-Cloud-Strategie, die es Unternehmen ermöglicht, ihre Anwendungen und Daten über mehrere Cloud-Plattformen hinweg effizient zu verwalten und zu betreiben. Konsistenz und Atomarität zwischen den Cloud-Plattformen zu wahren, stellt eine große Herausforderung dar. Hierzu wird in dieser Arbeit eine Lösung vorgestellt, um Cross-Cloud-Atomarität zu erreichen, welche auf Basis des 2-Phasen-Commit-Protokolls (2PC) beruht. In diesem Zusammenhang wird die Funktionsweise des 2PC-Protokolls erörtert und Erweiterungen sowie Alternativen zum Protokoll kurz angesprochen. Zusätzlich werden alternative Lösungsansätze diskutiert, die für die Erzielung von Cross-Cloud-Atomarität in Betracht gezogen werden können. Dadurch wird ein umfassender Einblick in das Thema sowie mögliche Lösungsansätze für diese Herausforderung gewährt.

Analyse verschiedener Technologien zur Steigerung der Webseiten-Performanz mit Schwerpunkt Island Architektur (2024)

Schmidt, Philipp

JavaScript-Frameworks (JSF) sind im Bereich der Webentwicklung seit längerem prominent. Jährlich werden neue JSF entwickelt, um spezifische Probleme zu lösen. In den letzten Jahren hat sich der Trend entwickelt, bei der Wahl des JSF verstärkt auch auf die Performanz der entwickelten Webseite zu achten. Dabei wird versucht, den Anteil an JavaScript auf der Webseite zu reduzieren oder ganz zu eliminieren. Besonders neu ist der Ansatz der "Island Architecture", die erstmals 2019 vorgeschlagen wurde. In dieser Thesis soll die Performanz der meistbenutzten und des performantesten JSF mit dem JSF "Astro" verglichen werden, welches die "Island Architecture" von sich aus unterstützt. Der Schwerpunkt liegt beim Vergleichen der Webseitenperformanz, jedoch werden auch Effizienz und Einfachheit während der Entwicklung untersucht. Das Ziel dieser Arbeit ist es, potenzielle Frameworks zu untersuchen, die die Effizienz und Produktivität für den Nutzer und während der Entwicklung steigern können.

A Dynamic Framework for Evaluating DTLS using Network Simulator (ns-3) (2024)

Abraham, Joji

This thesis focuses on the development and implementation of a Datagram Transport Layer Security (DTLS) communication framework within the ns-3 network simulator, specifically targeting the LoRaWAN model network. The primary aim is to analyse the behaviour and performance of DTLS protocols across different network conditions within a LoRaWAN context. The key aspects of this work include the following. Utilization of ns-3: This thesis leverages ns-3’s capabilities as a powerful discrete event network simulator. This platform enables the emulation of diverse network environments, characterized by varying levels of latency, packet loss, and bandwidth constraints. Emulation of Network Challenges: The framework specifically addresses unique challenges posed by certain network configurations, such as duty cycle limitations. These constraints, which limit the time allocated for data transmission by each device, are crucial in understanding the real-world performance of DTLS protocols. Testing in Multi-client-server Scenarios: A significant feature of this framework is its ability to test DTLS performance in complex scenarios involving multiple clients and servers. This is vital for assessing the behaviour of a protocol under realistic network conditions. Realistic Environment Simulation: By simulating challenging network conditions, such as congestion, limited bandwidth, and resource constraints, the framework provides a realistic environment for thorough evaluation. This allows for a comprehensive analysis of DTLS in terms of security, performance, and scalability. Overall, this thesis contributes to a deeper understanding of DTLS protocols by providing a robust tool for their evaluation under various and challenging network conditions.

Conceptualization and implementation of automated optimization methods for private 5G networks (2023)

Hadian, Seyedali

Today’s companies are adjusting to the new connectivity realities. New applications require more bandwidth, lower latency, and higher reliability as industries become more distributed and autonomous. Private 5th Generation (5G) networks known as 5G Non-Public Networks (5G-NPN), is a novel 3rd Generation Partnership Project (3GPP)- based 5G network that can deliver seamless and dedicated wireless access for a particular industrial use case by providing the mentioned application’s requirements. To meet these requirements, several radio-related aspects and network parameters should be considered. In many cases, the behavior of the link connection may vary based on wireless conditions, available network resources, and User Equipment (UE) requirements. Furthermore, Optimizing these networks can be a complex task due to the large number of network parameters and KPIs that need to be considered. For these reasons, traditional solutions and static network configuration are not affordable or simply impossible. Despite the existence of papers in the literature that address several optimization methods for cellular networks in industrial scenarios, more insight into these existing but complex or unknown methods is needed. In this thesis, a series of optimization methods were implemented to deliver an optimal configuration solution for a 5G private network. To facilitate this implementation, a testing system was implemented. This system enables remote control over the UE and 5G network, establishment of a test environment, extraction of relevant KPI reports from both UE and network sides, assessment of test results and KPIs, and effective utilization of the optimization and sampling techniques. The research highlights the advantageous aspects of automated testing by using OFAT, Simulated Annealing, and Random Forest Regressor methods. With OFAT, as a common sampling method, a sensitivity analysis and an impact of each single parameter variation on the performance of the network were revealed. With Simulated Annealing, an optimal solution with MSE of roughly 10 was revealed. And, in the Random Forest Regressor, it was seen that this method presented a significant advantage over the simulated annealing method by providing substantial benefits in time efficiency due to its machine- learning capability. Additionally, it was seen that by providing a larger dataset or using some other machine-learning techniques, the solution might be more accurate.

Aufbereitung von Bilddaten mittels Autoencodern (2023)

Zimmermann, Marco Enrique

In dieser Arbeit wird der Bildbearbeitungsprozess von Dokumenten mithilfe von einem schlicht gehaltenem Neuronalen Netzwerk und Bearbeitungsoperationen optimiert. Ziel ist es, abfotografierte Dokumente zum Drucken aufzubereiten, sodass die Schrift gut lesbar, gerade und nicht verzerrt ist und Störfaktoren herausgefiltert werden. Als API zur Verfügung gestellt, können Bilder von Dokumenten beliebiger Größe und Schriftgröße bearbeitet werden. Während ein unter schlechten Bedingungen schräg aufgenommenes Bild nach Tesseract keine Buchstaben enthält, wird mit dem bearbeiteten Bild davon eine Buchstabenfehlerrate von 0,9% erreicht.

Entwicklung eines Compilation-Managers für einen statischen Typchecker für die Programmiersprache Erlang (2023)

Rösch, Markus

Die Komplexität von Softwareprojekten hat in den letzten Jahren stetig zugenommen. Um den gleichzeitig steigenden Anforderungen an die Codequalität gerecht zu werden, setzen auch ursprünglich dynamisch typisierte Programmiersprachen zuhnemend auf statische Typisierung. Dies kann in Form von externen Werkzeugen geschehen, die zusätzlich zum eigentlichen Compiler den Code auf Typsicherheit überprüfen, oder alternativ durch Erweiterung der Compiler selbst, um die Unterstützung für statische Typisierung direkt in der Sprache zu verankern. Ziel des etylizer-Projekts ist es, für die Programmiersprache Erlang zunächst ein solches externes Tool bereitzustellen und langfristig Teil der Compiler-Toolchain zu werden.In dieser Arbeit wird der Typchecker um die Fähigkeit erweitert, Erlang-Projekte vollständig zu verifizieren. Dafür wird zunächst die interne Symboltabelle erweitert, die etylizer nutzt, um Verweise auf Funktionen und Typen aus anderen Modulen aufzulösen. Die Implementierung der Symboltabelle wird so angepasst,dass sie zur Laufzeit um alle für das aktuell geprüfte Modul benötigten Symbole erweitert wird. Um die Laufzeit im Rahmen zu halten, wird ein Algorithmus entwickelt, der die Abhängigkeiten zwischen den Source-Code Dateien des Erlang-Projekts erkennt und anhand dieser entscheidet, welche Dateien sich seit dem letzten Durchlauf geändert haben und deshalb erneut überprüft werden müssen.

Evaluation des Lightweight Machine to Machine Kommunikationsprotokolls (2023)

Kloos, Levi

Die Thesis beschäftigt sich mit dem Kommunikationsprotokoll Lightweight Machine to Machine, welches für das Internet of Things entwickelt wurde. Es soll untersucht werden, wie das Protokoll funktioniert und wie es eingesetzt werden kann. Ebenfalls soll die Thesis zeigen, wie und ob Lightweight Machine to Machine über Long Term Evolution for Machines für Anwendungen mit begrenzten Ressourcen geeignet ist. Um diese Fragestellung zu beantworten, wurde das Protokoll auf Grund seiner Spezifikation und seinen Softwareimplementationen untersucht. Daraufhin wurde ein Versuchssystem entworfen und dieses anschließend auf sein Laufzeitverhalten und auf sein Energieverbrauch getestet. Die Evaluation des Protokolls ergab, dass es viele sinnvolle Funktionen zugeschnitten auf Geräte im Internet of Things besitzt und diese Funktionen kompakt und verständlich umsetzt. Da das Protokoll noch relativ jung ist, stellt es an verschiedenen Punkten eine Herausforderung dar. Die Tests des Versuchssystems ergaben, dass Lightweight Machine to Machine sich unter bestimmten Bedingungen für ressourcenbegrenzte Anwendungen eignet.

Generalisierung von mehrdimensionalen Lichtsetzungsparametern mittels Deep Learning (2023)

Gampe, Stefano

Licht war für die Menschheit schon immer ein Hilfsmittel zur Orientierung. Das Zusammenspiel zwischen hellen und schattierten Oberflächen macht eine räumliche Wahrnehmung erst möglich. Die Lokalisierung von Lichtquellen bietet darüber hinaus für zahlreiche Anwendungsfelder, wie beispielsweise Augmented Reality, ein großes Potential. Das Ziel der vorliegenden Arbeit war es, ein neuronales Netzwerk zu entwickeln, welches mit Hilfe eines selbst generierten, synthetischen Datensatzes eine Lichtsetzung parametrisiert. Dafür wurden State-of-the-Art Netzwerke aus der digitalen Bildverarbeitung eingesetzt. Zu Beginn der Arbeit mussten die Eigenschaften der Lichtsetzung extrahiert werden. Eine weitere fundamentale Anforderung war die Aufbereitung des Wissens von Deep Learning. Für die Generierung des synthetischen Datensatzes wurde eigens ein Framework entwickelt, welches auf der Blender Engine basiert. Anschließend wurden die generierten Bilder und Metadaten in einem abgewandelten VGG16- und ResNet50-Netz trainiert, validiert und evaluiert. Eine gewonnene Erkenntnis ist, dass sich künstlich generierte Daten eignen um ein neuronales Netz zu trainieren. Des Weiteren konnte gezeigt werden, dass sich mit Hilfe von Deep Learning Lichtsetzungsparameter extrahieren lassen. Eine weiterführende Forschungsaufgabe könnte mit dem vorgeschlagenen Ansatzdie Lichtinszenierung von Augmented Reality Anwendungen verbessern.

Classification Of Impersonated Domains Using Rule-Based and Machine Learning Algorithms and Brand Abuse Monitoring (2022)

Miskin, Amit Ravindrasa

Organizations striving to achieve success in the long term must have a positive brand image which will have direct implications on the business. In the face of the rising cyber threats and intense competition, maintaining a threat-free domain is an important aspect of preserving that image in today's internet world. Domain names are often near-synonyms for brand names for numerous companies. There are likely thousands of domains that try to impersonate the big companies in a bid to trap unsuspecting users, usually falling prey to attacks such as phishing or watering hole. Because domain names are important for organizations for running their business online, they are also particularly vulnerable to misuse by malicious actors. So, how can you ensure that your domain name is protected while still protecting your brand identity? Brand Monitoring, for example, may assist. The term "Brand Monitoring" applies only to keep tabs on an organization's brand performance, reception, and overall online presence through various online channels and platforms [1]. There has been a rise in the need of maintaining one's domain clear of any linkages to malicious activities as the threat environment has expanded. Since attackers are targeting domain names of organizations and luring unsuspecting users to visit malicious websites, domain monitoring becomes an important aspect. Another important aspect of brand abuse is how attackers leverage brand logos in creating fake and phishing web pages. In this Master Thesis, we try to solve the problem of classification of impersonated domains using rule-based and machine learning algorithms and automation of domain monitoring. We first use a rule-based classifier and Machine Learning algorithms to classify the domains gathered into two buckets – "Parked" and "Non-Parked". In the project's second phase, we will deploy object detection models (Scale Invariant Feature Transform - SIFT and Multi-Template Matching – MTM) to detect brand logos from the domains of interest.

Visualisierung von Python Programmen in der Entwicklungsumgebung Visual Studio Code (2023)

Velten, Marco

Eine neue Programmiersprache zu erlernen kann für Anfänger:innen manchmal schwer sein, selbst für Programmiersprachen wie Python, die bekannt dafür sind Einsteigerfreundlich zu sein. Denn selbst wenn die Syntax eines Python Programms schnell verstanden wird, ist oft nicht direkt erkenntlich wie der Code hinter dem Programm funktioniert. Anfänger:innen können dabei auch auf ihre Grenzen stoßen, den Ablauf eines Programmes nur alleine durch den Programmcode zu verstehen. Denn der Text der den Code ausmacht, kann auch nur bis zu einem gewissen Grad vermitteln wie oder was genau abläuft. Um den Ablauf eines Programms besser vermitteln zu können, wird der Code oft z.B. mit Diagrammen visualisiert. Visuelle Elemente können ebenfalls zusätzlich zum Code mehr Unterstützung leisten. Das Thema dieser Arbeit beschäftigt sich mit der Visualisierung von Python Programmen in der Entwicklungsumgebung Visual Studio Code, um Programmieranfänger:innen und Student:innen beim Erlernen der Programmiersprache Python zu unterstützen. Die Entwicklung der Visualisierung beinhaltet, das Erstellen einer Erweiterung in Visual Studio Code, die unter anderem das Debug Adapter Protocol einsetzt um mit dem Python Debugger zu kommunizieren.

Evaluation und Integration von Composable Commerce Architekturen (2022)

Studer, Niklas

Komplexe E-Commerce-Systeme müssen heutzutage immer schneller am Markt sein und sich an diesen anpassen. Dies wird durch SaaS-Services möglich, wodurch sich die Best-of-Breed-Lösungen einsetzen lassen. Der monolithische Ansatz der meisten E-Commerce-Systeme ist für diese Anwendungen nicht mehr geeignet. Abhilfe soll der Composable-Commerce-Ansatz schaffen. Für den Ansatz wird eine Integrationslösung benötigt. Ziel dieser Thesis ist es, Integrationslösungen zu evaluieren und mithilfe von Integration-Layer-Prototypen gegenüberzustellen. Es werden zwei Integrationslösungen ausgewählt, die als Prototyp implementiert werden. Für den ersten Prototypen wird Apache Camel in einem Spring-Boot-Server verwendet. Der zweite Prototyp setzt die AWS-eigenen Services für die Integration ein. Zum Schluss werden diese durch einen Last-Test auf ihre Performance geprüft.

SBTMS: Scalable Blockchain Trust Management System for VANET (2021)

Ghovanlooy Ghajar, Fatemeh ; Salimi Sratakhti, Javad ; Sikora, Axel

With many advances in sensor technology and the Internet of Things, Vehicle Ad Hoc Net- work (VANET) is becoming a new generation. VANET’s current technical challenges are deploying decentralized architecture and protecting privacy. Because Blockchain features are decentralized, distributed, mass storage, and non-manipulation features, this paper designs a new decentralized architecture using Blockchain technology called Blockchain-based VANET. Blockchain-based VANET can effectively resolve centralized problems and mutual distrust between VANET units. To achieve this, it is needed to provide scalability on the blockchain to run for VANET. In this system, our focus is on the reliability of incoming messages on the network. Vehicles check the validity of the received messages using the proposed Bayesian formula for trust management system and some information saved in the Blockchain. Then, based on the validation result, the vehicle computes a rate for each message type and message source vehicle. Vehicles upload the computed rates to Roadside Units (RSUs) in order to calculate the net reliability value. Finally, RSUs using a sharding consensus mechanism generate blocks, including the net reliability value as a transaction. In this system, all RSUs collaboratively maintain the latest updated Blockchain. Our experimental results show that the proposed system is effective, scalable and dependable in data gathering, computing, organization, and retrieval of trust values in VANET.

Security Audit of a Blockchain-Based Industrial Application Platform (2021)

Stodt, Jan ; Schönle, Daniel ; Reich, Christoph ; Ghovanlooy Ghajar, Fatemeh ; Welte, Dominik ; Sikora, Axel

In recent years, both the Internet of Things (IoT) and blockchain technologies have been highly influential and revolutionary. IoT enables companies to embrace Industry 4.0, the Fourth Industrial Revolution, which benefits from communication and connectivity to reduce cost and to increase productivity through sensor-based autonomy. These automated systems can be further refined with smart contracts that are executed within a blockchain, thereby increasing transparency through continuous and indisputable logging. Ideally, the level of security for these IoT devices shall be very high, as they are specifically designed for this autonomous and networked environment. This paper discusses a use case of a company with legacy devices that wants to benefit from the features and functionality of blockchain technology. In particular, the implications of retrofit solutions are analyzed. The use of the BISS:4.0 platform is proposed as the underlying infrastructure. BISS:4.0 is intended to integrate the blockchain technologies into existing enterprise environments. Furthermore, a security analysis of IoT and blockchain present attacks and countermeasures are presented that are identified and applied to the mentioned use case.

Entwicklung einer interaktiven E-Learning Anwendung zum P2P-CAN-Algorithmus (2021)

Thaller, Emanuel

Diese Arbeit beschäftigt sich mit der Entwicklung einer E-Learning Anwendung zum Peer-to-Peer Algorithmus Content Addressable Network (CAN). Ein CAN ist eine verteilte Hashtabelle zur dezentralen Verwaltung von Daten in Form von Schlüssel-Wert Paaren. Zweck der Anwendung ist eine didaktisch sinnvolle Aufbereitung und Darstellung der grundlegenden Abläufe im CAN. Die Anwendung simuliert ein CAN und bietet ein grafisches Interface zur Interaktion. Die Anwendung soll unterstützend im Rahmen des Moduls Advanced Networking im Informatik Master an der Hochschule Offenburg eingesetzt werden.

SmartPred: Unsupervised Hard Disk Failure Detection (2020)

Keuper, Janis ; Rombach, Philipp

Due to the rapidly increasing storage consumption worldwide, as well as the expectation of continuous availability of information, the complexity of administration in today’s data centers is growing permanently. Integrated techniques for monitoring hard disks can increase the reliability of storage systems. However, these techniques often lack intelligent data analysis to perform predictive maintenance. To solve this problem, machine learning algorithms can be used to detect potential failures in advance and prevent them. In this paper, an unsupervised model for predicting hard disk failures based on Isolation Forest is proposed. Consequently, a method is presented that can deal with the highly imbalanced datasets, as the experiment on the Backblaze benchmark dataset demonstrates.

Python Workflows on HPC Systems (2020)

Keuper, Janis ; Straßel, Dominik ; Reusch, Philipp

The recent successes and wide spread application of compute intensive machine learning and data analytics methods have been boosting the usage of the Python programming language on HPC systems. While Python provides many advantages for the users, it has not been designed with a focus on multiuser environments or parallel programming - making it quite challenging to maintain stable and secure Python workflows on a HPC system. In this paper, we analyze the key problems induced by the usage of Python on HPC clusters and sketch appropriate workarounds for efficiently maintaining multi-user Python software environments, securing and restricting resources of Python jobs and containing Python processes, while focusing on Deep Learning applications running on GPU clusters.

Analyse des Deep Reinforcement Learning Algorithmus PPO2 in der RoboCup Umgebung (2020)

Spitznagel, Martin

Seit 2009 nimmt das Team ”magmaOffenburg” an der 3D-Simulationsliga des RoboCups teil. Für das erfolgreiche Abschneiden in Turnieren ist die Qualität der erlernten Bewegungsabläufe ein zentraler Faktor. Bisher wurden genetische Algorithmen verwendet, um verschiedenste Aktionen zu erlernen sowie zu optimieren. In dieser Arbeit wird der Deep Reinforcement Learning Algorithmus Proximal Policy Optimization für das Erlernen bestimmter Bewegungen verwendet. Um ein Verständnis für dessen einflussreichen Parameter zu erhalten, werden Größen wie paralleles Lernen, Hyperparameter, Netzwerktopologie, Größe des Observationspace sowie asynchronem Lernen anhand dem Kicken aus dem Stand evaluiert. Durch die Ergebnisse der Evaluierung konnte der erlernte Kick signifikant verbessert werden und sein genetisch erlerntes Gegenstück im Spiel ablösen. Drüber hinaus wurden die Erkenntnisse anhand dem Laufen lernen evaluiert und Zusammenhänge bzw. Unterschiede der zwei Lernprobleme festgestellt.

Detection of Spamming Users in Crowdsourcing Tasks (2020)

Bystrow, Dennis

Annotated training data is essential for supervised learning methods. Human annotation is costly and laborsome especially if a dataset consists of hundreds of thousands of samples and annotators need to be hired. Crowdsourcing emerged as a solution that makes it easier to get access to large amounts of human annotators. Introducing paid external annotators however introduces malevolent annotations, both intentional and unintentional. Both forms of malevolent annotations have negative effects on further usage of the data and can be summarized as spam. This work explores different approaches to post-hoc detection of spamming users and which kinds of spam can be detected by them. A manual annotation checking process resulted in the creation of a small user spam dataset which is used in this thesis. Finally an outlook for future improvements of these approaches will be made.

Open Access

004 Informatik

Refine

Author

Year of publication

Document Type

Conference Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

Open Access

19 search hits