Refine
Year of publication
- 2021 (79) (remove)
Document Type
- Conference Proceeding (44)
- Article (reviewed) (16)
- Book (4)
- Letter to Editor (4)
- Report (4)
- Doctoral Thesis (3)
- Part of a Book (2)
- Contribution to a Periodical (1)
- Patent (1)
Conference Type
- Konferenzartikel (40)
- Konferenz-Abstract (3)
- Konferenz-Poster (1)
Has Fulltext
- no (79) (remove)
Is part of the Bibliography
- yes (79)
Keywords
Institute
- Fakultät Elektrotechnik, Medizintechnik und Informatik (EMI) (ab 04/2019) (79) (remove)
Open Access
- Closed Access (43)
- Open Access (35)
- Bronze (4)
- Closed (1)
- Diamond (1)
- Grün (1)
Aerosol particles play an important role in the climate system by absorbing and scattering radiation and influencing cloud properties. They are also one of the biggest sources of uncertainty for climate modeling. Many climate models do not include aerosols in sufficient detail. In order to achieve higher accuracy, aerosol microphysical properties and processes have to be accounted for. This is done in the ECHAM-HAM global climate aerosol model using the M7 microphysics model, but increased computational costs make it very expensive to run at higher resolutions or for a longer time. We aim to use machine learning to approximate the microphysics model at sufficient accuracy and reduce the computational cost by being fast at inference time. The original M7 model is used to generate data of input-output pairs to train a neural network on it. By using a special logarithmic transform we are able to learn the variables tendencies achieving an average score of . On a GPU we achieve a speed-up of 120 compared to the original model.
Correlation Clustering, also called the minimum cost Multicut problem, is the process of grouping data by pairwise similarities. It has proven to be effective on clustering problems, where the number of classes is unknown. However, not only is the Multicut problem NP-hard, an undirected graph G with n vertices representing single images has at most edges, thus making it challenging to implement correlation clustering for large datasets. In this work, we propose Multi-Stage Multicuts (MSM) as a scalable approach for image clustering. Specifically, we solve minimum cost Multicut problems across multiple distributed compute units. Our approach not only allows to solve problem instances which are too large to fit into the shared memory of a single compute node, but it also achieves significant speedups while preserving the clustering accuracy at the same time. We evaluate our proposed method on the CIFAR10 …
A fundamental and still largely unsolved question in the context of Generative Adversarial Networks is whether they are truly able to capture the real data distribution and, consequently, to sample from it. In particular, the multidimensional nature of image distributions leads to a complex evaluation of the diversity of GAN distributions. Existing approaches provide only a partial understanding of this issue, leaving the question unanswered. In this work, we introduce a loop-training scheme for the systematic investigation of observable shifts between the distributions of real training data and GAN generated data. Additionally, we introduce several bounded measures for distribution shifts, which are both easy to compute and to interpret. Overall, the combination of these methods allows an explorative investigation of innate limitations of current GAN algorithms. Our experiments on different data-sets and multiple state-of-the-art GAN architectures show large shifts between input and output distributions, showing that existing theoretical guarantees towards the convergence of output distributions appear not to be holding in practice.
An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters
(2021)
We present first empirical results from our ongoing investigation of distribution shifts in image data used for various computer vision tasks. Instead of analyzing the original training and test data, we propose to study shifts in the learned weights of trained models. In this work, we focus on the properties of the distributions of dominantly used 3x3 convolution filter kernels. We collected and publicly provide a data set with over half a billion filters from hundreds of trained CNNs, using a wide range of data sets, architectures, and vision tasks. Our analysis shows interesting distribution shifts (or the lack thereof) between trained filters along different axes of meta-parameters, like data type, task, architecture, or layer depth. We argue, that the observed properties are a valuable source for further investigation into a better understanding of the impact of shifts in the input data to the generalization abilities of CNN models and novel methods for more robust transfer-learning in this domain.
Engineering, construction and operation of complex machines involves a wide range of complicated, simultaneous tasks, which potentially could be automated. In this work, we focus on perception tasks in such systems, investigating deep learning approaches for multi-task transfer learning with limited training data. We show an approach that takes advantage of a technical systems’ focus on selected objects and their properties. We create focused representations and simultaneously solve joint objectives in a system through multi-task learning with convolutional autoencoders. The focused representations are used as a starting point for the data-saving solution of the additional tasks. The efficiency of this approach is demonstrated using images and tasks of an autonomous circular crane with a grapple.
For the past few years Low Power Wide Area Networks (LPWAN) have emerged as key technologies for the connectivity of many applications in the Internet of Things (IoT) combining low-data rates with strict cost and energy restrictions. Especially LoRa/LoRaWAN enjoys a high visibility on today’s markets, because of its good performance and its open community. Originally LoRa was designed for operation within the Sub-GHz ISM bands for Industrial, Scientific and Medical applications. However, at the end of 2018, a LoRa-based solution in the 2.4GHz ISM-band was presented promising higher bandwidths and higher data rates. Furthermore, it overcomes the limited duty-cycle prescribed by the regulations in the ISM-bands and therefore also opens doors to many novel application fields. Also, due to higher bandwidths and shorter transmission times, the use of alternative MAC layer protocols becomes very interesting, i.e. for TDMA based-approaches. Within this paper, we propose a system architecture with 2.4GHz LoRa components combining two aspects. On the one hand, we present a design and an implementation of a 2.4GHz based LoRaWAN solution that can be seamlessly integrated into existing LoRaWAN back-hauls. On the other hand, we describe deterministic setup using a Time Slotted Channel Hopping (TSCH) approach as defined in the IEEE802.15.4-2015 standard for industrial applications. Finally, measurements show the performance of the system.
It seems to be a widespread impression that the use of strong cryptography inevitably imposes a prohibitive burden on industrial communication systems, at least inasmuch as real-time requirements in cyclic fieldbus communications are concerned. AES-GCM is a leading cryptographic algorithm for authenticated encryption, which protects data against disclosure and manipulations. We study the use of both hardware and software-based implementations of AES-GCM. By simulations as well as measurements on an FPGA-based prototype setup we gain and substantiate an important insight: for devices with a 100 Mbps full-duplex link, a single low-footprint AES-GCM hardware engine can deterministically cope with the worst-case computational load, i.e., even if the device maintains a maximum number of cyclic communication relations with individual cryptographic keys. Our results show that hardware support for AES-GCM in industrial fieldbus components may actually be very lightweight.
In the last decade, deep learning models for condition monitoring of mechanical systems increasingly gained importance. Most of the previous works use data of the same domain (e.g., bearing type) or of a large amount of (labeled) samples. This approach is not valid for many real-world scenarios from industrial use-cases where only a small amount of data, often unlabeled, is available.
In this paper, we propose, evaluate, and compare a novel technique based on an intermediate domain, which creates a new representation of the features in the data and abstracts the defects of rotating elements such as bearings. The results based on an intermediate domain related to characteristic frequencies show an improved accuracy of up to 32 % on small labeled datasets compared to the current state-of-the-art in the time-frequency domain.
Furthermore, a Convolutional Neural Network (CNN) architecture is proposed for transfer learning. We also propose and evaluate a new approach for transfer learning, which we call Layered Maximum Mean Discrepancy (LMMD). This approach is based on the Maximum Mean Discrepancy (MMD) but extends it by considering the special characteristics of the proposed intermediate domain. The presented approach outperforms the traditional combination of Hilbert–Huang Transform (HHT) and S-Transform with MMD on all datasets for unsupervised as well as for semi-supervised learning. In most of our test cases, it also outperforms other state-of-the-art techniques.
This approach is capable of using different types of bearings in the source and target domain under a wide variation of the rotation speed.
The aim of this work is the application and evaluation of a method to visually detect markers at a distance of up to five meters and determine their real-world position. Combinations of cameras and lenses with different parameters were studied to determine the optimal configuration. Based on this configuration, camera images were taken after proper calibration. These images are then transformed into a bird's eye view using a homography matrix. The homography matrix is calculated with four-point pairs as well as with coordinate transformations. The obtained images show the ground plane un distorted, making it possible to convert a pixel position into a real-world position with a conversion factor. The proposed approach helps to effectively create data sets for training neural networks for navigation purposes.
The applicability of characteristics of local magnetic fields for more precise determination of localization of subjects and/or objects in indoor environments, such as railway stations, airports, exhibition halls, showrooms, or shopping centers, is considered. An investigation has been carried out to find out whether and how low-cost magnetic field sensors and mobile robot platforms can be used to create maps that improve the accuracy and robustness of later navigation with smartphones or other devices.
Object Detection and Mapping with Unmanned Aerial Vehicles Using Convolutional Neural Networks
(2021)
Significant progress has been made in the field of deep learning through intensive research over the last decade. So-called convolutional neural networks are an essential component of this research. In this type of neural network, the mathematical convolution operator is used to extract characteristics or anomalies. The purpose of this work is to investigate the extent to which it is possible in certain initial settings to input aerial recordings and flight data of Unmanned Aerial Vehicles (UAVs) in the architecture of a neural network and to detect and map an object. Using the calculated contours or dimensions of the so-called bounding boxes, the position of the objects can be determined relative to the current UAV location.
Cryptographic protection of messages requires frequent updates of the symmetric cipher key used for encryption and decryption, respectively. Protocols of legacy IT security, like TLS, SSH, or MACsec implement rekeying under the assumption that, first, application data exchange is allowed to stall occasionally and, second, dedicated control messages to orchestrate the process can be exchanged. In real-time automation applications, the first is generally prohibitive, while the second may induce problematic traffic patterns on the network. We present a novel seamless rekeying approach, which can be embedded into cyclic application data exchanges. Although, being agnostic to the underlying real-time communication system, we developed a demonstrator emulating the widespread industrial Ethernet system PROFINET IO and successfully use this rekeying mechanism.
It is important to minimize the unscheduled downtime of machines caused by outages of machine components in highly automated production lines. Considering machine tools such as, grinding machines, the bearing inside of spindles is one of the most critical components. In the last decade, research has increasingly focused on fault detection of bearings. In addition, the rise of machine learning concepts has also intensified interest in this area. However, up to date, there is no single one-fits-all solution for predictive maintenance of bearings. Most research so far has only looked at individual bearing types at a time.
This paper gives an overview of the most important approaches for bearing-fault analysis in grinding machines. There are two main parts of the analysis presented in this paper. The first part presents the classification of bearing faults, which includes the detection of unhealthy conditions, the position of the error (e.g. at the inner or at the outer ring of the bearing) and the severity, which detects the size of the fault. The second part presents the prediction of remaining useful life, which is important for estimating the productive use of a component before a potential failure, optimizing the replacement costs and minimizing downtime.
Sustainable chemical processes should be designed to combine the technological advantages and progress with lower safety risks and minimization of environmental impact such as, for example, reduction of raw materials, energy and water consumption, and avoidance of hazardous waste and pollution with toxic chemical agents. A number of novel eco-friendly chemical technologies have been developed in the recent decades with the help of the eco-innovations approaches and methods such as Life Cycle Analysis, Green Process Engineering, Process Intensification, Process Design for Sustainability, and others. An emerging approach to the sustainable process design in process engineering builds on the innovative solutions inspired from nature. However, the implementation of the eco-friendly technologies often faces secondary ecological problems. The study postulates that the eco-inventive principles identified in natural systems allow to avoid secondary eco-problems and proposes to apply these principles for sustainable design in chemical process engineering. The research work critically examines how this approach differs from the biomimetics, as it is commonly used for copying natural systems. The application of nature-inspired eco-design principles is illustrated with an example of a sustainable technology for extraction of nickel from pyrophyllite.
The proposed method includes identification and documentation of the elementary TRIZ inventive principles from the TRIZ body of knowledge, extension and enhancement of inventive principles by patents and technologies analysis, avoiding overlapping and redundant principles, classification and adaptation of principles to at least following categories such as working medium, target object, useful action, harmful effect, environment, information, field, substance, time, and space, assignment of the elementary inventive principles to the at least following underlying engineering domains such as universal, design, mechanical, acoustic, thermal, chemical, electromagnetic, intermolecular, biological, and data processing. The method includes classification of abstraction level of the elementary principles, definition of the statistical ranking of principles for different problem types, and specific engineering or non-technical domains, definition of strategies for selection of principles sets with high solution potential for predefined problems, automated semantic transformation of the elementary inventive principles into solution ideas, evaluation of automatically generated ideas and transformation of ideas to innovation or inventive concepts.
In this work, we evaluate two different image clustering objectives, k-means clustering and correlation clustering, in the context of Triplet Loss induced feature space embeddings. Specifically, we train a convolutional neural network to learn discriminative features by optimizing two popular versions of the Triplet Loss in order to study their clustering properties under the assumption of noisy labels. Additionally, we propose a new, simple Triplet Loss formulation, which shows desirable properties with respect to formal clustering objectives and outperforms the existing methods. We evaluate all three Triplet loss formulations for K-means and correlation clustering on the CIFAR-10 image classification dataset.
Despite the success of convolutional neural networks (CNNs) in many computer vision and image analysis tasks, they remain vulnerable against so-called adversarial attacks: Small, crafted perturbations in the input images can lead to false predictions. A possible defense is to detect adversarial examples. In this work, we show how analysis in the Fourier domain of input images and feature maps can be used to distinguish benign test samples from adversarial images. We propose two novel detection methods: Our first method employs the magnitude spectrum of the input images to detect an adversarial attack. This simple and robust classifier can successfully detect adversarial perturbations of three commonly used attack methods. The second method builds upon the first and additionally extracts the phase of Fourier coefficients of feature-maps at different layers of the network. With this extension, we are able to improve adversarial detection rates compared to state-of-the-art detectors on five different attack methods. The code for the methods proposed in the paper is available at github.com/paulaharder/SpectralAdversarialDefense
We demonstrate how to exploit group sparsity in order to bridge the areas of network pruning and neural architecture search (NAS). This results in a new one-shot NAS optimizer that casts the problem as a single-level optimization problem and does not suffer any performance degradation from discretizating the architecture.
Interpreting seismic data requires the characterization of a number of key elements such as the position of faults and main reflections, presence of structural bodies, and clustering of areas exhibiting a similar amplitude versus angle response. Manual interpretation of geophysical data is often a difficult and time-consuming task, complicated by lack of resolution and presence of noise. In recent years, approaches based on convolutional neural networks have shown remarkable results in automating certain interpretative tasks. However, these state-of-the-art systems usually need to be trained in a supervised manner, and they suffer from a generalization problem. Hence, it is highly challenging to train a model that can yield accurate results on new real data obtained with different acquisition, processing, and geology than the data used for training. In this work, we introduce a novel method that combines generative neural networks with a segmentation task in order to decrease the gap between annotated training data and uninterpreted target data. We validate our approach on two applications: the detection of diffraction events and the picking of faults. We show that when transitioning from synthetic training data to real validation data, our workflow yields superior results compared to its counterpart without the generative network.
Facial image manipulation is a generation task where the output face is shifted towards an intended target direction in terms of facial attribute and styles. Recent works have achieved great success in various editing techniques such as style transfer and attribute translation. However, current approaches are either focusing on pure style transfer, or on the translation of predefined sets of attributes with restricted interactivity. To address this issue, we propose FacialGAN, a novel framework enabling simultaneous rich style transfers and interactive facial attributes manipulation. While preserving the identity of a source image, we transfer the diverse styles of a target image to the source image. We then incorporate the geometry information of a segmentation mask to provide a fine-grained manipulation of facial attributes. Finally, a multi-objective learning strategy is introduced to optimize the loss of each specific tasks. Experiments on the CelebA-HQ dataset, with CelebAMask-HQ as semantic mask labels, show our model’s capacity in producing visually compelling results in style transfer, attribute manipulation, diversity and face verification. For reproducibility, we provide an interactive open-source tool to perform facial manipulations, and the Pytorch implementation of the model.
In this preliminary report, we present a simple but very effective technique to stabilize the training of CNN based GANs. Motivated by recently published methods using frequency decomposition of convolutions (eg Octave Convolutions), we propose a novel convolution scheme to stabilize the training and reduce the likelihood of a mode collapse. The basic idea of our approach is to split convolutional filters into additive high and low frequency parts, while shifting weight updates from low to high during the training. Intuitively, this method forces GANs to learn low frequency coarse image structures before descending into fine (high frequency) details. Our approach is orthogonal and complementary to existing stabilization methods and can simply plugged into any CNN based GAN architecture. First experiments on the CelebA dataset show the effectiveness of the proposed method.
Generative adversarial networks (GANs) provide state-of-the-art results in image generation. However, despite being so powerful, they still remain very challenging to train. This is in particular caused by their highly non-convex optimization space leading to a number of instabilities. Among them, mode collapse stands out as one of the most daunting ones. This undesirable event occurs when the model can only fit a few modes of the data distribution, while ignoring the majority of them. In this work, we combat mode collapse using second-order gradient information. To do so, we analyse the loss surface through its Hessian eigenvalues, and show that mode collapse is related to the convergence towards sharp minima. In particular, we observe how the eigenvalues of the are directly correlated with the occurrence of mode collapse. Finally, motivated by these findings, we design a new optimization algorithm called nudged-Adam (NuGAN) that uses spectral information to overcome mode collapse, leading to empirically more stable convergence properties.
Generative adversarial networks are the state of the art approach towards learned synthetic image generation. Although early successes were mostly unsupervised, bit by bit, this trend has been superseded by approaches based on labelled data. These supervised methods allow a much finer-grained control of the output image, offering more flexibility and stability. Nevertheless, the main drawback of such models is the necessity of annotated data. In this work, we introduce an novel framework that benefits from two popular learning techniques, adversarial training and representation learning, and takes a step towards unsupervised conditional GANs. In particular, our approach exploits the structure of a latent space (learned by the representation learning) and employs it to condition the generative model. In this way, we break the traditional dependency between condition and label, substituting the latter by unsupervised features coming from the latent space. Finally, we show that this new technique is able to produce samples on demand keeping the quality of its supervised counterpart.
Most eCommerce applications, like web-shops have millions of products. In this context, the identification of similar products is a common sub-task, which can be utilized in the implementation of recommendation systems, product search engines and internal supply logistics. Providing this data set, our goal is to boost the evaluation of machine learning methods for the prediction of the category of the retail products from tuples of images and descriptions.
Transformer models have recently attracted much interest from computer vision researchers and have since been successfully employed for several problems traditionally addressed with convolutional neural networks. At the same time, image synthesis using generative adversarial networks (GANs) has drastically improved over the last few years. The recently proposed TransGAN is the first GAN using only transformer-based architectures and achieves competitive results when compared to convolutional GANs. However, since transformers are data-hungry architectures, TransGAN requires data augmentation, an auxiliary super-resolution task during training, and a masking prior to guide the self-attention mechanism. In this paper, we study the combination of a transformer-based generator and convolutional discriminator and successfully remove the need of the aforementioned required design choices. We evaluate our approach by conducting a benchmark of well-known CNN discriminators, ablate the size of the transformer-based generator, and show that combining both architectural elements into a hybrid model leads to better results. Furthermore, we investigate the frequency spectrum properties of generated images and observe that our model retains the benefits of an attention based generator.
This paper describes a thorough analysis of using PPO to learn kick behaviors with simulated NAO robots in the simspark environment. The analysis includes an investigation of the influence of PPO hyperparameters, network size, training setups and performance in real games. We believe to improve the state of the art mainly in four points: first, the kicks are learned with a toed version of the NAO robot, second, we improve the reliability with respect to kickable area and avoidance of falls, third, the kick can be parameterized with desired distance and direction as input to the deep network and fourth, the approach allows to integrate the learned behavior seamlessly into soccer games. The result is a significant improvement of the general level of play.
Autonomous driving is disrupting the automotive industry as we know it today. For this, fail-operational behavior is essential in the sense, plan, and act stages of the automation chain in order to handle safety-critical situations on its own, which currently is not reached with state-of-the-art approaches.The European ECSEL research project PRYSTINE realizes Fail-operational Urban Surround perceptION (FUSION) based on robust Radar and LiDAR sensor fusion and control functions in order to enable safe automated driving in urban and rural environments. This paper showcases some of the key exploitable results (e.g., novel Radar sensors, innovative embedded control and E/E architectures, pioneering sensor fusion approaches, AI-controlled vehicle demonstrators) achieved until its final year 3.
Es wird ein neuer Ansatz zur Bestimmung des Abstands zweier oder mehrerer Smartphones zueinander vorgestellt. Dabei wird die Position des jeweiligen Smartphones im Raum bzw. im Gelände bezüglich eines Referenzpunkts (Spatial Anchor Point) ermittelt. Über einen zentralen Server tauschen die Smartphones ihre Position relativ zum Referenzpunkt aus und können daraus die Abstände zueinander berechnen. Unterschreitet der Abstand zweier Smartphones einen Schwellwert (< 2 m), erfolgt eine entsprechende Signalisierung auf den Smartphones.
Um die im Pariser Klimaschutzabkommen vereinbarte Begrenzung der Erderwärmung auf 1,5 Grad Celsius zu begrenzen, muss die Energiewende deutlich stärker vorangetrieben werden als bisher. Das Schaufenster C/sells in der größten der SINTEG-Modellregionen hat sich dieser Herausforderung gestellt. Über vier Jahre haben 56 Partner aus Energiewirtschaft, Wissenschaft und Politik in Baden-Württemberg, Bayern und Hessen daran gearbeitet, ein zelluläres Energiesystem zu etablieren. Sie haben Musterlösungen für eine erfolgreiche Energiewende entwickelt. In mehr als 30 Demonstrationszellen sowie in neun Partizipationszellen, den sogenannten C/sells-Citys, wurde demonstriert, wie ein Informationssystem die intelligente Organisation von Stromversorgungsnetzen und den regionalisierten Handel mit Energie und Flexibilitäten ermöglicht.