Refine
Year of publication
- 2021 (82) (remove)
Document Type
- Conference Proceeding (46)
- Article (reviewed) (24)
- Letter to Editor (4)
- Book (3)
- Part of a Book (2)
- Doctoral Thesis (2)
- Report (1)
Conference Type
- Konferenzartikel (41)
- Konferenz-Abstract (5)
Language
- English (82) (remove)
Is part of the Bibliography
- yes (82) (remove)
Keywords
- Blockchain (3)
- Generative Adversarial Network (3)
- Götz von Berlichingen (3)
- neural networks (3)
- printed electronics (3)
- Machine Learning (2)
- Predictive maintenance (2)
- Stability (2)
- ancient Capua leg (2)
- artificial intelligence (2)
Institute
- Fakultät Elektrotechnik, Medizintechnik und Informatik (EMI) (ab 04/2019) (82) (remove)
Open Access
- Open Access (46)
- Closed Access (35)
- Bronze (1)
- Closed (1)
- Diamond (1)
- Grün (1)
Artificial intelligence (AI), and in particular machine learning algorithms, are of increasing importance in many application areas but interpretability and understandability as well as responsibility, accountability, and fairness of the algorithms' results, all crucial for increasing the humans' trust into the systems, are still largely missing. Big industrial players, including Google, Microsoft, and Apple, have become aware of this gap and recently published their own guidelines for the use of AI in order to promote fairness, trust, interpretability, and other goals. Interactive visualization is one of the technologies that may help to increase trust in AI systems. During the seminar, we discussed the requirements for trustworthy AI systems as well as the technological possibilities provided by interactive visualizations to increase human trust in AI.
Digital transformation strengthens the interconnection of companies in order to develop optimized and better customized, cross-company business models. These models require secure, reliable, and traceable evidence and monitoring of contractually agreed information to gain trust between stakeholders. Blockchain technology using smart contracts allows the industry to establish trust and automate cross-company business processes without the risk of losing data control. A typical cross-company industry use case is equipment maintenance. Machine manufacturers and service providers offer maintenance for their machines and tools in order to achieve high availability at low costs. The aim of this chapter is to demonstrate how maintenance use cases are attempted by utilizing hyperledger fabric for building a chain of trust by hardened evidence logging of the maintenance process to achieve legal certainty. Contracts are digitized into smart contracts automating business that increase the security and mitigate the error-proneness of the business processes.
Significant progress in the development and commercialization of electrically conductive adhesives has been made. This makes shingling a very attractive approach for solar cell interconnection. In this study, we investigate the shading tolerance of two types of solar modules based on shingle interconnection: first, the already commercialized string approach, and second, the matrix technology where solar cells are intrinsically interconnected in parallel and in series. An experimentally validated LTspice model predicts major advantages for the power output of the matrix layout under partial shading. Diagonal as well as random shading of a 1.6-m2 solar module is examined. Power gains of up to 73.8 % for diagonal shading and up to 96.5 % for random shading are found for the matrix technology compared to the standard string approach. The key factor is an increased current extraction due to lateral current flows. Especially under minor shading, the matrix technology benefits from an increased fill factor as well. Under diagonal shading, we find the probability of parts of the matrix module being bypassed to be reduced by 40 % in comparison to the string module. In consequence, the overall risk of hotspot occurrence in matrix modules is decreased significantly.
A versatile liquid metal (LM) printing process enabling the fabrication of various fully printed devices such as intra- and interconnect wires, resistors, diodes, transistors, and basic circuit elements such as inverters which are process compatible with other digital printing and thin film structuring methods for integration is presented. For this, a glass capillary-based direct-write method for printing LMs such as eutectic gallium alloys, exploring the potential for fully printed LM-enabled devices is demonstrated. Examples for successful device fabrication include resistors, p–n diodes, and field effect transistors. The device functionality and easiness of one integrated fabrication flow shows that the potential of LM printing is far exceeding the use of interconnecting conventional electronic devices in printed electronics.
Objective: To quantify the effect of inhaled 5% carbon-dioxide/95% oxygen on EEG recordings from patients in non-convulsive status epilepticus (NCSE).
Methods: Five children of mixed aetiology in NCSE were given high flow of inhaled carbogen (5% carbon dioxide/95% oxygen) using a face mask for maximum 120s. EEG was recorded concurrently in all patients. The effects of inhaled carbogen on patient EEG recordings were investigated using band-power, functional connectivity and graph theory measures. Carbogen effect was quantified by measuring effect size (Cohen's d) between "before", "during" and "after" carbogen delivery states.
Results: Carbogen's apparent effect on EEG band-power and network metrics across all patients for "before-during" and "before-after" inhalation comparisons was inconsistent across the five patients.
Conclusion: The changes in different measures suggest a potentially non-homogeneous effect of carbogen on the patients' EEG. Different aetiology and duration of the inhalation may underlie these non-homogeneous effects. Tuning the carbogen parameters (such as ratio between CO2 and O2, duration of inhalation) on a personalised basis may improve seizure suppression in future.
Physically Unclonable Functions (PUFs) are hardware-based security primitives, which allow for inherent device fingerprinting. Therefore, intrinsic variation of imperfect manufactured systems is exploited to generate device-specific, unique identifiers. With printed electronics (PE) joining the internet of things (IoT), hardware-based security for novel PE-based systems is of increasing importance. Furthermore, PE offers the possibility for split-manufacturing, which mitigates the risk of PUF response readout by third parties, before commissioning. In this paper, we investigate a printed PUF core as intrinsic variation source for the generation of unique identifiers from a crossbar architecture. The printed crossbar PUF is verified by simulation of a 8×8-cells crossbar, which can be utilized to generate 32-bit wide identifiers. Further focus is on limiting factors regarding printed devices, such as increased parasitics, due to novel materials and required control logic specifications. The simulation results highlight, that the printed crossbar PUF is capable to generate close-to-ideal unique identifiers at the investigated feature size. As proof of concept a 2×2-cells printed crossbar PUF core is fabricated and electrically characterized.
Printed electronics (PE) offers flexible, extremely low-cost, and on-demand hardware due to its additive manufacturing process, enabling emerging ultra-low-cost applications, including machine learning applications. However, large feature sizes in PE limit the complexity of a machine learning classifier (e.g., a neural network (NN)) in PE. Stochastic computing Neural Networks (SC-NNs) can reduce area in silicon technologies, but still require complex designs due to unique implementation tradeoffs in PE. In this paper, we propose a printed mixed-signal system, which substitutes complex and power-hungry conventional stochastic computing (SC) components by printed analog designs. The printed mixed-signal SC consumes only 35% of power consumption and requires only 25% of area compared to a conventional 4-bit NN implementation. We also show that the proposed mixed-signal SC-NN provides good accuracy for popular neural network classification problems. We consider this work as an important step towards the realization of printed SC-NN hardware for near-sensor-processing.
Printed electronics, due to its manufacturability using printing technology, allows for fabrication on large areas and the usage of flexible substrates and thus enables novel applications. Non-impact printing technology, such as inkjet-printing, permits for flexible, decentralized manufacturing of electronic devices and systems. This further facilitates split-manufacturing in security-critical electrical components, as well as a maximum in design flexibility in terms of free form factors and non-standardized structures with different geometrical sizes, reaching from a few micrometers up to several millimeters.
Based on the technological benefits printed electronics offers, it provides an interesting counterpart to classical silicon-based electronics, which is usually densely integrated on miniaturized, rigid areas. By utilizing both technologies in a complementary manner, novel systems in the form of hybrid systems can be enabled. Whilst hybrid systems, incorporating passive printed components and electrically conductive wiring concepts, are already commercialized, complex printed systems, which also utilize active components remain rare. To enable more complex (hybrid) systems, various building blocks are required. This includes possibilities for lightweight, printed data storage, the capability to provide sustainable, self-powered printed components and especially circuits for secure, unique identification for holistic printed systems, deployed in the internet of things.
The presented thesis focuses on inkjet-printed electronic devices, circuits and hybrid systems. It investigates solutions for current scientific questions in the area of efficient data storage, sustainable electronics and hardware-based security in printed electronics.
For data storage, an inkjet-printed memristor is developed. The device is fully electrically evaluated with a focus on its data storage capabilities. Furthermore, the printed device is of special interest due to its easy manufacturability and integration capabilities. The experimental analysis reveals that the developed memristor is highly suitable as lightweight non-volatile memory device.
In order to enable sustainable electronic systems, an inkjet-printed full-wave rectifier based on near-zero threshold voltage electrolyte-gated transistors is developed and fully electrically characterized. The circuit is capable for small alternating voltage rectification of low-frequency vibration energy harvesters in the sub-volt region. This provides an important building block in enabling sustainable, self-powered electronic systems. The inkjet-printed full-wave rectifier is evaluated by electrical simulation and experimentally.
To tackle hardware-based security for printed electronics, two implementations for inkjet-printed physically unclonable functions are developed and presented. For unique identification, intrinsic variation in active printed devices are exploited. One implementation is based on a crossbar architecture, incorporating integrable electrolyte-gated transistor cells. The second implementation, the so-called differential circuit physically unclonable function, is based on inverter structures, which provide the basis for unique response generation. Both physically unclonable functions are evaluated using an electrical simulation-based approach and experimentally. The differential circuit approach is furthermore fully integrated within a silicon-based electronic platform environment and serves as intrinsic variation source in a hybrid system. The hybrid system physically unclonable function is fully verified regarding performance metrics and is capable to generate highly unique responses for secure identification.
Emerging applications in soft robotics, wearables, smart consumer products or IoT-devices benefit from soft materials, flexible substrates in conjunction with electronic functionality. Due to high production costs and conformity restrictions, rigid silicon technologies do not meet application requirements in these new domains. However, whenever signal processing becomes too comprehensive, silicon technology must be used for the high-performance computing unit. At the same time, designing everything in flexible or printed electronics using conventional digital logic is not feasible yet due to the limitations of printed technologies in terms of performance, power and integration density. We propose to rather use the strengths of neuromorphic computing architectures consisting in their homogeneous topologies, few building blocks and analog signal processing to be mapped to an inkjet-printed hardware architecture. It has remained a challenge to demonstrate non-linear elements besides weighted aggregation. We demonstrate in this work printed hardware building blocks such as inverter-based comprehensive weight representation and resistive crossbars as well as printed transistor-based activation functions. In addition, we present a learning algorithm developed to train the proposed printed NCS architecture based on specific requirements and constraints of the technology.
The present work ties in with the problem of bicycle road assessment that is currently done using expensive special measuring vehicles. Our alternative approach for road condition assessment is to mount a sensor device on a bicycle which sends accelerometer and gyroscope data via WiFi to a classification server. There, a prediction model determines road type and condition based on the sensor data. For the classification task, we compare different machine learning methods with each other, whereby validation accuracies of 99% can be achieved with deep residual networks such as InceptionTime. The main contribution of this work with respect to comparable work is that we achieve excellent accuracies on a realistic dataset classifying road conditions into nine distinct classes that are highly relevant for practice.
Sustainable chemical processes should be designed to combine the technological advantages and progress with lower safety risks and minimization of environmental impact such as, for example, reduction of raw materials, energy and water consumption, and avoidance of hazardous waste and pollution with toxic chemical agents. A number of novel eco-friendly chemical technologies have been developed in the recent decades with the help of the eco-innovations approaches and methods such as Life Cycle Analysis, Green Process Engineering, Process Intensification, Process Design for Sustainability, and others. An emerging approach to the sustainable process design in process engineering builds on the innovative solutions inspired from nature. However, the implementation of the eco-friendly technologies often faces secondary ecological problems. The study postulates that the eco-inventive principles identified in natural systems allow to avoid secondary eco-problems and proposes to apply these principles for sustainable design in chemical process engineering. The research work critically examines how this approach differs from the biomimetics, as it is commonly used for copying natural systems. The application of nature-inspired eco-design principles is illustrated with an example of a sustainable technology for extraction of nickel from pyrophyllite.
The proposed method includes identification and documentation of the elementary TRIZ inventive principles from the TRIZ body of knowledge, extension and enhancement of inventive principles by patents and technologies analysis, avoiding overlapping and redundant principles, classification and adaptation of principles to at least following categories such as working medium, target object, useful action, harmful effect, environment, information, field, substance, time, and space, assignment of the elementary inventive principles to the at least following underlying engineering domains such as universal, design, mechanical, acoustic, thermal, chemical, electromagnetic, intermolecular, biological, and data processing. The method includes classification of abstraction level of the elementary principles, definition of the statistical ranking of principles for different problem types, and specific engineering or non-technical domains, definition of strategies for selection of principles sets with high solution potential for predefined problems, automated semantic transformation of the elementary inventive principles into solution ideas, evaluation of automatically generated ideas and transformation of ideas to innovation or inventive concepts.
Cryptographic protection of messages requires frequent updates of the symmetric cipher key used for encryption and decryption, respectively. Protocols of legacy IT security, like TLS, SSH, or MACsec implement rekeying under the assumption that, first, application data exchange is allowed to stall occasionally and, second, dedicated control messages to orchestrate the process can be exchanged. In real-time automation applications, the first is generally prohibitive, while the second may induce problematic traffic patterns on the network. We present a novel seamless rekeying approach, which can be embedded into cyclic application data exchanges. Although, being agnostic to the underlying real-time communication system, we developed a demonstrator emulating the widespread industrial Ethernet system PROFINET IO and successfully use this rekeying mechanism.
Fifth-generation (5G) cellular mobile networks are expected to support mission-critical low latency applications in addition to mobile broadband services, where fourth-generation (4G) cellular networks are unable to support Ultra-Reliable Low Latency Communication (URLLC). However, it might be interesting to understand which latency requirements can be met with both 4G and 5G networks. In this paper, we discuss (1) the components contributing to the latency of cellular networks and (2) evaluate control-plane and user-plane latencies for current-generation narrowband cellular networks and point out the potential improvements to reduce the latency of these networks, (3) present, implement and evaluate latency reduction techniques for latency-critical applications. The two elements we detected, namely the short transmission time interval and the semi-persistent scheduling are very promising as they allow to shorten the delay to processing received information both into the control and data planes. We then analyze the potential of latency reduction techniques for URLLC applications. To this end, we develop these techniques into the long term evolution (LTE) module of ns-3 simulator and then evaluate the performance of the proposed techniques into two different application fields: industrial automation and intelligent transportation systems. Our detailed evaluation results from simulations indicate that LTE can satisfy the low-latency requirements for a large choice of use cases in each field.
To demonstrate how deep learning can be applied to industrial applications with limited training data, deep learning methodologies are used in three different applications. In this paper, we perform unsupervised deep learning utilizing variational autoencoders and demonstrate that federated learning is a communication efficient concept for machine learning that protects data privacy. As an example, variational autoencoders are utilized to cluster and visualize data from a microelectromechanical systems foundry. Federated learning is used in a predictive maintenance scenario using the C-MAPSS dataset.
It is important to minimize the unscheduled downtime of machines caused by outages of machine components in highly automated production lines. Considering machine tools such as, grinding machines, the bearing inside of spindles is one of the most critical components. In the last decade, research has increasingly focused on fault detection of bearings. In addition, the rise of machine learning concepts has also intensified interest in this area. However, up to date, there is no single one-fits-all solution for predictive maintenance of bearings. Most research so far has only looked at individual bearing types at a time.
This paper gives an overview of the most important approaches for bearing-fault analysis in grinding machines. There are two main parts of the analysis presented in this paper. The first part presents the classification of bearing faults, which includes the detection of unhealthy conditions, the position of the error (e.g. at the inner or at the outer ring of the bearing) and the severity, which detects the size of the fault. The second part presents the prediction of remaining useful life, which is important for estimating the productive use of a component before a potential failure, optimizing the replacement costs and minimizing downtime.
In the last decade, deep learning models for condition monitoring of mechanical systems increasingly gained importance. Most of the previous works use data of the same domain (e.g., bearing type) or of a large amount of (labeled) samples. This approach is not valid for many real-world scenarios from industrial use-cases where only a small amount of data, often unlabeled, is available.
In this paper, we propose, evaluate, and compare a novel technique based on an intermediate domain, which creates a new representation of the features in the data and abstracts the defects of rotating elements such as bearings. The results based on an intermediate domain related to characteristic frequencies show an improved accuracy of up to 32 % on small labeled datasets compared to the current state-of-the-art in the time-frequency domain.
Furthermore, a Convolutional Neural Network (CNN) architecture is proposed for transfer learning. We also propose and evaluate a new approach for transfer learning, which we call Layered Maximum Mean Discrepancy (LMMD). This approach is based on the Maximum Mean Discrepancy (MMD) but extends it by considering the special characteristics of the proposed intermediate domain. The presented approach outperforms the traditional combination of Hilbert–Huang Transform (HHT) and S-Transform with MMD on all datasets for unsupervised as well as for semi-supervised learning. In most of our test cases, it also outperforms other state-of-the-art techniques.
This approach is capable of using different types of bearings in the source and target domain under a wide variation of the rotation speed.
It seems to be a widespread impression that the use of strong cryptography inevitably imposes a prohibitive burden on industrial communication systems, at least inasmuch as real-time requirements in cyclic fieldbus communications are concerned. AES-GCM is a leading cryptographic algorithm for authenticated encryption, which protects data against disclosure and manipulations. We study the use of both hardware and software-based implementations of AES-GCM. By simulations as well as measurements on an FPGA-based prototype setup we gain and substantiate an important insight: for devices with a 100 Mbps full-duplex link, a single low-footprint AES-GCM hardware engine can deterministically cope with the worst-case computational load, i.e., even if the device maintains a maximum number of cyclic communication relations with individual cryptographic keys. Our results show that hardware support for AES-GCM in industrial fieldbus components may actually be very lightweight.
In recent years, physically unclonable functions (PUFs) have gained significant attraction in IoT security applications, such as cryptographic key generation and entity authentication. PUFs extract the uncontrollable production characteristics of different devices to generate unique fingerprints for security applications. When generating PUF-based secret keys, the reliability and entropy of the keys are vital factors. This study proposes a novel method for generating PUF-based keys from a set of measurements. Firstly, it formulates the group-based key generation problem as an optimization problem and solves it using integer linear programming (ILP), which guarantees finding the optimum solution. Then, a novel scheme for the extraction of keys from groups is proposed, which we call positioning syndrome coding (PSC). The use of ILP as well as the introduction of PSC facilitates the generation of high-entropy keys with low error correction costs. These new methods have been tested by applying them on the output of a capacitor network PUF. The results confirm the application of ILP and PSC in generating high-quality keys.
For the past few years Low Power Wide Area Networks (LPWAN) have emerged as key technologies for the connectivity of many applications in the Internet of Things (IoT) combining low-data rates with strict cost and energy restrictions. Especially LoRa/LoRaWAN enjoys a high visibility on today’s markets, because of its good performance and its open community. Originally LoRa was designed for operation within the Sub-GHz ISM bands for Industrial, Scientific and Medical applications. However, at the end of 2018, a LoRa-based solution in the 2.4GHz ISM-band was presented promising higher bandwidths and higher data rates. Furthermore, it overcomes the limited duty-cycle prescribed by the regulations in the ISM-bands and therefore also opens doors to many novel application fields. Also, due to higher bandwidths and shorter transmission times, the use of alternative MAC layer protocols becomes very interesting, i.e. for TDMA based-approaches. Within this paper, we propose a system architecture with 2.4GHz LoRa components combining two aspects. On the one hand, we present a design and an implementation of a 2.4GHz based LoRaWAN solution that can be seamlessly integrated into existing LoRaWAN back-hauls. On the other hand, we describe deterministic setup using a Time Slotted Channel Hopping (TSCH) approach as defined in the IEEE802.15.4-2015 standard for industrial applications. Finally, measurements show the performance of the system.
Autonomous driving is disrupting the automotive industry as we know it today. For this, fail-operational behavior is essential in the sense, plan, and act stages of the automation chain in order to handle safety-critical situations on its own, which currently is not reached with state-of-the-art approaches.The European ECSEL research project PRYSTINE realizes Fail-operational Urban Surround perceptION (FUSION) based on robust Radar and LiDAR sensor fusion and control functions in order to enable safe automated driving in urban and rural environments. This paper showcases some of the key exploitable results (e.g., novel Radar sensors, innovative embedded control and E/E architectures, pioneering sensor fusion approaches, AI-controlled vehicle demonstrators) achieved until its final year 3.
We describe a prototype for power line communi- cation for grid monitoring. The PLC receiver is used to gain information about the PLC channel and the current state of the power grid. The PLC receiver uses the communication signal to obtain an accurate estimate of the current channel and provides information which can be used as a basis for further processing with the aim to detect partial discharges and other anomalies in the grid. This monitoring of the power grid takes advantage of existing PLC infrastructure and uses the data signals, which are transmitted anyway to obtain a real-time measurement of the channel transfer function and the received noise signal. Since this signal is sampled at a high sampling rate compared to simpler measurement sensors, it contains valuable information about possible degradations in the grid which need to be addressed. While channel measurements are based on a received PLC signal, information about partial discharges or other sources of interference can be gathered by a PLC receiver in the absence of a transmit signal. A prototype based on Software Defined Radio has been developed, which implements the simultaneous communication and sensing for a power grid.
The following describes a new method for estimating the parameters of an interior permanent magnet synchronous machine (IPMSM). For the estimation of the parameters the current slopes caused by the switching of the inverter are used to determine the unknowns of the system equations of the electrical machine. The angle and current dependence of the machine parameters are linearized within a PWM cycle. By considering the different switching states of the inverter, several system equations can be derived and a solution can be found within one PWM cycle. The use of test signals and filter-based approaches is avoided. The derived algorithm is explained and validated with measurements on a test bench.
This paper describes a thorough analysis of using PPO to learn kick behaviors with simulated NAO robots in the simspark environment. The analysis includes an investigation of the influence of PPO hyperparameters, network size, training setups and performance in real games. We believe to improve the state of the art mainly in four points: first, the kicks are learned with a toed version of the NAO robot, second, we improve the reliability with respect to kickable area and avoidance of falls, third, the kick can be parameterized with desired distance and direction as input to the deep network and fourth, the approach allows to integrate the learned behavior seamlessly into soccer games. The result is a significant improvement of the general level of play.
In bimodal cochlear implant (CI) / hearing aid (HA) users a constant interaural time delay in the order of several milliseconds occurs due to differences in signal processing of the devices. For MED-EL CI systems in combination with different HA types, we have quantified the respective device delay mismatch (Zirn et al. 2015). In the current study, we investigate the effect of the device delay mismatch in simulated and actual bimodal listeners on sound localization accuracy.
To deal with the device delay mismatch in actual bimodal listeners we delayed the CI stimulation according to the measured HA processing delay and two other values. With all delay values highly significant improvements of the rms error in the localization task were observed compared to the test without the delay. The results help to narrow down the optimal patient-specific delay value.
Facial image manipulation is a generation task where the output face is shifted towards an intended target direction in terms of facial attribute and styles. Recent works have achieved great success in various editing techniques such as style transfer and attribute translation. However, current approaches are either focusing on pure style transfer, or on the translation of predefined sets of attributes with restricted interactivity. To address this issue, we propose FacialGAN, a novel framework enabling simultaneous rich style transfers and interactive facial attributes manipulation. While preserving the identity of a source image, we transfer the diverse styles of a target image to the source image. We then incorporate the geometry information of a segmentation mask to provide a fine-grained manipulation of facial attributes. Finally, a multi-objective learning strategy is introduced to optimize the loss of each specific tasks. Experiments on the CelebA-HQ dataset, with CelebAMask-HQ as semantic mask labels, show our model’s capacity in producing visually compelling results in style transfer, attribute manipulation, diversity and face verification. For reproducibility, we provide an interactive open-source tool to perform facial manipulations, and the Pytorch implementation of the model.
Evaluation of Deep Learning-Based Neural Network Methods for Cloud Detection and Segmentation
(2021)
This paper presents a systematic approach for accurate short-time cloud coverage prediction based on a machine learning (ML) approach. Based on a newly built omnidirectional ground-based sky camera system, local training and evaluation data sets were created. These were used to train several state-of-the-art deep neural networks for object detection and segmentation. For this purpose, the camera-generated a full hemispherical image every 30 min over two months in daylight conditions with a fish-eye lens. From this data set, a subset of images was selected for training and evaluation according to various criteria. Deep neural networks, based on the two-stage R-CNN architecture, were trained and compared with a U-net segmentation approach implemented by CloudSegNet. All chosen deep networks were then evaluated and compared according to the local situation.
Object Detection and Mapping with Unmanned Aerial Vehicles Using Convolutional Neural Networks
(2021)
Significant progress has been made in the field of deep learning through intensive research over the last decade. So-called convolutional neural networks are an essential component of this research. In this type of neural network, the mathematical convolution operator is used to extract characteristics or anomalies. The purpose of this work is to investigate the extent to which it is possible in certain initial settings to input aerial recordings and flight data of Unmanned Aerial Vehicles (UAVs) in the architecture of a neural network and to detect and map an object. Using the calculated contours or dimensions of the so-called bounding boxes, the position of the objects can be determined relative to the current UAV location.
The applicability of characteristics of local magnetic fields for more precise determination of localization of subjects and/or objects in indoor environments, such as railway stations, airports, exhibition halls, showrooms, or shopping centers, is considered. An investigation has been carried out to find out whether and how low-cost magnetic field sensors and mobile robot platforms can be used to create maps that improve the accuracy and robustness of later navigation with smartphones or other devices.
The aim of this work is the application and evaluation of a method to visually detect markers at a distance of up to five meters and determine their real-world position. Combinations of cameras and lenses with different parameters were studied to determine the optimal configuration. Based on this configuration, camera images were taken after proper calibration. These images are then transformed into a bird's eye view using a homography matrix. The homography matrix is calculated with four-point pairs as well as with coordinate transformations. The obtained images show the ground plane un distorted, making it possible to convert a pixel position into a real-world position with a conversion factor. The proposed approach helps to effectively create data sets for training neural networks for navigation purposes.
Engineering, construction and operation of complex machines involves a wide range of complicated, simultaneous tasks, which potentially could be automated. In this work, we focus on perception tasks in such systems, investigating deep learning approaches for multi-task transfer learning with limited training data. We show an approach that takes advantage of a technical systems’ focus on selected objects and their properties. We create focused representations and simultaneously solve joint objectives in a system through multi-task learning with convolutional autoencoders. The focused representations are used as a starting point for the data-saving solution of the additional tasks. The efficiency of this approach is demonstrated using images and tasks of an autonomous circular crane with a grapple.
An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters
(2021)
We present first empirical results from our ongoing investigation of distribution shifts in image data used for various computer vision tasks. Instead of analyzing the original training and test data, we propose to study shifts in the learned weights of trained models. In this work, we focus on the properties of the distributions of dominantly used 3x3 convolution filter kernels. We collected and publicly provide a data set with over half a billion filters from hundreds of trained CNNs, using a wide range of data sets, architectures, and vision tasks. Our analysis shows interesting distribution shifts (or the lack thereof) between trained filters along different axes of meta-parameters, like data type, task, architecture, or layer depth. We argue, that the observed properties are a valuable source for further investigation into a better understanding of the impact of shifts in the input data to the generalization abilities of CNN models and novel methods for more robust transfer-learning in this domain.
A fundamental and still largely unsolved question in the context of Generative Adversarial Networks is whether they are truly able to capture the real data distribution and, consequently, to sample from it. In particular, the multidimensional nature of image distributions leads to a complex evaluation of the diversity of GAN distributions. Existing approaches provide only a partial understanding of this issue, leaving the question unanswered. In this work, we introduce a loop-training scheme for the systematic investigation of observable shifts between the distributions of real training data and GAN generated data. Additionally, we introduce several bounded measures for distribution shifts, which are both easy to compute and to interpret. Overall, the combination of these methods allows an explorative investigation of innate limitations of current GAN algorithms. Our experiments on different data-sets and multiple state-of-the-art GAN architectures show large shifts between input and output distributions, showing that existing theoretical guarantees towards the convergence of output distributions appear not to be holding in practice.
Correlation Clustering, also called the minimum cost Multicut problem, is the process of grouping data by pairwise similarities. It has proven to be effective on clustering problems, where the number of classes is unknown. However, not only is the Multicut problem NP-hard, an undirected graph G with n vertices representing single images has at most edges, thus making it challenging to implement correlation clustering for large datasets. In this work, we propose Multi-Stage Multicuts (MSM) as a scalable approach for image clustering. Specifically, we solve minimum cost Multicut problems across multiple distributed compute units. Our approach not only allows to solve problem instances which are too large to fit into the shared memory of a single compute node, but it also achieves significant speedups while preserving the clustering accuracy at the same time. We evaluate our proposed method on the CIFAR10 …
Aerosol particles play an important role in the climate system by absorbing and scattering radiation and influencing cloud properties. They are also one of the biggest sources of uncertainty for climate modeling. Many climate models do not include aerosols in sufficient detail. In order to achieve higher accuracy, aerosol microphysical properties and processes have to be accounted for. This is done in the ECHAM-HAM global climate aerosol model using the M7 microphysics model, but increased computational costs make it very expensive to run at higher resolutions or for a longer time. We aim to use machine learning to approximate the microphysics model at sufficient accuracy and reduce the computational cost by being fast at inference time. The original M7 model is used to generate data of input-output pairs to train a neural network on it. By using a special logarithmic transform we are able to learn the variables tendencies achieving an average score of . On a GPU we achieve a speed-up of 120 compared to the original model.
Recently, adversarial attacks on image classification networks by the AutoAttack (Croce and Hein, 2020b) framework have drawn a lot of attention. While AutoAttack has shown a very high attack success rate, most defense approaches are focusing on network hardening and robustness enhancements, like adversarial training. This way, the currently best-reported method can withstand about 66% of adversarial examples on CIFAR10. In this paper, we investigate the spatial and frequency domain properties of AutoAttack and propose an alternative defense. Instead of hardening a network, we detect adversarial attacks during inference, rejecting manipulated inputs. Based on a rather simple and fast analysis in the frequency domain, we introduce two different detection algorithms. First, a black box detector that only operates on the input images and achieves a detection accuracy of 100% on the AutoAttack CIFAR10 benchmark and 99.3% on ImageNet, for epsilon = 8/255 in both cases. Second, a whitebox detector using an analysis of CNN feature maps, leading to a detection rate of also 100% and 98.7% on the same benchmarks.
Transformer models have recently attracted much interest from computer vision researchers and have since been successfully employed for several problems traditionally addressed with convolutional neural networks. At the same time, image synthesis using generative adversarial networks (GANs) has drastically improved over the last few years. The recently proposed TransGAN is the first GAN using only transformer-based architectures and achieves competitive results when compared to convolutional GANs. However, since transformers are data-hungry architectures, TransGAN requires data augmentation, an auxiliary super-resolution task during training, and a masking prior to guide the self-attention mechanism. In this paper, we study the combination of a transformer-based generator and convolutional discriminator and successfully remove the need of the aforementioned required design choices. We evaluate our approach by conducting a benchmark of well-known CNN discriminators, ablate the size of the transformer-based generator, and show that combining both architectural elements into a hybrid model leads to better results. Furthermore, we investigate the frequency spectrum properties of generated images and observe that our model retains the benefits of an attention based generator.
Most eCommerce applications, like web-shops have millions of products. In this context, the identification of similar products is a common sub-task, which can be utilized in the implementation of recommendation systems, product search engines and internal supply logistics. Providing this data set, our goal is to boost the evaluation of machine learning methods for the prediction of the category of the retail products from tuples of images and descriptions.
Generative adversarial networks are the state of the art approach towards learned synthetic image generation. Although early successes were mostly unsupervised, bit by bit, this trend has been superseded by approaches based on labelled data. These supervised methods allow a much finer-grained control of the output image, offering more flexibility and stability. Nevertheless, the main drawback of such models is the necessity of annotated data. In this work, we introduce an novel framework that benefits from two popular learning techniques, adversarial training and representation learning, and takes a step towards unsupervised conditional GANs. In particular, our approach exploits the structure of a latent space (learned by the representation learning) and employs it to condition the generative model. In this way, we break the traditional dependency between condition and label, substituting the latter by unsupervised features coming from the latent space. Finally, we show that this new technique is able to produce samples on demand keeping the quality of its supervised counterpart.
Generative adversarial networks (GANs) provide state-of-the-art results in image generation. However, despite being so powerful, they still remain very challenging to train. This is in particular caused by their highly non-convex optimization space leading to a number of instabilities. Among them, mode collapse stands out as one of the most daunting ones. This undesirable event occurs when the model can only fit a few modes of the data distribution, while ignoring the majority of them. In this work, we combat mode collapse using second-order gradient information. To do so, we analyse the loss surface through its Hessian eigenvalues, and show that mode collapse is related to the convergence towards sharp minima. In particular, we observe how the eigenvalues of the are directly correlated with the occurrence of mode collapse. Finally, motivated by these findings, we design a new optimization algorithm called nudged-Adam (NuGAN) that uses spectral information to overcome mode collapse, leading to empirically more stable convergence properties.
In this preliminary report, we present a simple but very effective technique to stabilize the training of CNN based GANs. Motivated by recently published methods using frequency decomposition of convolutions (eg Octave Convolutions), we propose a novel convolution scheme to stabilize the training and reduce the likelihood of a mode collapse. The basic idea of our approach is to split convolutional filters into additive high and low frequency parts, while shifting weight updates from low to high during the training. Intuitively, this method forces GANs to learn low frequency coarse image structures before descending into fine (high frequency) details. Our approach is orthogonal and complementary to existing stabilization methods and can simply plugged into any CNN based GAN architecture. First experiments on the CelebA dataset show the effectiveness of the proposed method.
Interpreting seismic data requires the characterization of a number of key elements such as the position of faults and main reflections, presence of structural bodies, and clustering of areas exhibiting a similar amplitude versus angle response. Manual interpretation of geophysical data is often a difficult and time-consuming task, complicated by lack of resolution and presence of noise. In recent years, approaches based on convolutional neural networks have shown remarkable results in automating certain interpretative tasks. However, these state-of-the-art systems usually need to be trained in a supervised manner, and they suffer from a generalization problem. Hence, it is highly challenging to train a model that can yield accurate results on new real data obtained with different acquisition, processing, and geology than the data used for training. In this work, we introduce a novel method that combines generative neural networks with a segmentation task in order to decrease the gap between annotated training data and uninterpreted target data. We validate our approach on two applications: the detection of diffraction events and the picking of faults. We show that when transitioning from synthetic training data to real validation data, our workflow yields superior results compared to its counterpart without the generative network.
We demonstrate how to exploit group sparsity in order to bridge the areas of network pruning and neural architecture search (NAS). This results in a new one-shot NAS optimizer that casts the problem as a single-level optimization problem and does not suffer any performance degradation from discretizating the architecture.
Despite the success of convolutional neural networks (CNNs) in many computer vision and image analysis tasks, they remain vulnerable against so-called adversarial attacks: Small, crafted perturbations in the input images can lead to false predictions. A possible defense is to detect adversarial examples. In this work, we show how analysis in the Fourier domain of input images and feature maps can be used to distinguish benign test samples from adversarial images. We propose two novel detection methods: Our first method employs the magnitude spectrum of the input images to detect an adversarial attack. This simple and robust classifier can successfully detect adversarial perturbations of three commonly used attack methods. The second method builds upon the first and additionally extracts the phase of Fourier coefficients of feature-maps at different layers of the network. With this extension, we are able to improve adversarial detection rates compared to state-of-the-art detectors on five different attack methods. The code for the methods proposed in the paper is available at github.com/paulaharder/SpectralAdversarialDefense
In this work, we evaluate two different image clustering objectives, k-means clustering and correlation clustering, in the context of Triplet Loss induced feature space embeddings. Specifically, we train a convolutional neural network to learn discriminative features by optimizing two popular versions of the Triplet Loss in order to study their clustering properties under the assumption of noisy labels. Additionally, we propose a new, simple Triplet Loss formulation, which shows desirable properties with respect to formal clustering objectives and outperforms the existing methods. We evaluate all three Triplet loss formulations for K-means and correlation clustering on the CIFAR-10 image classification dataset.
The term “attribute transfer” refers to the tasks of altering images in such a way, that the semantic interpretation of a given input image is shifted towards an intended direction, which is quantified by semantic attributes. Prominent example applications are photo realistic changes of facial features and expressions, like changing the hair color, adding a smile, enlarging the nose or altering the entire context of a scene, like transforming a summer landscape into a winter panorama. Recent advances in attribute transfer are mostly based on generative deep neural networks, using various techniques to manipulate images in the latent space of the generator. In this paper, we present a novel method for the common sub-task of local attribute transfers, where only parts of a face have to be altered in order to achieve semantic changes (e.g. removing a mustache). In contrast to previous methods, where such local changes have been implemented by generating new (global) images, we propose to formulate local attribute transfers as an inpainting problem. Removing and regenerating only parts of images, our “Attribute Transfer Inpainting Generative Adversarial Network” (ATI-GAN) is able to utilize local context information to focus on the attributes while keeping the background unmodified resulting in visually sound results.
The Go programming language is an increasingly popular language but some of its features lack a formal investigation. This article explains Go's resolution mechanism for overloaded methods and its support for structural subtyping by means of translation from Featherweight Go to a simple target language. The translation employs a form of dictionary passing known from type classes in Haskell and preserves the dynamic behavior of Featherweight Go programs.
It is considered necessary to implement advanced controllers such as model predictive control (MPC) to utilize the technical flexibility of a building polygeneration system to support the rapidly expanding renewable electricity grid. These can handle multiple inputs and outputs, uncertainties in forecast data, and plant constraints, amongst other features. One of the main issues identified in the literature regarding deploying these controllers is the lack of experimental demonstrations using standard components and communication protocols. In this original work, the economic-MPC-based optimal scheduling of a real-world heat pump-based building energy plant is demonstrated, and its performance is evaluated against two conventional controllers. The demonstration includes the steps to integrate an optimization-based supervisory controller into a typical building automation and control system with off-the-shelf HVAC components and usage of state-of-art algorithms to solve a mixed integer quadratic problem. Technological benefits in terms of fewer constraint violations and a hardware-friendly operation with MPC were identified. Additionally, a strong dependency of the economic benefits on the type of load profile, system design and controller parameters was also identified. Future work for the quantification of these benefits, the application of machine learning algorithms, and the study of forecast deviations is also proposed.