Refine
Document Type
- Conference Proceeding (202)
- Article (reviewed) (74)
- Article (unreviewed) (27)
- Patent (20)
- Letter to Editor (16)
- Book (11)
- Part of a Book (10)
- Doctoral Thesis (10)
- Report (10)
- Contribution to a Periodical (7)
- Moving Images (1)
- Other (1)
- Working Paper (1)
Conference Type
- Konferenzartikel (176)
- Konferenz-Abstract (19)
- Sonstiges (5)
- Konferenz-Poster (2)
Language
- English (296)
- German (91)
- Other language (1)
- Multiple languages (1)
- Russian (1)
Has Fulltext
- no (390) (remove)
Is part of the Bibliography
- yes (390)
Keywords
- Machine Learning (12)
- RoboCup (12)
- Deep Leaning (9)
- Götz von Berlichingen (5)
- Heart rhythm model (5)
- Herzrhythmusmodell (5)
- Modeling and simulation (5)
- E-Fahrzeug (4)
- Johann Sebastian Bach (4)
- Regelungstechnik (4)
Institute
- Fakultät Elektrotechnik, Medizintechnik und Informatik (EMI) (ab 04/2019) (390) (remove)
Open Access
- Open Access (161)
- Closed Access (136)
- Closed (81)
- Bronze (47)
- Diamond (20)
- Grün (3)
- Gold (2)
- Hybrid (1)
Printed electrolyte-gated oxide electronics is an emerging electronic technology in the low voltage regime (≤1 V). Whereas in the past mainly dielectrics have been used for gating the transistors, many recent approaches employ the advantages of solution processable, solid polymer electrolytes, or ion gels that provide high gate capacitances produced by a Helmholtz double layer, allowing for low-voltage operation. Herein, with special focus on work performed at KIT recent advances in building electronic circuits based on indium oxide, n-type electrolyte-gated field-effect transistors (EGFETs) are reviewed. When integrated into ring oscillator circuits a digital performance ranging from 250 Hz at 1 V up to 1 kHz is achieved. Sequential circuits such as memory cells are also demonstrated. More complex circuits are feasible but remain challenging also because of the high variability of the printed devices. However, the device inherent variability can be even exploited in security circuits such as physically unclonable functions (PUFs), which output a reliable and unique, device specific, digital response signal. As an overall advantage of the technology all the presented circuits can operate at very low supply voltages (0.6 V), which is crucial for low-power printed electronics applications.
Due to its performance, the field of deep learning has gained a lot of attention, with neural networks succeeding in areas like Computer Vision (CV), Neural Language Processing (NLP), and Reinforcement Learning (RL). However, high accuracy comes at a computational cost as larger networks require longer training time and no longer fit onto a single GPU. To reduce training costs, researchers are looking into the dynamics of different optimizers, in order to find ways to make training more efficient. Resource requirements can be limited by reducing model size during training or designing more efficient models that improve accuracy without increasing network size.
This thesis combines eigenvalue computation and high-dimensional loss surface visualization to study different optimizers and deep neural network models. Eigenvectors of different eigenvalues are computed, and the loss landscape and optimizer trajectory are projected onto the plane spanned by those eigenvectors. A new parallelization method for the stochastic Lanczos method is introduced, resulting in faster computation and thus enabling high-resolution videos of the trajectory and secondorder information during neural network training. Additionally, the thesis presents the loss landscape between two minima along with the eigenvalue density spectrum at intermediate points for the first time.
Secondly, this thesis presents a regularization method for Generative Adversarial Networks (GANs) that uses second-order information. The gradient during training is modified by subtracting the eigenvector direction of the biggest eigenvalue, preventing the network from falling into the steepest minima and avoiding mode collapse. The thesis also shows the full eigenvalue density spectra of GANs during training.
Thirdly, this thesis introduces ProxSGD, a proximal algorithm for neural network training that guarantees convergence to a stationary point and unifies multiple popular optimizers. Proximal gradients are used to find a closed-form solution to the problem of training neural networks with smooth and non-smooth regularizations, resulting in better sparsity and more efficient optimization. Experiments show that ProxSGD can find sparser networks while reaching the same accuracy as popular optimizers.
Lastly, this thesis unifies sparsity and neural architecture search (NAS) through the framework of group sparsity. Group sparsity is achieved through ℓ2,1-regularization during training, allowing for filter and operation pruning to reduce model size with minimal sacrifice in accuracy. By grouping multiple operations together, group sparsity can be used for NAS as well. This approach is shown to be more robust while still achieving competitive accuracies compared to state-of-the-art methods
Current training methods for deep neural networks boil down to very high dimensional and non-convex optimization problems which are usually solved by a wide range of stochastic gradient descent methods. While these approaches tend to work in practice, there are still many gaps in the theoretical understanding of key aspects like convergence and generalization guarantees, which are induced by the properties of the optimization surface (loss landscape). In order to gain deeper insights, a number of recent publications proposed methods to visualize and analyze the otimization surfaces. However, the computational cost of these methods are very high, making it hardly possible to use them on larger networks. In this paper, we present the GradVis Toolbox, an open source library for efficient and scalable visualization and analysis of deep neural network loss landscapes in Tesorflow and PyTorch. Introducing more efficient mathematical formulations and a novel parallelization scheme, GradVis allows to plot 2d and 3d projections of optimization surfaces and trajectories, as well as high resolution second order gradient information for large networks.
In this paper, we propose a unified approach for network pruning and one-shot neural architecture search (NAS) via group sparsity. We first show that group sparsity via the recent Proximal Stochastic Gradient Descent (ProxSGD) algorithm achieves new state-of-the-art results for filter pruning. Then, we extend this approach to operation pruning, directly yielding a gradient-based NAS method based on group sparsity. Compared to existing gradient-based algorithms such as DARTS, the advantages of this new group sparsity approach are threefold. Firstly, instead of a costly bilevel optimization problem, we formulate the NAS problem as a single-level optimization problem, which can be optimally and efficiently solved using ProxSGD with convergence guarantees. Secondly, due to the operation-level sparsity, discretizing the network architecture by pruning less important operations can be safely done without any performance degradation. Thirdly, the proposed approach finds architectures that are both stable and well-performing on a variety of search spaces and datasets.
We demonstrate how to exploit group sparsity in order to bridge the areas of network pruning and neural architecture search (NAS). This results in a new one-shot NAS optimizer that casts the problem as a single-level optimization problem and does not suffer any performance degradation from discretizating the architecture.
The use of artificial intelligence continues to impact a broad variety of domains, application areas, and people. However, interpretability, understandability, responsibility, accountability, and fairness of the algorithms' results - all crucial for increasing humans' trust into the systems - are still largely missing. The purpose of this seminar is to understand how these components factor into the holistic view of trust. Further, this seminar seeks to identify design guidelines and best practices for how to build interactive visualization systems to calibrate trust.
In this paper we report on further success of our work to develop a multi-method energy optimization which works with a digital twin concept. The twin concept serves to replicate production processes of different kinds of production companies, including complex energy systems and test market interactions to then use them for model predictive optimizing. The presented work finally reports about the performed flexibility assessment leading to a flexibility audit with a list of measures and the impact of energy optimizations made related to interactions with the local power grid i.e., the exchange node of the low voltage distribution grid. The analysis and continuous exploration of flexibilities as well as the exchange with energy markets require a “guide” leading to continuous optimization with a further tool like the Flexibility Survey and Control Panel helping decision-making processes on the day-ahead horizon for real production plants or the investment planning to improve machinery, staff schedules and production
infrastructure.
The twin concept is increasingly used for optimization tasks in the context of Industry 4.0 and digitization. The twin concept can also help small and medium-sized enterprises (SME) to exploit their energy flexibility potential and to achieve added value by appropriate energy marketing. At the same time, this use of flexibility helps to realize a climate-neutral energy supply with high shares of renewable energies. The digital twin reflects real production, power flows and market influences as a computer model, which makes it possible to simulate and optimize on-site interventions and interactions with the energy market without disturbing the real production processes. This paper describes the development of a generic model library that maps flexibility-relevant components and processes of SME, thus simplifying the creation of a digital twin. The paper also includes the development of an experimental twin consisting of SME hardware components and a PLC-based SCADA system. The experimental twin provides a laboratory environment in which the digital twin can be tested, further developed and demonstrated on a laboratory scale. Concrete implementations of such a digital twin and experimental twin are described as examples.
Die Erfindung betrifft ein Verfahren zum Betrieb eines batterieelektrischen Fahrzeugs mit einer elektrischen Maschine zum Antrieb des Fahrzeugs und einem Inverter (1) zum Ansteuern der elektrischen Maschine, wobei der Inverter (1) eine dreiphasige Brückenschaltung mit einer Anzahl von als Halbleiter ausgebildeten Schaltern (3) umfasst, wobei im Inverter (1) entstehende Verluste zum Heizen eines Innenraums des Fahrzeugs und/oder zum Temperieren einer Batterie und/oder zum Temperieren von Getriebeöl verwendet werden, wobei der Inverter (1) mittels Raumzeigermodulation gesteuert wird, wobei ein nicht-optimales Schaltverhalten des Inverters (1) herbeigeführt wird, indem nicht optimale Spannungs-Raumzeiger (e, eu, ev, ew, e1, e2, -e1, -e2) eingestellt werden, wobei eine Skalierung der Spannungs-Raumzeiger (e, e1, e2) über die Schaltung von Nullspannungsvektoren, die je nach zeitlichem Anteil die Spannung reduzieren, oder durch Zuhilfenahme eines jeweils gegenüberliegenden Spannungs-Raumzeigers (-e1, - e2) erfolgt, so dass eine Schaltfolge mit einer maximalen Anzahl von Schaltzyklen realisiert wird, wobei in der Mitte einer Schaltperiode (Tp) keine Symmetrie erzeugt wird.
Die Erfindung betrifft ein Verfahren zum Betrieb eines batterieelektrischen Fahrzeugs mit einer elektrischen Maschine zum Antrieb des Fahrzeugs und einem Inverter (1) zum Ansteuern eine Stators (2) der elektrischen Maschine, wobei der Inverter (1) eine dreiphasige Brückenschaltung mit einer Anzahl von als Halbleiter ausgebildeten Schaltern (3) umfasst, wobei im Inverter (1) und/oder in der elektrischen Maschine entstehende Verluste zum Heizen eines Innenraums des Fahrzeugs und/oder zum Temperieren einer Batterie und/oder zum Temperieren von Getriebeöl verwendet werden, wobei während des Stillstands des Fahrzeugs ein von einem Permanentmagneten der elektrischen Maschine verursachter Permanentmagnetfluss durch Einstellen einer nichtdrehmomentbildenden Statorstromkomponente (Id) in Höhe des negativen Quotienten aus einem Statorfluss (&psgr;PM) und einer d-Komponente einer Statorinduktivität (Ld) so stark geschwächt wird, dass der magnetische Fluss kompensiert wird, wobei ein sehr hochfrequenter Wechselstrom als drehmomentbildende Statorstromkomponente (Iq) eingestellt wird.
Verfahren zum Betrieb eines batterieelektrischen Fahrzeugs mit einer elektrischen Maschine zum Antrieb des Fahrzeugs und einem Inverter (1) zum Ansteuern der elektrischen Maschine, wobei der Inverter (1) eine dreiphasige Brückenschaltung mit einer Anzahl von als Halbleiter ausgebildeten Schaltern (3) umfasst, wobei im Inverter (1) entstehende Verluste zum Heizen eines Innenraums des Fahrzeugs und/oder zum Temperieren einer Batterie und/oder zum Temperieren von Getriebeöl verwendet werden, wobei der Inverter (1) mittels Raumzeigermodulation gesteuert wird, wobei ein nicht-optimales Schaltverhalten des Inverters (1) herbeigeführt wird, indem nicht optimale Spannungs-Raumzeiger (e, eu, ev, ew, e1, e2, -e1, -e2) eingestellt werden, wobei eine Skalierung der Spannungs-Raumzeiger (e, e1, e2) über die Schaltung von Nullspannungsvektoren, die je nach zeitlichem Anteil die Spannung reduzieren, oder durch Zuhilfenahme eines jeweils gegenüberliegenden Spannungs-Raumzeigers (-e1, -e2) erfolgt, so dass eine Schaltfolge mit einer maximalen Anzahl von Schaltzyklen realisiert wird, dadurch gekennzeichnet, dass in der Mitte einer Schaltperiode (Tp) keine Symmetrie erzeugt wird.
Die Erfindung betrifft ein Verfahren zum Betrieb eines batterieelektrischen Fahrzeugs mit einer elektrischen Maschine zum Antrieb des Fahrzeugs und einem Inverter (1) zum Ansteuern eines Stators (2) der elektrischen Maschine, wobei der Inverter (1) eine dreiphasige Brückenschaltung mit einer Anzahl von als Halbleiter ausgebildeten Schaltern (3) umfasst, wobei im Inverter (1) und/oder in der elektrischen Maschine entstehende Verluste zum Heizen eines Innenraums des Fahrzeugs und/oder zum Temperieren einer Batterie und/oder zum Temperieren von Getriebeöl verwendet werden, wobei eine als Wechselstrom ausgebildete nichtdrehmomentbildende Statorstromkomponente (Id) in die elektrische Maschine eingeprägt wird, wobei im Stillstand eine drehmomentbildende Statorstromkomponente (Iq) zu Null geregelt wird, wobei im Fahrbetrieb ein Kompensationsstrom als drehmomentbildende Statorstromkomponente (Iq) eingeprägt wird, der ein durch die Variation der nichtdrehmomentbildenden Statorstromkomponente (Id) entstehendes Drehmoment kompensiert.
The present work describes an extension of current slope estimation for parameter estimation of permanent magnet synchronous machines operated at inverters. The area of operation for current slope estimation in the individual switching states of the inverter is limited due to measurement noise, bandwidth limitation of the current sensors and the commutation processes of the inverter's switching operations. Therefore, a minimum duration of each switching state is necessary, limiting the final area of operation of a robust current slope estimation. This paper presents an extension of existing current slope estimation algorithms resulting in a greater area of operation and a more robust estimation result.
In this work a method for the estimation of current slopes induced by inverters operating interior permanent magnet synchronous machines is presented. After the derivation of the estimation algorithm, the requirements for a suitable sensor setup in terms of accuracy, dynamic and electromagnetic interference are discussed. The boundary conditions for the estimation algorithm are presented with respect to application within high power traction systems. The estimation algorithm is implemented on a field programmable gateway array. This moving least-square algorithm offers the advantage that it is not dependent on vectors and therefore not every measured value has to be stored. The summation of all measured values leads to a significant reduction of the required storage units and thus decreases the hardware requirements. The algorithm is designed to be calculated within the dead time of the inverter. Appropriate countermeasures for disturbances and hardware restrictions are implemented. The results are discussed afterwards.
The following describes a new method for estimating the parameters of an interior permanent magnet synchronous machine (IPMSM). For the estimation of the parameters the current slopes caused by the switching of the inverter are used to determine the unknowns of the system equations of the electrical machine. The angle and current dependence of the machine parameters are linearized within a PWM cycle. By considering the different switching states of the inverter, several system equations can be derived and a solution can be found within one PWM cycle. The use of test signals and filter-based approaches is avoided. The derived algorithm is explained and validated with measurements on a test bench.
A Novel Approach of High Dynamic Current Control of Interior Permanent Magnet Synchronous Machines
(2019)
Harmonic-afflicted effects of permanent magnet synchronous machines with high power density are hardly faced by traditional current PI controllers, due to limited controller bandwidth. As a consequence, currents and lastly torque ripples appear. In this paper, a new deadbeat current controller architecture has been presented, which is capable to encounter the effects of these harmonics. This new control algorithm, here named “Hybrid-Deadbeat-Controller”, combines the stability and the low steady-state errors offered by common PI regulators with the high dynamic offered by the deadbeat control. Therefore, a novel control algorithm is proposed, capable of either compensating the current harmonics in order to get smoother currents or to control a varying reference value to achieve a smoother torque. The information needed to calculate the optimal reference currents is based on an online parameter estimation feeding an optimization algorithm to achieve an optimal torque output and will be investigated in future research. In order to ensure the stability of the controller over the whole area of operation even under the influence of effects changing the system’s parameter, this work as well focusses on the robustness of the “hybrid” dead beat controller.
Der verstärkte Einsatz von Wärmepumpen bei der Realisierung einer klimaneutralen Wärmeversorgung führt zu einer signifikanten Zunahme und Änderung der elektrischen Lasten in den Verteilnetzen. Daher gilt es, Wärmepumpen so zu steuern, dass sie Verteilnetze wenig belasten oder sogar unterstützen.
Inhalt des Projekts „PV²WP - PV Vorhersage für die netzdienliche Steuerung von Wärmepumpen“ (Projektlaufzeit 1.07.2018 – 30.06.2021) war die Demonstration eines neuen Ansatzes zur Steuerung von Heizungssystemen, die auf Wärmepumpen und thermischen Speichern basieren und in Kombination mit einer Photovoltaikanlage betrieben werden. Das übergeordnete Ziel war dabei die Verbesserung der Netzintegration und Smart-Grid-Tauglichkeit entsprechender Heizungssysteme durch eine kostengünstige Technologie bei gleichzeitiger Erhöhung der Wirtschaftlichkeit.
Dabei wurden drei zukunftsweisende Technologien in Kombination genutzt und demonstriert: wolkenkamerabasierte Kurzfristprognosen, prädiktive Steuerung und Regelung sowie machinelearning-basierte Systemmodellierung als Basis für die Optimierung. Als Demonstrationsumgebung diente mit dem Projekthaus Ulm ein real bewohntes Einfamilienhaus.Umweltforschung
In this study, various imaging algorithms for the localization of objects have been investigated. Therefore, an Ultra-Wideband (UWB) radar based experimental setup with a circular antenna array is designed as part of this work. This concept could be particularly useful in microwave medical imaging applications. In order to validate its applicability in microwave imaging, different imaging algorithms have been evaluated and compared by means of our experimental setup. Accurate imaging results have been achieved with our system under multiple test-scenarios.
In this study, an approach to a microwave-based radar system for the localization of objects has been proposed. This could be particularly useful in microwave imaging applications such as cardiac catheter detection. An experimental system is defined and realized with the selection of an appropriate antenna design. Hardware control functions and different imaging algorithms are implemented as well. The functionality of this measurement setup has been analyzed considering multiple testscenarios and it is proved to be capable of locating multiple objects as well as expanded objects.
Dissertation D. Dongol
This paper presents the use of model predictive control (MPC) based approach for peak shaving application of a battery in a Photovoltaic (PV) battery system connected to a rural low voltage gird. The goals of the MPC are to shave the peaks in the PV feed-in and the grid power consumption and at the same time maximize the use of the battery. The benefit to the prosumer is from the maximum use of the self-produced electricity. The benefit to the grid is from the reduced peaks in the PV feed-in and the grid power consumption. This would allow an increase in the PV hosting and the load hosting capacity of the grid.
The paper presents the mathematical formulation of the optimal control problem
along with the cost benefit analysis. The MPC implementation scheme in the
laboratory and experiment results have also been presented. The results show
that the MPC is able to track the deviation in the weather forecast and operate
the battery by solving the optimal control problem to handle this deviation.
Due to the Covid-19 pandemic, the RoboCup WorldCup 2021 was held completely remotely. For this competition the Webots simulator (https://cyberbotics.com/) was used, so all teams needed to transfer their robot to the simulation. This paper describes our experiences during this process as well as a genetic learning approach to improve our walk engine to allow a more stable and faster movement in the simulation. Therefore we used a docker setup to scale easily. The resulting movement was one of the outstanding features that finally led to the championship title.
Sweaty has already participated several times in RoboCup soccer competitions (Adult Size). Now the work is focused on stabilizing the gait. Moreover, we would like to overcome the constraints of a ZMP-algorithm that has a horizontal footplate as precondition for the simplification of the equations. In addition we would like to switch between impedance and position control with a fuzzy-like algorithm that might help to minimize jerks when Sweaty’s feet touch the ground.
Sweaty has already participated four times in RoboCup soccer competitions (Adult Size) and came second three times. While 2016 Sweaty needed a lot of luck to be finalist, 2017 Sweaty was a serious adversary in the preliminary rounds. In 2018 Sweaty showed up in the final with some lack of experience and room for improvements, but not without any chance. This paper describes the intended improvements of the humanoid adult size robot Sweaty in order to qualify for the RoboCup 2019 adult size competition.
Autonomous driving is disrupting the automotive industry as we know it today. For this, fail-operational behavior is essential in the sense, plan, and act stages of the automation chain in order to handle safety-critical situations on its own, which currently is not reached with state-of-the-art approaches.The European ECSEL research project PRYSTINE realizes Fail-operational Urban Surround perceptION (FUSION) based on robust Radar and LiDAR sensor fusion and control functions in order to enable safe automated driving in urban and rural environments. This paper showcases some of the key exploitable results (e.g., novel Radar sensors, innovative embedded control and E/E architectures, pioneering sensor fusion approaches, AI-controlled vehicle demonstrators) achieved until its final year 3.
Generative adversarial networks (GANs) provide state-of-the-art results in image generation. However, despite being so powerful, they still remain very challenging to train. This is in particular caused by their highly non-convex optimization space leading to a number of instabilities. Among them, mode collapse stands out as one of the most daunting ones. This undesirable event occurs when the model can only fit a few modes of the data distribution, while ignoring the majority of them. In this work, we combat mode collapse using second-order gradient information. To do so, we analyse the loss surface through its Hessian eigenvalues, and show that mode collapse is related to the convergence towards sharp minima. In particular, we observe how the eigenvalues of the G are directly correlated with the occurrence of mode collapse. Finally, motivated by these findings, we design a new optimization algorithm called nudged-Adam (NuGAN) that uses spectral information to overcome mode collapse, leading to empirically more stable convergence properties.
Generative adversarial networks (GANs) provide state-of-the-art results in image generation. However, despite being so powerful, they still remain very challenging to train. This is in particular caused by their highly non-convex optimization space leading to a number of instabilities. Among them, mode collapse stands out as one of the most daunting ones. This undesirable event occurs when the model can only fit a few modes of the data distribution, while ignoring the majority of them. In this work, we combat mode collapse using second-order gradient information. To do so, we analyse the loss surface through its Hessian eigenvalues, and show that mode collapse is related to the convergence towards sharp minima. In particular, we observe how the eigenvalues of the are directly correlated with the occurrence of mode collapse. Finally, motivated by these findings, we design a new optimization algorithm called nudged-Adam (NuGAN) that uses spectral information to overcome mode collapse, leading to empirically more stable convergence properties.
Transformer models have recently attracted much interest from computer vision researchers and have since been successfully employed for several problems traditionally addressed with convolutional neural networks. At the same time, image synthesis using generative adversarial networks (GANs) has drastically improved over the last few years. The recently proposed TransGAN is the first GAN using only transformer-based architectures and achieves competitive results when compared to convolutional GANs. However, since transformers are data-hungry architectures, TransGAN requires data augmentation, an auxiliary super-resolution task during training, and a masking prior to guide the self-attention mechanism. In this paper, we study the combination of a transformer-based generator and convolutional discriminator and successfully remove the need of the aforementioned required design choices. We evaluate our approach by conducting a benchmark of well-known CNN discriminators, ablate the size of the transformer-based generator, and show that combining both architectural elements into a hybrid model leads to better results. Furthermore, we investigate the frequency spectrum properties of generated images and observe that our model retains the benefits of an attention based generator.
Seismic data processing involves techniques to deal with undesired effects that occur during acquisition and pre-processing. These effects mainly comprise coherent artefacts such as multiples, non-coherent signals such as electrical noise, and loss of signal information at the receivers that leads to incomplete traces. In this work, we employ a generative solution, since it can explicitly model complex data distributions and hence, yield to a better decision-making process. In particular, we introduce diffusion models for multiple removal. To that end, we run experiments on synthetic and on real data, and we compare the deep diffusion performance with standard algorithms. We believe that our pioneer study not only demonstrates the capability of diffusion models, but also opens the door to future research to integrate generative models in seismic workflows.
Generative adversarial networks are the state of the art approach towards learned synthetic image generation. Although early successes were mostly unsupervised, bit by bit, this trend has been superseded by approaches based on labelled data. These supervised methods allow a much finer-grained control of the output image, offering more flexibility and stability. Nevertheless, the main drawback of such models is the necessity of annotated data. In this work, we introduce an novel framework that benefits from two popular learning techniques, adversarial training and representation learning, and takes a step towards unsupervised conditional GANs. In particular, our approach exploits the structure of a latent space (learned by the representation learning) and employs it to condition the generative model. In this way, we break the traditional dependency between condition and label, substituting the latter by unsupervised features coming from the latent space. Finally, we show that this new technique is able to produce samples on demand keeping the quality of its supervised counterpart.
Generative adversarial networks are the state of the art approach towards learned synthetic image generation. Although early successes were mostly unsupervised, bit by bit, this trend has been superseded by approaches based on labelled data. These supervised methods allow a much finer-grained control of the output image, offering more flexibility and stability. Nevertheless, the main drawback of such models is the necessity of annotated data. In this work, we introduce an novel framework that benefits from two popular learning techniques, adversarial training and representation learning, and takes a step towards unsupervised conditional GANs. In particular, our approach exploits the structure of a latent space (learned by the representation learning) and employs it to condition the generative model. In this way, we break the traditional dependency between condition and label, substituting the latter by unsupervised features coming from the latent space. Finally, we show that this new technique is able to produce samples on demand keeping the quality of its supervised counterpart.
Facial image manipulation is a generation task where the output face is shifted towards an intended target direction in terms of facial attribute and styles. Recent works have achieved great success in various editing techniques such as style transfer and attribute translation. However, current approaches are either focusing on pure style transfer, or on the translation of predefined sets of attributes with restricted interactivity. To address this issue, we propose FacialGAN, a novel framework enabling simultaneous rich style transfers and interactive facial attributes manipulation. While preserving the identity of a source image, we transfer the diverse styles of a target image to the source image. We then incorporate the geometry information of a segmentation mask to provide a fine-grained manipulation of facial attributes. Finally, a multi-objective learning strategy is introduced to optimize the loss of each specific tasks. Experiments on the CelebA-HQ dataset, with CelebAMask-HQ as semantic mask labels, show our model’s capacity in producing visually compelling results in style transfer, attribute manipulation, diversity and face verification. For reproducibility, we provide an interactive open-source tool to perform facial manipulations, and the Pytorch implementation of the model.
A fundamental and still largely unsolved question in the context of Generative Adversarial Networks is whether they are truly able to capture the real data distribution and, consequently, to sample from it. In particular, the multidimensional nature of image distributions leads to a complex evaluation of the diversity of GAN distributions. Existing approaches provide only a partial understanding of this issue, leaving the question unanswered. In this work, we introduce a loop-training scheme for the systematic investigation of observable shifts between the distributions of real training data and GAN generated data. Additionally, we introduce several bounded measures for distribution shifts, which are both easy to compute and to interpret. Overall, the combination of these methods allows an explorative investigation of innate limitations of current GAN algorithms. Our experiments on different data-sets and multiple state-of-the-art GAN architectures show large shifts between input and output distributions, showing that existing theoretical guarantees towards the convergence of output distributions appear not to be holding in practice.
Generative convolutional deep neural networks, e.g. popular GAN architectures, are relying on convolution based up-sampling methods to produce non-scalar outputs like images or video sequences. In this paper, we show that common up-sampling methods, i.e. known as up-convolution or transposed convolution, are causing the inability of such models to reproduce spectral distributions of natural training data correctly. This effect is independent of the underlying architecture and we show that it can be used to easily detect generated data like deepfakes with up to 100% accuracy on public benchmarks. To overcome this drawback of current generative models, we propose to add a novel spectral regularization term to the training optimization objective. We show that this approach not only allows to train spectral consistent GANs that are avoiding high frequency errors. Also, we show that a correct approximation of the frequency spectrum has positive effects on the training stability and output quality of generative networks.
Deep generative models have recently achieved impressive results for many real-world applications, successfully generating high-resolution and diverse samples from complex datasets. Due to this improvement, fake digital contents have proliferated growing concern and spreading distrust in image content, leading to an urgent need for automated ways to detect these AI-generated fake images.
Despite the fact that many face editing algorithms seem to produce realistic human faces, upon closer examination, they do exhibit artifacts in certain domains which are often hidden to the naked eye. In this work, we present a simple way to detect such fake face images - so-called DeepFakes. Our method is based on a classical frequency domain analysis followed by basic classifier. Compared to previous systems, which need to be fed with large amounts of labeled data, our approach showed very good results using only a few annotated training samples and even achieved good accuracies in fully unsupervised scenarios. For the evaluation on high resolution face images, we combined several public datasets of real and fake faces into a new benchmark: Faces-HQ. Given such high-resolution images, our approach reaches a perfect classification accuracy of 100% when it is trained on as little as 20 annotated samples. In a second experiment, in the evaluation of the medium-resolution images of the CelebA dataset, our method achieves 100% accuracy supervised and 96% in an unsupervised setting. Finally, evaluating a low-resolution video sequences of the FaceForensics++ dataset, our method achieves 91% accuracy detecting manipulated videos.
The term attribute transfer refers to the tasks of altering images in such a way, that the semantic interpretation of a given input image is shifted towards an intended direction, which is quantified by semantic attributes. Prominent example applications are photo realistic changes of facial features and expressions, like changing the hair color, adding a smile, enlarging the nose or altering the entire context of a scene, like transforming a summer landscape into a winter panorama. Recent advances in attribute transfer are mostly based on generative deep neural networks, using various techniques to manipulate images in the latent space of the generator.
In this paper, we present a novel method for the common sub-task of local attribute transfers, where only parts of a face have to be altered in order to achieve semantic changes (e.g. removing a mustache). In contrast to previous methods, where such local changes have been implemented by generating new (global) images, we propose to formulate local attribute transfers as an inpainting problem. Removing and regenerating only parts of images, our Attribute Transfer Inpainting Generative Adversarial Network (ATI-GAN) is able to utilize local context information to focus on the attributes while keeping the background unmodified resulting in visually sound results.
Recent deep learning based approaches have shown remarkable success on object segmentation tasks. However, there is still room for further improvement. Inspired by generative adversarial networks, we present a generic end-to-end adversarial approach, which can be combined with a wide range of existing semantic segmentation networks to improve their segmentation performance. The key element of our method is to replace the commonly used binary adversarial loss with a high resolution pixel-wise loss. In addition, we train our generator employing stochastic weight averaging fashion, which further enhances the predicted output label maps leading to state-of-the-art results. We show, that this combination of pixel-wise adversarial training and weight averaging leads to significant and consistent gains in segmentation performance, compared to the baseline models.
Recent studies have shown remarkable success in image-to-image translation for attribute transfer applications. However, most of existing approaches are based on deep learning and require an abundant amount of labeled data to produce good results, therefore limiting their applicability. In the same vein, recent advances in meta-learning have led to successful implementations with limited available data, allowing so-called few-shot learning.
In this paper, we address this limitation of supervised methods, by proposing a novel approach based on GANs. These are trained in a meta-training manner, which allows them to perform image-to-image translations using just a few labeled samples from a new target class. This work empirically demonstrates the potential of training a GAN for few shot image-to-image translation on hair color attribute synthesis tasks, opening the door to further research on generative transfer learning.
The term “attribute transfer” refers to the tasks of altering images in such a way, that the semantic interpretation of a given input image is shifted towards an intended direction, which is quantified by semantic attributes. Prominent example applications are photo realistic changes of facial features and expressions, like changing the hair color, adding a smile, enlarging the nose or altering the entire context of a scene, like transforming a summer landscape into a winter panorama. Recent advances in attribute transfer are mostly based on generative deep neural networks, using various techniques to manipulate images in the latent space of the generator. In this paper, we present a novel method for the common sub-task of local attribute transfers, where only parts of a face have to be altered in order to achieve semantic changes (e.g. removing a mustache). In contrast to previous methods, where such local changes have been implemented by generating new (global) images, we propose to formulate local attribute transfers as an inpainting problem. Removing and regenerating only parts of images, our “Attribute Transfer Inpainting Generative Adversarial Network” (ATI-GAN) is able to utilize local context information to focus on the attributes while keeping the background unmodified resulting in visually sound results.
In this preliminary report, we present a simple but very effective technique to stabilize the training of CNN based GANs. Motivated by recently published methods using frequency decomposition of convolutions (e.g. Octave Convolutions), we propose a novel convolution scheme to stabilize the training and reduce the likelihood of a mode collapse. The basic idea of our approach is to split convolutional filters into additive high and low frequency parts, while shifting weight updates from low to high during the training. Intuitively, this method forces GANs to learn low frequency coarse image structures before descending into fine (high frequency) details. Our approach is orthogonal and complementary to existing stabilization methods and can simply plugged into any CNN based GAN architecture. First experiments on the CelebA dataset show the effectiveness of the proposed method.
In this preliminary report, we present a simple but very effective technique to stabilize the training of CNN based GANs. Motivated by recently published methods using frequency decomposition of convolutions (eg Octave Convolutions), we propose a novel convolution scheme to stabilize the training and reduce the likelihood of a mode collapse. The basic idea of our approach is to split convolutional filters into additive high and low frequency parts, while shifting weight updates from low to high during the training. Intuitively, this method forces GANs to learn low frequency coarse image structures before descending into fine (high frequency) details. Our approach is orthogonal and complementary to existing stabilization methods and can simply plugged into any CNN based GAN architecture. First experiments on the CelebA dataset show the effectiveness of the proposed method.
Interpreting seismic data requires the characterization of a number of key elements such as the position of faults and main reflections, presence of structural bodies, and clustering of areas exhibiting a similar amplitude versus angle response. Manual interpretation of geophysical data is often a difficult and time-consuming task, complicated by lack of resolution and presence of noise. In recent years, approaches based on convolutional neural networks have shown remarkable results in automating certain interpretative tasks. However, these state-of-the-art systems usually need to be trained in a supervised manner, and they suffer from a generalization problem. Hence, it is highly challenging to train a model that can yield accurate results on new real data obtained with different acquisition, processing, and geology than the data used for training. In this work, we introduce a novel method that combines generative neural networks with a segmentation task in order to decrease the gap between annotated training data and uninterpreted target data. We validate our approach on two applications: the detection of diffraction events and the picking of faults. We show that when transitioning from synthetic training data to real validation data, our workflow yields superior results compared to its counterpart without the generative network.
Die Katheterablation mit Hochfrequenzstrom (HF) ist der Goldstandard für die Therapie vieler kardi-aler Tachyarrhythmien. Bei der HF-Ablation entstehen Temperaturen zwischen 50 °C und 70 °C, wo-durch bestimmte Strukturen im Herzgewebe gezielt zerstört werden können. Ziel der Studie ist, die HF-Ablation und deren Wärmeausbreitung in Bezug auf die zugeführte Leistung mit unterschiedli-chem Elektrodenmaterial und Elektrodengröße bei supraventrikülären Tachykardien zu simulieren.
Background: The application of high-frequency ablation is used for the treatment of tachycardia arrhythmias and is a respected method. Ablation with high frequency current leads to the targeted heat destruction of myocardial tissue at specific sites and thus prevents the pathological propagation of excitation through these structures.
Purpose: The aim of this study was to simulate heat propagation during RF ablation with modeled electrodes in different sizes and materials. The simulation was performed on atrioventricular node re-entry tachycardia (AVNRT), atrioventricular re-entry tachycardia (AVRT) and atrial flutter (AFL).
Methods: Using the modeling and simulation software CST, ablation catheters with 4 mm and 8 mm tip electrodes were modeled from gold and platinum for each. The designed catheters correspond to the manufacturer"s specifications of Medtronic, Biotronik and Osypka. The catheters were integrated into the Offenburg heart rhythm model to simulate and compare the heat propagation during an ablation application, which also takes into account the blood flow in the four heart chambers. A power of 5 W - 40 W was simulated for the 4 mm electrodes and a power of 50 W - 80 W for the 8 mm electrodes.
Results: During the simulated HF ablation application, the temperature at the ablation electrode was measured at different powers. This is 40.67°C at 5 W, 44.34°C at 10 W, 51.76°C at 20 W, 59.0°C at 30 W, and 66.33°C at 40 W. The measured temperature during 40 W application is 39.5°C at 0,5 mm depth in the myocardium and 37.5°C at 2 mm depth.
In the simulation, the 8 mm platinum electrode reached an ablation temperature of 72.85°C at its tip during an applied power of 60 W. In contrast, the 8 mm platinum electrode reached a depth of 5 mm at 39.5 C° and at a depth of 2 mm at 37.5 °C. In contrast, the 8 mm gold electrode reached a temperature of 64.66°C with the same performance. This is due to the thermal properties of gold, which has a better thermal conductivity than platinum.
Conclusions: CST offers the possibility to carry out a static and dynamic simulation of a heart model and the ablation electrodes integrated in it during an HF ablation. In variation with different electrode sizes and materials, therapy methods for the treatment of AVNRT, AVRT and AFL can be optimized
Most eCommerce applications, like web-shops have millions of products. In this context, the identification of similar products is a common sub-task, which can be utilized in the implementation of recommendation systems, product search engines and internal supply logistics. Providing this data set, our goal is to boost the evaluation of machine learning methods for the prediction of the category of the retail products from tuples of images and descriptions.