Refine
Document Type
- Conference Proceeding (14) (remove)
Conference Type
- Konferenzartikel (14)
Language
- English (14)
Has Fulltext
- no (14)
Is part of the Bibliography
- yes (14)
Keywords
- Generative Adversarial Network (3)
- Geophysik (2)
- Stability (2)
- Computer Vision (1)
- Deep Leaning (1)
- Eigenvalues (1)
- Machine Learning (1)
- Mode Collapse (1)
- Octave Convolution (1)
- Pattern Recognition (1)
Institute
Open Access
- Open Access (6)
- Closed (4)
- Closed Access (4)
Diffracted waves carry high resolution information that can help interpreting fine structural details at a scale smaller than the seismic wavelength. Because of the low signal-to-noise ratio of diffracted waves, it is challenging to preserve them during processing and to identify them in the final data. It is, therefore, a traditional approach to pick manually the diffractions. However, such task is tedious and often prohibitive, thus, current attention is given to domain adaptation. Those methods aim to transfer knowledge from a labeled domain to train the model, and then infer on the real unlabeled data. In this regard, it is common practice to create a synthetic labeled training dataset, followed by testing on unlabeled real data. Unfortunately, such procedure may fail due to the existing gap between the synthetic and the real distribution since quite often synthetic data oversimplifies the problem, and consequently the transfer learning becomes a hard and non-trivial procedure. Furthermore, deep neural networks are characterized by their high sensitivity towards cross-domain distribution shift. In this work, we present deep learning model that builds a bridge between both distributions creating a semi-synthetic datatset that fills in the gap between synthetic and real domains. More specifically, our proposal is a feed-forward, fully convolutional neural network for imageto-image translation that allows to insert synthetic diffractions while preserving the original reflection signal. A series of experiments validate that our approach produces convincing seismic data containing the desired synthetic diffractions.
Generative convolutional deep neural networks, e.g. popular GAN architectures, are relying on convolution based up-sampling methods to produce non-scalar outputs like images or video sequences. In this paper, we show that common up-sampling methods, i.e. known as up-convolution or transposed convolution, are causing the inability of such models to reproduce spectral distributions of natural training data correctly. This effect is independent of the underlying architecture and we show that it can be used to easily detect generated data like deepfakes with up to 100% accuracy on public benchmarks. To overcome this drawback of current generative models, we propose to add a novel spectral regularization term to the training optimization objective. We show that this approach not only allows to train spectral consistent GANs that are avoiding high frequency errors. Also, we show that a correct approximation of the frequency spectrum has positive effects on the training stability and output quality of generative networks.
Detecting Images Generated by Deep Diffusion Models using their Local Intrinsic Dimensionality
(2023)
Diffusion models recently have been successfully applied for the visual synthesis of strikingly realistic appearing images. This raises strong concerns about their potential for malicious purposes. In this paper, we propose using the lightweight multi Local Intrinsic Dimensionality (multiLID), which has been originally developed in context of the detection of adversarial examples, for the automatic detection of synthetic images and the identification of the according generator networks. In contrast to many existing detection approaches, which often only work for GAN-generated images, the proposed method provides close to perfect detection results in many realistic use cases. Extensive experiments on known and newly created datasets demonstrate that the proposed multiLID approach exhibits superiority in diffusion detection and model identification.Since the empirical evaluations of recent publications on the detection of generated images are often mainly focused on the "LSUN-Bedroom" dataset, we further establish a comprehensive benchmark for the detection of diffusion-generated images, including samples from several diffusion models with different image sizes.The code for our experiments is provided at https://github.com/deepfake-study/deepfake-multiLID.
Seismic data processing involves techniques to deal with undesired effects that occur during acquisition and pre-processing. These effects mainly comprise coherent artefacts such as multiples, non-coherent signals such as electrical noise, and loss of signal information at the receivers that leads to incomplete traces. In this work, we employ a generative solution, since it can explicitly model complex data distributions and hence, yield to a better decision-making process. In particular, we introduce diffusion models for multiple removal. To that end, we run experiments on synthetic and on real data, and we compare the deep diffusion performance with standard algorithms. We believe that our pioneer study not only demonstrates the capability of diffusion models, but also opens the door to future research to integrate generative models in seismic workflows.
Recent deep learning based approaches have shown remarkable success on object segmentation tasks. However, there is still room for further improvement. Inspired by generative adversarial networks, we present a generic end-to-end adversarial approach, which can be combined with a wide range of existing semantic segmentation networks to improve their segmentation performance. The key element of our method is to replace the commonly used binary adversarial loss with a high resolution pixel-wise loss. In addition, we train our generator employing stochastic weight averaging fashion, which further enhances the predicted output label maps leading to state-of-the-art results. We show, that this combination of pixel-wise adversarial training and weight averaging leads to significant and consistent gains in segmentation performance, compared to the baseline models.
The term “attribute transfer” refers to the tasks of altering images in such a way, that the semantic interpretation of a given input image is shifted towards an intended direction, which is quantified by semantic attributes. Prominent example applications are photo realistic changes of facial features and expressions, like changing the hair color, adding a smile, enlarging the nose or altering the entire context of a scene, like transforming a summer landscape into a winter panorama. Recent advances in attribute transfer are mostly based on generative deep neural networks, using various techniques to manipulate images in the latent space of the generator. In this paper, we present a novel method for the common sub-task of local attribute transfers, where only parts of a face have to be altered in order to achieve semantic changes (e.g. removing a mustache). In contrast to previous methods, where such local changes have been implemented by generating new (global) images, we propose to formulate local attribute transfers as an inpainting problem. Removing and regenerating only parts of images, our “Attribute Transfer Inpainting Generative Adversarial Network” (ATI-GAN) is able to utilize local context information to focus on the attributes while keeping the background unmodified resulting in visually sound results.
Seismic data has often missing traces due to technical acquisition or economical constraints. A compete dataset is crucial in several processing and inversion techniques. Deep learning algorithms, based on convolutional neural networks (CNNs), have shown alternative solutions that overcome limitation of traditional interpolation methods e.g. data regularity, linearity assumption, etc. There are two different paradigms of CNN methods for seismic interpolation. The first one, so-called deep prior interpolation (DPI), trains a CNN to map random noise to a complete seismic image using only the decimated image itself. The second one, referred as standard deep learning method, trains a CNN to map a decimated seismic image into a complete one using a dataset of complete and artificially decimated images. Within this research, we systematically compare the performance of both methods for different quantities of regular and irregular missing traces using 4 datasets. We evaluate the results of both methods using 5 well-known metrics. We found that DPI method performs better than the standard method if the percentage of missing traces is low (10%) and otherwise if the level of decimation is high (50%).
In this work, we explore three deep learning algorithms apply to seismic interpolation: deep prior image (DPI), standard, and generative adversarial networks (GAN). The standard and GAN approaches rely on a dataset of complete and decimated seismic images for the training process, while the DPI method learns from a decimated image itself, without training images. We carry out two main experiments, considering 10%, 30%, and 50% of regular and irregular decimation. The first tests the optimal situation for the GAN and the standard approaches, where training and testing images are from the same dataset. The second tests the ability of GAN and standard methods to learn simultaneously from three datasets, and generalize to a fourth dataset not used during training. The standard method provides the best results in the first experiment, when the training distribution is similar to the testing one. In this situation, the DPI approach reports the second best results. In the second experiment, the standard method shows the ability to learn simultaneously and effectively three data distributions for the regular case. In the irregular case, the DPI approach is more effective. The GAN approach is the less effective of the three deep learning methods in both experiments.
A fundamental and still largely unsolved question in the context of Generative Adversarial Networks is whether they are truly able to capture the real data distribution and, consequently, to sample from it. In particular, the multidimensional nature of image distributions leads to a complex evaluation of the diversity of GAN distributions. Existing approaches provide only a partial understanding of this issue, leaving the question unanswered. In this work, we introduce a loop-training scheme for the systematic investigation of observable shifts between the distributions of real training data and GAN generated data. Additionally, we introduce several bounded measures for distribution shifts, which are both easy to compute and to interpret. Overall, the combination of these methods allows an explorative investigation of innate limitations of current GAN algorithms. Our experiments on different data-sets and multiple state-of-the-art GAN architectures show large shifts between input and output distributions, showing that existing theoretical guarantees towards the convergence of output distributions appear not to be holding in practice.
Facial image manipulation is a generation task where the output face is shifted towards an intended target direction in terms of facial attribute and styles. Recent works have achieved great success in various editing techniques such as style transfer and attribute translation. However, current approaches are either focusing on pure style transfer, or on the translation of predefined sets of attributes with restricted interactivity. To address this issue, we propose FacialGAN, a novel framework enabling simultaneous rich style transfers and interactive facial attributes manipulation. While preserving the identity of a source image, we transfer the diverse styles of a target image to the source image. We then incorporate the geometry information of a segmentation mask to provide a fine-grained manipulation of facial attributes. Finally, a multi-objective learning strategy is introduced to optimize the loss of each specific tasks. Experiments on the CelebA-HQ dataset, with CelebAMask-HQ as semantic mask labels, show our model’s capacity in producing visually compelling results in style transfer, attribute manipulation, diversity and face verification. For reproducibility, we provide an interactive open-source tool to perform facial manipulations, and the Pytorch implementation of the model.