Refine
Document Type
- Conference Proceeding (11)
- Article (reviewed) (1)
Conference Type
- Konferenzartikel (11)
Language
- English (12)
Has Fulltext
- no (12)
Is part of the Bibliography
- yes (12)
Keywords
- convolutional neural networks (2)
- image classification (2)
- adversarial attacks (1)
- adversarial detection (1)
- artificial intelligence (1)
- correlation (1)
- face recognition (1)
- generative adversarial networks (1)
- hair (1)
- image color analysis (1)
Institute
Open Access
- Closed Access (12) (remove)
Correlation Clustering, also called the minimum cost Multicut problem, is the process of grouping data by pairwise similarities. It has proven to be effective on clustering problems, where the number of classes is unknown. However, not only is the Multicut problem NP-hard, an undirected graph G with n vertices representing single images has at most edges, thus making it challenging to implement correlation clustering for large datasets. In this work, we propose Multi-Stage Multicuts (MSM) as a scalable approach for image clustering. Specifically, we solve minimum cost Multicut problems across multiple distributed compute units. Our approach not only allows to solve problem instances which are too large to fit into the shared memory of a single compute node, but it also achieves significant speedups while preserving the clustering accuracy at the same time. We evaluate our proposed method on the CIFAR10 …
Transformer models have recently attracted much interest from computer vision researchers and have since been successfully employed for several problems traditionally addressed with convolutional neural networks. At the same time, image synthesis using generative adversarial networks (GANs) has drastically improved over the last few years. The recently proposed TransGAN is the first GAN using only transformer-based architectures and achieves competitive results when compared to convolutional GANs. However, since transformers are data-hungry architectures, TransGAN requires data augmentation, an auxiliary super-resolution task during training, and a masking prior to guide the self-attention mechanism. In this paper, we study the combination of a transformer-based generator and convolutional discriminator and successfully remove the need of the aforementioned required design choices. We evaluate our approach by conducting a benchmark of well-known CNN discriminators, ablate the size of the transformer-based generator, and show that combining both architectural elements into a hybrid model leads to better results. Furthermore, we investigate the frequency spectrum properties of generated images and observe that our model retains the benefits of an attention based generator.
Interpreting seismic data requires the characterization of a number of key elements such as the position of faults and main reflections, presence of structural bodies, and clustering of areas exhibiting a similar amplitude versus angle response. Manual interpretation of geophysical data is often a difficult and time-consuming task, complicated by lack of resolution and presence of noise. In recent years, approaches based on convolutional neural networks have shown remarkable results in automating certain interpretative tasks. However, these state-of-the-art systems usually need to be trained in a supervised manner, and they suffer from a generalization problem. Hence, it is highly challenging to train a model that can yield accurate results on new real data obtained with different acquisition, processing, and geology than the data used for training. In this work, we introduce a novel method that combines generative neural networks with a segmentation task in order to decrease the gap between annotated training data and uninterpreted target data. We validate our approach on two applications: the detection of diffraction events and the picking of faults. We show that when transitioning from synthetic training data to real validation data, our workflow yields superior results compared to its counterpart without the generative network.
Despite the success of convolutional neural networks (CNNs) in many computer vision and image analysis tasks, they remain vulnerable against so-called adversarial attacks: Small, crafted perturbations in the input images can lead to false predictions. A possible defense is to detect adversarial examples. In this work, we show how analysis in the Fourier domain of input images and feature maps can be used to distinguish benign test samples from adversarial images. We propose two novel detection methods: Our first method employs the magnitude spectrum of the input images to detect an adversarial attack. This simple and robust classifier can successfully detect adversarial perturbations of three commonly used attack methods. The second method builds upon the first and additionally extracts the phase of Fourier coefficients of feature-maps at different layers of the network. With this extension, we are able to improve adversarial detection rates compared to state-of-the-art detectors on five different attack methods. The code for the methods proposed in the paper is available at github.com/paulaharder/SpectralAdversarialDefense
In this work, we evaluate two different image clustering objectives, k-means clustering and correlation clustering, in the context of Triplet Loss induced feature space embeddings. Specifically, we train a convolutional neural network to learn discriminative features by optimizing two popular versions of the Triplet Loss in order to study their clustering properties under the assumption of noisy labels. Additionally, we propose a new, simple Triplet Loss formulation, which shows desirable properties with respect to formal clustering objectives and outperforms the existing methods. We evaluate all three Triplet loss formulations for K-means and correlation clustering on the CIFAR-10 image classification dataset.
The term “attribute transfer” refers to the tasks of altering images in such a way, that the semantic interpretation of a given input image is shifted towards an intended direction, which is quantified by semantic attributes. Prominent example applications are photo realistic changes of facial features and expressions, like changing the hair color, adding a smile, enlarging the nose or altering the entire context of a scene, like transforming a summer landscape into a winter panorama. Recent advances in attribute transfer are mostly based on generative deep neural networks, using various techniques to manipulate images in the latent space of the generator. In this paper, we present a novel method for the common sub-task of local attribute transfers, where only parts of a face have to be altered in order to achieve semantic changes (e.g. removing a mustache). In contrast to previous methods, where such local changes have been implemented by generating new (global) images, we propose to formulate local attribute transfers as an inpainting problem. Removing and regenerating only parts of images, our “Attribute Transfer Inpainting Generative Adversarial Network” (ATI-GAN) is able to utilize local context information to focus on the attributes while keeping the background unmodified resulting in visually sound results.
Due to the rapidly increasing storage consumption worldwide, as well as the expectation of continuous availability of information, the complexity of administration in today’s data centers is growing permanently. Integrated techniques for monitoring hard disks can increase the reliability of storage systems. However, these techniques often lack intelligent data analysis to perform predictive maintenance. To solve this problem, machine learning algorithms can be used to detect potential failures in advance and prevent them. In this paper, an unsupervised model for predicting hard disk failures based on Isolation Forest is proposed. Consequently, a method is presented that can deal with the highly imbalanced datasets, as the experiment on the Backblaze benchmark dataset demonstrates.
The recent successes and wide spread application of compute intensive machine learning and data analytics methods have been boosting the usage of the Python programming language on HPC systems. While Python provides many advantages for the users, it has not been designed with a focus on multiuser environments or parallel programming - making it quite challenging to maintain stable and secure Python workflows on a HPC system. In this paper, we analyze the key problems induced by the usage of Python on HPC clusters and sketch appropriate workarounds for efficiently maintaining multi-user Python software environments, securing and restricting resources of Python jobs and containing Python processes, while focusing on Deep Learning applications running on GPU clusters.
Diffracted waves carry high resolution information that can help interpreting fine structural details at a scale smaller than the seismic wavelength. Because of the low signal-to-noise ratio of diffracted waves, it is challenging to preserve them during processing and to identify them in the final data. It is, therefore, a traditional approach to pick manually the diffractions. However, such task is tedious and often prohibitive, thus, current attention is given to domain adaptation. Those methods aim to transfer knowledge from a labeled domain to train the model, and then infer on the real unlabeled data. In this regard, it is common practice to create a synthetic labeled training dataset, followed by testing on unlabeled real data. Unfortunately, such procedure may fail due to the existing gap between the synthetic and the real distribution since quite often synthetic data oversimplifies the problem, and consequently the transfer learning becomes a hard and non-trivial procedure. Furthermore, deep neural networks are characterized by their high sensitivity towards cross-domain distribution shift. In this work, we present deep learning model that builds a bridge between both distributions creating a semi-synthetic datatset that fills in the gap between synthetic and real domains. More specifically, our proposal is a feed-forward, fully convolutional neural network for imageto-image translation that allows to insert synthetic diffractions while preserving the original reflection signal. A series of experiments validate that our approach produces convincing seismic data containing the desired synthetic diffractions.
Recent deep learning based approaches have shown remarkable success on object segmentation tasks. However, there is still room for further improvement. Inspired by generative adversarial networks, we present a generic end-to-end adversarial approach, which can be combined with a wide range of existing semantic segmentation networks to improve their segmentation performance. The key element of our method is to replace the commonly used binary adversarial loss with a high resolution pixel-wise loss. In addition, we train our generator employing stochastic weight averaging fashion, which further enhances the predicted output label maps leading to state-of-the-art results. We show, that this combination of pixel-wise adversarial training and weight averaging leads to significant and consistent gains in segmentation performance, compared to the baseline models.
Current training methods for deep neural networks boil down to very high dimensional and non-convex optimization problems which are usually solved by a wide range of stochastic gradient descent methods. While these approaches tend to work in practice, there are still many gaps in the theoretical understanding of key aspects like convergence and generalization guarantees, which are induced by the properties of the optimization surface (loss landscape). In order to gain deeper insights, a number of recent publications proposed methods to visualize and analyze the otimization surfaces. However, the computational cost of these methods are very high, making it hardly possible to use them on larger networks. In this paper, we present the GradVis Toolbox, an open source library for efficient and scalable visualization and analysis of deep neural network loss landscapes in Tesorflow and PyTorch. Introducing more efficient mathematical formulations and a novel parallelization scheme, GradVis allows to plot 2d and 3d projections of optimization surfaces and trajectories, as well as high resolution second order gradient information for large networks.
Most machine learning methods require careful selection of hyper-parameters in order to train a high performing model with good generalization abilities. Hence, several automatic selection algorithms have been introduced to overcome tedious manual (try and error) tuning of these parameters. Due to its very high sample efficiency, Bayesian Optimization over a Gaussian Processes modeling of the parameter space has become the method of choice. Unfortunately, this approach suffers from a cubic compute complexity due to underlying Cholesky factorization, which makes it very hard to be scaled beyond a small number of sampling steps. In this paper, we present a novel, highly accurate approximation of the underlying Gaussian Process. Reducing its computational complexity from cubic to quadratic allows an efficient strong scaling of Bayesian Optimization while outperforming the previous approach regarding optimization accuracy. First experiments show speedups of a factor of 162 in single node and further speed up by a factor of 5 in a parallel environment.