Refine
Document Type
- Article (unreviewed) (17) (remove)
Is part of the Bibliography
- yes (17) (remove)
Keywords
- Deep Learning (3)
- Advanced Footwear Technology (2)
- Biomechanics (2)
- Exercise Science (2)
- 1,5-Grad-Ziel (1)
- Abhängigkeit des deutschen Elektrizitätssystems (1)
- COVID-19 (1)
- Cryptography (1)
- Deep Leaning (1)
- DenseNet (1)
Institute
- Fakultät Elektrotechnik, Medizintechnik und Informatik (EMI) (ab 04/2019) (7)
- IMLA - Institute for Machine Learning and Analytics (5)
- Fakultät Maschinenbau und Verfahrenstechnik (M+V) (4)
- Fakultät Medien (M) (ab 22.04.2021) (2)
- Fakultät Wirtschaft (W) (2)
- IBMS - Institute for Advanced Biomechanics and Motion Studies (ab 16.11.2022) (2)
- INES - Institut für nachhaltige Energiesysteme (2)
- ACI - Affective and Cognitive Institute (1)
Open Access
- Diamond (17) (remove)
The mathematical representations of data in the Spherical Harmonic (SH) domain has recently regained increasing interest in the machine learning community. This technical report gives an in-depth introduction to the theoretical foundation and practical implementation of SH representations, summarizing works on rotation invariant and equivariant features, as well as convolutions and exact correlations of signals on spheres. In extension, these methods are then generalized from scalar SH representations to Vectorial Harmonics (VH), providing the same capabilities for 3d vector fields on spheres.
With the rising necessity of explainable artificial intelligence (XAI), we see an increase in task-dependent XAI methods on varying abstraction levels. XAI techniques on a global level explain model behavior and on a local level explain sample predictions. We propose a visual analytics workflow to support seamless transitions between global and local explanations, focusing on attributions and counterfactuals on time series classification. In particular, we adapt local XAI techniques (attributions) that are developed for traditional datasets (images, text) to analyze time series classification, a data type that is typically less intelligible to humans. To generate a global overview, we apply local attribution methods to the data, creating explanations for the whole dataset. These explanations are projected onto two dimensions, depicting model behavior trends, strategies, and decision boundaries. To further inspect the model decision-making as well as potential data errors, a what-if analysis facilitates hypothesis generation and verification on both the global and local levels. We constantly collected and incorporated expert user feedback, as well as insights based on their domain knowledge, resulting in a tailored analysis workflow and system that tightly integrates time series transformations into explanations. Lastly, we present three use cases, verifying that our technique enables users to (1)~explore data transformations and feature relevance, (2)~identify model behavior and decision boundaries, as well as, (3)~the reason for misclassifications.
Entity Matching (EM) defines the task of learning to group objects by transferring semantic concepts from example groups (=entities) to unseen data. Despite the general availability of image data in the context of many EM-problems, most currently available EM-algorithms solely rely on (textual) meta data. In this paper, we introduce the first publicly available large-scale dataset for "visual entity matching", based on a production level use case in the retail domain. Using scanned advertisement leaflets, collected over several years from different European retailers, we provide a total of ~786k manually annotated, high resolution product images containing ~18k different individual retail products which are grouped into ~3k entities. The annotation of these product entities is based on a price comparison task, where each entity forms an equivalence class of comparable products. Following on a first baseline evaluation, we show that the proposed "visual entity matching" constitutes a novel learning problem which can not sufficiently be solved using standard image based classification and retrieval algorithms. Instead, novel approaches which allow to transfer example based visual equivalent classes to new data are needed to address the proposed problem. The aim of this paper is to provide a benchmark for such algorithms.
Information about the dataset, evaluation code and download instructions are provided under https://www.retail-786k.org/.
Following the traditional paradigm of convolutional neural networks (CNNs), modern CNNs manage to keep pace with more recent, for example transformer-based, models by not only increasing model depth and width but also the kernel size. This results in large amounts of learnable model parameters that need to be handled during training. While following the convolutional paradigm with the according spatial inductive bias, we question the significance of \emph{learned} convolution filters. In fact, our findings demonstrate that many contemporary CNN architectures can achieve high test accuracies without ever updating randomly initialized (spatial) convolution filters. Instead, simple linear combinations (implemented through efficient 1×1 convolutions) suffice to effectively recombine even random filters into expressive network operators. Furthermore, these combinations of random filters can implicitly regularize the resulting operations, mitigating overfitting and enhancing overall performance and robustness. Conversely, retaining the ability to learn filter updates can impair network performance. Lastly, although we only observe relatively small gains from learning 3×3 convolutions, the learning gains increase proportionally with kernel size, owing to the non-idealities of the independent and identically distributed (\textit{i.i.d.}) nature of default initialization techniques.
Modern CNNs are learning the weights of vast numbers of convolutional operators. In this paper, we raise the fundamental question if this is actually necessary. We show that even in the extreme case of only randomly initializing and never updating spatial filters, certain CNN architectures can be trained to surpass the accuracy of standard training. By reinterpreting the notion of pointwise ($1\times 1$) convolutions as an operator to learn linear combinations (LC) of frozen (random) spatial filters, we are able to analyze these effects and propose a generic LC convolution block that allows tuning of the linear combination rate. Empirically, we show that this approach not only allows us to reach high test accuracies on CIFAR and ImageNet but also has favorable properties regarding model robustness, generalization, sparsity, and the total number of necessary weights. Additionally, we propose a novel weight sharing mechanism, which allows sharing of a single weight tensor between all spatial convolution layers to massively reduce the number of weights.
We have developed a methodology for the systematic generation of a large image dataset of macerated wood references, which we used to generate image data for nine hardwood genera. This is the basis for a substantial approach to automate, for the first time, the identification of hardwood species in microscopic images of fibrous materials by deep learning. Our methodology includes a flexible pipeline for easy annotation of vessel elements. We compare the performance of different neural network architectures and hyperparameters. Our proposed method performs similarly well to human experts. In the future, this will improve controls on global wood fiber product flows to protect forests.
High-tech running shoes and spikes ("super-footwear") are currently being debated in sports. There is direct evidence that distance running super shoes improve running economy; however, it is not well established to which extent world-class performances are affected over the range of track and road running events.
This study examined publicly available performance datasets of annual best track and road performances for evidence of potential systematic performance effects following the introduction of super footwear. The analysis was based on the 100 best performances per year for men and women in outdoor events from 2010 to 2022, provided by the world governing body of athletics (World Athletics).
We found evidence of progressing improvements in track and road running performances after the introduction of super distance running shoes in 2016 and super spike technology in 2019. This evidence is more pronounced for distances longer than 1500 m in women and longer than 5000 m in men. Women seem to benefit more from super footwear in distance running events than men.
While the observational study design limits causal inference, this study provides a database on potential systematic performance effects following the introduction of super shoes/spikes in track and road running events in world-class athletes. Further research is needed to examine the underlying mechanisms and, in particular, potential sex differences in the performance effects of super footwear.
Recent advances in spiked shoe design, characterized by increased longitudinal stiffness, thicker midsole foams, and reconfigured geometry are considered to improve sprint performance. However, so far there is no empirical data on the effects of advanced spikes technology on maximal sprinting speed (MSS) published yet. Consequently, we assessed MSS via ‘flying 30m’ sprints of 44 trained male (PR: 10.32 s - 12.08 s) and female (PR: 11.56 s - 14.18 s) athletes, wearing both traditional and advanced spikes in a randomized, repeated measures design. The results revealed a statistically significant increase in MSS by 1.21% on average when using advanced spikes technology. Notably, 87% of participants showed improved MSS with the use of advanced spikes. A cluster analysis unveiled that athletes with higher MSS may benefit to a greater extent. However, individual responses varied widely, suggesting the influence of multiple factors that need detailed exploration. Therefore, coaches and athletes are advised to interpret the promising performance enhancements cautiously and evaluate the appropriateness of the advanced spike technology for their athletes critically.
State-of-the-art models for pixel-wise prediction tasks such as image restoration, image segmentation, or disparity estimation, involve several stages of data resampling, in which the resolution of feature maps is first reduced to aggregate information and then sequentially increased to generate a high-resolution output. Several previous works have investigated the effect of artifacts that are invoked during downsampling and diverse cures have been proposed that facilitate to improve prediction stability and even robustness for image classification. However, equally relevant, artifacts that arise during upsampling have been less discussed. This is significantly relevant as upsampling and downsampling approaches face fundamentally different challenges. While during downsampling, aliases and artifacts can be reduced by blurring feature maps, the emergence of fine details is crucial during upsampling. Blurring is therefore not an option and dedicated operations need to be considered. In this work, we are the first to explore the relevance of context during upsampling by employing convolutional upsampling operations with increasing kernel size while keeping the encoder unchanged. We find that increased kernel sizes can in general improve the prediction stability in tasks such as image restoration or image segmentation, while a block that allows for a combination of small-size kernels for fine details and large-size kernels for artifact removal and increased context yields the best results.
The COVID19 pandemic, a unique and devastating respiratory disease outbreak, has affected global populations as the disease spreads rapidly. Recent Deep Learning breakthroughs may improve COVID19 prediction and forecasting as a tool of precise and fast detection, however, current methods are still being examined to achieve higher accuracy and precision. This study analyzed the collection contained 8055 CT image samples, 5427 of which were COVID cases and 2628 non COVID. The 9544 Xray samples included 4044 COVID patients and 5500 non COVID cases. The most accurate models are MobileNet V3 (97.872 percent), DenseNet201 (97.567 percent), and GoogleNet Inception V1 (97.643 percent). High accuracy indicates that these models can make many accurate predictions, as well as others, are also high for MobileNetV3 and DenseNet201. An extensive evaluation using accuracy, precision, and recall allows a comprehensive comparison to improve predictive models by combining loss optimization with scalable batch normalization in this study. Our analysis shows that these tactics improve model performance and resilience for advancing COVID19 prediction and detection and shows how Deep Learning can improve disease handling. The methods we suggest would strengthen healthcare systems, policymakers, and researchers to make educated decisions to reduce COVID19 and other contagious diseases.