Refine
Document Type
- Article (unreviewed) (16) (remove)
Language
- English (16)
Has Fulltext
- no (16) (remove)
Is part of the Bibliography
- yes (16)
Keywords
- Deep Learning (3)
- Advanced Footwear Technology (2)
- Biomechanics (2)
- Exercise Science (2)
- COVID-19 (1)
- Cryptography (1)
- Deep Leaning (1)
- DenseNet (1)
- DenseNet201 (1)
- Disease Detection (1)
Institute
- Fakultät Elektrotechnik, Medizintechnik und Informatik (EMI) (ab 04/2019) (7)
- IMLA - Institute for Machine Learning and Analytics (5)
- Fakultät Maschinenbau und Verfahrenstechnik (M+V) (3)
- Fakultät Medien (M) (ab 22.04.2021) (2)
- Fakultät Wirtschaft (W) (2)
- IBMS - Institute for Advanced Biomechanics and Motion Studies (ab 16.11.2022) (2)
- ACI - Affective and Cognitive Institute (1)
- INES - Institut für nachhaltige Energiesysteme (1)
Open Access
- Diamond (16) (remove)
State-of-the-art models for pixel-wise prediction tasks such as image restoration, image segmentation, or disparity estimation, involve several stages of data resampling, in which the resolution of feature maps is first reduced to aggregate information and then sequentially increased to generate a high-resolution output. Several previous works have investigated the effect of artifacts that are invoked during downsampling and diverse cures have been proposed that facilitate to improve prediction stability and even robustness for image classification. However, equally relevant, artifacts that arise during upsampling have been less discussed. This is significantly relevant as upsampling and downsampling approaches face fundamentally different challenges. While during downsampling, aliases and artifacts can be reduced by blurring feature maps, the emergence of fine details is crucial during upsampling. Blurring is therefore not an option and dedicated operations need to be considered. In this work, we are the first to explore the relevance of context during upsampling by employing convolutional upsampling operations with increasing kernel size while keeping the encoder unchanged. We find that increased kernel sizes can in general improve the prediction stability in tasks such as image restoration or image segmentation, while a block that allows for a combination of small-size kernels for fine details and large-size kernels for artifact removal and increased context yields the best results.
Following the traditional paradigm of convolutional neural networks (CNNs), modern CNNs manage to keep pace with more recent, for example transformer-based, models by not only increasing model depth and width but also the kernel size. This results in large amounts of learnable model parameters that need to be handled during training. While following the convolutional paradigm with the according spatial inductive bias, we question the significance of \emph{learned} convolution filters. In fact, our findings demonstrate that many contemporary CNN architectures can achieve high test accuracies without ever updating randomly initialized (spatial) convolution filters. Instead, simple linear combinations (implemented through efficient 1×1 convolutions) suffice to effectively recombine even random filters into expressive network operators. Furthermore, these combinations of random filters can implicitly regularize the resulting operations, mitigating overfitting and enhancing overall performance and robustness. Conversely, retaining the ability to learn filter updates can impair network performance. Lastly, although we only observe relatively small gains from learning 3×3 convolutions, the learning gains increase proportionally with kernel size, owing to the non-idealities of the independent and identically distributed (\textit{i.i.d.}) nature of default initialization techniques.
Modern CNNs are learning the weights of vast numbers of convolutional operators. In this paper, we raise the fundamental question if this is actually necessary. We show that even in the extreme case of only randomly initializing and never updating spatial filters, certain CNN architectures can be trained to surpass the accuracy of standard training. By reinterpreting the notion of pointwise ($1\times 1$) convolutions as an operator to learn linear combinations (LC) of frozen (random) spatial filters, we are able to analyze these effects and propose a generic LC convolution block that allows tuning of the linear combination rate. Empirically, we show that this approach not only allows us to reach high test accuracies on CIFAR and ImageNet but also has favorable properties regarding model robustness, generalization, sparsity, and the total number of necessary weights. Additionally, we propose a novel weight sharing mechanism, which allows sharing of a single weight tensor between all spatial convolution layers to massively reduce the number of weights.
The COVID19 pandemic, a unique and devastating respiratory disease outbreak, has affected global populations as the disease spreads rapidly. Recent Deep Learning breakthroughs may improve COVID19 prediction and forecasting as a tool of precise and fast detection, however, current methods are still being examined to achieve higher accuracy and precision. This study analyzed the collection contained 8055 CT image samples, 5427 of which were COVID cases and 2628 non COVID. The 9544 Xray samples included 4044 COVID patients and 5500 non COVID cases. The most accurate models are MobileNet V3 (97.872 percent), DenseNet201 (97.567 percent), and GoogleNet Inception V1 (97.643 percent). High accuracy indicates that these models can make many accurate predictions, as well as others, are also high for MobileNetV3 and DenseNet201. An extensive evaluation using accuracy, precision, and recall allows a comprehensive comparison to improve predictive models by combining loss optimization with scalable batch normalization in this study. Our analysis shows that these tactics improve model performance and resilience for advancing COVID19 prediction and detection and shows how Deep Learning can improve disease handling. The methods we suggest would strengthen healthcare systems, policymakers, and researchers to make educated decisions to reduce COVID19 and other contagious diseases.
The mathematical representations of data in the Spherical Harmonic (SH) domain has recently regained increasing interest in the machine learning community. This technical report gives an in-depth introduction to the theoretical foundation and practical implementation of SH representations, summarizing works on rotation invariant and equivariant features, as well as convolutions and exact correlations of signals on spheres. In extension, these methods are then generalized from scalar SH representations to Vectorial Harmonics (VH), providing the same capabilities for 3d vector fields on spheres.
Recent advances in spiked shoe design, characterized by increased longitudinal stiffness, thicker midsole foams, and reconfigured geometry are considered to improve sprint performance. However, so far there is no empirical data on the effects of advanced spikes technology on maximal sprinting speed (MSS) published yet. Consequently, we assessed MSS via ‘flying 30m’ sprints of 44 trained male (PR: 10.32 s - 12.08 s) and female (PR: 11.56 s - 14.18 s) athletes, wearing both traditional and advanced spikes in a randomized, repeated measures design. The results revealed a statistically significant increase in MSS by 1.21% on average when using advanced spikes technology. Notably, 87% of participants showed improved MSS with the use of advanced spikes. A cluster analysis unveiled that athletes with higher MSS may benefit to a greater extent. However, individual responses varied widely, suggesting the influence of multiple factors that need detailed exploration. Therefore, coaches and athletes are advised to interpret the promising performance enhancements cautiously and evaluate the appropriateness of the advanced spike technology for their athletes critically.
Entity Matching (EM) defines the task of learning to group objects by transferring semantic concepts from example groups (=entities) to unseen data. Despite the general availability of image data in the context of many EM-problems, most currently available EM-algorithms solely rely on (textual) meta data. In this paper, we introduce the first publicly available large-scale dataset for "visual entity matching", based on a production level use case in the retail domain. Using scanned advertisement leaflets, collected over several years from different European retailers, we provide a total of ~786k manually annotated, high resolution product images containing ~18k different individual retail products which are grouped into ~3k entities. The annotation of these product entities is based on a price comparison task, where each entity forms an equivalence class of comparable products. Following on a first baseline evaluation, we show that the proposed "visual entity matching" constitutes a novel learning problem which can not sufficiently be solved using standard image based classification and retrieval algorithms. Instead, novel approaches which allow to transfer example based visual equivalent classes to new data are needed to address the proposed problem. The aim of this paper is to provide a benchmark for such algorithms.
Information about the dataset, evaluation code and download instructions are provided under https://www.retail-786k.org/.
Socially assistive robots (SARs) are becoming more prevalent in everyday life, emphasizing the need to make them socially acceptable and aligned with users' expectations. Robots' appearance impacts users' behaviors and attitudes towards them. Therefore, product designers choose visual qualities to give the robot a character and to imply its functionality and personality. In this work, we sought to investigate the effect of cultural differences on Israeli and German designers' perceptions and preferences regarding the suitable visual qualities of SARs in four different contexts: a service robot for an assisted living/retirement residence facility, a medical assistant robot for a hospital environment, a COVID-19 officer robot, and a personal assistant robot for domestic use. Our results indicate that Israeli and German designers share similar perceptions of visual qualities and most of the robotics roles. However, we found differences in the perception of the COVID-19 officer robot's role and, by that, its most suitable visual design. This work indicates that context and culture play a role in users' perceptions and expectations; therefore, they should be taken into account when designing new SARs for diverse contexts.
Running shoes were categorized either as motion control, cushioned, or minimal footwear in the past. Today, these categories blur and are not as clearly defined. Moreover, with the advances in manufacturing processes, it is possible to create individualized running shoes that incorporate features that meet individual biomechanical and experiential needs. However, specific ways to individualize footwear to reduce individual injury risk are poorly understood. Therefore, the purpose of this scoping review was to provide an overview of (1) footwear design features that have the potential for individualization; (2) human biomechanical variability as a theoretical foundation for individualization; (3) the literature on the differential responses to footwear design features between selected groups of individuals. These purposes focus exclusively on reducing running-related risk factors for overuse injuries. We included studies in the English language on adults that analyzed: (1) potential interaction effects between footwear design features and subgroups of runners or covariates (e.g., age, gender) for running-related biomechanical risk factors or injury incidences; (2) footwear perception for a systematically modified footwear design feature. Most of the included articles (n = 107) analyzed male runners. Several footwear design features (e.g., midsole characteristics, upper, outsole profile) show potential for individualization. However, the overall body of literature addressing individualized footwear solutions and the potential to reduce biomechanical risk factors is limited. Future studies should leverage more extensive data collections considering relevant covariates and subgroups while systematically modifying isolated footwear design features to inform footwear individualization.
We have developed a methodology for the systematic generation of a large image dataset of macerated wood references, which we used to generate image data for nine hardwood genera. This is the basis for a substantial approach to automate, for the first time, the identification of hardwood species in microscopic images of fibrous materials by deep learning. Our methodology includes a flexible pipeline for easy annotation of vessel elements. We compare the performance of different neural network architectures and hyperparameters. Our proposed method performs similarly well to human experts. In the future, this will improve controls on global wood fiber product flows to protect forests.
The identification of vulnerabilities is an important element in the software development life cycle to ensure the security of software. While vulnerability identification based on the source code is a well studied field, the identification of vulnerabilities on basis of a binary executable without the corresponding source code is more challenging. Recent research has shown, how such detection can be achieved by deep learning methods. However, that particular approach is limited to the identification of only 4 types of vulnerabilities. Subsequently, we analyze to what extent we could cover the identification of a larger variety of vulnerabilities. Therefore, a supervised deep learning approach using recurrent neural networks for the application of vulnerability detection based on binary executables is used. The underlying basis is a dataset with 50,651 samples of vulnerable code in the form of a standardized LLVM Intermediate Representation. The vectorised features of a Word2Vec model are used to train different variations of three basic architectures of recurrent neural networks (GRU, LSTM, SRNN). A binary classification was established for detecting the presence of an arbitrary vulnerability, and a multi-class model was trained for the identification of the exact vulnerability, which achieved an out-of-sample accuracy of 88% and 77%, respectively. Differences in the detection of different vulnerabilities were also observed, with non-vulnerable samples being detected with a particularly high precision of over 98%. Thus, the methodology presented allows an accurate detection of 23 (compared to 4) vulnerabilities.
With the rising necessity of explainable artificial intelligence (XAI), we see an increase in task-dependent XAI methods on varying abstraction levels. XAI techniques on a global level explain model behavior and on a local level explain sample predictions. We propose a visual analytics workflow to support seamless transitions between global and local explanations, focusing on attributions and counterfactuals on time series classification. In particular, we adapt local XAI techniques (attributions) that are developed for traditional datasets (images, text) to analyze time series classification, a data type that is typically less intelligible to humans. To generate a global overview, we apply local attribution methods to the data, creating explanations for the whole dataset. These explanations are projected onto two dimensions, depicting model behavior trends, strategies, and decision boundaries. To further inspect the model decision-making as well as potential data errors, a what-if analysis facilitates hypothesis generation and verification on both the global and local levels. We constantly collected and incorporated expert user feedback, as well as insights based on their domain knowledge, resulting in a tailored analysis workflow and system that tightly integrates time series transformations into explanations. Lastly, we present three use cases, verifying that our technique enables users to (1)~explore data transformations and feature relevance, (2)~identify model behavior and decision boundaries, as well as, (3)~the reason for misclassifications.
CNN-based deep learning models for disease detection have become popular recently. We compared the binary classification performance of eight prominent deep learning models: DenseNet 121, DenseNet 169, DenseNet 201, EffecientNet b0, EffecientNet lite4, GoogleNet, MobileNet, and ResNet18 for their binary classification performance on combined Pulmonary Chest Xrays dataset. Despite the widespread application in different fields in medical images, there remains a knowledge gap in determining their relative performance when applied to the same dataset, a gap this study aimed to address. The dataset combined Shenzhen, China (CH) and Montgomery, USA (MC) data. We trained our model for binary classification, calculated different parameters of the mentioned models, and compared them. The models were trained to keep in mind all following the same training parameters to maintain a controlled comparison environment. End of the study, we found a distinct difference in performance among the other models when applied to the pulmonary chest Xray image dataset, where DenseNet169 performed with 89.38 percent and MobileNet with 92.2 percent precision.
Featherweight Generic Go (FGG) is a minimal core calculus modeling the essential features of the programming language Go. It includes support for overloaded methods, interface types, structural subtyping and generics. The most straightforward semantic description of the dynamic behavior of FGG programs is to resolve method calls based on runtime type information of the receiver.
This article shows a different approach by defining a type-directed translation from FGG to an untyped lambda-calculus. The translation of an FGG program provides evidence for the availability of methods as additional dictionary parameters, similar to the dictionary-passing approach known from Haskell type classes. Then, method calls can be resolved by a simple lookup of the method definition in the dictionary.
Every program in the image of the translation has the same dynamic semantics as its source FGG program. The proof of this result is based on a syntactic, step-indexed logical relation. The step-index ensures a well-founded definition of the relation in the presence of recursive interface types and recursive methods.
High-tech running shoes and spikes ("super-footwear") are currently being debated in sports. There is direct evidence that distance running super shoes improve running economy; however, it is not well established to which extent world-class performances are affected over the range of track and road running events.
This study examined publicly available performance datasets of annual best track and road performances for evidence of potential systematic performance effects following the introduction of super footwear. The analysis was based on the 100 best performances per year for men and women in outdoor events from 2010 to 2022, provided by the world governing body of athletics (World Athletics).
We found evidence of progressing improvements in track and road running performances after the introduction of super distance running shoes in 2016 and super spike technology in 2019. This evidence is more pronounced for distances longer than 1500 m in women and longer than 5000 m in men. Women seem to benefit more from super footwear in distance running events than men.
While the observational study design limits causal inference, this study provides a database on potential systematic performance effects following the introduction of super shoes/spikes in track and road running events in world-class athletes. Further research is needed to examine the underlying mechanisms and, in particular, potential sex differences in the performance effects of super footwear.