Refine
Document Type
Conference Type
- Konferenzartikel (4)
Language
- English (4)
Is part of the Bibliography
- yes (4)
Keywords
- Neural networks (2)
- efficient training (2)
- Blockchain (1)
- Edge AI (1)
- Embedded AI (1)
- Embedded Systems (1)
- Machine learning (1)
- Security (1)
- TinyML (1)
- Traceability (1)
Institute
Open Access
- Diamond (4) (remove)
In recent years, the topic of embedded machine learning has become very popular in AI research. With the help of various compression techniques such as pruning, quantization and others compression techniques, it became possible to run neural networks on embedded devices. These techniques have opened up a whole new application area for machine learning. They range from smart products such as voice assistants to smart sensors that are needed in robotics. Despite the achievements in embedded machine learning, efficient algorithms for training neural networks in constrained domains are still lacking. Training on embedded devices will open up further fields of applications. Efficient training algorithms would enable federated learning on embedded devices, in which the data remains where it was collected, or retraining of neural networks in different domains. In this paper, we summarize techniques that make training on embedded devices possible. We first describe the need and requirements for such algorithms. Then we examine existing techniques that address training in resource-constrained environments as well as techniques that are also suitable for training on embedded devices, such as incremental learning. At the end, we also discuss which problems and open questions still need to be solved in these areas.
Training deep neural networks using backpropagation is very memory and computationally intensive. This makes it difficult to run on-device learning or fine-tune neural networks on tiny, embedded devices such as low-power micro-controller units (MCUs). Sparse backpropagation algorithms try to reduce the computational load of on-device learning by training only a subset of the weights and biases. Existing approaches use a static number of weights to train. A poor choice of this so-called backpropagation ratio limits either the computational gain or can lead to severe accuracy losses. In this paper we present TinyProp, the first sparse backpropagation method that dynamically adapts the back-propagation ratio during on-device training for each training step. TinyProp induces a small calculation overhead to sort the elements of the gradient, which does not significantly impact the computational gains. TinyProp works particularly well on fine-tuning trained networks on MCUs, which is a typical use case for embedded applications. For typical datasets from three datasets MNIST, DCASE2020 and CIFAR10, we are 5 times faster compared to non-sparse training with an accuracy loss of on average 1%. On average, TinyProp is 2.9 times faster than existing, static sparse backpropagation algorithms and the accuracy loss is reduced on average by 6 % compared to a typical static setting of the back-propagation ratio.
The importance of machine learning has been increasing dramatically for years. From assistance systems to production optimisation to support the health sector, almost every area of daily life and industry comes into contact with machine learning. Besides all the benefits that ML brings, the lack of transparency and the difficulty in creating traceability pose major risks. While there are solutions that make the training of machine learning models more transparent, traceability is still a major challenge. Ensuring the identity of a model is another challenge. Unnoticed modification of a model is also a danger when using ML. One solution is to create an ML birth certificate and an ML family tree secured by blockchain technology. Important information about training and changes to the model through retraining can be stored in a blockchain and accessed by any user to create more security and traceability about an ML model.
In this paper, we study the runtime performance of symmetric cryptographic algorithms on an embedded ARM Cortex-M4 platform. Symmetric cryptographic algorithms can serve to protect the integrity and optionally, if supported by the algorithm, the confidentiality of data. A broad range of well-established algorithms exists, where the different algorithms typically have different properties and come with different computational complexity. On deeply embedded systems, the overhead imposed by cryptographic operations may be significant. We execute the algorithms AES-GCM, ChaCha20-Poly1305, HMAC-SHA256, KMAC, and SipHash on an STM32 embedded microcontroller and benchmark the execution times of the algorithms as a function of the input lengths.