Refine
Document Type
- Conference Proceeding (9)
- Article (unreviewed) (1)
- Report (1)
- Working Paper (1)
Conference Type
- Konferenzartikel (9)
Has Fulltext
- no (12)
Is part of the Bibliography
- yes (12)
Keywords
- Binary Executable (1)
- Blockchain (1)
- Cloud (1)
- Computersicherheit (1)
- Cryptography (1)
- Data Integrity (1)
- Datenbanksystem (1)
- Datensicherung (1)
- Deep Learning (1)
- Deepfake (1)
Institute
Open Access
- Closed Access (6)
- Open Access (4)
- Closed (2)
- Diamond (2)
Funding number
- 16KIS1403 (1)
Synthesizing voice with the help of machine learning techniques has made rapid progress over the last years [1]. Given the current increase in using conferencing tools for online teaching, we question just how easy (i.e. needed data, hardware, skill set) it would be to create a convincing voice fake. We analyse how much training data a participant (e.g. a student) would actually need to fake another participants voice (e.g. a professor). We provide an analysis of the existing state of the art in creating voice deep fakes and align the identified as well as our own optimization techniques in the context of two different voice data sets. A user study with more than 100 participants shows how difficult it is to identify real and fake voice (on avg. only 37 percent can recognize a professor’s fake voice). From a longer-term societal perspective such voice deep fakes may lead to a disbelief by default.
In the field of network security, the detection of possible intrusions is an important task to prevent and analyse attacks. Machine learning has been adopted as a particular supporting technique over the last years. However, the majority of related published work uses post mortem log files and fails to address the required real-time capabilities of network data feature extraction and machine learning based analysis [1-5]. We introduce the network feature extractor library FEX, which is designed to allow real-time feature extraction of network data. This library incorporates 83 statistical features based on reassembled data flows. The introduced Cython implementation allows processing individual packets within 4.58 microseconds. Based on the features extracted by FEX, existing intrusion detection machine learning models were examined with respect to their real-time capabilities. An identified Decision-Tree Classifier model was thus further optimised by transpiling it into C Code. This reduced the prediction time of a single sample to 3.96 microseconds on average. Based on the feature extractor and the improved machine learning model an IDS system was implemented which supports a data throughput between 63.7 Mbit/s and 2.5 Gbit/s making it a suitable candidate for a real-time, machine-learning based IDS.
Threat Modelling is an accepted technique to identify general threats as early as possible in the software development lifecycle. Previous work of ours did present an open-source framework and web-based tool (OVVL) for automating threat analysis on software architectures using STRIDE. However, one open problem is that available threat catalogues are either too general or proprietary with respect to a certain domain (e.g. .Net). Another problem is that a threat analyst should not only be presented (repeatedly) with a list of all possible threats, but already with some automated support for prioritizing these. This paper presents an approach to dynamically generate individual threat catalogues on basis of the established CWE as well as related CVE databases. Roughly 60% of this threat catalogue generation can be done by identifying and matching certain key values. To map the remaining 40% of our data (~50.000 CVE entries) we train a text classification model by using the already mapped 60% of our dataset to perform a supervised machine-learning based text classification. The generated entire dataset allows us to identify possible threats for each individual architectural element and automatically provide an initial prioritization. Our dataset as well as a supporting Jupyter notebook are openly available.
OVVL (the Open Weakness and Vulnerability Modeller) is a tool and methodology to support threat modeling in the early stages of the secure software development lifecycle. We provide an overview of OVVL (https://ovvl.org), its data model and browser-based UI. We equally provide a discussion of initial experiments on how identified threats in the design phase can be aligned with later activities in the software lifecycle (issue management and security testing).
In this paper we report on the commercial background as well as resulting high-level architecture and design of a cloud-based system for cryptographic software protection and licensing. This is based on the experiences and insights gained in the context of a real-world commercial R&D project at Wibu-Systems AG, a company that specialises in software encryption and licensing solutions.
The development of secure software systems is of ever-increasing importance. While software companies often invest large amounts of resources into the upkeeping and general security properties of large-scale applications when in production, they appear to neglect utilizing threat modeling in the earlier stages of the software development lifecycle. When applied during the design phase of development, and continuously throughout development iterations, threat modeling can help to establish a "Secure by Design" approach. This approach allows issues relating to IT security to be found early during development, reducing the need for later improvement – and thus saving resources in the long term. In this paper the current state of threat modeling is investigated. This investigation drove the derivation of requirements for the development of a new threat modelling framework and tool, called OVVL. OVVL utilizes concepts of established threat modeling methodologies, as well as functionality not available in existing solutions.
Blockchain frameworks enable the immutable storage of data. A still open practical question is the so called "oracle" problem, i.e. the way how real world data is actually transferred into and out of a blockchain while preserving its integrity. We present a case study that demonstrates how to use an existing industrial strength secure element for cryptographic software protection (Wibu CmDongle / the "dongle") to function as such a hardware-based oracle for the Hyperledger blockchain framework. Our scenario is that of a dentist having leased a 3D printer. This printer is initially supplied with an amount of x printing units. With each print action the local unit counter on the attached dongle is decreased and in parallel a unit counter is maintained in the Hyperledger-based blockchain. Once a threshold is met, the printer will stop working (by means of the cryptographically protected invocation of the local print method). The blockchain is configured in such a way that chaincode is executed to increase the units again automatically (and essentially trigger any payment processes). Once this has happened, the new unit counter value will be passed from the blockchain to the local dongle and thus allow for further execution of print jobs.
Protecting software from illegal access, intentional modification or reverse engineering is an inherently difficult practical problem involving code obfuscation techniques and real-time cryptographic protection of code. In traditional systems a secure element (the "dongle") is used to protect software. However, this approach suffers from several technical and economical drawbacks such as the dongle being lost or broken.
We present a system that provides such dongles as a cloud service, and more importantly, provides the required cryptographic material to control access to software functionality in real-time.
This system is developed as part of an ongoing nationally funded research project and is now entering a first trial stage with stakeholders from different industrial sectors.
The identification of vulnerabilities is an important element in the software development life cycle to ensure the security of software. While vulnerability identification based on the source code is a well studied field, the identification of vulnerabilities on basis of a binary executable without the corresponding source code is more challenging. Recent research has shown, how such detection can be achieved by deep learning methods. However, that particular approach is limited to the identification of only 4 types of vulnerabilities. Subsequently, we analyze to what extent we could cover the identification of a larger variety of vulnerabilities. Therefore, a supervised deep learning approach using recurrent neural networks for the application of vulnerability detection based on binary executables is used. The underlying basis is a dataset with 50,651 samples of vulnerable code in the form of a standardized LLVM Intermediate Representation. The vectorised features of a Word2Vec model are used to train different variations of three basic architectures of recurrent neural networks (GRU, LSTM, SRNN). A binary classification was established for detecting the presence of an arbitrary vulnerability, and a multi-class model was trained for the identification of the exact vulnerability, which achieved an out-of-sample accuracy of 88% and 77%, respectively. Differences in the detection of different vulnerabilities were also observed, with non-vulnerable samples being detected with a particularly high precision of over 98%. Thus, the methodology presented allows an accurate detection of 23 (compared to 4) vulnerabilities.
Synthesizing voice with the help of machine learning techniques has made rapid progress over the last years. Given the current increase in using conferencing tools for online teaching, we question just how easy (i.e. needed data, hardware, skill set) it would be to create a convincing voice fake. We analyse how much training data a participant (e.g. a student) would actually need to fake another participants voice (e.g. a professor). We provide an analysis of the existing state of the art in creating voice deep fakes and align the identified as well as our own optimization techniques in the context of two different voice data sets. A user study with more than 100 participants shows how difficult it is to identify real and fake voice (on avg. only 37% can recognize a professor’s fake voice). From a longer-term societal perspective such voice deep fakes may lead to a disbelief by default.