Refine
Document Type
Conference Type
- Konferenzartikel (3)
Language
- English (4)
Has Fulltext
- no (4) (remove)
Is part of the Bibliography
- yes (4) (remove)
Keywords
- Analytical Query (1)
- Data mining (1)
- Datenbank (1)
- Datenbanksystem (1)
- Filter Dimension (1)
- GPU computing (1)
- Hash Function (1)
- Hash Table (1)
- Informationssystem (1)
- Many-core architectures (1)
Institute
- Fakultät Elektrotechnik und Informationstechnik (E+I) (bis 03/2019) (4) (remove)
Open Access
- Closed Access (2)
- Open Access (1)
Finding clusters in high dimensional data is a challenging research problem. Subspace clustering algorithms aim to find clusters in all possible subspaces of the dataset where, a subspace is the subset of dimensions of the data. But exponential increase in the number of subspaces with the dimensionality of data renders most of the algorithms inefficient as well as ineffective. Moreover, these algorithms have ingrained data dependency in the clustering process, thus, parallelization becomes difficult and inefficient. SUBSCALE is a recent subspace clustering algorithm which is scalable with the dimensions and contains independent processing steps which can be exploited through parallelism. In this paper, we aim to leverage, firstly, the computational power of widely available multi-core processors to improve the runtime performance of the SUBSCALE algorithm. The experimental evaluation has shown linear speedup. Secondly, we are developing an approach using graphics processing units (GPUs) for fine-grained data parallelism to accelerate the computation further. First tests of the GPU implementation show very promising results.
In online analytical processing (OLAP), filtering elements of a given dimensional attribute according to the value of a measure attribute is an essential operation, for example in top-k evaluation. Such filters can involve extremely large amounts of data to be processed, in particular when the filter condition includes “quantification” such as ANY or ALL, where large slices of an OLAP cube have to be computed and inspected. Due to the sparsity of OLAP cubes, the slices serving as input to the filter are usually sparse as well, presenting a challenge for GPU approaches which need to work with a limited amount of memory for holding intermediate results. Our CUDA solution involves a hashing scheme specifically designed for frequent and parallel updates, including several optimizations exploiting architectural features of Nvidia’s Fermi and Kepler GPUs.
This paper describes the use of the single-linkage hierarchical clustering method in outlier detection for manufactured metal work pieces. The main goal of the study is to group defects that occur 5 mm into a work piece from the edge, i.e., the border of the metal work piece. The goal is to remove defects outside the area of interest as outliers. According to the assumptions made for the performance criteria, the single-linkage method has achieved better results compared to other agglomeration methods.