Visible to the public Nearest Neighbor Subsequence Search in Time Series Data

TitleNearest Neighbor Subsequence Search in Time Series Data
Publication TypeConference Paper
Year of Publication2019
AuthorsAhsan, Ramoza, Bashir, Muzammil, Neamtu, Rodica, Rundensteiner, Elke A., Sarkozy, Gabor
Conference Name2019 IEEE International Conference on Big Data (Big Data)
Date Publisheddec
KeywordsAgriculture, augmented relationship graph model, Bridges, data mining, graph theory, horizontal pruning, Indexes, Measurement, Meteorology, Metrics, nearest neighbor search, nearest neighbour methods, pubcrawl, query processing, query processing strategy, range interval diversity properties, search problems, sensor data, similarity search support, similarity vectors, Subsequence Mining, subsequence similarity match problem, Temperature sensors, temporal sequence data, time series, Time series analysis, Time Series Data, time series datasets, time series nearest neighbor subsequence search, TINN graph, TINN model, TINN nodes
AbstractContinuous growth in sensor data and other temporal sequence data necessitates efficient retrieval and similarity search support on these big time series datasets. However, finding exact similarity results, especially at the granularity of subsequences, is known to be prohibitively costly for large data sets. In this paper, we thus propose an efficient framework for solving this exact subsequence similarity match problem, called TINN (TIme series Nearest Neighbor search). Exploiting the range interval diversity properties of time series datasets, TINN captures similarity at two levels of abstraction, namely, relationships among subsequences within each long time series and relationships across distinct time series in the data set. These relationships are compactly organized in an augmented relationship graph model, with the former relationships encoded in similarity vectors at TINN nodes and the later captured by augmented edge types in the TINN Graph. Query processing strategy deploy novel pruning techniques on the TINN Graph, including node skipping, vertical and horizontal pruning, to significantly reduce the number of time series as well as subsequences to be explored. Comprehensive experiments on synthetic and real world time series data demonstrate that our TINN model consistently outperforms state-of-the-art approaches while still guaranteeing to retrieve exact matches.
Citation Keyahsan_nearest_2019