Visible to the public Biblio

Found 9043 results

2020-05-22
Platonov, A.V., Poleschuk, E.A., Bessmertny, I. A., Gafurov, N. R..  2018.  Using quantum mechanical framework for language modeling and information retrieval. 2018 IEEE 12th International Conference on Application of Information and Communication Technologies (AICT). :1—4.
This article shows the analogy between natural language texts and quantum-like systems on the example of the Bell test calculating. The applicability of the well-known Bell test for texts in Russian is investigated. The possibility of using this test for the text separation on the topics corresponding to the user query in information retrieval system is shown.
Geetha, R, Rekha, Pasupuleti, Karthika, S.  2018.  Twitter Opinion Mining and Boosting Using Sentiment Analysis. 2018 International Conference on Computer, Communication, and Signal Processing (ICCCSP). :1—4.
Social media has been one of the most efficacious and precise by speakers of public opinion. A strategy which sanctions the utilization and illustration of twitter data to conclude public conviction is discussed in this paper. Sentiments on exclusive entities with diverse strengths and intenseness are stated by public, where these sentiments are strenuously cognate to their personal mood and emotions. To examine the sentiments from natural language texts, addressing various opinions, a lot of methods and lexical resources have been propounded. A path for boosting twitter sentiment classification using various sentiment proportions as meta-level features has been proposed by this article. Analysis of tweets was done on the product iPhone 6.
Khadilkar, Kunal, Kulkarni, Siddhivinayak, Bone, Poojarani.  2018.  Plagiarism Detection Using Semantic Knowledge Graphs. 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). :1—6.
Every day, huge amounts of unstructured text is getting generated. Most of this data is in the form of essays, research papers, patents, scholastic articles, book chapters etc. Many plagiarism softwares are being developed to be used in order to reduce the stealing and plagiarizing of Intellectual Property (IP). Current plagiarism softwares are mainly using string matching algorithms to detect copying of text from another source. The drawback of some of such plagiarism softwares is their inability to detect plagiarism when the structure of the sentence is changed. Replacement of keywords by their synonyms also fails to be detected by these softwares. This paper proposes a new method to detect such plagiarism using semantic knowledge graphs. The method uses Named Entity Recognition as well as semantic similarity between sentences to detect possible cases of plagiarism. The doubtful cases are visualized using semantic Knowledge Graphs for thorough analysis of authenticity. Rules for active and passive voice have also been considered in the proposed methodology.
Devarakonda, Ranjeet, Giansiracusa, Michael, Kumar, Jitendra.  2018.  Machine Learning and Social Media to Mine and Disseminate Big Scientific Data. 2018 IEEE International Conference on Big Data (Big Data). :5312—5315.
One of the challenges in supplying the communities with wider access to scientific databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it might not be practical to make the public aware of the structure of databases. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide a more intuitive method for generating database queries and delivering responses. Through social media, which makes it possible to interact with a wide section of the population, and with the help of natural language processing, researchers at the Atmospheric Radiation Measurement (ARM) Data Center at Oak Ridge National Laboratory (ORNL) have developed a concept to enable easy search and retrieval of data from several environmental data centers for the scientific community through social media.Using a machine learning framework that maps natural language text to thousands of datasets, instruments, variables, and data streams, the prototype system would allow users to request data through Twitter and receive a link (via tweet) to applicable data results on the project's search catalog tailored to their key words. This automated identification of relevant data from various petascale archives at ORNL could increase convenience, access, and use of the project's data by the broader community. In this paper we discuss how some data-intensive projects at ORNL are using innovative ways to help in data discovery.
Kate, Abhilasha, Kamble, Satish, Bodkhe, Aishwarya, Joshi, Mrunal.  2018.  Conversion of Natural Language Query to SQL Query. 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA). :488—491.
This paper present an approach to automate the conversion of Natural Language Query to SQL Query effectively. Structured Query Language is a powerful tool for managing data held in a relational database management system. To retrieve or manage data user have to enter the correct SQL Query. But the users who don't have any knowledge about SQL are unable to retrieve the required data. To overcome this we proposed a model in Natural Language Processing for converting the Natural Language Query to SQL query. This helps novice user to get required content without knowing any complex details about SQL. This system can also deal with complex queries. This system is designed for Training and Placement cell officers who work on student database but don't have any knowledge about SQL. In this system, user can also enter the query using speech. System will convert speech into the text format. This query will get transformed to SQL query. System will execute the query and gives output to the user.
Sheth, Utsav, Dutta, Sanghamitra, Chaudhari, Malhar, Jeong, Haewon, Yang, Yaoqing, Kohonen, Jukka, Roos, Teemu, Grover, Pulkit.  2018.  An Application of Storage-Optimal MatDot Codes for Coded Matrix Multiplication: Fast k-Nearest Neighbors Estimation. 2018 IEEE International Conference on Big Data (Big Data). :1113—1120.
We propose a novel application of coded computing to the problem of the nearest neighbor estimation using MatDot Codes (Fahim et al., Allerton'17) that are known to be optimal for matrix multiplication in terms of recovery threshold under storage constraints. In approximate nearest neighbor algorithms, it is common to construct efficient in-memory indexes to improve query response time. One such strategy is Multiple Random Projection Trees (MRPT), which reduces the set of candidate points over which Euclidean distance calculations are performed. However, this may result in a high memory footprint and possibly paging penalties for large or high-dimensional data. Here we propose two techniques to parallelize MRPT that exploit data and model parallelism respectively by dividing both the data storage and the computation efforts among different nodes in a distributed computing cluster. This is especially critical when a single compute node cannot hold the complete dataset in memory. We also propose a novel coded computation strategy based on MatDot codes for the model-parallel architecture that, in a straggler-prone environment, achieves the storage-optimal recovery threshold, i.e., the number of nodes that are required to serve a query. We experimentally demonstrate that, in the absence of straggling, our distributed approaches require less query time than execution on a single processing node, providing near-linear speedups with respect to the number of worker nodes. Our experiments on real systems with simulated straggling, we also show that in a straggler-prone environment, our strategy achieves a faster query execution than the uncoded strategy.
Despotovski, Filip, Gusev, Marjan, Zdraveski, Vladimir.  2018.  Parallel Implementation of K-Nearest-Neighbors for Face Recognition. 2018 26th Telecommunications Forum (℡FOR). :1—4.
Face recognition is a fast-expanding field of research. Countless classification algorithms have found use in face recognition, with more still being developed, searching for better performance and accuracy. For high-dimensional data such as images, the K-Nearest-Neighbours classifier is a tempting choice. However, it is very computationally-intensive, as it has to perform calculations on all items in the stored dataset for each classification it makes. Fortunately, there is a way to speed up the process by performing some of the calculations in parallel. We propose a parallel CUDA implementation of the KNN classifier and then compare it to a serial implementation to demonstrate its performance superiority.
Varricchio, Valerio, Frazzoli, Emilio.  2018.  Asymptotically Optimal Pruning for Nonholonomic Nearest-Neighbor Search. 2018 IEEE Conference on Decision and Control (CDC). :4459—4466.
Nearest-Neighbor Search (NNS) arises as a key component of sampling-based motion planning algorithms and it is known as their asymptotic computational bottleneck. Algorithms for exact Nearest-Neighbor Search rely on explicit distance comparisons to different extents. However, in motion planning, evaluating distances is generally a computationally demanding task, since the metric is induced by the minimum cost of steering a dynamical system between states. In the presence of driftless nonholonomic constraints, we propose efficient pruning techniques for the k-d tree algorithm that drastically reduce the number of distance evaluations performed during a query. These techniques exploit computationally convenient lower and upper bounds to the geodesic distance of the corresponding sub-Riemannian geometry. Based on asymptotic properties of the reachable sets, we show that the proposed pruning techniques are optimal, modulo a constant factor, and we provide experimental results with the Reeds-Shepp vehicle model.
[Anonymous].  Submitted.  Approximate Nearest Neighbor Search Based on Hierarchical Multi-Index Hashing.
Nowadays, hashing methods are widely used in large-scale approximate nearest neighbor search due to its efficient storage and fast retrieval speed. By these methods, the original data is usually hashed into binary codes which enables to measure the similarity by Hamming distance. When dealing with large-scale data, their binary codes can be used as direct indices in a hash table. However, codes longer than 32 bits are obviously not efficient. For the given binary codes, this paper proposes a multi-index hashing structure based on binary code substrings partitioning. Since substrings partitioning is essentially a combinatorial optimization problem, we propose a hierarchical and recursive partitioning approach to obtain an approximate solution to it. Furthermore, we adopt a query-adaptive fine-grained ranking approach in the neighbor search stage to alleviate the imbalance between multi-index tables. Finally, extensive experiments are conducted on two datasets MNIST and CIFAR-10, demonstrating that our method achieves state-of-the-art performance in terms of efficiency, precision and recall rate.
Yan, Donghui, Wang, Yingjie, Wang, Jin, Wang, Honggang, Li, Zhenpeng.  2018.  K-nearest Neighbor Search by Random Projection Forests. 2018 IEEE International Conference on Big Data (Big Data). :4775—4781.
K-nearest neighbor (kNN) search has wide applications in many areas, including data mining, machine learning, statistics and many applied domains. Inspired by the success of ensemble methods and the flexibility of tree-based methodology, we propose random projection forests, rpForests, for kNN search. rpForests finds kNNs by aggregating results from an ensemble of random projection trees with each constructed recursively through a series of carefully chosen random projections. rpForests achieves a remarkable accuracy in terms of fast decay in the missing rate of kNNs and that of discrepancy in the kNN distances. rpForests has a very low computational complexity. The ensemble nature of rpForests makes it easily run in parallel on multicore or clustered computers; the running time is expected to be nearly inversely proportional to the number of cores or machines. We give theoretical insights by showing the exponential decay of the probability that neighboring points would be separated by ensemble random projection trees when the ensemble size increases. Our theory can be used to refine the choice of random projections in the growth of trees, and experiments show that the effect is remarkable.
Kang, Hyunjoong, Hong, Sanghyun, Lee, Kookjin, Park, Noseong, Kwon, Soonhyun.  2018.  On Integrating Knowledge Graph Embedding into SPARQL Query Processing. 2018 IEEE International Conference on Web Services (ICWS). :371—374.
SPARQL is a standard query language for knowledge graphs (KGs). However, it is hard to find correct answer if KGs are incomplete or incorrect. Knowledge graph embedding (KGE) enables answering queries on such KGs by inferring unknown knowledge and removing incorrect knowledge. Hence, our long-term goal in this line of research is to propose a new framework that integrates KGE and SPARQL, which opens various research problems to be addressed. In this paper, we solve one of the most critical problems, that is, optimizing the performance of nearest neighbor (NN) search. In our evaluations, we demonstrate that the search time of state-of-the-art NN search algorithms is improved by 40% without sacrificing answer accuracy.
Chen, Yalin, Li, Zhiyang, Shi, Jia, Liu, Zhaobin, Qu, Wenyu.  2018.  Stacked K-Means Hashing Quantization for Nearest Neighbor Search. 2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM). :1—4.
Nowadays, with such a huge amount of information available online, one key challenge is how to retrieve target data efficiently. A recent state-of-art solution, k-means hashing (KMH), codes data via a string of binary code obtained by iterative k-means clustering and binary code optimizing. To deal with high dimensional data, KMH divides the space into low-dimensional subspaces, places a hypercube in each subspace and finds its proper location by the mentioned optimizing process. However, the complexity of the optimization increases rapidly when the dimension of the hypercube increases. To address this issue, we propose an improved hashing method stacked k-means hashing (SKMH). The main idea is to increase the approximation by a coarse-to-fine multi-layer lower-dimensional cubes. With these kinds of lower-dimensional cubes, SKMH can achieve a similar approximation ability via a less optimizing time, compared with KMH method using higher-dimensional cubes. Extensive experiments have been conducted on two public databases, demonstrating the performance of our method by some common metrics in fast nearest neighbor search.
Ito, Toshitaka, Itotani, Yuri, Wakabayashi, Shin'ichi, Nagayama, Shinobu, Inagi, Masato.  2018.  A Nearest Neighbor Search Engine Using Distance-Based Hashing. 2018 International Conference on Field-Programmable Technology (FPT). :150—157.
This paper proposes an FPGA-based nearest neighbor search engine for high-dimensional data, in which nearest neighbor search is performed based on distance-based hashing. The proposed hardware search engine implements a nearest neighbor search algorithm based on an extension of flexible distance-based hashing (FDH, for short), which finds an exact solution with high probability. The proposed engine is a parallel processing and pipelined circuit so that search results can be obtained in a short execution time. Experimental results show the effectiveness and efficiency of the proposed engine.
Wang, Xi, Yao, Jun, Ji, Hongxia, Zhang, Ze, Li, Chen, Ma, Beizhi.  2018.  A Local Integral Hash Nearest Neighbor Algorithm. 2018 3rd International Conference on Mechanical, Control and Computer Engineering (ICMCCE). :544—548.
Nearest neighbor search algorithm plays a very important role in computer image algorithm. When the search data is large, we need to use fast search algorithm. The current fast retrieval algorithms are tree based algorithms. The efficiency of the tree algorithm decreases sharply with the increase of the data dimension. In this paper, a local integral hash nearest neighbor algorithm of the spatial space is proposed to construct the tree structure by changing the way of the node of the access tree. It is able to express data distribution characteristics. After experimental testing, this paper achieves more efficient performance in high dimensional data.
Vijay, Savinu T., Pournami, P. N..  2018.  Feature Based Image Registration using Heuristic Nearest Neighbour Search. 2018 22nd International Computer Science and Engineering Conference (ICSEC). :1—3.
Image registration is the process of aligning images of the same scene taken at different instances, from different viewpoints or by heterogeneous sensors. This can be achieved either by area based or by feature based image matching techniques. Feature based image registration focuses on detecting relevant features from the input images and attaching descriptors to these features. Matching visual descriptions of two images is a major task in image registration. This feature matching is currently done using Exhaustive Search (or Brute-Force) and Nearest Neighbour Search. The traditional method used for nearest neighbour search is by representing the data as k-d trees. This nearest neighbour search can also be performed using combinatorial optimization algorithms such as Simulated Annealing. This work proposes a method to perform image feature matching by nearest neighbour search done based on Threshold Accepting, a faster version of Simulated Annealing.The experiments performed suggest that the proposed algorithm can produce better results within a minimum number of iterations than many existing algorithms.
Song, Fuyuan, Qin, Zheng, Liu, Qin, Liang, Jinwen, Ou, Lu.  2019.  Efficient and Secure k-Nearest Neighbor Search Over Encrypted Data in Public Cloud. ICC 2019 - 2019 IEEE International Conference on Communications (ICC). :1—6.
Cloud computing has become an important and popular infrastructure for data storage and sharing. Typically, data owners outsource their massive data to a public cloud that will provide search services to authorized data users. With privacy concerns, the valuable outsourced data cannot be exposed directly, and should be encrypted before outsourcing to the public cloud. In this paper, we focus on k-Nearest Neighbor (k-NN) search over encrypted data. We propose efficient and secure k-NN search schemes based on matrix similarity to achieve efficient and secure query services in public cloud. In our basic scheme, we construct the traces of two diagonal multiplication matrices to denote the Euclidean distance of two data points, and perform secure k-NN search by comparing traces of corresponding similar matrices. In our enhanced scheme, we strengthen the security property by decomposing matrices based on our basic scheme. Security analysis shows that our schemes protect the data privacy and query privacy under attacking with different levels of background knowledge. Experimental evaluations show that both schemes are efficient in terms of computation complexity as well as computational cost.
Dubey, Abhimanyu, Maaten, Laurens van der, Yalniz, Zeki, Li, Yixuan, Mahajan, Dhruv.  2019.  Defense Against Adversarial Images Using Web-Scale Nearest-Neighbor Search. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :8759—8768.
A plethora of recent work has shown that convolutional networks are not robust to adversarial images: images that are created by perturbing a sample from the data distribution as to maximize the loss on the perturbed example. In this work, we hypothesize that adversarial perturbations move the image away from the image manifold in the sense that there exists no physical process that could have produced the adversarial image. This hypothesis suggests that a successful defense mechanism against adversarial images should aim to project the images back onto the image manifold. We study such defense mechanisms, which approximate the projection onto the unknown image manifold by a nearest-neighbor search against a web-scale image database containing tens of billions of images. Empirical evaluations of this defense strategy on ImageNet suggest that it very effective in attack settings in which the adversary does not have access to the image database. We also propose two novel attack methods to break nearest-neighbor defense settings and show conditions under which nearest-neighbor defense fails. We perform a series of ablation experiments, which suggest that there is a trade-off between robustness and accuracy between as we use features from deeper in the network, that a large index size (hundreds of millions) is crucial to get good performance, and that careful construction of database is crucial for robustness against nearest-neighbor attacks.
Markchit, Sarawut, Chiu, Chih-Yi.  2019.  Hash Code Indexing in Cross-Modal Retrieval. 2019 International Conference on Content-Based Multimedia Indexing (CBMI). :1—4.
Cross-modal hashing, which searches nearest neighbors across different modalities in the Hamming space, has become a popular technique to overcome the storage and computation barrier in multimedia retrieval recently. Although dozens of cross-modal hashing algorithms are proposed to yield compact binary code representation, applying exhaustive search in a large-scale dataset is impractical for the real-time purpose, and the Hamming distance computation suffers inaccurate results. In this paper, we propose a novel index scheme over binary hash codes in cross-modal retrieval. The proposed indexing scheme exploits a few binary bits of the hash code as the index code. Based on the index code representation, we construct an inverted index structure to accelerate the retrieval efficiency and train a neural network to improve the indexing accuracy. Experiments are performed on two benchmark datasets for retrieval across image and text modalities, where hash codes are generated by three cross-modal hashing methods. Results show the proposed method effectively boosts the performance over the benchmark datasets and hash methods.
Yang, Jiacheng, Chen, Bin, Xia, Shu-Tao.  2019.  Mean-Removed Product Quantization for Approximate Nearest Neighbor Search. 2019 International Conference on Data Mining Workshops (ICDMW). :711—718.
Product quantization (PQ) and its variations are popular and attractive in approximate nearest neighbor search (ANN) due to their lower memory usage and faster retrieval speed. PQ decomposes the high-dimensional vector space into several low-dimensional subspaces, and quantizes each sub-vector in their subspaces, separately. Thus, PQ can generate a codebook containing an exponential number of codewords or indices by a Cartesian product of the sub-codebooks from different subspaces. However, when there is large variance in the average amplitude of the components of the data points, directly utilizing the PQ on the data points would result in poor performance. In this paper, we propose a new approach, namely, mean-removed product quantization (MRPQ) to address this issue. In fact, the average amplitude of a data point or the mean of a date point can be regarded as statistically independent of the variation of the vector, that is, of the way the components vary about this average. Then we can learn a separate scalar quantizer of the means of the data points and apply the PQ to their residual vectors. As shown in our comprehensive experiments on four large-scale public datasets, our approach can achieve substantial improvements in terms of Recall and MAP over some known methods. Moreover, our approach is general which can be combined with PQ and its variations.
Horzyk, Adrian, Starzyk, Janusz A..  2019.  Associative Data Model in Search for Nearest Neighbors and Similar Patterns. 2019 IEEE Symposium Series on Computational Intelligence (SSCI). :933—940.
This paper introduces a biologically inspired associative data model and structure for finding nearest neighbors and similar patterns. The method can be used as an alternative to the classical approaches to accelerate the search for such patterns using various priorities for attributes according to the Sebestyen measure. The presented structure, together with algorithms developed in this paper can be useful in various computational intelligence tasks like pattern matching, recognition, clustering, classification, multi-criterion search etc. This approach is particularly useful for the on-line operation of associative neural network graphs. Graphs that dynamically develop their structure during learning on training data. The results of experiments show that the associative approach can substantially accelerate the nearest neighbor search and that associative structures can also be used as a model for KNN tasks. Finally, this paper presents how the associative structures can be used to self-organize data and represent knowledge about them in the associative way, which yields new search approaches described in this paper.
Ahsan, Ramoza, Bashir, Muzammil, Neamtu, Rodica, Rundensteiner, Elke A., Sarkozy, Gabor.  2019.  Nearest Neighbor Subsequence Search in Time Series Data. 2019 IEEE International Conference on Big Data (Big Data). :2057—2066.
Continuous growth in sensor data and other temporal sequence data necessitates efficient retrieval and similarity search support on these big time series datasets. However, finding exact similarity results, especially at the granularity of subsequences, is known to be prohibitively costly for large data sets. In this paper, we thus propose an efficient framework for solving this exact subsequence similarity match problem, called TINN (TIme series Nearest Neighbor search). Exploiting the range interval diversity properties of time series datasets, TINN captures similarity at two levels of abstraction, namely, relationships among subsequences within each long time series and relationships across distinct time series in the data set. These relationships are compactly organized in an augmented relationship graph model, with the former relationships encoded in similarity vectors at TINN nodes and the later captured by augmented edge types in the TINN Graph. Query processing strategy deploy novel pruning techniques on the TINN Graph, including node skipping, vertical and horizontal pruning, to significantly reduce the number of time series as well as subsequences to be explored. Comprehensive experiments on synthetic and real world time series data demonstrate that our TINN model consistently outperforms state-of-the-art approaches while still guaranteeing to retrieve exact matches.
Abdelhadi, Ameer M.S., Bouganis, Christos-Savvas, Constantinides, George A..  2019.  Accelerated Approximate Nearest Neighbors Search Through Hierarchical Product Quantization. 2019 International Conference on Field-Programmable Technology (ICFPT). :90—98.
A fundamental recurring task in many machine learning applications is the search for the Nearest Neighbor in high dimensional metric spaces. Towards answering queries in large scale problems, state-of-the-art methods employ Approximate Nearest Neighbors (ANN) search, a search that returns the nearest neighbor with high probability, as well as techniques that compress the dataset. Product-Quantization (PQ) based ANN search methods have demonstrated state-of-the-art performance in several problems, including classification, regression and information retrieval. The dataset is encoded into a Cartesian product of multiple low-dimensional codebooks, enabling faster search and higher compression. Being intrinsically parallel, PQ-based ANN search approaches are amendable for hardware acceleration. This paper proposes a novel Hierarchical PQ (HPQ) based ANN search method as well as an FPGA-tailored architecture for its implementation that outperforms current state of the art systems. HPQ gradually refines the search space, reducing the number of data compares and enabling a pipelined search. The mapping of the architecture on a Stratix 10 FPGA device demonstrates over ×250 speedups over current state-of-the-art systems, opening the space for addressing larger datasets and/or improving the query times of current systems.
Rattaphun, Munlika, Prayoonwong, Amorntip, Chiu, Chih- Yi.  2019.  Indexing in k-Nearest Neighbor Graph by Hash-Based Hill-Climbing. 2019 16th International Conference on Machine Vision Applications (MVA). :1—4.
A main issue in approximate nearest neighbor search is to achieve an excellent tradeoff between search accuracy and computation cost. In this paper, we address this issue by leveraging k-nearest neighbor graph and hill-climbing to accelerate vector quantization in the query assignment process. A modified hill-climbing algorithm is proposed to traverse k-nearest neighbor graph to find closest centroids for a query, rather than calculating the query distances to all centroids. Instead of using random seeds in the original hill-climbing algorithm, we generate high-quality seeds based on the hashing technique. It can boost the query assignment efficiency due to a better start-up in hill-climbing. We evaluate the experiment on the benchmarks of SIFT1M and GIST1M datasets, and show the proposed hashing-based seed generation effectively improves the search performance.
Ranjan, G S K, Kumar Verma, Amar, Radhika, Sudha.  2019.  K-Nearest Neighbors and Grid Search CV Based Real Time Fault Monitoring System for Industries. 2019 IEEE 5th International Conference for Convergence in Technology (I2CT). :1—5.
Fault detection in a machine at earlier stage can prevent severe damage and loss to the industries. Fault detection techniques are broadly classified into three categories; signature extraction-based, model-based and knowledge-based approach. Model-based techniques are efficient for raising an alarm signal if there is any fault in the machine. This paper focuses on one such model based-technique to identify the internal faults of induction machine. The model developed is deployed in the end to make it feasible to use in real time. K-Nearest Neighbors (KNN) and grid search cross validation (CV) have been used to train and optimize the model to give the best results. The advantage of proposed algorithm is the accuracy in prediction which has been seen to be 80%. Finally, a user friendly interface has been built using Flask, a python web framework.
Li, Xiaodong.  2019.  DURS: A Distributed Method for k-Nearest Neighbor Search on Uncertain Graphs. 2019 20th IEEE International Conference on Mobile Data Management (MDM). :377—378.
Large graphs are increasingly prevalent in mobile networks, social networks, traffic networks and biological networks. These graphs are often uncertain, where edges are augmented with probabilities that indicates the chance to exist. Recently k-nearest neighbor search has been studied within the field of uncertain graphs, but the scalability and efficiency issues are not well solved. Moreover, solutions are implemented on a single machine and thus cannot fit large uncertain graphs. In this paper, we develop a framework, called DURS, to distribute k-nearest neighbor search into several machines and re-partition the uncertain graphs to balance the work loads and reduce the communication costs. Evaluation results show that DURS is essential to make the system scalable when answering k-nearest neighbor queries on uncertain graphs.