Visible to the public Biblio

Filters: Keyword is data structures  [Clear All Filters]
Luo, Xinyi, Xu, Zhuo, Xue, Kaiping, Jiang, Qiantong, Li, Ruidong, Wei, David.  2022.  ScalaCert: Scalability-Oriented PKI with Redactable Consortium Blockchain Enabled "On-Cert" Certificate Revocation. 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS). :1236–1246.
As the voucher for identity, digital certificates and the public key infrastructure (PKI) system have always played a vital role to provide the authentication services. In recent years, with the increase in attacks on traditional centralized PKIs and the extensive deployment of blockchains, researchers have tried to establish blockchain-based secure decentralized PKIs and have made significant progress. Although blockchain enhances security, it brings new problems in scalability due to the inherent limitations of blockchain’s data structure and consensus mechanism, which become much severe for the massive access in the era of 5G and B5G. In this paper, we propose ScalaCert to mitigate the scalability problems of blockchain-based PKIs by utilizing redactable blockchain for "on-cert" revocation. Specifically, we utilize the redactable blockchain to record revocation information directly on the original certificate ("on-cert") and remove additional data structures such as CRL, significantly reducing storage overhead. Moreover, the combination of redactable and consortium blockchains brings a new kind of attack called deception of versions (DoV) attack. To defend against it, we design a random-block-node-check (RBNC) based freshness check mechanism. Security and performance analyses show that ScalaCert has sufficient security and effectively solves the scalability problem of the blockchain-based PKI system.
Belaïd, Sonia, Mercadier, Darius, Rivain, Matthieu, Taleb, Abdul Rahman.  2022.  IronMask: Versatile Verification of Masking Security. 2022 IEEE Symposium on Security and Privacy (SP). :142—160.
This paper introduces lronMask, a new versatile verification tool for masking security. lronMask is the first to offer the verification of standard simulation-based security notions in the probing model as well as recent composition and expandability notions in the random probing model. It supports any masking gadgets with linear randomness (e.g. addition, copy and refresh gadgets) as well as quadratic gadgets (e.g. multiplication gadgets) that might include non-linear randomness (e.g. by refreshing their inputs), while providing complete verification results for both types of gadgets. We achieve this complete verifiability by introducing a new algebraic characterization for such quadratic gadgets and exhibiting a complete method to determine the sets of input shares which are necessary and sufficient to perform a perfect simulation of any set of probes. We report various benchmarks which show that lronMask is competitive with state-of-the-art verification tools in the probing model (maskVerif, scVerif, SILVEH, matverif). lronMask is also several orders of magnitude faster than VHAPS -the only previous tool verifying random probing composability and expandability- as well as SILVEH -the only previous tool providing complete verification for quadratic gadgets with nonlinear randomness. Thanks to this completeness and increased performance, we obtain better bounds for the tolerated leakage probability of state-of-the-art random probing secure compilers.
Yeo, Guo Feng Anders, Hudson, Irene, Akman, David, Chan, Jeffrey.  2022.  A Simple Framework for XAI Comparisons with a Case Study. 2022 5th International Conference on Artificial Intelligence and Big Data (ICAIBD). :501—508.
The number of publications related to Explainable Artificial Intelligence (XAI) has increased rapidly this last decade. However, the subjective nature of explainability has led to a lack of consensus regarding commonly used definitions for explainability and with differing problem statements falling under the XAI label resulting in a lack of comparisons. This paper proposes in broad terms a simple comparison framework for XAI methods based on the output and what we call the practical attributes. The aim of the framework is to ensure that everything that can be held constant for the purpose of comparison, is held constant and to ignore many of the subjective elements present in the area of XAI. An example utilizing such a comparison along the lines of the proposed framework is performed on local, post-hoc, model-agnostic XAI algorithms which are designed to measure the feature importance/contribution for a queried instance. These algorithms are assessed on two criteria using synthetic datasets across a range of classifiers. The first is based on selecting features which contribute to the underlying data structure and the second is how accurately the algorithms select the features used in a decision tree path. The results from the first comparison showed that when the classifier was able to pick up the underlying pattern in the model, the LIME algorithm was the most accurate at selecting the underlying ground truth features. The second test returned mixed results with some instances in which the XAI algorithms were able to accurately return the features used to produce predictions, however this result was not consistent.
Rajan, Mohammad Hasnain, Rebello, Keith, Sood, Yajur, Wankhade, Sunil B..  2021.  Graph-Based Transfer Learning for Conversational Agents. 2021 6th International Conference on Communication and Electronics Systems (ICCES). :1335–1341.
Graphs have proved to be a promising data structure to solve complex problems in various domains. Graphs store data in an associative manner which is analogous to the manner in which humans store memories in the brain. Generathe chatbots lack the ability to recall details revealed by the user in long conversations. To solve this problem, we have used graph-based memory to recall-related conversations from the past. Thus, providing context feature derived from query systems to generative systems such as OpenAI GPT. Using graphs to detect important details from the past reduces the total amount of processing done by the neural network. As there is no need to keep on passingthe entire history of the conversation. Instead, we pass only the last few pairs of utterances and the related details from the graph. This paper deploys this system and also demonstrates the ability to deploy such systems in real-world applications. Through the effective usage of knowledge graphs, the system is able to reduce the time complexity from O(n) to O(1) as compared to similar non-graph based implementations of transfer learning- based conversational agents.
Saquib, Nazmus, Krintz, Chandra, Wolski, Rich.  2021.  PEDaLS: Persisting Versioned Data Structures. 2021 IEEE International Conference on Cloud Engineering (IC2E). :179—190.
In this paper, we investigate how to automatically persist versioned data structures in distributed settings (e.g. cloud + edge) using append-only storage. By doing so, we facilitate resiliency by enabling program state to survive program activations and termination, and program-level data structures and their version information to be accessed programmatically by multiple clients (for replay, provenance tracking, debugging, and coordination avoidance, and more). These features are useful in distributed, failure-prone contexts such as those for heterogeneous and pervasive Internet of Things (IoT) deployments. We prototype our approach within an open-source, distributed operating system for IoT. Our results show that it is possible to achieve algorithmic complexities similar to those of in-memory versioning but in a distributed setting.
Wang, Zhaohong, Guo, Jing.  2021.  Denoising Signals on the Graph for Distributed Systems by Secure Outsourced Computation. 2021 IEEE 7th World Forum on Internet of Things (WF-IoT). :524—529.
The burgeoning networked computing devices create many distributed systems and generate new signals on a large scale. Many Internet of Things (IoT) applications, such as peer-to-peer streaming of multimedia data, crowdsourcing, and measurement by sensor networks, can be modeled as a form of big data. Processing massive data calls for new data structures and algorithms different from traditional ones designed for small-scale problems. For measurement from networked distributed systems, we consider an essential data format: signals on graphs. Due to limited computing resources, the sensor nodes in the distributed systems may outsource the computing tasks to third parties, such as cloud platforms, arising a severe concern on data privacy. A de-facto solution is to have third parties only process encrypted data. We propose a novel and efficient privacy-preserving secure outsourced computation protocol for denoising signals on the graph based on the information-theoretic secure multi-party computation (ITS-MPC). Denoising the data makes paths for further meaningful data processing. From experimenting with our algorithms in a testbed, the results indicate a better efficiency of our approach than a counterpart approach with computational security.
Zhang, Feng, Pan, Zaifeng, Zhou, Yanliang, Zhai, Jidong, Shen, Xipeng, Mutlu, Onur, Du, Xiaoyong.  2021.  G-TADOC: Enabling Efficient GPU-Based Text Analytics without Decompression. 2021 IEEE 37th International Conference on Data Engineering (ICDE). :1679–1690.
Text analytics directly on compression (TADOC) has proven to be a promising technology for big data analytics. GPUs are extremely popular accelerators for data analytics systems. Unfortunately, no work so far shows how to utilize GPUs to accelerate TADOC. We describe G-TADOC, the first framework that provides GPU-based text analytics directly on compression, effectively enabling efficient text analytics on GPUs without decompressing the input data. G-TADOC solves three major challenges. First, TADOC involves a large amount of dependencies, which makes it difficult to exploit massive parallelism on a GPU. We develop a novel fine-grained thread-level workload scheduling strategy for GPU threads, which partitions heavily-dependent loads adaptively in a fine-grained manner. Second, in developing G-TADOC, thousands of GPU threads writing to the same result buffer leads to inconsistency while directly using locks and atomic operations lead to large synchronization overheads. We develop a memory pool with thread-safe data structures on GPUs to handle such difficulties. Third, maintaining the sequence information among words is essential for lossless compression. We design a sequence-support strategy, which maintains high GPU parallelism while ensuring sequence information. Our experimental evaluations show that G-TADOC provides 31.1× average speedup compared to state-of-the-art TADOC.
Zhang, Hongao, Yang, Zhen, Yu, Haiyang.  2021.  Lightweight and Privacy-preserving Search over Encryption Blockchain. 2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC). :423—427.
With the development of cloud computing, a growing number of users use the cloud to store their sensitive data. To protect privacy, users often encrypt their data before outsourcing. Searchable Symmetric Encryption (SSE) enables users to retrieve their encrypted data. Most prior SSE schemes did not focus on malicious servers, and users could not confirm the correctness of the search results. Blockchain-based SSE schemes show the potential to solve this problem. However, the expensive nature of storage overhead on the blockchain presents an obstacle to the implementation of these schemes. In this paper, we propose a lightweight blockchain-based searchable symmetric encryption scheme that reduces the space cost in the scheme by improving the data structure of the encrypted index and ensuring efficient data retrieval. Experiment results demonstrate the practicability of our scheme.
Godin, Jonathan, Lamontagne, Philippe.  2021.  Deletion-Compliance in the Absence of Privacy. 2021 18th International Conference on Privacy, Security and Trust (PST). :1–10.
Garg, Goldwasser and Vasudevan (Eurocrypt 2020) invented the notion of deletion-compliance to formally model the “right to be forgotten’, a concept that confers individuals more control over their digital data. A requirement of deletion-compliance is strong privacy for the deletion requesters since no outside observer must be able to tell if deleted data was ever present in the first place. Naturally, many real world systems where information can flow across users are automatically ruled out.The main thesis of this paper is that deletion-compliance is a standalone notion, distinct from privacy. We present an alternative definition that meaningfully captures deletion-compliance without any privacy implications. This allows broader class of data collectors to demonstrate compliance to deletion requests and to be paired with various notions of privacy. Our new definition has several appealing properties:•It is implied by the stronger definition of Garg et al. under natural conditions, and is equivalent when we add a strong privacy requirement.•It is naturally composable with minimal assumptions.•Its requirements are met by data structure implementations that do not reveal the order of operations, a concept known as history-independence.Along the way, we discuss the many challenges that remain in providing a universal definition of compliance to the “right to be forgotten.”
Myasnikov, Evgeny.  2021.  Nearest Neighbor Search In Hyperspectral Data Using Binary Space Partitioning Trees. 2021 11th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing (WHISPERS). :1—4.
Fast search of hyperspectral data is crucial in many practical applications ranging from classification to finding duplicate fragments in images. In this paper, we evaluate two space partitioning data structures in the task of searching hyperspectral data. In particular, we consider vp-trees and ball-trees, study several tree construction algorithms, and compare these structures with the brute force approach. In addition, we evaluate vp-trees and ball-trees with four similarity measures, namely, Euclidean Distance, Spectral Angle Mapper Bhattacharyya Angle, and Hellinger distance.
Paul, Rosebell, Selvan, Mercy Paul.  2021.  A Study On Naming and Caching in Named Data Networking. 2021 Fifth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC). :1387–1395.
This paper examines the fast approaching highly secure and content centric data sharing architecture Named Data Networking. The content name plays the key role in NDN. Most of the users are interested only in the content or information and thereby the host centric internet architecture is losing its importance. Different naming conventions and caching strategies used in Named Data Networking based applications have been discussed in this study. The convergence of NDN with the vehicular networks and the ongoing studies in it will make the path to Intelligent Transportation system more optimized and efficient. It describes the future internet and this idea has taken root in most of the upcoming IOT applications which are going to conquer every phase of life. Though it is in its infancy stage of development, NDN will soon take over traditional IP Architecture.
Zheng, Shiyuan, Xie, Hong, Lui, John C.S..  2021.  Social Visibility Optimization in OSNs with Anonymity Guarantees: Modeling, Algorithms and Applications. 2021 IEEE 37th International Conference on Data Engineering (ICDE). :2063–2068.
Online social network (OSN) is an ideal venue to enhance one's visibility. This paper considers how a user (called requester) in an OSN selects a small number of available users and invites them as new friends/followers so as to maximize his "social visibility". More importantly, the requester has to do this under the anonymity setting, which means he is not allowed to know the neighborhood information of these available users in the OSN. In this paper, we first develop a mathematical model to quantify the social visibility and formulate the problem of visibility maximization with anonymity guarantee, abbreviated as "VisMAX-A". Then we design an algorithmic framework named as "AdaExp", which adaptively expands the requester's visibility in multiple rounds. In each round of the expansion, AdaExp uses a query oracle with anonymity guarantee to select only one available user. By using probabilistic data structures like the k-minimum values (KMV) sketch, we design an efficient query oracle with anonymity guarantees. We also conduct experiments on real-world social networks and validate the effectiveness of our algorithms.
Everson, Douglas, Cheng, Long.  2021.  Compressing Network Attack Surfaces for Practical Security Analysis. 2021 IEEE Secure Development Conference (SecDev). :23–29.
Testing or defending the security of a large network can be challenging because of the sheer number of potential ingress points that need to be investigated and evaluated for vulnerabilities. In short, manual security testing and analysis do not easily scale to large networks. While it has been shown that clustering can simplify the problem somewhat, the data structures and formats returned by the latest network mapping tools are not conducive to clustering algorithms. In this paper we introduce a hybrid similarity algorithm to compute the distance between two network services and then use those calculations to support a clustering algorithm designed to compress a large network attack surface by orders of magnitude. Doing so allows for new testing strategies that incorporate outlier detection and smart consolidation of test cases to improve accuracy and timeliness of testing. We conclude by presenting two case studies using an organization's network attack surface data to demonstrate the effectiveness of this approach.
Sahu, Abhijeet, Davis, Katherine.  2021.  Structural Learning Techniques for Bayesian Attack Graphs in Cyber Physical Power Systems. 2021 IEEE Texas Power and Energy Conference (TPEC). :1–6.

Updating the structure of attack graph templates based on real-time alerts from Intrusion Detection Systems (IDS), in an Industrial Control System (ICS) network, is currently done manually by security experts. But, a highly-connected smart power systems, that can inadvertently expose numerous vulnerabilities to intruders for targeting grid resilience, needs automatic fast updates on learning attack graph structures, instead of manual intervention, to enable fast isolation of compromised network to secure the grid. Hence, in this work, we develop a technique to first construct a prior Bayesian Attack Graph (BAG) based on a predefined threat model and a synthetic communication network for a cyber-physical power system. Further, we evaluate a few score-based and constraint-based structural learning algorithms to update the BAG structure based on real-time alerts, based on scalability, data dependency, time complexity and accuracy criteria.

Hardin, David S..  2020.  Verified Hardware/Software Co-Assurance: Enhancing Safety and Security for Critical Systems. 2020 IEEE International Systems Conference (SysCon). :1—6.
Experienced developers of safety-critical and security-critical systems have long emphasized the importance of applying the highest degree of scrutiny to a system's I/O boundaries. From a safety perspective, input validation is a traditional “best practice.” For security-critical architecture and design, identification of the attack surface has emerged as a primary analysis technique. One of our current research focus areas concerns the identification of and mitigation against attacks along that surface, using mathematically-based tools. We are motivated in these efforts by emerging application areas, such as assured autonomy, that feature a high degree of network connectivity, require sophisticated algorithms and data structures, are subject to stringent accreditation/certification, and encourage hardware/software co-design approaches. We have conducted several experiments employing a state-of-the-art toolchain, due to Russinoff and O'Leary, and originally designed for use in floating-point hardware verification, to determine its suitability for the creation of safety-critical/security-critical input filters. We focus first on software implementation, but extending to hardware as well as hardware/software co-designs. We have implemented a high-assurance filter for JSON-formatted data used in an Unmanned Aerial Vehicle (UAV) application. Our JSON filter is built using a table-driven lexer/parser, supported by mathematically-proven lexer and parser table generation technology, as well as verified data structures. Filter behavior is expressed in a subset of Algorithmic C, which defines a set of C++ header files providing support for hardware design, including the peculiar bit widths utilized in that discipline, and enables compilation to both hardware and software platforms. The Russinoff-O'Leary Restricted Algorithmic C (RAC) toolchain translates Algorithmic C source to the Common Lisp subset supported by the ACL2 theorem prover; once in ACL2, filter behavior can be mathematically verified. We describe how we utilize RAC to translate our JSON filter to ACL2, present proofs of correctness for its associated data types, and describe validation and performance results obtained through the use of concrete test vectors.
Mell, Peter, Gueye, Assane.  2020.  A Suite of Metrics for Calculating the Most Significant Security Relevant Software Flaw Types. 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC). :511—516.
The Common Weakness Enumeration (CWE) is a prominent list of software weakness types. This list is used by vulnerability databases to describe the underlying security flaws within analyzed vulnerabilities. This linkage opens the possibility of using the analysis of software vulnerabilities to identify the most significant weaknesses that enable those vulnerabilities. We accomplish this through creating mashup views combining CWE weakness taxonomies with vulnerability analysis data. The resulting graphs have CWEs as nodes, edges derived from multiple CWE taxonomies, and nodes adorned with vulnerability analysis information (propagated from children to parents). Using these graphs, we develop a suite of metrics to identify the most significant weakness types (using the perspectives of frequency, impact, exploitability, and overall severity).
Bouzar-Benlabiod, L., Rubin, S. H., Belaidi, K., Haddar, N. E..  2020.  RNN-VED for Reducing False Positive Alerts in Host-based Anomaly Detection Systems. 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI). :17–24.
Host-based Intrusion Detection Systems HIDS are often based on anomaly detection. Several studies deal with anomaly detection by analyzing the system-call traces and get good detection rates but also a high rate off alse positives. In this paper, we propose a new anomaly detection approach applied on the system-call traces. The normal behavior learning is done using a Sequence to sequence model based on a Variational Encoder-Decoder (VED) architecture that integrates Recurrent Neural Networks (RNN) cells. We exploit the semantics behind the invoking order of system-calls that are then seen as sentences. A preprocessing phase is added to structure and optimize the model input-data representation. After the learning step, a one-class classification is run to categorize the sequences as normal or abnormal. The architecture may be used for predicting abnormal behaviors. The tests are achieved on the ADFA-LD dataset.
Cortiñas, C. T., Vassena, M., Russo, A..  2020.  Securing Asynchronous Exceptions. 2020 IEEE 33rd Computer Security Foundations Symposium (CSF). :214–229.

Language-based information-flow control (IFC) techniques often rely on special purpose, ad-hoc primitives to address different covert channels that originate in the runtime system, beyond the scope of language constructs. Since these piecemeal solutions may not compose securely, there is a need for a unified mechanism to control covert channels. As a first step towards this goal, we argue for the design of a general interface that allows programs to safely interact with the runtime system and the available computing resources. To coordinate the communication between programs and the runtime system, we propose the use of asynchronous exceptions (interrupts), which, to the best of our knowledge, have not been considered before in the context of IFC languages. Since asynchronous exceptions can be raised at any point during execution-often due to the occurrence of an external event-threads must temporarily mask them out when manipulating locks and shared data structures to avoid deadlocks and, therefore, breaking program invariants. Crucially, the naive combination of asynchronous exceptions with existing features of IFC languages (e.g., concurrency and synchronization variables) may open up new possibilities of information leakage. In this paper, we present MACasync, a concurrent, statically enforced IFC language that, as a novelty, features asynchronous exceptions. We show how asynchronous exceptions easily enable (out of the box) useful programming patterns like speculative execution and some degree of resource management. We prove that programs in MACasync satisfy progress-sensitive non-interference and mechanize our formal claims in the Agda proof assistant.

Lei, X., Tu, G.-H., Liu, A. X., Xie, T..  2020.  Fast and Secure kNN Query Processing in Cloud Computing. 2020 IEEE Conference on Communications and Network Security (CNS). :1–9.
Advances in sensing and tracking technology lead to the proliferation of location-based services. Location service providers (LSPs) often resort to commercial public clouds to store the tremendous geospatial data and process location-based queries from data users. To protect the privacy of LSP's geospatial data and data user's query location against the untrusted cloud, they are required to be encrypted before sending to the cloud. Nevertheless, it is not easy to design a fast and secure location-based query processing scheme over the encrypted data. In this paper, we propose a Fast and Secure kNN (FSkNN) scheme to support secure k nearest neighbor (k NN) search in cloud computing. We reveal the inherent connection between an Sk NN protocol and a secure range query protocol and further describe how to construct FSkNN based on a secure range query protocol. FSkNN leverages a customized accuracy-assured strategy to ensure the result accuracy and adopts a data structure named random Bloom filter (RBF) to build a secure index for efficiently searching. We formally prove the security of FSkNN under the random oracle model. Our evaluation results show that FSkNN is highly practical.
Sun, J., Ma, J., Quan, J., Zhu, X., I, C..  2019.  A Fuzzy String Matching Scheme Resistant to Statistical Attack. 2019 International Conference on Networking and Network Applications (NaNA). :396–402.
The fuzzy query scheme based on vector index uses Bloom filter to construct vector index for key words. Then the statistical attack based on the deviation of frequency distribution of the vector index brings out the sensitive information disclosure. Using the noise vector, a fuzzy query scheme resistant to the statistical attack serving for encrypted database, i.e. S-BF, is introduced. With the noise vector to clear up the deviation of frequency distribution of vector index, the statistical attacks to the vector index are resolved. Demonstrated by lab experiment, S-BF scheme can achieve the secure fuzzy query with the powerful privation protection capability for encrypted cloud database without the loss of fuzzy query efficiency.
Huang, K., Yang, T..  2020.  Additive and Subtractive Cuckoo Filters. 2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS). :1–10.
Bloom filters (BFs) are fast and space-efficient data structures used for set membership queries in many applications. BFs are required to satisfy three key requirements: low space cost, high-speed lookups, and fast updates. Prior works do not satisfy these requirements at the same time. The standard BF does not support deletions of items and the variants that support deletions need additional space or performance overhead. The state-of-the-art cuckoo filters (CF) has high performance with seemingly low space cost. However, the CF suffers a critical issue of varying space cost per item. This is because the exclusive-OR (XOR) operation used by the CF requires the total number of buckets to be a power of two, leading to the space inflation. To address the issue, in this paper we propose a scalable variant of the cuckoo filter called additive and subtractive cuckoo filter (ASCF). We aim to improve the space efficiency while sustaining comparably high performance. The ASCF uses the addition and subtraction (ADD/SUB) operations instead of the XOR operation to compute an item's two candidate bucket indexes based on its fingerprint. Experimental results show that the ASCF achieves both low space cost and high performance. Compared to the CF, the ASCF reduces up to 1.9x space cost per item while maintaining the same lookup and update throughput. In addition, the ASCF outperforms other filters in both space cost and performance.
Zhang, H., Zhang, D., Chen, H., Xu, J..  2020.  Improving Efficiency of Pseudonym Revocation in VANET Using Cuckoo Filter. 2020 IEEE 20th International Conference on Communication Technology (ICCT). :763–769.
In VANETs, pseudonyms are often used to replace the identity of vehicles in communication. When vehicles drive out of the network or misbehave, their pseudonym certificates need to be revoked by the certificate authority (CA). The certificate revocation lists (CRLs) are usually used to store the revoked certificates before their expiration. However, using CRLs would incur additional storage, communication and computation overhead. Some existing schemes have proposed to use Bloom Filter to compress the original CRLs, but they are unable to delete the expired certificates and introduce the false positive problem. In this paper, we propose an improved pseudonym certificates revocation scheme, using Cuckoo Filter for compression to reduce the impact of these problems. In order to optimize deletion efficiency, we propose the concept of Certificate Expiration List (CEL) which can be implemented with priority queue. The experimental results show that our scheme can effectively reduce the storage and communication overhead of pseudonym certificates revocation, while retaining moderately low false positive rates. In addition, our scheme can also greatly improve the lookup performance on CRLs, and reduce the revocation operation costs by allowing deletion.
Awad, M. A., Ashkiani, S., Porumbescu, S. D., Owens, J. D..  2020.  Dynamic Graphs on the GPU. 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). :739–748.
We present a fast dynamic graph data structure for the GPU. Our dynamic graph structure uses one hash table per vertex to store adjacency lists and achieves 3.4-14.8x faster insertion rates over the state of the art across a diverse set of large datasets, as well as deletion speedups up to 7.8x. The data structure supports queries and dynamic updates through both edge and vertex insertion and deletion. In addition, we define a comprehensive evaluation strategy based on operations, workloads, and applications that we believe better characterize and evaluate dynamic graph data structures.
Riaz, S., Khan, A. H., Haroon, M., Latif, S., Bhatti, S..  2020.  Big Data Security and Privacy: Current Challenges and Future Research perspective in Cloud Environment. 2020 International Conference on Information Management and Technology (ICIMTech). :977—982.

Cloud computing is an Internet-based technology that emerging rapidly in the last few years due to popular and demanded services required by various institutions, organizations, and individuals. structured, unstructured, semistructured data is transfer at a record pace on to the cloud server. These institutions, businesses, and organizations are shifting more and more increasing workloads on cloud server, due to high cost, space and maintenance issues from big data, cloud computing will become a potential choice for the storage of data. In Cloud Environment, It is obvious that data is not secure completely yet from inside and outside attacks and intrusions because cloud servers are under the control of a third party. The Security of data becomes an important aspect due to the storage of sensitive data in a cloud environment. In this paper, we give an overview of characteristics and state of art of big data and data security & privacy top threats, open issues and current challenges and their impact on business are discussed for future research perspective and review & analysis of previous and recent frameworks and architectures for data security that are continuously established against threats to enhance how to keep and store data in the cloud environment.

Zhang, W., Byna, S., Niu, C., Chen, Y..  2019.  Exploring Metadata Search Essentials for Scientific Data Management. 2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC). :83—92.

Scientific experiments and observations store massive amounts of data in various scientific file formats. Metadata, which describes the characteristics of the data, is commonly used to sift through massive datasets in order to locate data of interest to scientists. Several indexing data structures (such as hash tables, trie, self-balancing search trees, sparse array, etc.) have been developed as part of efforts to provide an efficient method for locating target data. However, efficient determination of an indexing data structure remains unclear in the context of scientific data management, due to the lack of investigation on metadata, metadata queries, and corresponding data structures. In this study, we perform a systematic study of the metadata search essentials in the context of scientific data management. We study a real-world astronomy observation dataset and explore the characteristics of the metadata in the dataset. We also study possible metadata queries based on the discovery of the metadata characteristics and evaluate different data structures for various types of metadata attributes. Our evaluation on real-world dataset suggests that trie is a suitable data structure when prefix/suffix query is required, otherwise hash table should be used. We conclude our study with a summary of our findings. These findings provide a guideline and offers insights in developing metadata indexing methodologies for scientific applications.