Visible to the public Biblio

Filters: Keyword is distributed computing  [Clear All Filters]
2021-06-02
Bychkov, Igor, Feoktistov, Alexander, Gorsky, Sergey, Edelev, Alexei, Sidorov, Ivan, Kostromin, Roman, Fereferov, Evgeniy, Fedorov, Roman.  2020.  Supercomputer Engineering for Supporting Decision-making on Energy Systems Resilience. 2020 IEEE 14th International Conference on Application of Information and Communication Technologies (AICT). :1—6.
We propose a new approach to creating a subject-oriented distributed computing environment. Such an environment is used to support decision-making in solving relevant problems of ensuring energy systems resilience. The proposed approach is based on the idea of advancing and integrating the following important capabilities in supercomputer engineering: continuous integration, delivery, and deployment of the system and applied software, high-performance computing in heterogeneous environments, multi-agent intelligent computation planning and resource allocation, big data processing and geo-information servicing for subject information, including weakly structured data, and decision-making support. This combination of capabilities and their advancing are unique to the subject domain under consideration, which is related to combinatorial studying critical objects of energy systems. Evaluation of decision-making alternatives is carrying out through applying combinatorial modeling and multi-criteria selection rules. The Orlando Tools framework is used as the basis for an integrated software environment. It implements a flexible modular approach to the development of scientific applications (distributed applied software packages).
2021-06-01
Reijsbergen, Daniël, Anh Dinh, Tien Tuan.  2020.  On Exploiting Transaction Concurrency To Speed Up Blockchains. 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). :1044—1054.
Consensus protocols are currently the bottlenecks that prevent blockchain systems from scaling. However, we argue that transaction execution is also important to the performance and security of blockchains. In other words, there are ample opportunities to speed up and further secure blockchains by reducing the cost of transaction execution. Our goal is to understand how much we can speed up blockchains by exploiting transaction concurrency available in blockchain workloads. To this end, we first analyze historical data of seven major public blockchains, namely Bitcoin, Bitcoin Cash, Litecoin, Dogecoin, Ethereum, Ethereum Classic, and Zilliqa. We consider two metrics for concurrency, namely the single-transaction conflict rate per block, and the group conflict rate per block. We find that there is more concurrency in UTXO-based blockchains than in account-based ones, although the amount of concurrency in the former is lower than expected. Another interesting finding is that some blockchains with larger blocks have more concurrency than blockchains with smaller blocks. Next, we propose an analytical model for estimating the transaction execution speed-up given an amount of concurrency. Using results from our empirical analysis, the model estimates that 6× speed-ups in Ethereum can be achieved if all available concurrency is exploited.
2021-05-25
Fang, Ying, Gu, Tianlong, Chang, Liang, Li, Long.  2020.  Algebraic Decision Diagram-Based CP-ABE with Constant Secret and Fast Decryption. 2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). :98–106.
Ciphertext-policy attribute-based encryption (CP-ABE) is applied to many data service platforms to provides secure and fine-grained access control. In this paper, a new CP-ABE system based on the algebraic decision diagram (ADD) is presented. The new system makes full use of both the powerful description ability and the high calculating efficiency of ADD to improves the performance and efficiency of algorithms contained in CP-ABE. First, the new system supports both positive and negative attributes in the description of access polices. Second, the size of the secret key is constant and is not affected by the number of attributes. Third, time complexity of the key generation and decryption algorithms are O(1). Finally, this scheme allows visitors to have different access permissions to access shared data or file. At the same time, PV operation is introduced into CP-ABE framework for the first time to prevent resource conflicts caused by read and write operations on shared files. Compared with other schemes, the new scheme proposed in this paper performs better in function and efficiency.
2021-04-09
Peng, X., Hongmei, Z., Lijie, C., Ying, H..  2020.  Analysis of Computer Network Information Security under the Background of Big Data. 2020 5th International Conference on Smart Grid and Electrical Automation (ICSGEA). :409—412.
In today's society, under the comprehensive arrival of the Internet era, the rapid development of technology has facilitated people's production and life, but it is also a “double-edged sword”, making people's personal information and other data subject to a greater threat of abuse. The unique features of big data technology, such as massive storage, parallel computing and efficient query, have created a breakthrough opportunity for the key technologies of large-scale network security situational awareness. On the basis of big data acquisition, preprocessing, distributed computing and mining and analysis, the big data analysis platform provides information security assurance services to the information system. This paper will discuss the security situational awareness in large-scale network environment and the promotion of big data technology in security perception.
2020-11-17
Hossain, M. S., Ramli, M. R., Lee, J. M., Kim, D.-S..  2019.  Fog Radio Access Networks in Internet of Battlefield Things (IoBT) and Load Balancing Technology. 2019 International Conference on Information and Communication Technology Convergence (ICTC). :750—754.

The recent trend of military is to combined Internet of Things (IoT) knowledge to their field for enhancing the impact in battlefield. That's why Internet of battlefield (IoBT) is our concern. This paper discusses how Fog Radio Access Network(F-RAN) can provide support for local computing in Industrial IoT and IoBT. F-RAN can play a vital role because of IoT devices are becoming popular and the fifth generation (5G) communication is also an emerging issue with ultra-low latency, energy consumption, bandwidth efficiency and wide range of coverage area. To overcome the disadvantages of cloud radio access networks (C-RAN) F-RAN can be introduced where a large number of F-RAN nodes can take part in joint distributed computing and content sharing scheme. The F-RAN in IoBT is effective for enhancing the computing ability with fog computing and edge computing at the network edge. Since the computing capability of the fog equipment are weak, to overcome the difficulties of fog computing in IoBT this paper illustrates some challenging issues and solutions to improve battlefield efficiency. Therefore, the distributed computing load balancing problem of the F-RAN is researched. The simulation result indicates that the load balancing strategy has better performance for F-RAN architecture in the battlefield.

2020-06-01
Talusan, Jose Paolo, Tiausas, Francis, Yasumoto, Keiichi, Wilbur, Michael, Pettet, Geoffrey, Dubey, Abhishek, Bhattacharjee, Shameek.  2019.  Smart Transportation Delay and Resiliency Testbed Based on Information Flow of Things Middleware. 2019 IEEE International Conference on Smart Computing (SMARTCOMP). :13–18.
Edge and Fog computing paradigms are used to process big data generated by the increasing number of IoT devices. These paradigms have enabled cities to become smarter in various aspects via real-time data-driven applications. While these have addressed some flaws of cloud computing some challenges remain particularly in terms of privacy and security. We create a testbed based on a distributed processing platform called the Information flow of Things (IFoT) middleware. We briefly describe a decentralized traffic speed query and routing service implemented on this framework testbed. We configure the testbed to test countermeasure systems that aim to address the security challenges faced by prior paradigms. Using this testbed, we investigate a novel decentralized anomaly detection approach for time-sensitive distributed smart transportation systems.
2020-03-30
Scherzinger, Stefanie, Seifert, Christin, Wiese, Lena.  2019.  The Best of Both Worlds: Challenges in Linking Provenance and Explainability in Distributed Machine Learning. 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). :1620–1629.
Machine learning experts prefer to think of their input as a single, homogeneous, and consistent data set. However, when analyzing large volumes of data, the entire data set may not be manageable on a single server, but must be stored on a distributed file system instead. Moreover, with the pressing demand to deliver explainable models, the experts may no longer focus on the machine learning algorithms in isolation, but must take into account the distributed nature of the data stored, as well as the impact of any data pre-processing steps upstream in their data analysis pipeline. In this paper, we make the point that even basic transformations during data preparation can impact the model learned, and that this is exacerbated in a distributed setting. We then sketch our vision of end-to-end explainability of the model learned, taking the pre-processing into account. In particular, we point out the potentials of linking the contributions of research on data provenance with the efforts on explainability in machine learning. In doing so, we highlight pitfalls we may experience in a distributed system on the way to generating more holistic explanations for our machine learning models.
2020-03-02
Gupta, Diksha, Saia, Jared, Young, Maxwell.  2019.  Peace Through Superior Puzzling: An Asymmetric Sybil Defense. 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS). :1083–1094.

A common tool to defend against Sybil attacks is proof-of-work, whereby computational puzzles are used to limit the number of Sybil participants. Unfortunately, current Sybil defenses require significant computational effort to offset an attack. In particular, good participants must spend computationally at a rate that is proportional to the spending rate of an attacker. In this paper, we present the first Sybil defense algorithm which is asymmetric in the sense that good participants spend at a rate that is asymptotically less than an attacker. In particular, if T is the rate of the attacker's spending, and J is the rate of joining good participants, then our algorithm spends at a rate f O($\surd$(TJ) + J). We provide empirical evidence that our algorithm can be significantly more efficient than previous defenses under various attack scenarios. Additionally, we prove a lower bound showing that our algorithm's spending rate is asymptotically optimal among a large family of algorithms.

2019-12-30
Sharma, Mukesh Kumar, Somwanshi, Devendra.  2018.  Improvement in Homomorphic Encryption Algorithm with Elliptic Curve Cryptography and OTP Technique. 2018 3rd International Conference and Workshops on Recent Advances and Innovations in Engineering (ICRAIE). :1–6.
Cloud computing is a technology is where client require not to stress over the expense of equipment establishment and their support cost. Distributed computing is presently turned out to be most prominent innovation on account of its accessibility, ease and some different elements. Yet, there is a few issues in distributed computing, the principle one is security in light of the fact that each client store their valuable information on the system so they need their information ought to be shielded from any unapproved get to, any progressions that isn't done for client's benefit. To take care of the issue of Key administration, Key Sharing different plans have been proposed. The outsider examiner is the plan for key administration and key sharing. The primary preferred standpoint of this is the cloud supplier can encourage the administration which was accessible by the customary outsider evaluator and make it trustful. The outsider examining plan will be fizzled, if the outsider's security is endangered or of the outsider will be malignant. To take care of the issue, there is another modular for key sharing and key administration in completely Homomorphic Encryption conspire is outlined. In this paper we utilized the symmetric key understanding calculation named Diffie Hellman to make session key between two gatherings who need to impart and elliptic curve cryptography to create encryption keys rather than RSA and have utilized One Time Password (OTP) for confirming the clients.
Bazm, Mohammad-Mahdi, Lacoste, Marc, Südholt, Mario, Menaud, Jean-Marc.  2018.  Secure Distributed Computing on Untrusted Fog Infrastructures Using Trusted Linux Containers. 2018 IEEE International Conference on Cloud Computing Technology and Science (CloudCom). :239–242.
Fog and Edge computing provide a large pool of resources at the edge of the network that may be used for distributed computing. Fog infrastructure heterogeneity also results in complex configuration of distributed applications on computing nodes. Linux containers are a mainstream technique allowing to run packaged applications and micro services. However, running applications on remote hosts owned by third parties is challenging because of untrusted operating systems and hardware maintained by third parties. To meet such challenges, we may leverage trusted execution mechanisms. In this work, we propose a model for distributed computing on Fog infrastructures using Linux containers secured by Intel's Software Guard Extensions (SGX) technology. We implement our model on a Docker and OpenSGX platform. The result is a secure and flexible approach for distributed computing on Fog infrastructures.
2019-12-09
Bangalore, Laasya, Choudhury, Ashish, Patra, Arpita.  2018.  Almost-Surely Terminating Asynchronous Byzantine Agreement Revisited. Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing. :295–304.

The problem of Byzantine Agreement (BA) is of interest to both distributed computing and cryptography community. Following well-known results from the distributed computing literature, BA problem in the asynchronous network setting encounters inevitable non-termination issues. The impasse is overcome via randomization that allows construction of BA protocols in two flavours of termination guarantee - with overwhelming probability and with probability one. The latter type termed as almost-surely terminating BAs are the focus of this paper. An eluding problem in the domain of almost-surely terminating BAs is achieving a constant expected running time. Our work makes progress in this direction. In a setting with n parties and an adversary with unbounded computing power controlling at most t parties in Byzantine fashion, we present two asynchronous almost-surely terminating BA protocols: With the optimal resilience of t \textbackslashtextless n3 , our first protocol runs for expected O(n) time. The existing protocols in the same setting either runs for expected O(n2) time (Abraham et al, PODC 2008) or requires exponential computing power from the honest parties (Wang, CoRR 2015). In terms of communication complexity, our construction outperforms all the known constructions that offer almost-surely terminating feature. With the resilience of t \textbackslashtextless n/3+ε for any ε \textbackslashtextgreater 0, our second protocol runs for expected O( 1 ε ) time. The expected running time of our protocol turns constant when ε is a constant fraction. The known constructions with constant expected running time either require ε to be at least 1 (Feldman-Micali, STOC 1988), implying t \textbackslashtextless n/4, or calls for exponential computing power from the honest parties (Wang, CoRR 2015). We follow the traditional route of building BA via common coin protocol that in turn reduces to asynchronous verifiable secretsharing (AVSS). Our constructions are built on a variant of AVSS that is termed as shunning. A shunning AVSS fails to offer the properties of AVSS when the corrupt parties strike, but allows the honest parties to locally detect and shun a set of corrupt parties for any future communication. Our shunning AVSS with t \textbackslashtextless n/3 and t \textbackslashtextless n 3+ε guarantee Ω(n) and respectively Ω(εt 2) conflicts to be revealed when failure occurs. Turning this shunning AVSS to a common coin protocol constitutes another contribution of our paper.

2019-10-28
Withers, Alex, Bockelman, Brian, Weitzel, Derek, Brown, Duncan, Gaynor, Jeff, Basney, Jim, Tannenbaum, Todd, Miller, Zach.  2018.  SciTokens: Capability-Based Secure Access to Remote Scientific Data. Proceedings of the Practice and Experience on Advanced Research Computing. :24:1–24:8.
The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the problems, re-run their computations, and wait longer for their results. In this paper, we introduce SciTokens, open source software to help scientists manage their security credentials more reliably and securely. We describe the SciTokens system architecture, design, and implementation addressing use cases from the Laser Interferometer Gravitational-Wave Observatory (LIGO) Scientific Collaboration and the Large Synoptic Survey Telescope (LSST) projects. We also present our integration with widely-used software that supports distributed scientific computing, including HTCondor, CVMFS, and XrootD. SciTokens uses IETF-standard OAuth tokens for capability-based secure access to remote scientific data. The access tokens convey the specific authorizations needed by the workflows, rather than general-purpose authentication impersonation credentials, to address the risks of scientific workflows running on distributed infrastructure including NSF resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds (e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the interoperability and security of scientific workflows, SciTokens 1) enables use of distributed computing for scientific domains that require greater data protection and 2) enables use of more widely distributed computing resources by reducing the risk of credential abuse on remote systems.
2019-08-26
Gries, S., Hesenius, M., Gruhn, V..  2018.  Embedding Non-Compliant Nodes into the Information Flow Monitor by Dependency Modeling. 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS). :1541-1542.

Observing semantic dependencies in large and heterogeneous networks is a critical task, since it is quite difficult to find the actual source of a malfunction in the case of an error. Dependencies might exist between many network nodes and among multiple hops in paths. If those dependency structures are unknown, debugging errors gets quite difficult. Since CPS and other large networks change at runtime and consists of custom software and hardware, as well as components off-the-shelf, it is necessary to be able to not only include own components in approaches to detect dependencies between nodes. In this paper we present an extension to the Information Flow Monitor approach. Our goal is that this approach should be able to handle unalterable blackbox nodes. This is quite challenging, since the IFM originally requires each network node to be compliant with the IFM protocol.

2018-12-10
Oyekanlu, E..  2018.  Distributed Osmotic Computing Approach to Implementation of Explainable Predictive Deep Learning at Industrial IoT Network Edges with Real-Time Adaptive Wavelet Graphs. 2018 IEEE First International Conference on Artificial Intelligence and Knowledge Engineering (AIKE). :179–188.
Challenges associated with developing analytics solutions at the edge of large scale Industrial Internet of Things (IIoT) networks close to where data is being generated in most cases involves developing analytics solutions from ground up. However, this approach increases IoT development costs and system complexities, delay time to market, and ultimately lowers competitive advantages associated with delivering next-generation IoT designs. To overcome these challenges, existing, widely available, hardware can be utilized to successfully participate in distributed edge computing for IIoT systems. In this paper, an osmotic computing approach is used to illustrate how distributed osmotic computing and existing low-cost hardware may be utilized to solve complex, compute-intensive Explainable Artificial Intelligence (XAI) deep learning problem from the edge, through the fog, to the network cloud layer of IIoT systems. At the edge layer, the C28x digital signal processor (DSP), an existing low-cost, embedded, real-time DSP that has very wide deployment and integration in several IoT industries is used as a case study for constructing real-time graph-based Coiflet wavelets that could be used for several analytic applications including deep learning pre-processing applications at the edge and fog layers of IIoT networks. Our implementation is the first known application of the fixed-point C28x DSP to construct Coiflet wavelets. Coiflet Wavelets are constructed in the form of an osmotic microservice, using embedded low-level machine language to program the C28x at the network edge. With the graph-based approach, it is shown that an entire Coiflet wavelet distribution could be generated from only one wavelet stored in the C28x based edge device, and this could lead to significant savings in memory at the edge of IoT networks. Pearson correlation coefficient is used to select an edge generated Coiflet wavelet and the selected wavelet is used at the fog layer for pre-processing and denoising IIoT data to improve data quality for fog layer based deep learning application. Parameters for implementing deep learning at the fog layer using LSTM networks have been determined in the cloud. For XAI, communication network noise is shown to have significant impact on results of predictive deep learning at IIoT network fog layer.
2018-11-19
Langfinger, M., Schneider, M., Stricker, D., Schotten, H. D..  2017.  Addressing Security Challenges in Industrial Augmented Reality Systems. 2017 IEEE 15th International Conference on Industrial Informatics (INDIN). :299–304.

In context of Industry 4.0 Augmented Reality (AR) is frequently mentioned as the upcoming interface technology for human-machine communication and collaboration. Many prototypes have already arisen in both the consumer market and in the industrial sector. According to numerous experts it will take only few years until AR will reach the maturity level to be deployed in productive applications. Especially for industrial usage it is required to assess security risks and challenges this new technology implicates. Thereby we focus on plant operators, Original Equipment Manufacturers (OEMs) and component vendors as stakeholders. Starting from several industrial AR use cases and the structure of contemporary AR applications, in this paper we identify security assets worthy of protection and derive the corresponding security goals. Afterwards we elaborate the threats industrial AR applications are exposed to and develop an edge computing architecture for future AR applications which encompasses various measures to reduce security risks for our stakeholders.

2018-09-28
Li-Xin, L., Yong-Shan, D., Jia-Yan, W..  2017.  Differential Privacy Data Protection Method Based on Clustering. 2017 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). :11–16.

To enhance privacy protection and improve data availability, a differential privacy data protection method ICMD-DP is proposed. Based on insensitive clustering algorithm, ICMD-DP performs differential privacy on the results of ICMD (insensitive clustering method for mixed data). The combination of clustering and differential privacy realizes the differentiation of query sensitivity from single record to group record. At the meanwhile, it reduces the risk of information loss and information disclosure. In addition, to satisfy the requirement of maintaining differential privacy for mixed data, ICMD-DP uses different methods to calculate the distance and centroid of categorical and numerical attributes. Finally, experiments are given to illustrate the availability of the method.

Lu, Z., Shen, H..  2017.  A New Lower Bound of Privacy Budget for Distributed Differential Privacy. 2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT). :25–32.

Distributed data aggregation via summation (counting) helped us to learn the insights behind the raw data. However, such computing suffered from a high privacy risk of malicious collusion attacks. That is, the colluding adversaries infer a victim's privacy from the gaps between the aggregation outputs and their source data. Among the solutions against such collusion attacks, Distributed Differential Privacy (DDP) shows a significant effect of privacy preservation. Specifically, a DDP scheme guarantees the global differential privacy (the presence or absence of any data curator barely impacts the aggregation outputs) by ensuring local differential privacy at the end of each data curator. To guarantee an overall privacy performance of a distributed data aggregation system against malicious collusion attacks, part of the existing work on such DDP scheme aim to provide an estimated lower bound of privacy budget for the global differential privacy. However, there are two main problems: low data utility from using a large global function sensitivity; unknown privacy guarantee when the aggregation sensitivity of the whole system is less than the sum of the data curator's aggregation sensitivity. To address these problems while ensuring distributed differential privacy, we provide a new lower bound of privacy budget, which works with an unconditional aggregation sensitivity of the whole distributed system. Moreover, we study the performance of our privacy bound in different scenarios of data updates. Both theoretical and experimental evaluations show that our privacy bound offers better global privacy performance than the existing work.

2018-09-12
Weintraub, E..  2017.  Estimating Target Distribution in security assessment models. 2017 IEEE 2nd International Verification and Security Workshop (IVSW). :82–87.

Organizations are exposed to various cyber-attacks. When a component is exploited, the overall computed damage is impacted by the number of components the network includes. This work is focuses on estimating the Target Distribution characteristic of an attacked network. According existing security assessment models, Target Distribution is assessed by using ordinal values based on users' intuitive knowledge. This work is aimed at defining a formula which enables measuring quantitatively the attacked components' distribution. The proposed formula is based on the real-time configuration of the system. Using the proposed measure, firms can quantify damages, allocate appropriate budgets to actual real risks and build their configuration while taking in consideration the risks impacted by components' distribution. The formula is demonstrated as part of a security continuous monitoring system.

Januário, Fábio, Cardoso, Alberto, Gil, Paulo.  2017.  A Multi-Agent Framework for Resilient Enhancement in Networked Control Systems. Proceedings of the 9th International Conference on Computer and Automation Engineering. :291–295.
Recent advances on the integration of control systems with state of the art information technologies have brought into play new uncertainties, not only associated with the physical world, but also from a cyber-space's perspective. In cyber-physical environments, awareness and resilience are invaluable properties. The paper focuses on the development of an architecture relying on a hierarchical multi-agent framework for resilience enhancement. This framework was evaluated on a test-bed comprising several distributed computational devices and heterogeneous communications. Results from tests prove the relevance of the proposed approach.
2018-09-05
Gai, K., Qiu, M..  2017.  An Optimal Fully Homomorphic Encryption Scheme. 2017 ieee 3rd international conference on big data security on cloud (bigdatasecurity), ieee international conference on high performance and smart computing (hpsc), and ieee international conference on intelligent data and security (ids). :101–106.

The expeditious expansion of the networking technologies have remarkably driven the usage of the distributedcomputing as well as services, such as task offloading to the cloud. However, security and privacy concerns are restricting the implementations of cloud computing because of the threats from both outsiders and insiders. The primary alternative of protecting users' data is developing a Fully Homomorphic Encryption (FHE) scheme, which can cover both data protections and data processing in the cloud. Despite many previous attempts addressing this approach, none of the proposed work can simultaneously satisfy two requirements that include the non-noise accuracy and an efficiency execution. This paper focuses on the issue of FHE design and proposes a novel FHE scheme, which is called Optimal Fully Homomorphic Encryption (O-FHE). Our approach utilizes the properties of the Kronecker Product (KP) and designs a mechanism of achieving FHE, which consider both accuracy and efficiency. We have assessed our scheme in both theoretical proofing and experimental evaluations with the confirmed and exceptional results.

2018-06-11
DeYoung, Mark E., Salman, Mohammed, Bedi, Himanshu, Raymond, David, Tront, Joseph G..  2017.  Spark on the ARC: Big Data Analytics Frameworks on HPC Clusters. Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact. :34:1–34:6.

In this paper we document our approach to overcoming service discovery and configuration of Apache Hadoop and Spark frameworks with dynamic resource allocations in a batch oriented Advanced Research Computing (ARC) High Performance Computing (HPC) environment. ARC efforts have produced a wide variety of HPC architectures. A common HPC architectural pattern is multi-node compute clusters with low-latency, high-performance interconnect fabrics and shared central storage. This pattern enables processing of workloads with high data co-dependency, frequently solved with message passing interface (MPI) programming models, and then executed as batch jobs. Unfortunately, many HPC programming paradigms are not well suited to big data workloads which are often easily separable. Our approach lowers barriers of entry to HPC environments by enabling end users to utilize Apache Hadoop and Spark frameworks that support big data oriented programming paradigms appropriate for separable workloads in batch oriented HPC environments.

2018-05-02
Rjoub, G., Bentahar, J..  2017.  Cloud Task Scheduling Based on Swarm Intelligence and Machine Learning. 2017 IEEE 5th International Conference on Future Internet of Things and Cloud (FiCloud). :272–279.

Cloud computing is the expansion of parallel computing, distributed computing. The technology of cloud computing becomes more and more widely used, and one of the fundamental issues in this cloud environment is related to task scheduling. However, scheduling in Cloud environments represents a difficult issue since it is basically NP-complete. Thus, many variants based on approximation techniques, especially those inspired by Swarm Intelligence (SI) have been proposed. This paper proposes a machine learning algorithm to guide the cloud choose the scheduling technique by using multi criteria decision to optimize the performance. The main contribution of our work is to minimize the makespan of a given task set. The new strategy is simulated using the CloudSim toolkit package where the impact of the algorithm is checked with different numbers of VMs varying from 2 to 50, and different task sizes between 30 bytes and 2700 bytes. Experiment results show that the proposed algorithm minimizes the execution time and the makespan between 7% and 75%, and improves the performance of the load balancing scheduling.

2018-04-11
Assadi, Sepehr, Khanna, Sanjeev.  2017.  Randomized Composable Coresets for Matching and Vertex Cover. Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures. :3–12.

A common approach for designing scalable algorithms for massive data sets is to distribute the computation across, say k, machines and process the data using limited communication between them. A particularly appealing framework here is the simultaneous communication model whereby each machine constructs a small representative summary of its own data and one obtains an approximate/exact solution from the union of the representative summaries. If the representative summaries needed for a problem are small, then this results in a communication-efficient and $\backslash$emph\round-optimal\ (requiring essentially no interaction between the machines) protocol. Some well-known examples of techniques for creating summaries include sampling, linear sketching, and composable coresets. These techniques have been successfully used to design communication efficient solutions for many fundamental graph problems. However, two prominent problems are notably absent from the list of successes, namely, the maximum matching problem and the minimum vertex cover problem. Indeed, it was shown recently that for both these problems, even achieving a modest approximation factor of $\backslash$polylog\(n)\ requires using representative summaries of size $\backslash$widetilde\$\backslash$Omega\(ntextasciicircum2) i.e. essentially no better summary exists than each machine simply sending its entire input graph. The main insight of our work is that the intractability of matching and vertex cover in the simultaneous communication model is inherently connected to an adversarial partitioning of the underlying graph across machines. We show that when the underlying graph is randomly partitioned across machines, both these problems admit $\backslash$emph\randomized composable coresets\ of size $\backslash$widetildeØ\(n) that yield an $\backslash$widetildeØ\(1)-approximate solution$\backslash$footnote\Here and throughout the paper, we use $\backslash$Ot($\backslash$cdot) notation to suppress $\backslash$polylog\(n)\ factors, where n is the number of vertices in the graph. In other words, a small subgraph of the input graph at each machine can be identified as its representative summary and the final answer then is obtained by simply running any maximum matching or minimum vertex cover algorithm on these combined subgraphs. This results in an Õ(1)-approximation simultaneous protocol for these problems with Õ(nk) total communication when the input is randomly partitioned across k machines. We also prove our results are optimal in a very strong sense: we not only rule out existence of smaller randomized composable coresets for these problems but in fact show that our $\backslash$Ot(nk) bound for total communication is optimal for em any simultaneous communication protocol (i.e. not only for randomized coresets) for these two problems. Finally, by a standard application of composable coresets, our results also imply MapReduce algorithms with the same approximation guarantee in one or two rounds of communication, improving the previous best known round complexity for these problems.\vphantom\

2018-03-26
Liu, W., Chen, F., Hu, H., Cheng, G., Huo, S., Liang, H..  2017.  A Novel Framework for Zero-Day Attacks Detection and Response with Cyberspace Mimic Defense Architecture. 2017 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). :50–53.

In cyberspace, unknown zero-day attacks can bring safety hazards. Traditional defense methods based on signatures are ineffective. Based on the Cyberspace Mimic Defense (CMD) architecture, the paper proposes a framework to detect the attacks and respond to them. Inputs are assigned to all online redundant heterogeneous functionally equivalent modules. Their independent outputs are compared and the outputs in the majority will be the final response. The abnormal outputs can be detected and so can the attack. The damaged executive modules with abnormal outputs will be replaced with new ones from the diverse executive module pool. By analyzing the abnormal outputs, the correspondence between inputs and abnormal outputs can be built and inputs leading to recurrent abnormal outputs will be written into the zero-day attack related database and their reuses cannot work any longer, as the suspicious malicious inputs can be detected and processed. Further responses include IP blacklisting and patching, etc. The framework also uses honeypot like executive module to confuse the attacker. The proposed method can prevent the recurrent attack based on the same exploit.

2018-02-21
Zhang, G., Qiu, X., Chang, W..  2017.  Scheduling of Security Resources in Software Defined Security Architecture. 2017 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). :494–503.

With the development of Software Defined Networking, its software programmability and openness brings new idea for network security. Therefore, many Software Defined Security Architectures emerged at the right moment. Software Defined Security decouples security control plane and security data plane. In Software Defined Security Architectures, underlying security devices are abstracted as security resources in resource pool, intellectualized and automated security business management and orchestration can be realized through software programming in security control plane. However, network management has been becoming extremely complicated due to expansible network scale, varying network devices, lack of abstraction and heterogeneity of network especially. Therefore, new-type open security devices are needed in SDS Architecture for unified management so that they can be conveniently abstracted as security resources in resource pool. This paper firstly analyses why open security devices are needed in SDS architecture and proposes a method of opening security devices. Considering this new architecture requires a new security scheduling mechanism, this paper proposes a security resource scheduling algorithm which is used for managing and scheduling security resources in resource pool according to user s security demand. The security resource scheduling algorithm aims to allocate a security protection task to a suitable security resource in resource pool so that improving security protection efficiency. In the algorithm, we use BP neural network to predict the execution time of security tasks to improve the performance of the algorithm. The simulation result shows that the algorithm has ideal performance. Finally, a usage scenario is given to illustrate the role of security resource scheduling in software defined security architecture.