Visible to the public Biblio

Found 12254 results

Khodabakhsh, A., Busch, C..  2020.  A Generalizable Deepfake Detector based on Neural Conditional Distribution Modelling. 2020 International Conference of the Biometrics Special Interest Group (BIOSIG). :1—5.
Photo- and video-realistic generation techniques have become a reality following the advent of deep neural networks. Consequently, there are immense concerns regarding the difficulty in differentiating what content is real from what is synthetic. An example of video-realistic generation techniques is the infamous Deepfakes, which exploit the main modality by which humans identify each other. Deepfakes are a category of synthetic face generation methods and are commonly based on generative adversarial networks. In this article, we propose a novel two-step synthetic face image detection method in which general-purpose features are extracted in a first step, trivializing the task of detecting synthetic images. The anomaly detector predicts the conditional probabilities for observing every individual pixel in the image and is trained on pristine data only. The extracted anomaly features demonstrate true generalization capacity across widely different unknown synthesis methods while showing a minimal loss in performance with regard to the detection of known synthetic samples.
Feng, Qi, Feng, Chendong, Hong, Weijiang.  2020.  Graph Neural Network-based Vulnerability Predication. 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). :800–801.
Automatic vulnerability detection is challenging. In this paper, we report our in-progress work of vulnerability prediction based on graph neural network (GNN). We propose a general GNN-based framework for predicting the vulnerabilities in program functions. We study the different instantiations of the framework in representative program graph representations, initial node encodings, and GNN learning methods. The preliminary experimental results on a representative benchmark indicate that the GNN-based method can improve the accuracy and recall rates of vulnerability prediction.
Kerschbaumer, C., Ritter, T., Braun, F..  2020.  Hardening Firefox against Injection Attacks. 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :653—663.
Web browsers display content in the form of HTML, CSS and JavaScript retrieved from the world wide web. The loaded content is subject to the web security model and considered untrusted and potentially malicious. To complicate security matters, Firefox uses the same technologies to render its user interface as it does to render untrusted web content which blurs the distinction between the two privilege levels.Getting interactions between the two correct turns out to be complicated and has led to numerous real-world security vulnerabilities. We study those vulnerabilities to discover common threats and explain how we address them systematically to harden Firefox.
Mashhadi, M. J., Hemmati, H..  2020.  Hybrid Deep Neural Networks to Infer State Models of Black-Box Systems. 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). :299–311.
Inferring behavior model of a running software system is quite useful for several automated software engineering tasks, such as program comprehension, anomaly detection, and testing. Most existing dynamic model inference techniques are white-box, i.e., they require source code to be instrumented to get run-time traces. However, in many systems, instrumenting the entire source code is not possible (e.g., when using black-box third-party libraries) or might be very costly. Unfortunately, most black-box techniques that detect states over time are either univariate, or make assumptions on the data distribution, or have limited power for learning over a long period of past behavior. To overcome the above issues, in this paper, we propose a hybrid deep neural network that accepts as input a set of time series, one per input/output signal of the system, and applies a set of convolutional and recurrent layers to learn the non-linear correlations between signals and the patterns, over time. We have applied our approach on a real UAV auto-pilot solution from our industry partner with half a million lines of C code. We ran 888 random recent system-level test cases and inferred states, over time. Our comparison with several traditional time series change point detection techniques showed that our approach improves their performance by up to 102%, in terms of finding state change points, measured by F1 score. We also showed that our state classification algorithm provides on average 90.45% F1 score, which improves traditional classification algorithms by up to 17%.
Hou, M..  2020.  IMPACT: A Trust Model for Human-Agent Teaming. 2020 IEEE International Conference on Human-Machine Systems (ICHMS). :1–4.
A trust model IMPACT: Intention, Measurability, Predictability, Agility, Communication, and Transparency has been conceptualized to build human trust in autonomous agents. The six critical characteristics must be exhibited by the agents in order to gain and maintain the trust from their human partners towards an effective and collaborative team in achieving common goals. The IMPACT model guided a design of an intelligent adaptive decision aid for dynamic target engagement processes in a military context. Positive feedback from subject matter experts participated in a large scale joint exercise controlling multiple unmanned vehicles indicated the effectiveness of the decision aid. It also demonstrated the utility of the IMPACT model as design principles for building up a trusted human-agent teaming.
Usman, S., Winarno, I., Sudarsono, A..  2020.  Implementation of SDN-based IDS to protect Virtualization Server against HTTP DoS attacks. 2020 International Electronics Symposium (IES). :195—198.
Virtualization and Software-defined Networking (SDN) are emerging technologies that play a major role in cloud computing. Cloud computing provides efficient utilization, high performance, and resource availability on demand. However, virtualization environments are vulnerable to various types of intrusion attacks that involve installing malicious software and denial of services (DoS) attacks. Utilizing SDN technology, makes the idea of SDN-based security applications attractive in the fight against DoS attacks. Network intrusion detection system (IDS) which is used to perform network traffic analysis as a detection system implemented on SDN networks to protect virtualization servers from HTTP DoS attacks. The experimental results show that SDN-based IDS is able to detect and mitigate HTTP DoS attacks effectively.
Kumar, S. A., Kumar, A., Bajaj, V., Singh, G. K..  2020.  An Improved Fuzzy Min–Max Neural Network for Data Classification. IEEE Transactions on Fuzzy Systems. 28:1910–1924.
Hyperbox classifier is an efficient tool for modern pattern classification problems due to its transparency and rigorous use of Euclidian geometry. Fuzzy min-max (FMM) network efficiently implements the hyperbox classifier, and has been modified several times to yield better classification accuracy. However, the obtained accuracy is not up to the mark. Therefore, in this paper, a new improved FMM (IFMM) network is proposed to increase the accuracy rate. In the proposed IFMM network, a modified constraint is employed to check the expandability of a hyperbox. It also uses semiperimeter of the hyperbox along with k-nearest mechanism to select the expandable hyperbox. In the proposed IFMM, the contraction rules of conventional FMM and enhanced FMM (EFMM) are also modified using semiperimeter of a hyperbox in order to balance the size of both overlapped hyperboxes. Experimental results show that the proposed IFMM network outperforms the FMM, k-nearest FMM, and EFMM by yielding more accuracy rate with less number of hyperboxes. The proposed methods are also applied to histopathological images to know the best magnification factor for classification.
Simon, L., Verma, A..  2020.  Improving Fuzzing through Controlled Compilation. 2020 IEEE European Symposium on Security and Privacy (EuroS P). :34–52.
We observe that operations performed by standard compilers harm fuzzing because the optimizations and the Intermediate Representation (IR) lead to transformations that improve execution speed at the expense of fuzzing. To remedy this problem, we propose `controlled compilation', a set of techniques to automatically re-factor a program's source code and cherry pick beneficial compiler optimizations to improve fuzzing. We design, implement and evaluate controlled compilation by building a new toolchain with Clang/LLVM. We perform an evaluation on 10 open source projects and compare the results of AFL to state-of-the-art grey-box fuzzers and concolic fuzzers. We show that when programs are compiled with this new toolchain, AFL covers 30 % new code on average and finds 21 additional bugs in real world programs. Our study reveals that controlled compilation often covers more code and finds more bugs than state-of-the-art fuzzing techniques, without the need to write a fuzzer from scratch or resort to advanced techniques. We identify two main reasons to explain why. First, it has proven difficult for researchers to appropriately configure existing fuzzers such as AFL. To address this problem, we provide guidelines and new LLVM passes to help automate AFL's configuration. This will enable researchers to perform a fairer comparison with AFL. Second, we find that current coverage-based evaluation measures (e.g. the total number of visited lines, edges or BBs) are inadequate because they lose valuable information such as which parts of a program a fuzzer actually visits and how consistently it does so. Coverage is considered a useful metric to evaluate a fuzzer's performance and devise a fuzzing strategy. However, the lack of a standard methodology for evaluating coverage remains a problem. To address this, we propose a rigorous evaluation methodology based on `qualitative coverage'. Qualitative coverage uniquely identifies each program line to help understand which lines are commonly visited by different fuzzers vs. which lines are visited only by a particular fuzzer. Throughout our study, we show the benefits of this new evaluation methodology. For example we provide valuable insights into the consistency of fuzzers, i.e. their ability to cover the same code or find the same bug across multiple independent runs. Overall, our evaluation methodology based on qualitative coverage helps to understand if a fuzzer performs better, worse, or is complementary to another fuzzer. This helps security practitioners adjust their fuzzing strategies.
Bronzin, T., Prole, B., Stipić, A., Pap, K..  2020.  Individualization of Anonymous Identities Using Artificial Intelligence (AI). 2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO). :1058–1063.

Individualization of anonymous identities using artificial intelligence - enables innovative human-computer interaction through the personalization of communication which is, at the same time, individual and anonymous. This paper presents possible approach for individualization of anonymous identities in real time. It uses computer vision and artificial intelligence to automatically detect and recognize person's age group, gender, human body measures, proportions and other specific personal characteristics. Collected data constitutes the so-called person's biometric footprint and are linked to a unique (but still anonymous) identity that is recorded in the computer system, along with other information that make up the profile of the person. Identity anonymization can be achieved by appropriate asymmetric encryption of the biometric footprint (with no additional personal information being stored) and integrity can be ensured using blockchain technology. Data collected in this manner is GDPR compliant.

Rehan, S., Singh, R..  2020.  Industrial and Home Automation, Control, Safety and Security System using Bolt IoT Platform. 2020 International Conference on Smart Electronics and Communication (ICOSEC). :787—793.
This paper describes a system that comprises of control, safety and security subsystem for industries and homes. The entire system is based on the Bolt IoT platform. Using this system, the user can control the devices such as LEDs, speed of the fan or DC motor, monitor the temperature of the premises with an alert sub-system for critical temperatures through SMS and call, monitor the presence of anyone inside the premises with an alert sub-system about any intrusion through SMS and call. If the system is used specifically in any industry then instead of monitoring the temperature any other physical quantity, which is critical for that industry, can be monitored using suitable sensors. In addition, the cloud connectivity is provided to the system using the Bolt IoT module and temperature data is sent to the cloud where using machine-learning algorithm the future temperature is predicted to avoid any accidents in the future.
Zhang, Y., Liu, Y., Chung, C.-L., Wei, Y.-C., Chen, C.-H..  2020.  Machine Learning Method Based on Stream Homomorphic Encryption Computing. 2020 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-Taiwan). :1–2.
This study proposes a machine learning method based on stream homomorphic encryption computing for improving security and reducing computational time. A case study of mobile positioning based on k nearest neighbors ( kNN) is selected to evaluate the proposed method. The results showed the proposed method can save computational resources than others.
Sharma, Rajesh Kumar, Pippal, Ravi Singh.  2020.  Malicious Attack and Intrusion Prevention in IoT Network using Blockchain based Security Analysis. 2020 12th International Conference on Computational Intelligence and Communication Networks (CICN). :380–385.
The Internet of Things (IoT) as a demanding technology require the best features of information security for effective development of the IoT based smart city and technological activity. There are huge number of recent security threats searching for some loopholes which are ready to exploit any network. Against the back-drop of recent rapidly growing technological advancement of IoT, security-threats have become a critical challenge which demand responsive and continuous action. As privacy and security exhibit an ever-present flourishing issue, so loopholes detection and analysis are indispensable process in the network. This paper presents Block chain based security analysis of data generated from IoT devices to prevent malicious attacks and intrusion in the IoT network.
Romano, A., Zheng, Y., Wang, W..  2020.  MinerRay: Semantics-Aware Analysis for Ever-Evolving Cryptojacking Detection. 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). :1129—1140.
Recent advances in web technology have made in-browser crypto-mining a viable funding model. However, these services have been abused to launch large-scale cryptojacking attacks to secretly mine cryptocurrency in browsers. To detect them, various signature-based or runtime feature-based methods have been proposed. However, they can be imprecise or easily circumvented. To this end, we propose MinerRay, a generic scheme to detect malicious in-browser cryptominers. Instead of leveraging unreliable external patterns, MinerRay infers the essence of cryptomining behaviors that differentiate mining from common browser activities in both WebAssembly and JavaScript contexts. Additionally, to detect stealthy mining activities without user consents, MinerRay checks if the miner can only be instantiated from user actions. MinerRay was evaluated on over 1 million websites. It detected cryptominers on 901 websites, where 885 secretly start mining without user consent. Besides, we compared MinerRay with five state-of-the-art signature-based or behavior-based cryptominer detectors (MineSweeper, CMTracker, Outguard, No Coin, and minerBlock). We observed that emerging miners with new signatures or new services were detected by MinerRay but missed by others. The results show that our proposed technique is effective and robust in detecting evolving cryptominers, yielding more true positives, and fewer errors.
Fauzan, A., Sukarno, P., Wardana, A. A..  2020.  Overhead Analysis of the Use of Digital Signature in MQTT Protocol for Constrained Device in the Internet of Things System. 2020 3rd International Conference on Computer and Informatics Engineering (IC2IE). :415–420.
This paper presents an overhead analysis of the use of digital signature mechanisms in the Message Queue Telemetry Transport (MQTT) protocol for three classes of constrained-device. Because the resources provided by constrained-devices are very limited, the purpose of this overhead analysis is to help find out the advantages and disadvantages of each class of constrained-devices after a security mechanism has been applied, namely by applying a digital signature mechanism. The objective of using this digital signature mechanism is for providing integrity, that if the payload sent and received in its destination is still original and not changed during the transmission process. The overhead analysis aspects performed are including analyzing decryption time, signature verification performance, message delivery time, memory and flash usage in the three classes of constrained-device. Based on the overhead analysis result, it can be seen that for decryption time and signature verification performance, the Class-2 device is the fastest one. For message delivery time, the smallest time needed for receiving the payload is Class-l device. For memory usage, the Class-2 device is providing the biggest available memory and flash.
Sheptunov, Sergey A., Sukhanova, Natalia V..  2020.  The Problems of Design and Application of Switching Neural Networks in Creation of Artificial Intelligence. 2020 International Conference Quality Management, Transport and Information Security, Information Technologies (IT QM IS). :428–431.
The new switching architecture of the neural networks was proposed. The switching neural networks consist of the neurons and the switchers. The goal is to reduce expenses on the artificial neural network design and training. For realization of complex models, algorithms and methods of management the neural networks of the big size are required. The number of the interconnection links “everyone with everyone” grows with the number of neurons. The training of big neural networks requires the resources of supercomputers. Time of training of neural networks also depends on the number of neurons in the network. Switching neural networks are divided into fragments connected by the switchers. Training of switcher neuron network is provided by fragments. On the basis of switching neural networks the devices of associative memory were designed with the number of neurons comparable to the human brain.
Li, Yizhi.  2020.  Research on Application of Convolutional Neural Network in Intrusion Detection. 2020 7th International Forum on Electrical Engineering and Automation (IFEEA). :720–723.
At present, our life is almost inseparable from the network, the network provides a lot of convenience for our life. However, a variety of network security incidents occur very frequently. In recent years, with the continuous development of neural network technology, more and more researchers have applied neural network to intrusion detection, which has developed into a new research direction in intrusion detection. As long as the neural network is provided with input data including network data packets, through the process of self-learning, the neural network can separate abnormal data features and effectively detect abnormal data. Therefore, the article innovatively proposes an intrusion detection method based on deep convolutional neural networks (CNN), which is used to test on public data sets. The results show that the model has a higher accuracy rate and a lower false negative rate than traditional intrusion detection methods.
Piessens, F..  2020.  Security across abstraction layers: old and new examples. 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :271–279.
A common technique for building ICT systems is to build them as successive layers of bstraction: for instance, the Instruction Set Architecture (ISA) is an abstraction of the hardware, and compilers or interpreters build higher level abstractions on top of the ISA.The functionality of an ICT application can often be understood by considering only a single level of abstraction. For instance the source code of the application defines the functionality using the level of abstraction of the source programming language. Functionality can be well understood by just studying this source code.Many important security issues in ICT system however are cross-layer issues: they can not be understood by considering the system at a single level of abstraction, but they require understanding how multiple levels of abstraction are implemented. Attacks may rely on, or exploit, implementation details of one or more layers below the source code level of abstraction.The purpose of this paper is to illustrate this cross-layer nature of security by discussing old and new examples of cross-layer security issues, and by providing a classification of these issues.
Ng, M., Coopamootoo, K. P. L., Toreini, E., Aitken, M., Elliot, K., Moorsel, A. van.  2020.  Simulating the Effects of Social Presence on Trust, Privacy Concerns Usage Intentions in Automated Bots for Finance. 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :190–199.
FinBots are chatbots built on automated decision technology, aimed to facilitate accessible banking and to support customers in making financial decisions. Chatbots are increasing in prevalence, sometimes even equipped to mimic human social rules, expectations and norms, decreasing the necessity for human-to-human interaction. As banks and financial advisory platforms move towards creating bots that enhance the current state of consumer trust and adoption rates, we investigated the effects of chatbot vignettes with and without socio-emotional features on intention to use the chatbot for financial support purposes. We conducted a between-subject online experiment with N = 410 participants. Participants in the control group were provided with a vignette describing a secure and reliable chatbot called XRO23, whereas participants in the experimental group were presented with a vignette describing a secure and reliable chatbot that is more human-like and named Emma. We found that Vignette Emma did not increase participants' trust levels nor lowered their privacy concerns even though it increased perception of social presence. However, we found that intention to use the presented chatbot for financial support was positively influenced by perceived humanness and trust in the bot. Participants were also more willing to share financially-sensitive information such as account number, sort code and payments information to XRO23 compared to Emma - revealing a preference for a technical and mechanical FinBot in information sharing. Overall, this research contributes to our understanding of the intention to use chatbots with different features as financial technology, in particular that socio-emotional support may not be favoured when designed independently of financial function.
Muñoz, C. M. Blanco, Cruz, F. Gómez, Valero, J. S. Jimenez.  2020.  Software architecture for the application of facial recognition techniques through IoT devices. 2020 Congreso Internacional de Innovación y Tendencias en Ingeniería (CONIITI). :1–5.

The facial recognition time by time takes more importance, due to the extend kind of applications it has, but it is still challenging when faces big variations in the characteristics of the biometric data used in the process and especially referring to the transportation of information through the internet in the internet of things context. Based on the systematic review and rigorous study that supports the extraction of the most relevant information on this topic [1], a software architecture proposal which contains basic security requirements necessary for the treatment of the data involved in the application of facial recognition techniques, oriented to an IoT environment was generated. Concluding that the security and privacy considerations of the information registered in IoT devices represent a challenge and it is a priority to be able to guarantee that the data circulating on the network are only accessible to the user that was designed for this.

Song, M., Lind, M..  2020.  Towards Automated Generation of Function Models from P IDs. 2020 25th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA). 1:1081—1084.
Although function model has been widely applied to develop various operator decision support systems, the modeling process is essentially a manual work, which takes significant efforts on knowledge acquisition. It would greatly improve the efficiency of modeling if relevant information can be automatically retrieved from engineering documents. This paper investigates the possibility of automated transformation from P&IDs to a function model called MFM via AutomationML. Semantics and modeling patterns of MFM are established in AutomationML, which can be utilized to convert plant topology models into MFM models. The proposed approach is demonstrated with a small use case. Further topics for extending the study are also discussed.
Nageswar Rao, A., Rajendra Naik, B., Nirmala Devi, L., Venkata Subbareddy, K..  2020.  Trust and Packet Loss Aware Routing (TPLAR) for Intrusion Detection in WSNs. 2020 12th International Conference on Computational Intelligence and Communication Networks (CICN). :386–391.
In this paper, a new intrusion detection mechanism is proposed based on Trust and Packet Loss Rate at Sensor Node in WSNs. To find the true malicious nodes, the proposed mechanism performs a deep analysis on the packet loss. Two independent metrics such as buffer capacity metric and residual energy metric are considered for packet loss rate evaluation. Further, the trust evaluation also considers the basic communication interactions between sensor nodes. Based on these three metrics, a new composite metric called Packet Forwarding Probability (PFP) is derived through which the malicious nodes are identified. Simulation experiments are conducted over the proposed mechanism and the performance is evaluated through False Positive Rate (FPR) and Malicious Detection Rate (MDR). The results declare that the proposed mechanism achieves a better performance compared to the conventional approaches.
Sharma, Prince, Shukla, Shailendra, Vasudeva, Amol.  2020.  Trust-based Incentive for Mobile Offloaders in Opportunistic Networks. 2020 International Conference on Smart Electronics and Communication (ICOSEC). :872—877.
Mobile data offloading using opportunistic network has recently gained its significance to increase mobile data needs. Such offloaders need to be properly incentivized to encourage more and more users to act as helpers in such networks. The extent of help offered by mobile data offloading alternatives using appropriate incentive mechanisms is significant in such scenarios. The limitation of existing incentive mechanisms is that they are partial in implementation while most of them use third party intervention based derivation. However, none of the papers considers trust as an essential factor for incentive distribution. Although few works contribute to the trust analysis, but the implementation is limited to offloading determination only while the incentive is independent of trust. We try to investigate if trust could be related to the Nash equilibrium based incentive evaluation. Our analysis results show that trust-based incentive distribution encourages more than 50% offloaders to act positively and contribute successfully towards efficient mobile data offloading. We compare the performance of our algorithm with literature based salary-bonus scheme implementation and get optimum incentive beyond 20% dependence over trust-based output.
Leff, D., Maskay, A., Cunha, M. P. da.  2020.  Wireless Interrogation of High Temperature Surface Acoustic Wave Dynamic Strain Sensor. 2020 IEEE International Ultrasonics Symposium (IUS). :1–4.
Dynamic strain sensing is necessary for high-temperature harsh-environment applications, including powerplants, oil wells, aerospace, and metal manufacturing. Monitoring dynamic strain is important for structural health monitoring and condition-based maintenance in order to guarantee safety, increase process efficiency, and reduce operation and maintenance costs. Sensing in high-temperature (HT), harsh-environments (HE) comes with challenges including mounting and packaging, sensor stability, and data acquisition and processing. Wireless sensor operation at HT is desirable because it reduces the complexity of the sensor connection, increases reliability, and reduces costs. Surface acoustic wave resonators (SAWRs) are compact, can operate wirelessly and battery-free, and have been shown to operate above 1000°C, making them a potential option for HT HE dynamic strain sensing. This paper presents wirelessly interrogated SAWR dynamic strain sensors operating around 288.8MHz at room temperature and tested up to 400°C. The SAWRs were calibrated with a high-temperature wired commercial strain gauge. The sensors were mounted onto a tapered-type Inconel constant stress beam and the assembly was tested inside a box furnace. The SAWR sensitivity to dynamic strain excitation at 25°C, 100°C, and 400°C was .439 μV/με, 0.363μV/με, and .136 μV/με, respectively. The experimental outcomes verified that inductive coupled wirelessly interrogated SAWRs can be successfully used for dynamic strain sensing up to 400°C.
Illing, B., Westhoven, M., Gaspers, B., Smets, N., Brüggemann, B., Mathew, T..  2020.  Evaluation of Immersive Teleoperation Systems using Standardized Tasks and Measurements. 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). :278—285.

Despite advances regarding autonomous functionality for robots, teleoperation remains a means for performing delicate tasks in safety critical contexts like explosive ordnance disposal (EOD) and ambiguous environments. Immersive stereoscopic displays have been proposed and developed in this regard, but bring about their own specific problems, e.g., simulator sickness. This work builds upon standardized test environments to yield reproducible comparisons between different robotic platforms. The focus was placed on testing three optronic systems of differing degrees of immersion: (1) A laptop display showing multiple monoscopic camera views, (2) an off-the-shelf virtual reality headset coupled with a pantilt-based stereoscopic camera, and (3) a so-called Telepresence Unit, providing fast pan, tilt, yaw rotation, stereoscopic view, and spatial audio. Stereoscopic systems yielded significant faster task completion only for the maneuvering task. As expected, they also induced Simulator Sickness among other results. However, the amount of Simulator Sickness varied between both stereoscopic systems. Collected data suggests that a higher degree of immersion combined with careful system design can reduce the to-be-expected increase of Simulator Sickness compared to the monoscopic camera baseline while making the interface subjectively more effective for certain tasks.

Rana, Krishan, Dasagi, Vibhavari, Talbot, Ben, Milford, Michael, Sünderhauf, Niko.  2020.  Multiplicative Controller Fusion: Leveraging Algorithmic Priors for Sample-efficient Reinforcement Learning and Safe Sim-To-Real Transfer. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). :6069—6076.
Learning-based approaches often outperform hand-coded algorithmic solutions for many problems in robotics. However, learning long-horizon tasks on real robot hardware can be intractable, and transferring a learned policy from simulation to reality is still extremely challenging. We present a novel approach to model-free reinforcement learning that can leverage existing sub-optimal solutions as an algorithmic prior during training and deployment. During training, our gated fusion approach enables the prior to guide the initial stages of exploration, increasing sample-efficiency and enabling learning from sparse long-horizon reward signals. Importantly, the policy can learn to improve beyond the performance of the sub-optimal prior since the prior's influence is annealed gradually. During deployment, the policy's uncertainty provides a reliable strategy for transferring a simulation-trained policy to the real world by falling back to the prior controller in uncertain states. We show the efficacy of our Multiplicative Controller Fusion approach on the task of robot navigation and demonstrate safe transfer from simulation to the real world without any fine-tuning. The code for this project is made publicly available at