Visible to the public Biblio

Filters: Keyword is Computer vision  [Clear All Filters]
Maung, Maung, Pyone, April, Kiya, Hitoshi.  2020.  Encryption Inspired Adversarial Defense For Visual Classification. 2020 IEEE International Conference on Image Processing (ICIP). :1681—1685.
Conventional adversarial defenses reduce classification accuracy whether or not a model is under attacks. Moreover, most of image processing based defenses are defeated due to the problem of obfuscated gradients. In this paper, we propose a new adversarial defense which is a defensive transform for both training and test images inspired by perceptual image encryption methods. The proposed method utilizes a block-wise pixel shuffling method with a secret key. The experiments are carried out on both adaptive and non-adaptive maximum-norm bounded white-box attacks while considering obfuscated gradients. The results show that the proposed defense achieves high accuracy (91.55%) on clean images and (89.66%) on adversarial examples with noise distance of 8/255 on CFAR-10 dataset. Thus, the proposed defense outperforms state-of-the-art adversarial defenses including latent adversarial training, adversarial training and thermometer encoding.
Jain, Harsh, Vikram, Aditya, Mohana, Kashyap, Ankit, Jain, Ayush.  2020.  Weapon Detection using Artificial Intelligence and Deep Learning for Security Applications. 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC). :193—198.
Security is always a main concern in every domain, due to a rise in crime rate in a crowded event or suspicious lonely areas. Abnormal detection and monitoring have major applications of computer vision to tackle various problems. Due to growing demand in the protection of safety, security and personal properties, needs and deployment of video surveillance systems can recognize and interpret the scene and anomaly events play a vital role in intelligence monitoring. This paper implements automatic gun (or) weapon detection using a convolution neural network (CNN) based SSD and Faster RCNN algorithms. Proposed implementation uses two types of datasets. One dataset, which had pre-labelled images and the other one is a set of images, which were labelled manually. Results are tabulated, both algorithms achieve good accuracy, but their application in real situations can be based on the trade-off between speed and accuracy.
Cui, W., Li, X., Huang, J., Wang, W., Wang, S., Chen, J..  2020.  Substitute Model Generation for Black-Box Adversarial Attack Based on Knowledge Distillation. 2020 IEEE International Conference on Image Processing (ICIP). :648–652.
Although deep convolutional neural network (CNN) performs well in many computer vision tasks, its classification mechanism is very vulnerable when it is exposed to the perturbation of adversarial attacks. In this paper, we proposed a new algorithm to generate the substitute model of black-box CNN models by using knowledge distillation. The proposed algorithm distills multiple CNN teacher models to a compact student model as the substitution of other black-box CNN models to be attacked. The black-box adversarial samples can be consequently generated on this substitute model by using various white-box attacking methods. According to our experiments on ResNet18 and DenseNet121, our algorithm boosts the attacking success rate (ASR) by 20% by training the substitute model based on knowledge distillation.
Bronzin, T., Prole, B., Stipić, A., Pap, K..  2020.  Individualization of Anonymous Identities Using Artificial Intelligence (AI). 2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO). :1058–1063.

Individualization of anonymous identities using artificial intelligence - enables innovative human-computer interaction through the personalization of communication which is, at the same time, individual and anonymous. This paper presents possible approach for individualization of anonymous identities in real time. It uses computer vision and artificial intelligence to automatically detect and recognize person's age group, gender, human body measures, proportions and other specific personal characteristics. Collected data constitutes the so-called person's biometric footprint and are linked to a unique (but still anonymous) identity that is recorded in the computer system, along with other information that make up the profile of the person. Identity anonymization can be achieved by appropriate asymmetric encryption of the biometric footprint (with no additional personal information being stored) and integrity can be ensured using blockchain technology. Data collected in this manner is GDPR compliant.

Hynes, E., Flynn, R., Lee, B., Murray, N..  2020.  An Evaluation of Lower Facial Micro Expressions as an Implicit QoE Metric for an Augmented Reality Procedure Assistance Application. 2020 31st Irish Signals and Systems Conference (ISSC). :1–6.
Augmented reality (AR) has been identified as a key technology to enhance worker utility in the context of increasing automation of repeatable procedures. AR can achieve this by assisting the user in performing complex and frequently changing procedures. Crucial to the success of procedure assistance AR applications is user acceptability, which can be measured by user quality of experience (QoE). An active research topic in QoE is the identification of implicit metrics that can be used to continuously infer user QoE during a multimedia experience. A user's QoE is linked to their affective state. Affective state is reflected in facial expressions. Emotions shown in micro facial expressions resemble those expressed in normal expressions but are distinguished from them by their brief duration. The novelty of this work lies in the evaluation of micro facial expressions as a continuous QoE metric by means of correlation analysis to the more traditional and accepted post-experience self-reporting. In this work, an optimal Rubik's Cube solver AR application was used as a proof of concept for complex procedure assistance. This was compared with a paper-based procedure assistance control. QoE expressed by affect in normal and micro facial expressions was evaluated through correlation analysis with post-experience reports. The results show that the AR application yielded higher task success rates and shorter task durations. Micro facial expressions reflecting disgust correlated moderately to the questionnaire responses for instruction disinterest in the AR application.
Wang Xiao, Mi Hong, Wang Wei.  2010.  Inner edge detection of PET bottle opening based on the Balloon Snake. 2010 2nd International Conference on Advanced Computer Control. 4:56—59.

Edge detection of bottle opening is a primary section to the machine vision based bottle opening detection system. This paper, taking advantage of the Balloon Snake, on the PET (Polyethylene Terephthalate) images sampled at rotating bottle-blowing machine producing pipelines, extracts the opening. It first uses the grayscale weighting average method to calculate the centroid as the initial position of Snake and then based on the energy minimal theory, it extracts the opening. Experiments show that compared with the conventional edge detection and center location methods, Balloon Snake is robust and can easily step over the weak noise points. Edge extracted thorough Balloon Snake is more integral and continuous which provides a guarantee to correctly judge the opening.

Rathi, P., Adarsh, P., Kumar, M..  2020.  Deep Learning Approach for Arbitrary Image Style Fusion and Transformation using SANET model. 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184). :1049–1057.
For real-time applications of arbitrary style transformation, there is a trade-off between the quality of results and the running time of existing algorithms. Hence, it is required to maintain the equilibrium of the quality of generated artwork with the speed of execution. It's complicated for the present arbitrary style-transformation procedures to preserve the structure of content-image while blending with the design and pattern of style-image. This paper presents the implementation of a network using SANET models for generating impressive artworks. It is flexible in the fusion of new style characteristics while sustaining the semantic-structure of the content-image. The identity-loss function helps to minimize the overall loss and conserves the spatial-arrangement of content. The results demonstrate that this method is practically efficient, and therefore it can be employed for real-time fusion and transformation using arbitrary styles.
Matern, F., Riess, C., Stamminger, M..  2019.  Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations. 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW). :83—92.
High quality face editing in videos is a growing concern and spreads distrust in video content. However, upon closer examination, many face editing algorithms exhibit artifacts that resemble classical computer vision issues that stem from face tracking and editing. As a consequence, we wonder how difficult it is to expose artificial faces from current generators? To this end, we review current facial editing methods and several characteristic artifacts from their processing pipelines. We also show that relatively simple visual artifacts can be already quite effective in exposing such manipulations, including Deepfakes and Face2Face. Since the methods are based on visual features, they are easily explicable also to non-technical experts. The methods are easy to implement and offer capabilities for rapid adjustment to new manipulation types with little data available. Despite their simplicity, the methods are able to achieve AUC values of up to 0.866.
Amerini, I., Galteri, L., Caldelli, R., Bimbo, A. Del.  2019.  Deepfake Video Detection through Optical Flow Based CNN. 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). :1205—1207.
Recent advances in visual media technology have led to new tools for processing and, above all, generating multimedia contents. In particular, modern AI-based technologies have provided easy-to-use tools to create extremely realistic manipulated videos. Such synthetic videos, named Deep Fakes, may constitute a serious threat to attack the reputation of public subjects or to address the general opinion on a certain event. According to this, being able to individuate this kind of fake information becomes fundamental. In this work, a new forensic technique able to discern between fake and original video sequences is given; unlike other state-of-the-art methods which resorts at single video frames, we propose the adoption of optical flow fields to exploit possible inter-frame dissimilarities. Such a clue is then used as feature to be learned by CNN classifiers. Preliminary results obtained on FaceForensics++ dataset highlight very promising performances.
Liu, X., Gao, W., Feng, D., Gao, X..  2020.  Abnormal Traffic Congestion Recognition Based on Video Analysis. 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). :39—42.

The incidence of abnormal road traffic events, especially abnormal traffic congestion, is becoming more and more prominent in daily traffic management in China. It has become the main research work of urban traffic management to detect and identify traffic congestion incidents in time. Efficient and accurate detection of traffic congestion incidents can provide a good strategy for traffic management. At present, the detection and recognition of traffic congestion events mainly rely on the integration of road traffic flow data and the passing data collected by electronic police or devices of checkpoint, and then estimating and forecasting road conditions through the method of big data analysis; Such methods often have some disadvantages such as low time-effect, low precision and small prediction range. Therefore, with the help of the current large and medium cities in the public security, traffic police have built video surveillance equipment, through computer vision technology to analyze the traffic flow from video monitoring, in this paper, the motion state and the changing trend of vehicle flow are obtained by using the technology of vehicle detection from video and multi-target tracking based on deep learning, so as to realize the perception and recognition of traffic congestion. The method achieves the recognition accuracy of less than 60 seconds in real-time, more than 80% in detection rate of congestion event and more than 82.5% in accuracy of detection. At the same time, it breaks through the restriction of traditional big data prediction, such as traffic flow data, truck pass data and GPS floating car data, and enlarges the scene and scope of detection.

Cao, S., Zou, J., Du, X., Zhang, X..  2020.  A Successive Framework: Enabling Accurate Identification and Secure Storage for Data in Smart Grid. ICC 2020 - 2020 IEEE International Conference on Communications (ICC). :1–6.
Due to malicious eavesdropping, forgery as well as other risks, it is challenging to dispose and store collected power data from smart grid in secure manners. Blockchain technology has become a novel method to solve the above problems because of its de-centralization and tamper-proof characteristics. It is especially well known that data stored in blockchain cannot be changed, so it is vital to seek out perfect mechanisms to ensure that data are compliant with high quality (namely, accuracy of the power data) before being stored in blockchain. This will help avoid losses due to low-quality data modification or deletion as needed in smart grid. Thus, we apply the parallel vision theory on the identification of meter readings to realize accurate power data. A cloud-blockchain fusion model (CBFM) is proposed for the storage of accurate power data, allowing for secure conducting of flexible transactions. Only power data calculated by parallel visual system instead of image data collected originally via robot would be stored in blockchain. Hence, we define the quality assurance before data uploaded to blockchain and security guarantee after data stored in blockchain as a successive framework, which is a brand new solution to manage efficiency and security as a whole for power data and data alike in other scenes. Security analysis and performance evaluations are performed, which prove that CBFM is highly secure and efficient impressively.
Maram, S. S., Vishnoi, T., Pandey, S..  2019.  Neural Network and ROS based Threat Detection and Patrolling Assistance. 2019 Second International Conference on Advanced Computational and Communication Paradigms (ICACCP). :1—5.

To bring a uniform development platform which seamlessly combines hardware components and software architecture of various developers across the globe and reduce the complexity in producing robots which help people in their daily ergonomics. ROS has come out to be a game changer. It is disappointing to see the lack of penetration of technology in different verticals which involve protection, defense and security. By leveraging the power of ROS in the field of robotic automation and computer vision, this research will pave path for identification of suspicious activity with autonomously moving bots which run on ROS. The research paper proposes and validates a flow where ROS and computer vision algorithms like YOLO can fall in sync with each other to provide smarter and accurate methods for indoor and limited outdoor patrolling. Identification of age,`gender, weapons and other elements which can disturb public harmony will be an integral part of the research and development process. The simulation and testing reflects the efficiency and speed of the designed software architecture.

Prakash, A., Walambe, R..  2018.  Military Surveillance Robot Implementation Using Robot Operating System. 2018 IEEE Punecon. :1—5.

Robots are becoming more and more prevalent in many real world scenarios. Housekeeping, medical aid, human assistance are a few common implementations of robots. Military and Security are also major areas where robotics is being researched and implemented. Robots with the purpose of surveillance in war zones and terrorist scenarios need specific functionalities to perform their tasks with precision and efficiency. In this paper, we present a model of Military Surveillance Robot developed using Robot Operating System. The map generation based on Kinect sensor is presented and some test case scenarios are discussed with results.

Zhou, Z., Yang, Y., Cai, Z., Yang, Y., Lin, L..  2019.  Combined Layer GAN for Image Style Transfer*. 2019 IEEE International Conference on Computational Electromagnetics (ICCEM). :1—3.

Image style transfer is an increasingly interesting topic in computer vision where the goal is to map images from one style to another. In this paper, we propose a new framework called Combined Layer GAN as a solution of dealing with image style transfer problem. Specifically, the edge-constraint and color-constraint are proposed and explored in the GAN based image translation method to improve the performance. The motivation of the work is that color and edge are fundamental vision factors for an image, while in the traditional deep network based approach, there is a lack of fine control of these factors in the process of translation and the performance is degraded consequently. Our experiments and evaluations show that our novel method with the edge and color constrains is more stable, and significantly improves the performance compared with the traditional methods.

Huang, Y., Jing, M., Tang, H., Fan, Y., Xue, X., Zeng, X..  2019.  Real-Time Arbitrary Style Transfer with Convolution Neural Network. 2019 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA). :65—66.

Style transfer is a research hotspot in computer vision. Up to now, it is still a challenge although many researches have been conducted on it for high quality style transfer. In this work, we propose an algorithm named ASTCNN which is a real-time Arbitrary Style Transfer Convolution Neural Network. The ASTCNN consists of two independent encoders and a decoder. The encoders respectively extract style and content features from style and content and the decoder generates the style transferred image images. Experimental results show that ASTCNN achieves higher quality output image than the state-of-the-art style transfer algorithms and the floating point computation of ASTCNN is 23.3% less than theirs.

Li, Y., Zhang, T., Han, X., Qi, Y..  2018.  Image Style Transfer in Deep Learning Networks. 2018 5th International Conference on Systems and Informatics (ICSAI). :660–664.

Since Gatys et al. proved that the convolution neural network (CNN) can be used to generate new images with artistic styles by separating and recombining the styles and contents of images. Neural Style Transfer has attracted wide attention of computer vision researchers. This paper aims to provide an overview of the style transfer application deep learning network development process, and introduces the classical style migration model, on the basis of the research on the migration of style of the deep learning network for collecting and organizing, and put forward related to gathered during the investigation of the problem solution, finally some classical model in the image style to display and compare the results of migration.

Wang, C., He, M..  2018.  Image Style Transfer with Multi-target Loss for loT Applications. 2018 15th International Symposium on Pervasive Systems, Algorithms and Networks (I-SPAN). :296–299.

Transferring the style of an image is a fundamental problem in computer vision. Which extracts the features of a context image and a style image, then fixes them to produce a new image with features of the both two input images. In this paper, we introduce an artificial system to separate and recombine the content and style of arbitrary images, providing a neural algorithm for the creation of artistic images. We use a pre-trained deep convolutional neural network VGG19 to extract the feature map of the input style image and context image. Then we define a loss function that captures the difference between the output image and the two input images. We use the gradient descent algorithm to update the output image to minimize the loss function. Experiment results show the feasibility of the method.

Goel, A., Agarwal, A., Vatsa, M., Singh, R., Ratha, N..  2019.  DeepRing: Protecting Deep Neural Network With Blockchain. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). :2821—2828.

Several computer vision applications such as object detection and face recognition have started to completely rely on deep learning based architectures. These architectures, when paired with appropriate loss functions and optimizers, produce state-of-the-art results in a myriad of problems. On the other hand, with the advent of "blockchain", the cybersecurity industry has developed a new sense of trust which was earlier missing from both the technical and commercial perspectives. Employment of cryptographic hash as well as symmetric/asymmetric encryption and decryption algorithms ensure security without any human intervention (i.e., centralized authority). In this research, we present the synergy between the best of both these worlds. We first propose a model which uses the learned parameters of a typical deep neural network and is secured from external adversaries by cryptography and blockchain technology. As the second contribution of the proposed research, a new parameter tampering attack is proposed to properly justify the role of blockchain in machine learning.

Vi, Bao Ngoc, Noi Nguyen, Huu, Nguyen, Ngoc Tran, Truong Tran, Cao.  2019.  Adversarial Examples Against Image-based Malware Classification Systems. 2019 11th International Conference on Knowledge and Systems Engineering (KSE). :1—5.

Malicious software, known as malware, has become urgently serious threat for computer security, so automatic mal-ware classification techniques have received increasing attention. In recent years, deep learning (DL) techniques for computer vision have been successfully applied for malware classification by visualizing malware files and then using DL to classify visualized images. Although DL-based classification systems have been proven to be much more accurate than conventional ones, these systems have been shown to be vulnerable to adversarial attacks. However, there has been little research to consider the danger of adversarial attacks to visualized image-based malware classification systems. This paper proposes an adversarial attack method based on the gradient to attack image-based malware classification systems by introducing perturbations on resource section of PE files. The experimental results on the Malimg dataset show that by a small interference, the proposed method can achieve success attack rate when challenging convolutional neural network malware classifiers.

Marrone, Stefano, Sansone, Carlo.  2019.  An Adversarial Perturbation Approach Against CNN-based Soft Biometrics Detection. 2019 International Joint Conference on Neural Networks (IJCNN). :1–8.
The use of biometric-based authentication systems spread over daily life consumer electronics. Over the years, researchers' interest shifted from hard (such as fingerprints, voice and keystroke dynamics) to soft biometrics (such as age, ethnicity and gender), mainly by using the latter to improve the authentication systems effectiveness. While newer approaches are constantly being proposed by domain experts, in the last years Deep Learning has raised in many computer vision tasks, also becoming the current state-of-art for several biometric approaches. However, since the automatic processing of data rich in sensitive information could expose users to privacy threats associated to their unfair use (i.e. gender or ethnicity), in the last years researchers started to focus on the development of defensive strategies in the view of a more secure and private AI. The aim of this work is to exploit Adversarial Perturbation, namely approaches able to mislead state-of-the-art CNNs by injecting a suitable small perturbation over the input image, to protect subjects against unwanted soft biometrics-based identification by automatic means. In particular, since ethnicity is one of the most critical soft biometrics, as a case of study we will focus on the generation of adversarial stickers that, once printed, can hide subjects ethnicity in a real-world scenario.
Gamba, Matteo, Azizpour, Hossein, Carlsson, Stefan, Björkman, Mårten.  2019.  On the Geometry of Rectifier Convolutional Neural Networks. 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). :793—797.

While recent studies have shed light on the expressivity, complexity and compositionality of convolutional networks, the real inductive bias of the family of functions reachable by gradient descent on natural data is still unknown. By exploiting symmetries in the preactivation space of convolutional layers, we present preliminary empirical evidence of regularities in the preimage of trained rectifier networks, in terms of arrangements of polytopes, and relate it to the nonlinear transformations applied by the network to its input.

Kakadiya, Rutvik, Lemos, Reuel, Mangalan, Sebin, Pillai, Meghna, Nikam, Sneha.  2019.  AI Based Automatic Robbery/Theft Detection using Smart Surveillance in Banks. 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA). :201—204.

Deep learning is the segment of artificial intelligence which is involved with imitating the learning approach that human beings utilize to get some different types of knowledge. Analyzing videos, a part of deep learning is one of the most basic problems of computer vision and multi-media content analysis for at least 20 years. The job is very challenging as the video contains a lot of information with large differences and difficulties. Human supervision is still required in all surveillance systems. New advancement in computer vision which are observed as an important trend in video surveillance leads to dramatic efficiency gains. We propose a CCTV based theft detection along with tracking of thieves. We use image processing to detect theft and motion of thieves in CCTV footage, without the use of sensors. This system concentrates on object detection. The security personnel can be notified about the suspicious individual committing burglary using Real-time analysis of the movement of any human from CCTV footage and thus gives a chance to avert the same.

Feng, Ri-Chen, Lin, Daw-Tung, Chen, Ken-Min, Lin, Yi-Yao, Liu, Chin-De.  2019.  Improving Deep Learning by Incorporating Semi-automatic Moving Object Annotation and Filtering for Vision-based Vehicle Detection*. 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC). :2484—2489.

Deep learning has undergone tremendous advancements in computer vision studies. The training of deep learning neural networks depends on a considerable amount of ground truth datasets. However, labeling ground truth data is a labor-intensive task, particularly for large-volume video analytics applications such as video surveillance and vehicles detection for autonomous driving. This paper presents a rapid and accurate method for associative searching in big image data obtained from security monitoring systems. We developed a semi-automatic moving object annotation method for improving deep learning models. The proposed method comprises three stages, namely automatic foreground object extraction, object annotation in subsequent video frames, and dataset construction using human-in-the-loop quick selection. Furthermore, the proposed method expedites dataset collection and ground truth annotation processes. In contrast to data augmentation and data generative models, the proposed method produces a large amount of real data, which may facilitate training results and avoid adverse effects engendered by artifactual data. We applied the constructed annotation dataset to train a deep learning you-only-look-once (YOLO) model to perform vehicle detection on street intersection surveillance videos. Experimental results demonstrated that the accurate detection performance was improved from a mean average precision (mAP) of 83.99 to 88.03.

Bashir, Muzammil, Rundensteiner, Elke A., Ahsan, Ramoza.  2019.  A deep learning approach to trespassing detection using video surveillance data. 2019 IEEE International Conference on Big Data (Big Data). :3535—3544.
Railroad trespassing is a dangerous activity with significant security and safety risks. However, regular patrolling of potential trespassing sites is infeasible due to exceedingly high resource demands and personnel costs. This raises the need to design automated trespass detection and early warning prediction techniques leveraging state-of-the-art machine learning. To meet this need, we propose a novel framework for Automated Railroad Trespassing detection System using video surveillance data called ARTS. As the core of our solution, we adopt a CNN-based deep learning architecture capable of video processing. However, these deep learning-based methods, while effective, are known to be computationally expensive and time consuming, especially when applied to a large volume of surveillance data. Leveraging the sparsity of railroad trespassing activity, ARTS corresponds to a dual-stage deep learning architecture composed of an inexpensive pre-filtering stage for activity detection, followed by a high fidelity trespass classification stage employing deep neural network. The resulting dual-stage ARTS architecture represents a flexible solution capable of trading-off accuracy with computational time. We demonstrate the efficacy of our approach on public domain surveillance data achieving 0.87 f1 score while keeping up with the enormous video volume, achieving a practical time and accuracy trade-off.
Adari, Suman Kalyan, Garcia, Washington, Butler, Kevin.  2019.  Adversarial Video Captioning. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :24—27.
In recent years, developments in the field of computer vision have allowed deep learning-based techniques to surpass human-level performance. However, these advances have also culminated in the advent of adversarial machine learning techniques, capable of launching targeted image captioning attacks that easily fool deep learning models. Although attacks in the image domain are well studied, little work has been done in the video domain. In this paper, we show it is possible to extend prior attacks in the image domain to the video captioning task, without heavily affecting the video's playback quality. We demonstrate our attack against a state-of-the-art video captioning model, by extending a prior image captioning attack known as Show and Fool. To the best of our knowledge, this is the first successful method for targeted attacks against a video captioning model, which is able to inject 'subliminal' perturbations into the video stream, and force the model to output a chosen caption with up to 0.981 cosine similarity, achieving near-perfect similarity to chosen target captions.