Abstract
Eye movement tracking and analysis has gained popularity as a critical tool for evaluating interfaces and visual display. Existing tools and methods that are used in analyzing scan paths and eye movements are to a greater extent limited in the nature of tasks that they support. In particular, the tools used have been found ineffective in tracking large amounts of data that has high variations. Eye movement analysis has been used in psychology, medical research, advertising, and cognitive science. Today, using CCTV cameras, human eye gaze, and eye movement and blinking is relatively recorded with ease and with high reliability. Since eye movement is a primarily a combination of voluntary and involuntary cognitive processes, eye data analyzers need to be careful in interpreting object’s appearance as well as its movements, as this is what is used to infer the objects’ behaviors and intentions.
Introduction
Automated analysis of CCTV still and in motion images has gained interest among public and private law enforcement officers so as to replace limited and fallible human operators and enhance visual surveillance. Visual analytics make use of extensive research that is drawn from methods that are developed in machine learning; analytical disciplines such as pattern recognition and in statistics. The main goal of these methods is to enable human operators to understand eye movements aided by underlying cognitive processes (Faux & Luthon, 2012). This paper is an extensive examination of visual analysis methods. This paper will make use of available literature materials to investigate approaches by technologists and evaluation groups in creating suitable eye movement analysis methods for criminal environment eye CCTV camera data.
Globally, CCTV surveillance camera industry is expected to hit in excess of $ 30 billion by 2016. This has been driven mostly by the rapid proliferation of CCTV installation numbers in critical infrastructures and places such as underground train stations, shopping centers, and airports. This recent growth has expedited an increasing demand for automated methods which will process the CCTV cameras output. Obviously, human operators are incapable of effectively monitoring large numbers of camera data and for that matter for a long duration of time. This has therefore created significant research and attention in computer vision-based systems to fill the limitations of human beings. The new system employs intelligent tools and methods that filter events of interest where operators can, therefore, focus on specific individuals through shopping areas, streets, and airports. The interesting events are ones that are defined by users as having some implications on the security, safety and threats to the members of the public. Normally, CCTV deployed cameras send real images to a control room, all in real time that must be supervised by human operators. The captured images may be archived and retrieved later when they are needed as evidence in a criminal trial.
The data used in eye tracking analysis mainly consists of records that describe the positions as well as times of fixations of the gaze. Each of data record has user identifier, fixation duration, time and position which are represented in a display space through x and y coordinates. Other records may have attributes such as stimulus identifier otherwise known as scan path or eye tracking. In eye movement analysis one must be careful to distinguish it with the actual geographical movement of the object. Recent studies on eye tracking are mainly classified into three broad categories namely detection of the eyes, feature extraction of eye information such as visible contours of the eyeball region, pupil and iris circular area and location of the pupil and lastly a combination of detection of the eyes and feature extraction (Andrienko et al, 2012). Real-time eye analysis processing demands that a human operator will be capable of detecting the eye and at the same time acquire the information that is needed to be analyzed. However regardless of the categories and the methods that will be used in eye analysis, it must be noted that image pre-processing environment such as noise reduction, natural jitter, and camera resolution contributes to the success of the eye tracking and analysis steps. The image background also to a large extent affects the overall quality of the image tracking and analysis.
Eye tracking and analysis methods in security camera systems are broadly classified as supervised or unsupervised methods. Supervised methods make use of machine learning algorithms and may be knowledge-based template matching or featured based otherwise known as pattern matching. Knowledge-based methods encode human knowledge of features of a typical face. They are then represented on software that is used to capture the relationship between different facial features and eventually detect the position of the eye for analysis. This method, therefore, uses face localization to detect the eye and its position relative to other facial features for detecting the eye data for analysis. The problem with this method is that it is difficult to accurately translate human knowledge to defined rules for building a computer system that will detect and analyze human eye (Adams & Ferryman, 2015).
Pattern matching system method primarily is feature based where it tracks the features of the eyes and compares it will a template already achieved in the system. The features of the eyes that are used in this kind of method involve the areas around the pupil, iris, and Scleral. In this method, a template is utilized in detecting whether the eyes of the object are open or closed. Analyzing the eye movement pattern is important when determining if the object under review is alert or not. This method has the advantages of being relatively easier to implement compared to machine learning. However, this method is limited in that it is incapable of adequately addressing complications caused by shape, pose and scale of the object. Blinking and eye tracking are only determined by examining the correlation between the input image and the template in the computer system (Murawski & Różanowski, 2013). This approach is best implemented by security officers who are concerned with the alertness of a driver along the highway by determining whether their eyes are alert or not.
Recent developments
CCTV cameras that capture facial images for evidential purposes in a court of law have significant challenges that are around resolution and degradation which may result from noise or occlusion. These issues end up affecting the quality of the image captured and therefore eventually detecting and analyzing of the eyes become difficult. For this reason, for accurate identification of the eyes, intensive research has been carried out recently with an aim to develop CCTV cameras with super resolutions. These cameras capture multiple images of an object accurately and hence making it easier to detect and analyze the features of the eyes. More developments have taken place with the invention of a hallucination-based method that reconstructs facial and hence eye positions through inference. It uses a high-resolution facial image that is derived from low-resolution image frame with an aid of large libraries of similar high-resolution facial images. When the resolution of the CCTV camera is addressed the overall work for Human operators will now shift to finding better ways through which eyes of criminal activities can be identified from different multiple images and scenes (Adams & Ferryman, 2015).
Despite the fact that CCTV cameras presently are automated as seems especially capable of operating sufficiently in security monitoring systems where current algorithms operate well, security officials in collaboration with system developers have embarked on new projects that will effectively respond to real world eye tracking and analysis situation. In particular, recent research has tended to focus on integrating cognitive systems with CCTV cameras and hence help build a database of information through learning and association. This method draws from the premise that perception results from actions (Hassaballah & Aly, 2015). The implication of this premise is that perception-action feedback cycle is developed for cognitive systems using artificial intelligence orientations. The main areas in this approach are recognition and categorization, goal specification and achievement, knowledge representation, and event and structure reasoning. Future cognitive vision systems will be capable of acquiring information in addition to using developed knowledge in helping security officers to make decisions about the eye features that were being analyzed. Futures cognitive vision systems will permit abilities to recognize eye features of the target object from multiple scenes and then model effective ways to analyzing the most important features of the eyes. Building cognitive visual systems will involve integrating cues from difference scenes for one object and then inducing the system to learn in order that the learned cues can be used in identifying the eyes of criminal in another scene by matching patterns.
Typically, professional working around eye tracking and analysis projects will be faced with increasing needs to track the global motion of the eyes, maintaining invariance with movements of eye pupils, and distinguishing between open eye frames and closed-eyes frames. Accordingly, future effective eye movement tracking and analysis algorithm needs to be sufficient in handling the above challenges. Existing studies points out two main methods for eye blinking and tracking are used. The first method involves estimating the location of the eyes through a tracking algorithm that identifies differences between opened and closed eyes. Template matching is normally used in determining openness of the eyes. When these methods are used in determining the position, openness and alertness of the object, security officers are capable of making real-time judgments about a criminal suspect or a driver who is carelessly driving along the highway (Hassaballah & Aly, 2015). Future research findings need to focus more on detecting eyes from reconstructed facial images. This is because, in real life environments, it will not be possible to achieve high-quality images as hardware technologies are slow so as to develop cameras with high resolutions. This limitation will thus be solved by interpolating available images for identification of the position and status of the eyes. The caution will, however, be to make an intelligent speculation of the position and status of the eyes. Despite all this, Eye movement analysis and detection for CCTV cameras still remain fundamental in controlling the security of a nation.
References
Adams, A. A., & Ferryman, J. M. (2015). The future of video analytics for surveillance and its ethical implications. Security Journal, 28(3), 272-289.
Andrienko, G., Andrienko, N., Burch, M., & Weiskopf, D. (2012). Visual analytics methodology for eye movement studies. Visualization and Computer Graphics, IEEE Transactions on, 18(12), 2889-2898.
Faux, F., & Luthon, F. (2012, July). Theory of evidence for face detection and tracking. International Journal of Approximate Reasoning, 53(5), 728-746. doi:10.1016/j.ijar.2012.02.002
Hassaballah, M., & Aly, S. (2015, July 1). Face recognition: challenges, achievements and future directions. IET Computer Vision, 9(4), 614-626. doi:10.1049/iet-cvi.2014.0084
Murawski, K., & Różanowski, K. (2013). Pattern Recognition Algorithm for Eye Tracker Sensor Video Data Analysis. Acta Physica Polonica, A., 124(3).