Understanding People Behavior from Video

JUL 11, 2017 Manager, Rajesh Anantharaman

Current State of Enterprise Video

Enterprises and corporations install a lot of Closed Circuit TV (CCTV) cameras in order to perform video surveillance. As of 2016, it is estimated that there are 350 million surveillance cameras installed worldwide. With all these cameras installed, a huge amount of video is being generated around the clock, and typically a human security officer is looking at this video feed in order to look for criminal, suspicious or safety-related behavior. There are multiple problems with the current approach:

1. Manual monitoring is expensive and prone to human error. A 1999 study (Green, 1999) found that after 20 minutes, guards watching a video scene will miss up to 95 percent of all activity.
2. The large amount of video data being generated is being stored in its raw format in order to ensure that no significant information is lost. This leads to the requirement for large investments in storage and video management infrastructure. With the growth of High-Definition video, the requirements for storage infrastructure just keep going up.
3. Video surveillance feed is primarily used for post-analysis, to look for evidence of suspected activities during investigations or for lawsuits. Using video feeds to take actions in real-time is not commonplace.

Applying Deep Learning to Understand Behavior from Video

Deep Learning is a class of machine learning that is loosely based on how the brain works. With the explosion of compute capacity and availability of large data sets, deep learning has been shown to be very powerful in finding patterns in unstructured data, such as images, test, speech and video. Deep learning is rapidly replacing traditional computer vision and natural language processing techniques because of its accuracy, surpassing human accuracy in many cases.

Deep learning models are built using deep neural networks, which consist of multiple layers of neurons connected together in a specific architecture. This deep learning network is trained by feeding examples of data that you want the network to learn (ex: cats, dogs, faces). Intuitively, each layer of the network learns different types of patterns – the first layer learns to find edges, the second layer learns to find object parts, the next layer learns to find objects, and so on and so forth each higher layer of the network learns more and more complicated combinations of patterns relevant to the training data set.

There are a multitude of deep neural network architectures that have different structures of connecting the neurons in the network. Convolutional neural networks (CNNs) are a popular deep learning architecture to understand images. Recurrent Neural Networks (RNNs) and specifically Long Short Term Memory Networks (LSTMs) are popular architectures to understand patterns from sequential data. For videos, a combination of CNNs and LSTMs is used to extract the spatial patterns from each frame as well as the temporal aspect from a sequence of frames.

What if deep learning could be applied to find behavioral patterns in video surveillance data? The benefits would be manifold:

1. Automatically identifying behavior patterns directly from video feed can reduce the need for manual monitoring of video feed, saving cost and eliminating potential human error.
2. Selective videos with only meaningful behaviors can be stored and managed, greatly reducing storage infrastructure needs.
3. Identifying behaviors in real-time can enable fast response to those behaviors.
4. Additional patterns can be extracted from the sequence of behaviors to predict outcomes or ensure compliance.
5. Deep learning technology continuously learns from video data to distinguish finer and finer details related to the behavior.

Enterprise Behavior Recognition

Samsung SDS has developed a pioneering platform that leverages deep learning technology to understand behavior from video for enterprise use cases.

Understanding behavior automatically from video data enables a variety of use cases with high business value for multiple industry verticals. Below are some such use case examples:

Law Enforcement:

Automatic monitoring of surveillance cameras to identify violent and criminal behavior (fighting, theft, drug dealing, etc.)
Detect suspicious behavior to predict and prevent adverse events
Identify safety-related behaviors in real-time to allow faster response

Retail:

Determine customer intent based on their behavior in order to improve cleinteling and tailor customer experience (ex: customer needs help, in a hurry, window shopping, strong product interest, etc.)
Improve customer service by recognizing interactions between salespeople and shoppers and predicting customer satisfaction
Automatic monitoring of surveillance cameras to identify security-related anomalous behavior (shoplifting, pickpocketing, mugging, etc.)

Healthcare:

Determine behaviors performed by hospital staff in order to determine compliance of procedure
Detect changes or emergency situations in patients’ health to respond quickly and speed patients’ recovery

Manufacturing:

Determine activities performed by factory operators in order to ensure compliance of procedures
Improve quality control by identifying patterns that lead to different outcomes (ex: pass or fail)
Improve employee health & safety by monitoring activities and ensuring repetitive activities are done in a safe manner
Uncover ways to improve production throughput through analysis of sequence of activities being performed on assembly line

Samsung SDS’ enterprise video intelligence platform utilizes cutting edge deep learning technology to enable multiple enterprise behavior recognition use cases. We aim to help enterprises unlock the tremendous value from their video data and we are looking to work closely with customers to apply our technology. If you want to obtain more information about this platform, please contact us at bd.sdsa@samsung.com.

Manager, Rajesh Anantharaman IT Technology

Samsung SDS Research in America(SDSRA)

Rajesh Anantharaman is a part of SDSA’s Artificial Intelligence team based in Silicon Valley. He is focused on helping solve customer problems and bringing new AI-based solutions to market, including Samsung SDS’ enterprise video intelligence platform.