Lights, camera, safety?

On April 15, two young men dropped their backpacks on the ground near the Boston Marathon finish line and walked away. Security cameras captured their actions, but no one paid much attention at the time. But each bag contained a pressure-cooker bomb, packed with nails, ball bearings and black powder. Metal fragments ripped through the crowd at an estimated 3,300 feet per second, killing five people and injuring 267 others.

Set aside the whys —
religion, politics, insanity, whatever — and focus instead on this:

What if a software program could have sifted through all the surveillance videos in real time, then immediately alerted the police to the suspicious act, the abandoned backpacks?

A team of LSU researchers led by Supratik Mukhopadhyay, an assistant professor in LSU’s Computer Science Department, has designed a system to do just that.

“This kind of technology has become very important in the wake of Boston. If you can detect those guys putting their backpacks down and leaving,” Mukhopadhyay said, his voice trailing off.

The researchers call their invention an integrated automated video activity recognition system. The system identifies what Mukhopadhyay calls “anomalous behaviors” or “activities of interest.” That can be anything out of the ordinary — loitering, leaving a car in a bank parking lot at 2 a.m. or dumping a backpack on a crowded sidewalk.

In the system’s eyes, walking is OK. Running or carrying a package, may not be, he said. Some things nobody does, like running from a store, unless a crime is involved.

The system looks for unusual activities and tries to infer if additional anomalous behaviors will take place, Mukhopadhyay said.

For example, a person wearing a ski mask and carrying a crowbar runs into a store. The system recognizes the probability of that being anything innocent is low.

Mukhopadhyay said the
system can capitalize on the ubiquity of video cameras, which can be found everywhere from intersections and parking garages to businesses and cellphones. All of those cameras capture millions of terrabytes of data each year (a terrabyte is a 1 followed by 12 zeroes).

But most of the video data is never used to generate usable intelligence in real time, Mukhopadhyay said. The only time the videos are analyzed is after the fact, and that won’t bring back the people who died.

Even if law enforcement had enough people to work around the clock analyzing video, it’s very likely they would miss something crucial, he said. Humans have limited attention spans.

But the researchers’ system doesn’t have that problem, he said. The system is limited only by the computing power available.

To make the video recognition system work, the researchers used an artificial intelligence technique known as “deep learning” and one of their own invention, which they call “agile learning,” Mukhopadhyay said.

Deep learning programs mimic the brain’s neural connections. An article published Nov. 23 in The New York Times said the technique had recently resulted in startling advances in a number of fields, including computer vision and speech recognition. These pattern-recognition breakthroughs offer the promise of “machines that converse like humans,” drive cars or work in factories, the article says.

The LSU researchers came up with agile learning while teaching their computers to recognize different activities.

Machine learning, or a system that can learn from data, has two essential problems, Mukhopadhyay said. One is that the computer learns only what is in the data, although that can involve an enormous amount of information.

Second, machine learning suffers from concept drift, he said. By the time, the machine learns a particular concept, the concept itself may have changed.

The LSU team solved those problems by combining agents that implement different classifiers — basically computer systems that can be trained to perform a task — with machine learning, he said. The combination allows the system to adapt to rapidly changing environments.

Think of a car traveling along a road. The background may change from light to dark and back again.

If the system is trained only to recognize a lighted background, the system can’t recognize when a car enters an area with a dark background, Mukhopadhyay said. In order to recognize both backgrounds, two agents are needed, and the system must learn to switch agents as the background changes.

So far, the researchers have taught the system to recognize a few dozen activities, including walking, running, shadowboxing, digging and waving, he said. It takes a few seconds for the system to recognize the difference between waving and shadowboxing.

But the researchers are constantly teaching their system to recognize more activities, Mukhopadhyay said.

Mukhopadhyay and his team, PhD candidates Robert DiBiano, Manohar Karki and Saikat Basu, and undergrad Malcolm Stagg began developing the system in 2010 to win a grant from the U.S. Department of Defense. The project has since drawn funding from the Louisiana Transportation Research Center, which wants to apply the technology to traffic management.

Mukhopadhyay said the system could guide motorists to alternate routes in case of accidents or traffic jams, or even control traffic lights to move pedestrians and vehicles more efficiently.

Altogether the research has drawn around $880,000 in funding.

Along the way, the team also realized that some of their techniques could be used on satellite imagery, and the researchers began collaborating with NASA, Mukhopadhyay said.

NASA wanted to measure carbon sequestration, but the agency’s system couldn’t tell the difference between trees and grass, he said. The LSU researchers’ system can.

Basu is now interning at the NASA Ames Research Center in Moffett Field, Calif., to further that research.

And Mukhopadhyay said the team is still exploring other applications.

The system could direct drivers to empty spaces in crowded parking lots or garages, he said. The system could tell a golfer how his stroke compares to an expert’s, then provide tips for improvement;
the same holds true for swimming.

Mukhopadhyay said the system can also detect the kinds of things a person videos on a cellphone, then target ads to the person’s areas of interest or connect the person with other people with similar interests.

Peter Kelleher, who directs LSU’s Office of Intellectual Property, Commercialization and Development, said the school is in talks with a number of companies about licensing.

The potential market for surveillance alone appears nearly limitless, Kelleher said. How many people and companies have video cameras to monitor properties or activities?

Kelleher said so far no firm has signed a licensing agreement, but these kinds of deals always take longer than he would like.

One reason is that all the licensing discussions have involved early-stage companies, so lots of people have to weigh in before a decision is made, Kelleher said. Another reason is that LSU wants to license the system to as many firms as possible.

This means figuring out if there are differences between what each company wants to do with the system, Kelleher said. If one company is doing something different than another, LSU can license the system to both.