Crowd Sourcing Train Detection Data Analytics

3/2/2021

NoCo Train Alert needs your brain cells and creativity. The north camera has been doing a reasonable job detecting trains during the day. Train detection at night has been problematic. ("Night time" meaning "after dark.") We're seeking ideas for identifying patterns in data, generated by image analysis, between non-train suspects and real trains - with the goal of improving overall train detection accuracy. The specific request will be laid out after some basics of how things work are established.

How Train Detection Works
The train detection portion of Train Alert is a python program that uses OpenCV - an open source library of image analysis software. The north camera (a Raspberry Pi with integrated camera) reads 2-3 images per second. After an image is read OpenCV identifies what, if anything, is "new" in the image (something is deemed "new" if it wasn't seen in the previous 3-5 images). The size and location of the new item is specified via coordinates of a rectangle drawn around the object.

OpenCV uses 4 integers to define a rectangle.

x = # of pixels top left corner of rectangle is from the image's left edge
y = # of pixels top left corner of rectangle is from the image's top edge
w = width of rectangle; measured in pixels
h = height of rectangle; measured in pixels

When something new and interesting (i.e. relatively large in size) is reported in a frame Train Alert reads two additional frames and records the resulting 8 values (4 for each frame). At this point three frames have been processed, and 3 sets of x, y, w and h values have been collected.

Now it is time to decide whether these 12 collected values are indicative of a train or not. The frequency a new and interesting object (i.e. train suspect) appears in the frame is quite high - oftentimes numerous times per minute. Obviously trains don't appear that frequently. Leaves blowing in the wind, changes in light due to clouds, a car driving in the frame - these and other situations are examples of what can be reported as "new" in the frame. The challenge is to separate the non-train chaff from what is really of interest: a train.

Here is one very clear example of a how a train presents itself during a daytime detection scenario. The train is traveling left-to-right (northbound) in this data set:

Key things to notice in the data:

The object width w is growing with each new frame read.
The top-left corner of the object is at the left edge of the image in every frame read.
The distance of the object from the top of the image (y) and height (h) of the object both vary to some degree - but not dramatically.

It is easy to see that the above data represents a scenario where a rectangle appears in the left side of the frame, gets progressively larger with each frame, does not change in height, all while remaining anchored at the left edge.

Daytime trains traveling in the opposite direction present similarly clear data:

An increasing rectangle width w from frame to frame.
A decreasing x value with each frame as the object progresses from right-to-left in the frame.
Similar relatively-constant values of y and h.

For contrast the data below represents one (of many) daytime non-train examples:

Note how all values (in this example) show high variance. Not all non-train scenarios, however, are as easy to weed out as the above scenario. Some non-train scenarios are more subtle. Through data gathering and trial-and-error we've determined if all of the following conditions are true we can virtually guarantee a train is in the frame:

Rectangle width w gets progressively larger from Frame 1 to Frame 2 to Frame 3
Extreme spread (difference between min and max values) for both y and h are less than 10 pixels
x values are either 1/ all 0 or 2/ get progressively smaller from Frame 1 to Frame 2 to Frame 3

Nighttime Train Detection
Daylight, literally, enables clear vision and, consequently, data that is clear and straightforward to interpret. At night circumstances change substantially. Overall the software has more difficulty seeing changes in scenes primarily due to reduced lighting. Some other aspects of the camera installation make nighttime train detection challenging:

The camera is a significant distance away from the train track.
The track is not perpendicular to the camera. Trains traveling left-to-right (northbound) are moving slightly away from the camera. Track angle is approximately 110-120 degrees relative to the camera's line of sight.
Conversely (and helpfully) trains traveling right-to-left are angled slightly toward the camera. The train's headlamps, in this scenario, make it easier for the camera to identify movement.

Over recent weeks numerous 3-frame sets of nighttime data have been collected. These data sets include trains traveling in both directions, various other "vehicle" scenarios (e.g. trucks driving in camera view in the middle of the night) and a myriad of other non-train/non-vehicle scenarios. Trains traveling right-to-left in the frame are quite predictable. Establishing an algorithm to capture these scenarios does not appear to be particularly challenging.

The Problem
Nighttime trains traveling left-to-right have been particularly difficult to identify. As stated previously, the train is moving away from the camera so it is not possible to see the train's headlamps, which results in the camera not having a clear indication of change/movement in the frame. As a result, data representing trains in this scenario do not look like train data generated for other scenarios.

We are seeking help in identifying patterns/tests that can be used to identify nighttime left-to-right (northbound) trains. When looking at non-train and train data, what are those unique characteristics or combination of characteristics that enable a train to be identified while ignoring non-train anomalies?

This is some of the data we've collected for both train and non-train scenarios.

One metric used in other scenarios (such as daytime detection) is Extreme Spread (ES). An example of the kind of observation being sought: It appears a left-to-right train is present when all 4 measurements have a very low ES. This method, however, proved inadequate when the last Non-Train data set was collected. We're asking people to look at and play with this data to see if you can identify tests we can do in the software to reliably distinguish between train and non-train data.

The above data can be accessed either by downloading this Excel file or by accessing this public Google Sheet. (Please copy the data to another Google Sheet prior to manipulating the data so others can get clean copies of the data.) FYI: We've generally found best detection accuracy occurs when multiple train-positive events occur in a data set- such as growing rectangle width AND changing x measurement, for example; although clearly that particular test will not positively identify a train in the above data. The key is to find those unique ways of looking at the data and its patterns to filter out the noise but recognize a train.

THANK YOU for helping with this! If you have observations or suggestions please either leave a comment below or send email to [email protected].

Jim

0 Comments

Crowd Sourcing Train Detection Data Analytics

Leave a Reply.

Jim's Blog

Archives

Categories