Proposal: Use images as unit of observation instead of sampled points #81

jayqi · 2024-09-12T04:11:18Z

Status quo

Currently, the output of the "2. Match an image to each point" (assign_images.py) keeps the sampled points as the unit of observation in the output.

This is reflected by a few design choices we have:

The output geodataframe/GeoPackage file has the points as the rows.
We attempt to match as many sampled points as possible to images.
- We dont allow multiple sampled points to map to the same image. If an image has already been "claimed", a point will try to find another close (but slightly further) image.
The primary geometry data of the Point features are still the geolocation of the sampled points and not the geolocation of the images.

Proposed change

I propose that the output should instead have the images as the unit of observation, with the geolocation of the images as the primary geometry of the geospatial dataset.

The sampled points are kind of imaginary. We provided roads that we want to analyze, and from those roads we sampled points, but there isn't actually any data associated with those points. The real data is associated with the imagery and physically located at the images' geolocations.
If multiple points have the same closest image, we probably just care about that image. It doesn't seem like it makes sense to get another image that is further away to more closely match having an arbitrary number of images.

We should think of the "matching" step as more like a "spatial query": given a dataset of street-level imagery, we are querying a subset of that imagery based on the intersection with a set of evenly spaced points along roads we care about.

This change would have the following interactions or implications with these open issues:

develop suggested process for visualizing the output #25
- This would change the geometry of the final output data to be based on the image's geolocation instead of the sample points.
local image matching returning duplicates and distant "matches" #43
- There is probably a bug in local matching based on the case shown in the issue. This change makes the process simpler and potentially less bug-prone — we query for the closet images to each point, discarding duplicates, and discarding ones that are beyond a distance threshold.

The text was updated successfully, but these errors were encountered:

jayqi added this to street-view-green-view for Indonesian Red Cross pilot Sep 12, 2024

jayqi converted this from a draft issue Sep 12, 2024

danbjoseph mentioned this issue Sep 16, 2024

local image matching returning duplicates and distant "matches" #43

Closed

dragonejt added this to the Indonesia SLI Capture milestone Sep 26, 2024

jayqi self-assigned this Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Use images as unit of observation instead of sampled points #81

Proposal: Use images as unit of observation instead of sampled points #81

jayqi commented Sep 12, 2024

Proposal: Use images as unit of observation instead of sampled points #81

Proposal: Use images as unit of observation instead of sampled points #81

Comments

jayqi commented Sep 12, 2024

Status quo

Proposed change