I built a browser-based tool that uses Vision-Language Models (VLMs) to detect objects in satellite imagery via natural language prompts. Draw a polygon on the map, type what you want to find (e.g., "swimming pools," "oil tanks," "solar panels"), and the system scans tile-by-tile, projecting bounding boxes back onto the globe as GeoJSON.
The pipeline: pick zoom level + prompt → slice map into mercantile tiles → feed each tile + prompt to VLM → create bounding boxes → project to WGS84 coordinates → render on map.
No login required for the demo. Works well for distinct structures zero-shot; struggles with dense/occluded objects where narrow YOLO models still win.
Dupe yesterday with ~50 points and ~20 comments:
https://news.ycombinator.com/item?id=47305979
That's actually tomorrow (check the submission IDs & hover over the submission time on this). This is an accidental SCP fail.
They've submitted basically the same thing 4 times: https://news.ycombinator.com/submitted?id=eyasu6464
I assume this is a marketing strategy. It feels a bit dishonest.
Neat. I was recently wondering if there was a way to find houses in my area that had roof-top solar, just to get an idea of how common it was.
Since this post is a dupe, here's a video demonstrating a similar but different app I made: https://www.youtube.com/watch?v=EjH0kMEz4YY
Find me large outcroppings of gold, or gold particles in tree canopies please
did you just make Danti?
[dead]