I downloaded a bounding box around Berlin and calculated some coverage stats.
Categories – 22.5 % of places don’t have any category
Restaurants and cuisines
Distribution of the confidence values
Brands
Supermarkets
Here’s what to expect with regards to category and brand attributes for some popular supermarket brands:
keyword category_main brand_name
ALDI discount_store NaN 1
food_beverage_service_distribution NaN 1
grocery_store ALDI Nord 182
NaN 33
shopping NaN 1
supermarket NaN 4
wholesale_grocer NaN 1
NaN NaN 136
EDEKA discount_store NaN 2
grocery_store EDEKA 5
NaN 61
retail NaN 2
shopping EDEKA 1
NaN 13
shopping_center NaN 2
supermarket EDEKA 8
NaN 92
NaN NaN 188
REWE business NaN 1
convenience_store REWE To Go 1
flea_market NaN 1
gas_station REWE To Go 6
grocery_store REWE 2
REWE To Go 2
NaN 10
shopping REWE 1
NaN 3
supermarket REWE 212
toom Baumarkt 1
NaN 11
NaN NaN 41
dtype: int64
Feel free to fork the notebook to run your own analyses.
So, from that one small spot - the spot they have the real estate agent mapped is out on the footpath, & has the wrong agency name; the convenience store is named with two variations of the same current name, one in the correct spot, one out in the street, but also a third time as IGA, which changed ~4 years ago; The Facialist closed down ~3 years ago; & some of the business names just don’t exist
Even “better”, just up a bit they have Sea World Gold Coast marked: Overture Places
Sea World is one of the biggest theme parks in all of Australia, but is about 20km north of here, not in a suburban back yard!
In regard to the question of reusing this info in OSM, another thing I noticed is lots of “work from home” businesses shown, often with names attached to them.
I happen to know that hardly any (if any?) of these businesses have signs posted outside saying that this business is here.
In this case, we’d have to very careful with what we copy, & as always, it shouldn’t be done without first verifying what is on the ground!
I have seen similar effects in other datasets - usually it is result of importing official business register, full of such business without an actual offices.
A “short list” consensus (initial distillation of quality w.r.t. the .alpha-0) is that it is a relative shit-show.
It is .alpha-0. I’ve been on teams (Apple application/OS software, Adobe system software, Santa Cruz Operation Xenix…) that make such (very early versions of) things, and in retrospect, they are “not usually particularly great.”
It’s the first “pickle out of the barrel.” We (OSM) might offer patience, we might say “do better,” we might openly mock and ridicule (I’m not), we might encourage patience and better results in future endeavors.
We might also “double down” and continue (as @hfs , @SimonPoole , and others have) to “tear it open and see how it bleeds.” I’m in the latter camp, not necessarily doing the tearing, but certainly munching popcorn as others do. Thanks to everyone who tears this stuff apart.
To be fair, the Overture Maps data also provides the confidence of a POI:
confidence:
description: The confidence of the existence of the place. It's a number between 0 and 1. 0 means that we're sure that the place doesn't exist (anymore). 1 means that we're sure that the place exists. If there's no value for the confidence, it means that we don't have any confidence information.
Many of the POIs in question have a very small confidence. Probably the POIs with low confidence should be filtered out and only POIs with sufficiently high confidence should be used.
By the way, the color on this map indicates the category of the POI.
In my opinion this stuff should only be used as input for StreetComplete or EveryDoor quest to ask mapper “Is shop/gas station/barber/… at this location?”. And leave everything else up to mapper to add. And if mapper says no, remove entry from quest.
Yowza, OSM-community. Thank you to all who tear this apart. I am delighted to see this sort of “eat the red meat” we have been thrown. We are up to the task. Good discussion can only help; good for us.
The main issue I see is, we know the POI data is junk (and we actually knew that before it was released because we’ve been on the receiving end of complaints about the low quality Facebook data for years), and anybody that actually inspects the data will realize that it is junk, but that has nothing to with outside perception.
Not just the Facebook fanbois and shills, but the media have decided that this is a wonderful thing and the last thing they will do is point to OSM as the better data source.
Google is not much better. Too many street names that are wrong, even when sent photographic evidence left wrong. So you get routed on Wase to the wrong location or it can’t find a location. How POI can have different streetnames on them versus the streetname right out the doorstep is a riddle. OSM QA will sure as heck let you know.
Having worked with MapWithAI pulling in MS footprint buildings, 99.99% requires correcting so if one sees buildings, a token representation or a figment of the shadow hunters. The data is dated at that, many newer buildings missing.
Without some sort of filtering, I doubt that it is even worth doing that with it. Locally to me (in England) there are about 3 times as many invalid POIs as valid ones, and (very small sample but a rough guess) maybe 30% of pois missing, though that might be a process or category issue at Overture Maps’ side.
As Simon said above, OSM has had numerous reports about the problems with Facebook’s data (whether used in FB, Instagram or elsewhere) - as reports to the the help site, forums like this and to the DWG where we pretty much created a special reporting category for it.
Lots of unverifiable work from home, or where the business is
registered to receive mail. On this subject one of the bowling clubs is
shown at a residential address which is probably where the secretary
lives rather than on the bowling green and clubhouse where OSM has it
correctly placed.
Outdated POIs, seeing both previous businesses and current at the same
place.
Some seriously misplaced such as a school that is in a village several
miles away.
One that looks like it could be a GDPR breach, the name of a lady on
her address with nothing more to indicate a type of business. I know
her and thought she works in a local cafe.
I see the same here, Austria. Recently two people (or was it one) mapped hundreds of PoIs in the area, the ones you find on shady yellow-pages sites, and from the looks, in overture now.
They are a pain to weed out. Maybe that is the idea?