Overture Maps first dataset release

I downloaded a bounding box around Berlin and calculated some coverage stats.

  • Categories – 22.5 % of places don’t have any category
  • Restaurants and cuisines
  • Distribution of the confidence values
  • Brands
  • Supermarkets

Here’s what to expect with regards to category and brand attributes for some popular supermarket brands:

keyword  category_main                       brand_name
ALDI     discount_store                      NaN                1
         food_beverage_service_distribution  NaN                1
         grocery_store                       ALDI Nord        182
                                             NaN               33
         shopping                            NaN                1
         supermarket                         NaN                4
         wholesale_grocer                    NaN                1
         NaN                                 NaN              136
EDEKA    discount_store                      NaN                2
         grocery_store                       EDEKA              5
                                             NaN               61
         retail                              NaN                2
         shopping                            EDEKA              1
                                             NaN               13
         shopping_center                     NaN                2
         supermarket                         EDEKA              8
                                             NaN               92
         NaN                                 NaN              188
REWE     business                            NaN                1
         convenience_store                   REWE To Go         1
         flea_market                         NaN                1
         gas_station                         REWE To Go         6
         grocery_store                       REWE               2
                                             REWE To Go         2
                                             NaN               10
         shopping                            REWE               1
                                             NaN                3
         supermarket                         REWE             212
                                             toom Baumarkt      1
                                             NaN               11
         NaN                                 NaN               41
dtype: int64

Feel free to fork the notebook to run your own analyses.

Use the Azure links provided in the Github, such as https://overturemapswestus2.blob.core.windows.net/release/2023-07-26-alpha.0/theme=places

It still needs credentials.

aws + --region us-west-2 is working for me ( > 200 GB ) - without any credentials.

aws s3 cp --region us-west-2 --no-sign-request --recursive s3://overturemaps-us-west-2/release/2023-07-26-alpha.0/ <DESTINATION>

with : aws-cli 2.13

$ aws --version
aws-cli/2.13.3 Python/3.11.4 Linux/5.4.0-155-generic exe/x86_64.ubuntu.20 prompt/off

( via )

1 Like

Reading that, it had a link to Overture Places.

Zoomed in to a local shopping centre at Overture Places, which we have as: OpenStreetMap.

So, from that one small spot - the spot they have the real estate agent mapped is out on the footpath, & has the wrong agency name; the convenience store is named with two variations of the same current name, one in the correct spot, one out in the street, but also a third time as IGA, which changed ~4 years ago; The Facialist closed down ~3 years ago; & some of the business names just don’t exist :face_with_monocle: :thinking:

Even “better”, just up a bit they have Sea World Gold Coast marked: Overture Places
Sea World is one of the biggest theme parks in all of Australia, but is about 20km north of here, not in a suburban back yard!

3 Likes

In regard to the question of reusing this info in OSM, another thing I noticed is lots of “work from home” businesses shown, often with names attached to them.

I happen to know that hardly any (if any?) of these businesses have signs posted outside saying that this business is here.

In this case, we’d have to very careful with what we copy, & as always, it shouldn’t be done without first verifying what is on the ground!

9 Likes

I have seen similar effects in other datasets - usually it is result of importing official business register, full of such business without an actual offices.

3 Likes

A “short list” consensus (initial distillation of quality w.r.t. the .alpha-0) is that it is a relative shit-show.

It is .alpha-0. I’ve been on teams (Apple application/OS software, Adobe system software, Santa Cruz Operation Xenix…) that make such (very early versions of) things, and in retrospect, they are “not usually particularly great.”

It’s the first “pickle out of the barrel.” We (OSM) might offer patience, we might say “do better,” we might openly mock and ridicule (I’m not), we might encourage patience and better results in future endeavors.

We might also “double down” and continue (as @hfs , @SimonPoole , and others have) to “tear it open and see how it bleeds.” I’m in the latter camp, not necessarily doing the tearing, but certainly munching popcorn as others do. Thanks to everyone who tears this stuff apart.

1 Like

To be fair, the Overture Maps data also provides the confidence of a POI:

confidence:
        description: The confidence of the existence of the place. It's a number between 0 and 1. 0 means that we're sure that the place doesn't exist (anymore). 1 means that we're sure that the place exists. If there's no value for the confidence, it means that we don't have any confidence information.

schema/schema/places/place.yaml at main · OvertureMaps/schema · GitHub

Many of the POIs in question have a very small confidence. Probably the POIs with low confidence should be filtered out and only POIs with sufficiently high confidence should be used.
By the way, the color on this map indicates the category of the POI.

2 Likes

In my opinion this stuff should only be used as input for StreetComplete or EveryDoor quest to ask mapper “Is shop/gas station/barber/… at this location?”. And leave everything else up to mapper to add. And if mapper says no, remove entry from quest.

7 Likes

Thanks!

Can we see that confidence level at all?

& again, do we know the legend?

Overture Maps data contains the confidence. This map (only a small area) also shows the confidence, by clicking on the marker:
hmb: places: 931 rows

Here it is:

1 Like

Yowza, OSM-community. Thank you to all who tear this apart. I am delighted to see this sort of “eat the red meat” we have been thrown. We are up to the task. Good discussion can only help; good for us.

The main issue I see is, we know the POI data is junk (and we actually knew that before it was released because we’ve been on the receiving end of complaints about the low quality Facebook data for years), and anybody that actually inspects the data will realize that it is junk, but that has nothing to with outside perception.

Not just the Facebook fanbois and shills, but the media have decided that this is a wonderful thing and the last thing they will do is point to OSM as the better data source.

6 Likes

Google is not much better. Too many street names that are wrong, even when sent photographic evidence left wrong. So you get routed on Wase to the wrong location or it can’t find a location. How POI can have different streetnames on them versus the streetname right out the doorstep is a riddle. OSM QA will sure as heck let you know.
Having worked with MapWithAI pulling in MS footprint buildings, 99.99% requires correcting so if one sees buildings, a token representation or a figment of the shadow hunters. The data is dated at that, many newer buildings missing.

Without some sort of filtering, I doubt that it is even worth doing that with it. Locally to me (in England) there are about 3 times as many invalid POIs as valid ones, and (very small sample but a rough guess) maybe 30% of pois missing, though that might be a process or category issue at Overture Maps’ side.

As Simon said above, OSM has had numerous reports about the problems with Facebook’s data (whether used in FB, Instagram or elsewhere) - as reports to the the help site, forums like this and to the DWG where we pretty much created a special reporting category for it.

9 Likes

For shop data it is much better, at least in my area.

3 Likes

Have they been attempting to restructure the data?

I am seeing similar in my town.

Lots of unverifiable work from home, or where the business is
registered to receive mail. On this subject one of the bowling clubs is
shown at a residential address which is probably where the secretary
lives rather than on the bowling green and clubhouse where OSM has it
correctly placed.

Outdated POIs, seeing both previous businesses and current at the same
place.

Some seriously misplaced such as a school that is in a village several
miles away.

One that looks like it could be a GDPR breach, the name of a lady on
her address with nothing more to indicate a type of business. I know
her and thought she works in a local cafe.

7 Likes

I see the same here, Austria. Recently two people (or was it one) mapped hundreds of PoIs in the area, the ones you find on shady yellow-pages sites, and from the looks, in overture now.

They are a pain to weed out. Maybe that is the idea?

2 Likes