Is the license of alltheplaces suitable?

Note https://osmfoundation.org/wiki/Licensing_Working_Group/Minutes/2023-08-14#Ticket#2023081110000064_—_First_party_websites_as_sources

4 Likes

Just to be clear, you’ve answered a general “is the licence of X compatible” question with a statement from the LWG that “data of one specific type will be compatible with OSM under some specific circumstances”.

It doesn’t address the general question asked at the top (“Is the license of alltheplaces suitable?”) or make any statement about non-opening hours data on first-party websites (as discussed above).

1 Like

I think @Mateusz_Konieczny’s post is on-topic. The LWG fielded an inquiry from an AllThePlaces developer that recounted several specific objections to the use of AllThePlaces data in OSM, specifically citing the original post in this thread, and this was their response. It wasn’t limited to opening hours (emphasis on the word “like”). Perhaps they wanted to address these questions specifically so they wouldn’t have to deliver an opinion on every similar scraper that comes along. You could demand a yes or no answer from the LWG if you feel it would make a difference.

To be clear, the specific request was quoted here and the questions asked there*** were:

I guess my questions are: 1) Can we use first party websites as sources for independent POIs?
2) Can we use first party websites as sources for chain POIs?
3) Are urls copyrighted?
4) Can we collect opening hours off doors?

The LWG’s answer can be seen below that, starting

LWG official position
Copying the opening hours of a business from its own website is fine…

That, I’m sure, is non-controversial. However some parts of what follows are somewhat “interesting”, not least:

… Even where the legal status of scraping is uncertain, it does not impact whether or not OSM can use the resulting data - it’s just a matter of the personal risk of the person running the scraper

which seems to imply that data can be “license-washed” by being included in a third-party database such as “alltheplaces”. Perhaps it would help for the LWG to expand on what they actually mean there? Maybe I’ve just misunderstood the sense of it.

However, as noted in the request to the LWG, the business owners want this data to be public, it is extremely unlikely that there will be complaints from them that this data is being used by OSM et al**. The challenge occurs when the data isn’t actually the business owner’s to distribute under a licence of their choosing (see e.g. the Postcode Address File question above****). That’s why it’s important not to assume that “any data that might have got into alltheplaces is therefore freely licenced for use in OSM regardless of what licence it was originally under” and instead to read what the LWG actually said.

Best Regards,
Andy (for the avoidance of doubt, writing in a personal capacity, and most definitely not a lawyer)

** although there are exceptions; with a DWG hat on I can think of a few, in most cases invalid, complaints.

*** Incidentally, the “As far as I know [alltheplaces is] not used as a source to add data to OSM” statement in there is at best “incompletely informed”; a search though changeset tags for “%alltheplaces%” finds quite a few.

**** and when I browse data locally I see almost no opening hours in alltheplaces data but lots of postcodes.

Scraping is not a violation of copyright law per se – that’s the domain of contract law, terms of service, computer abuse laws, etc. What they said is that someone engaging in scraping takes on some personal risk, depending on the jurisdiction, but that the result of that scraping isn’t necessarily tainted by association, nor does it necessarily make the scraped content free.

A lawyer by training (but not my lawyer) once made an analogy to being handed a photocopy of a public domain book that had been shoplifted from a bookstore. There are undoubtedly limits to this analogy, but yes, license-washing is a thing, and there’s a fair chance that your computer exists because of it. To be clear, it’s not something I personally care to devote my time to, because it’s a bit of a Rube Goldberg contraption compared to what I’m most interested in doing.

Edit: This turns out to be a poor analogy altogether, and I misinterpreted the term “license-washing”. The point about scraping being orthogonal to copyright still stands, however.

3 Likes

Probably a stupid question but couldn’t the LWG just make a final decision on this one? I feel like they are the most qualified for that, right?

1 Like

Just had a look in my locale & noticed:

https://www.alltheplaces.xyz/map/#14.65/-28.08091/153.43921

All good, except that the marked Night Owl is actually located in the same group of shops as Brumbies Bakery, Coles & Shell in the bottom left corner!

IANAL, IANAL, IANAL, this is my personal understanding of situation, not consulted with anyone

My understanding is that license-washing (someone taking copyrighted data, falsely claiming that it is openly licensed and publishing it on license not applicable to it) is distinct from either clean room design and scrapping.

In clean room design something is recreated by people who have not seen original implementation, and therefore it is provable that new work is not tainted by copyrighted original (and only non-copyrightable parts are used as inspiration - for example, as I understand, it is legal and fine to look at Google Maps public transport routing and decide to add similar functionality to Organic Maps but using leaked Google Maps code or even implementing it after looking at leaked Google Maps code would be problematic).
In terms of map data clean room design would be going to some location after hearing that say “Google Maps has many shops mapped at Foobar Street, OSM has none”, without looking at Google Maps (so shops are clearly not copied from there).
So clean room design is kind of opposite of license washing.

Scraping does not effect legality of copied data, but doing it may break some rules - and depending on type of scraping and local laws may be illegal.
In terms of other mapping activity - it can be similar to mapping military installation where it is not copyright issue but mapper may break some local laws while doing it (analogy is not ideal as map with military installations may be illegal according to local law, while only scraping is problematic and there is no trouble with using product of it).
Maybe “flying a drone to get image for mapping” may be better analogy? Flying drone may or may not be illegal, depending on how it was done and local law. But mapping done based on that will not be affected by how images were collected.

(though I guess that in some jurisdiction they could rule that data obtained by data scrapping/people with wrong religion/illegal drone flights is illegal and cannot be used and taints any products? But as I understand in typical jurisdictions this does not apply)


So LWG commented, as I understand, that ATP is not license washing because it was not copyrighted or database rights protected in the first place.

IANAL, IANAL, IANAL, this is my personal understanding of situation, not consulted with anyone

3 Likes

Thanks for the clarification. I wasn’t aware that this was a formal term for that practice specifically.

Ah nice analogy, and more relevant to our project too.

not sure is it formal term, but that is how it was used by Wikimedia Commons community.

I found now

1 Like

The limits are that there are not even claimed intellectual property rights involved in the example and it illustrates exactly nothing of relevance to this thread.

The whole point of (conventional) copyright is that it is a near universal, state guaranteed, set of exclusive rights that does not rely on contracts between the parties to take effect. There are some corner cases wrt terms, and the US “fair use” terms tend to be substantially more relaxed than in other countries, but as said, corner cases.

I find the statement from the LWG a bit unfortunate as it could be taken as saying that material on a website is not protected at all, but naturally that is not the case, images, text, audio, visual etc. material that is eligible for copyright protection naturally doesn’t lose that protection just because the rights owner decided to use them on a website, or licence such use. So if you scrape images from websites and try to reuse them in a form that isn’t an exception in copyright protection you are asking for trouble.

What the LWG is referring to is the extraction of information, or if you so will “data”, from websites and that has already been discussed at length in this thread.

OK, fair enough. I’ve redacted it from my earlier post.

Doesn’t look this question was ever appropriately answered… If a citizen of a country where there are no database rights, were to want to add data to OSM, from a database consisting of non-copyrightable data, from an organization which only operates in that country( lets say the government of that country )… is it allowed?

EDITS: grammar and punctuation… hope it is more readable now.

The largest problem is that the OSM community is not one entity. Even with the official organization being registered as a charitable organization in the UK, each chapter is bound by the laws of the respective countries.

Then there is the official database that carries data with multiple copyrights that have been traditionally used with software. Licenses that have never been truly tested in most jurisdictions.

That doesn’t even include the different members with pseudo official websites and tools. Each maintained by thier own subgroups.

At the end of the day, the best we can do is show that we are working in good faith. Doing our best to follow the spirit of the respective laws in whatever jurisdictions where the map might efitted or accessed. I doubt any multinational corporation’s legal department wouldn’t want to deal with our situation. That is assuming they could even figure it all out if they wanted to. We likely get a pass for of these mostly due to being a charity that provides so much value to those in so many jurisdictions.

For that reason alone most of the government agencies and corporations we partners with have an interest in preventing any real litigation from moving forward. They lose a lot when inoffensive volunteers get servered with legal paperwork. In a way, OSM has become the UN of mapping by holding ourselves to a higher standard.

It was an essentially rhetorical question, as nobody had claimed that regional/national regulations outside of conventional copyright have universal application.

The problem (forgetting about ethical issues) lies in that OSM data is used universally. Any data that for whatever reason can’t be used in area A but can be legally obtained in area B will end up being distributed in area A.

The whole basis of this thread is people in country B complaining that they need to comply / show that they comply with the rules in area A for data obtained in area A if they want to include it in OSM.

1 Like

Just for correctness sake: the OSMF is not registered as a charitable organisation in the UK.

2 Likes

My question is unfortunately not rhetorical, even if the original one probably was.

The problem with this is that the locals of a jurisdiction are best placed to assess that and a central DWG assessing this is just flawed design.

This still doesn’t answer the question though. Is that a no?

I understand that this thread was about a particular dataset, I wouldn’t mind starting a new thread if you think this is the wrong place to ask the question.

As I pointed out further above it is a “it depends”.

The dataset is produced in B but contains data obtained in A and B (and more). The original question was “can I use the dataset in OSM”, and the answer has always been “sure, if any data obtained from A can be legally included in OSM for use in A, because A is important for OSM” (obviously everything paraphrased).

There just isn’t a blanket statement by the creator of the database that the above holds, this doesn’t make the data unusable for a mapper that verifies themself that the store location information is actually extracted from and published on the companies website, because as has been established that should work legally for most places. Contrary for example as to if it had been generated by a query on Here’s website.

1 Like

Thanks for the clarification.

This is also my understanding: that mapping info from text on website operated by shop chain is as fine as mapping info from text on door operated by shop chain.

At imports mailing list I submitted for review edit plan that would add some fuel stations, partially using data from Orlen website (via ATP dataset).

This is a good moment to post there (or tell me to make thread also here, I guess) that this plan is bad for one reason or another.

2 Likes