On replacing Basic Auth with OAuth 2.0

Dear OpenStreetMap Foundation,

I would like to express my concerns regarding OSM dropping support for Basic Auth (not to be confused with OAuth 1.0). The current intention is to replace it with just OAuth 2.0, which seems like the wrong tool for the job, as it will introduce unnecessary complexity, especially for new developers and people running basic edit scripts. This will most likely affect how many developers contribute to OSM in the future.

I initially expressed my opinion and suggested an alternative here, but I haven’t received a satisfying response/justification from the people responsible for the deprecation.

The Problem

Today, to start scripting on OpenStreetMap, all a person needs is their username and password. Then, they can simply call the API interface to perform any interactions. This system, while not secure (people tend to leave their passwords in plain-text files), makes it super easy to get started with OSM.

The procedure looks as follows:

  1. Save the username + password somewhere safe
  2. Attach the HTTP header: Authorization: basic <encoded username:password>

The current proposition is to replace this Basic Auth method with OAuth 2.0. Meaning that the new scripting procedure will look like:

  1. Register an OAuth 2.0 application (https://www.openstreetmap.org/oauth2/applications/new)
  2. Save the Client ID + Client Secret somewhere safe
  3. Generate an access_token with the use of a library or an external application
  4. Save the access_token somewhere safe
  5. Attach the HTTP header: Authorization: bearer <access_token>

I believe this will significantly increase the barrier to entry for new and inexperienced developers. I believe the OpenStreetMap platform should strive towards being open and inviting to everyone. It will also make many applications more dependent on external libraries/applications potentially leading to supply chain attacks.

One More Hidden Problem

There is one more hidden problem with using OAuth 2.0 in scripts. Currently, access_tokens produced by the OSM server are never-expiring. This is quite a non-standard configuration, and ideally, we should switch to using a refresh_token + access_token pair in the future. While this is a non-issue today, if we proceed with the current Basic Auth → OAuth 2.0 replacement, tomorrow it will be a massive headache for all script developers. The scripting procedure will then look like:

  1. Register an OAuth 2.0 application (https://www.openstreetmap.org/oauth2/applications/new)
  2. Save the Client ID + Client Secret somewhere safe
  3. Generate a refresh_token with the use of library or an external application (long validity: months)
  4. Save the refresh_token somewhere safe
  5. Generate an access_token with the use of library (short validity: minutes)
  6. Save the access_token somewhere safe
  7. Attach the HTTP header: Authorization: bearer <access_token>

The Simple Solution

Many public APIs have already solved this problem, and I believe it’s best to learn from them rather than coming up with our own clunky solutions.

I propose that we replace Basic Auth with the Personal Access Token (PAT) system. This will provide a secure, flexible, and simple solution for the basic auth security problem. It will require very little code/maintenance as it can be run on the existing OAuth 2.0 solution internally.

The PAT scripting procedure will look like:

  1. Retrieve the PAT token from OSM user settings
  2. Save the access_token somewhere safe
  3. Attach the HTTP header: Authorization: bearer <access_token>

Basically, all the steps, from registering an OAuth application to generating tokens, are done server-side, hidden from the user. It has all the same benefits of OAuth 2.0 but none of the downsides of using it in a simple script application. The PAT system is very easy to use, secure, and provides no additional barrier to entry for new developers (it is also dependency-free!). This also future-proofs the potential refresh_token support as PATs will remain unaffected.

Regards,
Kamil

9 Likes

Why would we want “new and inexperienced developers” writing to the API?

4 Likes

I strongly believe OpenStreetMap should be an open and inviting platform to everyone. It’s also important to note that API authorizations affect rate-limits for read-only script users.

If we theoretically wanted to close down API access, I believe it’s important to discuss such changes beforehand with the broader community. Nonetheless, using OAuth 2.0 for script authorizations is an unusual approach that unnecessarily complicates writing scripts today (additional dependencies) and managing the OpenStreetMap website tomorrow (more headaches with supporting refresh_tokens). The industry standard is to use PAT-like systems that provide a perfect replacement for Basic Auth. Let’s not try reinventing the wheel :slightly_smiling_face:.

You didn’t answer the question.

I did sir!

You copy pasted something you thought sounds good, but which has no bearing on the issue at hand.

Your “argument” would work just as good to support a RMSish no passwords world, or to allow open access to vandals, or making everybody an admin on the website etc. Naturally nobody is seriously suggesting that because OpenStreetMap is about collecting and maintaining open geodata and we want that to be accessible for everybody, the tools enabling that are just tools, not the raison d’être for the project. Even with basic auth gone, API access will remain open and anybody that puts the effort in can still write a script (regardless of how misguided it might be).

PS: what would be the reason that we want “new and inexperienced developers” to exceed the API rate limits?

PS: @ZeLonewolf where’s the :popcorn:

2 Likes

You copy pasted something you thought sounds good

I don’t know where you got this from but this is just incorrect.

I believe the burden of proof/justification lies on the people making the change and not on me. Over the past many years, OpenStreetMap has been an open platform, and if we want to change that, it should be clearly communicated.

However, as this point is missing in the original diary, I believe it’s nothing more than an mere oversight. I wouldn’t put too much weight on it until we get an official response. It’s justified as “in order to improve security and reduce code maintenance”, and so my PAT recommendation achieves just that without sacrificing developer experience, but also improves on security (less dependencies = less risk) and future-proofing.

PATs are usually used as a replacement for username+password when working with systems over a communication medium that does not support OAuth or similar. The most common application I’m aware of is when using Git (it’s quite telling that all references from the Wikipedia article you linked are to Git server providers).

When working via protocols that do allow for OAuth (such as REST in the case of OSM) that’s no need for that sort of “bandage”. Your step examples are somewhat exaggerated, a more comparable comparison:

OAuth 2 PAT
  1. Register OAuth2 application
  2. Copy client ID and secret to somewhere the app can read it
  3. Use the API
  1. Create a new PAT
  2. Copy the PAT to somewhere the app can read it
  3. Use the API

Think I’m missing something between step 2 and 3 for OAuth 2 (steps 3-6 in your steps)? That’s because even as a developer you rarely need to care about that, provided you use a decent library (which you should do anyway).

Let’s compare the code using requests in Python (code not tested, likely incomplete but serves the purpose of comparing):

OAuth 2PAT
from oauthlib.oauth2 import BackendApplicationClient
from requests_oauthlib import OAuth2Session
session = OAuth2Session(client=BackendApplicationClient(client_id='XXX'), auto_refresh_url='https://www.openstreetmap.org/oauth2/token')
session.fetch_token(token_url='https://www.openstreetmap.org/oauth2/token', client_id='XXX', client_secret='XXX')
session.get(...)
from requests import Session
session = Session()
session.auth = ('', 'XXX')  # add PAT here
session.get(...)

There difference is a single line more (and a few more arguments) for OAuth 2. But even this isn’t the whole truth, because ideally all that should be abstracted away by a library (sadly osmapi, the most up-to-date Python lib for the OSM API I’m aware of, does not support this, though for example osm-auth (JS/TS) does) so that even for OAuth you’d just have:

from some_osm_lib import OSM

client = OSM(client_id='XXX', client_secret='XXX')
client.nodes.get(45124552)

While I won’t go as far as saying that we want some artificial barriers like Simon is implying (but also not saying that it’s bad), I think implying that supporting OAuth 2 but not PATs would be the opposite of an open and inviting platform is just plain wrong.

“Unusual approach”?

OAuth 2 is literally the industry standard for third-party API access (though it’s mowing more towards OIDC, but that’s just an extension/formalization of OAuth 2).

The OSM community is quite good at re-inventing the wheel (just look at our data model compared to what everyone else is doing in mapping/geospatial), but using OAuth 2 is one of the cases where we’re using an exact copy of the wheel blueprints everyone else is using…

Now this part is interesting, and I suspect part of the reason for this post is that you’re working on implementing that part in your Python port.

I’ve previously talked about the fact that implementing a security mechanism (be it OAuth or PAT or something else) is something the OSM community should avoid; doing that correctly and securely is just to much work. Personally, I would have solved it using some pre-existing component such as Keycloak or the Ory-suite, the Ruby port solved it using the Devise library, you could use oauthlib.

Regarding refresh tokens, much of the industry is moving towards (semi-) stateless services and JWTs, and while there’s a lot to be said about JWTs (and I encourage you to read about them regardless of if you end up using them or not) it does make refresh token handling very easy (somewhat pseudo-codey):

@app.get("/refresh_token")
async def refresh_token(refresh_token: str):
    parsed = jwt.decode(refresh_token)
    if not db.user_exists(parsed["uid"]) or parsed["valid_until"] < datetime.now():
        raise HTTPError(401)
    return jwt.encode(dict(uid=parsed["uid"], access_token=generate_token(), refresh_token=generate_token(), valid_until=datetime.now() + timedelta(minutes=5)))
6 Likes

First of all thank you for such an extensive response! Let me answer your feedback:

I don’t fully agree. Let me bring GitHub as a counterexample. GitHub supports both OAuth and PATs, each used for different purposes. OAuth is used for authorizing external services to use your account in a limited way, while PATs are used for authorizing yourself. This provides flexibility and does not complicate local scripts with the OAuth workflow (PATs can run on OAuth under the hood, which is transparent to the developer).

I believe your table misses some crucial steps. Let me create a working side-by-side code comparison (OAuth at the top, PAT at the bottom):
https://gist.github.com/Zaczero/74bdfc57318aef8794f2bbfd2b43a484

The code provided by you, as already pointed out, does not run, and even if it did, it would probably require a desktop interface and user interaction (to approve login on OSM website), which is missing on many servers where the scripts usually run.

(i have zero idea why the error message is in a different language)

I don’t know any public API that would require OAuth for script authorization. That’s why I call it unusual.

I believe there is some misunderstanding between using OAuth 2.0 for scripts and for user-facing applications. Using OAuth 2.0 for 3rd party authorization grants is perfectly fine but that’s not what this post is about.

I don’t, but I believe sooner or later it will come down to this, and I would love everyone to avoid unnecessary stress. Frankly speaking, for the Python port, it’s easier to support just OAuth 2.0 than OAuth 2.0 + PATs, but I highlight this issue for the good of the community. :stuck_out_tongue:

1 Like

I think for scripts people would just use Josm to upload the changes via the API, at least those where the author cannot or doesn’t want to implement oauth 2.

I’m just pointing out that there was and is no reason given why we should go out of our way (that is implement a personal access token system) to support @NorthCrab’s “new and inexperienced developers” writing to the API.

If they are so “new and inexperienced” that they can’t scale that miniscule barrier, then would we want them writing to the data? The github analogy is quite telling as any damage you can do there is limited to the repositories you either own, or have been made a collaborator of, you can’t write to all of github.

But in any case there are literally no barriers to running a script, app or whatever against the API outside of actually writing the code, and you can even circumvent that given a powerful enough editor. You don’t need a special account to generate the client keys, nor does any other administrative barrier exist.

2 Likes

Is this a big problem? Using the OAuth2.0? How many are affected?

I have written lot of Perl scripts, and my best friend was a Perl Cookbook published by OReilly. Anytime I hade to write a complex, for me unusual code, the cookbook mostly showed the solution. And I hadn’t know all details about why the code worked. I assume there such cookbook examples also for how to implement OAuth2.0.

Big problem - no. But an easily avoidable one for sure!

The core issue with using OAuth 2 for scripts is unnecessary complexity that is placed on the developers. So far, there is no sensible reason why we need to use OAuth 2.0 and can’t just use PATs. It feels like, instead of developing a solution once on the OpenStreetMap website, we will require each developer to write the same boilerplate code in each application, which seems like a waste. The idea of PATs is to handle this part on the server-side, transparently to the developer, achieving the same benefits of OAuth while centralizing the solution: easier for developers, more secure, and future-proofed.

The question is:
Why do we need to complicate script development? Is there a reason behind it? Or is it just an oversight?

1 Like

Actually, it’s not**. There may well be places where PATs would be a good fit and the standard approach (usually SSO via an external IDP***) might be a bit heavy-handed, but Oauth 2.0 is the industry standard here.

** Based on the experience of my day job, which regularly involves helping customers negotiate this particular minefield.

*** and obviously a reliance on an external IDP wouldn’t work for OSM, though you can use one if you want.

@NorthCrab Perhaps this is an opportunity to take a step back and look at where we are:

You commented extensively on the github issue. Broadly speaking, no-one there shared your concerns**. You’ve now raised it in the Foundation area here, and many of the same people have also commented saying that they still don’t share your concerns.

At what point do you say to yourself “if I think A and everyone else things B, what information do I have that they do not, (or vice versa)?”. In this case a “collective delusion” or some sort of “conspiracy” among all the people who think that this Oauth2 change is a good idea seems unlikely, so the more likely explanation is that you are perhaps missing some of the knowledge or experience that led them to come to that conclusion?

Edit: There are at least two - TrickyFoxy below, and Mateusz in a diary comment … and 4 people have now liked the github post.

Hey! Can you please provide public API example that requires scripts to authenticate via OAuth? Also, yes, I did forward it to Foundation section to raise awareness of this change as it will directly affect the community. It’s fair to say that Operations’ GitHub is not the most active place for such a discussion. I never said it’s a conspiracy nor anything alike - I don’t know where you got that “quote” from. As previously stated, I believe it’s just an oversight.

Broadly speaking, no-one there shared your concerns

Please note that this untrue and it appears to me that your response is highly biased for some reason.

Oh, do we want only experienced developers to exceed the limits? Well, this summer and autumn we saw it.


Let me remind you that the OSM API is not limited to editing the map.


I absolutely agree with NorthCrab.

Let’s say you want to create a bot account. Or a client for forwarding incoming messages. Or parse user information (as far as I remember, not all information is provided by the API without authorization) Or. any other scripts…

Why is there an abstraction in the form of an OAuth application in these scenarios? These can be scripts in several lines of code for which [Login:Password]/token is more than enough.

But no, for OAuth we write wrappers for each language GitHub - osmlab/osm-auth: Easy authentication for OpenStreetMap over OAuth2.0 GitHub - Zverik/cli-oauth2: Helper library for OAuth2 in command-line tools (which starts the web server to authorize the script :man_facepalming:)
Anyone who wants to write a simple script in their favorite language will need to write a script to do that does the same thing?

And can I, as a developer, just copy and paste the token from the browser into the config of my script, leave it on the server and forget? Thank you, at least we have OAuth tokens with no expiration date…

pnorman's Diary | Future deprecation of HTTP Basic Auth and OAuth 1.0a | OpenStreetMap

4 Likes

And a little bit about the friendliness of OAuth to developers.

Stumbling over this and seeing the opportunity to use a username/password, I use them better.

To be clear, the first line of your first post in this thread was “I would like to express my concerns regarding OSM dropping support for Basic Auth”. To answer the question “who has dropped or is dropping support for basic authentication” a web search can easily provide a list. I’m in the UK and the results I see are heavily geared towards Microsoft and Exchange, but if I scroll down I also see Atlassian, Google, Jamf, Github and more.

You now seem to have moved away from wanting basic auth to stick around and now just just don’t want to have to implement Oauth2. In OSM, as elsewhere in the world, sometimes decisions will get made that make no sense to you. It might mean the decision was flawed, or it might just mean that you don’t understand why the decision was made. When this happens, generally speaking, we just need to “deal with it”.

3 Likes