Hello everybody. I am planning to deploy a bot that will update the address data for Oklahoma City. As of right now, this will only effect existing ways. Details can be seen on Automated edits/Big Friendly - OpenStreetMap Wiki . Please let me know if you have any feedback, or concerns.
I am a bit concerned that an account with only 6 edits immediately wants to start doing automated edits, and with a custom bot no less. How do we know you understand how OSM works and know how to perform these types of activities?
I also do not believe we can use the data source you’re thinking of using. It explicitly states:
Furthermore, the receiving party agrees not to redistribute or resale the information provided by the City Of Oklahoma City.
I’m planning to have my code open source by the end of the day. Hopefully this will address your concerns about this.
I can get the written consistent from the city clerk to use this data.
I understand that this is a large area to be changing, so I want to make sure that all concerns get addressed before moving forward. Please feel free to to ask more questions.
Is there any address data present there? If yes, how you will handle it?
can you state which OSM keys you will use?
Hopefully not housenumber listed at the page
All the existing ways in the city will be updated.
surely not all ways?
also, what will be done if building is not mapped yet?
The data is sourced directly from the City of Oklahoma City, and can be accessed at data.okc.gov.
have you checked quality of this data?
I do not understand this question; would you be able to rephrase it for me?
housenumber
street
postcode
city
state
If the way’s data is the same as OKC’s data, then it will not be changed. Majority of the data is either not there, or is not accurate.
If a building is not mapped it will be added to a list and be dealt with in another project. I will not be adding new ways in this project.
The data there is the most accurate data available.
Hello! I applaud your enthusiasm for this ambitious project but share concerns about needing to take things slowly. There’s no rush.
I encourage you, once you have data source approval, to do some address additions “by hand” so you have a really good sense of the complexities involved in merging two incomplete/conflicting/incorrect data sets.You’re going to want to present a much more thorough explanation for various quality assurance checks, plans for conflict resolution etc. As I hope you can see, “All the existing ways in the city will be updated.” lacks the specificity required to get things approved.
It’s worth reading through and digesting the plans for other address imports (either manually reviewed imports or automated) and make reference to the things you have learned from those processes. There’s a huge amount of prior art for this kind of thing so you’ll have plenty to chew on.
I have done some address additions by hand. Enough to know there is a lack of consciousness/accuracy on the OKC OSM data. It’s why I think this project is important. That being said I can provide a more detailed QA, and conflict resolutions plans when I get the chance. Additionally, this project can, and should be done in parts to allow for additional review.
A lot of address data is provided as parcel centroids. For OSM, at least in the US, we standardize on addresses being inside the building footprint. In practice, this means it’s much easier to get address data where it needs to go in areas that have relatively high density of accurate building footprints.
Zooming around OKC a bit, I get the impression there’s a lot of work to do on that front.
I’ve written a lot about adding addresses and OSM. You may find some of informative. (OSM is a bit underwater right now so these aren’t reachable currently)
https://www.openstreetmap.org/user/watmildon/diary/400812
https://www.openstreetmap.org/user/watmildon/diary/401407
I believe it would be a good idea for you to first get some mapping experience before attempting to do such an import (And I will stress, this is an import, not an automated edit). I also do not understand why you need a custom bot for this. Why does using a standard tool like JOSM not suffice?
In the past 2 weeks, we’ve already had 3 people accidentally dragging nodes halfway across the planet because they decided to create their own custom tools which inevitably had some very basic bugs in them ( Changeset: 178428572 | OpenStreetMap, Changeset: 178367561 | OpenStreetMap, Changeset: 177811159 | OpenStreetMap).
In fact, you already ran into such a bug where you mixed up node and way ids Changeset: 178321329 | OpenStreetMap
Creating your own tool and having it be bug free is hard and difficult work, and while I appreciate you intending to make it open source, your tool behaving as intended is still your responsibility.
Right now your methodology basically just says “These changes will be done with a bot”, but you don’t explain any of the other details. How are you going to find the correct nodes and ways? How are you going to conflate the data? How will you deal with duplicate addresses? These are all details that are important to consider.
The reason this is a bot is that I would eventually like to automatically update address data for new ways and buildings. Ideally, this is all the bot would do; however, due to the current state of OSM’s OKC data, the scope of the bot had to grow.
The OKC Address dataset contains addresses and their corresponding coordinates. The coordinates are sent in an Overpass API request, which returns either a way ID or nothing. If it returns nothing, the address is logged for later review. If it returns a way ID, the script updates the OSM address data as needed.
Below is an example of how the data is conflated. It should be noted that the code overwrites casing and abbreviations to align with the OKC data. The code is subject to change to remain in line with OSM naming conventions.
for key, new_value in city_data.items():
old_value = osm_data.get(key)
if old_value is None or old_value.strip() != new_value.strip():
osm_data[key] = new_value
changed = True
Checks are included in the code to flag any ways within close proximity to prevent address duplication. These addresses are logged and will be manually added later.
I agree with others that enthusiasm for adding addresses is very welcome, but large imports are usually not for beginners, so heed people’s advice carefully if you want to proceed. Before you start anything else, you really need to get that explicit permission from the city that OSM can use the data, I believe there are template emails on the wiki.
These are not the correct OSM tags. Please see Key:addr:* - OpenStreetMap Wiki
Also, going off your example on the wiki page:
18101
N Western Ave
73012
Edmond
Oklahoma
I have a couple of comments: first, please note that typical OSM protocol is to expand abbreviations in street addresses, so “North Western Avenue” not “N Western Ave”. This should match the nearby street names verbatim hopefully. However, the prevailing usage for addr:state is to use the postal abbreviation, so “OK” not “Oklahoma”.
I agree with the others above that you might want to try mapping a small area first as a test run before doing the whole import, as well as doing some reading about previous address imports.
These are not the correct OSM tags. Please see Key:addr:* - OpenStreetMap Wiki
My apologizes. I’ll update the wiki page right away.
I have a couple of comments: first, please note that typical OSM protocol is to expand abbreviations in street addresses, so “North Western Avenue” not “N Western Ave”. This should match the nearby street names verbatim hopefully. However, the prevailing usage for
addr:stateis to use the postal abbreviation, so “OK” not “Oklahoma”.
Street names in OKC are written abbreviated. All official city documents, and signage reflect this. No one in OKC would refer to SW 44th in writing as Southwest 44th. I would prefer to follow the actual city data when it comes to this. I am indifferent to expanded street types, since there is no concession on this within the city.
I would prefer to follow the actual city data when it comes to this.
It’s solidly established worldwide practice in OSM that all street abbreviations are fully expanded in street names and address records. Deviating from that standard will need a massive justification and “this government datasource uses the abbreviated version” is not enough.
See also the Utah addressing system.
Is that code you plan to use generated by LLM?
Is it tested, and if yes what kind of testing was done?
The data there is the most accurate data available
This does not mean it is importable and good enough.
Can you share dry run of your bot?
In other way, which edit it would perform if run today?
For example actions like “add tag xyz to way _here_link_to_osm_object”
Street names in OKC are written abbreviated. All official city documents, and signage reflect this. No one in OKC would refer to SW 44th in writing as Southwest 44th. I would prefer to follow the actual city data when it comes to this.
I get where you’re coming from, as someone who also lives in a place with directions in the road names that are nearly always abbreviated in common parlance. But as someone else said above, fully expanding abbreviations in road names is a very established worldwide practice in OSM. And I’ll add that it’s not just for precedent’s sake: in general shortening full words to abbreviations is easy for a consumer (say a map renderer or router), but going the opposite direction and expanding them (Is N “north”, “new”, or just “N”?) is non-trivial. This is discussed a bit at Abbreviations - OpenStreetMap Wiki . For a more local precedent, just see the actual roadways in OKC, which have had fully expanded names for well over a decade in line with OSM consensus Way: ‪North Western Avenue‬ (‪27574172‬) | OpenStreetMap
Good tooling also already exists to help sort out mismatches between street names and addr:street values (specifically the MapWithAI plugin for JOSM has robust warnings for this). It’s been more or less a must have for imports for a while now. It’s the kind of thing you’ll be able to navigate once you read through a few approved proposals and incorporate their techniques into your plan.
It’s solidly established worldwide practice in OSM that all street abbreviations are fully expanded in street names and address records. Deviating from that standard will need a massive justification and “this government datasource uses the abbreviated version” is not enough.
All official and widely unofficial sources use abbreviation for the directions (N, S, E, W…). This is reflected in all the signage. I would argue official data sources should hold higher propriety then OSM naming convention. In addition, residents of OKC, and the greater metro area use abbreviations for the direction. What it sounds like a is question of “on the ground principles” or OSM naming convention.
Is that code you plan to use generated by LLM?
It is not. I did use the AI overview search thing that seems to be turned on by default with every search engine. This was only used for reference.
Is it tested, and if yes what kind of testing was done?
I wanted to do testing on the development server, but none of the ways where there. I’d like to test it out on a small area like a neighborhood or a block once we get to a consensus on the naming convention.
And I’ll add that it’s not just for precedent’s sake: in general shortening full words to abbreviations is easy for a consumer (say a map renderer or router), but going the opposite direction and expanding them (Is N “north”, “new”, or just “N”?) is non-trivial.
The problem is it’s not accurate. I started this project to make OSM the most accurate source, but it seems like y’all are more concerned with OSM precedent. The only justification you have is “because that’s how we always do it”. That decision ultimately makes future projects harder.
For a more local precedent, just see the actual roadways in OKC
You want to talk about local precedent? I live on Western, I’ll take a picture of the sign when I get the chance.
Good tooling also already exists to help sort out mismatches between street names and addr:street values (specifically the MapWithAI plugin for JOSM has robust warnings for this). It’s been more or less a must have for imports for a while now. It’s the kind of thing you’ll be able to navigate once you read through a few approved proposals and incorporate their techniques into your plan.
Thanks for the info, I’ll take a look.
@Taya_S slightly OT, but are we seeing more of these, being nice here, “editor glitches” than before? They used to be quite rare and the subject of OSM lore.
If there is a clear uptick that would warrant its own thread for discussion given they are completely unnecessary and shouldn’t be conflated with honest editing mistakes. Not to mention that it should be, when an error happens, part of the dev ethos to clean up after oneself and not burden volunteers with that.
I have noticed a pretty sudden uptick in people using their own custom editors lately. I’m still holding my breath it’s just a random coincidence, but I doubt it. These custom editors tend to be only used by one or two people at most.
The way I’m discovering these editors is usually when they start making pretty basic editing mistakes like confusing node, way, relation and changeset ids, so there are likely more out there. Another thing I’ve noticed is a recent surge in test notes, presumably caused by people experimenting with their own editor and not using the dev server.
If I had to take a guess I’d say this is all happening either due to people vibe-coding something they know nothing about or people following the advice an LLM gave them without fully understanding how OSM works and what tools are already available.
If this keeps happening I will definitely create a thread. The people making these editors probably won’t see it. Not before starting their edits at least. But having community consensus/awareness, and somewhere to point them towards when they start making mistakes is a good idea.
Also, it’s actually 4 people making mistakes with their own custom editors in the last 2 weeks. Not 3. I had forgotten about this one Changeset: 178331692 | OpenStreetMap
This seems to be missed, but I will ask again:
Can you share dry run of your bot?
In other way, which edit it would perform if run today?
For example actions like “add tag xyz to way _here_link_to_osm_object”
This is necessary to avoid mistakes of this variety