Duplicated addresses in Trinidad & Tobago

Hello everybody,

I was looking at duplicated notes and came across some in Trinidad. That’s how I noticed that there are an unbelievable number of duplicated address nodes over the whole country. Zoom in at any place and you will see the housenumbers in multiples.

For example, the same housenumber 109 has been added at least seven times with two changesets:
Changeset: 8456180 | OpenStreetMap
Changeset: 8457691 | OpenStreetMap
These imports seem to have all been done by @pdunn who last edited 4 years ago

All of the import changesets that I could find, in chronological order (it’s very possible that I have missed some):
Changeset: 8390578 | OpenStreetMap
Changeset: 8393223 | OpenStreetMap
Changeset: 8393233 | OpenStreetMap
Changeset: 8393357 | OpenStreetMap
Changeset: 8393538 | OpenStreetMap
Changeset: 8393554 | OpenStreetMap
Changeset: 8393556 | OpenStreetMap
Changeset: 8393560 | OpenStreetMap
Changeset: 8393562 | OpenStreetMap
Changeset: 8393673 | OpenStreetMap
Changeset: 8393676 | OpenStreetMap
Changeset: 8393678 | OpenStreetMap
Changeset: 8393685 | OpenStreetMap
Changeset: 8393690 | OpenStreetMap

Changeset: 8456163 | OpenStreetMap
Changeset: 8456170 | OpenStreetMap
Changeset: 8456180 | OpenStreetMap
Changeset: 8456184 | OpenStreetMap
Changeset: 8456193 | OpenStreetMap
Changeset: 8456199 | OpenStreetMap
Changeset: 8456204 | OpenStreetMap
Changeset: 8456210 | OpenStreetMap
Changeset: 8456220 | OpenStreetMap
Changeset: 8456238 | OpenStreetMap
Changeset: 8456259 | OpenStreetMap
Changeset: 8456295 | OpenStreetMap
Changeset: 8456330 | OpenStreetMap
Changeset: 8457514 | OpenStreetMap
Changeset: 8457521 | OpenStreetMap
Changeset: 8457524 | OpenStreetMap
Changeset: 8457536 | OpenStreetMap
Changeset: 8457547 | OpenStreetMap
Changeset: 8457602 | OpenStreetMap
Changeset: 8457633 | OpenStreetMap
Changeset: 8457691 | OpenStreetMap

It looks like the changes were done in only 3 days in June 2011 with a few days of break in between. The first changeset alone has 24947 nodes :dizzy_face:

There are no sources given for the address data.

What should we do with all of that?

Is there a real mapping community in Trinidad and Tobago?
It does not look like there are many active mappers there: OsmStats for Trinidad and Tobago

In my opinion, the multi-plicates are at least very annoying. I don’t know if the addresses are even accurate enough to be of some use for navigation.
Would removing all of them be worse than the current situation, given that nobody seems to be on the ground to add new address data? Could we merge duplicates in an automated way? (probably not a good idea!)

Seeing this area Node History: 1327467563 | OpenStreetMap I really think removing all of that data would make the map better. There are at least nine nodes with addr:housenumber=14 in that 20m radius, but they have different street names and even different city names. In addition, there are about 30 nodes in the same view with different addr:city and addr:street names that don’t have a addr:housenumber :scream:

Of course, it’s also very possible that the data is not compatible with our licence – but how could one find out after so many years? There does not seem to be anything about the imports on the wiki.

Reverting the changes would be a challenge. Most of the nodes have been changed at least once by xybot, or some other mechanical edits. Some seem to have been fixed by locals. Is there a tool that lets you revert all nodes that have only been changed by certain users or changesets?

(PS) I did send a message to pdunn asking them to join, hoping they still can read their OSM emails.

1 Like

Hey @daganzdaanda

I used the delete duplicate attributes tool in QGIS to create a reference layer in JOSM to use the conflation tool to attempt to find and delete the duplicates. I ended up with 71,000 duplicate addresses removed.

However, after deleting the duplicate addresses, these addresses look to be very inaccurate, and likely pretty useless.

I can remove all the addresses added by this user, as long as it is agreed that this would be the correct thing to do.

–SherbetS

2 Likes

Thanks for doing the calculations. It’s crazy how many duplicates there are…

I agree that the data seems to be not very useful at all.
It may be that local mappers could even be deterred from cleaning up address data, since it may seem overwhelming.

Still, I hope we can get some more opinions, ideally from someone local. I will write to users joshuarshah and SRASC who look like the most active local mappers according to OsmStats.

1 Like

I have had responses from the two local mappers @joshuarshah and @SRASC
Joshua said:

I think it’s best we leave it as is. Whoever imported that data created a mess. If you could find an automated way as mentioned above to fix the issue that would be great but I haven’t seen any tools like that in OSM before.

I am wondering if it is realistically possible to leave all the nodes that have been changed by a “regular” mapper, but to delete all the rest?

The majority of nodes seem to have only been touched by @xybot shortly after the initial imports. Could we delete all of the nodes which were added in one of the imports and were edited last by either @pdunn or @xybot? That should be safe for the beginning, I guess.

…of course, after a clean-up like this, there will be huge parts of the country without any address nodes. So it’s fixing the issue of bad data by having no data :frowning:
Hopefully this will motivate more locals to add addresses from their areas.

On the other hand, the original dataset must have come from some place. @SRASC suggested it could have been from the Local Government Ministry, the Housing Ministry or TTPost. I’m kind of tempted to email these and ask if they have this address data and whether OSM could get permission to use it. And then we could make a more careful import.

I see that user @wkc had asked about the same issue in the old forum already in July 2012:

Hi,

I am wondering if it is realistically possible to leave all the nodes that have been changed by a “regular” mapper, but to delete all the rest?

The majority of nodes seem to have only been touched by @xybot shortly after the initial imports. Could we delete all of the nodes which were added in one of the imports and were edited last by either @pdunn or @xybot? That should be safe for the beginning, I guess.

This is possible.

…of course, after a clean-up like this, there will be huge parts of the country without any address nodes. So it’s fixing the issue of bad data by having no data :frowning:

While true, my impression of the data is that it’s inaccurate to such a degree that it is completely unusable in many cases.

I can move forward with the plan to remove all the untouched address points, given the local community agrees.

–SherbetS

1 Like

The default action of the perl revert scripts is to “ignore changes made subsequently to the changes being reverted”. If you threw the xybot changesets into the mix as well you might get the desired result, but so much time has passed that I suspect that there’s been lots of other editing, and therefore this might not be practical.

does this count as approval? If so I’m happy to start work.

Actually, I haven’t found much editing of the nodes at all. The nodes that still exist seem to be mostly at version 2, after import and xybot. An action like “delete all the nodes that are v2 and have xybot as last editor” would probably clean up the vast majority already. Then we could have a look at what’s left to see if a second round is necessary.

I don’t know… it’s not a very strong approval and only one person…
The other mapper who responded by mail wasn’t happy about the address data quality, but did not say anything about removing them.

Of course, it looks like the local community really is not very big at the moment. I’ve also contacted @wkc who did some cleanup of addresses years ago, but haven’t had a reply. Also none from @pdunn, but they haven’t been active for a long time.