Improvement proposal

POI classification system is bad and we need it be redesigned.
Use case: I wanna buy specific chemical. I wanna find companies nearby selling the chemical I need. I don’t want to find all the companies selling all the chemicals, I need only to find the ones selling the chemical I need.

Problem: Today we can’t target POI’s by exact good or service they provide.

The solution is to throw away old classification framework and write the new one of the following design.

1 We should not use key-value pairs. Instead we should use JSON documents. We map old fields names to a new JSON paths in the obvious way:


will be converted to

	"addr" : {
		"street" : "Purple",
		"housenumber" : 13,

2 All metadata are stored in a DB. My proposal assumes MongoDB (since it operates JSON (in fact BSON, but noone cares)), but it can be reingeneered to any other DB.
3 Types: there are standard JSON data types alongside with enum data type. Enum data type is an unique integer ID referring to a record with that ID.
4 Classifications should have enum data type.
5 We introduce a database of goods and services. It is hierarchical. It is a tree of objects. Every node of this tree must have “_id” property which will be used as enum id.
The example is

	"good" : {
		"chemical" : {
			"isopropyl alcohol" : {
				"_id" : 1234,
				"CAS" : "67-63-0",
				"IUPAC" : "Propan-2-ol",
				"pure substance" : true,
		"food" : {...},
		"device" : {...},
	"service" : {
		"accomodation" : {...},
		"healthcare" : {...},
		"technology" : {
			"manufacturing" : {
				"cutting" : {
					"laser" : {...},
					"hydro" : {...},
					"plasma" : {...},
					"grinder" : {...}
				"positive manufacturing" : {
					"3d printing" : {...}
			"repairing and installation" : {
				"devices" {
					"electronic" : {
						"TV" : {...},
						"PC" : {...}
					"mechanic" : {
						"transport" : {
							"car" : {...},
						"refrigerators" : {...},
		"recreational" : {
			"show" : {
				"sport" : {...},
				"theatre" : {...},
				"exibition" : {
					"gallery" : {
						"picture" : {...}
			"activity" : {
				"physical" : {...},
		"legal advice" : {
		"insurance" : {
			"movable" : {
				"transport" : {
					"car" : {...},
			"immovable" : {
			"personal" : {
				"life" : {
		"criminal&government" : {
			"force" : {
				"torture" : {
					"beating" : {...},
				"killing" : {...},
				"kidnapping" : {...},
				"extortion" : {...},
				"robbery" : {...},
				"espionage" : {...},
				"war" : {...},
			"bribery" {...},

This was a JSON representation of database structure, actual way of mapping this structure to DB structures depends on chosen stack of technologies. For example it can be a table in an SQL db (for example PostgreSQL), each row represents each object, the fields are ID (primary key), list of parents and a JSON with rest of fields.

6 We throw away these “amenity”, “shop”, “office”, and other similar fields (I call them “amenity-like”) and introduce a new field “product”. It contains an mapping of ids for goods and services provided by an org to additional info.
The example is

	"product" : {
		"1234" : {
			"assay" : 0.9

It means that the org sells a product with id 1234 (isopropyl alcohol) and its assay is 0.9.
7 Software should map hierarchical paths typed in GUI to IDs and show tooltips. IDs should be used by software internally.
8 Disadvantages: it is not compatible with old osm software. But JSON parsers are widespread, so it will be no much problem.

OSM is not a trade directory.

I’ll agree with hadw, this information does not belong in OSM.

Since you are not the first to ask how more detailed product information can be added, I think it would be better to create a separate open data project with a data structure that is more suited for such information. This information can contain lat/long so it can be linked to a OSM data. The final end-user application can then use both open data datasets.

What you suggest me to do is to create own POI directory project competing with OSM. It will be used by noone because OSM has own POIs and OSM is already used by very minor part of population. So minor probability of using OSM * minor probability of using anything non-official and non-default, including a POI directory for OSM (other maps have own POI directories and don’t provide using third-party ones) when using OSM = negligible probability of using non-official POI directory. It just doesn’t worth to make such project.).

OSM is GIS. POI directory is a part of GIS. If you wanted purity, why do you store anything except name and coordinates, for example type (amenity, shop, craft, etc), phone, website and opening_hours in your POI directory which makes it trade directory?

So I see no reason why not to implement the proposal. BTW, it can be OSM’s killer feature (I’m not aware that any website with maps has such a feature).

Agree. Why not implement it.

Just two suggestions.

  1. Provide a prove of the concept implementation

  2. Change your OSM account to something more meaningful. Imagine how would people reference you in a skype call?

If you are going to implement a new database, consider to expand this for offices and services also, eg.:

office=lawyer-->specialist solicitor
office=insurance --> car insurance

What do you mean under the PoC? Do you expect me to implement this feature or what?

Sorry, this won’t be done.

root.service["legal advice"]

Do you mean this (I use JavaScript syntax since we are dealing with JSON, root is the root object of the classifiers tree)?

See previous discussion on the “sells” or “product” or … tag on the tagging mailing list.

IMHO this requires that you upload the product catalogue with all Product SKUs and stock, list and sales on a daily basis for every POI.

There are lot of kinds of businesses not changing the set of their goods and services for years. The goal of this directory is not to provide extremely accurate realtime list of goods in stock (but it would be possible if an org implemened API for this and registered its URI as a field) but to provide a possibility to search by good which is sold by company during large timespan. Mind it as a more informative replacement for amenity-like fields with a bit of duck typing phylosophy.

PoC is proof of concept and, yes, it does mean that you are expected to provide code.

Your data structure doesn’t seem to fit at all with the free tagging structure used by OSM.

Coverage of businesses is very patchy even now, with most not yet mapped and many that have gone out of business or moved. Even someone with the resources of Google has very patchy information on where goods can be obtained, especially when they are not popular consumer goods, and most businesses are only really interested in registering a hyped up sales pitch.

I think it is going to be impossible to maintain this data in a useful form using only volunteer effort.

People should not be using the OSM servers for doing business searches; there is no way of funding the resources needed for a full consumer service; the servers are there for mappers It is up to people providing servers for such use to integrate multiple data sources.

It is better than nothing.

It doesn’t, your structure is flat (key-value pairs), it emulates tree-like with prefixes like “addr:”, I propose to use true tree-like structure like JSON objects. What is in common is that you still will be able to create any properties you like (except reserved like “_id”), there will be no rigid schema, if you use document-oriented db. So I propose to upgrade the clients, upgrade the structure and migrate the data.

In fact OSM has more reliable data than Google sometimes. I recently went to an org shown on Google Maps only to find there is no such an org in that place.

In fact, it should be possible. As you see, the structure of goods is hierarchical. This means that you are free to lower detail level when you don’t have enough data. If you don’t know which kind of bread is sold in a supermarket you can mark it just as “bread” and this should be OK. I have fixed the spec in a way that not only leaves now must have an ID.

People should not be using the OSM servers for doing business searches; there is no way of funding the resources needed for a full consumer service; the servers are there for mappers It is up to people providing servers for such use to integrate multiple data sources.

it is possible to implement this in the way this won’t impact performance much, for example you can store full path to leaf in POI object as an optimization so you won’t need to traverse the tree each time a user makes a search.

Don’t expect it soon, I have no experience with OSM deployment and I’m a bit busy now.

So you want someone else to convert the current database structure and all APIs to JSON as well as all data consumers to this new API & data structure ? This is not going to happen anytime soon I think.

OSM is a do-ocracy, which means that when you want to change something or define something you have to do it yourself, so if you do not have to time, no one will do it.

I still do no understand what type of queries you want to be able to perform on products and which queries you will not be able to solve. Can I search for stores that sell apples (the fruit) ? Granny Smith Apples ?
Can I search for stores selling smartphone with a screen of 6 inch and 64 Gb memory for under 400$ ?
Can I search for shops selling white sand in sacks of 25 kg ?
Can I search for a printing shop where I can order 250 printed business cards with my own logo in colour with a gold spot colour ?

In general can I search on categories of products or specific properties or brand or model ?

What is the average number of products / categories you think you have to add to

  • a grocery
  • a Smartphone shop
  • a supermarket
  • a do-it-yourself shop

to make your application useful ?

Initial migration of data structure should be done automatically by a script. It’s easy, something like:

function migrate(objects){
		let newObj=Object.create(null);
		for(let field in o){
			let curObj=newObj;
			for(let i=0;i<path.length-1;++i){
				let p = path[i];
		return newObj;
var objs=[
console.log(JSON.stringify(objs, null, "\t"));
		"a": {
			"b": {
				"c": {
					"D": "d",
					"f": "F"
				"g": "G"
			"h": "H"
		"addr": {
			"housenumber": "15",
			"street": "Liteynaya"

Data consumers is the hardest part. But you can give them some time announcing plans in advance. Also it is possible to write a tool converting back to old data structure with some information loss for the ones who haven’t migrated yet. But migration should be easy, all the thing they need to do is to utilize some library, to modify UI a bit and to adapt new good-based classification in their app.

The depth of the tree of product types is limited by the common sense we surely don’t want a tree with depth of thousands.

No. This info (exact makes, exact models, etc) changes too often and will surely be obsolete. But if a shop specializes on selling goods of a certain make, for example Apple’s reStore, make can be put as an additional field, but not into “good” field.

I guess, yes. This info should be put into additional data about the good.

You will be able to search for shops printing just business cards, then you should phone them all and ask if they can do what you need for you.

People will add them when they add an org selling a good with the classifier they cannot find. There should also be a tool for merging 2 different classifications for the same good into a one (history must be preserved in order to be able to revert vandalism). Merging will be in the following way: the new category is created and old ones are added with a special field indicating that they should be redirected to the category with the _id stored in that field.

No, editors. One should make sure that old editors do not break the new data model, from day one.

Anyway, I suggest you first learn how the OSM community works before starting to work on such a project of converting data. This forum is also not the best way to discuss this. There are a lot of developer centric communication channels, a forum here, a mailing list (which is the best way to get in touch with other devs afaik), a josm-dev mailing list, irc, etc.

JOSM has a validator tool. It is also possible to add a validator on server for some time (say a couple of months) to return an error when anyone is trying to upload something looking like it of the old model. No automatic fixes here, everyone (including apps) must become familiar with the new model in order to edit.

not quite alive

inconvenient: you get every aspect even the ones you are not interested in into your mailbox.

not very convenient - sometimes you have to wait online for hours for somebody to see your message

I don’t think they will be happy with the fact they will have to fix everything touching the data model.

That pretty much says it all. Also fully agree with Escada.
And can’t help suspecting that the big ‘silent majority’ who didn’t say a word to this yet sees it the same way - it’s so completely outlandish, not even worth wasting time discussing.

This would be a GIANT project. And OpenstreetMAP does have as main target MAPS, which is already a GIANT project in itself. Dropping both together into the same pot doesn’t make sense to me, it’s different things, and most of all, data size getting way too big. Even now, the current size of offline maps is already staggering for many people and their mobile phones. And wouldn’t be surprised if such a trade directory would even be ten or a hundred times bigger as some street nodes and building outlines.

Please, keep maps as maps, and trade directories as trade directories!

Besides, I don’t believe either such a directory would be possible with any reasonable accuracy by volunteers. On one hand due to the incredible mass of data alone, half the stuff would always be wrong and if a shop database is so unreliable it’s useless anyway. And on the other, when I see how much nonsense gets mapped even today it makes my hair stand on edge. And third, if shop competition kicks in and MONEY can be earned, they’d manipulate like crazy, as was already mentioned too.

Sure to a certain extent it’s great if the maps contain some additional details which need not always be 100% strictly map related, as long as they add useful information, but please without blowing up the database so extremely.
And what’s next? Once crossed that line, you can then toss the whole world into it, there will be no limit anymore. Which offices offer which services in all details, health stuff and doctor and equipment details, public transport up to the tiniest details and all prices and seat class etc., and just about every bit of information on the planet.
That may not be your vision, but if OSM would not be “map centered” anymore, why should it stay “map+trade centered” ever after?? Taking away it’s definition is like opening a flood gate.

As much as I’d really love to have access to such a detailed trade directory, and do hope that OSM will contain more shop details too (to a certain extent), but such a GIANT super detailed trade directory is just plain not the job of OSM. And it couldn’t work in this organisation anyway.

Instead I’m sure there’s some much better way of crosslinking separate databases and apps in a flexible way, with OSM just providing what’s its job - showing the streets and positions of the shops.

I completely agree.

See no reason not to make it even bigger.

What is the problem with it? Do you lack data storage? If you do lack, consider using IPFS for storing “checkpoints” (large immutable blobs generated in some periods of time, recent changes are stored as difference between the “checkpoints” and actual state) of it. Anyway, data storage is cheap today. How large is OSM POI database? I don’t think it will grow more than 10 times (if we store the info in an optimized way, not as text) in the best (I consider "every shop org added full description of their goods) and surely it won’t happen in a 2 years since we don’t have even a full set of ordinary POIs only small subset of it. I suppose that in future storage will be much cheaper. And again, consider to use IPFS.

People usually need only cities they live in and travel to. They don’t need whole world. The map of Moscow (full with all POIs) in my phone takes 100 Mib, the map of Berlin is 50 Mib, the map of london is 50 Mib, the map of New york is 165 Mib, the maps of states of USA and Canada with all the cities are from hundreds Mib to Gib. It’s tolerable (though in cases of states it can be better to have maps on per-city basis). In future there will be more storage on phones, the base won’t grow to that size instantly, so IMHO it’s not a problem.

Any accuracy above that we have now will be better.

It should be fixed then. I mean OSM is used by osm-aware folks, this folks understand how OSM works, so I expect them to spend a second and fix a POI with incorrect info.

Why don’t they do it (remove competitors from maps for example) now?

It would be completely awesome if this happened.

Everything which has not been tried never works. So I think that the map software should be modified to open this possibility, but will it really work we’ll see. proposed additions are useful even for a map.

Crosslinking is good, but again, there are troubles.
1 noone will use a DB of POIs which is not shipped and enabled by default.
2 making a separate POI project means additional expenses on DBs synchronisation.
I insist on implementing proposal in existing OSM official POI project.

So you won’t discuss this topic outside this topic, which is read by only a handful of people, none of them really involved in the development of the server and editors (but are aware what is going in in OSM for several years), but you insist that your dream is implemented ? Thereby changing the whole infrastructure of OSM which was build up over more than 10 years ?

Please read more about OSM, on data migration and planning such projects in general, try to find other people to back up your proposal and that are capable to implement it, and if you come to a different conclusion than the other people here, you could start your project. But I sincerely doubt you will be successful.

Really, you have some wild dream, have no time to do anything about it (you only gave a script which I doubt will work), you barely know OSM (I think) but others have to immediately drop everything and implement it for you ? That is not the OSM spirit.

IMHO you only have some vague idea of want you want to achieve, you came up with a simple example of 1 type of product, rejected all examples that do not fit you idea and complain people do not want to follow you. Isn’t that weird ?

Back when this thread was created I wondered aloud on IRC whether it was “serious, trolling or a parody”. Recent posts would suggest that it’s actually the second of those three :slight_smile: