I think OSM's search system needs to be improved

(First of all, please understand that English is not my native language.)
Hi.
I wish the OSM search system could be improved overall.
In particular, I think it should be improved to search considering dots and gaps. (ex. USA-U.S.A, CIA-C.I.A, … 한국 박물관(with blank)-한국박물관(no blank) in Korean)
It would be nice if you could take into account typos and show the results, but I’m hesitant to ask for this because I think it will require more advanced skills.
I’m curious what the developers and other language speakers think about this.
Thank you.

Search results for 'K.F.C' in OSM

If you need a translation, please click :globe_with_meridians: button below the text. ↓

4 Likes

@lonvia might be able to comment on that (if the username here matches the username on GitHub).

Probably related issue:

It would be definitely welcome and very useful, but sadly it is very complex to improve. Help is welcomed, but for most tasks one needs to be an experienced programmer to help.

1 Like

@Mateusz_Konieczny, Thanks for your comment.

It’s really sad news that it’s complex to improve.
This small difference is quite important for improving user experience, and if users don’t get the search results they want, they will use OSM less.

2 Likes

Personally I would consider Google Maps ability to recognize mangled addresses, names and locations to be the most amazing part of that platform.

And it is quite hard to match it.

Mappers may also help - see Nominatim QA Tool ( Nominatim QA tool - OpenStreetMap Wiki ) that lists problems making geocoding harder. (Nominatim is one of the search systems using OSM, fixing problems listed there will help in general)

Note that as usual you need to use brain while processing QA tool reports.

3 Likes

In general, decomposition is a difficult problem for languages that use few spaces (like German) and those that don’t only use spaces as word boundaries (like Vietnamese). However, Nominatim could implement some heuristics to improve geocoding in Korean when omitting spaces. In particular, it could assume that certain characters like 도 appear at the end of a place name.

11 posts were split to a new topic: KFC vs. Kentucky Fried Chicken

Thanks for your comments.
Although there are a few new issues as in the comments below, I think your comments suggest something worth looking into.
In this case, instead of writing all the names in the OSM attribute, I thought it would be nice to connect them together in the attribute processing process.
But it’s not easy, so it’s good to find a new way.
Even in actual use, there are quite a few cases where it is recognized as a name different from the actual name.(especially in the case of an unfamiliar brand or in the case of changing the brand name to an abbreviation, etc.)

좋은 의견 고맙습니다.
아래 댓글처럼 몇 가지 새로운 문제가 있기는 하지만, 당신은 댓글에서 살펴볼 만한 점을 제시했다고 생각합니다.
이런 경우에 그 모든 이름을 OSM 속성에 기입하는 대신에 속성 처리 과정에서 그것들을 하나로 이어주면 좋겠다는 게 제 생각이었습니다.
하지만 그것이 쉬운 일이 아니라니 새로운 방법을 찾아 보는 것도 좋겠습니다.
실제 사용에서도 실제 이름과 다른 이름으로 인식되어 불리는 경우도 꽤 있습니다.(특히 유명하지 않은 브랜드인 경우나 브랜드명을 축약형 등으로 바꾸는 경우 등)

1 Like

Computers are hard.

OSM is made by people applying their talents to make things better.

2 Likes

Everyone is saying it is complex. Sure, but not impossible. And the technology is old. The wikipedia article Approximate string matching is mostly quoting papers written in the last century.
With a decent proposal report to the OSMF (and/or other donor) we could surely raise funds to get this feature coded to professional standards.
Thoughts?

1 Like

Throwing money at the problem sounds like the easy solution. However, as always, it’s a good idea to discuss such topics with the project maintainers first. They may have a completely different view on the topic, that includes boring stuff like long term maintainablility and stability of the code. The proposed changes may not fit their planning at all, b/c it would require large scale changes that are too expensive and/or risky. As an outsider you probably don*t have the visibility for a half way decent assessment.

5 Likes

Thanks for your comment mmd.

First: I’m a little worried about your tone. We have a code of conduct at OSM which requires being nice to each other on group discussions. Perhaps you are feeling a bit frustrated with me, because I can see a hint of aggression :rage:, maybe a dismissive attitude and perhaps even assumption of idiocy :crazy_face: on my part. I hope I’m mistaken.

Second: I would always encourage all the members, both experienced and inexperienced to discuss even the most difficult technical aspects of delivering OSM to mappers. Sure, somebody will inevitably get stuff wrong, and maybe even annoy the elite techies, but most importantly everyone is participating and feels included, feels respected and feels part of the organisation. And the comments that come back should avoid being dismissive, but should be constructive and corrective to help participants understand more of the often opaque workings of the OSMF. For example, clearly you know stuff, so your response could have been constructive and set out a suggested path for the OSMF to achieve some fuzzy searching capability.

Third: Any proposals to do with code will always go to the Engineering Working Group. You can be reassured that the very talented EWG will quickly sort the silly from the serious suggestions. And serious suggestions will be developed into proposals, be scheduled into the development programme, will ultimately be funded by the Board, put out to contract (managed by the EWG) and be implemented.

I hope that helps!

1 Like

I’m an outsider to this discussion, but I detected absolutely no aggression, dismissive attitude or an “assumption of idiocy” here at all - merely a genuine attempt to explain.

8 Likes

Admittedly it’s out of my peer view, but I assume the developers of Nominatim will do whatever they think is best for their project regardless of what the EWG says or does. Sure, I’m aware that your talking about a paid proposal, but I still don’t think the EWG just approves something without consulting the developer and they aren’t going to force a developer to implement something that isn’t in the best interest of their project either. Obviously.

(BTW, I’m as bullish as anyone else when it comes to crying foul about other people’s behavior at the slightest sign of them defensive, but I think mmd was perfectly fine here. That’s just my opinion though.)

2 Likes

Ok. noted. I wasn’t sure. May mmd forgive my pitbull defence of the peace of the channel.

1 Like

Yup. I’m absolutely not telling Nominatim or other developers what to do. It’s not my role. The EWG is the only entity that can take a formal position for OSMF on the value of a proposal from any source, including this channel. And EWG can use that opinion to negotiate with a developer and secure funding etc.

The Engineering Working Group (EWG) is charged with

  • Handling software development paid for by the OSMF
  • Putting out calls for proposals on tasks of interest, and accepting proposals on other tasks
  • Offering a platform for coordination of software development efforts across the OSM ecosystem
  • Managing OSM’s participation in software mentorship programs"

Further, I believe FOSS developers should be paid a fair fee for commissioned work - I personally wouldn’t refer to paying a fair fee as “throwing money at the problem”. Right?

1 Like

It really depends on what the time to cost to benefit ratio is, which my guess would be not so great given the complexity of the problem :man_shrugging:

Even superficial research on the topic of “fuzzy” search on OSM data would have immediately turned up https://photon.komoot.io/ GitHub - komoot/photon: an open source geocoder for openstreetmap data (which btw is maintained by the Nominatim maintainer and uses Nominatim data as input). A lot less time than it took to craft underhanded slurs (implying that the current code is not written to “professional” standards for example).

There are however a number of issues with deploying it on openstreetmap.org, some technical (language support), some strategic (osm.org is not intended as an “enduser” map site). Naturally expecting the same results as a product that can use your complete search history and has received 100 of millions of $ investment over the years is misguided.

5 Likes

I think I should have added a “trigger” warning for the “throwing money at the problem” bit. It was mostly a direct response to this section:

“old” and “papers wirtten in the last century” somehow implies that the issue has already been solved, and all it takes is to secure some funding and have it implemented by someone.

My response was an attempt to offer a different perspective to this idea, and think more about the big picture, and in particular the long term implications. And most importantly, have those ideas reviewed by the subject matter experts early on.

By the way, EWG did the same mistakes initially and involved project maintainers towards the end of their proposal process only. I guess everyone is still learning here.

5 Likes

:shield: There have been some flagged posts on this thread. I’m happy to see that several people have made an effort to calm things down by clarifying their intention in later posts, so there seems to be no immediate need for moderator action. There has been some unnecessarily triggering language on all sides, though, so please keep things civil going forward.

Also, I’ve split off the KFC naming debate.

5 Likes