I’ve commented on the topic of unspecified template parameters via mail, but I’d like to describe another idea that may be relevant here.
There are several different projects parsing OSM wiki templates - TTTBot, UserGroupsBot, Taginfo, and an upcoming project by my regional OSM user group, to name only those I have had direct contact to. Therefore, I considered extracting the basic template parsing functionality into a common service serving a json representation of wiki templates.
This service is supposed to offer the following features:
make templates more machine-readable by hiding the details of the MediaWiki query api, and filtering out comments and other formatting details from the wiki pages themselves
provide a frequently updated view of the wiki by only parsing those pages that have changed since the last run
join the translations of a page with templates into a single json object containing all the localised strings
support both the case with just one template per page and multiple templates per page
provide common functionality such as constructing image thumbnail links, which is necessary for embedding wiki images outside the wiki
TTTBot is still lacking many of those features (and afaik none of the other examples I provided has all of them either), so the software catalogue would also profit from the implementation of this service. But the larger vision is of course to make it easier for all projects to access the OSM wiki, and to avoid reinventing the wheel each time.
To achieve this goal, the service is not supposed to contain any intelligence related to how the template content is being used, that would be the consumers’ task. And it is supposed to require as little configuration as possible: It should figure out the type of the value (text, link, image, url, list, …) from the value itself instead of requiring a full schema for each template.
I’ve even started writing it, but didn’t get very far, and because I’m very busy this month I cannot work on wiki template parsing during that time period at all. I still wanted to to get your feedback on the idea and to ask you to maybe consider it for your future plans.
I like your idea and I’m ready to use your JSON service instead of my JSON generator. We just have to discuss structure of the file.
On my future plans. I’m implementing Catalogue with my colleague. He is busy this week so it will be not so quick as I wish. My vision of this project I tried to explain in my first message of this thread. Other details we can discuss by mail or skype.
Please find my comments below.
If json file would contain all the templates, it will be large. In this case we also have to implement filtering engine which will work on the server side.
Right now we have ~350 apps and size of Software2 template file is 220kb. Not a big but not a small. Gzipped file is 39kb.
What if provide link to JSON validator or generator so user will be able to validate his text in the same time and do not wait for bot?
It will be great. I did not know how to get link to screenshot.
It is true, but originally that wasn’t even my intent. What I want to do is making wiki parsing easier. So as a start, one would only replace any code accessing the MediaWiki API and searching for templates in the results with an access to my json representation of wiki templates.
Of course it would be nice if dynamic pages could be generated directly based that json, but as you say this might create additional requirements on the server - such as filtering. Do you think this is possible without semantic knowledge specific to each application?
I’m not sure if I fully understand this. Would you expect the user to paste the template code into that validator, or would that functionality become available somehow after saving the page?
There are a lot of successful examples: Google Play, AppStore.
Anyway, it is just a database. We have to think how to design structure. To design structure we need list of all use cases and requirements. I will prepare requirements of Catalogue.
We can provide three options:
Option to parse Templates like how it works right now. We just have to add some samples and documentation. Maybe implement Template parser in Java?
Option to download complete new JSON file. The difference comparing to first option is this file is easy to parse.
Option to download part of the new JSON file using server side filtering.
All options will be implemented one by one in the order of the listing. User will choose suitable. It is like .NET/Java approach. You can start at any point of abstraction (File->Stream->Serializer).
I prefer “after saving page”.
If wiki does not support triggers, we can add some button to regenerate JSON file.
I believe it is better to have strong typified fields. If field is of text type, consumer should use it as text. If field is of list type, consumer should treat field as list. Otherwise… it is hard to control software. What we need is template validator (to fix errors in the edit time) and JSON generator with option to generate JSON files of deprecated formats to support old consumers at least some time (in case of format changing). For example right now price field is list. OSM-JSON v1 will have price as list. However I’d like to have separated fields per currency per ApPStore. If we will change JSON structure we still will be able to generate OSM-JSON v1.
I would be happy to make OSM Software Catalog one of the official applications. But I think TTTBot should be fixed too. At least Linux users need it and those who do not want to run any software locally.