Vamos por partes. Vou quebrar em mais de uma resposta.
Timeline
- 2023-05-24: Meu primeiro contato foi como sugerido na página do EWG, enviar e-mail. Fiz isso em 2023-05-24. Meu e-mail foi longo. Fui convidado pra participar da próxima reunião
- 2023-06-28: Essa foi primeira reunião que participei (como visitante). Não foi decidido sobre minha entrada ainda. Working Group Minutes/EWG 2023-06-28 - OpenStreetMap Foundation . Um dos topicos dessa reunião era ver se alguns membros inativos ainda tinham interesse de continuar.
- 2023-07–12 Aqui foi quando fui aprovado Working Group Minutes/EWG 2023-07-12 - OpenStreetMap Foundation
The existing members voted unanimously to admit Emerson Rocha and Salim Baidoun to the Engineering Working Group.
foi o texto.
- Vou falar mais sobre isso depois, pois no meu caso não teve tanta descrição pública. E mesmo se tivesse, talvez não é algo de fácil acesso.
Texto do meu e-mail de pedido para ser membro do EWG
Hi, I'm Emerson Rocha, and would like to join the OSMF Engineering Working Group! I'm relatively new to OpenStreetMap compared to the average contributor, but in addition to my previous expertise, I've been heavily active since I joined and focused in particular on semantics and data interoperability. There are some of my accounts:
* https://github.com/fititnt
* https://www.openstreetmap.org/user/fititnt
* https://wiki.openstreetmap.org/wiki/User:EmericusPetro
Beyond obviously be willing to be focused on help do code and infra (even if as proof of concept for ideas you all know is possible but cannot bootstrap) I'm especially interested in help to write drafts of technical specifications (such as IETF, but I would recommend the https://github.com/w3c/respec formating, which is used by W3C et al, but is fantastic to work with citations of other works, and also is friendly to receive more hands editing) from what already exist today. I actually have some work to do data mining on the OSM Wiki as structured data (or at least anyone export tables and code snippets) which could be used to stimulate users beyond the infoboxes.
I read the past minutes of EWG, and on the coding part, I think I will try to seek more the intersection between specifications and proof of concepts or usable tools, not what typically would be outsourced to a dedicated project which requires some PR. Some things I worked on (in part influenced by the discussions related to Wikibase potential deprecarion and move the vocabulary "inside" the database):
1. I created a rudimentar python script to concert part (not all) OpenStreetMap XML in RDF, which can be used both as command line with local cache and as online proxy. Not tested for performance (because the idea was made flexible to think the schema, and strict RDF this can take time) but unless it overcomplicates, it is mostly a file conversion from output already created by the backend.
2. I created a command line tool (which also can be used as a proxy with local cache) to use any MediaWiki as file storage . It don't scrape page by page (but uses WikiMedia API) so up to 50 pages (500 if super admin) can be taken and then the tool either generate an JSON-LD with a friendly result of all pages or (for people just want files) simply a zip file with "extracted files". Things like tables are conveyed to CSV, inspired by the complaints that "Infovoxes could be parsed" as replacements of Wikibase, the tool also exposes them. Since OpenStreetMap is heavy recommended use the Wiki for document things, maybe with some extra edit (such as hint for filename of code snippets, of fix some syntaxhighlight blocks), this could allow (for example) save an entire page with code examples as files in a zip. The python package is wiki-as-base (see https://github.com/fititnt/wiki_as_base-py) (but might change the name if get serious; but Wikibase actually stores RDF like normal Wiki Pages, so this is just a bit generic use and without need to install as extension)
3. Despite quite new, I do have several code and schema-related repository related to OpenStreetMap. One of them is a Function As A Service with OpenFaaS (open code at https://github.com/fititnt/openstreetmap-serverless-functions and the one with Ansible to set up the host). The home page is at [https://osm-faas.etica.ai](https://osm-faas.etica.ai/) (not really open to the public yet). For example the wiki-as-base package beyond command can extract this Wiki page https://wiki.openstreetmap.org/wiki/User:EmericusPetro/sandbox/Wiki-as-base as parsed zip file at https://osm-faas.etica.ai/wiki-as-base/User:EmericusPetro/sandbox/Wiki-as-base.zip and it also allows get get all files from pages with a specific category (no pagination yet, so just fetch the first 50 pages and cache before serve to user)
4. (Using 2 to concert OSM Wiki to file exports) I believe is viable to even check syntax and validation for documentation on the OSM Wiki in a more generic way. For example, tag info parses Infoboxes, but my initial use case for it was XML Schema of OSM files and other RDF things which could strictly validate (or explain semantically) typical output of the API. In theory, this could allow users to write rules on Wikipedia (such as more complex checks, like allowed speed in a country) and compare with one output in production. But I think if this starts to get used, it would by a considerable time just be used to *validate rules that would validate rules*. Anyway, just assume that after Wiki is congested to files on disk, anything that could be the result of command line external tool (maybe even compiling) could be used as a pipeline, and we could even only update/publish a parsed version of data initially on Wiki if they're strictly validated.
I'm likely missing some things I have already done, but I'm general either my github in last 6 months or my OSM Wiki might give a hint, In addition to containers/virtualization, continuous integration etc, I'm mostly accostuned with Ansible, but could use Chief if friendly to others already running the OSM infra and then plan ahead the health checks.
I do have any kind conflicts of interest as my day job is not related at all with OSM ecosystem (mostly is keep servers online from few clients with significant number of users and also do PHP/CMS Joomla/Moodle code for them when need, so I'm mostly free to use rest of my time focus i projects that help others). And a trivia: my initial attention to OpenStreetMap was because I discovered it was used heavily in the humanitarian sector (often without explicitly acknowledging) and also it's query language is more flexible than SPARQL and all the tooling is very memory efficient. I was already perceiving that from an ontological point of view, both spatial and temporal are close to universal reference points compared to other forms of organized knowledge. On my last diary post comparing OSM to Wikidata/RDF didn't consider for example that "primary key" used on OpenStreetMap (location) and it's very realist approach of visualize knowledge (visually in a canvas) is easier for non-experts cooperate, while Wikidata still require high levels of knowledge and even then, it have clusters of how to taxonomize. I mean, several criticisms of Matheuz against Wikidata ontology organization are true, and it is only "easy" to import data on Wikidata because there's poor support for validation! Also, the density of knowledge inside OpenStreetMap dumps are far higher than Wikidata, so an hypothetical RDF version of OSM data expanded with full meaning explained by an structured version we could get from a subset of what's on the Wiki today would not fit any RDF triplestore or at least take weeks or months on initial import. It's not a random act from me starting as how to generalize extract information from the wiki in a way that others could see if is working or not, because even if not OWL (likely SCHACL is the side more useful from RDF world, not open-world assumption) would either by improving existing Infoboxes or, by documenting top level constraints (like maximum allowed maxspeed at world level, then overload by country, then explain the concept of types of vehicles, external variables like weather, etc) the wiki itself could allow validate concepts when compared to output of OSM data from API (or extracts of data dumps) such as better context that 200 (km/h) is not an allowed value for maxspeed. But if using maxspeed, sometimes it might appear as a zone:maxspeed=*, so it expands the value of maxspeed=* which might be just a literal value or vary depending on context.
If I need more context about me, I'm totally open to see how I could fit on EWG. To make it clear, I'm interested in doing all this pro bono publico, and I'm excited that EWG is attempting standardization between developers, and some things really are complicated to be paid, and I could do a lot of these side tasks. For example, there's some things like document/fix XML DTD (which today just exist on the wiki) , so existing tools could actually use these to validate output formats and also bring to the group suggestions to formalize (as in IANA Media Types, like Paul Norman done with vnd.openstreetmap.data+xml) initiatives like this one https://wiki.openstreetmap.org/wiki/Talk:Overpass_API/Overpass_QL#Recommended_file_extension? . That's it! Thanks!
Wainting,
Rocha