Feelings around OpenStreetCam?

Just wondering , what is the current state of community relations with Telenav vis-a-vis OpenStreetCam? My contributions to OSM have been fairly meagre since I signed up 5 years ago so I am not really very “in the know”.

I am very keen on the idea of crowd-sourcing topographic data from dashcam imagery but I am also very much of the Steve Coast/Richard Stallman/Aaron Swartz school of thought when it comes to the accessibility of information. I only just found out about the OpenStreetCam project and it concerns me somewhat that the public repo for the website appears to be out of date (https://github.com/openstreetcam/openstreetview.org/issues/116). Similarly, the idea that Telenav are running a private code repository seems to be concerning a couple of people in the GitHub issues.

–edit: accidentally hit enter on the above and it entered the moderation queue–

A particular concern of mine is also around the availability of the data en mass. At the moment, if I understand correctly, uploaded data goes to Telenav’s servers and can only be downloaded on an individual basis. Has the OSM community investigated or expressed any desire in setting up a mirror or some sort of decentralised storage system for uploaded imagery? If not, is there enough interest in the project at this stage to look at something like that?

This is not a direct answer to your questions, but it explains the company’s license. First, I would like to point out that a similar project called Mapillary also exists. It is more mature, and it is older.

The images license for both projects is CC-BY-SA, which means one can, at least from the legal aspect, download all the images and then mirror them or use them freely (Under the CC-BY-SA terms).

The source code license is free (I don’t know if all of it is, inc. server side. haven’t checked in depth) for OpenStreetCam but proprietary for Mapillary. Also, OpenStreetCam has some nonfree dependencies, preventing its inclusion in F-Droid.

Both companies’ terms grant them an exception to be able to use the images in a way unrestricted by CC-BY-SA’s terms. But the original images are always available with CC-BY-SA.

For further info, please see these:

*Note that some Github users got the facts wrong. See this comment by a Mapillary dev for facts: https://github.com/openstreetcam/openstreetview.org/issues/60#issuecomment-259912310

Thanks for your reply. I became aware of Mapillary around the same time and my instinct was to go to OSV because I thought after watching this video their approach toward the community was more compelling.

When it comes to mass collection of the images has that been looked at by a legal professional? I ask not to be difficult but because if Telenav do choose to start organising the photos in some proprietary fashion they could perhaps claim database right on the collection of tracks. I.e. a track in itself may be protected by CC-BY-SA but the collection of the tracks could be database right of Telenav and the mass copying of the tracks could violate that.

Also, I am not sure if, at this stage, the original photos can be download by someone other than the original uploader? In fairness, maybe this is impossible due to privacy regs.

As for the source code licence is free but it appears that they are pushing their own patches to a private code repository which seems to somewhat defeat the point – I was going to write a pull request to fix a bug on the openstreetview website but when I went to the source code it was out of date.

The AI training data is also worth thinking about, not only that but the code for the image analysis/neural net – just had a quick look and I can’t see it [maybe if someone from Telenav is reading they could point me in the right direction?].

Thanks for the links. I am still keen to know whether or not the community has considered self-hosting images or linking up with the Wikimedia Foundation who would perhaps consider hosting as part of Wikimedia Commons. I am thinking that, if not, it may be smart to think about these things in the early adoption stage to reduce the risk of the project becoming de facto closed/unfree. I appreciate that the Telenav developers working on the project have expressed a desire in it being community led but my concern is that their business-minded superiors may have other ideas and that it would be a shame for the OSM contributors to inadvertently gift away something of commercial value to a for-profit.

What would be the value of such an image collection to OSM ?

Mapillary allows downloading any photo you want under CC-BY-SA. There’s a button on the website, accessible without login. I think there are API calls too. I think OpenStreetCam is supposed to allow image downloads as well, but I cannot find a download button on the website. Disclaimer: not a lawyer.

Street-view Armchair Mapping.

In the case of Mapillary, three quarters of the images come from sources completely unrelated to OSM. This means the OSM community gets access to a massive amount of imagery for mapping.

Edit: I probably misunderstood your post. Sorry.

So we would only have to set up a service for those people that do not like the license, or any other aspect of Mapillary/OpenStreetCam ? That’s why I host the pictures I take on SmugMug. I do not need any other service.

In practical terms the same reason it is valuable now and the same reason Google use their StreetView imagery for reCAPTCHA; automation of feature tagging using computer vision (CV). I looked for the CV part of the OSC website source code (after you upload and it goes “processing” to blur things out) but I could not see it. E.g. topographic detail such as road widths, house numbers.

If you’re asking why OSM should bother trying to retain some level of control over the dataset I think that unless there is some legal agreement to suggest otherwise, there is a risk Telenav could start to close & profit from the uploaded data using the same models as Mapillary. Or the development team could die in a plane crash on the way to a conference or some other such scenario. If the project died for any reason and the data died with it it could p**s off a lot of contributors who decided to engage with the project because of some brand association with non-profit OpenStreetMap. Especially as they may not have backed-up their data, assuming it was in safe hands.

Regardless of methodology (e.g. surveying vs arm chair mapping) I think, philosophically, the reason people like to contribute to projects like OSM and Wikipedia is because of a desire to make information more widely and freely available, would you not agree? If I am correct in my assumption then it seems like a no brainer that such a project should seek to maximise those goals because it would improve adoption and potentially open up avenues unknown for people with similar open data aspirations outside of the OSM project, e.g. Wikimedia Commons.

This is a fairly weird question. What is the value of recent street and path photography to OSM? We create maps by using them. It is usually not possiblye for a given individual to cover whole countries, therefore we habitually use remote sensing, including datasets acquired by others (who, in turn, do not intend to do the mapping). We may use it as a wholesale armchair mapping, or more often to fix up already surveyed data, like missing surfaces, missing house numbers, identifying object on satellite imagery, etc.

The whole point of such services (like the discontinued panoramio, mapillary, or osc in question) is to have a cental, georeferenced, searchable and retrievable collection of imagery.

You may upload your imagery wherever you want, it’s useless for us, since we cannot possibly download a planet worth of image collection and start looking them one-by-one. Even using tools to geosearch may be challanging (I am using digikam to filter a 100000+ image collection and it requires a nice chunk or RAM to get it done, and I don’t know what would happen when I tried to filter the few ten millions of images uploaded to mapillary just in the surrounding area).

Mapillary offers both legal and technical means to retrieve imagess inside a bounding box, and I would say it works pretty well. Lots of their code is open. Their issue handling is pretty good, bugs usually get fixed fast. Has a useful JOSM plugin.

OSC, well, I do not yet have much experience, the code seems to be quite simple yet and there is no API. The android client seems to be mostly open source, but nothing more, really, just some small tools. It does have a JOSM plugin as well. But OSC do not seem to provide any technical mean to retrieve imagery en masse.

While the idea of having a collection of street imagery available not only for mapping, but also to write your own Mapilllary-“killer” is appealing, I wonder whether that should be done under the OpenStreetMap brand or umbrella. Wikipedia also splits up all data in different projects like Commons, Wikivoyage, Wikidata, etc.

When I wrote “What would be the value of such an image collection to OSM ?”, I meant why should OSM or OSMF release such dataset? I think it is better to have a separate organisation doing this. Let OSM/OSMF focus on map data, not on street imagery.
OSM/OSMF does not deal with aerial imagery, navigation apps, etc.

p.s. Any of those image collection should protect the privacy of the submitters better, by offering the availability to make it possible to hide your name from the pictures one contributed.

I may be wrong I believe Escada meant to ask, “What is the value of replicating the database of Mapillary/OSC”.

Ah, I see what you mean now. I think there is a sort of existential difficulty in concerning what is ‘map data’… in some senses an image could be seen as embryonic map data in the same way that a GPS trace is. Though I can understand if OSM would not want to concern itself with non-geometric sources of map data.

It’s a fair point about the scale of the data and the ability to process it. Considering the power of the new JavaScript APIs (Video, WebGL, Canvas) I wonder if processing could be shifted from servers to web clients (view this example in Firefox), metadata could be isolated from the video source and videos could be uploaded anywhere.

E.g. infrastructure for a decentralised project could be something like this:

• OSM could host geocoded & timecoded metadata for dashcam recordings using a standard like https://w3c.github.io/webvtt/
• Dashcam recordings could be uploaded to some service like YouTube or WikiMedia Commons.
• Coders could create web tool process the above for specific feature data (e.g. “traffic island”), they can be hosted on OSM site, github.io or wherever.
• Users could process areas of their interest.
• Could potentially use man-in-the-middle or customised app clients to complement rather than compete with OSC and Mapillary (e.g. upload goes to two places instead of one).

As a general trend in the information age the ability to processing and storage store large amounts of data is becoming less problematic over time but commercial ‘gatekeeping’ is becoming more problematic. Things like API limits and IP laws can be inhibitive to the development of FOSS projects. Also, sometimes large companies will buy-out competitors for the sake of eliminating competition; it worries me that a lot of people could upload a lot of data and then one day everyone may have to start again because it was only stored in one place and that place is taken offline or paywalled.

Oh! I sense some serious terminology problem here. You meant OSMF instead of OSM then.

So the original question was whether the OSM community want to mirror any or both of the services. To that I’d say from the longevity standpoint external mirrors are definitely a bonus, provided both the licenses and the technical possibilities allow it. It may be the case for Mapillary; so far OSC seem not to allow such activity even on theoretical level. The storage requirements are right now around 590 TB for Mapillary by my guesses so it’s not a trivial task to accomplish [approximately $100,000 - $150,000 to do it at home {geeky details: I’d use Supermicro NR X10 4U72 Ceph Data N 432TB times 3}, plus electricity bills, which is definitely doable for a larger community]. Basically I’d say partial local mirrors could be created, but I wouldn’t hold my breath.

As for doing it under OSMF I’d say no. This is a fairly different project, with different goals and requirements. I would be glad if either mapillary or telenav would earn huge amounts of money by using the imagery and in exchange they would make it as simple and open as possible to retrieve the imagery. I am gladly helping them to get rich if they provide the service for that.

So, generally, I would say we rather convince these companies to keep it open and cooperative rather than replicating them: they have their stakes in making the service better to get more images.

As a safety net I would really prefer some assurance that in case of closing, bankrupcy or boredom they would guarantee the possibility for the community to retrieve the imagery before it gets annihilated. That’s not a simple legal task, though.

For images there’s not much processing required, and video indexing can be done at client side right before uploading.

The biggest problem I see is in, as always, the safety and longevity of storage. Storing on YouTube means absolutely no assurance of anything: they may delete it without even thinking about it. Commons isn’t fit for massive amounts of WP-unrelated content. So, basically, the first step is to get a reasonably safe storage to point to.

You are right that separate metadata may be manageable… to some extent. For 150 million images (Mapillary is around that) it’d need around 32 bytes for basic indexing per image, which is 4.8GB by itself. It is not for users, though, but any server can handle lookups in no time. Image retrieval may not be trivial either, considering average image size (for my uploads at least) are around 4MB per image; thumbnails either require massive server processing power or definite amount of additional storage. And for videos it’s even larger.

MITM upload and download is really simple provided all the legal and technical obstacles are removed. We’re not quite there yet…

Definitely an acute problem, especially since both companies acquire rights well beyond cc_by-sa. We would need assurances that they’re not allowed to take away the content without means to freely access and retrieve it.

That’s interesting to know – not as bad as one might think but still non-trivial. I also thought about some sort of BitTorrent based network of seedboxes but that would bring complications of its own and probably raise barriers to participation. I think there are be potential to reduce storage requirements by using the quite beautiful BPG compression algorithm which doesn’t take the space of PNG and doesn’t create the same sort of potentially troublesome compression artifacts that JPEG does.

Yeah, it does seem any project would be a non-starter without storage. If the goal of the project was only CV for attribute data rather than street photography ephemeral uploads may not be so bad. At least if metadata was centralised any interested party would be able to rip from YT – albeit probably not legally – videos as and when they are uploaded.

Yeah, exactly. I would think a gentleman’s agreement would be find if there exists a relationship of trust… that’s partly why I was wondering what community relations were like. There’s certainly not been an tsunami of bitterness from the forum members but then there hasn’t been overwhelming indications of support either.

If no one from Telenav chimes in I might e-mail them to ask if we can have up-to-date code repos. If the repos were up-to-date it would make it easier to hack in things like multiple upload destinations. Alternatively, I could look at that which Mapillary has opened up and see if there is something to work with there.

You do realize that Mapbox is also collaborating with the OSM foundation quite closely? For example, the default iD editor was developed (mainly?) by Mapbox, and I believe a large chunk of donations to the OSMF is from Mapbox.


On one side, I think that this is too far outside the prime mission of OSMF that it should probably not be investing large sums in it. On the other hand, unless there is a rock solid contract giving some sort of escrow access, if the business gets into difficulties, its financial advisers will tell it to try and make money out of all of its assets. This has been evident with software patents, where companies in difficulty suddenly start enforcing their patents,including some quite widely used algorithms.

In terms of the discussion of automated mapping from the images, I think anything that makes the on the ground mapper redundant is likely to lose OSM a lot of its core supporters, and just leave those who want a map without paying for it. A purely commercial business might well do that, but a community, crowd sourced, project really should be loyal to the people that made it possible.

Sorry, I don’t understand the comparison; their repos are up to date and the ‘supply chain’ of data does not create a gatekeeper out of them.

IP law strikes again.

I don’t see it making the ground mapper redundant I just see it as a way of automating the tedium. I could go out in my town and measure the width of every road or write down every house number but it would take me hours. Give me a tool to automate it though and I would gladly start taking suburban routes around town just to gather more data. Similarly I would be happy to code a game or something which allows people do reCAPTCHA-like training. Richer data would allow for more beautiful cartography.

Strangely enough, walking around recording house numbers and verifying street names is often my daily exercise routine. :slight_smile:

My gripe with OpenStreetCam is that their app and upload mechanism just hasn’t worked on either my old or my new phone. After lots of tries I have basically given up trying to make it work.