Why does OSMF Budget €25,000 on Amazon

May I ask active volunteers why they think that essential functions of OpenStreetMap should work on a voluntary basis, if at the same time the functionality of OpenStreetMap has to be significantly restricted to avoid overloading by end users. These limitations only drive a divide between contributors and users, and users are then left to rely on commercial users of OpenStreetMap data. As a result, end users of OpenStreetMap data often do not recognize the origin of the data, and a break occurs because corrections then do not flow into OpenStreetMap.
In practice, I see that many entrepreneurs, for example, update their business data very quickly in Google Maps, but do not even think of maintaining it in OpenStreetMap.
Also my question, how the influence of commercial OpenStreetMap data users on the management of OpenStreetMap is to be assessed, and how willing they are to grant OpenStreetMap more core functionality.

1 Like

I think the core idea is to treat everyone equally, meaning, almost nobody gets paid for doing anything. This of course, has its ups and downs. I think this topic would require a broader discussion, outside the scope of this thread. Let’s focus here just on the AWS and transparency issues.

I recognize here the diligent attempt to get the appearance of pure voluntary work, probably to keep actual total costs apolitical. Proper financial management makes political influence visible. The trumpeting that everything is voluntary work is supposed to keep the political influence of commercial data users opaque.

One could argue that offering work for free is a way to evade certain responsibilities. When people receive something at no cost, they often have lower/no expectations. However, when they know it’s a premium or paid product, their expectations naturally rise. But let’s not deviate from the original thread too much. You can always start a new one to discuss other topics.

So the title of this topic should really be 'why does OSMF reserve 25,000 on Amazon services?

(JIC the emergency occurs needing their services). When not used almost a gain that can be carried over to the next year, so the donations/new funding needs the next year are minus 25,000.

There’s is some confusion thrown around but this 25,000 has as I read it not made it to the P&L statement, is merely a reserve item on the BS / an item on the ‘Business Forecast’ (if one can call it that), only a projection of possible cost to operate.

To my understanding, the free AWS service are a relatively new thing. So just last year this has been a real, actual spending. Originally I found no information about the free AWS service, as all publicly available information indicates that this is still in ‘planned’ status. It’s hard to resonate on the budget when the exact terms are unknown. We can all just speculate at the moment.

I just wish OWG was more transparent about their S3 operations. Take example from https://hardware.openstreetmap.org/. Right now the public does not know what’s exactly being stored on the S3, nor what are the terms of the free AWS sponsorship.

Yes - there’s not a great way to handle the accounting. Using the two examples you gave, we can see the problems.

If we didn’t have Fastly sponsorship, we’d have to pay a CDN. At our volumes, we’d never pay the listed commercial rates, but we don’t know what price we’d get actually doing the negotiations, which we wouldn’t do unless we had to switch. We’d also probably change our usage policies, so we’d have different traffic levels. At this point, we put the risk in the budget request with a maximum cost.

The AWS render server is different. It’s important for capacity right now, but if we lost the credits, we’d go with a cheaper option than EC2. We also have a very low risk of losing credits compared to donated servers, since once the credits are in our account they’re good until they expire at a known date. If we put a cash value on the risk, it would be the capex cost of replacing it, not the dollar value of AWS credits it uses. We didn’t call this risk out in the budget, because it is minor compared to other risks, and substantially less than other donated servers.

14 Likes

It is quite sad when the serious lack of communication and social skills (aggressive approach, blatantly ignoring “assume good faith” policy and other etiquette guidelines, replying when angry / ignoring “WP:NAM”, poor anger control, incessant posting of retorts, repeatedly ignoring other people suggestions to moderate as well as dis-likes as indication of problem with their writing style, continuing to enforce validating their own bad behaviour instead of noting that there must be a problem with it as so many people complain, and general argumentative and trollish behaviour, arrogant self-righteousness even in the face of facts proving them completely wrong, inability to accept and admit to others that they have been wrong and acted inappropriately etc.) completely overwhelmes the actually quite reasonable request for information and clarification and more transparency, and makes people want to just blacklist them as a troll vulgaris domesticus.

Sad, but quite understandable. It is human social behaviour 101 (what you say doesn’t matter in the least if you don’t know how to say it and do in a way that puts people in a “here comes aggressor, defend now!” mode). Hopefully OP will learn from this and acquire better communication skills to discuss issues in more civil way, and even complain if need be in more amiable and sociable way, before they get blacklisted by the majority of the community as a troll.

And yet a simple sincere apology (instead of trying to reframe the issue so they remain blameless) would suffice upon noticing community response and lash-back, e.g.

“I apologize I overracted, I misunderstood the situation and was angry, so I forgot to assume good faith and my lack of knowledge, so I inappropriately come out agressively. I however still have questions on the sucjet pertaining to XXX (like: what is the difference between XXX1 and XXX2), and would like to suggest that OSMF be more transparent about YYYY in the future, by including more information in financial reports about ZZZ. I find that important because of QQQQ”.

Simple, admitting own mistakes (instead of desperately trying to find any smaller mistake of others and “try to make them more wrong then me”), non-agressive and constructive, and yet still asking for same information in non-confrontational way. It would make people see him as a valid peer, and support the idea.


P.S. I actually connected their identity on GitHub with this identity of Discourse forum, and in my experience (on e.g. StreetComplete issue tracker) previously, they actually seem as valuable members which do want to help the OSM project. But their lacking anger management issues however might turn away most of the community unless they learn to manage it (much) better. And that would be a loss (for both sides).

11 Likes

@NorthCrab You have made many implicit and explicit assumptions throughout this whole conversation. You are engaging bad faith. I will be changing the title of this thread to “Why does OSMF budget 25,000 euros on Amazon” because the implication of the current title is not accurate.

4 Likes

I will also lock this thread for a couple hours.

1 Like

OpenStreetMap Ops team AWS usage.

AWS EC2 - virtual machines:

  • OSM tile render server (USA) - $3500/month (including data transfer which is ~50% of cost)
    palulukon.openstreetmap.org Fully sponsored, no cost to OSM. Would find alternative if not sponsored. EC2 On-Demand, potential to optimise using Spot Instances, but significant ops investment required.

AWS S3 - object storage:

Summary: 272TB used. Growth of 3% per month.

Quoted amount include data transfer, API usage and storage costs.

  • openstreetmap-storage-backups - 112.4 TB - $120/month
    Backups including some historical. Backups are not de-duped by design (heavy admin / risk burden). Some opportunity to manually cleanup, but very low priority. No automatic cleanup.

  • openstreetmap-planet - 71.1 TB - $100/month
    Historical and current copies of published planet files. Deep-Archive, for future restore to AWS hosted planet service with full back catalog. No automatic cleanups.

  • openstreetmap-tile-aggregated-logs - 32.1 TB - $125/month
    Archival of processed tile CDN usage logs. Historical reference for Ops to work out tends and usage patterns. More data here than provided by public logs: Index of /tile_logs @pnorman can clarify.

  • openstreetmap-wal - 28.7 TB - $400/month
    Live streaming “Write Ahead Log” copies of the OpenStreetMap core Postgres database. The WAL files are used for syncing follower instances of the core Postgres database server. Vital asset to our data recovery plans. Can be used for recovery between full weekly database backup or corruption. For clarity this database is private and not published via planet data (eg: messages, users etc). Automatic cleanup after 1 year.

  • openstreetmap-imagery-backups - 18.2 TB - $35/month
    Backups of imagery provided to OpenStreetMap. Deep archival. Primarily backups of imagery hosted on kessie. No automatic cleanups.

  • openstreetmap-fastly-logs - 5.3 TB - $125/month
    Inbound fastly CDN logs for processing. Key to us finding and managing abuse, source for publish tile log analysis: Index of /tile_logs Automatic Cleanup after 31 days.

  • openstreetmap-gps-traces - 2.8 TB - $80 to $225/month (varies due to access by website users)
    The GPS traces that are uploaded to OpenStreetMap.org, the storage backend for website: Public GPS Traces | OpenStreetMap Formerly provided by NFS service, moved to S3 to simply admin burden and to seamlessly work across our hosting data centres. No automatic cleanup, but opportunity to improve costs with S3 “tier” lifecycle rules.

  • openstreetmap-fastly-processed-logs - 1.9 TB - $50/month
    Archival of processed tile CDN view logs. Historical reference for Ops to work out tends and usage patterns. More data than provided by public logs: Index of /tile_logs @pnorman can clarify.

  • openstreetmap-user-avatars - 113.1 GB - $5/month
    The user “avatar” images as uploaded by users. No automatic cleanup, but opportunity to improve costs with S3 “tier” lifecycle rules. Formerly provided by NFS service, moved to S3 to simply admin burden and to seamlessly work across our hosting data centres.

  • openstreetmap-aws-cloudtrail - 76.0 GB - $2/month
    Storage backend for AWS Cloudtrail API access logging service. Security monitoring. No automatic cleanup.

  • openstreetmap-gps-images - 62.7 GB - $10/month
    The processed display images used by OpenStreetMap.org on Public GPS Traces | OpenStreetMap
    Formerly provided by NFS service, moved to S3 to simply admin burden and to seamlessly work across our hosting data centres.

  • openstreetmap-backups - 21.1 GB $0.03/month
    Historical database backups from OSM in first few years. No automatic cleanup.

Please remember all storage solutions also carry an ongoing human administrative / management overhead cost which is not accounted for in the above numbers.

Other AWS services

We also use AWS like Cloudtrail, Athena, Glue, etc. There are all minimal expenses (<$20/month) or covered by free tier.

Usage of Credits

All costs above have been covered by AWS credits since Nov 2022. The credits cannot be used for purchasing Savings Plans or Reserved Instances etc which can be used for offsetting / reducing future costs. See AWS credits FAQ. Credits are valid for 1 year and any unused credits expire. Credits cannot be exchanged for cash. Crypto Mining is not a permitted use of credits. :wink:

Credits need to be requested annually from AWS and it is not guaranteed we will get credits and this is why we have a budget line item for contingency.

40 Likes

Thread is unlocked. Please to continue discussing the additional information provided by Firefishy

I relay here a question that this thread has raised within the French crowd: is this a purely financial issue, or is there some kind of technical commitment to consider? Are there specific APIs or architectures that would make the switch to another storage infrastructure costly?

3 Likes

Dear all

Thank you @Firefishy for extensive details about what components are currently hosted.

Was there any discussion in the past about possible rivalry this hosting strategy could raise in regard of Amazon involvment in Overture foundation?

This kind of problems could be taken seriously due to the impact it could have, years after decision making.

Best regards

2 Likes

Amazon is involved in Overture, and also a big user and supporter of OpenStreetMap. I see no rivalry there. These buckets were created long before Overture was a thing. OWG is independent in the products and technologies it chooses to use.

5 Likes

The website uses Rails Active Storage which is responsible for openstreetmap-user-avatars, openstreetmap-gps-traces and openstreetmap-gps-images. Active Storage uses the S3 API. Prior to using S3 we used NFS, but NFS did not scale well beyond a single data centre. We extensively evaluated Ceph but the administrative burden of managing it is too high for our small team with our expected limited usage. AWS S3 was a feature match, price match and has first class compatibility with Active Storage. While other providers do provide a S3 compatible API, other providers were not deeply considered.

The other S3 buckets are effectively purely storage. We push data via the AWS cli. A significant portion of the storage used is “deep archive” storage which is expensive to retrieve (very cheap to store). We are quite happy with AWS S3 and at the moment don’t have a reason to re-evaluate.

7 Likes

Generally large companies are complex, while everything you say is true, the large user and contributor to OSM (historically, it may or may have not died with the OMF) was Amazon Logistics. The geo-services provided by AWS on the other hand have to my knowledge never been directly OSM based and have used Here and ESRI (the bits from ESRI may have some OSM in them).

That said, having at least a cordial relationship with OMF members that matter to us, for example Amazon, ESRI and Tomtom, is probably a good idea, and if it is only to have some channel to rein in some of the more egregious behaviour of the OMF.

2 Likes

Thank you all. At this juncture, I am choosing to refrain from any future involvement in OSM global operations. I’ve provided additional context in my diary entry here: NorthCrab's Diary | 🌂 The Past, The Present, The Future | OpenStreetMap. I no longer view this as “my” issue and will thus cease all related commentary. OSM moderation has taken many of my texts entirely out of context, seemingly to cast me in a negative light. I’m not willing to continue the discussion in such an environment, where there’s an overemphasis on subjective views and a lack of grounded argumentation. Moreover, I find it unsettling when a thread is closed “to cool off” even when tensions have largely subsided. To make matters worse, one party continued to comment even after the thread was locked, which, to me, is a textbook form of censorship. Times have changed, and I recognize that. It’s time for me to shift my attention to other matters.