Pssst: We have working IPv4 routing in Amsterdam with new ISP.
Basic config applied to router and single host so far.
IPv6 is broken, reported already to new ISP. Looks like they forgot to allocate a subnet. IPv6 router subnet works, but missing the “public” IPv6 subnet.
I’m off to bed, and will add missing VRRP bits and improve config tomorrow.
Just for the record, BGP wouldn’t have helped in this particular failure case anyway (i.e. it would provide nothing that you can’t do with simple manual DNS change; i.e. you’d still need to find a working OOB line to sync data and then do R/W failover to database which has working link – so no improvement to current situation at all)
BGP does provide automatic self-balancing at layer3 (again, nothing you can’t approximate or even overtake with good GeoIP DNS balancing) and automatic self-healing/rerouting around broken routers (but that requires your BGP uplinks do not go over same SPOF, as was the case here).
qualification: I’ve managed several BGP multihoming sites and one small LIR at the time (handling BGP and routing on Debian on standard enterprise servers using Quagga/FRR on 1Gbps uplinks so far, but should move to 10Gbps in a month or two), so perhaps I can provide some feedback (although some policy requirements for new customers have changed from the time I’ve been setting up those, particularly related to IPv4 shortages) should you become interested in BGP after this crisis is over.
But not really. The OpenStreetMap Foundation pays 1320$ per month to
that ISP for service, and this is a lot different from the unpaid
volunteer in the original XKCD.
For me large part of (dark) humour in @Strubbl mashup is inversion of expensive commercial service provider turning out to be the weak part. Despite being paid to provide this service and to avoid being a weak link.
The situation that OpenStreetMap faces right now reminds me of the one that how Internet Archive went down a few months ago. A service a lot of people depends on but running on hopes and dreams and a little funds only… Thank you OSMF for your work!
Our current ISP has confirmed replacement equipment has shipped and is on-route and expected to arrive Wednesday. It then still needs to be configured.
In parallel we (Ops) are making good progress in switching to the new ISP.
I don’t know whether the foundation has put out a call for tenders to major European suppliers such as Hetzner, OVH and Ikoula? They would certainly do a mix of invoicing and sponsoring that would fit into the foundation’s budget.
In the event of a breakdown, they don’t need to order equipment from the other side of the planet, they manage tens of thousands of servers and they like to support common projects.
Our new ISP is excellent. We also have OpenStreetMap mappers who work in their engineering team.
Their cost is unbeatable , more on this once the public announcement is ready. I have strong confidence in them and their technical setup.
We have sponsored equipment at OVH and previously Hetzner. We tend to run our own equipment for a variety of factors. OVH and Hetzner are great, but not ideally suited to how we run things. See https://hardware.openstreetmap.org/
New ISP configured and we have started moving servers and DNS. If all runs smoothly we will have all services back by this evening (GMT/UTC) running on the new ISP.
Minor: IPv6 is not yet working with the new ISP, we will add it once issues are resolved.
Confirming that spike-06, spike-07 and spike-08 are currently reachable via IPv4:
$ ping -4 spike-08.openstreetmap.org
PING spike-08.openstreetmap.org (82.199.86.104) 56(84) bytes of data.
64 bytes from 82.199.86.104: icmp_seq=1 ttl=58 time=32.5 ms
64 bytes from 82.199.86.104: icmp_seq=2 ttl=58 time=33.0 ms
64 bytes from 82.199.86.104: icmp_seq=3 ttl=58 time=33.3 ms
64 bytes from 82.199.86.104: icmp_seq=4 ttl=58 time=32.7 ms
just as a sidenote: I think he meant that “BGP through multiple uplinks”; one usually don’t have multiple uplinks without BGP (unless you’re the drinking buddy of both). But I know you know, just mentioned for the sake of clarity.
But yes, two uplinks are better than one, especially when the backup link is “on standby” and the cabling is not expensive, like inside a DC.
I wondered whether there’s an OSMF network topology somewhere (like it’s not clear who’s provide hosting, and who the connectivity; similar to hw) but first I haven’t found anything and second, it’s probably none of my business anyway.
if they’re bringing up the site, I would say that if you’re working with them in a coordinated way, trying to get admin’s attention here is more detrimental than helpful. Let them finish their job, announce when it’s ready and then report any issues you might find.