Most timezone boundaries correspond to administrative boundaries, so importing them as separate shape files would be wrong. They would need to share nodes with the existing boundaries, or be created as relations based on existing areas.
I commented on one of their changesets, but I’m afraid it would eventually take a revert by DWG to restore back . Pinging @woodpeck and @SomeoneElse from DWG.
While I do not doubt their good faith, it is a good idea to search the wiki and forums before engaging in a huge, globe-spanning yet futile effort.
Yes, I have always wanted kill the remaining time zone relations (after
the discussion we had years ago which resulted in the documented
“adequately captured by timezone tag on admin boundary”), but never got
around to do it. Maybe now is the time, before more people think “hey,
there’s a timezone relation missing here” and create another 20k member
relation…
My understanding is that OSM doesn’t strictly need a giant boundary relation for every time zone, because Timezone Boundary Builder can generate them from a combination of time zone boundaries and administrative boundaries.
In any case, now that OpenHistoricalMap has imported a comprehensive dataset of historical U.S. county boundaries, I guess we can look into mapping the U.S. timezone boundaries there using the county boundaries as a foundation, and also investigate doing something similar for the rest of the world, so that tz database users will continue to have access to useful boundary data. This could be a positive outcome, since the tz database also keeps track of historical changes to time zones that would be out of scope for OSM but well within scope for OHM.
The plan was to replace explicitly mapped timezone boundaries by
tiemzone tags on administrative entities on the highest applicable admin
level, and only resort to explicitly mapped timezone boundaries where
this was not possible.
For the use case “I want to know the (multi)polygon where this time zone
applies” you would therefore have to union all objects with
boundary=administrative OR boundary=timezone that have a certain
timezone tag.
For the use case “I want know what timezone this location is in” you
would have to look for any object with boundary=administrative or
boundary=timezone and a timezone tag set that contains the location in
question.
Creating relations for time zones would be generally unnecessary since
no place is in two time zones at once.
I’d like to add my thoughts here on several items.
Quality and Comprehensive Coverage
Since this topic first started years ago, the quality of timezone relations in OpenStreetMap has improved dramatically as a result of my and Arctic Gnome’s edits. As of today, 94.5% of the 419 timezone IDs from the timezone database’s zone.tab file are mapped in OpenStreetMap in a manner where a single Overpass query for all relations with a timezone=* tag are able to be downloaded and unioned into a GeoJSON multipolygon. Most of the remaining ones are zones in Antarctica that haven’t yet been mapped.
This data has been relied upon since the timezone-boundary-builder project started in 2016. However over time and especially in the last year I made an effort to add and correct as many boundaries as possible in OpenStreetMap. It is no longer a “futile effort” to map timezones as the coverage is mostly complete across the globe. It still does take some effort to keep things current, but that effort takes me a few hours a few times a year.
Usage of timezone data
The timezone-boundary-builder project has had over 300,000 downloads of release data since its inception. There are currently 21 known software packages in various programming languages that use this data. Some of these packages have several million downloads such as the python package timezonefinder which has over 1 million monthly downloads.
Given the widespread usage of this data, I recommend caution before considering any deletion of this data.
Preference for continued inclusion of timezone boundaries in OSM, not OHM
While the timezone-boundary-builder project is capable of making numerous queries for various relations and ways-as-areas, I don’t believe it is appropriate for inclusion in OHM. New timezones are only created through the timezone database project when there is an area of the world that has a timekeeping method that uniquely diverges from any other timekeeping methods. Therefore any query to get the historical time at a given GPS coordinate needs the current boundaries of timezones. Additionally, it would be a significant amount of effort to go back to manually compiling boundaries from all of the relations that make up some timezone boundaries without simply querying for the timezone=* tag.
Another item to consider is that the timezone database only keeps track of time changes occurring after the year 1970 as noted in the theory page. Therefore, OHM may need to use a different and not as widely-used data source to account for time changes before 1970.
Large UTC* timezone relations
I am aware of the large timezones that Arctic gnome is creating that are of the UTC* variation. These large relations are not used in timezone-boundary-builder. Instead of downloading large relations from OSM, timezone-boundary-builder simply diffs against all other timezones to create zones in the oceans.
While I do not know Arctic gnome's intent with creating these large relations I don’t think they present a problem to the data being used in timezone-boundary-builder either. It does seem that they don’t necessarily fit into the scope of what is described in the boundary=timezone or Key:timezone wiki pages.
Desired clarifications/discussion
Instead of simply calling for a mass deletion of data, I urge this conversation to steer more towards what policies we should be implementing when mapping timezones. Since the data exists and has existed for the better part of a decade now and is heavily used, I would be surprised if this data were to suddenly be deleted.
However, I do think some discussions would be useful to have as it relates to how to map timezone boundaries as it pertains to the following topics:
Whether or not it’s appropriate to include the UTC* timezones that Arctic gnome has been adding.
Whether or not it’s appropriate to include Exclusive Economic Zones within the boundaries of timezones
How to handle discrepancies between various “deprecated” timezones. For this one, the maintainer of the timezone database keeps some timezones that share the same timekeeping method since 1970, but have differed prior to that cutoff year. In the past there was not such cutoff and a significant amount of commotion occurred after the maintainer instituted the 1970 cutoff.
Etcetera
Creating relations for time zones would be generally unnecessary since
no place is in two time zones at once.
There are plenty of places where a relation is needed to model a timezone. There are lots of places where a timezone boundary does not follow any administrative boundary. The boundary between America/Chicago and America/Denver in particular has plenty of places where the boundary goes along a river, a road and sometimes arbitrary surveying points.
There are numerous places where a single place can be in multiple timezones at once. The largest example is Asia/Urumqi, but also any place where there is also a border dispute will also result in multiple possible timezones. Also, people that live near timezone boundaries will often informally use a different timekeeping method despite being in an area that otherwise sometimes resulting in interesting situations like in Fort Pierre, North Dakota.
I am not advocating for the deletion of timezone=* tags on objects.
The earlier discussion I referred to was about people creating relations
that would encompass all boundary lines of a time zone, including
where they are coincidental with administrative boundaries, leading to
very large relations (=relations with very many members). OSM is not
well suited to handle such relations; anyone making a change to any of
the boundaries included would have to upload a new version of the
tens-of-thousands-of-members timezone relation, whether that particular
mapper was even interested in time zones or not.
Such time zone relations, if they still exist, must be deleted, and the
addition of such time zone relations is discouraged.
Where a time zone boundary does not coincide with an administrative
boundary, the creation of a boundary relation for that area is ok, it
should however be limited to the absolutely necessary parts. If, for
example, a state in the US is not covered by a single time zone, then
instead of mapping two large time zone relations “timezone A in state X”
and “timezone B in state X”, one should attempt to assign timezone=*
tags to individual counties of that state, or even smaller admin
boundaries, and only if that is not possible should time zone boundaries
be mapped as separate entities.
Regarding your statement about regions being in two time zones at once,
my opinion is as follows: Most time zones are not verifiable on the
ground. We tolerate them in OpenStreetMap but they are not a primary
mapping interest of OSM as a project; they are third-party data that is
only “carried” through OSM. OSM mappers cannot improve time zones by
survey. OSM will always be limited to “echoing” what is defined elsewhere.
I appreciate your interest in time zones; I always respect it when
people get a bit geeky about their subject. But the fact that timezones
are a fascinating subject does not mean that OSM has to be fashioned
into a vehicle to cover every nut and bolt of timezone geekery. As I
said, timezones are tolerated in OSM but if they start getting in the
way of mapping the stuff we actually want to map, then that tolerance
might go out the window - and it’s not like it would be a huge loss for
downstream data users since timezones and boundaries don’t change so
often that you need to have a minutely updated extract mechanism.
This is fairly hyperbolic. I did some Overpass querying for boundary=timezone relations and found some that had several thousand members, but none of the ones I looked at approached ten thousand, let alone multiple tens of thousands. I share the frustration about editing small geographic features that are part of massive relations, but this is not unique to time zones. Large national boundaries are just as frustrating. Complex lakes and seas can be far worse. That said, I don’t see how removing them solves the problem. Someone will just recreate them.
Strong language that I can’t say I fully agree with. Even though I have no particular interest in time zones in OSM, they are clearly useful and providing value through the timezone-boundary-builder project.
Technically, the tz database tracks a time zone’s various additions and removals past a certain date. It has a separate entry for America/Indiana/Vincennes because those four counties of southwestern Indiana have shared a common time zone history since 1970 (observing Central Time from 1977 to November 2007). This differs from how either OSM or OHM conceptualizes boundaries. For example, when a city annexes surrounding territory, we represent the resulting boundary as a single feature rather than mapping the annexation per se.
For example, OHM imported San José’s territorial evolution from a dataset that tracked individual annexations by date. Most of the features in the dataset are individual parcels. But we processed the data so that each boundary relation represents what was considered to be the city limits on a given day. You’d need to diff two of the boundaries to get the set of annexations that occurred on the same day. OHM could do something similar with time zone boundaries to arrive at geometries that correspond one-for-one with more of the tz database entries than OSM would ever be able to represent.
But you’re absolutely right that OHM would need to consult more sources beyond the tz database. One of the primary differences between OSM and OHM is that OHM welcomes verifiable geodata whether someone geeks out about century-old etchings in the slab of concrete they’re standing on, traces an out-of-copyright map, imports boundaries that were valid for all of three days in 1960, or consults an archaeological survey. In my view, OHM will inevitably incorporate time zone boundaries, regardless of what OSM does with them. I suppose that’s a win-win from a certain perspective.
From this explanation, it sounds like the concern is the maintenance overhead of keeping the boundary intact. I can only speak for the U.S. where I’ve mapped time zone boundaries, but this goes somewhat beyond the status quo and seems counterproductive to me. The Central Time Zone (or “America/Chicago”, as the tz database calls it) incorporates 15 ways that don’t correspond to any administrative boundary. All of them are pretty straightforward. If anyone breaks the time zone boundary elsewhere, they’ll most likely also break an administrative boundary at the same time, so there’s no additional maintenance overhead.
The status quo is that there’s a single boundary relation representing the Central Time Zone. You’re proposing to dissolve this boundary relation into timezone=* tags on ten state boundary relations, 84 Indian reservation boundaries, and 1,028 county boundary relations (65 FL, 92 IN, 105 KS, 120 KY, 83 MI, 47 ND, 93 NE, 95 TN, 64 SD, 254 TX), then create 12 new timezone boundary relations to represent “X Time Zone in Y County” and “X Time Zone on Y Reservation” along non-administrative portions of the boundary. None of these relations would correspond to real-world concepts, being artifacts of OSM’s chosen data model.
Thus, what is universally regarded as a single geographic feature in reality would instead be represented by 1,134 distinct elements in OSM. That’s 358 more elements than are members of the current boundary relation – 32% more. Instead of using a conventional tool such as Overpass to determine whether the boundary is closed, we’ll need to do something custom to union the 1,134 geometries together and determine whether there are any holes.
For what it’s worth, I’m no time zone geek. I patch up what I need to, so that OSM has the basics, and retreat as fast as I can. Maintaining comprehensive time zone data is a full-time professional job that I don’t need. But I care about either OSM or OHM having present-day time zone boundaries, because they’re a standard part of a general-interest map. You’d be hard-pressed to find a political map, road map, or rail map of the U.S. in print that lacks time zone boundaries, and I hope that Americana will also get there someday.
@EvanSiroky was referring to a situation where a locality keeps a different time unofficially than it does officially. I’m familiar with this situation, having grown up just across the border from a county in Indiana that kept “official time” (Central) and “commercial time” (Eastern). The very classic OSM approach would be to map what’s on the ground: that is, include the county in the Eastern Time Zone per the signs but simultaneously include it in the Central Time Zone per the clock on every wall. The locals would love our approach to local knowledge.
I accept that this differs markedly from the experience in some other parts of the world, where time zones conveniently fall under the “Don’t map local legislation” principle. But I look at the slippery slope argument with some skepticism, because the reality is that timekeeping practices do differ from region to region.
Can I ask a dumb question?
If a time zone is generally an area composed of administrative areas, and administrative areas are defined by boundary relations, wouldn’t it be easy to group the administrative areas as relation members into a timezone relation?
Wouldn’t that prevent having to put a timezone tag on every single member, and prevent also having to create monster timezone boundary relations with thousands of way members?
In that case, the Central Time Zone relation would be a monster relation with 1,134 relation members instead of the current 776 way members. It’s bad enough that some mappers see the need to stuff a boundary relation with the relations of all its subdivisions as subarea members, but at least all those subareas are actually related to each other. From a technical standpoint, this would be more like mapping a relation of all the countries and subnational regions that drive on the left, or all the U.S. cities, counties, and states that have banned plastic shopping bags.
It would be especially ironic because the U.S. has never conceptualized a time zone as a property of another administrative subdivision, but rather a zone with its own boundaries. The boundaries happen to coincide with administrative subdivisions most of the time, just as state boundaries generally align with county boundaries – not pure happenstance, but not quite the same feature.
Really? I’m surprised. I thought, most countries go as 1 each, except where a country has multiple time zones then you go by state. Only when a state has multiple timezones, you go by county for that state. When state time divisions occur often, the number of members rises very quickly, I can see that, but 1134?
You’re right, I copy-pasted the 1,134 figure from my response above, but I forgot that that figure represents the number of elements that need to be tagged with a timezone due to dissolving the Central Time Zone, including plenty of counties in the neighboring Mountain and Eastern time zones. So while the figure is still a stark rebuttal to the original relation-less proposal, it isn’t relevant to the relation you’re describing. Sorry for the confusion.
If we were to model the Central Time Zone relation as you describe, then it would have these 631 boundary relations as subarea members, plus the 12 artificial timezone boundaries along non-administrative features. So there would be a 17% reduction in the number of members in this relation.
This sort of relation would clearly run afoul of the “Relations are not categories” principle. What do the Town of West Wendover in Nevada, Tooele County in Utah, and the State of Colorado have to do with Dunn County in North Dakota, other than the fact that all happen to observe Mountain Time and lie within the real-world Mountain Time Zone boundary?
This modeling makes it more difficult to determine whether there are any gaps in the relation. If today’s relation is fragile, imagine the same fragility but more difficult to detect.
What’s more, in order to load the actual real-world time zone boundary’s geometry, you have to load an inordinate number of county and state boundary ways that don’t run along the actual boundary. This slimmed-down Mountain Time Zone relation may have only 151 members, but you’d have to recursively load around 5,695 ways representing a superset of the geometry. I’m not talking about what it takes to turn the geometry into something useful; this is just what it takes to load the whole relation, so you don’t end up breaking it while editing something unrelated.
Thanks for making me understand the matter. I can see why it has a high PITA-potential. And I understand that tagging of the zones themselves as areas or boundaries is far more direct and manageable than breaking them down into pieces that have to be pieced together ever time someone wants to do anything with time zones.
Should OSM hold this information in the database?
Well, people can do that, if they like, the question is: is there enough ground truth and verifiability. I think all time zones have enough public time devices to say there is ground truth. It’s more about precision: where exactly is the boundary. Either we accept imprecise borders, or we use external sources for the precision and the verification. I think it is safe to say that OSM already does both in many cases.
Time and again, similar attempts to populate huge geographical features (timezones, bays, mountain chains, geographical regions, seas, …) open a question of boundaries of OpenStreetMap’s scope, which has never been definitely answered. Obviously, such huge features do not play well along with local features, due to tools’ and data model limitations (and generally, it’s hard to design a model of pretty much anything that would scale well across such a huge range of dimensions).
For the start, OpenStreetMap lacks a proper layer model which could help separate global from local features. If we could go back and design the model from scratch, boundary would be a separate layer/namespace implemented in different database tables, which could only be queried from a joint API but otherwise completely separated. For most use cases (including routing, our original purpose), one does not need boundaries along with local geographic features such as highways and waterways.
But that’s water under the bridge… Unless we reach a clearcut specification about our scope, implemented by majority vote if it must be, there will always be frictions like this. I’m not even saying that we should do that, but then we will have to accept that it’s a byproduct of an anarchic-by-design model that we have (and that has been a success story for the most part).
I believe layers would probably create more problems than they solve. If there are things that can be kept completely separate because they do not interact with any of the other data, and these could be just as well be kept in a different project, and could also be filtered in the current model as e.g. Josm allows you to do, which would seem just the exact same as a different “layer”.