From a server perspective, the size alone wouldn’t necessarily be a problem, as it wouldn’t be any more unbounded than an OSM PBF extract of the same bbox. In OpenHistoricalMap, the San José bbox is currently giving iD much more of a headache than the OSM API, now that this API has switched to CGImap.
From a client perspective, especially on a mobile device, there is a strong desire for highly optimized tiles. Even the notion of downloading individual tiles is somewhat problematic for some use cases. Mapbox’s initial approach to offline maps would essentially hammer the server API, scraping every tile at every zoom level in the desired region. Not so great on a cell connection. (MapLibre GL Native still has this mechanism.) Mapbox later built an alternative mechanism based on a single packed download, though I believe purpose-built applications like OsmAnd and Organic Maps still work with a more efficient format that can be used for rendering, routing, and search simultaneously.
If this topic is still about the EWG’s interest in potentially serving tiles, stating the intended use cases would give a better idea of the optimal formats and software stack.