Basically you need to look at the detailed output from Nominatim (which is not the OSM API, its the API to a geocoder using OSM data).
This is 775 Harvard St NW: https://nominatim.openstreetmap.org/details.php?osmtype=W&osmid=69204393&class=building,
and this is the underlying OSM object: https://www.openstreetmap.org/way/69204393#map=19/38.92693/-77.02531
So some things to note:
- The building does not have the zipcode on it, therefore the zip code has to be computed
- You can see the entry for computed zipcode in the breakdown of the Nominatim
- Pleasant Plains is mapped as an area with type place=neighbourhood
- Chinatown is mapped as a node with type:place=suburb
This means that not only are the bounds of the zipcode computed, but so are the bounds of Chinatown. As it is marked as a suburb it is assumed to consist of several neighbourhoods. It looks as though there are no other suburbs mapped in DC N of Chinatown so all of them will end up being allocated to Chinatown. You can view the neighbourhoods and streets which belong to Chinatown here https://nominatim.openstreetmap.org/details.php?osmtype=N&osmid=158431204&class=place.
There are a few more complex things going on: Columbia Heights is marked as both a neighbourhood with fixed boundaries and as a suburb but appears to be only treated as a neighbourhood.
Zipcodes are probably derived from Tiger data which is retained on street geometries (https://www.openstreetmap.org/way/130764273#map=17/38.92697/-77.02411) and you can see Harvard St NW has 20001 on one side and 20009 on the other which likely makes computation harder.
- zip codes have not been added explicitly to buildings and are therefore computed. I’m not sure exactly how this works, but accurate postcodes on individual addresses is best practice
- neighbourhoods and suburbs have been added inconsistently around Washington DC. I’m not sure of the reasons for this, but suspect someone wanted certain places to show up on earlier than others on the map. (This is a bad thing, termed “Mapping for the Renderer”: bad because it usually produces unwanted side effects for other applications such as geocoding).
What can you do about it?
It really depends on how you want to use this data. I’m assuming you expect users to search for a restaurant and you want to provide a street address, locality, zip format for viewing, so a reasonable first approximation would be to use only the first place element after the street address. I think the API provides JSON & GeoJSON outputs which make parsing a bit easier, i.e, only work with 775 Harvard St NW, Pleasant Plains, Washington DC, .
You can, of course, work to improve the data. However, please ensure that you do not use sources such as google etc. If the zipcodes encoded in the street geometries are accurate these can be directly added to the addresses along the street (e.g., using addr:postcode).
For more detailed help there is a nominatim mailing list, IRC channel, and OSM-US hosts a Slack channel.