But transcribed using what set of phonetics? I’d prefer the phonetic/alphabet mappings used in English but others might prefer Italian, Spanish, French, German, etc. That in itself is a strong reason to simply have the name field use the local language and leave the name:xx tag blank.

Then a renderer could use an appropriate library to create a phonetic equivalent in the language the renderer is targeting. The Germans have a Postgresql plug-in that purports to do that. See https://github.com/giggls/mapnik-german-l10n