On Tue, 7 Dec 2010, Ville Skyttä wrote:
Don't stop reading there. I don't know about forbidden, but they do write that the value "is" something else, a bit below the above quoted part:
I didn't! :)
- In this PO file field, but not in locale names, ‘ll_CC’ combinations
denoting a language's main dialect are abbreviated as ‘ll’. For example, ‘de’ is equivalent to ‘de_DE’ (German as spoken in Germany), and ‘pt’ to ‘pt_PT’ (Portuguese as spoken in Portugal) in this context.
- In this PO file field, suffixes like ‘.encoding’ are not used.
- In this PO file field, variant designators that are not relevant to message
translation, such as ‘@euro’, are not used.
So, if your locale name is ‘de_DE.UTF-8’, the language specification in PO files is just ‘de’."
So, "de" is a synonym for "de_DE" and both definitions are as correct as they can be according to my interpretation.
But leaving out the country only applies to _the_ (there can be only one I gather) primary dialect of a language. So both zh_CN and zh_TW cannot be the primary dialect; dunno if there's such a thing for Chinese in the first place. If you look at my patch carefully, you'll see that for zh_CN.po the value of the Language field is zh_CN.
And you could use here the plain "zh" as it defaults to "zh_CH" - according to my quick Google searchs.
I did not invent any of these values myself - I just first fixed the Language- Team fields so that gettext itself understands them, and then ran the files through gettext, and copied/included in my patch what gettext itself had added. I think following what gettext's docs say/recommend and what it actually does itself is the best approach.
In that case those plain language codes are enough. I was just thinking about new translation that might come in future using subdialects. Translators usually uses the existing ones as a starting point and might not remember to update these fields correctly.
BR, -- rofa