Main menu

Non-English source locales

FAQ: Can I translate from a language other than English?

We plan to provide a source language switcher to help non-English-speaking translators, but we cannot say when this will be implemented. The main Loco platform has this feature already, but it is currently a separate product and beyond the scope of this article.

As it stands, the Loco Translate file editor is only able to show the original source strings for translating, and these are assumed to be in English because they usually are. If a developer created their source strings in another language, you will simply have to put up with them being wrongly labelled. It won't make any difference to the functioning of the editor unless you submit the text to one of the automatic translation providers. These APIs will be told to translate from English, so probably won't work if they receive something else.

The rest of this page explains more about why source strings are usually in English.

Source strings

The source locale of WordPress is "en_US".
We strongly recommend against any attempt to use an alternative source language in your code.

Although your site language is configurable, its source language is not. This concept simply does not exist in WordPress. In the world of WordPress all source strings are implicitly English. That means that they will be assumed by others to be English, even if they are not.

This might sound confusing, but the reality is that source strings are just unique identifiers for text which appears on your site. For example: If no translation is found for "Leave a comment" then your site will show the English text. Arguably English is the best choice for fallback text due to its global reach, especially online.

If a Polish theme developer wrote all their source strings in Polish - then you set your site language to English - your visitors would see Polish for any missing English translations. Likewise if they used some abstract identifier like "comment-section-header", this is what your visitors would see as a fallback.

Of course if all your translation files were at 100% then language fallback becomes a non-issue. So if this sounds fine to you, then go ahead and write source strings any way you like. But if you are releasing your work to the wider community, then consider their expectations too.

Gettext limitations

There are also technical barriers for some languages being used as source strings.

The Gettext file formats are fundamentally biased towards English. In the old days source strings were commonly US-ASCII, because files had to be encoded for the character set of the target language. (This would make it impossible to translate from Greek to Russian for example). Thankfully UTF-8 has fixed that problem, although there's no guarantee that all the software you use will be UTF-8 capable. Loco Translate operates in UTF-8 by default and recommends against doing otherwise.

A critical legacy of this bias however is that your source language can only have two plural forms. This would make Polish a non-starter, and many other languages too. This limitation is baked into the Gettext file formats and the tools that read them.