Main menu

Working with formatting syntax across platforms

You may know string formatting by various names. It refers to the use of placeholders such as "%s" or "{name}" that will be substituted for parameters in your application. On many platforms these strings are used by applying functions like printf.

Loco's support for string formatting serves several purposes in your workflow:

  1. Validating translations for correct syntax matching the source text.
  2. Converting formats between platforms that have different syntax.
  3. Syntax highlighting makes it easier to see mistakes as you type.

Supported platforms

Loco understands platform-specific "printf" styles for Java (including Android), Objective C (used in iOS) and PHP (e.g. Wordpress). Additionally it understands ICU MessageFormat, which is a more powerful, cross-platform syntax explained further down.

Although Loco is platform agnostic, many formatting implementations are not interoperable. For this reason every asset in your project has a setting which tells Loco what syntax you're using. Some common tokens like "%d" may be universal, but there are plenty more examples where the intended platform makes a difference.

When you import a file from one of the supported platforms, formatted strings are detected automatically and your assets will be assigned the appropriate property where possible. If you add any assets manually, you'll have to set the formatting property yourself.

Changing the asset property

The formatting syntax of each asset is held in a property you control. You can change or disable an asset's formatting via the asset properties dialogue, as shown below:

img

Asset properties are accessible from the Management view of your project, by clicking on an asset's :cog icon:. The tab shown above requires Developer permissions on the project. Translators don't generally have permission to change this setting.

Note that setting a specific platform from the dropdown list does not modify any texts. You are only telling Loco what the format is intended to be, not what you might be converting it to. Knowing the intended syntax is necessary for accurate validation and conversion of strings.

Validating translations

You want to be sure that when your translators copy the formatting of your source text, that they do it correctly and don't cause errors in your application. For example: The source text "Hello %s" could be translated to "Hola %s" or "Hola %1$s", but "Hola $s" would be a mistake.

As long as Loco knows the intended syntax of a source text, it will validate the translations and alert you to errors. You can also enable automatic flagging of invalid translations in your project settings.

Code highlighting

Validation is performed once a translation is saved, but you can check validity as you type by enabling the code view. Clicking the editor's :code icon: will enable syntax highlighting for string formatting tokens.

img

False positives

If you see validation errors for strings that are not supposed to be formatted, you can open the asset properties and disable formatting syntax. This will suppress validation for all translations of the string, and also prevent highlighting of the phantom tokens.

Example: The text "20% off" is probably not intended to be formatted, but actually it is a valid printf template on all supported platforms because "% o" is a space-padded octal. More about false positives in the custom syntax section.

Plural forms

When validating a plural form like "%d items" translations are permitted to have one missing argument. This tolerates the common practice of passing the same arguments to a plural string containing a number, as to a singular string that doesn't require a number. Most platforms forgive the passing of redundant arguments, but complain if the string has too many.

Note that entering a numeric pattern like this doesn't automatically tell Loco the asset is a plural form. See Managing pluralized translations.

Exporting and converting between platforms

Loco's various export formats will treat your formatted strings in different ways, depending on the format setting of the asset and the syntax of the target platform.

  • iOS strings
    iOS exports (including .strings, .stringsdict and .xliff for Xcode) will convert formatting tokens to Apple's Objective C format. See Android to iOS conversion.

  • Android strings
    Android XML and other Java exports will automatically convert formatting tokens to the Java printf format as supported by Android. See iOS to Android conversion.

  • PHP exports
    The various PHP export formats will automatically convert formatting tokens to PHP's sprintf format.

  • Gettext
    PO and POT exports don't have a native syntax, but messages will be exported with the appropriate "x-format" or "no-x-format" flags, where "x" is one of the supported platforms.

  • ICU MessageFormat
    ICU MessageFormat is cross-platform, so Loco never converts this syntax automatically. (Any platform could be using it). However, "printf" formats can be coerced to ICU MessageFormat by passing printf=icu to the export API. See the dedicated section below.

  • Other
    All other export formats will render your formatting tokens as is, but if you're using the export API you can force conversion by specifying the printf parameter.

Note that in order for any formatting conversion to work for an individual asset, it's formatting property must be set:

  • Assets with no format defined will NOT have their formatting converted.
  • Assets with formatting disabled will NOT have their formatting converted.
  • Assets with the same formatting as the target platform will NOT have their formatting converted.

Interoperable syntax

Loco will try to convert string formatting as far as possible, but try to stick to simple patterns if you need good interoperability.

Complex patterns using proprietary features might be totally incompatible with other platforms you want to export to. In some cases the target platform simply has no equivalent to what you take for granted in your source platform, for example:

  • Java date formatting like "%tD" has no equivalent other printf dialects and will default to "%s".
  • A floating point like "%.*2$f" is meaningful in C, but no other format supports variable precision.
  • The string "%Td is valid in Java and C, but means totally different things.

Incompatible flags and modifiers are dropped during conversion, and any incompatible conversion type defaults to the target platform's most basic string type.

Custom template syntax

It's not uncommon for developers to use custom syntax for proprietary template engines. Care must be take to avoid unwanted behaviour, because Loco may not understand your particular engine. Take the following string imported from a PHP file:

[ "Result" => "You scored %score% out of 100" ]

A human can see that "%score%" is supposed to be a custom template variable, but the string actually contains two valid "printf" arguments. The first "%s" is quite obvious but "% o" is also valid. (It's a space-padded octal). For this reason you may find that your strings have been automatically assigned a format, causing your translators to get strange errors when their texts are validated.

The solution is to disable string formatting for the asset, or add appropriate metadata to your source files before you import them.

Some file formats allow you to declare that a string is not formatted:

  • Gettext supports a "no-x-format" flag in PO and POT files, (where x is a format such as "php").
  • Loco's PHP extractor supports Gettext flags in comments, e.g. /* xgettext: no-php-format */
  • Android XML supports a formatted="false" attribute on <string> elements.

Android ↔ iOS formatted strings

Converting formatted strings between iOS and Android deserves a special mention because it's a common pair of platforms to convert between and their respective "printf" styles have a lot of differences.

Android to iOS

If an asset is configured as a Java string, it will be converted to iOS as follows:

  • All instances of "%s" are converted to "%@".
  • Positional arguments are maintained even if they would be valid without. e.g."%1$s %2$s" will become "%1$@ %2$@" NOT "%@ %@".
  • Unsupported conversion types such as "%h" and "%tY" will simply default to "%@".

Note that "%s" is valid in Objective C, but "%@" is conventional and more likely desired when converting from Java. If you must keep "%s" in your iOS export, then set the asset property to "Objective C".

iOS to Android

If an asset is configured as an Objective C string, it will be converted to Android as follows:

  • All instances of "%@" are converted to "%s".
  • Positional arguments are enforced for multiple substitutions. e.g "%d %d" becomes "%$1d %2$d".
  • Objective C integer types like "%i" and "%u" are converted to "%d".
  • Other unsupported conversion types default to "%s".

For backward compatibility, strings that have no defined syntax but contain "%@" symbols will be converted to Android automatically. If this legacy behaviour is unwanted then disable formatting for the asset.

ICU MessageFormat

MessageFormat is much more expressive than "printf" syntax and is available on multiple platforms. As part of Unicode's ICU project there are official libraries for C and Java and it is well supported in PHP's Intl extension. Other libraries such as Format.js have native implementations.

As MessageFormat is inherently cross-platform, you're unlikely to be converting it to other (simpler) printf styles. However, this is possible.

printf → MessageFormat

At its simplest, MessageFormat provides a linear sequence of scalar formatters like "printf" does. This allows Loco to convert basic placeholders, like the following:

  • %s{0}
  • %d{0,number,#}
  • %tR{0,time,HH:mm} (Java only)

MessageFormat is not native to any particular file format in the same way that printf styles are. To perform conversion like above, you must pass printf=icu to the export API. Doing so will also convert Loco's linked plural forms into ICU PluralFormat. See Managing pluralized translations.

MessageFormat → printf

Conversion in reverse is supported, but severely limited by the simplicity of printf formatting. Complex types, nested placeholders and named parameters will all be lost. However conversion of simple types like above will work in reverse.

Loco will never convert from MessageFormat automatically, because it isn't a platform specific style. For example: exporting to Android will keep ICU-formatted strings as they are, unless you force conversion by passing printf=java to the export API. The same goes for all supported printf styles.

Last updated by