Main menu

Importing XLIFF files into Loco

The XLIFF XML schema is used by many translation applications, and with some variation between implementations. When importing XLIFF files you may have to set some specific options for your platform. Here are a few examples of how Loco will deal with variations in the format.

XLIFF 1.2

This is a pretty standard bilingual file. It describes English as the "source", French as the "target", and gives two translations for a single unit called "greeting".

<?xml version="1.0"?>
<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">
    <file source-language="en" target-language="fr" datatype="plaintext" original="words.txt">
        <body>
            <trans-unit id="1" resname="greeting">
                <source>Hello</source>
                <target>Bonjour</target>
            </trans-unit>
        </body>
    </file>
</xliff>

Importing this into Loco will map the <trans-unit> element to a single translatable asset called "greeting". This asset ID adopts the value of the resname attribute. The id attribute in this example may have meaning to the software that generated the file, but we can ignore it.

It's possible to import both the <source> and <target> translations at the same time as long as the source-language and target-language attributes exist in your Loco project.

Example using cURL to import via the API:

curl -u <your_key>: --data-binary @words.xlf 'https://localise.biz/api/import/xliff?locale=auto'

Here we specify auto for the locale parameter. This tells Loco to import all languages in the file that match a language in your project. A full explanation of these API parameters is below.

XLIFF 2.0

Loco will also parse XLIFF 2.0. The following is semantically equivalent to the example above and can be imported in exactly the same way:

<?xml version="1.0"?>
<xliff version="2.0" xmlns="urn:oasis:names:tc:xliff:document:2.0" srcLang="en" trgLang="fr">
    <file original="words.txt">
        <unit id="1" name="greeting">
            <segment>
                <source>Hello</source>
                <target>Bonjour</target>
            </segment>
        </unit>
    </file>
</xliff>

XLIFF 2.0 has been an ISO standard since November 2017, but many web application frameworks still use the 1.2 specification. For the purpose of simple message catalogues, there is little to choose between the two specifications.

Additional translations

In addition to the single <target> element shown above, XLIFF 1.2 supports additional translations inside the <alt-trans> element:

<?xml version="1.0"?>
<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">
    <file source-language="en" target-language="fr" datatype="plaintext" original="alt-trans.txt">
        <body>
            <trans-unit id="1" resname="greeting">
                <source>Hello</source>
                <target>Bonjour</target>
                <alt-trans>
                    <target xml:lang="it-IT">Ciao</target> 
                    <target xml:lang="es-ES">Hola</target> 
                </alt-trans>
            </trans-unit>
        </body>
    </file>
</xliff>

As previously, specify locale=auto to import as much as possible, or if you want to extract only a single target language you can specify it exactly: e.g. locale=it will match xml:lang="it-IT". See more about language targeting below.

Xcode

The following example was exported from Xcode 9.2.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">
  <file original="en.lproj/Localizable.strings" source-language="en" datatype="plaintext">
    <header>
      <tool tool-id="com.apple.dt.xcode" tool-name="Xcode" tool-version="9.2" build-num="9C40b"/>
    </header>
    <body>
      <trans-unit id="greeting">
        <source>Hello</source>
      </trans-unit>
    </body>
  </file>
</xliff>

The first thing to notice is that the translation unit doesn't have the resname attribute, but instead identifies itself by the id attribute. Although this is technically against the XLIFF 1.2 specification, the Loco importer will simply fall back to the id when a resname is not found.

Another difference is that this file has no target language. The recommended way to import this via the API is as follows:

curl -u <your_key>: --data-binary @Localizable.xliff 'https://localise.biz/api/import/xliff?locale=source'

Here we specify source for the locale parameter. This tells Loco to extract the <source> element irrespective of the source-language attribute. In this example however, the exact same result would be achieved with locale=en.

Symfony

Versions of Symfony have exported XLIFF files in slightly different ways over the years. The various Symfony documentation pages show slightly different examples, but the following is how v4.2 of the Translation component would dump a XLIFF 1.2 message catalogue:

<?xml version="1.0" encoding="utf-8"?>
<xliff xmlns="urn:oasis:names:tc:xliff:document:1.2" version="1.2">
  <file source-language="en" target-language="en-GB" datatype="plaintext" original="file.ext">
    <header>
      <tool tool-id="symfony" tool-name="Symfony"/>
    </header>
    <body>
      <trans-unit id="6sz9l7G" resname="title.post_list">
        <source>title.post_list</source>
        <target>Post List</target>
      </trans-unit>
    </body>
  </file>
</xliff>

The biggest difference to all the previous examples is that the <source> element holds a duplicate of the "translation key" and not really "source text" by the usual definition. These keys will be extracted as asset IDs regardless of any setting, but we must prevent them from also being identified as source language translations.

The recommended way to import a Symfony XLIFF file via the API is as follows:

curl -u <your_key>: --data-binary @messages.en.xliff 'https://localise.biz/api/import/xliff?locale=target'

Here we specify target for the locale parameter. This is a short-cut for extracting the <target> into whatever locale the file says it is, but more importantly it's a safety feature that ensures we don't select the <source> element by mistake.

Specifying locale=en-GB in this example would work equally well, but specifying locale=en would extract the keys instead of the translations. That would be a mistake, but it would only be doing as we asked, because "en" is an exact match for the file's source-language attribute. See language targeting below and general Symfony support in Loco.

Import API parameters for XLIFF

At its core Loco's import tool is essentially a key/value mapper designed to work across a wide range of file formats. To fulfil this purpose it accepts two simple parameters, index and locale. This is best illustrated by example:

  • "greeting" → "Hello World" could be described as: index=id&locale=en
  • "Hello World" → "Hola Mundo" could be described as: index=text&locale=es

Easy when you're dealing with simple key/value pairs, but as the examples on this page demonstrate, XLIFF is far more expressive and it's not immediately obvious how this simple model can apply. Our XLIFF model actually looks more like this:

"greeting" → { "en" → "Hello World", "es" → "Hola Mundo" }

However, this gets simpler once you know that the "key" (or asset ID) is always extracted from XLIFF files, regardless of any other settings. So "greeting" here is automatically taken care of. That leaves us with the job of extracting the right language, which can be done purely with the right locale parameter.

Locale parameter

Specifying a language tag can extract one language at a time. This applies whether the language is "source" or "target", as follows:

  • "en" → "Hello World" can be imported with: https://localise.biz/api/import/xliff?locale=en
  • "es" → "Hola Mundo" can be imported with: https://localise.biz/api/import/xliff?locale=es

This works because XLIFF files are able to tell us what language each translation belongs to. To help this succeed, make sure your specified locale matches the file's XML attributes AND the project locales you're importing them into. More on language matching below.

Additionally there are three special values for the locale parameter which work with XLIFF files:

  • auto: Imports all languages that can be matched to existing project locales. Recommended for standard XLIFF files.

  • target: Extracts only the <target> elements and imports them into a project locale matching the target-language attribute. Recommended for Symfony XLIFF files.

  • source: Extracts only the <source> elements and imports them into your project's source locale (regardless of the source-language).

Index parameter

The default index for XLIFF imports is "id". You can leave the parameter out, or specify index=id; it makes no difference.

It's important to note that this ID-indexing mode does NOT disable the extraction of source texts from XLIFF files. IDs are extracted from name or resname attributes additionally and are quite separate from the contents of <source> elements. See Symfony.

If you're familiar with the Loco import API in general, you may be wondering why no examples on this page specify index=text even when <source> elements are involved. The answer is that specifying index=text also declares that source texts are unique. Unique sources are not normally required in XLIFF files and this behaviour will usually be unwanted.

Taking our first example of a standard bilingual file: We used locale=auto to extract the source (en) and target (fr), but specifying index=text&locale=fr would have achieved the exact same result. However, the two commands are not equivalent. The latter causes the importer to match existing assets in your project by their source text when nothing is found by their ID. This not only prevents new keys being added with the same source text, but can also result in colliding assets being updated with new IDs.

Selecting language elements

Many of the examples on this page show the locale parameter being used to target specific elements in XLIFF files based on their XML attributes. The same parameter is also used to find which locale in your Loco project to import the extracted text. For the most part this is as simple as keeping all three language tags the same, but reality doesn't always oblige and the behaviour of this language matching might require some explanation.

Language tags do not have to be exact. locale=fr would extract XML elements labelled "fr" or "fr-FR", and likewise would import into matching project locales. This is done according to Loco's standard matching rules.

Ambiguous matches abort the import process. locale=en would match source-language="en-US" and target-language="en-GB". Rather than choose between them, the API will respond with status 422 (Unprocessable). The same would occur if "en" was ambiguous amongst your project locales.

Unmatched languages in files will be tolerated if there's only one target. locale=fr will not match target-language="it" but if this is the only target in the file, it will be used. Hence you will extract Italian translations into your French project locale. This is in place for backward compatibility and may become stricter in future.

The project locale is matched first. Your given locale parameter is matched to an existing project locale before the XLIFF file is parsed. This means an ambiguous project locale would still fail, even if the XLIFF file could potentially ratify it. This can be avoided by specifying locale=target.

The file is always searched with your original parameter. If your locale parameter matches a more specific project locale (say locale=en matches "en-GB") your original parameter is still used to match translations in the XLIFF file.