summaryrefslogtreecommitdiff
path: root/pages/develop/legacy/program_guide/multilingual_pg.txt
blob: 9126a7a89e9f015eef52094e68623b23ba3c7bfd (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
{{page>index}}
------
===== Multilingual Programming Guide =====

=== Table of Contents ===

  * [[#Concepts|Concepts]]
  * [[#Localization_in_EFL|Localization in EFL]]
    * [[#Marking_text_parts_as_translatable|Marking text parts as translatable]]
    * [[#Translating_texts_directly|Translating texts directly]]
        * [[#Plurals|Plurals]]
        * [[#Handling_language_changes_at_runtime|Handling language changes at runtime]]
        * [[#compiling_and_running_a_Localized_Application|compiling and running a Localized Application]]
    * [[#Extracting_messages_for_translation|Extracting messages for translation]]
  * [[#Localization_tips|Localization tips]]
    * [[#Don't_make_assumptions_about_languages|Don't make assumptions about languages]]
    * [[#Translations_will_be_of_different_lengths|Translations will be of different lengths]]
    * [[#For_source_control,_don't_commit_.po_if_only_line_indicators_have_changed|For source control, don't commit .po if only line indicators have changed]]
    * [[#Using__()_as_a_shorthand_to_the_gettext()_function|Using _() as a shorthand to the gettext() function]]
      * [[#Proper_sorting:_strcoll()|Proper sorting: strcoll()]]
      * [[#Working_with_translators|Working with translators]]

==== Concepts ====

Localization (also called l10n) is done by using strings in a specific
language in the code (typically English) and then translating them to the
target language.

Not using resource identifiers but actual strings makes it much more convenient
and readable. A typical code to create a button that will be translated is:

<code c>
Evas_Object *button = elm_button_add(parent);
elm_object_translatable_text_set(button, "Click Here");
</code>

The messages that require translations are typically automatically extracted
from the sources and put into .po files, one per language. For the example
above, the "fr.po" file could contain:

<code c>
#: some_file.c:43 another_file.c:41
msgid "Click Here"
msgstr "Cliquez ici"
</code>

In the example above, the program that extracts strings has found two
occurrences of the same string, one in some_file.c at line 43 and another one
in another_file.c at line 41. It gives the original string after "msgid" and
the translation goes after "msgstr".

Strings without translation are stored as the empty string "" in the .po file
and the program will use the original strings, providing a sane fallback.

It is possible that the "fuzzy" keyword is added by the extractor program on
the line before "msgid"; it means the original string has changed and needs
review.

<note>
Don't be surprised if the translation is correct even though you didn't change
it: the extractor program is sometimes able to "guess" the updated
translation!
</note>

==== Localization in EFL ====

=== Marking text parts as translatable ===

The most common way to use a translation involves the following APIs:

<code c>
elm_object_translatable_text_set(Evas_Object *obj, const char *text)
elm_object_item_translatable_text_set(Elm_Object_Item *it, const char *text)
</code>

They set the untranslated string for the "default" part of the given
''Evas_Object'' or ''Elm_Object_Item'' and mark the string as translatable.

Similar functions are available if you wish to set the text for a part that is
not "default":

<code c>
elm_object_translatable_part_text_set(Evas_Object *obj, const char *part, const char *text)
elm_object_item_translatable_part_text_set(Elm_Object_Item *it, const char *part, const char *text)
</code>

It is important to provide the untranslated string to these functions because
the EFLs will trigger the translation themselves and re-translate the strings
automatically should the system language change.

It is also possible to set the text and the translatable property separately.
Setting the text is done as usual while the translatable property is set
through the ''elm_object_part_text_translatable_set()'':

There are also ''get()'' counterparts to the ''set()'' functions above.

=== Translating texts directly ===

The approach described in the previous section is not always applicable.
For instance, it won't work if you are populating a genlist, if you need
plurals in the translation or if you want to do something else with the
translation than putting it in elementary widgets.

It is however possible to retrieve the translation for a given text using
gettext from ''<libintl.h>'' :

<code c>
char * gettext(const char * msgid);
</code>

The input of this function is a string (that will be copied to an msgid
field in the .po files) and returns the translation (the
corresponding msgstr field).

In order to use gettext, you have to set the local before:

<code c>
setlocale(LC_ALL,"");
</code>

''LC_ALL'' is a catch-all Locale Category (LC).  Setting it will alter all LC
categories such as ''LC_MESSSAGES'' and ''LC_TYPES'' which are other categories
for translation: ''LC_MESSSAGES'' is for message translations and ''LC_TYPES''
indicates the character set supported.

By setting the locale to ''""'', you are implicitly assigning the locale to
the user's defined locale (grabbed from the user's LC or LANG environment
variables). If there is no user-defined locale, the default locale "C" is
used.

<code c>
bindtextdomain("hello","/usr/share/locale/");
</code>

This command binds the name ''"hello"'' to the directory root of the message
files. In fact, the program will be looking for your ''hello.mo'' in
''/usr/share/locale/<your_language>/LC_MESSAGES/'' directory where
''<your_language>'' can be ''fr_FR'' for example defining in the user's
defined locale. This is used to specify where you want your locale files
stored. You will use ''"hello"'' when setting the gettext domain through
''textdomain()'', and it corresponds to the name of the file to be looked up
in the appropriate locale directory.

The ''bindtextdomain()'' call is not mandatory; if you choose to install your
file in the system's default locale directory it can be omitted. Since the
default can change from system to system, however, it is recommended.

<code c>
textdomain("hello");
</code>

This sets the application name as ''"hello"'', as cited above. This makes gettext
calls look for the file ''hello.mo'' in the appropriate directory. By binding
various domains and setting the textdomain (or using ''dcgettext()'',
explained elsewhere) at runtime, you can switch between different domains as
desired.

When giving the text for a genlist item, you could use it in a similar manner
as the one below:

<code c>
#include<libintl.h>
#include<locale.h>

#define _(str) gettext(str)

static char *
_genlist_text_get(void *data, Evas_Object *obj, const char *part)
{
   return strdup(gettext("Some Text"));
   /* or usual way
    * return strdup(_("Some Text"));
    */
}

EAPI_MAIN int
elm_main(int argc, char **argv)
{
   setlocale(LC_ALL,"");
   bindtextdomain("hello","/usr/share/locale");
   textdomain("hello");

   /* ... */

   elm_run();
   elm_shutdown();
   return 0;
}
ELM_MAIN()
</code>

== Plurals ==

Plurals are handled in a similar way but through the ''ngettext()'' function.
Its prototype is shown below:

<code c>
char * ngettext (const char * msgid, const char * msgid_plural, unsigned long int n);
</code>

  * ''msgid'' is the same as before, i.e. the untranslated string
  * ''msgid_plural'' is the plural form of msgid
  * the quantity (with English, 1 would be singular and anything else would be plural)

A matching fr.po file would contain the following lines:

<code c>
msgid "%d Comment"
msgid_plural "%d Comments"
msgstr[0] "%d commentaire"
msgstr[1] "%d commentaires"
</code>

==Several plurals==

It is even possible to have several plural forms. For instance, the .po file
for Polish could contain:

The index values after msgstr are defined in system-wide settings. The ones
for Polish are given below:

<code c>
"Plural-Forms: nplurals=3; plural=n==1 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;\n"
</code>

There are 3 forms (including singular). The index is 0 (singular) if the given
integer n is 1. Then, if ''(n % 10 >= 2 && % 10 <= 4 &&
(n % 100 < 10 || n % 100 >= 20)'', the index is 1 and otherwise it is 2.

== Handling language changes at runtime ==

The user can change the system language settings at any time. When that is
done, Ecore Events notifies the application which can then change the language used
in elementary. The widgets then receive a "language,changed" signal and can
set their text again.

The first step is to handle the ecore event:

<code c>
static Eina_Bool
_app_language_changed(void *data, int type, void *event)
{
   // Set the language in elementary
   elm_language_set(setlocale(LC_ALL,NULL));
}

int
main(int argc, char *argv[])
{
    ...

    // Retrieve the current system language
    ecore_event_handler_add(ECORE_EVENT_LOCALE_CHANGED , _app_language_changed, NULL);

    ...
}
</code>

The call to ''elm_language_set()'' above will trigger the emission of the
"language,changed" signal which can then be handled like usual smart events
signals.

=== Extracting messages for translation ===

The xgettext tool can extract strings to translate to a .pot file (po
template) while msgmerge can maintain existing .po files. The typical workflow
is as follows:

  * run xgettext once; it will generate a .pot file
  * when adding a new translation, copy the .pot file to <locale>.po and translate that file
  * new runs of xgettext will update the existing .pot file and msgmerge will update .po files

A typical call to xgettext looks like:

<code bash>
 xgettext --directory=src --output-dir=res/po --keyword=_ --keyword=N_ --keyword=elm_object_translatable_text_set:2 --keyword=elm_object_item_translatable_text_set:2 --add-comments= --from-code=utf-8 --foreign-user
</code>

This will extract all strings that are used inside the "_()" function (usual
optional short-hand for gettext()), use UTF-8 as the encoding and add the
comments right before the strings to the output files.

A typical call to msgmerge looks like:

<code bash>
msgmerge --width=120 --update res/po/fr.po res/po/ref.pot
</code>

POT file (.pot) stands for Portable Object Template file.
It contains a series of lines in pair starting with the keywords msgid and msgstr
respectively. In the above example there is only one such pair & msgid is
shown first followed by a string in the source language, followed by a msgstr
in the next line which is immediately followed by a blank string.

Now in order to translate the application, these POT files are copied as PO
(.po) files in respective language folders and then translated. What I mean by
translation here is that, corresponding to every string adjacent to msgid
there is a translated string (in local script), adjacent to msgstr. For Hindi
it will look something like this:

<code bash>
msgid "Click Here\n"
msgstr "Cliquez ici\n"
</code>

===  compiling and running a Localized Application ===

Create an MO (.mo) file using the following command:

<code c>
msgfmt helloworld.po -o helloworld.mo
</code>

In root mode copy the MO file to /usr/share/locale/<LANGUAGE>/LC_MESSAGES. For
French, do something like this:

<code bash>
cp helloworld.mo /usr/share/locale/fr_FR/LC_MESSAGES/
</code>

Don't forget to export your language here:

<code bash>
export LANG=fr_FR.utf8
</code>

Then compile and execute your program.

==== Localization tips ====

=== Don't make assumptions about languages ===

Languages vary wildly and even though you might know several of them, you
shouldn't assume there is any common logic to them.

For instance, with English typography no character must appear before colons
and semicolons (':' and ';'). However, with French typography, there should be
"espace fine ins├ęcable", i.e. a non-breakable space (HTML's &nbsp;) that
is narrower that regular spaces.

This prevents proper translation in the following construct:

<code c>
snprintf(buf, some_size, "%s: %s", gettext(error), gettext(reason));
</code>

The proper way to do it is to use a single string and let the translators
manage the punctuation. This means, translating the format string instead:

<code c>
snprintf(buf, some_size, gettext("%s: %s"), gettext(error), gettext(reason));
</code>

Of course, it might not always be doable but you should strive for this unless
a specific issue arises.

=== Translations will be of different lengths ===

Depending on the language, the translation will have a different length on
screen. Some languages have shorter constructs than other in some cases while
it is reversed for others; some languages can also have a word for a concept
while others won't and will require a circumlocution (designating something by
using several words).

=== For source control, don't commit .po if only line indicators have changed ===

From the example above, a translation block looks like:

<code c>
#: some_file.c:43 another_file.c:41
msgid "Click Here"
msgstr "Cliquez ici"
</code>

In case you insert a new line at the top of "some_file.c", the line indicator
will change to look like

<code c>
#: some_file.c:44 another_file.c:41
</code>

Obviously, on non-trivial projects, such changes will happen often. If you use
source control (you should) and commit such changes even though no actual
translation change has happened, each and every commit will probably contain a
change to .po files. This will hamper readability of the change history and in
case several people are working in parallel and need to merge their changes,
this will create huge merge conflicts each time.

Only commit changes to .po files when actual translation changes have
happened, not merely because line comments have changed.

=== Using _() as a shorthand to the gettext() function ===

Since calling ''gettext()'' might happen very often, it is often abbreviated
to ''_()'':

<code c>
#define _(str) gettext(str)
</code>

== Proper sorting: strcoll() ==

Quite often you will want to sort data for display. There is a string
comparison tailored for that: ''strcoll()''. It works the same as ''strcmp()''
but sorts according to the current locale settings.

<code c>
int strcmp(const char *s1, const char *s2);
int strcoll(const char *s1, const char *s2);
</code>

The function prototype is a standard one and indicates how to order strings. A
detailed explanation would be out of scope for this guide but chances are you
will be able to provide the ''strcoll()'' function as the comparison function
for sorting the data set you are using.

== Working with translators ==

The system described above is a common one and will likely be known to
translators, meaning that giving its name ("gettext") might be enough to
explain how to work. In addition to this documentation, there is extensive
additional documentation and questions and answers on the topic on the
Internet.

Don't hesitate to put comments in your code right above the strings to
translate since they can be extracted along with the strings and put in the
.po files for the translator to see them.