Wiki pages multilanguage_pg created: 1 main PG

Signed-off-by: Clément Bénier <clement.benier@openwide.fr>
2015-09-02 15:43:12 +02:00 · 2015-09-02 15:43:12 +02:00 · 89c96566b5
parent 9631ab209b
commit 89c96566b5
3 changed files with 429 additions and 0 deletions
--- a/pages/docs.txt
+++ b/pages/docs.txt
@ -62,6 +62,7 @@ Go check the current available version of EFL on each distro/platform:
  * [[program_guide/event_effect_pg|Event and Effect PG]]
  * [[program_guide/evas_pg|Evas PG]]
  * [[program_guide/edje_pg|Edje PG]]
+  * [[program_guide/multilingual_pg|Multilingual PG]]

 === Samples ===

--- a/pages/program_guide/index.txt
+++ b/pages/program_guide/index.txt
@ -5,4 +5,5 @@
  * [[program_guide/event_effect_pg|Event and Effect PG]]
  * [[program_guide/evas_pg|Evas PG]]
  * [[program_guide/edje_pg|Edje PG]]
+  * [[program_guide/multilingual_pg|Multilingual PG]]
 ++++
--- a/pages/program_guide/multilingual_pg.txt
+++ b/pages/program_guide/multilingual_pg.txt
@ -0,0 +1,427 @@
+{{page>index}}
+------
+===== Multilingual Programming Guide =====
+
+=== Table of Contents ===
+
+  * [[#Concepts|Concepts]]
+  * [[#Internationalization_in_EFL|Internationalization in EFL]]
+    * [[#Marking_text_parts_as_translatable|Marking text parts as translatable]]
+    * [[#Translating_texts_directly|Translating texts directly]]
+        * [[#Plurals|Plurals]]
+        * [[#Handling_language_changes_at_runtime|Handling language changes at runtime]]
+        * [[#compiling_and_running_a_Localized_Application|compiling and running a Localized Application]]
+    * [[#Extracting_messages_for_translation|Extracting messages for translation]]
+  * [[#Internationalization_tips|Internationalization tips]]
+    * [[#Don't_make_assumptions_about_languages|Don't make assumptions about languages]]
+    * [[#Translations_will_be_of_different_lengths|Translations will be of different lengths]]
+    * [[#For_source_control,_don't_commit_.po_if_only_line_indicators_have_changed|For source control, don't commit .po if only line indicators have changed]]
+    * [[#Using__()_as_a_shorthand_to_the_gettext()_function|Using _() as a shorthand to the gettext() function]]
+      * [[#Proper_sorting:_strcoll()|Proper sorting: strcoll()]]
+      * [[#Working_with_translators|Working with translators]]
+
+==== Concepts ====
+
+Internationalization (also called i18n) is done by using strings in a specific
+language in the code (typically English) and then translating them to the
+target language.
+
+Not using resource identifiers but actual strings make it much more convenient
+and readable. A typical code to create a button that will be translated is:
+
+<code c>
+Evas_Object *button = elm_button_add(parent);
+elm_object_translatable_text_set(button, "Click Here");
+</code>
+
+The messages that require translations are typically automatically extracted
+from the sources and put into .po files, one per language. For the example
+above, the "fr.po" file could contain:
+
+<code c>
+#: some_file.c:43 another_file.c:41
+msgid "Click Here"
+msgstr "Cliquez ici"
+</code>
+
+In the example above, the program that extracts strings has found two
+occurrences of the same string, one in some_file.c at line 43 and another one
+in another_file.c at line 41. It gives the original string after "msgid" and
+the translation goes after "msgstr".
+
+Strings without translation are stored as the empty string "" in the .po file
+and the program will use the original strings, providing a sane fallback.
+
+It is possible that the "fuzzy" keyword is added by the extractor program on
+the line before "msgid"; it means the original string has changed and needs
+review.
+
+<note>
+Don't be surprised if the translation is correct even though you didn't change
+it: the extractor program is sometimes able to "guess" the updated
+translation!
+</note>
+
+==== Internationalization in EFL ====
+
+=== Marking text parts as translatable ===
+
+The most common way to use a translation involves the following APIs:
+
+<code c>
+elm_object_translatable_text_set(Evas_Object *obj, const char *text)
+elm_object_item_translatable_text_set(Elm_Object_Item *it, const char *text)
+</code>
+
+They set the untranslated string for the "default" part of the given
+''Evas_Object'' or ''Elm_Object_Item'' and mark the string as translatable.
+
+Similar functions are available if you wish to set the text for a part that is
+not "default":
+
+<code c>
+elm_object_translatable_part_text_set(Evas_Object *obj, const char *part, const char *text)
+elm_object_item_translatable_part_text_set(Elm_Object_Item *it, const char *part, const char *text)
+</code>
+
+It is important to provide the untranslated string to these functions because
+the EFLs will trigger the translation themselves and re-translate the strings
+automatically should the system language change.
+
+It is also possible to set the text and the translatable property separately.
+Setting the text is done as usual while the translatable property is set
+through the ''elm_object_part_text_translatable_set()'':
+
+There are also ''get()'' counterparts to the ''set()'' functions above.
+
+=== Translating texts directly ===
+
+The approach described in the previous section is not applicable all of the
+time. For instance, it won't work if you are populating a genlist, if you need
+plurals in the translation or if you want to do something else with the
+translation than putting it in elementary widgets.
+
+It is however possible to retrieve the translation for a given text using
+gettext from ''<libintl.h>'' :
+
+<code c>
+char * gettext(const char * msgid);
+</code>
+
+This function takes as input a string (that will be copied to an msgid field
+in the .po files) and returns the translation (the
+corresponding msgstr field).
+
+In order to use gettext, you have to set the local before:
+
+<code c>
+setlocale(LC_ALL,"");
+</code>
+
+''LC_ALL'' is a catch-all Locale Category (LC).  Setting it will alter all LC
+categories as ''LC_MESSSAGES'' and ''LC_TYPES'' which are other categories for
+translation: ''LC_MESSSAGES'' is for message translations and ''LC_TYPES''
+indicates the character set supported.
+
+By setting the locale to ''""'', you are implicitly assigning the locale to
+the user's defined locale (grabbed from the user's LC or LANG environment
+variables). If there is no user-defined locale, the default locale "C" is
+used.
+
+<code c>
+bindtextdomain("hello","/usr/share/locale/");
+</code>
+
+This command binds the name ''"hello"'' to the directory root of the message
+files. In fact, the program will be looking for your ''hello.mo'' in
+''/usr/share/locale/<your_language>/LC_MESSAGES/'' directory where
+''<your_language>'' can be ''fr_FR'' for example defining in the user's
+defined locale. This is used to specify where you want your locale files
+stored. You will use ''"hello"'' when setting the gettext domain through
+''textdomain()'', and it corresponds to the name of the file to be looked up
+in the appropriate locale directory.
+
+The ''bindtextdomain()'' call is not mandatory; if you choose to install your
+file in the system's default locale directory it can be omitted. Since the
+default can change from system to system, however, it is recommended.
+
+<code c>
+textdomain("hello");
+</code>
+
+This sets the application name as ''"hello"'', as cited above. This makes gettext
+calls look for the file ''hello.mo'' in the appropriate directory. By binding
+various domains and setting the textdomain (or using ''dcgettext()'',
+explained elsewhere) at runtime, you can switch between different domains as
+desired.
+
+When giving the text for a genlist item, you could use it in a similar manner
+as the one below:
+
+<code c>
+#include<libintl.h>
+#include<locale.h>
+
+#define _(str) gettext(str)
+
+static char *
+_genlist_text_get(void *data, Evas_Object *obj, const char *part)
+{
+   return strdup(gettext("Some Text"));
+   /* or usual way
+    * return strdup(_("Some Text"));
+    */
+}
+
+EAPI_MAIN int
+elm_main(int argc, char **argv)
+{
+   setlocale(LC_ALL,"");
+   bindtextdomain("hello","/usr/share/locale");
+   textdomain("hello");
+
+   /* ... */
+
+   elm_run();
+   elm_shutdown();
+   return 0;
+}
+ELM_MAIN()
+</code>
+
+== Plurals ==
+
+Plurals are handled in a similar way but through the ''ngettext()'' function.
+Its prototype is shown below:
+
+<code c>
+char * ngettext (const char * msgid, const char * msgid_plural, unsigned long int n);
+</code>
+
+  * ''msgid'' is the same as before, i.e. the untranslated string
+  * ''msgid_plural'' is the plural form of msgid
+  * the quantity (with English, 1 would be singular and anything else would be plural)
+
+A matching fr.po file would contain the following lines:
+
+<code c>
+msgid "%d Comment"
+msgid_plural "%d Comments"
+msgstr[0] "%d commentaire"
+msgstr[1] "%d commentaires"
+</code>
+
+__Several plurals__
+
+It is even possible to have several plural forms. For instance, the .po file
+for Polish could contain:
+
+The index values after msgstr are defined in system-wide settings. The ones
+for Polish are given below:
+
+<code c>
+"Plural-Forms: nplurals=3; plural=n==1 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;\n"
+</code>
+
+There are 3 forms (including singular). The index is 0 (singular) if the given
+integer n is 1. Then, if ''(n % 10 &gt;= 2 &amp;&amp; % 10 &lt;= 4 &amp;&amp;
+(n % 100 < 10 || n % 100 >= 20)'', the index is 1 and otherwise it is 2.
+
+== Handling language changes at runtime ==
+
+The user can change the system language settings at any time. When that is
+done, Ecore Events notifies the application which can then change the language used
+in elementary. The widgets then receive a "language,changed" signal and can
+set their text again.
+
+The first step is to handle the ecore event:
+
+<code c>
+static Eina_Bool
+_app_language_changed(void *data, int type, void *event)
+{
+   // Set the language in elementary
+   elm_language_set(setlocale(LC_ALL,NULL));
+}
+
+int
+main(int argc, char *argv[])
+{
+    ...
+
+    // Retrieve the current system language
+    ecore_event_handler_add(ECORE_EVENT_LOCALE_CHANGED , _app_language_changed, NULL);
+
+    ...
+}
+</code>
+
+The call to ''elm_language_set()'' above will trigger the emission of the
+"language,changed" signal which can then be handled like usual smart events
+signals.
+
+=== Extracting messages for translation ===
+
+The xgettext tool can extract strings to translate to a .pot file (po
+template) while msgmerge can maintain existing .po files. The typical workflow
+is as follows:
+
+  * run xgettext once; it will generate a .pot file
+  * when adding a new translation, copy the .pot file to <locale>.po and translate that file
+  * new runs of xgettext will update the existing .pot file and msgmerge will update .po files
+
+A typical call to xgettext looks like:
+
+<code bash>
+ xgettext --directory=src --output-dir=res/po --keyword=_ --keyword=N_ --keyword=elm_object_translatable_text_set:2 --keyword=elm_object_item_translatable_text_set:2 --add-comments= --from-code=utf-8 --foreign-user
+</code>
+
+This will extract all strings that are used inside the "_()" function (usaual
+optional short-hand for gettext()), use UTF-8 as the encoding and add the
+comments right before the strings to the output files.
+
+A typical call to msgmerge looks like:
+
+<code bash>
+msgmerge --width=120 --update res/po/fr.po res/po/ref.pot
+</code>
+
+POT file (.pot) stands for Portable Object Template file.
+It contains a series of lines in pair starting with the keywords msgid and msgstr
+respectively. In the above example there is only one such pair & msgid is
+shown first followed by a string in the source language, followed by a msgstr
+in the next line which is immediately followed by a blank string.
+
+Now in order to translate the application, these POT files are copied as PO
+(.po) files in respective language folders and then translated. What I mean by
+translation here is that, corresponding to every string adjacent to msgid
+there is a translated string (in local script), adjacent to msgstr. For Hindi
+it will look something like this:
+
+<code bash>
+msgid "Click Here\n"
+msgstr "Cliquez ici\n"
+</code>
+
+===  compiling and running a Localized Application ===
+
+Create an MO (.mo) file using the following command:
+
+<code c>
+msgfmt helloworld.po -o helloworld.mo
+</code>
+
+In root mode copy the MO file to /usr/share/locale/<LANGUAGE>/LC_MESSAGES. For
+French, do something like this:
+
+<code bash>
+cp helloworld.mo /usr/share/locale/fr_FR/LC_MESSAGES/
+</code>
+
+Don't forget to export your language here:
+
+<code bash>
+export LANG=fr_FR.utf8
+</code>
+
+Then compile and execute your program.
+
+==== Internationalization tips ====
+
+=== Don't make assumptions about languages ===
+
+Languages vary wildly and even though you might know several of them, you
+shouldn't assume there is any common logic to them.
+
+For instance, with English typography no character must appear before colons
+and semicolons (':' and ';'). However, with French typography, there should be
+"espace fine insécable", i.e. a non-breakable space (HTML's &nbsp;) that
+is narrower that regular spaces.
+
+This prevents proper translation in the following construct:
+
+<code c>
+snprintf(buf, some_size, "%s: %s", gettext(error), gettext(reason));
+</code>
+
+The proper way to do it is to use a single string and let the translators
+manage the punctuation. This means, translating the format string instead:
+
+<code c>
+snprintf(buf, some_size, gettext("%s: %s"), gettext(error), gettext(reason));
+</code>
+
+Of course, it might not always be doable but you should strive for this unless
+a specific issue arises.
+
+=== Translations will be of different lengths ===
+
+Depending on the language, the translation will have a different length on
+screen. Some languages have shorter constructs than other in some cases while
+it is reversed for others; some languages can also have a word for a concept
+while others won't and will require a circumlocution (designating something by
+using several words).
+
+=== For source control, don't commit .po if only line indicators have changed ===
+
+From the example above, a translation block looks like:
+
+<code c>
+#: some_file.c:43 another_file.c:41
+msgid "Click Here"
+msgstr "Cliquez ici"
+</code>
+
+In case you insert a new line at the top of "some_file.c", the line indicator
+will change to look like
+
+<code c>
+#: some_file.c:44 another_file.c:41
+</code>
+
+Obviously, on non-trivial projects, such changes will happen often. If you use
+source control (you should) and commit such changes even though no actual
+translation change has happened, each and every commit will probably contain a
+change to .po files. This will hamper readability of the change history and in
+case several people are working in parallel and need to merge their changes,
+this will create huge merge conflicts each time.
+
+Only commit changes to .po files when actual translation changes have
+happened, not merely because line comments have changed.
+
+=== Using _() as a shorthand to the gettext() function ===
+
+Since calling ''gettext()'' might happen very often, it is often abbreviated
+to ''_()'':
+
+<code c>
+#define _(str) gettext(str)
+</code>
+
+== Proper sorting: strcoll() ==
+
+Quite often you will want to sort data for display. There is a string
+comparison tailored for that: ''strcoll()''. It works the same as ''strcmp()''
+but sorts according to the current locale settings.
+
+<code c>
+int strcmp(const char *s1, const char *s2);
+int strcoll(const char *s1, const char *s2);
+</code>
+
+The function prototype is a standard one and indicates how to order strings. A
+detailed explanation would be out of scope for this guide but chances are you
+will be able to provide the ''strcoll()'' function as the comparison function
+for sorting the data set you are using.
+
+== Working with translators ==
+
+The system described above is a common one and will likely be known to
+translators, meaning that giving its name ("gettext") might be enough to
+explain how to work. In addition to this documentation, there is extensive
+additional documentation and questions and answers on the topic on the
+Internet.
+
+Don't hesitate to put comments in your code right above the strings to
+translate since they can be extracted along with the strings and put in the
+.po files for the translator to see them.