Many if not most natural languages have different gramatical structures ("pluralforms"/"plurals") to indicate different quantities. Though not every language agrees how quantities map to their pluralforms! e.g. Is 0 plural, not, or something else?

One of those things you might assume is trivial...

Gettext's facilities for this (which I'll discussed today) assumes English as a source language, though I suspect those assumptions can easily be overcome for programming in other languages.


`dcigettext` & its many wrappers will resolve plurals (count defaults to 0) once it successfully looked up translation in the configured/selected catalogue. If unsuccessful it may optionally apply an English-like "germanic" `n == 1` pluralform between the 2 given strings.

This involves interpreting (for the given count) the plural formula from the catalogue & iterating over the multistring to find the computed index. Which .MO compilation validates stays in-range.


Interpreting the plural formula is done over the abstract syntax tree recursively branching over the number of operands (0-3 inclusive) before before branching over & applying the mathematical operation.

Said expression is parsed when loading in the .mo file by locating the "plural=" & "nplurals=" headerfields of the metadata entry (translation for "") parsing nplurals via `strtoul` after scanning digits, & parsing plural using Bison & a manual lexer. Relatively trivial usage.

3/3 Fin!

P.S. I forgot to say: If Gettext fails to find plural headers in the .mo file it'll default to returning a handwritten AST representing `n != 1`. `strstr` is used to locate these headers.

3.1/3.1 Truly fin!

Upload this & yesterday's thread tonight, skim over the rest of Gettext's code sometime without commenting, & tomorrow I'll read some more of Pango to internationalize richtext rendering!

Gettext has ensured it works with XML formats like Pango's...

@alcinnz in Haiku locale kit we use ICU format instead. The way it works is there is a domain-specific language to describe the rules, something like this:

'{ one: "%d file", other: "%d files"}'

Each language can have a different set of rules (typically picking from "zero", "one", "few", "many" and "other"). The translation replaces the whole string, then it is run through a parser to pick the right output depending on the value.

Sign in to participate in the conversation

For people who care about, support, or build Free, Libre, and Open Source Software (FLOSS).