ISO 639 macrolanguage

A macrolanguage is a book-keeping mechanism for the ISO 639 international standard of language codes. Macrolanguages are established to assist mapping between different sets of ISO language codes. Specifically, there may be a many-to-one correspondence between ISO 639-3, intended to identify all the thousands of languages of the world, and either of two other sets, ISO 639-1, established to identify languages in computer systems, and ISO 639-2, which encodes a few hundred languages for library cataloguing and bibliographic purposes. When such many-to-one ISO 639-2 codes are included in an ISO 639-3 context, they are called "macrolanguages" to distinguish them from the corresponding individual languages of ISO 639-3.[1] According to the ISO,

Some existing code elements in ISO 639-2, and the corresponding code elements in ISO 639-1, are designated in those parts of ISO 639 as individual language code elements, yet are in a one-to-many relationship with individual language code elements in [ISO 639-3]. For purposes of [ISO 639-3], they are considered to be macrolanguage code elements.

— ISO 639-3: Relationship between ISO 639-3 and the other parts of ISO 639[2]

ISO 639-3 is curated by SIL International, ISO 639-2 is curated by the Library of Congress (USA).

The mapping often has the implication that it covers borderline cases where two language varieties may be considered strongly divergent dialects of the same language or very closely related languages (dialect continua); it may also encompass situations when there are language varieties that are considered to be varieties of the same language on the grounds of ethnic, cultural, and political considerations, rather than linguistic reasons.[dubious ] However, this is not its primary function and the classification is not evenly applied.

For example, Chinese is a macrolanguage encompassing many languages that are not mutually intelligible, but the languages "Standard German", "Bavarian German", and other closely related languages do not form a macrolanguage, despite being more mutually intelligible. Other examples include Tajiki not being part of the Persian macrolanguage despite sharing much lexicon, and Urdu and Hindi not forming a macrolanguage despite forming a mutually intelligible dialect continuum. All dialects of Hindi are considered separate languages. Basically, ISO 639-2 and ISO 639-3 use different criteria for dividing language varieties into languages, 639-2 uses shared writing systems and literature more whereas 639-3 focuses on mutual intelligibility and shared lexicon. The macrolanguages exist within the ISO 639-3 code set to make mapping between the two sets easier.

The use of macrolanguages was applied in Ethnologue, starting in the 16th edition.[3] As of 21 December 2023, there are fifty-nine language codes in ISO 639-2 that are counted as macrolanguages in ISO 639-3.[4] The most recent registered macrolanguage is Sanskrit with code san, adopted in 15 December 2023, though it already existed as individual language for several years.[5]

Some of the macrolanguages had no individual language (as defined by 639-3) in ISO 639-2, e.g. "ara" (Arabic), but ISO 639-3 recognizes different varieties of Arabic as separate languages under some circumstances. Others, like "nor" (Norwegian) had their two individual parts (nno Nynorsk, nob Bokmål) already in 639-2. That means some languages (e.g. "arb" Standard Arabic) that were considered by ISO 639-2 to be dialects of one language ("ara") are now in ISO 639-3 in certain contexts considered to be individual languages themselves. This is an attempt to deal with varieties that may be linguistically distinct from each other, but are treated by their speakers as forms of the same language, e.g. in cases of diglossia. For example,

  • Generic Arabic, 639-2[6]
  • Standard Arabic, 639-3[7]

ISO 639-2 also includes codes for collections of languages; these are not the same as macrolanguages. These collections of languages are excluded from ISO 639-3, because they never refer to individual languages. Most such codes are included in ISO 639-5.

  1. ^ ISO 639-3: Scope of denotation for language identifiers: Macrolanguages
  2. ^ "Relationships to other parts of ISO 639 | ISO 639-3".
  3. ^ Lewis, M. Paul, ed. (2009). Ethnologue. Dallas: SIL International.
  4. ^ "Scope of denotation for language identifiers". SIL International.
  5. ^ "Comments received for ISO 639-3 Change Request 2011-041" (PDF). SIL International. October 31, 2023. Retrieved 21 December 2023.
  6. ^ "Documentation for ISO 639 identifier: ara". SIL International.
  7. ^ "Documentation for ISO 639 identifier: arb". SIL International.