SC2/WG2 N2075R
Proposal to add Lithuanian Accented Letters to ISO/IEC 10646-1
Table of Contents
1. Official:
1.1. Proposal Request Form
1.2. Graphic Symbols and Names
2. Rationale:
2.1. Lithuanian Letters
2.1.1. Main alphabet
2.1.2. Extended alphabet (with accented letters)
2.2. 8-bit single-byte coding ( National standard code tables)
2.3. Multiple-Octet coding in ISO/IEC 10646-1 (UCS codes)
2.4. Samples
2.5. References
Lithuanian Standards Board: Kosciuskos 30, LT-2600 Vilnius, Lithuania
Phone: + 370-5-270 93 60, fax: +370-5-212 62 52
Author and contact person: Vladas Tumasonis (Vilnius University and Lithuanian Standards Board)
E-mail: vladas.tumasonis@maf.vu.lt
Phone: +370-5-236 60 35, fax: +370-5-215 15 85
1.1.
Proposal Request Form
Please
fill Sections A, B and C below. Section D will be filled by SC 2/WG 2.
For instructions and guidance for filling in the form please see the document " Principles and Procedures for Allocation of New Characters and Scripts" (http://www.dkuug.dk/JTC1/SC2/WG2/prot)
1. Title: Addition of Lithuanian
Accented Letters
2. Requester's name: Lithuanian Standards
Board (LST)
3. Requester type (Member
body/Liaison/Individual contribution): Correspondent Member
4. Submission date: 1999-08-15
5. Requester's reference (if applicable):
6. This is a complete proposal.
1. The proposal is for addition of
character(s) to an existing block.
Name of the existing block: LATIN EXTENDED-B
2. Number of characters in proposal: 35
3. Proposed category (see section II,
Character Categories): A
4. Proposed Level of Implementation (see
clause 15, ISO/IEC 10646-1): 1
Is a rationale provided for the choice?
If Yes, reference:
5. Is a repertoire including character names
provided?: Yes
a. If YES, are the names in accordance with
the 'character naming guidelines' in
Annex K of ISO/IEC 10646-1? Yes
b. Are the character shapes attached in a reviewable form? Yes
6. Who will provide the appropriate
computerized font (ordered preference: True Type, PostScript or 96x96
bit-mapped format) for publishing the standard? True Type; Fotonija UAB,
Vilnius, Lithuania
If available now, identify source(s) for the font (include address, e-mail,
ftp-site, etc.) and indicate the tools used: Mr. Virginijus Dadurkevicius;
dadurka@fotonija.com
7. References:
a. Are references (to other character sets, dictionaries, descriptive texts
etc.) provided? Yes
b. Are published examples (such as samples
from newspapers, magazines, or
other sources) of use of proposed characters attached? Yes
8. Special encoding issues:
Does the proposal address other aspects of
character data processing (if applicable) such as input, presentation, sorting,
searching, indexing, transliteration etc. (if yes please enclose information): No
1. Has this proposal for addition of
character(s) been submitted before? No
If YES explain
2. Has contact been made to members of the
user community (for example: National Body, user groups of the script or
characters, other experts, etc.)?
If YES, with whom?
If YES, available relevant documents?
3. Information on the user community for the
proposed characters (for example: size,
demographics, information technology use, or publishing use) is included? Yes
Reference:
4. The context of use for the proposed
characters (type of use; common or rare) Common
Reference:
5. Are the proposed characters in current use
by the user community? Yes
If YES, where? Reference: In Lithuania
6. After giving due considerations to the
principles in N 2002 must the proposed
characters be entirely in the BMP? Yes
If YES, is a rationale provided?
If YES, reference:
7. Should the proposed characters be kept together
in a contiguous range (rather than
being scattered)? Can be scattered
8. Can any of the proposed characters be
considered a presentation form of an existing
character or character sequence? Not existing characters, but they are fully
composed forms of glyphs that can be represented as a composite sequence
If YES, is a rationale for its inclusion provided? Yes
If YES, reference: Is enclosed
9. Can any of the proposed character(s) be
considered to be similar (in appearance or function) to an existing character? No
If YES, is a rationale for its inclusion provided?
If YES, reference:
10. Does the proposal include use of
combining characters and/or use of composite sequences (see clause 4.11 and 4.13 in ISO/IEC 10646-1)? No
If YES, is a rationale for such use provided?
If YES, reference:
Is a list of composite sequences and their corresponding glyph images (graphic
symbols) provided? No
If YES, reference:
11. Does the proposal contain characters with
any special properties such as control function or similar semantics? No
If YES, describe in detail (include attachment if necessary)
1. Relevant SC 2/WG 2 document numbers:
2. Status (list of meeting number and
corresponding action or disposition):
3. Additional contact to user communities,
liaison organizations etc:
4. Assigned category and assigned
priority/time frame:
1.2. Graphic Symbols and Names
Number |
Graphic symbol |
Name |
Remarks |
1 |
|
LATIN CAPITAL LETTER A WITH OGONEK AND ACUTE |
|
2 |
|
LATIN SMALL LETTER A WITH OGONEK AND ACUTE |
|
3 |
|
LATIN CAPITAL LETTER A WITH OGONEK AND TILDE |
|
4 |
|
LATIN SMALL LETTER A WITH OGONEK AND TILDE |
|
5 |
|
LATIN CAPITAL LETTER E WITH OGONEK AND ACUTE |
|
6 |
|
LATIN SMALL LETTER E WITH OGONEK AND ACUTE |
|
7 |
|
LATIN CAPITAL LETTER E WITH OGONEK AND TILDE |
|
8 |
|
LATIN SMALL LETTER E WITH OGONEK AND TILDE |
|
9 |
|
LATIN CAPITAL LETTER E WITH DOT ABOVE AND ACUTE |
|
10 |
|
LATIN SMALL LETTER E WITH DOT ABOVE AND ACUTE |
|
11 |
|
LATIN CAPITAL LETTER E WITH DOT ABOVE AND TILDE |
|
12 |
|
LATIN SMALL LETTER E WITH DOT ABOVE AND TILDE |
|
13 |
|
LATIN SMALL LETTER I WITH DOT ABOVE AND GRAVE |
Name ? |
14 |
|
LATIN SMALL LETTER I WITH DOT ABOVE AND ACUTE |
Name ? |
15 |
|
LATIN SMALL LETTER I WITH DOT ABOVE AND TILDE |
Name ? |
16 |
|
LATIN CAPITAL LETTER I WITH OGONEK AND ACUTE |
|
17 |
|
LATIN SMALL LETTER I WITH OGONEK AND DOT ABOVE AND ACUTE |
Name ? |
18 |
|
LATIN CAPITAL LETTER I WITH OGONEK AND TILDE |
|
19 |
|
LATIN SMALL LETTER I WITH OGONEK AND DOT ABOVE AND TILDE |
Name ? |
20 |
|
LATIN CAPITAL LETTER J WITH TILDE |
|
21 |
|
LATIN SMALL LETTER J WITH TILDE |
|
22 |
|
LATIN CAPITAL LETTER L WITH TILDE |
|
23 |
|
LATIN SMALL LETTER L WITH TILDE |
|
24 |
|
LATIN CAPITAL LETTER M WITH TILDE |
|
25 |
|
LATIN SMALL LETTER M WITH TILDE |
|
26 |
|
LATIN CAPITAL LETTER R WITH TILDE |
|
27 |
|
LATIN SMALL LETTER R WITH TILDE |
|
28 |
|
LATIN CAPITAL LETTER U WITH OGONEK AND ACUTE |
|
29 |
|
LATIN SMALL LETTER U WITH OGONEK AND ACUTE |
|
30 |
|
LATIN CAPITAL LETTER U WITH OGONEK AND TILDE |
|
31 |
|
LATIN SMALL LETTER U WITH OGONEK AND TILDE |
|
32 |
|
LATIN CAPITAL LETTER U WITH MACRON AND ACUTE |
|
33 |
|
LATIN SMALL LETTER U WITH MACRON AND ACUTE |
|
34 |
|
LATIN CAPITAL LETTER U WITH MACRON AND TILDE |
|
35 |
|
LATIN SMALL LETTER U WITH MACRON AND TILDE |
|
2. Rationale:
2.1. Lithuanian Letters
2.1.1. Main alphabet
Lithuanian by its gramatical structure is one of the most ancient languages of living Indo-European languages. It is spoken approximately by 5 millions people and is delivered at many Universities all over the world for linguistic studies.
The main Lithuanian alphabet consists at the Latin alphabet (excluding Q, q, W, w, X, x) with extra 18 letters with diacritics (9 capital and 9 small):
These letters are included in 8-bit single-byte coded character sets (ISO/IEC 8859-13, MS CP 1257, IBM CP 775, etc.). Thus there is no problems to use them.
2.1.2. Extended alphabet (with accented letters)
Lithuanian has a free word stress: stress may fall on every syllable of the word. it performs at least two functions. Its constitutive function manifests itself in distinguishing word from a combination of words, cf.:
The second function of word stress is the distinctive function, which distinguishes otherwise identical words by the place where the stress falls, e.g.:
For the word stressing (or accenting) there are three accent marks (or diacritical marks in ISO terms): grave accent, acute accent and tilde. The position of the stress depends on the stress pattern (or accentual paradigm) of the word and its morphological structure (see examples above).
Word stress is expressed by the means of accented letters.
There are 68 accented letters in the Lithuanian language:
The accented letters together with main letters comprise the extended alphabet.
Usage of accented letters goes back to the first Lithuanian writings. The first Lithuanian books were accented, e.g. "Kathechismas" (1595) and "Postilla catholicka" (1599). At present, the publishing practice all dictionaries, special vocabularies and encycklopaediae are accented. Accented letters are used in textbooks for schools, reference books, linguistic texts, and in publication of laws.
In common press (newspapers, fiction, etc.) only the letters of the main Lithuanian alphabet are used. Accented letters are used only in those words where it has a distinctive function.
2.2.
8-bit single-byte coding (National standard code tables)
There are three national code tables in Lithuania for encoding extended alphabet (usually we say "for encoding accented letters"). The basic Lithuanian code table is for UNIX environment (the second half of this table is shown in fig. 1). It defines the basic character repertoire including accented letters. This code table is conformant with ISO/IEC 8859-13, i. e. the codes of all Lithuanian main letters in both tables are the same. Common use and very important graphic characters are retained. The repertoire of this table is optimal for linguistic text processing.
Code table for Windows OS contains the basic repertoire and extra phonetic symbols in 8 and 9 columns. This code table is conformant with 8859-13.
Code table for DOS contains basic repertoire and box drawing symbols and is conformant with IBM CP 775 for Baltic States. DOS environment is still popular in publishing houses.
Fig. 1. UNIX code table for Lithuanian accented letters (second half)
2.3.
Multiple-Octet coding in ISO/IEC 10646-1 (UCS codes)
All letters of main Lithuanian alphabet have UCS codes (codes in ISO/IEC 10646-1) or UNICODE codes. The situation with Lithuanian accented letters is more complicated. As it was mentioned, Lithuanian accented letters are Latin script letters with grave accent, acute accent or tilde. So some Lithuanian accented letters are also the common letters in other languages. For example, LATIN LETTER A WITH ACUTE is also in Irish, Icelandic, Portuguese, Slovak etc. languages, LATIN LETTER N WITH TILDE is also in Basque, Breton and Spanish languages. Thus they have separate UCS codes.
All together there are 33 Lithuanian accented letters which have separate UCS codes and 35 accented letters have not separate UCS codes.
Not shadowed letters have UCS codes; shadowed letters have not UCS codes.
There is another problem with small letter "i" (and "i with ogonek"). Lithuanian letter "i" is with a dot above. All accented forms of "i" should be also with a dot (see samples in 2.4). In ISO/IEC 10646-1 all such forms are dotless. For example, LATIN SMALL LETTER I WITH ACUTE in fact specifies "Latin small letter dotless i with acute". We ought to retain a dot above, in that case, so we should define these letters as LATIN SMALL LETTER I WITH DOT ABOVE AND ACUTE (or may be LATIN SMALL LETTER DOTLESS I WITH DOT ABOVE AND ACUTE).
In [3, p.350]:
In [12, p.75]:
In [4, p.38]. Note the accented "i":
1. M. Dauksa, Kathechismas (1595) and
Postilla catholicka (1599).
2. Lietuvių kalbos žodynas, I–XVIII t. [Dictionary of Lithuanian Language, I–XVIII volumes], Vilnius, 1956–1997.
3. Dabartinės lietuvių kalbos žodynas, vyr. red. St. Keinys [Dictionary of Modern Lithuanian Language, ed. by St. Keinys], Vilnius, Mokslo ir enciklopedijų leidykla, 1993.
4. Adelė Laigonaitė, Zigmas Zinkevičius, Lietuvių kalba. Mokomoji knyga X klasei [Lithuanian Language. Textbook for X form], Kaunas, Sviesa, 1997.
5. S. Matulaitienė, Skaitiniai. Vadovėlis VI klasei [Lithuanian Texts. Textbook for VI form], Kaunas, Sviesa, 1990.
6. Lithuanian Grammar, ed. by V. Ambrazas, Vilnius, Baltos lankos, 1997.
7. T. Mathiassen, A Short Grammar of Lithuanian, Slavica Publishers, Columbus, Ohio, 1996.
8. M. Ramonienė, I. Press, Colloquial Lithuanian, London and New York, Routledge, 1996.
9. B. Svecevičius, B. Piesarskas, Lietuvių - anglų kalbų žodynas [English - Lithuanian Dictionary], Vilnius, Mokslas, 1979.
10. Vokiečių - lietuvių kalbų žodynas [German - Lithuanian Dictionary], Vilnius, Mokslas, 1989.
11. A. Parenti, Italiano - Lituano, Lituano - Italiano, Garzanti Editore, 1994.
12. Romos Misiolas. Gedulinis Misiolas [Missalis Romani. Missale Parvum], Kaunas - Vilnius, 1982.
13. Tarptautinių žodžių žodynas, ats. red. V. Kvietkauskas [Dictionary of International words, ed. by K. Kvietkauskas], Vilnius, Vyriausioji enciklopedijų redakcija, 1985.