diff options
| author | 2025-04-30 20:32:23 -0400 | |
|---|---|---|
| committer | 2025-04-30 20:32:23 -0400 | |
| commit | a7164d9e7b3c3ec6813e06a42d82180d766e15ca (patch) | |
| tree | b9c55a45ddac98e51653cb64d39b6b26cfb50362 /data/unicode/NamesList.html | |
| parent | Allocation Failure Tests (diff) | |
| download | zg-a7164d9e7b3c3ec6813e06a42d82180d766e15ca.tar.gz zg-a7164d9e7b3c3ec6813e06a42d82180d766e15ca.tar.xz zg-a7164d9e7b3c3ec6813e06a42d82180d766e15ca.zip | |
Unicode 16.0
Went smoothly, needed to add some scripts and adjust the magic numbers,
but other than that, all set.
Diffstat (limited to 'data/unicode/NamesList.html')
| -rw-r--r-- | data/unicode/NamesList.html | 69 |
1 files changed, 46 insertions, 23 deletions
diff --git a/data/unicode/NamesList.html b/data/unicode/NamesList.html index d6809e1..a67236e 100644 --- a/data/unicode/NamesList.html +++ b/data/unicode/NamesList.html | |||
| @@ -100,7 +100,7 @@ a.headernav:hover { | |||
| 100 | <tbody> | 100 | <tbody> |
| 101 | <tr> | 101 | <tr> |
| 102 | <td>Revision</td> | 102 | <td>Revision</td> |
| 103 | <td>15.1.0</td> | 103 | <td>16.0.0</td> |
| 104 | </tr> | 104 | </tr> |
| 105 | <tr> | 105 | <tr> |
| 106 | <td>Authors</td> | 106 | <td>Authors</td> |
| @@ -108,19 +108,19 @@ a.headernav:hover { | |||
| 108 | </tr> | 108 | </tr> |
| 109 | <tr> | 109 | <tr> |
| 110 | <td>Date</td> | 110 | <td>Date</td> |
| 111 | <td>2023-08-23</td> | 111 | <td>2024-08-21</td> |
| 112 | </tr> | 112 | </tr> |
| 113 | <tr> | 113 | <tr> |
| 114 | <td>This Version</td> | 114 | <td>This Version</td> |
| 115 | <td > | 115 | <td > |
| 116 | <a href="https://www.unicode.org/Public/15.1.0/ucd/NamesList.html"> | 116 | <a href="https://www.unicode.org/Public/16.0.0/ucd/NamesList.html"> |
| 117 | https://www.unicode.org/Public/15.1.0/ucd/NamesList.html</a></td> | 117 | https://www.unicode.org/Public/16.0.0/ucd/NamesList.html</a></td> |
| 118 | </tr> | 118 | </tr> |
| 119 | <tr> | 119 | <tr> |
| 120 | <td>Previous Version</td> | 120 | <td>Previous Version</td> |
| 121 | <td> | 121 | <td> |
| 122 | <a href="https://www.unicode.org/Public/15.0.0/ucd/NamesList.html"> | 122 | <a href="https://www.unicode.org/Public/15.1.0/ucd/NamesList.html"> |
| 123 | https://www.unicode.org/Public/15.0.0/ucd/NamesList.html</a></td> | 123 | https://www.unicode.org/Public/15.1.0/ucd/NamesList.html</a></td> |
| 124 | </tr> | 124 | </tr> |
| 125 | <tr> | 125 | <tr> |
| 126 | <td>Latest Version</td> | 126 | <td>Latest Version</td> |
| @@ -159,8 +159,8 @@ released form of the NamesList.txt file.</p> | |||
| 159 | draft versions of the NamesList.txt file. The support for UTF-8 encoded files and the syntax for the UTF-8 charset | 159 | draft versions of the NamesList.txt file. The support for UTF-8 encoded files and the syntax for the UTF-8 charset |
| 160 | declaration in a comment at the head of the file were introduced after Unicode | 160 | declaration in a comment at the head of the file were introduced after Unicode |
| 161 | 6.1.0 was published, as was the syntax for the specification of variation sequences and alternate glyphs and their respective summaries. The repertoire restriction | 161 | 6.1.0 was published, as was the syntax for the specification of variation sequences and alternate glyphs and their respective summaries. The repertoire restriction |
| 162 | in comments and aliases in the names list format was loosened from the prior | 162 | in comments and aliases in the names list format was loosened from the earlier |
| 163 | limitation to U+0020..U+00FF, to include the wider range U+0020..U+02FF, as of Unicode 11.0.</p> | 163 | limitation to U+0020..U+00FF, to include the wider range U+0020..U+02FF, as of Unicode 11.0, and dropped entirely as of Unicode 16.0.0.</p> |
| 164 | 164 | ||
| 165 | <p>The same input file can be used for the preparation of drafts and final editions for ISO/IEC | 165 | <p>The same input file can be used for the preparation of drafts and final editions for ISO/IEC |
| 166 | 10646. Earlier versions of that standard used a different style, referred to below as ISO-style. That style necessitated the presence of some | 166 | 10646. Earlier versions of that standard used a different style, referred to below as ISO-style. That style necessitated the presence of some |
| @@ -281,10 +281,18 @@ CHAR_ENTRY: NAME_LINE | RESERVED_LINE | |||
| 281 | charset declaration (see below). Alternatively, or in addition, a BOM may be | 281 | charset declaration (see below). Alternatively, or in addition, a BOM may be |
| 282 | present at the very beginning of the file, forcing the encoding to be | 282 | present at the very beginning of the file, forcing the encoding to be |
| 283 | interpreted as UTF-16 (little-endian only) or UTF-8. When | 283 | interpreted as UTF-16 (little-endian only) or UTF-8. When |
| 284 | declared as UTF-8, the names list format will support use of characters in | 284 | declared as UTF-8, the names list format will support use of any Unicode characters in |
| 285 | the range U+0020..U+02FF in LINE and LABEL elements. Otherwise, | 285 | STRING and LABEL elements. Otherwise, |
| 286 | the supported repertoire is limited to Latin-1, and attempted use of characters outside | 286 | the supported repertoire is limited to Latin-1, and attempted use of characters outside |
| 287 | the Latin-1 range will result in data corruption.</p> | 287 | the Latin-1 range will result in data corruption.</p> |
| 288 | <p>The NamesList file format does not support styled text; each line or other element | ||
| 289 | will usually be displayed in a specific font selected for it. To allow CHAR elements | ||
| 290 | that normally use chart glyphs to better coexist with running text in LABEL and STRING | ||
| 291 | elements, a user defined limit can be set, below which the normal selection of (chart) glyphs | ||
| 292 | for the CHAR element is overridden in favor of equivalent glyphs from a font selected for better | ||
| 293 | readability in running text. Any running text outside that range will use standard chart | ||
| 294 | glyphs, which may result in a ransom note effect. For production of the Unicode Standard | ||
| 295 | Version 16.0.0 and later the limit is set to U+1EFF.</p> | ||
| 288 | <p>Several of these elements, while part of the formal definition of the | 296 | <p>Several of these elements, while part of the formal definition of the |
| 289 | file format, do not occur in final published versions of | 297 | file format, do not occur in final published versions of |
| 290 | NamesList.txt in the <a href="https://www.unicode.org/Public/UCD/latest/">UCD</a>.</p> | 298 | NamesList.txt in the <a href="https://www.unicode.org/Public/UCD/latest/">UCD</a>.</p> |
| @@ -514,14 +522,14 @@ is machine generated and will always explicitly provide any summary subheaders.< | |||
| 514 | <li>Because a LINE or an EXPAND_LINE can itself start with a special character followed | 522 | <li>Because a LINE or an EXPAND_LINE can itself start with a special character followed |
| 515 | by a SP or LF, an "unmarked" COMMENT_LINE should match the input in lower priority than line | 523 | by a SP or LF, an "unmarked" COMMENT_LINE should match the input in lower priority than line |
| 516 | types that require a special character or have a more restrictive set of characters than EXPAND_LINE. | 524 | types that require a special character or have a more restrictive set of characters than EXPAND_LINE. |
| 517 | Similarly, a SUBHEADER containing TAB "!" LF should match with a higher priority than those | 525 | Similarly, a SUBHEADER containing TAB "!" LF should match with a higher priority than one |
| 518 | where the TAB is followed by a LINE.</li> | 526 | where the TAB is followed by a LINE.</li> |
| 519 | </ul> | 527 | </ul> |
| 520 | 528 | ||
| 521 | 529 | ||
| 522 | <h3 id="FilePrimitives">2.2 <a href="#FilePrimitives">NamesList File Primitives</a></h3> | 530 | <h3 id="FilePrimitives">2.2 <a href="#FilePrimitives">NamesList File Primitives</a></h3> |
| 523 | 531 | ||
| 524 | <p>The following are the primitives and terminals for the NamesList syntax.</p> | 532 | <p>The following are the primitives and terminals for the NamesList syntax. "Limit" is a user-defined value; see discussion of the implications of Limit in the notes below.</p> |
| 525 | 533 | ||
| 526 | <pre><strong>LINE</strong>: <strong>STRING LF | 534 | <pre><strong>LINE</strong>: <strong>STRING LF |
| 527 | COMMENT: "(" LABEL ")" | 535 | COMMENT: "(" LABEL ")" |
| @@ -533,8 +541,8 @@ COMMENT: "(" LABEL ")" | |||
| 533 | 541 | ||
| 534 | <strong>TAG</strong>: <sequence of ASCII letters> | 542 | <strong>TAG</strong>: <sequence of ASCII letters> |
| 535 | <strong>LCTAG</strong>: <sequence of lowercase ASCII letters> | 543 | <strong>LCTAG</strong>: <sequence of lowercase ASCII letters> |
| 536 | <strong>STRING</strong>: <sequence of characters in the range U+0020..U+02FF, except controls> | 544 | <strong>STRING</strong>: <sequence of characters, except controls> |
| 537 | <strong>LABEL</strong>: <sequence of characters in the range U+0020..U+02FF, except controls, "(" or ")"> | 545 | <strong>LABEL</strong>: <sequence of characters, except controls, "(" or ")"> |
| 538 | <strong>VARSEL</strong>: <strong>CHAR | 546 | <strong>VARSEL</strong>: <strong>CHAR |
| 539 | | "ALT" ( "1"|"2"|"3"|"4"|"5"|"6"|"7"|"8"|"9" )</strong> | 547 | | "ALT" ( "1"|"2"|"3"|"4"|"5"|"6"|"7"|"8"|"9" )</strong> |
| 540 | <strong>VARSEL_LIST</strong>: <strong>"{" CHAR_LIST "}"</strong> | 548 | <strong>VARSEL_LIST</strong>: <strong>"{" CHAR_LIST "}"</strong> |
| @@ -580,19 +588,27 @@ COMMENT: "(" LABEL ")" | |||
| 580 | of following characters.</li> | 588 | of following characters.</li> |
| 581 | <li>The hyphen in a character range CHAR-CHAR is replaced by an EN DASH on | 589 | <li>The hyphen in a character range CHAR-CHAR is replaced by an EN DASH on |
| 582 | output.</li> | 590 | output.</li> |
| 583 | <li>In a STRING or LABEL, a Unicode character outside the range | ||
| 584 | U+0000..U+02FF is displayed as is, with a glyph matching | ||
| 585 | the chart font, and not with the font that is otherwise defined for that element.</li> | ||
| 586 | <li>The NamesList.txt file is encoded in UTF-8 if the <i>first line</i> is a | 591 | <li>The NamesList.txt file is encoded in UTF-8 if the <i>first line</i> is a |
| 587 | FILE_COMMENT containing the declaration "UTF-8" or any casemap variation | 592 | FILE_COMMENT containing the declaration "UTF-8" or any casemap variation |
| 588 | thereof. Otherwise the file is encoded in Latin-1 (older versions). Beyond | 593 | thereof. Otherwise the file is encoded in Latin-1 (older versions). Beyond |
| 589 | detecting the charset declaration (typically: "; charset=utf-8") the | 594 | detecting the charset declaration (typically: "; charset=utf-8") the |
| 590 | remainder of that comment is ignored. | 595 | remainder of that comment is ignored. |
| 591 | If the file is not encoded as | 596 | When declared as UTF-8, the NamesList format will support any Unicode character |
| 592 | UTF-8, the character repertoire for running text (anything | 597 | in STRING or LABEL elements, but see further implications below.</li> |
| 593 | other than CHAR) is effectively restricted to the repertoire of Latin-1. | 598 | <li>In a STRING or LABEL element, a Unicode character outside the range |
| 594 | Otherwise, characters in the range U+0020..U+02FF | 599 | U+0020..Limit is displayed with a glyph matching |
| 595 | are allowed in STRING or LABEL elements, and elements derived from them.</li> | 600 | the chart font, and not with the font that is otherwise defined for that element. |
| 601 | The Limit value is user defined. | ||
| 602 | For production of the Unicode Standard from Version 16.0.0 and later the Limit | ||
| 603 | value is set to U+1EFF. | ||
| 604 | All code points less than the Limit value can be mapped onto a font selected for best | ||
| 605 | results in running text. However, any CHAR elements contained in an EXPAND_LINE | ||
| 606 | are exempt from this and are always displayed with a glyph matching the chart font. | ||
| 607 | The net effect is a workaround for the fact that the NamesList format does | ||
| 608 | not support style runs within any element that encompasses a single unit of flowed text.</li> | ||
| 609 | <li>When drafting STRING or LABEL elements, one should note that text containing | ||
| 610 | characters outside the range U+0020..Limit may result in a ransom note effect, | ||
| 611 | as the regular text font and charts fonts would be alternated. This is best avoided.</li> | ||
| 596 | <li>The code chart layout program | 612 | <li>The code chart layout program |
| 597 | (<a href="https://www.unicode.org/unibook/">Unibook</a>) | 613 | (<a href="https://www.unicode.org/unibook/">Unibook</a>) |
| 598 | can accept files in several other formats. These include little-endian UTF-16, | 614 | can accept files in several other formats. These include little-endian UTF-16, |
| @@ -610,9 +626,16 @@ COMMENT: "(" LABEL ")" | |||
| 610 | </ul> | 626 | </ul> |
| 611 | <h2 id="Modifications"><a href="#Modifications">Modifications</a></h2> | 627 | <h2 id="Modifications"><a href="#Modifications">Modifications</a></h2> |
| 612 | 628 | ||
| 629 | <p><b>Version 16.0.0</b></p> | ||
| 630 | <ul> | ||
| 631 | <li>Reissued for Unicode 16.0.0</li> | ||
| 632 | <li>Reflect the wider range of possible values for the user defined Limit.</li> | ||
| 633 | <li>Added an explanation of the effect of the Limit value.</li> | ||
| 634 | </ul> | ||
| 635 | |||
| 613 | <p><b>Version 15.1.0</b></p> | 636 | <p><b>Version 15.1.0</b></p> |
| 614 | <ul> | 637 | <ul> |
| 615 | <li>Reissued for Unicode 15.0.0.</li> | 638 | <li>Reissued for Unicode 15.1.0.</li> |
| 616 | <li>Adjusted NAMELIST definition to account for positions of FILE_COMMENT.</li> | 639 | <li>Adjusted NAMELIST definition to account for positions of FILE_COMMENT.</li> |
| 617 | <li>Added a note to the bullets in Section 2.1 to clarify priority of matching for | 640 | <li>Added a note to the bullets in Section 2.1 to clarify priority of matching for |
| 618 | some line types.</li> | 641 | some line types.</li> |