| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
| |
Also replaces the obsolete HTML/CSS version of the Unicode License
with the plain text version found on unicode.org.
|
| |
|
|
|
|
|
| |
Lets me slip these in:
Closes #12
Closes #14
|
| |
|
|
|
| |
Went smoothly, needed to add some scripts and adjust the magic numbers,
but other than that, all set.
|
| |
|
|
|
|
|
|
|
|
| |
These turned up an excessive amount of allocations in CanonData and
CompatData, which have been reduced to two through the somewhat
squirrely use of 'magic numbers'.
There are now allocation tests for every allocated structure in the
library, and they run to completion in a reasonable amount of time.
So, that's nice.
|
| |
|
|
|
|
|
|
| |
This harmonizes the allocating modules in a couple of ways. All can
now be constructed by pointer, and all treat various miscellaneous
read failures as `unreachable`, which indeed they should be.
The README has been updated to inform users of this option.
|
| | |
|
| |
|
|
| |
These get different names, but don't otherwise change.
|
| |
|
|
|
| |
CaseFolding now has the FoldData, and can be initialized with a copy
of Normalize if wanted.
|
| | |
|
| | |
|
| |
|
|
|
| |
In the process of refactoring the whole library, so that it doesn't
expose anything called "Data" separately from user functionality.
|
| |
|
|
|
|
| |
After a considerable slog, all tests are reachable from the test step,
and pass. Almost every failure was related to the change away from the
inclusion of an allocator on this or that.
|
| |
|
|
|
|
| |
Closes #29
The README is also updated to reflect this change.
|
| | |
|
| |
|
|
| |
methods were removed, mem.Allocators were added to deinit as arguments.
|
| | |
|
| |
|
|
|
|
|
| |
This allows a build of DisplayWidth to give characters in those classes
a width, for cases where they'll be printed with a substitute in the
final display. It also raises the size of possible characters from an
i3 to an i4, to accommodate printing C1s as e.g. <80> or \u{80}.
|
| | |
|
| | |
|
| |\
| |
| |
| |
| |
| | |
into master
Reviewed-on: https://codeberg.org/atman/zg/pulls/27
|
| |/ |
|
| |\
| |
| |
| |
| |
| | |
from e0328eric/zg:master into master
Reviewed-on: https://codeberg.org/atman/zg/pulls/25
|
| |/ |
|
| | |
|
| |\
| |
| |
| |
| |
| | |
squeek502/zg:folddata-leak into master
Reviewed-on: https://codeberg.org/atman/zg/pulls/21
|
| |/
|
|
| |
Closes #20
|
| |
|
|
| |
Also documents the `cjk` option, and how to enable it.
|
| | |
|
| | |
|
| | |
|
| |\
| |
| |
| | |
Reviewed-on: https://codeberg.org/dude_the_builder/zg/pulls/18
|
| | |
| |
| |
| |
| | |
This does the expected thing: returns the next ?Grapheme without
mutation of the iteration state.
|
| |/ |
|
| |\
| |
| |
| |
| |
| | |
unreachable and define return errorset' (#16) from rockorager/zg:master into master
Reviewed-on: https://codeberg.org/dude_the_builder/zg/pulls/16
|
| | |
| |
| |
| |
| |
| |
| | |
The reader is a static embedded file. All of the reads are readInt. This
function should not ever fail at runtime with a read error. Make all
read errors unreachable, leaving only allocation errors as the error
set.
|
| |/
|
|
|
|
|
| |
The reader is a static embedded file. All of the reads are either a
readInt or a readAll into a previously allocated buffer. This function
should not ever fail at runtime with a read error. Make all read errors
unreachable, leaving only allocation errors as the error set.
|
| | |
|
| |\
| |
| |
| |
| |
| | |
from lygaret/zg:master into master
Reviewed-on: https://codeberg.org/dude_the_builder/zg/pulls/11
|
| |/
|
|
|
|
| |
without changing the algorithm at all, move the responsibility of
decoding a u8 slice out of the iterator, and into a reusable function
so that it can be used by consumers of the library
|
| | |
|
| |\
| |
| |
| |
| |
| | |
squeek502/zg:bench-windows-and-fmt into master
Reviewed-on: https://codeberg.org/dude_the_builder/zg/pulls/9
|
| | | |
|
| |\ \
| |/
|/|
| |
| |
| | |
from squeek502/zg:folddata-min into master
Reviewed-on: https://codeberg.org/dude_the_builder/zg/pulls/10
|
| |/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Only a few codepoints have a mapping in CaseFolding.txt but do not have the Changes_When_Casefolded property set. So, FoldData can just store a list of those particular codepoints and then re-use the encoded CaseFolding.txt data alongside it in order to implement changesWhenCaseFolded.
This reduces the size of fold.bin.z from 4,387 bytes (4.28KiB) to 1,165 bytes (1.13KiB).
This also seemingly introduced a very slight performance regression in zg_caseless.
Before:
zg CaseFold.compatCaselessMatch: result: 626, took: 258ns
zg CaseFold.canonCaselessMatch: result: 626, took: 129ns
After:
zg CaseFold.compatCaselessMatch: result: 626, took: 263ns
zg CaseFold.canonCaselessMatch: result: 626, took: 131ns
|
| | |
|
| | |
|
| | |
|
| | |
|
| |\
| |
| |
| |
| |
| | |
explicit error sets' (#7) from squeek502/zg:normalize-utf8encode-error into master
Reviewed-on: https://codeberg.org/dude_the_builder/zg/pulls/7
|
| |/
|
|
| |
These utf8Encode calls are converting normalized codepoints back into UTF-8, so the codepoints can be assumed to be valid.
|