| Commit message (Collapse) | Author | Age | Files | Lines |
| | |
|
| | |
|
| |
|
|
| |
Also adds setupWithGraphemes variant.
|
| |
|
|
|
| |
It was one `try` block away from only returning Allocator.Error, so now
there's no need to filter errors in an outer `catch`.
|
| |
|
|
|
|
|
| |
Lets me slip these in:
Closes #12
Closes #14
|
| |
|
|
|
| |
Went smoothly, needed to add some scripts and adjust the magic numbers,
but other than that, all set.
|
| |
|
|
|
|
|
|
|
|
| |
These turned up an excessive amount of allocations in CanonData and
CompatData, which have been reduced to two through the somewhat
squirrely use of 'magic numbers'.
There are now allocation tests for every allocated structure in the
library, and they run to completion in a reasonable amount of time.
So, that's nice.
|
| |
|
|
|
|
|
|
| |
This harmonizes the allocating modules in a couple of ways. All can
now be constructed by pointer, and all treat various miscellaneous
read failures as `unreachable`, which indeed they should be.
The README has been updated to inform users of this option.
|
| | |
|
| |
|
|
| |
These get different names, but don't otherwise change.
|
| |
|
|
|
| |
CaseFolding now has the FoldData, and can be initialized with a copy
of Normalize if wanted.
|
| | |
|
| | |
|
| |
|
|
|
| |
In the process of refactoring the whole library, so that it doesn't
expose anything called "Data" separately from user functionality.
|
| |
|
|
|
|
| |
After a considerable slog, all tests are reachable from the test step,
and pass. Almost every failure was related to the change away from the
inclusion of an allocator on this or that.
|
| |
|
|
|
|
| |
Closes #29
The README is also updated to reflect this change.
|
| |
|
|
| |
methods were removed, mem.Allocators were added to deinit as arguments.
|
| | |
|
| |
|
|
|
|
|
| |
This allows a build of DisplayWidth to give characters in those classes
a width, for cases where they'll be printed with a substitute in the
final display. It also raises the size of possible characters from an
i3 to an i4, to accommodate printing C1s as e.g. <80> or \u{80}.
|
| |
|
|
| |
Closes #20
|
| |
|
|
|
| |
This does the expected thing: returns the next ?Grapheme without
mutation of the iteration state.
|
| | |
|
| |
|
|
|
|
|
| |
The reader is a static embedded file. All of the reads are readInt. This
function should not ever fail at runtime with a read error. Make all
read errors unreachable, leaving only allocation errors as the error
set.
|
| |
|
|
|
|
|
| |
The reader is a static embedded file. All of the reads are either a
readInt or a readAll into a previously allocated buffer. This function
should not ever fail at runtime with a read error. Make all read errors
unreachable, leaving only allocation errors as the error set.
|
| |
|
|
|
|
| |
without changing the algorithm at all, move the responsibility of
decoding a u8 slice out of the iterator, and into a reusable function
so that it can be used by consumers of the library
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Only a few codepoints have a mapping in CaseFolding.txt but do not have the Changes_When_Casefolded property set. So, FoldData can just store a list of those particular codepoints and then re-use the encoded CaseFolding.txt data alongside it in order to implement changesWhenCaseFolded.
This reduces the size of fold.bin.z from 4,387 bytes (4.28KiB) to 1,165 bytes (1.13KiB).
This also seemingly introduced a very slight performance regression in zg_caseless.
Before:
zg CaseFold.compatCaselessMatch: result: 626, took: 258ns
zg CaseFold.canonCaselessMatch: result: 626, took: 129ns
After:
zg CaseFold.compatCaselessMatch: result: 626, took: 263ns
zg CaseFold.canonCaselessMatch: result: 626, took: 131ns
|
| | |
|
| | |
|
| | |
|
| |
|
|
| |
These utf8Encode calls are converting normalized codepoints back into UTF-8, so the codepoints can be assumed to be valid.
|
| |
|
|
|
|
|
|
|
|
|
| |
If the last codepoint in a byte slice is incomplete (IE has a length of
3 but there are only 2 bytes remaining), the iterator will panic.
Instead of panicking, prefer to return a replacement character. This
strategy is similar to that in the block just above which returns a
replacement character if the first byte is not valid. In this latter
block, we also consume only one byte and allow the iterator to continue.
This allows for sections of text which may have a single byte incorrect
near the end of the slice.
|
| |\
| |
| |
| |
| |
| | |
found' (#3) from rockorager/zg:vs-16 into master
Reviewed-on: https://codeberg.org/dude_the_builder/zg/pulls/3
|
| | |
| |
| |
| |
| |
| |
| |
| | |
Explicitly set the width of an emoji to two when the next codepoint is a
VS16 selector. Add unit test for this case.
This is essentially the same PR as
https://codeberg.org/dude_the_builder/ziglyph/pulls/11
|
| |/
|
|
|
|
| |
The public function `graphemeBreak` requires a reference to a State
struct, however this type is not exported. Export the type to allow
users of zg to use this type and call graphemeBreak.
|
| |
|
|
| |
issues.
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|