summaryrefslogtreecommitdiff
path: root/src (unfollow)
Commit message (Collapse)AuthorFilesLines
2025-05-13Various small iterator improvementswork-branchGravatar Sam Atman1-9/+46
2025-05-09Add reverse CodePoint iteratorGravatar Sam Atman1-6/+75
2025-05-04Make DisplayWidth.setup publicv0.14.0-rc2Gravatar Sam Atman1-1/+7
Also adds setupWithGraphemes variant.
2025-05-01Remove inner setup from GeneralCategoriesGravatar Sam Atman1-10/+1
It was one `try` block away from only returning Allocator.Error, so now there's no need to filter errors in an outer `catch`.
2025-04-30Update Unicode version in README.mdGravatar Sam Atman1-0/+1
Lets me slip these in: Closes #12 Closes #14
2025-04-30Unicode 16.0Gravatar Sam Atman1-1/+7
Went smoothly, needed to add some scripts and adjust the magic numbers, but other than that, all set.
2025-04-30Allocation Failure TestsGravatar Sam Atman11-91/+178
These turned up an excessive amount of allocations in CanonData and CompatData, which have been reduced to two through the somewhat squirrely use of 'magic numbers'. There are now allocation tests for every allocated structure in the library, and they run to completion in a reasonable amount of time. So, that's nice.
2025-04-30Setup variants for all allocating modulesGravatar Sam Atman7-146/+228
This harmonizes the allocating modules in a couple of ways. All can now be constructed by pointer, and all treat various miscellaneous read failures as `unreachable`, which indeed they should be. The README has been updated to inform users of this option.
2025-04-30Update README.md to new APIGravatar Sam Atman1-10/+10
2025-04-30Rest of the RenamingsGravatar Sam Atman5-0/+0
These get different names, but don't otherwise change.
2025-04-30Remove FoldData, make CaseFoldingGravatar Sam Atman4-167/+218
CaseFolding now has the FoldData, and can be initialized with a copy of Normalize if wanted.
2025-04-30Merge NormData with NormalizeGravatar Sam Atman10-278/+269
2025-04-30grapheme now Graphemes, Data files goneGravatar Sam Atman4-193/+4
2025-04-30Factor out 'Data' for grapheme and DisplayWidthGravatar Sam Atman6-119/+313
In the process of refactoring the whole library, so that it doesn't expose anything called "Data" separately from user functionality.
2025-04-29Add general tests stepGravatar Sam Atman7-44/+49
After a considerable slog, all tests are reachable from the test step, and pass. Almost every failure was related to the change away from the inclusion of an allocator on this or that.
2025-04-29Add result.toOwned() to Normalize.zigGravatar Sam Atman1-0/+9
Closes #29 The README is also updated to reflect this change.
2025-04-29All the std.mem.Allocators that were stored just for init and deinitGravatar lch36115-102/+86
methods were removed, mem.Allocators were added to deinit as arguments.
2025-04-29Bump copyright year, isolate iterator testsGravatar Sam Atman1-13/+18
2025-03-20Add c0 and c1 control width optionsGravatar Sam Atman2-32/+36
This allows a build of DisplayWidth to give characters in those classes a width, for cases where they'll be printed with a substitute in the final display. It also raises the size of possible characters from an i3 to an i4, to accommodate printing C1s as e.g. <80> or \u{80}.
2024-12-04Fix leak of cwcf_exceptions in FoldDataGravatar Ryan Liptak1-0/+2
Closes #20
2024-11-02Add peek() to Grapheme.IteratorGravatar Sam Atman2-0/+95
This does the expected thing: returns the next ?Grapheme without mutation of the iteration state.
2024-11-02Replace deprecated uses of std.mem.splitGravatar Sam Atman1-8/+8
2024-10-14WidthData: define error set as mem.Allocator.ErrorGravatar Tim Culverhouse1-5/+5
The reader is a static embedded file. All of the reads are readInt. This function should not ever fail at runtime with a read error. Make all read errors unreachable, leaving only allocation errors as the error set.
2024-10-14GraphemeData: define error set as mem.Allocator.ErrorGravatar Tim Culverhouse1-7/+7
The reader is a static embedded file. All of the reads are either a readInt or a readAll into a previously allocated buffer. This function should not ever fail at runtime with a read error. Make all read errors unreachable, leaving only allocation errors as the error set.
2024-07-05refactor CodePoint.Iterator into a reusable fnGravatar Jonathan Raphaelson1-57/+79
without changing the algorithm at all, move the responsibility of decoding a u8 slice out of the iterator, and into a reusable function so that it can be used by consumers of the library
2024-06-27FoldData: Minimize Changes_When_Casefolded dataGravatar Ryan Liptak1-5/+16
Only a few codepoints have a mapping in CaseFolding.txt but do not have the Changes_When_Casefolded property set. So, FoldData can just store a list of those particular codepoints and then re-use the encoded CaseFolding.txt data alongside it in order to implement changesWhenCaseFolded. This reduces the size of fold.bin.z from 4,387 bytes (4.28KiB) to 1,165 bytes (1.13KiB). This also seemingly introduced a very slight performance regression in zg_caseless. Before: zg CaseFold.compatCaselessMatch: result: 626, took: 258ns zg CaseFold.canonCaselessMatch: result: 626, took: 129ns After: zg CaseFold.compatCaselessMatch: result: 626, took: 263ns zg CaseFold.canonCaselessMatch: result: 626, took: 131ns
2024-06-26Removed all inlinesGravatar Jose Colon Rodriguez11-33/+35
2024-06-26Added changes when casefolded backGravatar Jose Colon Rodriguez1-2/+6
2024-06-26Implemented sqeek502s case foldGravatar Jose Colon Rodriguez2-36/+53
2024-06-25Normalize: Mark utf8Encode errors as unreachable, use explicit error setsGravatar Ryan Liptak1-11/+11
These utf8Encode calls are converting normalized codepoints back into UTF-8, so the codepoints can be assumed to be valid.
2024-06-10codepoint: prevent panic when last cp too shortGravatar Tim Culverhouse1-0/+11
If the last codepoint in a byte slice is incomplete (IE has a length of 3 but there are only 2 bytes remaining), the iterator will panic. Instead of panicking, prefer to return a replacement character. This strategy is similar to that in the block just above which returns a replacement character if the first byte is not valid. In this latter block, we also consume only one byte and allow the iterator to continue. This allows for sections of text which may have a single byte incorrect near the end of the slice.
2024-04-11DisplayWidth: explicitly set width to 2 when VS16 is foundGravatar Tim Culverhouse1-0/+4
Explicitly set the width of an emoji to two when the next codepoint is a VS16 selector. Add unit test for this case. This is essentially the same PR as https://codeberg.org/dude_the_builder/ziglyph/pulls/11
2024-04-11grapheme: export grapheme.State structGravatar Tim Culverhouse1-1/+1
The public function `graphemeBreak` requires a reference to a State struct, however this type is not exported. Export the type to allow users of zg to use this type and call graphemeBreak.
2024-04-02NormData init now takes pointer to uninitialized Self to avoid stack copy ↵Gravatar Jose Colon Rodriguez3-14/+20
issues.
2024-03-31Updated READMEGravatar Jose Colon Rodriguez14-87/+36
2024-03-28Split out Unicode tests to separate fileGravatar Jose Colon Rodriguez3-185/+195
2024-03-28Merged NumericData into PropsDataGravatar Jose Colon Rodriguez2-69/+44
2024-03-28PropsData and errdefers for init fnsGravatar Jose Colon Rodriguez13-22/+179
2024-03-27ScriptsData and made all Datas constGravatar Jose Colon Rodriguez17-57/+283
2024-03-27Friendly general category methodsGravatar Jose Colon Rodriguez1-30/+116
2024-03-27Rename DisplayWidthDataGravatar Jose Colon Rodriguez1-7/+7
2024-03-26rm src/main.zigGravatar Jose Colon Rodriguez1-93/+0
2024-03-26GraphemeData and Normalize non-pub fnsGravatar Jose Colon Rodriguez2-13/+13
2024-03-26Using diff for lowercase mappingGravatar Jose Colon Rodriguez1-2/+3
2024-03-26Using diff for uppercase mappingGravatar Jose Colon Rodriguez1-2/+3
2024-03-26Removed title case processingGravatar Jose Colon Rodriguez1-35/+15
2024-03-25CaseDataGravatar Jose Colon Rodriguez1-0/+223
2024-03-24NumericDataGravatar Jose Colon Rodriguez2-12/+95
2024-03-23Rename CaseFold and NormalizeGravatar Jose Colon Rodriguez3-15/+15
2024-03-23Renamed Caser to FolderGravatar Jose Colon Rodriguez1-0/+0