summaryrefslogtreecommitdiff
path: root/src/unicode_tests.zig (unfollow)
Commit message (Collapse)AuthorFilesLines
2025-11-08Use takeDelimiterInclusive to support Zig 0.15.2Gravatar Jay1-1/+2
2025-09-14Embed data files in scripts rather than relying on filesystem access for ↵Gravatar Michael Chaten1-17/+6
easier packaging
2025-09-14Update codebase to Zig 0.15.1.Gravatar Michael Chaten1-41/+47
Removes compression support
2025-06-01Add graphemeAtIndex + iterate before and afterGravatar Sam Atman1-10/+59
That completes the set. I do think it's possible to bum a few more cycles from the implementation, but, I'm not going to. It passes the acceptance suite and that's what it needs to do.
2025-05-23Make offset size configurableGravatar Sam Atman1-4/+6
Hopefully I can talk users out of taking advantage of this configuration but I'll have better luck with that if it's available.
2025-05-23Add iterateBefore and iterateAfterGravatar Sam Atman1-0/+38
These create reverse or forward iterators before or after a Word. So this way, the user can get the word at an index, then iterate forward or back from that word. Also: Fixes #59 Which was fixed awhile back, but I don't feel like doing repo surgery to tag the fix where it happened. We have blame for that kind of thing.
2025-05-16Words moduleGravatar Sam Atman1-3/+3
In keeping with the new nomenclature, we're calling the module "Words", not "WordBreak". The latter is Unicode jargon, the module provides word iterators. Words are the figure, word breaks are the ground.
2025-05-15Merge Grapheme Segmentation Iterator TestsGravatar Sam Atman1-79/+34
2025-05-15wordAtIndex passes conformanceGravatar Sam Atman1-13/+63
I removed the initAtIndex functions from the public vocabulary, because the last couple of days of sweat and blood prove that it's hard to use correctly. That's probably it for WordBreak, now to fix the overlong bug on v0.14 and get this integrated with the new reverse grapheme iterator.
2025-05-15Peek tests for word iteratorsGravatar Sam Atman1-0/+19
2025-05-15Hooked up break test, some bugs squashedGravatar Sam Atman1-15/+34
The handling of ignorables is really different, because they 'adhere' to the future of the iteration, not the past.
2025-05-15Rewrite, passes WordBreakTestGravatar Sam Atman1-2/+1
After fixing a bug in Runicode which was fenceposting codepoints off the end of ranges. As one does.
2025-05-15Begin conformance testGravatar Sam Atman1-32/+70
I'm not sure the details of this strategy can actually be made to work. But, something can.
2025-05-15Refactor in unicode_testsGravatar Sam Atman1-28/+49
The comments in WordBreak and SentenceBreak tests get really long, the provided buffer would be inadequate. So this just provides a sub- iterator which will strip comments and comment lines, while keeping an eye on line numbers for any debugging.
2025-05-15feat: add reverse grapheme iteratorGravatar Matteo Romano1-0/+74
Closes #53
2025-04-30Update Unicode version in README.mdGravatar Sam Atman1-0/+1
Lets me slip these in: Closes #12 Closes #14
2025-04-30Allocation Failure TestsGravatar Sam Atman1-1/+1
These turned up an excessive amount of allocations in CanonData and CompatData, which have been reduced to two through the somewhat squirrely use of 'magic numbers'. There are now allocation tests for every allocated structure in the library, and they run to completion in a reasonable amount of time. So, that's nice.
2025-04-30Merge NormData with NormalizeGravatar Sam Atman1-3/+2
2025-04-30grapheme now Graphemes, Data files goneGravatar Sam Atman1-4/+4
2025-04-30Factor out 'Data' for grapheme and DisplayWidthGravatar Sam Atman1-5/+5
In the process of refactoring the whole library, so that it doesn't expose anything called "Data" separately from user functionality.
2025-04-29All the std.mem.Allocators that were stored just for init and deinitGravatar lch3611-6/+6
methods were removed, mem.Allocators were added to deinit as arguments.
2024-11-02Add peek() to Grapheme.IteratorGravatar Sam Atman1-0/+26
This does the expected thing: returns the next ?Grapheme without mutation of the iteration state.
2024-11-02Replace deprecated uses of std.mem.splitGravatar Sam Atman1-8/+8
2024-04-02NormData init now takes pointer to uninitialized Self to avoid stack copy ↵Gravatar Jose Colon Rodriguez1-2/+3
issues.
2024-03-28Split out Unicode tests to separate fileGravatar Jose Colon Rodriguez1-0/+194