summaryrefslogtreecommitdiff
path: root/src/unicode_tests.zig (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Use takeDelimiterInclusive to support Zig 0.15.2Gravatar Jay2025-11-081-1/+2
|
* Embed data files in scripts rather than relying on filesystem access for ↵Gravatar Michael Chaten2025-09-141-17/+6
| | | | easier packaging
* Update codebase to Zig 0.15.1.Gravatar Michael Chaten2025-09-141-41/+47
| | | | Removes compression support
* Add graphemeAtIndex + iterate before and afterGravatar Sam Atman2025-06-011-10/+59
| | | | | | That completes the set. I do think it's possible to bum a few more cycles from the implementation, but, I'm not going to. It passes the acceptance suite and that's what it needs to do.
* Make offset size configurableGravatar Sam Atman2025-05-231-4/+6
| | | | | Hopefully I can talk users out of taking advantage of this configuration but I'll have better luck with that if it's available.
* Add iterateBefore and iterateAfterGravatar Sam Atman2025-05-231-0/+38
| | | | | | | | | | | | | | These create reverse or forward iterators before or after a Word. So this way, the user can get the word at an index, then iterate forward or back from that word. Also: Fixes #59 Which was fixed awhile back, but I don't feel like doing repo surgery to tag the fix where it happened. We have blame for that kind of thing.
* Words moduleGravatar Sam Atman2025-05-161-3/+3
| | | | | | In keeping with the new nomenclature, we're calling the module "Words", not "WordBreak". The latter is Unicode jargon, the module provides word iterators. Words are the figure, word breaks are the ground.
* Merge Grapheme Segmentation Iterator TestsGravatar Sam Atman2025-05-151-79/+34
|
* Merge commit 'b5d955f' into develop-nextGravatar Sam Atman2025-05-151-0/+74
|\
| * feat: add reverse grapheme iteratorGravatar Matteo Romano2025-05-151-0/+74
| | | | | | | | Closes #53
* | wordAtIndex passes conformanceGravatar Sam Atman2025-05-151-13/+63
| | | | | | | | | | | | | | | | | | I removed the initAtIndex functions from the public vocabulary, because the last couple of days of sweat and blood prove that it's hard to use correctly. That's probably it for WordBreak, now to fix the overlong bug on v0.14 and get this integrated with the new reverse grapheme iterator.
* | Peek tests for word iteratorsGravatar Sam Atman2025-05-151-0/+19
| |
* | Hooked up break test, some bugs squashedGravatar Sam Atman2025-05-151-15/+34
| | | | | | | | | | The handling of ignorables is really different, because they 'adhere' to the future of the iteration, not the past.
* | Rewrite, passes WordBreakTestGravatar Sam Atman2025-05-151-2/+1
| | | | | | | | | | After fixing a bug in Runicode which was fenceposting codepoints off the end of ranges. As one does.
* | Begin conformance testGravatar Sam Atman2025-05-151-32/+70
| | | | | | | | | | I'm not sure the details of this strategy can actually be made to work. But, something can.
* | Refactor in unicode_testsGravatar Sam Atman2025-05-151-28/+49
|/ | | | | | | The comments in WordBreak and SentenceBreak tests get really long, the provided buffer would be inadequate. So this just provides a sub- iterator which will strip comments and comment lines, while keeping an eye on line numbers for any debugging.
* Update Unicode version in README.mdGravatar Sam Atman2025-04-301-0/+1
| | | | | | | Lets me slip these in: Closes #12 Closes #14
* Allocation Failure TestsGravatar Sam Atman2025-04-301-1/+1
| | | | | | | | | | These turned up an excessive amount of allocations in CanonData and CompatData, which have been reduced to two through the somewhat squirrely use of 'magic numbers'. There are now allocation tests for every allocated structure in the library, and they run to completion in a reasonable amount of time. So, that's nice.
* Merge NormData with NormalizeGravatar Sam Atman2025-04-301-3/+2
|
* grapheme now Graphemes, Data files goneGravatar Sam Atman2025-04-301-4/+4
|
* Factor out 'Data' for grapheme and DisplayWidthGravatar Sam Atman2025-04-301-5/+5
| | | | | In the process of refactoring the whole library, so that it doesn't expose anything called "Data" separately from user functionality.
* All the std.mem.Allocators that were stored just for init and deinitGravatar lch3612025-04-291-6/+6
| | | | methods were removed, mem.Allocators were added to deinit as arguments.
* Add peek() to Grapheme.IteratorGravatar Sam Atman2024-11-021-0/+26
| | | | | This does the expected thing: returns the next ?Grapheme without mutation of the iteration state.
* Replace deprecated uses of std.mem.splitGravatar Sam Atman2024-11-021-8/+8
|
* NormData init now takes pointer to uninitialized Self to avoid stack copy ↵Gravatar Jose Colon Rodriguez2024-04-021-2/+3
| | | | issues.
* Split out Unicode tests to separate fileGravatar Jose Colon Rodriguez2024-03-281-0/+194