summaryrefslogtreecommitdiff
path: root/src/WordBreak.zig (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Add wordAtCursorGravatar Sam Atman2025-05-151-48/+100
| | | | | | | | | | | | This is not actually the way to do it, and can break on some crafted strings. The way to actually do it: implement a reverse word search iterator, then do next() to find a word break, prev() to find a _valid_ word start, then next() again to find the valid end of said word. Maybe 2+, 2-, 1+ actually. I can probably write a test to see if the cursor spot is ambiguous, and apply an extra round if so. Need to mull the rules over before making any rash moves.
* Rewrite, passes WordBreakTestGravatar Sam Atman2025-05-151-74/+37
| | | | | After fixing a bug in Runicode which was fenceposting codepoints off the end of ranges. As one does.
* Begin conformance testGravatar Sam Atman2025-05-151-26/+57
| | | | | I'm not sure the details of this strategy can actually be made to work. But, something can.
* Implement Word iteratorGravatar Sam Atman2025-05-151-0/+228
| | | | | | | | | A by-the-book implmentation of the word break rules from tr29. This is superficially inefficient, but compilers are more than able to handle the common subexpression folding ignored by this approach. Now to port the WordBreakPropertyTests, and clean up the inevitable bugs in the implementation.
* Add WordBreakPropertyDataGravatar Sam Atman2025-05-151-0/+102
Passes some simple lookup tests.