summaryrefslogtreecommitdiff
path: root/src (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Various small iterator improvementswork-branchGravatar Sam Atman2025-05-131-9/+46
|
* Add reverse CodePoint iteratorGravatar Sam Atman2025-05-091-6/+75
|
* Make DisplayWidth.setup publicv0.14.0-rc2Gravatar Sam Atman2025-05-041-1/+7
| | | | Also adds setupWithGraphemes variant.
* Remove inner setup from GeneralCategoriesGravatar Sam Atman2025-05-011-10/+1
| | | | | It was one `try` block away from only returning Allocator.Error, so now there's no need to filter errors in an outer `catch`.
* Update Unicode version in README.mdGravatar Sam Atman2025-04-301-0/+1
| | | | | | | Lets me slip these in: Closes #12 Closes #14
* Unicode 16.0Gravatar Sam Atman2025-04-301-1/+7
| | | | | Went smoothly, needed to add some scripts and adjust the magic numbers, but other than that, all set.
* Allocation Failure TestsGravatar Sam Atman2025-04-3011-91/+178
| | | | | | | | | | These turned up an excessive amount of allocations in CanonData and CompatData, which have been reduced to two through the somewhat squirrely use of 'magic numbers'. There are now allocation tests for every allocated structure in the library, and they run to completion in a reasonable amount of time. So, that's nice.
* Setup variants for all allocating modulesGravatar Sam Atman2025-04-307-146/+228
| | | | | | | | This harmonizes the allocating modules in a couple of ways. All can now be constructed by pointer, and all treat various miscellaneous read failures as `unreachable`, which indeed they should be. The README has been updated to inform users of this option.
* Update README.md to new APIGravatar Sam Atman2025-04-301-10/+10
|
* Rest of the RenamingsGravatar Sam Atman2025-04-305-0/+0
| | | | These get different names, but don't otherwise change.
* Remove FoldData, make CaseFoldingGravatar Sam Atman2025-04-304-167/+218
| | | | | CaseFolding now has the FoldData, and can be initialized with a copy of Normalize if wanted.
* Merge NormData with NormalizeGravatar Sam Atman2025-04-3010-278/+269
|
* grapheme now Graphemes, Data files goneGravatar Sam Atman2025-04-304-193/+4
|
* Factor out 'Data' for grapheme and DisplayWidthGravatar Sam Atman2025-04-306-119/+313
| | | | | In the process of refactoring the whole library, so that it doesn't expose anything called "Data" separately from user functionality.
* Add general tests stepGravatar Sam Atman2025-04-297-44/+49
| | | | | | After a considerable slog, all tests are reachable from the test step, and pass. Almost every failure was related to the change away from the inclusion of an allocator on this or that.
* Add result.toOwned() to Normalize.zigGravatar Sam Atman2025-04-291-0/+9
| | | | | | Closes #29 The README is also updated to reflect this change.
* All the std.mem.Allocators that were stored just for init and deinitGravatar lch3612025-04-2915-102/+86
| | | | methods were removed, mem.Allocators were added to deinit as arguments.
* Bump copyright year, isolate iterator testsGravatar Sam Atman2025-04-291-13/+18
|
* Add c0 and c1 control width optionsGravatar Sam Atman2025-03-202-32/+36
| | | | | | | This allows a build of DisplayWidth to give characters in those classes a width, for cases where they'll be printed with a substitute in the final display. It also raises the size of possible characters from an i3 to an i4, to accommodate printing C1s as e.g. <80> or \u{80}.
* Fix leak of cwcf_exceptions in FoldDataGravatar Ryan Liptak2024-12-041-0/+2
| | | | Closes #20
* Add peek() to Grapheme.IteratorGravatar Sam Atman2024-11-022-0/+95
| | | | | This does the expected thing: returns the next ?Grapheme without mutation of the iteration state.
* Replace deprecated uses of std.mem.splitGravatar Sam Atman2024-11-021-8/+8
|
* WidthData: define error set as mem.Allocator.ErrorGravatar Tim Culverhouse2024-10-141-5/+5
| | | | | | | The reader is a static embedded file. All of the reads are readInt. This function should not ever fail at runtime with a read error. Make all read errors unreachable, leaving only allocation errors as the error set.
* GraphemeData: define error set as mem.Allocator.ErrorGravatar Tim Culverhouse2024-10-141-7/+7
| | | | | | | The reader is a static embedded file. All of the reads are either a readInt or a readAll into a previously allocated buffer. This function should not ever fail at runtime with a read error. Make all read errors unreachable, leaving only allocation errors as the error set.
* refactor CodePoint.Iterator into a reusable fnGravatar Jonathan Raphaelson2024-07-051-57/+79
| | | | | | without changing the algorithm at all, move the responsibility of decoding a u8 slice out of the iterator, and into a reusable function so that it can be used by consumers of the library
* FoldData: Minimize Changes_When_Casefolded dataGravatar Ryan Liptak2024-06-271-5/+16
| | | | | | | | | | | | | | | | | | Only a few codepoints have a mapping in CaseFolding.txt but do not have the Changes_When_Casefolded property set. So, FoldData can just store a list of those particular codepoints and then re-use the encoded CaseFolding.txt data alongside it in order to implement changesWhenCaseFolded. This reduces the size of fold.bin.z from 4,387 bytes (4.28KiB) to 1,165 bytes (1.13KiB). This also seemingly introduced a very slight performance regression in zg_caseless. Before: zg CaseFold.compatCaselessMatch: result: 626, took: 258ns zg CaseFold.canonCaselessMatch: result: 626, took: 129ns After: zg CaseFold.compatCaselessMatch: result: 626, took: 263ns zg CaseFold.canonCaselessMatch: result: 626, took: 131ns
* Removed all inlinesGravatar Jose Colon Rodriguez2024-06-2611-33/+35
|
* Added changes when casefolded backGravatar Jose Colon Rodriguez2024-06-261-2/+6
|
* Implemented sqeek502s case foldGravatar Jose Colon Rodriguez2024-06-262-36/+53
|
* Normalize: Mark utf8Encode errors as unreachable, use explicit error setsGravatar Ryan Liptak2024-06-251-11/+11
| | | | These utf8Encode calls are converting normalized codepoints back into UTF-8, so the codepoints can be assumed to be valid.
* codepoint: prevent panic when last cp too shortGravatar Tim Culverhouse2024-06-101-0/+11
| | | | | | | | | | | If the last codepoint in a byte slice is incomplete (IE has a length of 3 but there are only 2 bytes remaining), the iterator will panic. Instead of panicking, prefer to return a replacement character. This strategy is similar to that in the block just above which returns a replacement character if the first byte is not valid. In this latter block, we also consume only one byte and allow the iterator to continue. This allows for sections of text which may have a single byte incorrect near the end of the slice.
* Merge pull request 'DisplayWidth: explicitly set width to 2 when VS16 is ↵Gravatar Jose Colon2024-04-111-0/+4
|\ | | | | | | | | | | found' (#3) from rockorager/zg:vs-16 into master Reviewed-on: https://codeberg.org/dude_the_builder/zg/pulls/3
| * DisplayWidth: explicitly set width to 2 when VS16 is foundGravatar Tim Culverhouse2024-04-111-0/+4
| | | | | | | | | | | | | | | | Explicitly set the width of an emoji to two when the next codepoint is a VS16 selector. Add unit test for this case. This is essentially the same PR as https://codeberg.org/dude_the_builder/ziglyph/pulls/11
* | grapheme: export grapheme.State structGravatar Tim Culverhouse2024-04-111-1/+1
|/ | | | | | The public function `graphemeBreak` requires a reference to a State struct, however this type is not exported. Export the type to allow users of zg to use this type and call graphemeBreak.
* NormData init now takes pointer to uninitialized Self to avoid stack copy ↵Gravatar Jose Colon Rodriguez2024-04-023-14/+20
| | | | issues.
* Updated READMEGravatar Jose Colon Rodriguez2024-03-3114-87/+36
|
* Split out Unicode tests to separate fileGravatar Jose Colon Rodriguez2024-03-283-185/+195
|
* Merged NumericData into PropsDataGravatar Jose Colon Rodriguez2024-03-282-69/+44
|
* PropsData and errdefers for init fnsGravatar Jose Colon Rodriguez2024-03-2813-22/+179
|
* ScriptsData and made all Datas constGravatar Jose Colon Rodriguez2024-03-2717-57/+283
|
* Friendly general category methodsGravatar Jose Colon Rodriguez2024-03-271-30/+116
|
* Rename DisplayWidthDataGravatar Jose Colon Rodriguez2024-03-271-7/+7
|
* rm src/main.zigGravatar Jose Colon Rodriguez2024-03-261-93/+0
|
* GraphemeData and Normalize non-pub fnsGravatar Jose Colon Rodriguez2024-03-262-13/+13
|
* Using diff for lowercase mappingGravatar Jose Colon Rodriguez2024-03-261-2/+3
|
* Using diff for uppercase mappingGravatar Jose Colon Rodriguez2024-03-261-2/+3
|
* Removed title case processingGravatar Jose Colon Rodriguez2024-03-261-35/+15
|
* CaseDataGravatar Jose Colon Rodriguez2024-03-251-0/+223
|
* NumericDataGravatar Jose Colon Rodriguez2024-03-242-12/+95
|
* Rename CaseFold and NormalizeGravatar Jose Colon Rodriguez2024-03-233-15/+15
|