summaryrefslogtreecommitdiff
path: root/src (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* Factor out 'Data' for grapheme and DisplayWidthGravatar Sam Atman2025-04-306-119/+313
| | | | | In the process of refactoring the whole library, so that it doesn't expose anything called "Data" separately from user functionality.
* Add general tests stepGravatar Sam Atman2025-04-297-44/+49
| | | | | | After a considerable slog, all tests are reachable from the test step, and pass. Almost every failure was related to the change away from the inclusion of an allocator on this or that.
* Add result.toOwned() to Normalize.zigGravatar Sam Atman2025-04-291-0/+9
| | | | | | Closes #29 The README is also updated to reflect this change.
* All the std.mem.Allocators that were stored just for init and deinitGravatar lch3612025-04-2915-102/+86
| | | | methods were removed, mem.Allocators were added to deinit as arguments.
* Bump copyright year, isolate iterator testsGravatar Sam Atman2025-04-291-13/+18
|
* Add c0 and c1 control width optionsGravatar Sam Atman2025-03-202-32/+36
| | | | | | | This allows a build of DisplayWidth to give characters in those classes a width, for cases where they'll be printed with a substitute in the final display. It also raises the size of possible characters from an i3 to an i4, to accommodate printing C1s as e.g. <80> or \u{80}.
* Fix leak of cwcf_exceptions in FoldDataGravatar Ryan Liptak2024-12-041-0/+2
| | | | Closes #20
* Add peek() to Grapheme.IteratorGravatar Sam Atman2024-11-022-0/+95
| | | | | This does the expected thing: returns the next ?Grapheme without mutation of the iteration state.
* Replace deprecated uses of std.mem.splitGravatar Sam Atman2024-11-021-8/+8
|
* WidthData: define error set as mem.Allocator.ErrorGravatar Tim Culverhouse2024-10-141-5/+5
| | | | | | | The reader is a static embedded file. All of the reads are readInt. This function should not ever fail at runtime with a read error. Make all read errors unreachable, leaving only allocation errors as the error set.
* GraphemeData: define error set as mem.Allocator.ErrorGravatar Tim Culverhouse2024-10-141-7/+7
| | | | | | | The reader is a static embedded file. All of the reads are either a readInt or a readAll into a previously allocated buffer. This function should not ever fail at runtime with a read error. Make all read errors unreachable, leaving only allocation errors as the error set.
* refactor CodePoint.Iterator into a reusable fnGravatar Jonathan Raphaelson2024-07-051-57/+79
| | | | | | without changing the algorithm at all, move the responsibility of decoding a u8 slice out of the iterator, and into a reusable function so that it can be used by consumers of the library
* FoldData: Minimize Changes_When_Casefolded dataGravatar Ryan Liptak2024-06-271-5/+16
| | | | | | | | | | | | | | | | | | Only a few codepoints have a mapping in CaseFolding.txt but do not have the Changes_When_Casefolded property set. So, FoldData can just store a list of those particular codepoints and then re-use the encoded CaseFolding.txt data alongside it in order to implement changesWhenCaseFolded. This reduces the size of fold.bin.z from 4,387 bytes (4.28KiB) to 1,165 bytes (1.13KiB). This also seemingly introduced a very slight performance regression in zg_caseless. Before: zg CaseFold.compatCaselessMatch: result: 626, took: 258ns zg CaseFold.canonCaselessMatch: result: 626, took: 129ns After: zg CaseFold.compatCaselessMatch: result: 626, took: 263ns zg CaseFold.canonCaselessMatch: result: 626, took: 131ns
* Removed all inlinesGravatar Jose Colon Rodriguez2024-06-2611-33/+35
|
* Added changes when casefolded backGravatar Jose Colon Rodriguez2024-06-261-2/+6
|
* Implemented sqeek502s case foldGravatar Jose Colon Rodriguez2024-06-262-36/+53
|
* Normalize: Mark utf8Encode errors as unreachable, use explicit error setsGravatar Ryan Liptak2024-06-251-11/+11
| | | | These utf8Encode calls are converting normalized codepoints back into UTF-8, so the codepoints can be assumed to be valid.
* codepoint: prevent panic when last cp too shortGravatar Tim Culverhouse2024-06-101-0/+11
| | | | | | | | | | | If the last codepoint in a byte slice is incomplete (IE has a length of 3 but there are only 2 bytes remaining), the iterator will panic. Instead of panicking, prefer to return a replacement character. This strategy is similar to that in the block just above which returns a replacement character if the first byte is not valid. In this latter block, we also consume only one byte and allow the iterator to continue. This allows for sections of text which may have a single byte incorrect near the end of the slice.
* Merge pull request 'DisplayWidth: explicitly set width to 2 when VS16 is ↵Gravatar Jose Colon2024-04-111-0/+4
|\ | | | | | | | | | | found' (#3) from rockorager/zg:vs-16 into master Reviewed-on: https://codeberg.org/dude_the_builder/zg/pulls/3
| * DisplayWidth: explicitly set width to 2 when VS16 is foundGravatar Tim Culverhouse2024-04-111-0/+4
| | | | | | | | | | | | | | | | Explicitly set the width of an emoji to two when the next codepoint is a VS16 selector. Add unit test for this case. This is essentially the same PR as https://codeberg.org/dude_the_builder/ziglyph/pulls/11
* | grapheme: export grapheme.State structGravatar Tim Culverhouse2024-04-111-1/+1
|/ | | | | | The public function `graphemeBreak` requires a reference to a State struct, however this type is not exported. Export the type to allow users of zg to use this type and call graphemeBreak.
* NormData init now takes pointer to uninitialized Self to avoid stack copy ↵Gravatar Jose Colon Rodriguez2024-04-023-14/+20
| | | | issues.
* Updated READMEGravatar Jose Colon Rodriguez2024-03-3114-87/+36
|
* Split out Unicode tests to separate fileGravatar Jose Colon Rodriguez2024-03-283-185/+195
|
* Merged NumericData into PropsDataGravatar Jose Colon Rodriguez2024-03-282-69/+44
|
* PropsData and errdefers for init fnsGravatar Jose Colon Rodriguez2024-03-2813-22/+179
|
* ScriptsData and made all Datas constGravatar Jose Colon Rodriguez2024-03-2717-57/+283
|
* Friendly general category methodsGravatar Jose Colon Rodriguez2024-03-271-30/+116
|
* Rename DisplayWidthDataGravatar Jose Colon Rodriguez2024-03-271-7/+7
|
* rm src/main.zigGravatar Jose Colon Rodriguez2024-03-261-93/+0
|
* GraphemeData and Normalize non-pub fnsGravatar Jose Colon Rodriguez2024-03-262-13/+13
|
* Using diff for lowercase mappingGravatar Jose Colon Rodriguez2024-03-261-2/+3
|
* Using diff for uppercase mappingGravatar Jose Colon Rodriguez2024-03-261-2/+3
|
* Removed title case processingGravatar Jose Colon Rodriguez2024-03-261-35/+15
|
* CaseDataGravatar Jose Colon Rodriguez2024-03-251-0/+223
|
* NumericDataGravatar Jose Colon Rodriguez2024-03-242-12/+95
|
* Rename CaseFold and NormalizeGravatar Jose Colon Rodriguez2024-03-233-15/+15
|
* Renamed Caser to FolderGravatar Jose Colon Rodriguez2024-03-231-0/+0
|
* Small format change in mainGravatar Jose Colon Rodriguez2024-03-161-1/+5
|
* Normalizer back to 300k in SafeGravatar Jose Colon Rodriguez2024-03-011-16/+16
|
* Added canonical caseless match to CaserGravatar Jose Colon Rodriguez2024-03-013-7/+105
|
* Moved case fold stuff to src/Caser.zigGravatar Jose Colon Rodriguez2024-03-014-106/+125
|
* Changes when case folded check; 20ms fasterGravatar Jose Colon Rodriguez2024-03-012-6/+38
|
* Normalizer.eqlIgnoreCase compatibility caseless matchingGravatar Jose Colon Rodriguez2024-03-014-9/+163
|
* Removed dupe tombstone check in NormalizerGravatar Jose Colon Rodriguez2024-02-291-14/+0
|
* Major Normalizer optimizationsGravatar Jose Colon Rodriguez2024-02-291-60/+75
|
* Added nfc latin1 check backGravatar Jose Colon Rodriguez2024-02-282-71/+122
|
* Using slices for decompositions in NormalizerGravatar Jose Colon Rodriguez2024-02-284-122/+118
|
* General Category with GenCatDataGravatar Jose Colon Rodriguez2024-02-273-16/+108
|
* Normalizer 2x faster than Ziglyph; Uses 2x memoryGravatar Jose Colon Rodriguez2024-02-271-1/+1
|