diff options
| author | 2025-05-23 19:01:57 -0400 | |
|---|---|---|
| committer | 2025-05-23 19:01:57 -0400 | |
| commit | f4a174e27052e38aec09840e9195981cc2f24c88 (patch) | |
| tree | 3214263c7df7a890265406f6cc80b178d52aa698 /README.md | |
| parent | Make offset size configurable (diff) | |
| download | zg-f4a174e27052e38aec09840e9195981cc2f24c88.tar.gz zg-f4a174e27052e38aec09840e9195981cc2f24c88.tar.xz zg-f4a174e27052e38aec09840e9195981cc2f24c88.zip | |
Document "fat_offset" in README
Diffstat (limited to 'README.md')
| -rw-r--r-- | README.md | 21 |
1 files changed, 21 insertions, 0 deletions
| @@ -519,3 +519,24 @@ test "Scripts" { | |||
| 519 | try expect(scripts.script('צ') == .Hebrew); | 519 | try expect(scripts.script('צ') == .Hebrew); |
| 520 | } | 520 | } |
| 521 | ``` | 521 | ``` |
| 522 | |||
| 523 | ## Limits | ||
| 524 | |||
| 525 | Iterators, and fragment types such as `CodePoint`, `Grapheme` and `Word`, use a | ||
| 526 | `u32` to store the offset into a string, and the length of the fragment | ||
| 527 | (`CodePoint` uses a `u3` for length, actually). | ||
| 528 | |||
| 529 | 4GiB is a lot of string. There are a few reasons to work with that much | ||
| 530 | string, log files primarily, but fewer to bring it all into memory at once, and | ||
| 531 | practically no reason at all to do anything to such a string without breaking | ||
| 532 | it into smaller piece to work with. | ||
| 533 | |||
| 534 | Also, Zig compiles on 32 bit systems, where `usize` is 32. Code running on | ||
| 535 | such systems has no choice but to handle slices in smaller pieces. In general, | ||
| 536 | if you want code to perform correctly when encountering multi- gigabyte | ||
| 537 | strings, you'll need to code for that, at a level one or two steps above that | ||
| 538 | in which you'll want to, for example, iterate some graphemes of that string. | ||
| 539 | |||
| 540 | That all said, `zg` modules can be passed the Boolean config option | ||
| 541 | `fat_offset`, which will make all of those data structures use a `u64` instead. | ||
| 542 | You don't actually want to do this. But you can. | ||