summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--README.md21
-rw-r--r--build.zig4
2 files changed, 23 insertions, 2 deletions
diff --git a/README.md b/README.md
index 1d3899c..1da50f3 100644
--- a/README.md
+++ b/README.md
@@ -519,3 +519,24 @@ test "Scripts" {
519 try expect(scripts.script('צ') == .Hebrew); 519 try expect(scripts.script('צ') == .Hebrew);
520} 520}
521``` 521```
522
523## Limits
524
525Iterators, and fragment types such as `CodePoint`, `Grapheme` and `Word`, use a
526`u32` to store the offset into a string, and the length of the fragment
527(`CodePoint` uses a `u3` for length, actually).
528
5294GiB is a lot of string. There are a few reasons to work with that much
530string, log files primarily, but fewer to bring it all into memory at once, and
531practically no reason at all to do anything to such a string without breaking
532it into smaller piece to work with.
533
534Also, Zig compiles on 32 bit systems, where `usize` is 32. Code running on
535such systems has no choice but to handle slices in smaller pieces. In general,
536if you want code to perform correctly when encountering multi- gigabyte
537strings, you'll need to code for that, at a level one or two steps above that
538in which you'll want to, for example, iterate some graphemes of that string.
539
540That all said, `zg` modules can be passed the Boolean config option
541`fat_offset`, which will make all of those data structures use a `u64` instead.
542You don't actually want to do this. But you can.
diff --git a/build.zig b/build.zig
index 648571b..ca0eeef 100644
--- a/build.zig
+++ b/build.zig
@@ -14,7 +14,7 @@ pub fn build(b: *std.Build) void {
14 //| Options 14 //| Options
15 15
16 // Display width 16 // Display width
17 const cjk = b.option(bool, "cjk", "Ambiguous code points are wide (display width: 2).") orelse false; 17 const cjk = b.option(bool, "cjk", "Ambiguous code points are wide (display width: 2)") orelse false;
18 const dwp_options = b.addOptions(); 18 const dwp_options = b.addOptions();
19 dwp_options.addOption(bool, "cjk", cjk); 19 dwp_options.addOption(bool, "cjk", cjk);
20 20
@@ -33,7 +33,7 @@ pub fn build(b: *std.Build) void {
33 dwp_options.addOption(?i4, "c1_width", c1_width); 33 dwp_options.addOption(?i4, "c1_width", c1_width);
34 34
35 //| Offset size 35 //| Offset size
36 const fat_offset = b.option(bool, "fat_offset", "Offsets in Iterators and data structures will be u64") orelse false; 36 const fat_offset = b.option(bool, "fat_offset", "Offsets in iterators and data structures will be u64") orelse false;
37 const size_config = b.addOptions(); 37 const size_config = b.addOptions();
38 size_config.addOption(bool, "fat_offset", fat_offset); 38 size_config.addOption(bool, "fat_offset", fat_offset);
39 39