This accomplishes two things:
* Works around #8442 by putting stage1-specific logic in to disable all
the std.json tests.
* Slightly reduces installation size of zig since std lib files ending
in "test.zig" are excluded from being installed.
We already have a LICENSE file that covers the Zig Standard Library. We
no longer need to remind everyone that the license is MIT in every single
file.
Previously this was introduced to clarify the situation for a fork of
Zig that made Zig's LICENSE file harder to find, and replaced it with
their own license that required annual payments to their company.
However that fork now appears to be dead. So there is no need to
reinforce the copyright notice in every single file.
* Switch json testing 'roundTrip()' to use FixedBufferStream, improve error handling, remove comptime from param
* Add 'try' to calls to roundTrip() that can now return an error
* Remove comptime from params in json testing, replace expect(false) with letting error propagate
* Add 'try' to calls to ok() that can now return an error
Co-authored-by: Lewis Gaul <legaul@cisco.com>
The main goal here is to make the function pointers comptime, so that we
don't have to do the crazy stuff with async function frames.
Since InStream, OutStream, and SeekableStream are already generic
across error sets, it's not really worse to make them generic across the
vtable as well.
See #764 for the open issue acknowledging that using generics for these
abstractions is a design flaw.
See #130 for the efforts to make these abstractions non-generic.
This commit also changes the OutStream API so that `write` returns
number of bytes written, and `writeAll` is the one that loops until the
whole buffer is written.
Fixes#2379
= Overlong (non-shortest) sequences
UTF-8's unique encoding scheme allows for some Unicode codepoints
to be represented in multiple ways. For any of these characters,
the spec forbids all but the shortest form. These disallowed longer
sequences are called "overlong". As an interesting side effect of
this rule, the bytes C0 and C1 never appear in valid UTF-8.
= Codepoint range
UTF-8 disallows representation of codepoints beyond U+10FFFF,
which is the highest character which can be encoded in UTF-16.
Because a 4-byte sequence is capable of resulting in such characters,
they must be explicitly rejected. This rule also has an interesting
side effect, which is that bytes F5 to FF never appear.
= References
Detecting an overlong version of a codepoint could get gnarly, but
luckily The Unicode Consortium did the hard work by creating this
handy table of valid byte sequences:
https://unicode.org/versions/corrigendum1.html
I thought this mapped nicely to the parser's state machine, so I
rearranged the relevant states to make use of it.
This change was mostly made with `zig fmt` and this also modified some whitespace. Note that in some files, `zig fmt` produced incorrect code, so the change was made manually.