2020-02-26 06:18:23 +00:00
|
|
|
const std = @import("std");
|
2021-10-05 07:47:27 +01:00
|
|
|
const builtin = @import("builtin");
|
re-enable test-cases and get them all passing
Instead of using `zig test` to build a special version of the compiler
that runs all the test-cases, the zig build system is now used as much
as possible - all with the basic steps found in the standard library.
For incremental compilation tests (the ones that look like foo.0.zig,
foo.1.zig, foo.2.zig, etc.), a special version of the compiler is
compiled into a utility executable called "check-case" which checks
exactly one sequence of incremental updates in an independent
subprocess. Previously, all incremental and non-incremental test cases
were done in the same test runner process.
The compile error checking code is now simpler, but also a bit
rudimentary, and so it additionally makes sure that the actual compile
errors do not include *extra* messages, and it makes sure that the
actual compile errors output in the same order as expected. It is also
based on the "ends-with" property of each line rather than the previous
logic, which frankly I didn't want to touch with a ten-meter pole. The
compile error test cases have been updated to pass in light of these
differences.
Previously, 'error' mode with 0 compile errors was used to shoehorn in a
different kind of test-case - one that only checks if a piece of code
compiles without errors. Now there is a 'compile' mode of test-cases,
and 'error' must be only used when there are greater than 0 errors.
link test cases are updated to omit the target object format argument
when calling checkObject since that is no longer needed.
The test/stage2 directory is removed; the 2 files within are moved to be
directly in the test/ directory.
2023-03-10 01:22:51 +00:00
|
|
|
const Cases = @import("src/Cases.zig");
|
2020-05-07 00:31:18 +01:00
|
|
|
|
zig build system: change target, compilation, and module APIs
Introduce the concept of "target query" and "resolved target". A target
query is what the user specifies, with some things left to default. A
resolved target has the default things discovered and populated.
In the future, std.zig.CrossTarget will be rename to std.Target.Query.
Introduces `std.Build.resolveTargetQuery` to get from one to the other.
The concept of `main_mod_path` is gone, no longer supported. You have to
put the root source file at the module root now.
* remove deprecated API
* update build.zig for the breaking API changes in this branch
* move std.Build.Step.Compile.BuildId to std.zig.BuildId
* add more options to std.Build.ExecutableOptions, std.Build.ObjectOptions,
std.Build.SharedLibraryOptions, std.Build.StaticLibraryOptions, and
std.Build.TestOptions.
* remove `std.Build.constructCMacro`. There is no use for this API.
* deprecate `std.Build.Step.Compile.defineCMacro`. Instead,
`std.Build.Module.addCMacro` is provided.
- remove `std.Build.Step.Compile.defineCMacroRaw`.
* deprecate `std.Build.Step.Compile.linkFrameworkNeeded`
- use `std.Build.Module.linkFramework`
* deprecate `std.Build.Step.Compile.linkFrameworkWeak`
- use `std.Build.Module.linkFramework`
* move more logic into `std.Build.Module`
* allow `target` and `optimize` to be `null` when creating a Module.
Along with other fields, those unspecified options will be inherited
from parent `Module` when inserted into an import table.
* the `target` field of `addExecutable` is now required. pass `b.host`
to get the host target.
2023-12-03 04:51:34 +00:00
|
|
|
pub fn addCases(ctx: *Cases, b: *std.Build) !void {
|
2022-07-11 22:10:39 +01:00
|
|
|
{
|
2024-06-13 13:00:22 +01:00
|
|
|
const case = ctx.obj("multiline error messages", b.graph.host);
|
2022-07-11 22:10:39 +01:00
|
|
|
|
|
|
|
case.addError(
|
|
|
|
\\comptime {
|
|
|
|
\\ @compileError("hello\nworld");
|
|
|
|
\\}
|
|
|
|
, &[_][]const u8{
|
|
|
|
\\:2:5: error: hello
|
|
|
|
\\ world
|
|
|
|
});
|
|
|
|
|
|
|
|
case.addError(
|
|
|
|
\\comptime {
|
|
|
|
\\ @compileError(
|
|
|
|
\\ \\
|
|
|
|
\\ \\hello!
|
|
|
|
\\ \\I'm a multiline error message.
|
|
|
|
\\ \\I hope to be very useful!
|
|
|
|
\\ \\
|
|
|
|
\\ \\also I will leave this trailing newline here if you don't mind
|
|
|
|
\\ \\
|
|
|
|
\\ );
|
|
|
|
\\}
|
|
|
|
, &[_][]const u8{
|
|
|
|
\\:2:5: error:
|
|
|
|
\\ hello!
|
|
|
|
\\ I'm a multiline error message.
|
|
|
|
\\ I hope to be very useful!
|
|
|
|
\\
|
|
|
|
\\ also I will leave this trailing newline here if you don't mind
|
|
|
|
\\
|
|
|
|
});
|
|
|
|
}
|
|
|
|
|
2022-07-15 09:38:16 +01:00
|
|
|
{
|
2024-06-13 13:00:22 +01:00
|
|
|
const case = ctx.obj("missing semicolon at EOF", b.graph.host);
|
2022-07-15 09:38:16 +01:00
|
|
|
case.addError(
|
|
|
|
\\const foo = 1
|
|
|
|
, &[_][]const u8{
|
|
|
|
\\:1:14: error: expected ';' after declaration
|
|
|
|
});
|
|
|
|
}
|
|
|
|
|
2022-07-25 11:48:21 +01:00
|
|
|
{
|
2024-06-13 13:00:22 +01:00
|
|
|
const case = ctx.obj("argument causes error", b.graph.host);
|
2022-07-25 11:48:21 +01:00
|
|
|
|
|
|
|
case.addError(
|
|
|
|
\\pub export fn entry() void {
|
|
|
|
\\ var lib: @import("b.zig").ElfDynLib = undefined;
|
|
|
|
\\ _ = lib.lookup(fn () void);
|
|
|
|
\\}
|
|
|
|
, &[_][]const u8{
|
|
|
|
":3:12: error: unable to resolve comptime value",
|
2022-10-05 19:58:23 +01:00
|
|
|
":3:12: note: argument to function being called at comptime must be comptime-known",
|
2022-10-27 17:31:45 +01:00
|
|
|
":2:55: note: expression is evaluated at comptime because the generic function was instantiated with a comptime-only return type",
|
2022-07-25 11:48:21 +01:00
|
|
|
});
|
re-enable test-cases and get them all passing
Instead of using `zig test` to build a special version of the compiler
that runs all the test-cases, the zig build system is now used as much
as possible - all with the basic steps found in the standard library.
For incremental compilation tests (the ones that look like foo.0.zig,
foo.1.zig, foo.2.zig, etc.), a special version of the compiler is
compiled into a utility executable called "check-case" which checks
exactly one sequence of incremental updates in an independent
subprocess. Previously, all incremental and non-incremental test cases
were done in the same test runner process.
The compile error checking code is now simpler, but also a bit
rudimentary, and so it additionally makes sure that the actual compile
errors do not include *extra* messages, and it makes sure that the
actual compile errors output in the same order as expected. It is also
based on the "ends-with" property of each line rather than the previous
logic, which frankly I didn't want to touch with a ten-meter pole. The
compile error test cases have been updated to pass in light of these
differences.
Previously, 'error' mode with 0 compile errors was used to shoehorn in a
different kind of test-case - one that only checks if a piece of code
compiles without errors. Now there is a 'compile' mode of test-cases,
and 'error' must be only used when there are greater than 0 errors.
link test cases are updated to omit the target object format argument
when calling checkObject since that is no longer needed.
The test/stage2 directory is removed; the 2 files within are moved to be
directly in the test/ directory.
2023-03-10 01:22:51 +00:00
|
|
|
case.addSourceFile("b.zig",
|
|
|
|
\\pub const ElfDynLib = struct {
|
|
|
|
\\ pub fn lookup(self: *ElfDynLib, comptime T: type) ?T {
|
|
|
|
\\ _ = self;
|
|
|
|
\\ return undefined;
|
|
|
|
\\ }
|
|
|
|
\\};
|
|
|
|
);
|
2022-07-25 11:48:21 +01:00
|
|
|
}
|
|
|
|
|
2022-08-21 15:07:22 +01:00
|
|
|
{
|
2024-06-13 13:00:22 +01:00
|
|
|
const case = ctx.obj("astgen failure in file struct", b.graph.host);
|
2022-08-21 15:07:22 +01:00
|
|
|
|
|
|
|
case.addError(
|
|
|
|
\\pub export fn entry() void {
|
|
|
|
\\ _ = (@sizeOf(@import("b.zig")));
|
|
|
|
\\}
|
|
|
|
, &[_][]const u8{
|
2022-11-22 12:10:52 +00:00
|
|
|
":1:1: error: expected type expression, found '+'",
|
2022-08-21 15:07:22 +01:00
|
|
|
});
|
re-enable test-cases and get them all passing
Instead of using `zig test` to build a special version of the compiler
that runs all the test-cases, the zig build system is now used as much
as possible - all with the basic steps found in the standard library.
For incremental compilation tests (the ones that look like foo.0.zig,
foo.1.zig, foo.2.zig, etc.), a special version of the compiler is
compiled into a utility executable called "check-case" which checks
exactly one sequence of incremental updates in an independent
subprocess. Previously, all incremental and non-incremental test cases
were done in the same test runner process.
The compile error checking code is now simpler, but also a bit
rudimentary, and so it additionally makes sure that the actual compile
errors do not include *extra* messages, and it makes sure that the
actual compile errors output in the same order as expected. It is also
based on the "ends-with" property of each line rather than the previous
logic, which frankly I didn't want to touch with a ten-meter pole. The
compile error test cases have been updated to pass in light of these
differences.
Previously, 'error' mode with 0 compile errors was used to shoehorn in a
different kind of test-case - one that only checks if a piece of code
compiles without errors. Now there is a 'compile' mode of test-cases,
and 'error' must be only used when there are greater than 0 errors.
link test cases are updated to omit the target object format argument
when calling checkObject since that is no longer needed.
The test/stage2 directory is removed; the 2 files within are moved to be
directly in the test/ directory.
2023-03-10 01:22:51 +00:00
|
|
|
case.addSourceFile("b.zig",
|
|
|
|
\\+
|
|
|
|
);
|
2022-08-21 15:07:22 +01:00
|
|
|
}
|
|
|
|
|
2022-09-24 23:30:15 +01:00
|
|
|
{
|
2024-06-13 13:00:22 +01:00
|
|
|
const case = ctx.obj("invalid store to comptime field", b.graph.host);
|
2022-09-24 23:30:15 +01:00
|
|
|
|
|
|
|
case.addError(
|
|
|
|
\\const a = @import("a.zig");
|
|
|
|
\\
|
|
|
|
\\export fn entry() void {
|
2024-08-28 18:35:37 +01:00
|
|
|
\\ _ = a.S.qux(a.S{ .foo = 2, .bar = 2 });
|
2022-09-24 23:30:15 +01:00
|
|
|
\\}
|
|
|
|
, &[_][]const u8{
|
|
|
|
":4:23: error: value stored in comptime field does not match the default value of the field",
|
|
|
|
":2:25: note: default value set here",
|
|
|
|
});
|
re-enable test-cases and get them all passing
Instead of using `zig test` to build a special version of the compiler
that runs all the test-cases, the zig build system is now used as much
as possible - all with the basic steps found in the standard library.
For incremental compilation tests (the ones that look like foo.0.zig,
foo.1.zig, foo.2.zig, etc.), a special version of the compiler is
compiled into a utility executable called "check-case" which checks
exactly one sequence of incremental updates in an independent
subprocess. Previously, all incremental and non-incremental test cases
were done in the same test runner process.
The compile error checking code is now simpler, but also a bit
rudimentary, and so it additionally makes sure that the actual compile
errors do not include *extra* messages, and it makes sure that the
actual compile errors output in the same order as expected. It is also
based on the "ends-with" property of each line rather than the previous
logic, which frankly I didn't want to touch with a ten-meter pole. The
compile error test cases have been updated to pass in light of these
differences.
Previously, 'error' mode with 0 compile errors was used to shoehorn in a
different kind of test-case - one that only checks if a piece of code
compiles without errors. Now there is a 'compile' mode of test-cases,
and 'error' must be only used when there are greater than 0 errors.
link test cases are updated to omit the target object format argument
when calling checkObject since that is no longer needed.
The test/stage2 directory is removed; the 2 files within are moved to be
directly in the test/ directory.
2023-03-10 01:22:51 +00:00
|
|
|
case.addSourceFile("a.zig",
|
|
|
|
\\pub const S = struct {
|
|
|
|
\\ comptime foo: u32 = 1,
|
|
|
|
\\ bar: u32,
|
2024-08-28 18:35:37 +01:00
|
|
|
\\ pub fn qux(x: @This()) void {
|
re-enable test-cases and get them all passing
Instead of using `zig test` to build a special version of the compiler
that runs all the test-cases, the zig build system is now used as much
as possible - all with the basic steps found in the standard library.
For incremental compilation tests (the ones that look like foo.0.zig,
foo.1.zig, foo.2.zig, etc.), a special version of the compiler is
compiled into a utility executable called "check-case" which checks
exactly one sequence of incremental updates in an independent
subprocess. Previously, all incremental and non-incremental test cases
were done in the same test runner process.
The compile error checking code is now simpler, but also a bit
rudimentary, and so it additionally makes sure that the actual compile
errors do not include *extra* messages, and it makes sure that the
actual compile errors output in the same order as expected. It is also
based on the "ends-with" property of each line rather than the previous
logic, which frankly I didn't want to touch with a ten-meter pole. The
compile error test cases have been updated to pass in light of these
differences.
Previously, 'error' mode with 0 compile errors was used to shoehorn in a
different kind of test-case - one that only checks if a piece of code
compiles without errors. Now there is a 'compile' mode of test-cases,
and 'error' must be only used when there are greater than 0 errors.
link test cases are updated to omit the target object format argument
when calling checkObject since that is no longer needed.
The test/stage2 directory is removed; the 2 files within are moved to be
directly in the test/ directory.
2023-03-10 01:22:51 +00:00
|
|
|
\\ _ = x;
|
|
|
|
\\ }
|
|
|
|
\\};
|
|
|
|
);
|
2022-09-24 23:30:15 +01:00
|
|
|
}
|
|
|
|
|
2023-02-18 06:28:47 +00:00
|
|
|
{
|
2024-06-13 13:00:22 +01:00
|
|
|
const case = ctx.obj("file in multiple modules", b.graph.host);
|
2023-02-18 06:28:47 +00:00
|
|
|
case.addDepModule("foo", "foo.zig");
|
|
|
|
|
|
|
|
case.addError(
|
|
|
|
\\comptime {
|
|
|
|
\\ _ = @import("foo");
|
|
|
|
\\ _ = @import("foo.zig");
|
|
|
|
\\}
|
|
|
|
, &[_][]const u8{
|
|
|
|
":1:1: error: file exists in multiple modules",
|
2023-10-09 04:58:04 +01:00
|
|
|
":1:1: note: root of module foo",
|
2023-02-18 06:28:47 +00:00
|
|
|
":3:17: note: imported from module root",
|
|
|
|
});
|
re-enable test-cases and get them all passing
Instead of using `zig test` to build a special version of the compiler
that runs all the test-cases, the zig build system is now used as much
as possible - all with the basic steps found in the standard library.
For incremental compilation tests (the ones that look like foo.0.zig,
foo.1.zig, foo.2.zig, etc.), a special version of the compiler is
compiled into a utility executable called "check-case" which checks
exactly one sequence of incremental updates in an independent
subprocess. Previously, all incremental and non-incremental test cases
were done in the same test runner process.
The compile error checking code is now simpler, but also a bit
rudimentary, and so it additionally makes sure that the actual compile
errors do not include *extra* messages, and it makes sure that the
actual compile errors output in the same order as expected. It is also
based on the "ends-with" property of each line rather than the previous
logic, which frankly I didn't want to touch with a ten-meter pole. The
compile error test cases have been updated to pass in light of these
differences.
Previously, 'error' mode with 0 compile errors was used to shoehorn in a
different kind of test-case - one that only checks if a piece of code
compiles without errors. Now there is a 'compile' mode of test-cases,
and 'error' must be only used when there are greater than 0 errors.
link test cases are updated to omit the target object format argument
when calling checkObject since that is no longer needed.
The test/stage2 directory is removed; the 2 files within are moved to be
directly in the test/ directory.
2023-03-10 01:22:51 +00:00
|
|
|
case.addSourceFile("foo.zig",
|
|
|
|
\\const dummy = 0;
|
|
|
|
);
|
2023-02-18 06:28:47 +00:00
|
|
|
}
|
2023-03-17 12:27:19 +00:00
|
|
|
|
|
|
|
{
|
2024-06-13 13:00:22 +01:00
|
|
|
const case = ctx.obj("wrong same named struct", b.graph.host);
|
2023-03-17 12:27:19 +00:00
|
|
|
|
|
|
|
case.addError(
|
|
|
|
\\const a = @import("a.zig");
|
|
|
|
\\const b = @import("b.zig");
|
|
|
|
\\
|
|
|
|
\\export fn entry() void {
|
|
|
|
\\ var a1: a.Foo = undefined;
|
|
|
|
\\ bar(&a1);
|
|
|
|
\\}
|
|
|
|
\\
|
|
|
|
\\fn bar(_: *b.Foo) void {}
|
|
|
|
, &[_][]const u8{
|
|
|
|
":6:9: error: expected type '*b.Foo', found '*a.Foo'",
|
|
|
|
":6:9: note: pointer type child 'a.Foo' cannot cast into pointer type child 'b.Foo'",
|
|
|
|
":1:17: note: struct declared here",
|
|
|
|
":1:17: note: struct declared here",
|
|
|
|
":9:11: note: parameter type declared here",
|
|
|
|
});
|
|
|
|
|
|
|
|
case.addSourceFile("a.zig",
|
|
|
|
\\pub const Foo = struct {
|
|
|
|
\\ x: i32,
|
|
|
|
\\};
|
|
|
|
);
|
|
|
|
|
|
|
|
case.addSourceFile("b.zig",
|
|
|
|
\\pub const Foo = struct {
|
|
|
|
\\ z: f64,
|
|
|
|
\\};
|
|
|
|
);
|
|
|
|
}
|
|
|
|
|
|
|
|
{
|
2024-06-13 13:00:22 +01:00
|
|
|
const case = ctx.obj("non-printable invalid character", b.graph.host);
|
2023-03-17 12:27:19 +00:00
|
|
|
|
|
|
|
case.addError("\xff\xfe" ++
|
|
|
|
\\export fn foo() bool {
|
|
|
|
\\ return true;
|
|
|
|
\\}
|
|
|
|
, &[_][]const u8{
|
std.zig.tokenizer: simplify
I pointed a fuzzer at the tokenizer and it crashed immediately. Upon
inspection, I was dissatisfied with the implementation. This commit
removes several mechanisms:
* Removes the "invalid byte" compile error note.
* Dramatically simplifies tokenizer recovery by making recovery always
occur at newlines, and never otherwise.
* Removes UTF-8 validation.
* Moves some character validation logic to `std.zig.parseCharLiteral`.
Removing UTF-8 validation is a regression of #663, however, the existing
implementation was already buggy. When adding this functionality back,
it must be fuzz-tested while checking the property that it matches an
independent Unicode validation implementation on the same file. While
we're at it, fuzzing should check the other properties of that proposal,
such as no ASCII control characters existing inside the source code.
Other changes included in this commit:
* Deprecate `std.unicode.utf8Decode` and its WTF-8 counterpart. This
function has an awkward API that is too easy to misuse.
* Make `utf8Decode2` and friends use arrays as parameters, eliminating a
runtime assertion in favor of using the type system.
After this commit, the crash found by fuzzing, which was
"\x07\xd5\x80\xc3=o\xda|a\xfc{\x9a\xec\x91\xdf\x0f\\\x1a^\xbe;\x8c\xbf\xee\xea"
no longer causes a crash. However, I did not feel the need to add this
test case because the simplified logic eradicates most crashes of this
nature.
2024-07-31 19:51:19 +01:00
|
|
|
":1:1: error: expected type expression, found 'invalid token'",
|
2023-03-17 12:27:19 +00:00
|
|
|
});
|
|
|
|
}
|
2023-08-09 14:26:16 +01:00
|
|
|
|
|
|
|
{
|
2024-06-13 13:00:22 +01:00
|
|
|
const case = ctx.obj("imported generic method call with invalid param", b.graph.host);
|
2023-08-09 14:26:16 +01:00
|
|
|
|
|
|
|
case.addError(
|
|
|
|
\\pub const import = @import("import.zig");
|
|
|
|
\\
|
|
|
|
\\export fn callComptimeBoolFunctionWithRuntimeBool(x: bool) void {
|
|
|
|
\\ import.comptimeBoolFunction(x);
|
|
|
|
\\}
|
|
|
|
\\
|
|
|
|
\\export fn callComptimeAnytypeFunctionWithRuntimeBool(x: bool) void {
|
|
|
|
\\ import.comptimeAnytypeFunction(x);
|
|
|
|
\\}
|
|
|
|
\\
|
|
|
|
\\export fn callAnytypeFunctionWithRuntimeComptimeOnlyType(x: u32) void {
|
|
|
|
\\ const S = struct { x: u32, y: type };
|
|
|
|
\\ import.anytypeFunction(S{ .x = x, .y = u32 });
|
|
|
|
\\}
|
|
|
|
, &[_][]const u8{
|
|
|
|
":4:33: error: runtime-known argument passed to comptime parameter",
|
|
|
|
":1:38: note: declared comptime here",
|
|
|
|
":8:36: error: runtime-known argument passed to comptime parameter",
|
|
|
|
":2:41: note: declared comptime here",
|
compiler: preserve result type information through address-of operator
This commit introduces the new `ref_coerced_ty` result type into AstGen.
This represents a expression which we want to treat as an lvalue, and
the pointer will be coerced to a given type.
This change gives known result types to many expressions, in particular
struct and array initializations. This allows certain casts to work
which previously required explicitly specifying types via `@as`. It also
eliminates our dependence on anonymous struct types for expressions of
the form `&.{ ... }` - this paves the way for #16865, and also results
in less Sema magic happening for such initializations, also leading to
potentially better runtime code.
As part of these changes, this commit also implements #17194 by
disallowing RLS on explicitly-typed struct and array initializations.
Apologies for linking these changes - it seemed rather pointless to try
and separate them, since they both make big changes to struct and array
initializations in AstGen. The rationale for this change can be found in
the proposal - in essence, performing RLS whilst maintaining the
semantics of the intermediary type is a very difficult problem to solve.
This allowed the problematic `coerce_result_ptr` ZIR instruction to be
completely eliminated, which in turn also simplified the logic for
inferred allocations in Sema - thanks to this, we almost break even on
line count!
In doing this, the ZIR instructions surrounding these initializations
have been restructured - some have been added and removed, and others
renamed for clarity (and their semantics changed slightly). In order to
optimize ZIR tag count, the `struct_init_anon_ref` and
`array_init_anon_ref` instructions have been removed in favour of using
`ref` on a standard anonymous value initialization, since these
instructions are now virtually never used.
Lastly, it's worth noting that this commit introduces a slightly strange
source of generic poison types: in the expression `@as(*anyopaque, &x)`,
the sub-expression `x` has a generic poison result type, despite no
generic code being involved. This turns out to be a logical choice,
because we don't know the result type for `x`, and the generic poison
type represents precisely this case, providing the semantics we need.
Resolves: #16512
Resolves: #17194
2023-09-18 14:49:18 +01:00
|
|
|
":13:32: error: unable to resolve comptime value",
|
|
|
|
":13:32: note: initializer of comptime only struct must be comptime-known",
|
2023-08-09 14:26:16 +01:00
|
|
|
});
|
|
|
|
|
|
|
|
case.addSourceFile("import.zig",
|
|
|
|
\\pub fn comptimeBoolFunction(comptime _: bool) void {}
|
|
|
|
\\pub fn comptimeAnytypeFunction(comptime _: anytype) void {}
|
|
|
|
\\pub fn anytypeFunction(_: anytype) void {}
|
|
|
|
);
|
|
|
|
}
|
2024-07-09 16:20:04 +01:00
|
|
|
|
|
|
|
{
|
|
|
|
const case = ctx.obj("invalid byte in string", b.graph.host);
|
|
|
|
|
|
|
|
case.addError("_ = \"\x01Q\";", &[_][]const u8{
|
std.zig.tokenizer: simplify
I pointed a fuzzer at the tokenizer and it crashed immediately. Upon
inspection, I was dissatisfied with the implementation. This commit
removes several mechanisms:
* Removes the "invalid byte" compile error note.
* Dramatically simplifies tokenizer recovery by making recovery always
occur at newlines, and never otherwise.
* Removes UTF-8 validation.
* Moves some character validation logic to `std.zig.parseCharLiteral`.
Removing UTF-8 validation is a regression of #663, however, the existing
implementation was already buggy. When adding this functionality back,
it must be fuzz-tested while checking the property that it matches an
independent Unicode validation implementation on the same file. While
we're at it, fuzzing should check the other properties of that proposal,
such as no ASCII control characters existing inside the source code.
Other changes included in this commit:
* Deprecate `std.unicode.utf8Decode` and its WTF-8 counterpart. This
function has an awkward API that is too easy to misuse.
* Make `utf8Decode2` and friends use arrays as parameters, eliminating a
runtime assertion in favor of using the type system.
After this commit, the crash found by fuzzing, which was
"\x07\xd5\x80\xc3=o\xda|a\xfc{\x9a\xec\x91\xdf\x0f\\\x1a^\xbe;\x8c\xbf\xee\xea"
no longer causes a crash. However, I did not feel the need to add this
test case because the simplified logic eradicates most crashes of this
nature.
2024-07-31 19:51:19 +01:00
|
|
|
":1:5: error: expected expression, found 'invalid token'",
|
2024-07-09 16:20:04 +01:00
|
|
|
});
|
|
|
|
}
|
|
|
|
|
|
|
|
{
|
|
|
|
const case = ctx.obj("invalid byte in comment", b.graph.host);
|
|
|
|
|
|
|
|
case.addError("//\x01Q", &[_][]const u8{
|
std.zig.tokenizer: simplify
I pointed a fuzzer at the tokenizer and it crashed immediately. Upon
inspection, I was dissatisfied with the implementation. This commit
removes several mechanisms:
* Removes the "invalid byte" compile error note.
* Dramatically simplifies tokenizer recovery by making recovery always
occur at newlines, and never otherwise.
* Removes UTF-8 validation.
* Moves some character validation logic to `std.zig.parseCharLiteral`.
Removing UTF-8 validation is a regression of #663, however, the existing
implementation was already buggy. When adding this functionality back,
it must be fuzz-tested while checking the property that it matches an
independent Unicode validation implementation on the same file. While
we're at it, fuzzing should check the other properties of that proposal,
such as no ASCII control characters existing inside the source code.
Other changes included in this commit:
* Deprecate `std.unicode.utf8Decode` and its WTF-8 counterpart. This
function has an awkward API that is too easy to misuse.
* Make `utf8Decode2` and friends use arrays as parameters, eliminating a
runtime assertion in favor of using the type system.
After this commit, the crash found by fuzzing, which was
"\x07\xd5\x80\xc3=o\xda|a\xfc{\x9a\xec\x91\xdf\x0f\\\x1a^\xbe;\x8c\xbf\xee\xea"
no longer causes a crash. However, I did not feel the need to add this
test case because the simplified logic eradicates most crashes of this
nature.
2024-07-31 19:51:19 +01:00
|
|
|
":1:1: error: expected type expression, found 'invalid token'",
|
2024-07-09 16:20:04 +01:00
|
|
|
});
|
|
|
|
}
|
|
|
|
|
|
|
|
{
|
|
|
|
const case = ctx.obj("control character in character literal", b.graph.host);
|
|
|
|
|
|
|
|
case.addError("const c = '\x01';", &[_][]const u8{
|
std.zig.tokenizer: simplify
I pointed a fuzzer at the tokenizer and it crashed immediately. Upon
inspection, I was dissatisfied with the implementation. This commit
removes several mechanisms:
* Removes the "invalid byte" compile error note.
* Dramatically simplifies tokenizer recovery by making recovery always
occur at newlines, and never otherwise.
* Removes UTF-8 validation.
* Moves some character validation logic to `std.zig.parseCharLiteral`.
Removing UTF-8 validation is a regression of #663, however, the existing
implementation was already buggy. When adding this functionality back,
it must be fuzz-tested while checking the property that it matches an
independent Unicode validation implementation on the same file. While
we're at it, fuzzing should check the other properties of that proposal,
such as no ASCII control characters existing inside the source code.
Other changes included in this commit:
* Deprecate `std.unicode.utf8Decode` and its WTF-8 counterpart. This
function has an awkward API that is too easy to misuse.
* Make `utf8Decode2` and friends use arrays as parameters, eliminating a
runtime assertion in favor of using the type system.
After this commit, the crash found by fuzzing, which was
"\x07\xd5\x80\xc3=o\xda|a\xfc{\x9a\xec\x91\xdf\x0f\\\x1a^\xbe;\x8c\xbf\xee\xea"
no longer causes a crash. However, I did not feel the need to add this
test case because the simplified logic eradicates most crashes of this
nature.
2024-07-31 19:51:19 +01:00
|
|
|
":1:11: error: expected expression, found 'invalid token'",
|
2024-07-09 16:20:04 +01:00
|
|
|
});
|
|
|
|
}
|
|
|
|
|
|
|
|
{
|
|
|
|
const case = ctx.obj("invalid byte at start of token", b.graph.host);
|
|
|
|
|
|
|
|
case.addError("x = \x00Q", &[_][]const u8{
|
std.zig.tokenizer: simplify
I pointed a fuzzer at the tokenizer and it crashed immediately. Upon
inspection, I was dissatisfied with the implementation. This commit
removes several mechanisms:
* Removes the "invalid byte" compile error note.
* Dramatically simplifies tokenizer recovery by making recovery always
occur at newlines, and never otherwise.
* Removes UTF-8 validation.
* Moves some character validation logic to `std.zig.parseCharLiteral`.
Removing UTF-8 validation is a regression of #663, however, the existing
implementation was already buggy. When adding this functionality back,
it must be fuzz-tested while checking the property that it matches an
independent Unicode validation implementation on the same file. While
we're at it, fuzzing should check the other properties of that proposal,
such as no ASCII control characters existing inside the source code.
Other changes included in this commit:
* Deprecate `std.unicode.utf8Decode` and its WTF-8 counterpart. This
function has an awkward API that is too easy to misuse.
* Make `utf8Decode2` and friends use arrays as parameters, eliminating a
runtime assertion in favor of using the type system.
After this commit, the crash found by fuzzing, which was
"\x07\xd5\x80\xc3=o\xda|a\xfc{\x9a\xec\x91\xdf\x0f\\\x1a^\xbe;\x8c\xbf\xee\xea"
no longer causes a crash. However, I did not feel the need to add this
test case because the simplified logic eradicates most crashes of this
nature.
2024-07-31 19:51:19 +01:00
|
|
|
":1:5: error: expected expression, found 'invalid token'",
|
2024-07-09 16:20:04 +01:00
|
|
|
});
|
|
|
|
}
|
2017-04-19 09:12:22 +01:00
|
|
|
}
|