This introduces the concept of a "weak global name" into translate-c.
translate-c consists of two passes. The first is important, because it
discovers all global names, which are used to prevent naming conflicts:
whenever we see an identifier in the second pass, we can mangle it if it
conflicts with any global or any other in-scope identifier.
Unfortunately, this is a bit tricky for structs, unions, and enums. In
C, these types are not represented by normal identifers, but by separate
tags - `struct foo` does not prevent an unrelated identifier `foo`
existing. In general, we want to translate type names to user-friendly
ones such as `struct_foo` and `foo` where possible, but we can't
guarantee such names will not conflict with real variable names.
This is where weak global names come in. In the initial pass, when a
global type declaration is seen, `struct_foo` and `foo` are both added
as weak global names. This essentially means that we will use these
names for the type *if possible*, but if there is another global with
the same name, we will mangle the type name instead. Then, when actually
translating the declaration, we check whether there's a "true" global
with a conflicting name, in which case we mangle our name. If the
user-friendly alias `foo` conflicts, we do not attempt to mangle it: we
just don't emit it, because a mangled alias isn't particularly helpful.
The _environ variable that is populated when linking libc on Windows does not support Unicode keys/values (or, at least, the encoding is not necessarily UTF-8). So, for Unicode support, _wenviron would need to be used instead. However, this means that the keys/values would be encoded as UTF-16, so they would need to be converted to UTF-8 before being returned by `os.getenv`. This would require allocation which is not part of the `os.getenv` API, so `os.getenv` is not implementable on Windows even when linking libc.
Closes https://github.com/ziglang/zig/issues/8456
this will make s3 re-enable compression for the stdlib's autodoc
and improve loading times (and data usage) for users
alongside this commit the deploy script for the official website is also
being updated
Consider this C macro:
```c
#define FOO(x) struct x
```
Previously, translate-c did not detect that the `x` in the body referred
to the argument, so wrongly translated this code as using the
nonexistent `struct_x`. Since undefined identifiers are noticed in
AstGen, this prevents the translated file from being usable at all.
translate-c now instead detects this case and emits an appropriate
compile error in the macro's place.
AstGen emits an error when a closure over a known-runtime value crosses
a namespace boundary. This usually makes sense: however, this usage is
actually valid if the capture is within a `@TypeOf` operand. Sema
already has a special case to allow such closure within `@TypeOf` when
AstGen could not determine a value to be runtime-known. This commit
simply introduces analagous logic to AstGen to allow `var`s to cross
namespace boundaries within `@TypeOf`.
I have updated Felix's ZigAndroidTemplate to work with the latest
version of Zig and we are exploring adding Android support to Mach engine.
`std.c.getcontext` is _referenced_ but not _used_, and Android's bionic libc
does not implement `getcontext`. `std.os.linux.getcontext` also cannot be
used with bionic libc, so it seems prudent to just disable this extern for now.
This may not be the perfect long-term fix, but I have a golden rebuttal to that:
before I was unable to compile Zig applications for Android, and now I can.
<img width="828" alt="image" src="https://github.com/hexops/mach/assets/3173176/1e29142b-0419-4459-9c8b-75d92f87f822">
Signed-off-by: Stephen Gutekanst <stephen@hexops.com>
This is supposed to be the case, similar to how pointers to generic
functions are comptime-only (several pieces of logic already assumed
this). These types being considered runtime was causing `dbg_var_val`
AIR instructions to be wrongly emitted for such values, causing codegen
backends to create a runtime reference to the inline function, which (at
least on the LLVM backend) triggers an error.
Resolves: #38
The new `@depedencies` module contains generated code like the
following (where strings like "abc123" represent hashes):
```zig
pub const root_deps = [_]struct { []const u8, []const u8 }{
.{ "foo", "abc123" },
};
pub const packages = struct {
pub const abc123 = struct {
pub const build_root = "/home/mlugg/.cache/zig/blah/abc123";
pub const build_zig = @import("abc123");
pub const deps = [_]struct { []const u8, []const u8 }{
.{ "bar", "abc123" },
.{ "name", "ghi789" },
};
};
};
```
Each package contains a build root string, the build.zig import, and a
mapping from dependency names to package hashes. There is also such a
mapping for the root package dependencies.
In theory, we could now remove the `dep_prefix` field from `std.Build`,
since its main purpose is now handled differently. I believe this is a
desirable goal, as it doesn't really make sense to assign a single FQN
to any package (because it may appear in many different places in the
package hierarchy). This commit does not remove that field, as it's used
non-trivially in a few places in the build runner and compiler tests:
this will be a future enhancement.
Resolves: #16354Resolves: #17135
This applies the changes to the grammar introduced by the new
destructuring syntax, as well as some existing changes which were not
copied into the langref.
This change implements the following syntax into the compiler:
```zig
const x: u32, var y, foo.bar = .{ 1, 2, 3 };
```
A destructure expression may only appear within a block (i.e. not at
comtainer scope). The LHS consists of a sequence of comma-separated var
decls and/or lvalue expressions. The RHS is a normal expression.
A new result location type, `destructure`, is used, which contains
result pointers for each component of the destructure. This means that
when the RHS is a more complicated expression, peer type resolution is
not used: each result value is individually destructured and written to
the result pointers. RLS is always used for destructure expressions,
meaning every `const` on the LHS of such an expression creates a true
stack allocation.
Aside from anonymous array literals, Sema is capable of destructuring
the following types:
* Tuples
* Arrays
* Vectors
A destructure may be prefixed with the `comptime` keyword, in which case
the entire destructure is evaluated at comptime: this means all `var`s
in the LHS are `comptime var`s, every lvalue expression is evaluated at
comptime, and the RHS is evaluated at comptime. If every LHS is a
`const`, this is not allowed: as with single declarations, the user
should instead mark the RHS as `comptime`.
There are a few subtleties in the grammar changes here. For one thing,
if every LHS is an lvalue expression (rather than a var decl), a
destructure is considered an expression. This makes, for instance,
`if (cond) x, y = .{ 1, 2 };` valid Zig code. A destructure is allowed
in almost every context where a standard assignment expression is
permitted. The exception is `switch` prongs, which cannot be
destructures as the comma is ambiguous with the end of the prong.
A follow-up commit will begin utilizing this syntax in the Zig compiler.
Resolves: #498
When RLS is used to initialize a value with a comptime-only type, the
usual "value with comptime-only type depends on runtime control flow"
error message isn't hit, because we don't use results from a block. When
we reach `make_ptr_const`, we must validate that the value is
comptime-known.
This option exists for fast iteration during compiler development, since
avoiding codegen (but still performing semantic analysis) causes compile
errors to appear around twice as fast. This option was broken when the
`emit_bin` property on `std.Build.Step.Compile` was removed.
Now, when this option is passed, we add a direct install dependency on
the compile step itself, so that it is always run.
Insn.st() can be used with dynamic size just like Insn.stx(), which is
relevant in a code generation context.
using ImmOrReg caused an error as its fields were ordered differently than
Source.
The `coerce_result_ptr` instruction is highly problematic and leads to
unintentional memory reinterpretation in some cases. It is more correct
to simply not forward result pointers through this builtin.
`coerce_result_ptr` is still used for struct and array initializations,
where it can still cause issues. Eliminating this usage will be a future
change.
Resolves: #16991
* Use 32-bit integers instead of pointers for compactness and
serialization friendliness.
* Use a separate hash map for runtime and comptime capture scopes,
avoiding the 1-bit union tag.
* Use a compact array representation instead of a tree of hash maps.
* Eliminate the only instance of ref-counting in the compiler, instead
relying on garbage collection (not implemented yet but is the plan for
almost all long-lived objects related to incremental compilation).
Because a code modification may need to access capture scope data, this
makes capture scope data long-lived state. My goal is to get incremental
compilation state serialization down to a single pwritev syscall, by
unifying the on-disk representation with the in-memory representation.
This commit eliminates the last remaining pointer field of
`Module.Decl`.
This will allow users to construct e.g. a ComptimeStringMap that uses case-insensitive ASCII comparison.
Note: the previous ComptimeStringMap API is unchanged (i.e. this does not break any existing code).