There are two optimizations here, which work together to avoid a
pathological case.
The first optimization is that AstGen now records the result type of an
array multiplication expression where possible. This type is not used
according to the language specification, but instead as an optimization.
In the expression '.{x} ** 1000', if we know that the result must be an
array, then it is much more efficient to coerce the LHS to an array with
length 1 before doing the multiplication. Otherwise, we end up with a
1000-element tuple which we must coerce to an array by individually
extracting each field.
Secondly, the previous logic would repeatedly extract element/field
values from the LHS when initializing the result. This is unnecessary:
each element must only be extracted once, and the result reused.
These changes together give huge improvements to compiler performance on
a pathological case: AIR instructions go from 65551 to 15, and total AIR
bytes go from 1.86MiB to 264.57KiB. Codegen time spent on this function
(in a debug compiler build) goes from minutes to essentially zero.
Resolves: #17586
Previously the symbol tag field would remain `undefined` until it
was set during `flush`. However, the symbol's tag would be observed
earlier than where it was being set. We now set it to the explicit
tag `undefined` so this can be caught during debug. The symbol tag of
a decl will now also be set right after `updateDecl` and `updateFunc`.
Likewise, we now also set the `name` field during atom creation for
decls, as well as set the other fields to the max(u32) to ensure we
get a compiler crash during debug to ensure any misses will be caught.
There is no grantee that `copy_cqes` will return exactly wait_nr number of cqes.
If there are ready cqes it can return > 0 but < wait_nr number of cqes.
The low-level `Curve25519.fromEdwards25519()` function assumed
that the X/Y coordinates were not scaled (Z=1).
But this is not guaranteed to be the case.
In most real-world applications, the coordinates are freshly decoded,
either directly or via the `X25519.fromEd25519()` function, so this
is not an issue.
However, since we offer the ability to do that conversion after
arbitrary computations, the assertion was not correct.
This change allows struct field inits to use layout information
of their own struct without causing a circular dependency.
`semaStructFields` caches the ranges of the init bodies in the `StructType`
trailing data. The init bodies are then resolved by `resolveStructFieldInits`,
which is called before the inits are actually required.
Within the init bodies, the struct decl's instruction is repurposed to refer
to the field type itself. This is to allow us to easily rebuild the inst_map
mapping required for the init body instructions to refer to the field type.
Thanks to @mlugg for the guidance on this one!