Commit Graph

6427 Commits

Author SHA1 Message Date
Andrew Kelley
0b28133ec2 zig.h: avoid binary literals
gcc -pedantic complains about these
2022-11-22 22:33:16 -07:00
Frank Denis
ea05223b63
std.crypto.auth: add AEGIS MAC (#13607)
* Update the AEGIS specification URL to the current draft

* std.crypto.auth: add AEGIS MAC

The Pelican-based authentication function of the AEGIS construction
can be used independently from authenticated encryption, as a faster
and more secure alternative to GHASH/POLYVAL/Poly1305.

We already expose GHASH, POLYVAL and Poly1305 for use outside AES-GCM
and ChaChaPoly, so there are no reasons not to expose the MAC from AEGIS
as well.

Like other 128-bit hash functions, finding a collision only requires
~2^64 attempts or inputs, which may still be acceptable for many
practical applications.

Benchmark (Apple M1):

    siphash128-1-3:       3222 MiB/s
             ghash:       8682 MiB/s
    aegis-128l mac:      12544 MiB/s

Benchmark (Zen 2):

    siphash128-1-3:       4732 MiB/s
             ghash:       5563 MiB/s
    aegis-128l mac:      19270 MiB/s
2022-11-22 18:16:04 +01:00
Andrew Kelley
670b4c5c02
Merge pull request #13292 from mitchellh/valgrind-arm64
std: valgrind client request support for aarch64
2022-11-21 19:50:26 -05:00
Andrew Kelley
545c3117ff rename lse_atomics.zig to aarch64_outline_atomics.zig 2022-11-21 17:17:02 -07:00
Andrew Kelley
58430ae6d1 outline atomics: ret instead of jump to ret
After this, the machine code generated by zig matches identically to
gcc's after the differences in loading the have_lse flag.
2022-11-21 17:17:02 -07:00
Andrew Kelley
95ee8ab77d simplify outline atomics
* Rely on libSystem when targeting macOS.
 * Make tools/gen_outline_atomics.zig more idiomatic.
 * Remove the CPU detection / auxval checking from compiler_rt. This
   functionality belongs in a different component. Zig's compiler_rt
   must not rely on constructors. Instead it will export a symbol for
   setting the value, and start code can detect and activate it.
 * Remove the separate logic for inline assembly when the target does or
   does not have lse support. `.inst` works in both cases.
2022-11-21 17:17:02 -07:00
Devin Singh
a8f2d00ec4 compiler_rt: add outlined lse atomics for aarch64 2022-11-21 17:17:02 -07:00
Frank Denis
c45c6cd492 Add the POLYVAL universal hash function
POLYVAL is GHASH's little brother, required by the AES-GCM-SIV
construction. It's defined in RFC8452.

The irreducible polynomial is a mirror of GHASH's (which doesn't
change anything in our implementation that didn't reverse the raw
bits to start with).

But most importantly, POLYVAL encodes byte strings as little-endian
instead of big-endian, which makes it a little bit faster on the
vast majority of modern CPUs.

So, both share the same code, just with comptime magic to use the
correct endianness and only double the key for GHASH.
2022-11-20 18:13:19 -05:00
Andrew Kelley
8c7712d8fa fix CPU model detection for neoverse_n1 on aarch64-linux
see #10086
2022-11-20 15:34:39 -07:00
Andrew Kelley
78389af552 LLVM: add valgrind integration for x86 and aarch64
This also modifies the inline assembly to be more optimizable - instead of
doing explicit movs, we instead communicate to LLVM which registers we
would like to, somehow, have the correct values. This is how the x86_64
code already worked and thus allows the code to be unified across the
two architectures.

As a bonus, I threw in x86 support.
2022-11-19 19:32:45 -07:00
Mitchell Hashimoto
95e135a8cb std: valgrind client request support for aarch64 2022-11-19 18:54:31 -07:00
Ali Chraghi
fca776f8f5 os: windows: fix unhandled error 2022-11-19 22:48:32 +02:00
Stevie Hryciw
04f3067a79 run zig fmt on everything checked by CI 2022-11-18 19:22:42 +00:00
Stevie Hryciw
e999f9f472 std: replace parseAppend with parseWrite in std.zig.string_literal 2022-11-18 19:22:42 +00:00
Stevie Hryciw
ca9e1760e8 fmt: canonicalize identifiers 2022-11-18 19:22:42 +00:00
remeh
e7424d5d2a std.array_list: add a comment on every methods invalidating pointers.
While it is already mentioned on the `items` attributes of the structs, it is
interesting to comment in every method potentially invalidating pointers to items
that they may do so.
2022-11-18 14:49:31 +02:00
Veikka Tuominen
3c0c0f899b
Merge pull request #13417 from InKryption/rand-deterministic-indexing
std.Random: add functions with explicit index type
2022-11-18 14:48:51 +02:00
Veikka Tuominen
8082323dfd
Merge pull request #13411 from dweiller/custom-test-runner
Custom test runner
2022-11-18 14:47:21 +02:00
Stevie Hryciw
5f6f38ff31 std.math.big.int: implement popCount() for Const 2022-11-18 14:31:30 +02:00
Frank Denis
4dd061a7ac ghash: handle the .hi_lo case when no CLMUL acceleration is present, too 2022-11-17 23:54:21 +01:00
Frank Denis
3051e279a5 Reapply "std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs (#13566)"
This reapplies commit 72d3f4b5dc.
2022-11-17 23:52:58 +01:00
Andrew Kelley
72d3f4b5dc Revert "std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs (#13566)"
This reverts commit 7cfeae1ce7 which
is causing std lib tests to fail on wasm32-wasi.
2022-11-17 15:37:37 -07:00
Frank Denis
7cfeae1ce7
std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs (#13566)
* std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs

Carryless multiplication was slow on older Intel CPUs, justifying
the need for using Karatsuba multiplication.

This is not the case any more; using 4 multiplications to multiply
two 128-bit numbers is actually faster than 3 multiplications +
shifts and additions.

This is also true on aarch64.

Keep using Karatsuba only when targeting x86 (granted, this is a bit
of a brutal shortcut, we should really list all the CPU models that
had a slow clmul instruction).

Also remove useless agg_2 treshold and restore the ability to
precompute only H and H^2 in ReleaseSmall.

Finally, avoid using u256. Using 128-bit registers is actually faster.

* Use a switch, add some comments
2022-11-17 13:07:07 +01:00
Björn Linse
a09a5ad574 stdlib: make linux.PERF.TYPE non-exhaustive
perf_event_attr.type needs to take a runtime defined value to enable
dynamic PMU:s, such as kprobe and uprobe. This value can exceed
predefined values defined in the linux headers.

reference: perf_event_open(2) man page
2022-11-16 19:02:24 -05:00
Eric Joldasov
3c3def6ac2 process.zig: remove unused function getSelfExeSharedLibPaths 2022-11-16 18:51:11 -05:00
Guillaume Wenzek
699e7f721b fix Nvptx backend outputing files at the top level of zig-cache 2022-11-16 18:49:04 -05:00
Andrew Kelley
07671838b0
Merge pull request #13561 from jacobly0/gcc-warnings 2022-11-16 10:38:44 -05:00
Eric Joldasov
684264908e compiler_rt: fix TODOs in udivmod.zig 2022-11-16 13:08:41 +02:00
Jacob Young
b5b507a742 zig.h: match float comparison signatures from compiler rt 2022-11-15 23:33:48 -05:00
Veikka Tuominen
b6b3462796 std.mem.Allocator: do not return undefined pointers from create
Closes #13517
2022-11-16 01:12:27 +02:00
GethDW
024bac7f53 std.build: fix typo
This would only fail to compile when building *on* WASI.
2022-11-15 23:23:27 +02:00
Hayden Pope
ceb9fedb47 std.os.linux: Add setitimer and getitimer syscalls 2022-11-15 02:38:28 -05:00
Frank Denis
7eed028f9a
crypto.bcrypt: fix massive speed regression when using stage2 (#13518)
state: State -> state: *const State
Suggested by @nektro

Fixes #13510
2022-11-14 16:37:19 +01:00
Naoki MATSUMOTO
b29057b6ab
std.crypto.ghash: fix uninitialized polynomial use (#13527)
In the process of 'remaining blocks',
the length of processed message can be from 1 to 79.
The value of 'n-1' is ranged from 0 to 3.
So, st.hx[i] must be initialized at least from st.hx[0] to st.hx[3]
2022-11-14 16:35:08 +01:00
Andrew Kelley
20e8c2df4e zig.h: remove redundant definition of u16/i16 2022-11-13 16:50:20 -07:00
Andrew Kelley
77e7d97725 C backend: improve ergonomics of zig.h a little bit
Partially implements #13528. Enough to unblock the wasi-bootstrap
branch.
2022-11-13 16:50:16 -07:00
Halil
b2ffe113d3
x/os/Reactor: implement remove function (#13330)
* x/os/Reactor: implement remove function

* x/os/Reactor: update tests
2022-11-13 17:43:29 +02:00
Jonathan
81dadbcd77 pthread_sigmask 2022-11-13 17:36:56 +02:00
dweiller
a1b123bccb std.build: add setter for LibObjExeStep test runner path 2022-11-13 13:52:53 +11:00
Nick Cernis
8a5818535b
Make invalidFmtError public and use in place of compileErrors for bad format strings (#13526)
* Export invalidFmtErr

To allow consistent use of "invalid format string" compile error
response for badly formatted format strings.

See https://github.com/ziglang/zig/pull/13489#issuecomment-1311759340.

* Replace format compile errors with invalidFmtErr

- Provides more consistent compile errors.
- Gives user info about the type of the badly formated value.

* Rename invalidFmtErr as invalidFmtError

For consistency. Zig seems to use “Error” more often than “Err”.

* std: add invalid format string checks to remaining custom formatters

* pass reference-trace to comp when building build file; fix checkobjectstep
2022-11-12 21:03:24 +02:00
IntegratedQuantum
fbc4331f18 Implements std.math.sign for float vectors. 2022-11-12 15:41:55 +02:00
Jakub Konka
7733246d6e pdb: make SuperBlock def public 2022-11-12 09:40:40 +01:00
Frank Denis
df7223c7f2 crypto.AesGcm: provision ghash for the final block 2022-11-11 18:04:22 +01:00
Frank Denis
59af6417bb
crypto.ghash: define aggregate tresholds as blocks, not bytes (#13507)
These constants were read as a block count in initForBlockCount()
but at the same time, as a size in update().

The unit could be blocks or bytes, but we should use the same one
everywhere.

So, use blocks as intended.

Fixes #13506
2022-11-10 19:00:00 +01:00
Veikka Tuominen
41b7e40d75
Merge pull request #13418 from ryanschneider/signal-alignment-13216
std.os: fix alignment of Sigaction.handler_fn
2022-11-09 17:36:40 +02:00
IntegratedQuantum
d1e7be0bd1
Handle sentinel slices in std.mem.zeroes
Fixes #13256
2022-11-09 17:33:48 +02:00
bfredl
95f989a05b Fixes to linux/bpf/btf.zig
- the meaning of packed structs changed in zig 0.10. adjust accordingly.
  Use "extern struct" for the cases that directly map to C structs.

- Add new type info kinds, like enum64 and DeclTag

- change the Type enum to use the canonical names from libbpf.
  This is more predictable when comparing with external BPF
  documentation (than invented synonyms that need to be guessed)
2022-11-09 17:14:22 +02:00
Frank Denis
36e618aef1 crypto.ghash: compatibility with stage1
Defining the selector enum outside the function definition is
required for stage1.
2022-11-08 16:59:53 +01:00
Frank Denis
7d48cb1138
std.crypto: make ghash faster, esp. for small messages (#13464)
* std.crypto: make ghash faster, esp. for small messages

Aggregated reduction requires 5 additional multiplications (to
precompute the powers of H), in order to save 2 multiplications
per batch.

So, only use large batches when it's actually interesting to do so.

For the last blocks, reuse the precomputations in order to perform
a single reduction.

Also, even in .ReleaseSmall, allow 2-block aggregation.
The speedup is worth it, and the code increase is reasonable.

And in .ReleaseFast, bump the upper batch size up to 16.

Leverage comptime by the way instead of duplicating code.

std/crypto/benchmark.zig on Apple M1:

    Zig 0.10.0: 2769 MiB/s
        Before: 6014 MiB/s
         After: 7334 MiB/s

Normalize function names by the way.

* Change clmul() to accept the half to be processed

This avoids a bunch of truncate() calls.

* Add more ghash tests to check all code paths
2022-11-07 21:45:29 +01:00
Frank Denis
32563e6829
crypto.core.aes: process 6 block in parallel instead of 8 on aarch64 (#13473)
* crypto.core.aes: process 6 block in parallel instead of 8 on aarch64

At least on Apple Silicon, this is slightly faster than 8 blocks.

* AES: add parallel blocks for tigerlake, rocketlake, alderlake, zen3
2022-11-07 12:28:37 +01:00