Commit Graph

9283 Commits

Author SHA1 Message Date
Igor Anić
fd9db4962c reorganize compress package root folder 2024-02-14 23:34:13 +01:00
Igor Anić
2457b68b2f remove v1 deflate implementation 2024-02-14 22:34:13 +01:00
Igor Anić
e20080be13 preserve valuable tests from v1 implementation
Before removal of v1.
2024-02-14 22:12:54 +01:00
Igor Anić
0afe808928 remove testing struct sizes
It was usefull during development.

From andrewrk code review comment:
In fact, Zig does not guarantee the @sizeOf structs, and so these tests are not valid.
2024-02-14 21:06:45 +01:00
Igor Anić
d49cdf5b2d skip calculating struct sizes on 32 bit platforms 2024-02-14 19:58:45 +01:00
Igor Anić
c2361bf548 fix top level docs comments
I didn't understand the difference.

ref: https://ziglang.org/documentation/0.11.0/#Comments
2024-02-14 18:28:20 +01:00
Igor Anić
5fbc371b41 fix wording in comment 2024-02-14 18:28:20 +01:00
Igor Anić
f81b3a2095 fix reading input stream during decompression
By using read instead of readAll decompression reader could get bytes
then available in the stream and then later wrongly failed with end of
stream.
2024-02-14 18:28:20 +01:00
Igor Anić
d645114f7e add deflate implemented from first principles
Zig deflate compression/decompression implementation. It supports compression and decompression of gzip, zlib and raw deflate format.

Fixes #18062.

This PR replaces current compress/gzip and compress/zlib packages. Deflate package is renamed to flate. Flate is common name for deflate/inflate where deflate is compression and inflate decompression.

There are breaking change. Methods signatures are changed because of removal of the allocator, and I also unified API for all three namespaces (flate, gzip, zlib).

Currently I put old packages under v1 namespace they are still available as compress/v1/gzip, compress/v1/zlib, compress/v1/deflate. Idea is to give users of the current API little time to postpone analyzing what they had to change. Although that rises question when it is safe to remove that v1 namespace.

Here is current API in the compress package:

```Zig
// deflate
    fn compressor(allocator, writer, options) !Compressor(@TypeOf(writer))
    fn Compressor(comptime WriterType) type

    fn decompressor(allocator, reader, null) !Decompressor(@TypeOf(reader))
    fn Decompressor(comptime ReaderType: type) type

// gzip
    fn compress(allocator, writer, options) !Compress(@TypeOf(writer))
    fn Compress(comptime WriterType: type) type

    fn decompress(allocator, reader) !Decompress(@TypeOf(reader))
    fn Decompress(comptime ReaderType: type) type

// zlib
    fn compressStream(allocator, writer, options) !CompressStream(@TypeOf(writer))
    fn CompressStream(comptime WriterType: type) type

    fn decompressStream(allocator, reader) !DecompressStream(@TypeOf(reader))
    fn DecompressStream(comptime ReaderType: type) type

// xz
   fn decompress(allocator: Allocator, reader: anytype) !Decompress(@TypeOf(reader))
   fn Decompress(comptime ReaderType: type) type

// lzma
    fn decompress(allocator, reader) !Decompress(@TypeOf(reader))
    fn Decompress(comptime ReaderType: type) type

// lzma2
    fn decompress(allocator, reader, writer !void

// zstandard:
    fn DecompressStream(ReaderType, options) type
    fn decompressStream(allocator, reader) DecompressStream(@TypeOf(reader), .{})
    struct decompress
```

The proposed naming convention:
 - Compressor/Decompressor for functions which return type, like Reader/Writer/GeneralPurposeAllocator
 - compressor/compressor for functions which are initializers for that type, like reader/writer/allocator
 - compress/decompress for one shot operations, accepts reader/writer pair, like read/write/alloc

```Zig
/// Compress from reader and write compressed data to the writer.
fn compress(reader: anytype, writer: anytype, options: Options) !void

/// Create Compressor which outputs the writer.
fn compressor(writer: anytype, options: Options) !Compressor(@TypeOf(writer))

/// Compressor type
fn Compressor(comptime WriterType: type) type

/// Decompress from reader and write plain data to the writer.
fn decompress(reader: anytype, writer: anytype) !void

/// Create Decompressor which reads from reader.
fn decompressor(reader: anytype) Decompressor(@TypeOf(reader)

/// Decompressor type
fn Decompressor(comptime ReaderType: type) type

```

Comparing this implementation with the one we currently have in Zig's standard library (std).
Std is roughly 1.2-1.4 times slower in decompression, and 1.1-1.2 times slower in compression. Compressed sizes are pretty much same in both cases.
More resutls in [this](https://github.com/ianic/flate) repo.

This library uses static allocations for all structures, doesn't require allocator. That makes sense especially for deflate where all structures, internal buffers are allocated to the full size. Little less for inflate where we std version uses less memory by not preallocating to theoretical max size array which are usually not fully used.

For deflate this library allocates 395K while std 779K.
For inflate this library allocates 74.5K while std around 36K.

Inflate difference is because we here use 64K history instead of 32K in std.

If merged existing usage of compress gzip/zlib/deflate need some changes. Here is example with necessary changes in comments:

```Zig

const std = @import("std");

// To get this file:
// wget -nc -O war_and_peace.txt https://www.gutenberg.org/ebooks/2600.txt.utf-8
const data = @embedFile("war_and_peace.txt");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer std.debug.assert(gpa.deinit() == .ok);
    const allocator = gpa.allocator();

    try oldDeflate(allocator);
    try new(std.compress.flate, allocator);

    try oldZlib(allocator);
    try new(std.compress.zlib, allocator);

    try oldGzip(allocator);
    try new(std.compress.gzip, allocator);
}

pub fn new(comptime pkg: type, allocator: std.mem.Allocator) !void {
    var buf = std.ArrayList(u8).init(allocator);
    defer buf.deinit();

    // Compressor
    var cmp = try pkg.compressor(buf.writer(), .{});
    _ = try cmp.write(data);
    try cmp.finish();

    var fbs = std.io.fixedBufferStream(buf.items);
    // Decompressor
    var dcp = pkg.decompressor(fbs.reader());

    const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(plain);
    try std.testing.expectEqualSlices(u8, data, plain);
}

pub fn oldDeflate(allocator: std.mem.Allocator) !void {
    const deflate = std.compress.v1.deflate;

    // Compressor
    var buf = std.ArrayList(u8).init(allocator);
    defer buf.deinit();
    // Remove allocator
    // Rename deflate -> flate
    var cmp = try deflate.compressor(allocator, buf.writer(), .{});
    _ = try cmp.write(data);
    try cmp.close(); // Rename to finish
    cmp.deinit(); // Remove

    // Decompressor
    var fbs = std.io.fixedBufferStream(buf.items);
    // Remove allocator and last param
    // Rename deflate -> flate
    // Remove try
    var dcp = try deflate.decompressor(allocator, fbs.reader(), null);
    defer dcp.deinit(); // Remove

    const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(plain);
    try std.testing.expectEqualSlices(u8, data, plain);
}

pub fn oldZlib(allocator: std.mem.Allocator) !void {
    const zlib = std.compress.v1.zlib;

    var buf = std.ArrayList(u8).init(allocator);
    defer buf.deinit();

    // Compressor
    // Rename compressStream => compressor
    // Remove allocator
    var cmp = try zlib.compressStream(allocator, buf.writer(), .{});
    _ = try cmp.write(data);
    try cmp.finish();
    cmp.deinit(); // Remove

    var fbs = std.io.fixedBufferStream(buf.items);
    // Decompressor
    // decompressStream => decompressor
    // Remove allocator
    // Remove try
    var dcp = try zlib.decompressStream(allocator, fbs.reader());
    defer dcp.deinit(); // Remove

    const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(plain);
    try std.testing.expectEqualSlices(u8, data, plain);
}

pub fn oldGzip(allocator: std.mem.Allocator) !void {
    const gzip = std.compress.v1.gzip;

    var buf = std.ArrayList(u8).init(allocator);
    defer buf.deinit();

    // Compressor
    // Rename compress => compressor
    // Remove allocator
    var cmp = try gzip.compress(allocator, buf.writer(), .{});
    _ = try cmp.write(data);
    try cmp.close(); // Rename to finisho
    cmp.deinit(); // Remove

    var fbs = std.io.fixedBufferStream(buf.items);
    // Decompressor
    // Rename decompress => decompressor
    // Remove allocator
    // Remove try
    var dcp = try gzip.decompress(allocator, fbs.reader());
    defer dcp.deinit(); // Remove

    const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(plain);
    try std.testing.expectEqualSlices(u8, data, plain);
}

```
2024-02-14 18:28:20 +01:00
Andrew Kelley
5f92558290 std.posix.termios: bring V back
In d7563a7753, I misunderstood what `cc_t`
was supposed to do. Those V enum values are indices into the array.
2024-02-13 20:10:32 -08:00
Felix Kollmann
8addf53fb5
Add timedWait to std.Thread.Semaphore (#18805)
* Add `timedWait` to `std.Thread.Semaphore`

Add example to documentation of `std.Thread.Semaphore`

* Add unit test for thread semaphore timed wait

Fix missing try

* Change unit test to be simpler

* Change `timedWait()` to keep a deadline

* Change `timedWait()` to return earlier in some scenarios

* Change `timedWait()` to keep a deadline (based on std.Timer)

(similar to std.Thread.Futex)

---------

Co-authored-by: protty <45520026+kprotty@users.noreply.github.com>
2024-02-13 11:51:42 -06:00
Andrew Kelley
ce3bd51597 std.os.termios: move it to be with the group 2024-02-12 21:58:37 -07:00
Andrew Kelley
e1ab57337f std.c.speed_t: consolidate common across os 2024-02-12 21:53:54 -07:00
Andrew Kelley
ae107cf71b std.os.speed_t: add type safety
and collect the missing flag bits from all the operating systems.
2024-02-12 21:49:09 -07:00
Andrew Kelley
a280ff2767 std.os.termios: add type safety to lflag field
This creates `tc_cflag_t` even though such a type is not defined by
libc.

I also collected the missing flag bits from all the operating systems.
2024-02-12 21:21:45 -07:00
Andrew Kelley
e97fa8b038 std.os.termios: add type safety to cflag field
This creates `tc_cflag_t` even though such a type is not defined by
libc.

I also collected the missing flag bits from all the operating systems.
2024-02-12 18:24:07 -07:00
Andrew Kelley
20abc0caee std.os.termios: add type safety to oflag field
This creates `tc_oflag_t` even though such a type is not defined by
libc.

I also collected the missing flag bits from all the operating systems.
2024-02-12 17:28:09 -07:00
Andrew Kelley
47643cc5cc std.os.termios: add type safety to iflag field
This creates `tc_iflag_t` even though such a type is not defined by
libc.

I also collected the missing flag bits from all the operating systems.
2024-02-12 16:43:51 -07:00
Andrew Kelley
0c88f927f1 std.os.termios: consolidate and correct 2024-02-12 16:21:21 -07:00
Andrew Kelley
9a64318554 std.c.NCSS: consolidate and correct 2024-02-12 15:52:13 -07:00
Andrew Kelley
9bdf1ebe36 std.c.cc_t: consolidate same OS values 2024-02-12 15:44:28 -07:00
Andrew Kelley
5258c3caad std: add type safety to cc_t 2024-02-12 15:41:38 -07:00
CPestka
0c725a354a Replaced loop with memcpys 2024-02-12 12:58:33 -08:00
Andrew Kelley
fad5e7a997
Merge pull request #18898 from psnszsn/iouring_waitid
io_uring: add waitid operation
2024-02-12 12:20:12 -08:00
Andrew Kelley
0c1b9992fd
Merge pull request #18821 from jacobly0/x86_64-tests
x86_64: pass more tests
2024-02-12 12:18:03 -08:00
Andrew Kelley
f995c1b08a std.c.O: fix illumos regression
introduced in c3eb592a34
2024-02-12 01:06:27 -07:00
Jacob Young
e27db373ec x86_64: implement @clz and @ctz of big integers 2024-02-12 05:25:07 +01:00
Jacob Young
d894727873 x86_64: implement @byteSwap of big integers 2024-02-12 05:25:07 +01:00
Jacob Young
271505cfc8 x86_64: fix compiler_rt tests 2024-02-12 05:25:07 +01:00
Jacob Young
bcbd49b2a6 x86_64: implement shifts of big integers 2024-02-12 05:25:07 +01:00
Jacob Young
9023ff04d0 x86_64: fix register clobber 2024-02-12 05:25:07 +01:00
Jacob Young
a9f738e56b x86_64: implement c abi for bool vectors 2024-02-12 05:25:07 +01:00
Jacob Young
7c9a96111c x86_64: fix assert location 2024-02-12 05:25:07 +01:00
Jacob Young
6235762c09 x86_64: implement mul, div, and mod of large integers
This enables the last compiler-rt test disabled for the x86_64 backend.
2024-02-12 05:25:07 +01:00
Andrew Kelley
7680c5330c some API work on std.c, std.os, std.os.wasi
* std.c: consolidate some definitions, making them share code. For
  example, freebsd, dragonfly, and openbsd can all share the same
  `pthread_mutex_t` definition.
* add type safety to std.c.O
  - this caught a bug where mode flags were incorrectly passed as the
    open flags.
* 3 fewer uses of usingnamespace keyword
* as per convention, remove purposeless field prefixes from struct field
  names even if they have those prefixes in the corresponding C code.
* fix incorrect wasi libc Stat definition
* remove C definitions from incorrectly being in std.os.wasi
* make std.os.wasi definitions type safe
* go through wasi native APIs even when linking libc because the libc
  APIs are problematic and wasteful
* don't expose WASI definitions in std.posix
* remove std.os.wasi.rights_t.ALL: this is a footgun. should it be all
  future rights too? or only all current rights known? both are
  the wrong answer.
2024-02-11 13:38:55 -07:00
Vlad Pănăzan
d2789908ed io_uring: add waitid operation
This is the equivalent of a waitid(2) syscall and can be used to
be notified about child process state changes.

Available since kernel 6.7
2024-02-11 15:47:03 +01:00
Vlad Pănăzan
20ea0012f0 linux: add missing io_uring opcodes 2024-02-11 15:43:12 +01:00
Jakub Konka
d12c8db642
Merge pull request #18875 from ziglang/macho-zo-dwarf
macho: emit DWARF for ZigObject relocatable
2024-02-09 23:12:04 +01:00
Prokop Randacek
6fb23542fe Buffer the logging function
The default logging function used to have no buffer. So a single log
statement could result in many individual write syscalls each writing
only a couple of bytes.

After this change the logging function now has a 4kb buffer. Only log
statements longer than 4kb now do multiple write syscalls.

4kb is the default bufferedWriter size and was choosen arbitrarily.

The downside of this is that the log function now allocates 4kb more
stack space but I think that is an acceptable trade-off.
2024-02-09 14:02:57 -08:00
Andrew Kelley
32f30399e5
Merge pull request #18867 from e4m2/random
std.rand: Move to std.Random
2024-02-09 13:42:04 -08:00
Andrew Kelley
54bbc73f85
Merge pull request #18712 from Vexu/std.options
std: make options a struct instance instead of a namespace
2024-02-09 13:38:42 -08:00
Jakub Konka
32386a06ca builtin: enable panic handler on self-hosted macho
comp: toggle compiler-rt and zig-libc caps for macho
2024-02-08 23:51:21 +01:00
e4m2
60639ec83d Fixup filename casing 2024-02-08 15:39:28 +01:00
e4m2
8d56e472c9 Replace std.rand references with std.Random 2024-02-08 15:21:35 +01:00
e4m2
9af077d71e std.rand: Move to std.Random 2024-02-08 14:43:20 +01:00
Jacob Young
919a3bae1c http: protect against zero-length chunks
A zero-length chunk marks the end of the body, so prevent any from
possibly occurring in the middle of the body.
2024-02-08 01:29:49 -08:00
Andrew Kelley
3122fd0ba0
Merge pull request #17634 from ianprime0509/type-erased-writer
Add type-erased writer and GenericWriter
2024-02-07 23:52:53 -08:00
Andrew Kelley
ba8375328c
Merge pull request #18718 from schmee/bounds
Add upperBound, lowerBound, and equalRange
2024-02-07 18:48:41 -08:00
Andrew Kelley
42fcca49c5
Merge pull request #18846 from ziglang/std.os.linux.MAP
std.os.MAP: use a packed struct
2024-02-07 13:55:03 -08:00
John Schmidt
e487b576fa Changes to lowerBound/upperBound/equalRange
The old definitions had some problems:

- In `lowerBound`, the `lhs` (left hand side) argument was passed on the
  right hand side.
- In `upperBound`, the `greaterThan` function needed to return
  `greaterThanOrEqual` for the function work, so either the name or the
  implementation is incorrect.

To fix both problems, define the functions in terms of a `lessThan` function that returns `lhs < rhs`.
The is more consistent with the rest of `sort.zig` and it's also how C++ implements lower/upperBound (1)(2).

(1) https://en.cppreference.com/w/cpp/algorithm/lower_bound
(2) https://en.cppreference.com/w/cpp/algorithm/upper_bound

- Rewrite doc comments.
- Add a couple of more test cases.
- Add docstring for std.sort.binarySearch
2024-02-07 21:00:24 +01:00