libfetch: don't include fragments in HTTP requests

Summary:
Fragments are reserved for client-side processing, see
https://www.rfc-editor.org/rfc/rfc9110.html#section-7.1

Also, some servers don't like to receive HTTP requests with fragments.

```
$ fetch 'https://dropbox.com/a/b'
fetch: https://dropbox.com/a/b: Not Found

$ fetch 'https://dropbox.com/a/b#'
fetch: https://dropbox.com/a/b#: Bad Request
```

This is a real-world scenario, where some download link from dropbox
(eventually) redirects to an URL with a fragment:

```
$ fetch -v 'https://www.dropbox.com/sh/<some>/<thing>?dl=1' 2>&1 | grep requesting
requesting https://www.dropbox.com/sh/<some>/<thing>?dl=1
requesting https://www.dropbox.com/scl/fo/<foo>/<bar>?rlkey=<baz>&dl=1
requesting https://<boo>.dl.dropboxusercontent.com/zip_download_get/<some-long-strig>#
```

See how the last redirect ends with a `#`.

Currently, libfetch includes the ending fragment and makes it impossible
to download the file.

Differential Revision:	https://reviews.freebsd.org/D46318
MFC after:		2 weeks
This commit is contained in:
Pietro Cerutti 2024-08-21 12:35:27 +00:00
parent e7f9171b67
commit 1af7d5f389

View File

@ -447,7 +447,10 @@ nohost:
goto ouch; goto ouch;
} }
u->doc = doc; u->doc = doc;
while (*p != '\0') { /* fragments are reserved for client-side processing, see
* https://www.rfc-editor.org/rfc/rfc9110.html#section-7.1
*/
while (*p != '\0' && *p != '#') {
if (!isspace((unsigned char)*p)) { if (!isspace((unsigned char)*p)) {
*doc++ = *p++; *doc++ = *p++;
} else { } else {