Commit Graph

31 Commits

Author SHA1 Message Date
Daniel Stenberg 04488851e2
urlapi: make sure no +/- signs are accepted in IPv4 numericals
Follow-up to 56a037cc0a. Extends test 1560 to verify.

Reported-by: Tuomas Siipola
Fixes #6916
Closes #6917
2021-04-21 09:17:55 +02:00
Daniel Stenberg 56a037cc0a
urlapi: "normalize" numerical IPv4 host names
When the host name in a URL is given as an IPv4 numerical address, the
address can be specified with dotted numericals in four different ways:
a32, a.b24, a.b.c16 or a.b.c.d and each part can be specified in
decimal, octal (0-prefixed) or hexadecimal (0x-prefixed).

Instead of passing on the name as-is and leaving the handling to the
underlying name functions, which made them not work with c-ares but work
with getaddrinfo, this change now makes the curl URL API itself detect
and "normalize" host names specified as IPv4 numericals.

The WHATWG URL Spec says this is an okay way to specify a host name in a
URL. RFC 3896 does not allow them, but curl didn't prevent them before
and it seems other RFC 3896-using tools have not either. Host names used
like this are widely supported by other tools as well due to the
handling being done by getaddrinfo and friends.

I decided to add the functionality into the URL API itself so that all
users of these functions get the benefits, when for example wanting to
compare two URLs. Also, it makes curl built to use c-ares now support
them as well and make curl builds more consistent.

The normalization makes HTTPS and virtual hosted HTTP work fine even
when curl gets the address specified using one of the "obscure" formats.

Test 1560 is extended to verify.

Fixes #6863
Closes #6871
2021-04-19 08:34:55 +02:00
Daniel Stenberg 4d2f800677
curl.se: new home
Closes #6172
2020-11-04 23:59:47 +01:00
Daniel Stenberg b7ea3d2c22
urlapi: URL encode a '+' in the query part
... when asked to with CURLU_URLENCODE.

Extended test 1560 to verify.
Reported-by: Dietmar Hauser
Fixes #6086
Closes #6087
2020-10-15 23:21:53 +02:00
Daniel Stenberg 17fcdf6a31
lib: fix -Wassign-enum warnings
configure --enable-debug now enables -Wassign-enum with clang,
identifying several enum "abuses" also fixed.

Reported-by: Gisle Vanem
Bug: 879007f811 (commitcomment-42087553)

Closes #5929
2020-09-08 13:53:02 +02:00
Daniel Stenberg 259a81555d
lib1560: verify "redirect" to double-slash leading URL
Closes #5849
2020-08-25 13:06:34 +02:00
Martin V b71628b633
test1560: avoid possibly negative association in wording
Closes #5549
2020-06-12 10:01:57 +02:00
Daniel Stenberg 7f1c098728
urlapi: accept :: as a valid IPv6 address
Text 1560 is extended to verify.

Reported-by: Pavel Volgarev
Fixes #5344
Closes #5351
2020-05-08 08:47:29 +02:00
Patrick Monnerat a75f12768d
test 1560: avoid valgrind false positives
When using maximum code optimization level (-O3), valgrind wrongly
detects uses of uninitialized values in strcmp().

Preset buffers with all zeroes to avoid that.
2020-03-08 17:30:55 +01:00
Daniel Stenberg d3dc0a07e9
urlapi: guess scheme correct even with credentials given
In the "scheme-less" parsing case, we need to strip off credentials
first before we guess scheme based on the host name!

Assisted-by: Jay Satiro
Fixes #4856
Closes #4857
2020-01-28 08:40:16 +01:00
Daniel Stenberg 6e7733f788
urlapi: question mark within fragment is still fragment
The parser would check for a query part before fragment, which caused it
to do wrong when the fragment contains a question mark.

Extended test 1560 to verify.

Reported-by: Alex Konev
Fixes #4412
Closes #4413
2019-09-24 23:30:43 +02:00
Jens Finkhaeuser 0a4ecbdf1c
urlapi: CURLU_NO_AUTHORITY allows empty authority/host part
CURLU_NO_AUTHORITY is intended for use with unknown schemes (i.e. not
"file:///") to override cURL's default demand that an authority exists.

Closes #4349
2019-09-19 15:57:28 +02:00
Daniel Stenberg eab3c580f9
urlapi: verify the IPv6 numerical address
It needs to parse correctly. Otherwise it could be tricked into letting
through a-f using host names that libcurl would then resolve. Like
'[ab.be]'.

Reported-by: Thomas Vegas
Closes #4315
2019-09-10 11:32:12 +02:00
Marcel Raad e23c52b329
build: fix Codacy warnings
Reduce variable scopes and remove redundant variable stores.

Closes https://github.com/curl/curl/pull/3975
2019-06-05 20:38:06 +02:00
Daniel Stenberg 8b038bcc95
lib1560: add tests for parsing URL with too long scheme
Ref: #3905
2019-05-20 15:27:07 +02:00
Daniel Stenberg 9f9ec7da57
urlapi: require a non-zero host name length when parsing URL
Updated test 1560 to verify.

Closes #3880
2019-05-14 13:39:10 +02:00
Daniel Stenberg 2d0e9b40d3
urlapi: add CURLUPART_ZONEID to set and get
The zoneid can be used with IPv6 numerical addresses.

Updated test 1560 to verify.

Closes #3834
2019-05-05 15:52:46 +02:00
Daniel Stenberg bdb2dbc103
urlapi: strip off scope id from numerical IPv6 addresses
... to make the host name "usable". Store the scope id and put it back
when extracting a URL out of it.

Also makes curl_url_set() syntax check CURLUPART_HOST.

Fixes #3817
Closes #3822
2019-05-03 12:17:22 +02:00
Daniel Stenberg d715d2ac89
urlapi: stricter CURLUPART_PORT parsing
Only allow well formed decimal numbers in the input.

Document that the number MUST be between 1 and 65535.

Add tests to test 1560 to verify the above.

Ref: https://github.com/curl/curl/issues/3753
Closes #3762
2019-04-13 11:17:30 +02:00
Jakub Zakrzewski 89bb5a836c
test: urlapi: urlencode characters above 0x7f correctly 2019-04-07 22:57:42 +02:00
Daniel Stenberg dcd6f81025
snprintf: renamed and we now only use msnprintf()
The function does not return the same value as snprintf() normally does,
so readers may be mislead into thinking the code works differently than
it actually does. A different function name makes this easier to detect.

Reported-by: Tomas Hoger
Assisted-by: Daniel Gustafsson
Fixes #3296
Closes #3297
2018-11-23 08:26:51 +01:00
Daniel Stenberg 9aa8ff2895
urlapi: only skip encoding the first '=' with APPENDQUERY set
APPENDQUERY + URLENCODE would skip all equals signs but now it only skip
encoding the first to better allow "name=content" for any content.

Reported-by: Alexey Melnichuk
Fixes #3231
Closes #3231
2018-11-07 08:28:48 +01:00
Daniel Stenberg 9df8dc101b
url: a short host name + port is not a scheme
The function identifying a leading "scheme" part of the URL considered a
few letters ending with a colon to be a scheme, making something like
"short:80" to become an unknown scheme instead of a short host name and
a port number.

Extended test 1560 to verify.

Also fixed test203 to use file_pwd to make it get the correct path on
windows. Removed test 2070 since it was a duplicate of 203.

Assisted-by: Marcel Raad
Reported-by: Hagai Auro
Fixes #3220
Fixes #3233
Closes #3223
Closes #3235
2018-11-06 19:11:58 +01:00
Daniel Stenberg d9abebc7ee
Revert "url: a short host name + port is not a scheme"
This reverts commit 226cfa8264.

This commit caused test failures on appveyor/windows. Work on fixing them is
in #3235.
2018-11-05 09:24:59 +01:00
Daniel Stenberg 226cfa8264
url: a short host name + port is not a scheme
The function identifying a leading "scheme" part of the URL considered a few
letters ending with a colon to be a scheme, making something like "short:80"
to become an unknown scheme instead of a short host name and a port number.

Extended test 1560 to verify.

Reported-by: Hagai Auro
Fixes #3220
Closes #3223
2018-11-03 15:01:27 +01:00
Daniel Stenberg b28094833a
URL: fix IPv6 numeral address parser
Regression from 46e164069d. Extended test 1560 to verify.

Reported-by: tpaukrt on github
Fixes #3218
Closes #3219
2018-11-03 00:14:04 +01:00
Even Rouault 55b51b8c49
Curl_dedotdotify(): always nul terminate returned string.
This fixes potential out-of-buffer access on "file:./" URL

$ valgrind curl "file:./"
==24516== Memcheck, a memory error detector
==24516== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==24516== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==24516== Command: /home/even/install-curl-git/bin/curl file:./
==24516==
==24516== Conditional jump or move depends on uninitialised value(s)
==24516==    at 0x4C31F9C: strcmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24516==    by 0x4EBB315: seturl (urlapi.c:801)
==24516==    by 0x4EBB568: parseurl (urlapi.c:861)
==24516==    by 0x4EBC509: curl_url_set (urlapi.c:1199)
==24516==    by 0x4E644C6: parseurlandfillconn (url.c:2044)
==24516==    by 0x4E67AEF: create_conn (url.c:3613)
==24516==    by 0x4E68A4F: Curl_connect (url.c:4119)
==24516==    by 0x4E7F0A4: multi_runsingle (multi.c:1440)
==24516==    by 0x4E808E5: curl_multi_perform (multi.c:2173)
==24516==    by 0x4E7558C: easy_transfer (easy.c:686)
==24516==    by 0x4E75801: easy_perform (easy.c:779)
==24516==    by 0x4E75868: curl_easy_perform (easy.c:798)

Was originally spotted by
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=10637
Credit to OSS-Fuzz

Closes #3039
2018-09-24 07:48:41 +02:00
Daniel Stenberg 2097cd5152
urlapi: fix support for address scope in IPv6 numerical addresses
Closes #3024
2018-09-21 11:19:14 +02:00
Daniel Stenberg 5c73093edb
urlapi: document the error codes, remove two unused ones
Assisted-by: Daniel Gustafsson
Closes #3019
2018-09-19 23:25:11 +02:00
Daniel Stenberg 9307c219ad
urlapi: add CURLU_GUESS_SCHEME and fix hostname acceptance
In order for this API to fully work for libcurl itself, it now offers a
CURLU_GUESS_SCHEME flag that makes it "guess" scheme based on the host
name prefix just like libcurl always did. If there's no known prefix, it
will guess "http://".

Separately, it relaxes the check of the host name so that IDN host names
can be passed in as well.

Both these changes are necessary for libcurl itself to use this API.

Assisted-by: Daniel Gustafsson
Closes #3018
2018-09-19 23:21:52 +02:00
Daniel Stenberg fb30ac5a2d
URL-API
See header file and man pages for API. All documented API details work
and are tested in the 1560 test case.

Closes #2842
2018-09-08 15:36:11 +02:00