Apparently there are sites out there that do redirects to URLs they
provide in plain UTF-8 or similar. Browsers and wget %-encode such
headers when doing a subsequent request. Now libcurl does too.
Added test 1138 to verify.
Closes#473
... and assign it from the set.fread_func_set pointer in the
Curl_init_CONNECT function. This A) avoids that we have code that
assigns fields in the 'set' struct (which we always knew was bad) and
more importantly B) it makes it impossibly to accidentally leave the
wrong value for when the handle is re-used etc.
Introducing a state-init functionality in multi.c, so that we can set a
specific function to get called when we enter a state. The
Curl_init_CONNECT is thus called when switching to the CONNECT state.
Bug: https://github.com/bagder/curl/issues/346Closes#346
Currently, libcurl rejects responses with "Content-Encoding: compress"
when CURLOPT_ACCEPT_ENCODING is set to "". I think that libcurl should
treat the Content-Encoding "compress" the same as other
Content-Encodings that it does not support, e.g. "bzip2". That means
just ignoring it.
With many easy handles using the same connection for multiplexing, it is
important we store and keep the transfer-oriented stuff in the
SessionHandle so that callbacks and callback data work fine even when
many easy handles share the same physical connection.
.. also make __func__ replacement in multi.
Prior to this change debug builds would fail to build if the compiler
was building pre-c99 and didn't support __func__.
The factor of 8 is a bytes-to-bits conversion factor, but pkt_size and
rate_bps are both in bytes. When using the rate limiting option, curl
waits 8 times too long, and then transfers very quickly until the
average rate reaches the limit. The average rate follows the limit over
time, but the actual traffic is bursty.
Thanks-to: Benjamin Gilbert
This header file must be included after all header files except
memdebug.h, as it does similar memory function redefinitions and can be
similarly affected by conflicting definitions in system or dependent
library headers.
If the scratch buffer already existed when the CRLF conversion was
performed then the buffer pointer would be checked twice for NULL. This
second check is only necessary if the call to malloc() was performed by
the first check.
Whilst I had moved the dot stuffing code from being performed before
CRLF conversion takes place to after it, in commit 4bd860a001, I had
moved it outside the 'when something read' block of code when meant
it could perform the dot stuffing twice on partial send if nread
happened to contain the right values. It also meant the function could
potentially read past the end of buffer. This was highlighted by the
following warning:
warning: `nread' might be used uninitialized in this function
The previous condition that checked if the socket was marked as readable
when also adding a writable one, was incorrect and didn't take the pause
bits properly into account.
Basically since servers often then don't respond well to this and
instead send the full contents and then libcurl would instead error out
with the assumption that the server doesn't support resume. As the data
is then already transfered, this is now considered fine.
Test case 1434 added to verify this. Test case 1042 slightly modified.
Reported-by: hugo
Bug: http://curl.haxx.se/bug/view.cgi?id=1443
... for the local variable name in functions holding the return
code. Using the same name universally makes code easier to read and
follow.
Also, unify code for checking for CURLcode errors with:
if(result) or if(!result)
instead of
if(result == CURLE_OK), if(CURLE_OK == result) or if(result != CURLE_OK)
The method change is forbidden by the obsolete RFC2616, but libcurl did
it anyway for compatibility reasons. The new RFC7231 allows this
behaviour so there's no need for the scary "Violate RFC 2616/10.3.x"
notice. Also update the comments accordingly.
Make all code use connclose() and connkeep() when changing the "close
state" for a connection. These two macros take a string argument with an
explanation, and debug builds of curl will include that in the debug
output. Helps tracking connection re-use/close issues.
set.infilesize in this case was modified in several places, which could
lead to repeated requests using the same handle to get unintendent/wrong
consequences based on what the previous request did!
This makes the findprotocol() function work as intended so that libcurl
can properly be restricted to not support HTTP while still supporting
HTTPS - since the HTTPS handler previously set both the HTTP and HTTPS
bits in the protocol field.
This fixes --proto and --proto-redir for most SSL protocols.
This is done by adding a few new convenience defines that groups HTTP
and HTTPS, FTP and FTPS etc that should then be used when the code wants
to check for both protocols at once. PROTO_FAMILY_[protocol] style.
Bug: https://github.com/bagder/curl/pull/97
Reported-by: drizzt
For HTTP/2, we may read up everything including responde body with
header fields in Curl_http_readwrite_headers. If no content-length is
provided, curl waits for the connection close, which we emulate it
using conn->proto.httpc.closed = TRUE. The thing is if we read
everything, then http2_recv won't be called and we cannot signal the
HTTP/2 stream has closed. As a workaround, we return nonzero from
data_pending to call http2_recv.
When using the multi socket interface, libcurl calls the
curl_multi_timer_callback asking to be woken up after
CURL_TIMEOUT_EXPECT_100 milliseconds.
After the timeout has expired, calling curl_multi_socket_action with
CURL_SOCKET_TIMEOUT as sockfd leads libcurl to check expired
timeouts. When handling the 100-continue one, the following check in
Curl_readwrite() fails if exactly CURL_TIMEOUT_EXPECT_100 milliseconds
passed since the timeout has been set!
It seems logical to consider that having waited for exactly
CURL_TIMEOUT_EXPECT_100 ms is enough.
Bug: http://curl.haxx.se/bug/view.cgi?id=1334
With the recently added timeout "reminder" functionality, there's no
reason left for us to execute timeout code before the time is
ripe. Simplifies the handling too.
This will make the *TIMEOUT and *CONNECTTIMEOUT options more accurate
again, which probably is most important when the *_MS versions are used.
In multi_socket, make sure to update 'now' after having handled activity
on a socket.
Following commit 0aafd77fa4, replaced the internal usage of
FORMAT_OFF_T and FORMAT_OFF_TU with the external versions that we
expect API programmers to use.
This negates the need for separate definitions which were subtly
different under different platforms/compilers.
Make sure that we detect such attempts and return a proper error code
instead of silently handling this in problematic ways.
Updated the documentation to mention this limitation.
Bug: http://curl.haxx.se/bug/view.cgi?id=1286
Since all systems have inaccuracy in the timeout handling it is
imperative that we add an inaccuracy margin to the general timeout and
connecttimeout handling with the multi interface. This way, when the
timeout fires we should be fairly sure that it has passed the timeout
value and will be suitably detected.
For cases where the timeout fire before the actual timeout, we would
otherwise consume the timeout action and still not run the timeout code
since the condition wasn't met.
Reported-by: He Qin
Bug: http://curl.haxx.se/bug/view.cgi?id=1298
When waiting for a 100-continue response from the server, the
Curl_readwrite() will refuse to run if called until the timeout has been
reached.
We timeout code in multi_socket() allows code to run slightly before the
actual timeout time, so for test 154 it could lead to the function being
executed but refused in Curl_readwrite() and then the application would
just sit idling forever.
This was detected with runtests.pl -e on test 154.
All protocol handler structs are now opaque (void *) in the
SessionHandle struct and moved in the request-specific sub-struct
'SingleRequest'. The intension is to keep the protocol specific
knowledge in their own dedicated source files [protocol].c etc.
There's some "leakage" where this policy is violated, to be addressed at
a later point in time.
Introducing a number of options to the multi interface that
allows for multiple pipelines to the same host, in order to
optimize the balance between the penalty for opening new
connections and the potential pipelining latency.
Two new options for limiting the number of connections:
CURLMOPT_MAX_HOST_CONNECTIONS - Limits the number of running connections
to the same host. When adding a handle that exceeds this limit,
that handle will be put in a pending state until another handle is
finished, so we can reuse the connection.
CURLMOPT_MAX_TOTAL_CONNECTIONS - Limits the number of connections in total.
When adding a handle that exceeds this limit,
that handle will be put in a pending state until another handle is
finished. The free connection will then be reused, if possible, or
closed if the pending handle can't reuse it.
Several new options for pipelining:
CURLMOPT_MAX_PIPELINE_LENGTH - Limits the pipeling length. If a
pipeline is "full" when a connection is to be reused, a new connection
will be opened if the CURLMOPT_MAX_xxx_CONNECTIONS limits allow it.
If not, the handle will be put in a pending state until a connection is
ready (either free or a pipe got shorter).
CURLMOPT_CONTENT_LENGTH_PENALTY_SIZE - A pipelined connection will not
be reused if it is currently processing a transfer with a content
length that is larger than this.
CURLMOPT_CHUNK_LENGTH_PENALTY_SIZE - A pipelined connection will not
be reused if it is currently processing a chunk larger than this.
CURLMOPT_PIPELINING_SITE_BL - A blacklist of hosts that don't allow
pipelining.
CURLMOPT_PIPELINING_SERVER_BL - A blacklist of server types that don't allow
pipelining.
See the curl_multi_setopt() man page for details.
Remove internal separated behavior of the easy vs multi intercace.
curl_easy_perform() is now using the multi interface itself.
Several minor multi interface quirks and bugs have been fixed in the
process.
Much help with debugging this has been provided by: Yang Tse