Howard Chu brought the bulk work of this patch that properly
moves out the sending and recving of data to the parts of the
code that are properly responsible for the various ways of doing
so.
Daniel Stenberg assisted with polishing a few bits and fixed some
minor flaws in the original patch.
Another upside of this patch is that we now abuse CURLcodes less
with the "magic" -1 return codes and instead use CURLE_AGAIN more
consistently.
Bob Richmond: There's an annoying situation where libcurl will
read new HTTP response data from a socket, then check if it's a
timeout if one is set. If the last packet received constitutes
the end of the response body, libcurl still treats it as a
timeout condition and reports a message like:
"Operation timed out after 3000 milliseconds with 876 out of 876
bytes received"
It should only a timeout if the timer lapsed and we DIDN'T
receive the end of the response body yet.
makes sure that when using sub-second timeouts, there's no final bad 1000ms
wait. Previously, a sub-second timeout would often make the elapsed time end
up the time rounded up to the nearest second (e.g. 1s for 200ms timeout)
sequences in uploaded data. The test server doesn't "decode" escaped dot-lines
but instead test cases must be written to take them into account. Added test
case 803 to verify dot-escaping.
the define CURL_MAX_HTTP_HEADER which is even exposed in the public header
file to allow for users to fairly easy rebuild libcurl with a modified
limit. The rationale for a fixed limit is that libcurl is realloc()ing a
buffer to be able to put a full header into it, so that it can call the
header callback with the entire header, but that also risk getting it into
trouble if a server by mistake or willingly sends a header that is more or
less without an end. The limit is set to 100K.
transfer.c for blocking. It is currently used only by SCP and SFTP protocols.
This enhancement resolves an issue with 100% CPU usage during SFTP upload,
reported by Vourhey.
strdup() that could lead to segfault if it returned NULL. I extended his
suggest patch to now have Curl_retry_request() return a regular return code
and better check that.
Fix SIGSEGV on free'd easy_conn when pipe unexpectedly breaks
Fix data corruption issue with re-connected transfers
Fix use after free if we're completed but easy_conn not NULL
With the curl memory tracking feature decoupled from the debug build feature,
CURLDEBUG and DEBUGBUILD preprocessor symbol definitions are used as follows:
CURLDEBUG used for curl debug memory tracking specific code (--enable-curldebug)
DEBUGBUILD used for debug enabled specific code (--enable-debug)
KEEP_RECV to better match the general terminology: receive and send is what we
do from the (remote) servers. We read and write from and to the local fs.
libcurl did a superfluous 1000ms wait when doing SFTP downloads!
We read data with libssh2 while doing the "DO" operation for SFTP and then
when we were about to start getting data for the actual file part, the
"TRANSFER" part, we waited for socket action (in 1000ms) before doing a
libssh2-read. But in this case libssh2 had already read and buffered the
data so we ended up always just waiting 1000ms before we get working on the
data!
the condition in the previous request was unmet. This is typically a time
condition set with CURLOPT_TIMECONDITION and was previously not possible to
reliably figure out. From bug report #2565128
(http://curl.haxx.se/bug/view.cgi?id=2565128)
now has an improved ability to do right when the multi interface (both
"regular" and multi_socket) is used for SCP and SFTP transfers. This should
result in (much) less busy-loop situations and thus less CPU usage with no
speed loss.
(http://curl.haxx.se/bug/view.cgi?id=2155496) pointing out an error case
without a proper human-readable error message. When a read callback returns
a too large value (like when trying to return a negative number) it would
trigger and the generic error message then makes the proplem slightly
different to track down. I've added an error message for this now.
fixed a CURLINFO_REDIRECT_URL memory leak and an additional wrong-doing:
Any subsequent transfer with a redirect leaks memory, eventually crashing
the process potentially.
Any subsequent transfer WITHOUT a redirect causes the most recent redirect
that DID occur on some previous transfer to still be reported.
a fresh connection to be made in such cases and the request retransmitted.
This should fix test case 160. Added test case 1079 in an attempt to
test a similar connection dropping scenario, but as a race condition, it's
hard to test reliably.
CURLOPT_POST301 (but adds a define for backwards compatibility for you who
don't define CURL_NO_OLDIES). This option allows you to now also change the
libcurl behavior for a HTTP response 302 after a POST to not use GET in the
subsequent request (when CURLOPT_FOLLOWLOCATION is enabled). I edited the
patch somewhat before commit. The curl tool got a matching --post302
option. Test case 1076 was added to verify this.
"Connection: close" and actually close the connection after the
response-body, libcurl could still have outstanding data to send and it
would not properly notice this and stop sending. This caused weirdness and
sad faces. http://curl.haxx.se/bug/view.cgi?id=2080222
Note that there are still reasons to consider libcurl's behavior when
getting a >= 400 response code while sending data, as Craig Perras' note
"http upload: how to stop on error" specifies:
http://curl.haxx.se/mail/archive-2008-08/0138.html
remain in use as internal curl_off_t print formatting strings for the internal
*printf functions which still cannot handle print formatting string directives
such as "I64d", "I64u", and others available on MSVC, MinGW, Intel's ICC, and
other DOS/Windows compilers.
This reverts previous commit part which did:
FORMAT_OFF_T -> CURL_FORMAT_CURL_OFF_T
FORMAT_OFF_TU -> CURL_FORMAT_CURL_OFF_TU
the names of the curl_off_t formatting string directives now become
CURL_FORMAT_CURL_OFF_T and CURL_FORMAT_CURL_OFF_TU.
CURL_FMT_OFF_T -> CURL_FORMAT_CURL_OFF_T
CURL_FMT_OFF_TU -> CURL_FORMAT_CURL_OFF_TU
Remove the use of an internal name for the curl_off_t formatting string directives
and use the common one available from the inside and outside of the library.
FORMAT_OFF_T -> CURL_FORMAT_CURL_OFF_T
FORMAT_OFF_TU -> CURL_FORMAT_CURL_OFF_TU
proved how PUT and POST with a redirect could lead to a "hang" due to the
data stream not being rewound properly when it had to in order to get sent
properly (again) to the subsequent URL. This is now fixed and these test
cases are no longer disabled.
overrun" (http://curl.haxx.se/bug/view.cgi?id=2026240) identifying two
problems, and providing the fix for them:
- CURL_READFUNC_PAUSE did in fact not pause the _sending_ of data that it is
designed for but paused _receiving_ of data!
- libcurl didn't internally set the read counter to zero when this return
code was detected, which would potentially lead to junk getting sent to
the server.
how the HTTP redirect following code didn't properly follow to a new URL if
the new url was but a query string such as "Location: ?moo=foo". Test case
1031 was added to verify this fix.
when using CURL_AUTH_ANY" (http://curl.haxx.se/bug/view.cgi?id=1945240).
The problem was that when libcurl rewound a stream meant for upload when it
would prepare for a second request, it could accidentally continue the
sending of the rewound data on the first request instead of on the second.
Ben also provided test case 1030 that verifies this fix.
redirections and thus cannot use CURLOPT_FOLLOWLOCATION easily, we now
introduce the new CURLINFO_REDIRECT_URL option that lets applications
extract the URL libcurl would've redirected to if it had been told to. This
then enables the application to continue to that URL as it thinks is
suitable, without having to re-implement the magic of creating the new URL
from the Location: header etc. Test 1029 verifies it.
such as the CURLOPT_SSL_CTX_FUNCTION one treat that as if it was a Location:
following. The patch that introduced this feature was done for 7.11.0, but
this code and functionality has been broken since about 7.15.4 (March 2006)
with the introduction of non-blocking OpenSSL "connects".
It was a hack to begin with and since it doesn't work and hasn't worked
correctly for a long time and nobody has even noticed, I consider it a very
suitable subject for plain removal. And so it was done.
the SingleRequest one to make pipelining better. It is a bit tricky to keep
them in the right place, to keep things related to the actual request or to
the actual connection in the right place.
previously had a number of flaws, perhaps most notably when an application
fired up N transfers at once as then they wouldn't pipeline at all that
nicely as anyone would think... Test case 530 was also updated to take the
improved functionality into account.
libcurl to seek in a given input stream. This is particularly important when
doing upload resumes when there's already a huge part of the file present
remotely. Before, and still if this callback isn't used, libcurl will read
and through away the entire file up to the point to where the resuming
begins (which of course can be a slow opereration depending on file size,
I/O bandwidth and more). This new function will also be preferred to get
used instead of the CURLOPT_IOCTLFUNCTION for seeking back in a stream when
doing multi-stage HTTP auth with POST/PUT.
is inited at the start of the DO action. I removed the Curl_transfer_keeper
struct completely, and I had to move out a few struct members (that had to
be set before DO or used after DONE) to the UrlState struct. The SingleRequest
struct is accessed with SessionHandle->req.
One of the biggest reasons for doing this was the bunch of duplicate struct
members in HandleData and Curl_transfer_keeper since it was really messy to
keep track of two variables with the same name and basically the same purpose!
do_init() and do_complete() which now are called first and last in the DO
function. It simplified the flow in multi.c and the functions got more
sensible names!
for the cases where there's nothing to do in here, like for SFTP directory
listings that already is complete when this function gets called. The init
stuff clears byte counters which isn't really desired.
curl_easy_setopt() that alters how libcurl functions when following
redirects. It makes libcurl obey the RFC2616 when a 301 response is received
after a non-GET request is made. Default libcurl behaviour is to change
method to GET in the subsequent request (like it does for response code 302
- because that's what many/most browsers do), but with this CURLOPT_POST301
option enabled it will do what the spec says and do the next request using
the same method again. I.e keep POST after 301.
The curl tool got this option as --post301
Test case 1011 and 1012 were added to verify.
and allow reuse by multiple protocols. Several unused error codes were
removed. In all cases, macros were added to preserve source (and binary)
compatibility with the old names. These macros are subject to removal at
a future date, but probably not before 2009. An application can be
tested to see if it is using any obsolete code by compiling it with the
CURL_NO_OLDIES macro defined.
Documented some newer error codes in libcurl-error(3)
passed to it with curl_easy_setopt()! Previously it has always just refered
to the data, forcing the user to keep the data around until libcurl is done
with it. That is now history and libcurl will instead clone the given
strings and keep private copies.
chunked encoding (that also lacks "Connection: close"). It now simply
assumes that the connection WILL be closed to signal the end, as that is how
RFC2616 section 4.4 point #5 says we should behave.
bug report #1715394 (http://curl.haxx.se/bug/view.cgi?id=1715394), and the
transfer-related info "variables" were indeed overwritten with zeroes wrongly
and have now been adjusted. The upload size still isn't accurate.
when CURLOPT_HTTP200ALIASES is used to avoid the problem mentioned below is
not very nice if the client wants to be able to use _either_ a HTTP 1.1
server or one within the aliases list... so starting now, libcurl will
simply consider 200-alias matches the to be HTTP 1.0 compliant.
libcurls, which turned out to be the 25-nov-2006 change which treats HTTP
responses without Content-Length or chunked encoding as without bodies. We
now added the conditional that the above mentioned response is only without
body if the response is HTTP 1.1.
was 16385 bytes (16K+1) and it turned out we didn't properly always "suck
out" all data from libssh2. The effect being that libcurl would hang on the
socket waiting for data when libssh2 had in fact already read it all...
function that deprecates the curl_multi_socket() function. Using the new
function the application tell libcurl what action that was found in the
socket that it passes in. This gives a significant performance boost as it
allows libcurl to avoid a call to poll()/select() for every call to
curl_multi_socket*().
fixing some bugs:
o Don't mix GET and POST requests in a pipeline
o Fix the order in which requests are dispatched from the pipeline
o Fixed several curl bugs with pipelining when the server is returning
chunked encoding:
* Added states to chunked parsing for final CRLF
* Rewind buffer after parsing chunk with data remaining
* Moved chunked header initializing to a spot just before receiving
headers
to the debug callback.
- Shmulik Regev added CURLOPT_HTTP_CONTENT_DECODING and
CURLOPT_HTTP_TRANSFER_DECODING that if set to zero will disable libcurl's
internal decoding of content or transfer encoded content. This may be
preferable in cases where you use libcurl for proxy purposes or similar. The
command line tool got a --raw option to disable both at once.
and CURLOPT_CONNECTTIMEOUT_MS that, as their names should hint, do the
timeouts with millisecond resolution instead. The only restriction to that
is the alarm() (sometimes) used to abort name resolves as that uses full
seconds. I fixed the FTP response timeout part of the patch.
Internally we now count and keep the timeouts in milliseconds but it also
means we multiply set timeouts with 1000. The effect of this is that no
timeout can be set to more than 2^31 milliseconds (on 32 bit systems), which
equals 24.86 days. We probably couldn't before either since the code did
*1000 on the timeout values on several places already.
doing an FTP transfer is removed from a multi handle before completion. The
fix also fixed the "alive counter" to be correct on "premature removal" for
all protocols.
non-ASCII platforms. It does add some complexity, most notably with more
#ifdefs, but I want to see this supported added and I can't see how we can
add it without the extra stuff added.
(http://curl.haxx.se/bug/view.cgi?id=1618359) and subsequently provided a
patch for it: when downloading 2 zero byte files in a row, curl 7.16.0
enters an infinite loop, while curl 7.16.1-20061218 does one additional
unnecessary request.
Fix: During the "Major overhaul introducing http pipelining support and
shared connection cache within the multi handle." change, headerbytecount
was moved to live in the Curl_transfer_keeper structure. But that structure
is reset in the Transfer method, losing the information that we had about
the header size. This patch moves it back to the connectdata struct.
(http://curl.haxx.se/bug/view.cgi?id=1603712) which is about connections
getting cut off prematurely when --limit-rate is used. While I found no such
problems in my tests nor in my reading of the code, I found that the
--limit-rate code was severly flawed (since it was moved into the lib, since
7.15.5) when used with the easy interface and it didn't work as documented so
I reworked it somewhat and now it works for my tests.
passing a curl_off_t argument to the Curl_read_rewind() function which takes
an size_t argument. Curl_read_rewind() also had debug code left in it and it
was put in a different source file with no good reason when only used from
one single spot.
header in a third, not suppported by libcurl, format and we agreed that we
could make the parser more forgiving to accept all the three found
variations.
responded with a single status line and no headers nor body. Starting now, a
HTTP response on a persistent connection (i.e not set to be closed after the
response has been taken care of) must have Content-Length or chunked
encoding set, or libcurl will simply assume that there is no body.
To my horror I learned that we had no less than 57(!) test cases that did bad
HTTP responses like this, and even the test http server (sws) responded badly
when queried by the test system if it is the test system. So although the
actual fix for the problem was tiny, going through all the newly failing test
cases got really painful and boring.
case when 401 or 407 are returned, *IF* no auth credentials have been given.
The CURLOPT_FAILONERROR option is not possible to make fool-proof for 401
and 407 cases when auth credentials is given, but we've now covered this
somewhat more.
You might get some amounts of headers transferred before this situation is
detected, like for when a "100-continue" is received as a response to a
POST/PUT and a 401 or 407 is received immediately afterwards.
Added test 281 to verify this change.
could very well cause a negate number get passed in and thus cause reading
outside of the array usually used for this purpose.
We avoid this by using the uppercase macro versions introduced just now that
does some extra crazy typecasts to avoid byte codes > 127 to cause negative
int values.
CURLOPT_MAX_RECV_SPEED_LARGE that limit tha maximum rate libcurl is allowed
to send or receive data. This kind of adds the the command line tool's
option --limit-rate to the library.
The rate limiting logic in the curl app is now removed and is instead
provided by libcurl itself. Transfer rate limiting will now also work for -d
and -F, which it didn't before.
an app can use to let libcurl only connect to a remote host and then extract
the socket from libcurl. libcurl will then not attempt to do any transfer at
all after the connect is done.
with re-used FTP connections. If the second request on the same connection was
set not to fetch a "body", libcurl could get confused and consider it an
attempt to use a dead connection and would go acting mighty strange.
connection setup as a follow-redirect. It turns out 1) this fails when a FTP
connection is re-setup and 2) it does make the max-redirs counter behave
wrong. This fix was not verified since the reporter vanished, but I believe
this is the right fix nonetheless.
(http://curl.haxx.se/bug/view.cgi?id=1338648) which really is more of a
feature request, but anyway. It pointed out that --max-redirs did not allow
it to be set to 0, which then would return an error code on the first
Location: found. Based on Nis' patch, now libcurl supports CURLOPT_MAXREDIRS
set to 0, or -1 for infinity. Added test case 274 to verify.
(http://curl.haxx.se/bug/view.cgi?id=1299181) that identified a silly problem
with Content-Range: headers with the 'bytes' keyword written in a different
case than all lowercase! It would cause a segfault!
from the command line tool with --ignore-content-length. This will make it
easier to download files from Apache 1.x (and similar) servers that are
still having problems serving files larger than 2 or 4 GB. When this option
is enabled, curl will simply have to wait for the server to close the
connection to signal end of transfer. I wrote test case 269 that runs a
simple test that this works.
CURLOPT_COOKIEFILE), add a cookie (with CURLOPT_COOKIELIST), tell it to
write the result to a given cookie jar and then never actually call
curl_easy_perform() - the given file(s) to read was never read but the
output file was written and thus it caused a "funny" result.
- While doing some tests for the bug above, I noticed that Firefox generates
large numbers (for the expire time) in the cookies.txt file and libcurl
didn't treat them properly. Now it does.
site responds with bad HTTP response that doesn't contain any header at all,
only a response body, and the write callback returns 0 to abort the
transfer, it didn't have any real effect but the write callback would be
called once more anyway.
binary zeroes within the headers. They confused libcurl to do wrong so the
downloaded headers become incomplete. The fix is now verified with test case
262.
internally, with code provided by sslgen.c. All SSL-layer-specific code is
then written in ssluse.c (for OpenSSL) and gtls.c (for GnuTLS).
As far as possible, internals should not need to know what SSL layer that is
in use. Building with GnuTLS currently makes two test cases fail.
TODO.gnutls contains a few known outstanding issues for the GnuTLS support.
GnuTLS support is enabled with configure --with-gnutls
that picks NTLM. Thanks to David Byron letting me test NTLM against his
servers, I could quickly repeat and fix the problem. It turned out to be:
When libcurl POSTs without knowing/using an authentication and it gets back a
list of types from which it picks NTLM, it needs to either continue sending
its data if it keeps the connection alive, or not send the data but close the
connection. Then do the first step in the NTLM auth. libcurl didn't send the
data nor close the connection but simply read the response-body and then sent
the first negotiation step. Which then failed miserably of course. The fixed
version forces a connection if there is more than 2000 bytes left to send.