mirror of
https://github.com/moparisthebest/curl
synced 2024-12-21 23:58:49 -05:00
INTERNALS: cat lib/README* >> INTERNALS
and a conversion to markdown. Removed the lib/README.* files. The idea being to move toward having INTERNALS as the one and only "book" of internals documentation. Added a TOC to top of the document.
This commit is contained in:
parent
cbf2920d02
commit
55f3eb588d
675
docs/INTERNALS
675
docs/INTERNALS
@ -1,18 +1,56 @@
|
||||
_ _ ____ _
|
||||
___| | | | _ \| |
|
||||
/ __| | | | |_) | |
|
||||
| (__| |_| | _ <| |___
|
||||
\___|\___/|_| \_\_____|
|
||||
Table of Contents
|
||||
=================
|
||||
|
||||
INTERNALS
|
||||
- [Intro](#intro)
|
||||
- [git](#git)
|
||||
- [Portability](#Portability)
|
||||
- [Windows vs Unix](#winvsunix)
|
||||
- [Library](#Library)
|
||||
- [`Curl_connect`](#Curl_connect)
|
||||
- [`Curl_do`](#Curl_do)
|
||||
- [`Curl_readwrite`](#Curl_readwrite)
|
||||
- [`Curl_done`](#Curl_done)
|
||||
- [`Curl_disconnect`](#Curl_disconnect)
|
||||
- [HTTP(S)](#http)
|
||||
- [FTP](#ftp)
|
||||
- [Kerberos](#kerberos)
|
||||
- [TELNET](#telnet)
|
||||
- [FILE](#file)
|
||||
- [SMB](#smb)
|
||||
- [LDAP](#ldap)
|
||||
- [E-mail](#email)
|
||||
- [General](#general)
|
||||
- [Persistent Connections](#persistent)
|
||||
- [multi interface/non-blocking](#multi)
|
||||
- [SSL libraries](#ssl)
|
||||
- [Library Symbols](#symbols)
|
||||
- [Return Codes and Informationals](#returncodes)
|
||||
- [AP/ABI](#abi)
|
||||
- [Client](#client)
|
||||
- [Memory Debugging](#memorydebug)
|
||||
- [Test Suite](#test)
|
||||
- [Asynchronous name resolves](#asyncdns)
|
||||
- [c-ares](#cares)
|
||||
- [`curl_off_t`](#curl_off_t)
|
||||
- [curlx](#curlx)
|
||||
- [Content Encoding](#contentencoding)
|
||||
- [hostip.c explained](#hostip)
|
||||
- [Track Down Memory Leaks](#memoryleak)
|
||||
- [`multi_socket`](#multi_socket)
|
||||
|
||||
The project is split in two. The library and the client. The client part uses
|
||||
the library, but the library is designed to allow other applications to use
|
||||
it.
|
||||
<a name="intro"></a>
|
||||
curl internals
|
||||
==============
|
||||
|
||||
This project is split in two. The library and the client. The client part
|
||||
uses the library, but the library is designed to allow other applications to
|
||||
use it.
|
||||
|
||||
The largest amount of code and complexity is in the library part.
|
||||
|
||||
GIT
|
||||
|
||||
<a name="git"></a>
|
||||
git
|
||||
===
|
||||
|
||||
All changes to the sources are committed to the git repository as soon as
|
||||
@ -23,6 +61,7 @@ GIT
|
||||
Tagging shall be used extensively, and by the time we release new archives we
|
||||
should tag the sources with a name similar to the released version number.
|
||||
|
||||
<a name="Portability"></a>
|
||||
Portability
|
||||
===========
|
||||
|
||||
@ -34,45 +73,55 @@ Portability
|
||||
want it to remain functional and buildable with these and later versions
|
||||
(older versions may still work but is not what we work hard to maintain):
|
||||
|
||||
OpenSSL 0.9.7
|
||||
GnuTLS 1.2
|
||||
zlib 1.1.4
|
||||
libssh2 0.16
|
||||
c-ares 1.6.0
|
||||
libidn 0.4.1
|
||||
cyassl 2.0.0
|
||||
openldap 2.0
|
||||
MIT Kerberos 1.2.4
|
||||
GSKit V5R3M0
|
||||
NSS 3.14.x
|
||||
axTLS 1.2.7
|
||||
PolarSSL 1.3.0
|
||||
Heimdal ?
|
||||
nghttp2 1.0.0
|
||||
Dependencies
|
||||
------------
|
||||
|
||||
- OpenSSL 0.9.7
|
||||
- GnuTLS 1.2
|
||||
- zlib 1.1.4
|
||||
- libssh2 0.16
|
||||
- c-ares 1.6.0
|
||||
- libidn 0.4.1
|
||||
- cyassl 2.0.0
|
||||
- openldap 2.0
|
||||
- MIT Kerberos 1.2.4
|
||||
- GSKit V5R3M0
|
||||
- NSS 3.14.x
|
||||
- axTLS 1.2.7
|
||||
- PolarSSL 1.3.0
|
||||
- Heimdal ?
|
||||
- nghttp2 1.0.0
|
||||
|
||||
Operating Systems
|
||||
-----------------
|
||||
|
||||
On systems where configure runs, we aim at working on them all - if they have
|
||||
a suitable C compiler. On systems that don't run configure, we strive to keep
|
||||
curl running fine on:
|
||||
|
||||
Windows 98
|
||||
AS/400 V5R3M0
|
||||
Symbian 9.1
|
||||
Windows CE ?
|
||||
TPF ?
|
||||
- Windows 98
|
||||
- AS/400 V5R3M0
|
||||
- Symbian 9.1
|
||||
- Windows CE ?
|
||||
- TPF ?
|
||||
|
||||
Build tools
|
||||
-----------
|
||||
|
||||
When writing code (mostly for generating stuff included in release tarballs)
|
||||
we use a few "build tools" and we make sure that we remain functional with
|
||||
these versions:
|
||||
|
||||
GNU Libtool 1.4.2
|
||||
GNU Autoconf 2.57
|
||||
GNU Automake 1.7 (we currently avoid 1.10 due to Solaris-related bugs)
|
||||
GNU M4 1.4
|
||||
perl 5.004
|
||||
roffit 0.5
|
||||
groff ? (any version that supports "groff -Tps -man [in] [out]")
|
||||
ps2pdf (gs) ?
|
||||
- GNU Libtool 1.4.2
|
||||
- GNU Autoconf 2.57
|
||||
- GNU Automake 1.7
|
||||
- GNU M4 1.4
|
||||
- perl 5.004
|
||||
- roffit 0.5
|
||||
- groff ? (any version that supports "groff -Tps -man [in] [out]")
|
||||
- ps2pdf (gs) ?
|
||||
|
||||
<a name="winvsunix"></a>
|
||||
Windows vs Unix
|
||||
===============
|
||||
|
||||
@ -87,8 +136,9 @@ Windows vs Unix
|
||||
|
||||
2. Windows requires a couple of init calls for the socket stuff.
|
||||
|
||||
That's taken care of by the curl_global_init() call, but if other libs also
|
||||
do it etc there might be reasons for applications to alter that behaviour.
|
||||
That's taken care of by the `curl_global_init()` call, but if other libs
|
||||
also do it etc there might be reasons for applications to alter that
|
||||
behaviour.
|
||||
|
||||
3. The file descriptors for network communication and file operations are
|
||||
not easily interchangeable as in unix.
|
||||
@ -101,28 +151,29 @@ Windows vs Unix
|
||||
|
||||
We set stdout to binary under windows
|
||||
|
||||
Inside the source code, We make an effort to avoid '#ifdef [Your OS]'. All
|
||||
Inside the source code, We make an effort to avoid `#ifdef [Your OS]`. All
|
||||
conditionals that deal with features *should* instead be in the format
|
||||
'#ifdef HAVE_THAT_WEIRD_FUNCTION'. Since Windows can't run configure scripts,
|
||||
we maintain a curl_config-win32.h file in lib directory that is supposed to
|
||||
look exactly as a curl_config.h file would have looked like on a Windows
|
||||
`#ifdef HAVE_THAT_WEIRD_FUNCTION`. Since Windows can't run configure scripts,
|
||||
we maintain a `curl_config-win32.h` file in lib directory that is supposed to
|
||||
look exactly as a `curl_config.h` file would have looked like on a Windows
|
||||
machine!
|
||||
|
||||
Generally speaking: always remember that this will be compiled on dozens of
|
||||
operating systems. Don't walk on the edge.
|
||||
|
||||
<a name="Library"></a>
|
||||
Library
|
||||
=======
|
||||
|
||||
(See LIBCURL-STRUCTS for a separate document describing all major internal
|
||||
(See `LIBCURL-STRUCTS` for a separate document describing all major internal
|
||||
structs and their purposes.)
|
||||
|
||||
There are plenty of entry points to the library, namely each publicly defined
|
||||
function that libcurl offers to applications. All of those functions are
|
||||
rather small and easy-to-follow. All the ones prefixed with 'curl_easy' are
|
||||
rather small and easy-to-follow. All the ones prefixed with `curl_easy` are
|
||||
put in the lib/easy.c file.
|
||||
|
||||
curl_global_init_() and curl_global_cleanup() should be called by the
|
||||
`curl_global_init_()` and `curl_global_cleanup()` should be called by the
|
||||
application to initialize and clean up global stuff in the library. As of
|
||||
today, it can handle the global SSL initing if SSL is enabled and it can init
|
||||
the socket layer on windows machines. libcurl itself has no "global" scope.
|
||||
@ -130,51 +181,56 @@ Library
|
||||
All printf()-style functions use the supplied clones in lib/mprintf.c. This
|
||||
makes sure we stay absolutely platform independent.
|
||||
|
||||
curl_easy_init() allocates an internal struct and makes some initializations.
|
||||
The returned handle does not reveal internals. This is the 'SessionHandle'
|
||||
struct which works as an "anchor" struct for all curl_easy functions. All
|
||||
connections performed will get connect-specific data allocated that should be
|
||||
used for things related to particular connections/requests.
|
||||
[ `curl_easy_init()`][2] allocates an internal struct and makes some
|
||||
initializations. The returned handle does not reveal internals. This is the
|
||||
'SessionHandle' struct which works as an "anchor" struct for all `curl_easy`
|
||||
functions. All connections performed will get connect-specific data allocated
|
||||
that should be used for things related to particular connections/requests.
|
||||
|
||||
curl_easy_setopt() takes three arguments, where the option stuff must be
|
||||
passed in pairs: the parameter-ID and the parameter-value. The list of
|
||||
[`curl_easy_setopt()`][1] takes three arguments, where the option stuff must
|
||||
be passed in pairs: the parameter-ID and the parameter-value. The list of
|
||||
options is documented in the man page. This function mainly sets things in
|
||||
the 'SessionHandle' struct.
|
||||
|
||||
curl_easy_perform() is just a wrapper function that makes use of the multi
|
||||
API. It basically curl_multi_init(), curl_multi_add_handle(),
|
||||
curl_multi_wait(), and curl_multi_perform() until the transfer is done and
|
||||
then returns.
|
||||
`curl_easy_perform()` is just a wrapper function that makes use of the multi
|
||||
API. It basically calls `curl_multi_init()`, `curl_multi_add_handle()`,
|
||||
`curl_multi_wait()`, and `curl_multi_perform()` until the transfer is done
|
||||
and then returns.
|
||||
|
||||
Some of the most important key functions in url.c are called from multi.c
|
||||
when certain key steps are to be made in the transfer operation.
|
||||
|
||||
o Curl_connect()
|
||||
<a name="Curl_connect"></a>
|
||||
Curl_connect()
|
||||
--------------
|
||||
|
||||
Analyzes the URL, it separates the different components and connects to the
|
||||
remote host. This may involve using a proxy and/or using SSL. The
|
||||
Curl_resolv() function in lib/hostip.c is used for looking up host names
|
||||
`Curl_resolv()` function in lib/hostip.c is used for looking up host names
|
||||
(it does then use the proper underlying method, which may vary between
|
||||
platforms and builds).
|
||||
|
||||
When Curl_connect is done, we are connected to the remote site. Then it is
|
||||
time to tell the server to get a document/file. Curl_do() arranges this.
|
||||
When `Curl_connect` is done, we are connected to the remote site. Then it
|
||||
is time to tell the server to get a document/file. `Curl_do()` arranges
|
||||
this.
|
||||
|
||||
This function makes sure there's an allocated and initiated 'connectdata'
|
||||
struct that is used for this particular connection only (although there may
|
||||
be several requests performed on the same connect). A bunch of things are
|
||||
inited/inherited from the SessionHandle struct.
|
||||
|
||||
o Curl_do()
|
||||
<a name="Curl_do"></a>
|
||||
Curl_do()
|
||||
---------
|
||||
|
||||
Curl_do() makes sure the proper protocol-specific function is called. The
|
||||
`Curl_do()` makes sure the proper protocol-specific function is called. The
|
||||
functions are named after the protocols they handle.
|
||||
|
||||
The protocol-specific functions of course deal with protocol-specific
|
||||
negotiations and setup. They have access to the Curl_sendf() (from
|
||||
negotiations and setup. They have access to the `Curl_sendf()` (from
|
||||
lib/sendf.c) function to send printf-style formatted data to the remote
|
||||
host and when they're ready to make the actual file transfer they call the
|
||||
Curl_Transfer() function (in lib/transfer.c) to setup the transfer and
|
||||
`Curl_Transfer()` function (in lib/transfer.c) to setup the transfer and
|
||||
returns.
|
||||
|
||||
If this DO function fails and the connection is being re-used, libcurl will
|
||||
@ -183,11 +239,13 @@ Library
|
||||
we have discovered a dead connection before the DO function and thus we
|
||||
might wrongly be re-using a connection that was closed by the remote peer.
|
||||
|
||||
Some time during the DO function, the Curl_setup_transfer() function must
|
||||
Some time during the DO function, the `Curl_setup_transfer()` function must
|
||||
be called with some basic info about the upcoming transfer: what socket(s)
|
||||
to read/write and the expected file transfer sizes (if known).
|
||||
|
||||
o Curl_readwrite()
|
||||
<a name="Curl_readwrite"></a>
|
||||
Curl_readwrite()
|
||||
----------------
|
||||
|
||||
Called during the transfer of the actual protocol payload.
|
||||
|
||||
@ -196,18 +254,22 @@ Library
|
||||
called). The speedcheck functions in lib/speedcheck.c are also used to
|
||||
verify that the transfer is as fast as required.
|
||||
|
||||
o Curl_done()
|
||||
<a name="Curl_done"></a>
|
||||
Curl_done()
|
||||
-----------
|
||||
|
||||
Called after a transfer is done. This function takes care of everything
|
||||
that has to be done after a transfer. This function attempts to leave
|
||||
matters in a state so that Curl_do() should be possible to call again on
|
||||
matters in a state so that `Curl_do()` should be possible to call again on
|
||||
the same connection (in a persistent connection case). It might also soon
|
||||
be closed with Curl_disconnect().
|
||||
be closed with `Curl_disconnect()`.
|
||||
|
||||
o Curl_disconnect()
|
||||
<a name="Curl_disconnect"></a>
|
||||
Curl_disconnect()
|
||||
-----------------
|
||||
|
||||
When doing normal connections and transfers, no one ever tries to close any
|
||||
connections so this is not normally called when curl_easy_perform() is
|
||||
connections so this is not normally called when `curl_easy_perform()` is
|
||||
used. This function is only used when we are certain that no more transfers
|
||||
is going to be made on the connection. It can be also closed by force, or
|
||||
it can be called to make sure that libcurl doesn't keep too many
|
||||
@ -216,8 +278,9 @@ Library
|
||||
This function cleans up all resources that are associated with a single
|
||||
connection.
|
||||
|
||||
|
||||
HTTP(S)
|
||||
<a name="http"></a>
|
||||
HTTP(S)
|
||||
=======
|
||||
|
||||
HTTP offers a lot and is the protocol in curl that uses the most lines of
|
||||
code. There is a special file (lib/formdata.c) that offers all the multipart
|
||||
@ -229,100 +292,123 @@ Library
|
||||
HTTPS uses in almost every means the same procedure as HTTP, with only two
|
||||
exceptions: the connect procedure is different and the function used to read
|
||||
or write from the socket is different, although the latter fact is hidden in
|
||||
the source by the use of Curl_read() for reading and Curl_write() for writing
|
||||
data to the remote server.
|
||||
the source by the use of `Curl_read()` for reading and `Curl_write()` for
|
||||
writing data to the remote server.
|
||||
|
||||
http_chunks.c contains functions that understands HTTP 1.1 chunked transfer
|
||||
`http_chunks.c` contains functions that understands HTTP 1.1 chunked transfer
|
||||
encoding.
|
||||
|
||||
An interesting detail with the HTTP(S) request, is the Curl_add_buffer()
|
||||
An interesting detail with the HTTP(S) request, is the `Curl_add_buffer()`
|
||||
series of functions we use. They append data to one single buffer, and when
|
||||
the building is done the entire request is sent off in one single write. This
|
||||
is done this way to overcome problems with flawed firewalls and lame servers.
|
||||
|
||||
FTP
|
||||
<a name="ftp"></a>
|
||||
FTP
|
||||
===
|
||||
|
||||
The Curl_if2ip() function can be used for getting the IP number of a
|
||||
The `Curl_if2ip()` function can be used for getting the IP number of a
|
||||
specified network interface, and it resides in lib/if2ip.c.
|
||||
|
||||
Curl_ftpsendf() is used for sending FTP commands to the remote server. It was
|
||||
made a separate function to prevent us programmers from forgetting that they
|
||||
must be CRLF terminated. They must also be sent in one single write() to make
|
||||
firewalls and similar happy.
|
||||
`Curl_ftpsendf()` is used for sending FTP commands to the remote server. It
|
||||
was made a separate function to prevent us programmers from forgetting that
|
||||
they must be CRLF terminated. They must also be sent in one single write() to
|
||||
make firewalls and similar happy.
|
||||
|
||||
Kerberos
|
||||
<a name="kerberos"></a>
|
||||
Kerberos
|
||||
--------
|
||||
|
||||
Kerberos support is mainly in lib/krb5.c and lib/security.c but also
|
||||
curl_sasl_sspi.c and curl_sasl_gssapi.c for the email protocols and
|
||||
socks_gssapi.c & socks_sspi.c for SOCKS5 proxy specifics.
|
||||
`curl_sasl_sspi.c` and `curl_sasl_gssapi.c` for the email protocols and
|
||||
`socks_gssapi.c` and `socks_sspi.c` for SOCKS5 proxy specifics.
|
||||
|
||||
TELNET
|
||||
<a name="telnet"></a>
|
||||
TELNET
|
||||
======
|
||||
|
||||
Telnet is implemented in lib/telnet.c.
|
||||
|
||||
FILE
|
||||
<a name="file"></a>
|
||||
FILE
|
||||
====
|
||||
|
||||
The file:// protocol is dealt with in lib/file.c.
|
||||
|
||||
SMB
|
||||
<a name="smb"></a>
|
||||
SMB
|
||||
===
|
||||
|
||||
The smb:// protocol is dealt with in lib/smb.c.
|
||||
|
||||
LDAP
|
||||
<a name="ldap"></a>
|
||||
LDAP
|
||||
====
|
||||
|
||||
Everything LDAP is in lib/ldap.c and lib/openldap.c
|
||||
|
||||
E-mail
|
||||
<a name="email"></a>
|
||||
E-mail
|
||||
======
|
||||
|
||||
The e-mail related source code is in lib/imap.c, lib/pop3.c and lib/smtp.c.
|
||||
|
||||
GENERAL
|
||||
<a name="general"></a>
|
||||
General
|
||||
=======
|
||||
|
||||
URL encoding and decoding, called escaping and unescaping in the source code,
|
||||
is found in lib/escape.c.
|
||||
|
||||
While transferring data in Transfer() a few functions might get used.
|
||||
curl_getdate() in lib/parsedate.c is for HTTP date comparisons (and more).
|
||||
`curl_getdate()` in lib/parsedate.c is for HTTP date comparisons (and more).
|
||||
|
||||
lib/getenv.c offers curl_getenv() which is for reading environment variables
|
||||
in a neat platform independent way. That's used in the client, but also in
|
||||
lib/url.c when checking the proxy environment variables. Note that contrary
|
||||
to the normal unix getenv(), this returns an allocated buffer that must be
|
||||
free()ed after use.
|
||||
lib/getenv.c offers `curl_getenv()` which is for reading environment
|
||||
variables in a neat platform independent way. That's used in the client, but
|
||||
also in lib/url.c when checking the proxy environment variables. Note that
|
||||
contrary to the normal unix getenv(), this returns an allocated buffer that
|
||||
must be free()ed after use.
|
||||
|
||||
lib/netrc.c holds the .netrc parser
|
||||
|
||||
lib/timeval.c features replacement functions for systems that don't have
|
||||
gettimeofday() and a few support functions for timeval conversions.
|
||||
|
||||
A function named curl_version() that returns the full curl version string is
|
||||
found in lib/version.c.
|
||||
A function named `curl_version()` that returns the full curl version string
|
||||
is found in lib/version.c.
|
||||
|
||||
<a name="persistent"></a>
|
||||
Persistent Connections
|
||||
======================
|
||||
|
||||
The persistent connection support in libcurl requires some considerations on
|
||||
how to do things inside of the library.
|
||||
|
||||
o The 'SessionHandle' struct returned in the curl_easy_init() call must never
|
||||
hold connection-oriented data. It is meant to hold the root data as well as
|
||||
all the options etc that the library-user may choose.
|
||||
o The 'SessionHandle' struct holds the "connection cache" (an array of
|
||||
- The 'SessionHandle' struct returned in the [`curl_easy_init()`][2] call
|
||||
must never hold connection-oriented data. It is meant to hold the root data
|
||||
as well as all the options etc that the library-user may choose.
|
||||
|
||||
- The 'SessionHandle' struct holds the "connection cache" (an array of
|
||||
pointers to 'connectdata' structs).
|
||||
o This enables the 'curl handle' to be reused on subsequent transfers.
|
||||
o When libcurl is told to perform a transfer, it first checks for an already
|
||||
|
||||
- This enables the 'curl handle' to be reused on subsequent transfers.
|
||||
|
||||
- When libcurl is told to perform a transfer, it first checks for an already
|
||||
existing connection in the cache that we can use. Otherwise it creates a
|
||||
new one and adds that the cache. If the cache is full already when a new
|
||||
connection is added added, it will first close the oldest unused one.
|
||||
o When the transfer operation is complete, the connection is left
|
||||
|
||||
- When the transfer operation is complete, the connection is left
|
||||
open. Particular options may tell libcurl not to, and protocols may signal
|
||||
closure on connections and then they won't be kept open of course.
|
||||
o When curl_easy_cleanup() is called, we close all still opened connections,
|
||||
|
||||
- When `curl_easy_cleanup()` is called, we close all still opened connections,
|
||||
unless of course the multi interface "owns" the connections.
|
||||
|
||||
The curl handle must be re-used in order for the persistent connections to
|
||||
work.
|
||||
|
||||
<a name="multi"></a>
|
||||
multi interface/non-blocking
|
||||
============================
|
||||
|
||||
@ -341,6 +427,7 @@ multi interface/non-blocking
|
||||
protocols are crappy examples and they are subject for rewrite in the future
|
||||
to better fit the libcurl protocol family.
|
||||
|
||||
<a name="ssl"></a>
|
||||
SSL libraries
|
||||
=============
|
||||
|
||||
@ -350,36 +437,39 @@ SSL libraries
|
||||
in future libcurl versions.
|
||||
|
||||
To deal with this internally in the best way possible, we have a generic SSL
|
||||
function API as provided by the vtls.[ch] system, and they are the only SSL
|
||||
functions we must use from within libcurl. vtls is then crafted to use the
|
||||
appropriate lower-level function calls to whatever SSL library that is in
|
||||
function API as provided by the vtls/vtls.[ch] system, and they are the only
|
||||
SSL functions we must use from within libcurl. vtls is then crafted to use
|
||||
the appropriate lower-level function calls to whatever SSL library that is in
|
||||
use. For example vtls/openssl.[ch] for the OpenSSL library.
|
||||
|
||||
<a name="symbols"></a>
|
||||
Library Symbols
|
||||
===============
|
||||
|
||||
All symbols used internally in libcurl must use a 'Curl_' prefix if they're
|
||||
All symbols used internally in libcurl must use a `Curl_` prefix if they're
|
||||
used in more than a single file. Single-file symbols must be made static.
|
||||
Public ("exported") symbols must use a 'curl_' prefix. (There are exceptions,
|
||||
Public ("exported") symbols must use a `curl_` prefix. (There are exceptions,
|
||||
but they are to be changed to follow this pattern in future versions.) Public
|
||||
API functions are marked with CURL_EXTERN in the public header files so that
|
||||
all others can be hidden on platforms where this is possible.
|
||||
API functions are marked with `CURL_EXTERN` in the public header files so
|
||||
that all others can be hidden on platforms where this is possible.
|
||||
|
||||
<a name="returncodes"></a>
|
||||
Return Codes and Informationals
|
||||
===============================
|
||||
|
||||
I've made things simple. Almost every function in libcurl returns a CURLcode,
|
||||
that must be CURLE_OK if everything is OK or otherwise a suitable error code
|
||||
as the curl/curl.h include file defines. The very spot that detects an error
|
||||
must use the Curl_failf() function to set the human-readable error
|
||||
that must be `CURLE_OK` if everything is OK or otherwise a suitable error
|
||||
code as the curl/curl.h include file defines. The very spot that detects an
|
||||
error must use the `Curl_failf()` function to set the human-readable error
|
||||
description.
|
||||
|
||||
In aiding the user to understand what's happening and to debug curl usage, we
|
||||
must supply a fair amount of informational messages by using the Curl_infof()
|
||||
function. Those messages are only displayed when the user explicitly asks for
|
||||
them. They are best used when revealing information that isn't otherwise
|
||||
obvious.
|
||||
must supply a fair amount of informational messages by using the
|
||||
`Curl_infof()` function. Those messages are only displayed when the user
|
||||
explicitly asks for them. They are best used when revealing information that
|
||||
isn't otherwise obvious.
|
||||
|
||||
<a name="abi"></a>
|
||||
API/ABI
|
||||
=======
|
||||
|
||||
@ -387,29 +477,31 @@ API/ABI
|
||||
that makes it easier to keep a solid API/ABI over time. See docs/libcurl/ABI
|
||||
for our promise to users.
|
||||
|
||||
<a name="client"></a>
|
||||
Client
|
||||
======
|
||||
|
||||
main() resides in src/tool_main.c.
|
||||
main() resides in `src/tool_main.c`.
|
||||
|
||||
src/tool_hugehelp.c is automatically generated by the mkhelp.pl perl script
|
||||
`src/tool_hugehelp.c` is automatically generated by the mkhelp.pl perl script
|
||||
to display the complete "manual" and the src/tool_urlglob.c file holds the
|
||||
functions used for the URL-"globbing" support. Globbing in the sense that the
|
||||
{} and [] expansion stuff is there.
|
||||
|
||||
The client mostly messes around to setup its 'config' struct properly, then
|
||||
it calls the curl_easy_*() functions of the library and when it gets back
|
||||
control after the curl_easy_perform() it cleans up the library, checks status
|
||||
and exits.
|
||||
it calls the `curl_easy_*()` functions of the library and when it gets back
|
||||
control after the `curl_easy_perform()` it cleans up the library, checks
|
||||
status and exits.
|
||||
|
||||
When the operation is done, the ourWriteOut() function in src/writeout.c may
|
||||
be called to report about the operation. That function is using the
|
||||
curl_easy_getinfo() function to extract useful information from the curl
|
||||
`curl_easy_getinfo()` function to extract useful information from the curl
|
||||
session.
|
||||
|
||||
It may loop and do all this several times if many URLs were specified on the
|
||||
command line or config file.
|
||||
|
||||
<a name="memorydebug"></a>
|
||||
Memory Debugging
|
||||
================
|
||||
|
||||
@ -439,6 +531,7 @@ Memory Debugging
|
||||
the configure script. When --enable-debug is given both features will be
|
||||
enabled, unless some restriction prevents memory tracking from being used.
|
||||
|
||||
<a name="test"></a>
|
||||
Test Suite
|
||||
==========
|
||||
|
||||
@ -456,29 +549,315 @@ Test Suite
|
||||
The test suite automatically detects if curl was built with the memory
|
||||
debugging enabled, and if it was it will detect memory leaks, too.
|
||||
|
||||
Building Releases
|
||||
=================
|
||||
<a name="asyncdns"></a>
|
||||
Asynchronous name resolves
|
||||
==========================
|
||||
|
||||
There's no magic to this. When you consider everything stable enough to be
|
||||
released, do this:
|
||||
libcurl can be built to do name resolves asynchronously, using either the
|
||||
normal resolver in a threaded manner or by using c-ares.
|
||||
|
||||
1. Tag the source code accordingly.
|
||||
<a name="cares"></a>
|
||||
[c-ares][3]
|
||||
------
|
||||
|
||||
2. run the 'maketgz' script (using 'make distcheck' will give you a pretty
|
||||
good view on the status of the current sources). maketgz requires a
|
||||
version number and creates the release archive. maketgz uses 'make dist'
|
||||
for the actual archive building, why you need to fill in the Makefile.am
|
||||
files properly for which files that should be included in the release
|
||||
archives.
|
||||
### Build libcurl to use a c-ares
|
||||
|
||||
3. When that's complete, sign the output files.
|
||||
1. ./configure --enable-ares=/path/to/ares/install
|
||||
2. make
|
||||
|
||||
4. Upload
|
||||
### c-ares on win32
|
||||
|
||||
5. Update web site and changelog on site
|
||||
First I compiled c-ares. I changed the default C runtime library to be the
|
||||
single-threaded rather than the multi-threaded (this seems to be required to
|
||||
prevent linking errors later on). Then I simply build the areslib project
|
||||
(the other projects adig/ahost seem to fail under MSVC).
|
||||
|
||||
6. Send announcement to the mailing lists
|
||||
Next was libcurl. I opened lib/config-win32.h and I added a:
|
||||
`#define USE_ARES 1`
|
||||
|
||||
NOTE: you must have curl checked out from git to be able to do a proper
|
||||
release build. The release tarballs do not have everything setup in order to
|
||||
do releases properly.
|
||||
Next thing I did was I added the path for the ares includes to the include
|
||||
path, and the libares.lib to the libraries.
|
||||
|
||||
Lastly, I also changed libcurl to be single-threaded rather than
|
||||
multi-threaded, again this was to prevent some duplicate symbol errors. I'm
|
||||
not sure why I needed to change everything to single-threaded, but when I
|
||||
didn't I got redefinition errors for several CRT functions (malloc, stricmp,
|
||||
etc.)
|
||||
|
||||
<a name="curl_off_t"></a>
|
||||
`curl_off_t`
|
||||
==========
|
||||
|
||||
curl_off_t is a data type provided by the external libcurl include
|
||||
headers. It is the type meant to be used for the [`curl_easy_setopt()`][1]
|
||||
options that end with LARGE. The type is 64bit large on most modern
|
||||
platforms.
|
||||
|
||||
curlx
|
||||
=====
|
||||
|
||||
The libcurl source code offers a few functions by source only. They are not
|
||||
part of the official libcurl API, but the source files might be useful for
|
||||
others so apps can optionally compile/build with these sources to gain
|
||||
additional functions.
|
||||
|
||||
We provide them through a single header file for easy access for apps:
|
||||
"curlx.h"
|
||||
|
||||
`curlx_strtoofft()`
|
||||
-------------------
|
||||
A macro that converts a string containing a number to a curl_off_t number.
|
||||
This might use the curlx_strtoll() function which is provided as source
|
||||
code in strtoofft.c. Note that the function is only provided if no
|
||||
strtoll() (or equivalent) function exist on your platform. If curl_off_t
|
||||
is only a 32 bit number on your platform, this macro uses strtol().
|
||||
|
||||
`curlx_tvnow()`
|
||||
---------------
|
||||
returns a struct timeval for the current time.
|
||||
|
||||
`curlx_tvdiff()`
|
||||
--------------
|
||||
returns the difference between two timeval structs, in number of
|
||||
milliseconds.
|
||||
|
||||
`curlx_tvdiff_secs()`
|
||||
---------------------
|
||||
returns the same as curlx_tvdiff but with full usec resolution (as a
|
||||
double)
|
||||
|
||||
Future
|
||||
------
|
||||
|
||||
Several functions will be removed from the public curl_ name space in a
|
||||
future libcurl release. They will then only become available as curlx_
|
||||
functions instead. To make the transition easier, we already today provide
|
||||
these functions with the curlx_ prefix to allow sources to get built properly
|
||||
with the new function names. The functions this concerns are:
|
||||
|
||||
- `curlx_getenv`
|
||||
- `curlx_strequal`
|
||||
- `curlx_strnequal`
|
||||
- `curlx_mvsnprintf`
|
||||
- `curlx_msnprintf`
|
||||
- `curlx_maprintf`
|
||||
- `curlx_mvaprintf`
|
||||
- `curlx_msprintf`
|
||||
- `curlx_mprintf`
|
||||
- `curlx_mfprintf`
|
||||
- `curlx_mvsprintf`
|
||||
- `curlx_mvprintf`
|
||||
- `curlx_mvfprintf`
|
||||
|
||||
<a name="contentencoding"></a>
|
||||
Content Encoding
|
||||
================
|
||||
|
||||
## About content encodings
|
||||
|
||||
[HTTP/1.1][4] specifies that a client may request that a server encode its
|
||||
response. This is usually used to compress a response using one of a set of
|
||||
commonly available compression techniques. These schemes are 'deflate' (the
|
||||
zlib algorithm), 'gzip' and 'compress'. A client requests that the sever
|
||||
perform an encoding by including an Accept-Encoding header in the request
|
||||
document. The value of the header should be one of the recognized tokens
|
||||
'deflate', ... (there's a way to register new schemes/tokens, see sec 3.5 of
|
||||
the spec). A server MAY honor the client's encoding request. When a response
|
||||
is encoded, the server includes a Content-Encoding header in the
|
||||
response. The value of the Content-Encoding header indicates which scheme was
|
||||
used to encode the data.
|
||||
|
||||
A client may tell a server that it can understand several different encoding
|
||||
schemes. In this case the server may choose any one of those and use it to
|
||||
encode the response (indicating which one using the Content-Encoding header).
|
||||
It's also possible for a client to attach priorities to different schemes so
|
||||
that the server knows which it prefers. See sec 14.3 of RFC 2616 for more
|
||||
information on the Accept-Encoding header.
|
||||
|
||||
## Supported content encodings
|
||||
|
||||
The 'deflate' and 'gzip' content encoding are supported by libcurl. Both
|
||||
regular and chunked transfers work fine. The zlib library is required for
|
||||
this feature.
|
||||
|
||||
## The libcurl interface
|
||||
|
||||
To cause libcurl to request a content encoding use:
|
||||
|
||||
[`curl_easy_setopt`][1](curl, [`CURLOPT_ACCEPT_ENCODING`][5], string)
|
||||
|
||||
where string is the intended value of the Accept-Encoding header.
|
||||
|
||||
Currently, libcurl only understands how to process responses that use the
|
||||
"deflate" or "gzip" Content-Encoding, so the only values for
|
||||
[`CURLOPT_ACCEPT_ENCODING`][5] that will work (besides "identity," which does
|
||||
nothing) are "deflate" and "gzip" If a response is encoded using the
|
||||
"compress" or methods, libcurl will return an error indicating that the
|
||||
response could not be decoded. If <string> is NULL no Accept-Encoding header
|
||||
is generated. If <string> is a zero-length string, then an Accept-Encoding
|
||||
header containing all supported encodings will be generated.
|
||||
|
||||
The [`CURLOPT_ACCEPT_ENCODING`][5] must be set to any non-NULL value for
|
||||
content to be automatically decoded. If it is not set and the server still
|
||||
sends encoded content (despite not having been asked), the data is returned
|
||||
in its raw form and the Content-Encoding type is not checked.
|
||||
|
||||
## The curl interface
|
||||
|
||||
Use the [--compressed][6] option with curl to cause it to ask servers to
|
||||
compress responses using any format supported by curl.
|
||||
|
||||
<a name="hostip"></a>
|
||||
hostip.c explained
|
||||
==================
|
||||
|
||||
The main compile-time defines to keep in mind when reading the host*.c source
|
||||
file are these:
|
||||
|
||||
## `CURLRES_IPV6`
|
||||
|
||||
this host has getaddrinfo() and family, and thus we use that. The host may
|
||||
not be able to resolve IPv6, but we don't really have to take that into
|
||||
account. Hosts that aren't IPv6-enabled have CURLRES_IPV4 defined.
|
||||
|
||||
## `CURLRES_ARES`
|
||||
|
||||
is defined if libcurl is built to use c-ares for asynchronous name
|
||||
resolves. This can be Windows or *nix.
|
||||
|
||||
## `CURLRES_THREADED`
|
||||
|
||||
is defined if libcurl is built to use threading for asynchronous name
|
||||
resolves. The name resolve will be done in a new thread, and the supported
|
||||
asynch API will be the same as for ares-builds. This is the default under
|
||||
(native) Windows.
|
||||
|
||||
If any of the two previous are defined, `CURLRES_ASYNCH` is defined too. If
|
||||
libcurl is not built to use an asynchronous resolver, `CURLRES_SYNCH` is
|
||||
defined.
|
||||
|
||||
## host*.c sources
|
||||
|
||||
The host*.c sources files are split up like this:
|
||||
|
||||
- hostip.c - method-independent resolver functions and utility functions
|
||||
- hostasyn.c - functions for asynchronous name resolves
|
||||
- hostsyn.c - functions for synchronous name resolves
|
||||
- asyn-ares.c - functions for asynchronous name resolves using c-ares
|
||||
- asyn-thread.c - functions for asynchronous name resolves using threads
|
||||
- hostip4.c - IPv4 specific functions
|
||||
- hostip6.c - IPv6 specific functions
|
||||
|
||||
The hostip.h is the single united header file for all this. It defines the
|
||||
`CURLRES_*` defines based on the config*.h and curl_setup.h defines.
|
||||
|
||||
<a name="memoryleak"></a>
|
||||
Track Down Memory Leaks
|
||||
=======================
|
||||
|
||||
## Single-threaded
|
||||
|
||||
Please note that this memory leak system is not adjusted to work in more
|
||||
than one thread. If you want/need to use it in a multi-threaded app. Please
|
||||
adjust accordingly.
|
||||
|
||||
|
||||
## Build
|
||||
|
||||
Rebuild libcurl with -DCURLDEBUG (usually, rerunning configure with
|
||||
--enable-debug fixes this). 'make clean' first, then 'make' so that all
|
||||
files actually are rebuilt properly. It will also make sense to build
|
||||
libcurl with the debug option (usually -g to the compiler) so that debugging
|
||||
it will be easier if you actually do find a leak in the library.
|
||||
|
||||
This will create a library that has memory debugging enabled.
|
||||
|
||||
## Modify Your Application
|
||||
|
||||
Add a line in your application code:
|
||||
|
||||
`curl_memdebug("dump");`
|
||||
|
||||
This will make the malloc debug system output a full trace of all resource
|
||||
using functions to the given file name. Make sure you rebuild your program
|
||||
and that you link with the same libcurl you built for this purpose as
|
||||
described above.
|
||||
|
||||
## Run Your Application
|
||||
|
||||
Run your program as usual. Watch the specified memory trace file grow.
|
||||
|
||||
Make your program exit and use the proper libcurl cleanup functions etc. So
|
||||
that all non-leaks are returned/freed properly.
|
||||
|
||||
## Analyze the Flow
|
||||
|
||||
Use the tests/memanalyze.pl perl script to analyze the dump file:
|
||||
|
||||
tests/memanalyze.pl dump
|
||||
|
||||
This now outputs a report on what resources that were allocated but never
|
||||
freed etc. This report is very fine for posting to the list!
|
||||
|
||||
If this doesn't produce any output, no leak was detected in libcurl. Then
|
||||
the leak is mostly likely to be in your code.
|
||||
|
||||
<a name="multi_socket"></a>
|
||||
`multi_socket`
|
||||
==============
|
||||
|
||||
Implementation of the `curl_multi_socket` API
|
||||
|
||||
The main ideas of this API are simply:
|
||||
|
||||
1 - The application can use whatever event system it likes as it gets info
|
||||
from libcurl about what file descriptors libcurl waits for what action
|
||||
on. (The previous API returns `fd_sets` which is very select()-centric).
|
||||
|
||||
2 - When the application discovers action on a single socket, it calls
|
||||
libcurl and informs that there was action on this particular socket and
|
||||
libcurl can then act on that socket/transfer only and not care about
|
||||
any other transfers. (The previous API always had to scan through all
|
||||
the existing transfers.)
|
||||
|
||||
The idea is that [`curl_multi_socket_action()`][7] calls a given callback
|
||||
with information about what socket to wait for what action on, and the
|
||||
callback only gets called if the status of that socket has changed.
|
||||
|
||||
We also added a timer callback that makes libcurl call the application when
|
||||
the timeout value changes, and you set that with [`curl_multi_setopt()`][9]
|
||||
and the [`CURLMOPT_TIMERFUNCTION`][10] option. To get this to work,
|
||||
Internally, there's an added a struct to each easy handle in which we store
|
||||
an "expire time" (if any). The structs are then "splay sorted" so that we
|
||||
can add and remove times from the linked list and yet somewhat swiftly
|
||||
figure out both how long time there is until the next nearest timer expires
|
||||
and which timer (handle) we should take care of now. Of course, the upside
|
||||
of all this is that we get a [`curl_multi_timeout()`][8] that should also
|
||||
work with old-style applications that use [`curl_multi_perform()`][11].
|
||||
|
||||
We created an internal "socket to easy handles" hash table that given
|
||||
a socket (file descriptor) return the easy handle that waits for action on
|
||||
that socket. This hash is made using the already existing hash code
|
||||
(previously only used for the DNS cache).
|
||||
|
||||
To make libcurl able to report plain sockets in the socket callback, we had
|
||||
to re-organize the internals of the [`curl_multi_fdset()`][12] etc so that
|
||||
the conversion from sockets to `fd_sets` for that function is only done in
|
||||
the last step before the data is returned. I also had to extend c-ares to
|
||||
get a function that can return plain sockets, as that library too returned
|
||||
only `fd_sets` and that is no longer good enough. The changes done to c-ares
|
||||
are available in c-ares 1.3.1 and later.
|
||||
|
||||
|
||||
[1]: http://curl.haxx.se/libcurl/c/curl_easy_setopt.html
|
||||
[2]: http://curl.haxx.se/libcurl/c/curl_easy_init.html
|
||||
[3]: http://c-ares.haxx.se/
|
||||
[4]: https://tools.ietf.org/html/rfc7230 "RFC 7230"
|
||||
[5]: http://curl.haxx.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
|
||||
[6]: http://curl.haxx.se/docs/manpage.html#--compressed
|
||||
[7]: http://curl.haxx.se/libcurl/c/curl_multi_socket_action.html
|
||||
[8]: http://curl.haxx.se/libcurl/c/curl_multi_timeout.html
|
||||
[9]: http://curl.haxx.se/libcurl/c/curl_multi_setopt.html
|
||||
[10]: http://curl.haxx.se/libcurl/c/CURLMOPT_TIMERFUNCTION.html
|
||||
[11]: http://curl.haxx.se/libcurl/c/curl_multi_perform.html
|
||||
[12]: http://curl.haxx.se/libcurl/c/curl_multi_fdset.html
|
@ -21,9 +21,6 @@
|
||||
###########################################################################
|
||||
AUTOMAKE_OPTIONS = foreign nostdinc
|
||||
|
||||
DOCS = README.encoding README.memoryleak README.ares README.curlx \
|
||||
README.hostip README.multi_socket README.httpauth README.curl_off_t
|
||||
|
||||
CMAKE_DIST = CMakeLists.txt curl_config.h.cmake
|
||||
|
||||
EXTRA_DIST = Makefile.b32 Makefile.m32 Makefile.vc6 config-win32.h \
|
||||
@ -31,7 +28,7 @@ EXTRA_DIST = Makefile.b32 Makefile.m32 Makefile.vc6 config-win32.h \
|
||||
makefile.dj config-dos.h libcurl.plist libcurl.rc config-amigaos.h \
|
||||
makefile.amiga Makefile.netware nwlib.c nwos.c config-win32ce.h \
|
||||
config-os400.h setup-os400.h config-symbian.h Makefile.Watcom \
|
||||
config-tpf.h $(DOCS) mk-ca-bundle.pl mk-ca-bundle.vbs $(CMAKE_DIST) \
|
||||
config-tpf.h mk-ca-bundle.pl mk-ca-bundle.vbs $(CMAKE_DIST) \
|
||||
firefox-db2pem.sh config-vxworks.h Makefile.vxworks checksrc.pl \
|
||||
objnames-test08.sh objnames-test10.sh objnames.inc checksrc.whitelist
|
||||
|
||||
|
@ -1,69 +0,0 @@
|
||||
_ _ ____ _
|
||||
___| | | | _ \| |
|
||||
/ __| | | | |_) | |
|
||||
| (__| |_| | _ <| |___
|
||||
\___|\___/|_| \_\_____|
|
||||
|
||||
How To Build libcurl to Use c-ares For Asynch Name Resolves
|
||||
===========================================================
|
||||
|
||||
c-ares:
|
||||
http://c-ares.haxx.se/
|
||||
|
||||
NOTE
|
||||
The latest libcurl version requires c-ares 1.6.0 or later.
|
||||
|
||||
Once upon the time libcurl built fine with the "original" ares. That is no
|
||||
longer true. You need to use c-ares.
|
||||
|
||||
Build c-ares
|
||||
============
|
||||
|
||||
1. unpack the c-ares archive
|
||||
2. cd c-ares-dir
|
||||
3. ./configure
|
||||
4. make
|
||||
5. make install
|
||||
|
||||
Build libcurl to use c-ares in the curl source tree
|
||||
===================================================
|
||||
|
||||
1. name or symlink the c-ares source directory 'ares' in the curl source
|
||||
directory
|
||||
2. ./configure --enable-ares
|
||||
|
||||
Optionally, you can point out the c-ares install tree root with the the
|
||||
--enable-ares option.
|
||||
|
||||
3. make
|
||||
|
||||
Build libcurl to use an installed c-ares
|
||||
========================================
|
||||
|
||||
1. ./configure --enable-ares=/path/to/ares/install
|
||||
2. make
|
||||
|
||||
c-ares on win32
|
||||
===============
|
||||
(description brought by Dominick Meglio)
|
||||
|
||||
First I compiled c-ares. I changed the default C runtime library to be the
|
||||
single-threaded rather than the multi-threaded (this seems to be required to
|
||||
prevent linking errors later on). Then I simply build the areslib project (the
|
||||
other projects adig/ahost seem to fail under MSVC).
|
||||
|
||||
Next was libcurl. I opened lib/config-win32.h and I added a:
|
||||
#define USE_ARES 1
|
||||
|
||||
Next thing I did was I added the path for the ares includes to the include
|
||||
path, and the libares.lib to the libraries.
|
||||
|
||||
Lastly, I also changed libcurl to be single-threaded rather than
|
||||
multi-threaded, again this was to prevent some duplicate symbol errors. I'm
|
||||
not sure why I needed to change everything to single-threaded, but when I
|
||||
didn't I got redefinition errors for several CRT functions (malloc, stricmp,
|
||||
etc.)
|
||||
|
||||
I would have modified the MSVC++ project files, but I only have VC.NET and it
|
||||
uses a different format than VC6.0 so I didn't want to go and change
|
||||
everything and remove VC6.0 support from libcurl.
|
@ -1,68 +0,0 @@
|
||||
|
||||
curl_off_t explained
|
||||
====================
|
||||
|
||||
curl_off_t is a data type provided by the external libcurl include headers. It
|
||||
is the type meant to be used for the curl_easy_setopt() options that end with
|
||||
LARGE. The type is 64bit large on most modern platforms.
|
||||
|
||||
Transition from < 7.19.0 to >= 7.19.0
|
||||
-------------------------------------
|
||||
|
||||
Applications that used libcurl before 7.19.0 that are rebuilt with a libcurl
|
||||
that is 7.19.0 or later may or may not have to worry about anything of
|
||||
this. We have made a significant effort to make the transition really seamless
|
||||
and transparent.
|
||||
|
||||
You have have to take notice if you are in one of the following situations:
|
||||
|
||||
o Your app is using or will after the transition use a libcurl that is built
|
||||
with LFS (large file support) disabled even though your system otherwise
|
||||
supports it.
|
||||
|
||||
o Your app is using or will after the transition use a libcurl that doesn't
|
||||
support LFS at all, but your system and compiler support 64bit data types.
|
||||
|
||||
In both these cases, the curl_off_t type will now (after the transition) be
|
||||
64bit where it previously was 32bit. This will cause a binary incompatibility
|
||||
that you MAY need to deal with.
|
||||
|
||||
Benefits
|
||||
--------
|
||||
|
||||
This new way has several benefits:
|
||||
|
||||
o Platforms without LFS support can still use libcurl to do >32 bit file
|
||||
transfers and range operations etc as long as they have >32 bit data-types
|
||||
supported.
|
||||
|
||||
o Applications will no longer easily build with the curl_off_t size
|
||||
mismatched, which has been a very frequent (and annoying) problem with
|
||||
libcurl <= 7.18.2
|
||||
|
||||
Historically
|
||||
------------
|
||||
|
||||
Previously, before 7.19.0, the curl_off_t type would be rather strongly
|
||||
connected to the size of the system off_t type, where currently curl_off_t is
|
||||
independent of that.
|
||||
|
||||
The strong connection to off_t made it troublesome for application authors
|
||||
since when they did mistakes, they could get curl_off_t type of different
|
||||
sizes in the app vs libcurl, and that caused strange effects that were hard to
|
||||
track and detect by users of libcurl.
|
||||
|
||||
SONAME
|
||||
------
|
||||
|
||||
We opted to not bump the soname for the library unconditionally, simply
|
||||
because soname bumping is causing a lot of grief and moaning all over the
|
||||
community so we try to keep that at minimum. Also, our selected design path
|
||||
should be 100% backwards compatible for the vast majority of all libcurl
|
||||
users.
|
||||
|
||||
Enforce SONAME bump
|
||||
-------------------
|
||||
|
||||
If configure doesn't detect your case where a bump is necessary, re-run it
|
||||
with the --enable-soname-bump command line option!
|
@ -1,61 +0,0 @@
|
||||
_ _ ____ _
|
||||
___| | | | _ \| |
|
||||
/ __| | | | |_) | |
|
||||
| (__| |_| | _ <| |___
|
||||
\___|\___/|_| \_\_____|
|
||||
|
||||
Source Code Functions Apps Might Use
|
||||
====================================
|
||||
|
||||
The libcurl source code offers a few functions by source only. They are not
|
||||
part of the official libcurl API, but the source files might be useful for
|
||||
others so apps can optionally compile/build with these sources to gain
|
||||
additional functions.
|
||||
|
||||
We provide them through a single header file for easy access for apps:
|
||||
"curlx.h"
|
||||
|
||||
curlx_strtoofft()
|
||||
|
||||
A macro that converts a string containing a number to a curl_off_t number.
|
||||
This might use the curlx_strtoll() function which is provided as source
|
||||
code in strtoofft.c. Note that the function is only provided if no
|
||||
strtoll() (or equivalent) function exist on your platform. If curl_off_t
|
||||
is only a 32 bit number on your platform, this macro uses strtol().
|
||||
|
||||
curlx_tvnow()
|
||||
|
||||
returns a struct timeval for the current time.
|
||||
|
||||
curlx_tvdiff()
|
||||
|
||||
returns the difference between two timeval structs, in number of
|
||||
milliseconds.
|
||||
|
||||
curlx_tvdiff_secs()
|
||||
|
||||
returns the same as curlx_tvdiff but with full usec resolution (as a
|
||||
double)
|
||||
|
||||
FUTURE
|
||||
======
|
||||
|
||||
Several functions will be removed from the public curl_ name space in a
|
||||
future libcurl release. They will then only become available as curlx_
|
||||
functions instead. To make the transition easier, we already today provide
|
||||
these functions with the curlx_ prefix to allow sources to get built properly
|
||||
with the new function names. The functions this concerns are:
|
||||
|
||||
curlx_getenv
|
||||
curlx_strequal
|
||||
curlx_strnequal
|
||||
curlx_mvsnprintf
|
||||
curlx_msnprintf
|
||||
curlx_maprintf
|
||||
curlx_mvaprintf
|
||||
curlx_msprintf
|
||||
curlx_mprintf
|
||||
curlx_mfprintf
|
||||
curlx_mvsprintf
|
||||
curlx_mvprintf
|
||||
curlx_mvfprintf
|
@ -1,60 +0,0 @@
|
||||
|
||||
Content Encoding Support for libcurl
|
||||
|
||||
* About content encodings:
|
||||
|
||||
HTTP/1.1 [RFC 2616] specifies that a client may request that a server encode
|
||||
its response. This is usually used to compress a response using one of a set
|
||||
of commonly available compression techniques. These schemes are `deflate' (the
|
||||
zlib algorithm), `gzip' and `compress' [sec 3.5, RFC 2616]. A client requests
|
||||
that the sever perform an encoding by including an Accept-Encoding header in
|
||||
the request document. The value of the header should be one of the recognized
|
||||
tokens `deflate', ... (there's a way to register new schemes/tokens, see sec
|
||||
3.5 of the spec). A server MAY honor the client's encoding request. When a
|
||||
response is encoded, the server includes a Content-Encoding header in the
|
||||
response. The value of the Content-Encoding header indicates which scheme was
|
||||
used to encode the data.
|
||||
|
||||
A client may tell a server that it can understand several different encoding
|
||||
schemes. In this case the server may choose any one of those and use it to
|
||||
encode the response (indicating which one using the Content-Encoding header).
|
||||
It's also possible for a client to attach priorities to different schemes so
|
||||
that the server knows which it prefers. See sec 14.3 of RFC 2616 for more
|
||||
information on the Accept-Encoding header.
|
||||
|
||||
* Current support for content encoding:
|
||||
|
||||
Support for the 'deflate' and 'gzip' content encoding are supported by
|
||||
libcurl. Both regular and chunked transfers should work fine. The library
|
||||
zlib is required for this feature. 'deflate' support was added by James
|
||||
Gallagher, and support for the 'gzip' encoding was added by Dan Fandrich.
|
||||
|
||||
* The libcurl interface:
|
||||
|
||||
To cause libcurl to request a content encoding use:
|
||||
|
||||
curl_easy_setopt(curl, CURLOPT_ACCEPT_ENCODING, <string>)
|
||||
|
||||
where <string> is the intended value of the Accept-Encoding header.
|
||||
|
||||
Currently, libcurl only understands how to process responses that use the
|
||||
"deflate" or "gzip" Content-Encoding, so the only values for
|
||||
CURLOPT_ACCEPT_ENCODING that will work (besides "identity," which does
|
||||
nothing) are "deflate" and "gzip" If a response is encoded using the
|
||||
"compress" or methods, libcurl will return an error indicating that the
|
||||
response could not be decoded. If <string> is NULL no Accept-Encoding header
|
||||
is generated. If <string> is a zero-length string, then an Accept-Encoding
|
||||
header containing all supported encodings will be generated.
|
||||
|
||||
The CURLOPT_ACCEPT_ENCODING must be set to any non-NULL value for content to
|
||||
be automatically decoded. If it is not set and the server still sends encoded
|
||||
content (despite not having been asked), the data is returned in its raw form
|
||||
and the Content-Encoding type is not checked.
|
||||
|
||||
* The curl interface:
|
||||
|
||||
Use the --compressed option with curl to cause it to ask servers to compress
|
||||
responses using any format supported by curl.
|
||||
|
||||
James Gallagher <jgallagher@gso.uri.edu>
|
||||
Dan Fandrich <dan@coneharvesters.com>
|
@ -1,35 +0,0 @@
|
||||
hostip.c explained
|
||||
==================
|
||||
|
||||
The main COMPILE-TIME DEFINES to keep in mind when reading the host*.c
|
||||
source file are these:
|
||||
|
||||
CURLRES_IPV6 - this host has getaddrinfo() and family, and thus we use
|
||||
that. The host may not be able to resolve IPv6, but we don't really have to
|
||||
take that into account. Hosts that aren't IPv6-enabled have CURLRES_IPV4
|
||||
defined.
|
||||
|
||||
CURLRES_ARES - is defined if libcurl is built to use c-ares for asynchronous
|
||||
name resolves. This can be Windows or *nix.
|
||||
|
||||
CURLRES_THREADED - is defined if libcurl is built to use threading for
|
||||
asynchronous name resolves. The name resolve will be done in a new thread,
|
||||
and the supported asynch API will be the same as for ares-builds. This is
|
||||
the default under (native) Windows.
|
||||
|
||||
If any of the two previous are defined, CURLRES_ASYNCH is defined too. If
|
||||
libcurl is not built to use an asynchronous resolver, CURLRES_SYNCH is
|
||||
defined.
|
||||
|
||||
The host*.c sources files are split up like this:
|
||||
|
||||
hostip.c - method-independent resolver functions and utility functions
|
||||
hostasyn.c - functions for asynchronous name resolves
|
||||
hostsyn.c - functions for synchronous name resolves
|
||||
asyn-ares.c - functions for asynchronous name resolves using c-ares
|
||||
asyn-thread.c - functions for asynchronous name resolves using threads
|
||||
hostip4.c - IPv4 specific functions
|
||||
hostip6.c - IPv6 specific functions
|
||||
|
||||
The hostip.h is the single united header file for all this. It defines the
|
||||
CURLRES_* defines based on the config*.h and curl_setup.h defines.
|
@ -1,74 +0,0 @@
|
||||
|
||||
1. PUT/POST without a known auth to use (possibly no auth required):
|
||||
|
||||
(When explicitly set to use a multi-pass auth when doing a POST/PUT,
|
||||
libcurl should immediately go the Content-Length: 0 bytes route to avoid
|
||||
the first send all data phase, step 2. If told to use a single-pass auth,
|
||||
goto step 3.)
|
||||
|
||||
Issue the proper PUT/POST request immediately, with the correct
|
||||
Content-Length and Expect: headers.
|
||||
|
||||
If a 100 response is received or the wait for one times out, start sending
|
||||
the request-body.
|
||||
|
||||
If a 401 (or 407 when talking through a proxy) is received, then:
|
||||
|
||||
If we have "more than just a little" data left to send, close the
|
||||
connection. Exactly what "more than just a little" means will have to be
|
||||
determined. Possibly the current transfer speed should be taken into
|
||||
account as well.
|
||||
|
||||
NOTE: if the size of the POST data is less than MAX_INITIAL_POST_SIZE (when
|
||||
CURLOPT_POSTFIELDS is used), libcurl will send everything in one single
|
||||
write() (all request-headers and request-body) and thus it will
|
||||
unconditionally send the full post data here.
|
||||
|
||||
2. PUT/POST with multi-pass auth but not yet completely negotiated:
|
||||
|
||||
Send a PUT/POST request, we know that it will be rejected and thus we claim
|
||||
Content-Length zero to avoid having to send the request-body. (This seems
|
||||
to be what IE does.)
|
||||
|
||||
3. PUT/POST as the last step in the auth negotiation, that is when we have
|
||||
what we believe is a completed negotiation:
|
||||
|
||||
Send a full and proper PUT/POST request (again) with the proper
|
||||
Content-Length and a following request-body.
|
||||
|
||||
NOTE: this may very well be the second (or even third) time the whole or at
|
||||
least parts of the request body is sent to the server. Since the data may
|
||||
be provided to libcurl with a callback, we need a way to tell the app that
|
||||
the upload is to be restarted so that the callback will provide data from
|
||||
the start again. This requires an API method/mechanism that libcurl
|
||||
doesn't have today. See below.
|
||||
|
||||
Data Rewind
|
||||
|
||||
It will be troublesome for some apps to deal with a rewind like this in all
|
||||
circumstances. I'm thinking for example when using 'curl' to upload data
|
||||
from stdin. If libcurl ends up having to rewind the reading for a request
|
||||
to succeed, of course a lack of this callback or if it returns failure, will
|
||||
cause the request to fail completely.
|
||||
|
||||
The new callback is set with CURLOPT_IOCTLFUNCTION (in an attempt to add a
|
||||
more generic function that might be used for other IO-related controls in
|
||||
the future):
|
||||
|
||||
curlioerr curl_ioctl(CURL *handle, curliocmd cmd, void *clientp);
|
||||
|
||||
And in the case where the read is to be rewinded, it would be called with a
|
||||
cmd named CURLIOCMD_RESTARTREAD. The callback would then return CURLIOE_OK,
|
||||
if things are fine, or CURLIOE_FAILRESTART if not.
|
||||
|
||||
Backwards Compatibility
|
||||
|
||||
The approach used until now, that issues a HEAD on the given URL to trigger
|
||||
the auth negotiation could still be supported and encouraged, but it would
|
||||
be up to the app to first fetch a URL with GET/HEAD to negotiate on, since
|
||||
then a following PUT/POST wouldn't need to negotiate authentication and
|
||||
thus avoid double-sending data.
|
||||
|
||||
Optionally, we keep the current approach if some option is set
|
||||
(CURLOPT_HEADBEFOREAUTH or similar), since it seems to work fairly well for
|
||||
POST on most servers.
|
@ -1,55 +0,0 @@
|
||||
_ _ ____ _
|
||||
___| | | | _ \| |
|
||||
/ __| | | | |_) | |
|
||||
| (__| |_| | _ <| |___
|
||||
\___|\___/|_| \_\_____|
|
||||
|
||||
How To Track Down Suspected Memory Leaks in libcurl
|
||||
===================================================
|
||||
|
||||
Single-threaded
|
||||
|
||||
Please note that this memory leak system is not adjusted to work in more
|
||||
than one thread. If you want/need to use it in a multi-threaded app. Please
|
||||
adjust accordingly.
|
||||
|
||||
|
||||
Build
|
||||
|
||||
Rebuild libcurl with -DCURLDEBUG (usually, rerunning configure with
|
||||
--enable-debug fixes this). 'make clean' first, then 'make' so that all
|
||||
files actually are rebuilt properly. It will also make sense to build
|
||||
libcurl with the debug option (usually -g to the compiler) so that debugging
|
||||
it will be easier if you actually do find a leak in the library.
|
||||
|
||||
This will create a library that has memory debugging enabled.
|
||||
|
||||
Modify Your Application
|
||||
|
||||
Add a line in your application code:
|
||||
|
||||
curl_memdebug("dump");
|
||||
|
||||
This will make the malloc debug system output a full trace of all resource
|
||||
using functions to the given file name. Make sure you rebuild your program
|
||||
and that you link with the same libcurl you built for this purpose as
|
||||
described above.
|
||||
|
||||
Run Your Application
|
||||
|
||||
Run your program as usual. Watch the specified memory trace file grow.
|
||||
|
||||
Make your program exit and use the proper libcurl cleanup functions etc. So
|
||||
that all non-leaks are returned/freed properly.
|
||||
|
||||
Analyze the Flow
|
||||
|
||||
Use the tests/memanalyze.pl perl script to analyze the dump file:
|
||||
|
||||
tests/memanalyze.pl dump
|
||||
|
||||
This now outputs a report on what resources that were allocated but never
|
||||
freed etc. This report is very fine for posting to the list!
|
||||
|
||||
If this doesn't produce any output, no leak was detected in libcurl. Then
|
||||
the leak is mostly likely to be in your code.
|
@ -1,53 +0,0 @@
|
||||
Implementation of the curl_multi_socket API
|
||||
|
||||
The main ideas of the new API are simply:
|
||||
|
||||
1 - The application can use whatever event system it likes as it gets info
|
||||
from libcurl about what file descriptors libcurl waits for what action
|
||||
on. (The previous API returns fd_sets which is very select()-centric).
|
||||
|
||||
2 - When the application discovers action on a single socket, it calls
|
||||
libcurl and informs that there was action on this particular socket and
|
||||
libcurl can then act on that socket/transfer only and not care about
|
||||
any other transfers. (The previous API always had to scan through all
|
||||
the existing transfers.)
|
||||
|
||||
The idea is that curl_multi_socket_action() calls a given callback with
|
||||
information about what socket to wait for what action on, and the callback
|
||||
only gets called if the status of that socket has changed.
|
||||
|
||||
We also added a timer callback that makes libcurl call the application when
|
||||
the timeout value changes, and you set that with curl_multi_setopt() and the
|
||||
CURLMOPT_TIMERFUNCTION option. To get this to work, Internally, there's an
|
||||
added a struct to each easy handle in which we store an "expire time" (if
|
||||
any). The structs are then "splay sorted" so that we can add and remove
|
||||
times from the linked list and yet somewhat swiftly figure out both how long
|
||||
time there is until the next nearest timer expires and which timer (handle)
|
||||
we should take care of now. Of course, the upside of all this is that we get
|
||||
a curl_multi_timeout() that should also work with old-style applications
|
||||
that use curl_multi_perform().
|
||||
|
||||
We created an internal "socket to easy handles" hash table that given
|
||||
a socket (file descriptor) return the easy handle that waits for action on
|
||||
that socket. This hash is made using the already existing hash code
|
||||
(previously only used for the DNS cache).
|
||||
|
||||
To make libcurl able to report plain sockets in the socket callback, we had
|
||||
to re-organize the internals of the curl_multi_fdset() etc so that the
|
||||
conversion from sockets to fd_sets for that function is only done in the
|
||||
last step before the data is returned. I also had to extend c-ares to get a
|
||||
function that can return plain sockets, as that library too returned only
|
||||
fd_sets and that is no longer good enough. The changes done to c-ares are
|
||||
available in c-ares 1.3.1 and later.
|
||||
|
||||
We have done a test runs with up to 9000 connections (with a single active
|
||||
one). The curl_multi_socket_action() invoke then takes less than 10
|
||||
microseconds in average (using the read-only-1-byte-at-a-time hack). We are
|
||||
now below the 60 microseconds "per socket action" goal (the extra 50 is the
|
||||
time libevent needs).
|
||||
|
||||
Documentation
|
||||
|
||||
http://curl.haxx.se/libcurl/c/curl_multi_socket_action.html
|
||||
http://curl.haxx.se/libcurl/c/curl_multi_timeout.html
|
||||
http://curl.haxx.se/libcurl/c/curl_multi_setopt.html
|
Loading…
Reference in New Issue
Block a user