mirror of
https://github.com/moparisthebest/curl
synced 2025-03-07 20:59:41 -05:00
extended the proxy chapter mucho
This commit is contained in:
parent
5b58e61f28
commit
14e9420d2c
@ -137,9 +137,22 @@ Handle the Easy libcurl
|
|||||||
|
|
||||||
It returns an easy handle. Using that you proceed to the next step: setting
|
It returns an easy handle. Using that you proceed to the next step: setting
|
||||||
up your preferred actions. A handle is just a logic entity for the upcoming
|
up your preferred actions. A handle is just a logic entity for the upcoming
|
||||||
transfer or series of transfers. One of the most basic properties to set in
|
transfer or series of transfers.
|
||||||
the handle is the URL. You set your preferred URL to transfer with
|
|
||||||
CURLOPT_URL in a manner similar to:
|
You set properties and options for this handle using curl_easy_setopt(). They
|
||||||
|
control how the subsequent transfer or transfers will be made. Options remain
|
||||||
|
set in the handle until set again to something different. Alas, multiple
|
||||||
|
requests using the same handle will use the same options.
|
||||||
|
|
||||||
|
Many of the informationals you set in libcurl are "strings", pointers to data
|
||||||
|
terminated with a zero byte. Keep in mind that when you set strings with
|
||||||
|
curl_easy_setopt(), libcurl will not copy the data. It will merely point to
|
||||||
|
the data. You MUST make sure that the data remains available for libcurl to
|
||||||
|
use until finished or until you use the same option again to point to
|
||||||
|
something else.
|
||||||
|
|
||||||
|
One of the most basic properties to set in the handle is the URL. You set
|
||||||
|
your preferred URL to transfer with CURLOPT_URL in a manner similar to:
|
||||||
|
|
||||||
curl_easy_setopt(easyhandle, CURLOPT_URL, "http://curl.haxx.se/");
|
curl_easy_setopt(easyhandle, CURLOPT_URL, "http://curl.haxx.se/");
|
||||||
|
|
||||||
@ -358,12 +371,16 @@ HTTP POSTing
|
|||||||
|
|
||||||
curl_easy_perform(easyhandle); /* post away! */
|
curl_easy_perform(easyhandle); /* post away! */
|
||||||
|
|
||||||
Simple enough, huh? Ok, so what if you want to post binary data that also
|
Simple enough, huh? Since you set the POST options with the
|
||||||
requires you to set the Content-Type: header of the post? Well, binary posts
|
CURLOPT_POSTFIELDS, this automaticly switches the handle to use POST in the
|
||||||
prevents libcurl from being able to do strlen() on the data to figure out the
|
upcoming request.
|
||||||
size, so therefore we must tell libcurl the size of the post data. Setting
|
|
||||||
headers in libcurl requests are done in a generic way, by building a list of
|
Ok, so what if you want to post binary data that also requires you to set the
|
||||||
our own headers and then passing that list to libcurl.
|
Content-Type: header of the post? Well, binary posts prevents libcurl from
|
||||||
|
being able to do strlen() on the data to figure out the size, so therefore we
|
||||||
|
must tell libcurl the size of the post data. Setting headers in libcurl
|
||||||
|
requests are done in a generic way, by building a list of our own headers and
|
||||||
|
then passing that list to libcurl.
|
||||||
|
|
||||||
struct curl_slist *headers=NULL;
|
struct curl_slist *headers=NULL;
|
||||||
headers = curl_slist_append(headers, "Content-Type: text/xml");
|
headers = curl_slist_append(headers, "Content-Type: text/xml");
|
||||||
@ -416,14 +433,14 @@ HTTP POSTing
|
|||||||
/* free the post data again */
|
/* free the post data again */
|
||||||
curl_formfree(post);
|
curl_formfree(post);
|
||||||
|
|
||||||
The multipart formposts are a chain of parts using MIME-style separators and
|
Multipart formposts are chains of parts using MIME-style separators and
|
||||||
headers. That means that each of these separate parts get a few headers set
|
headers. It means that each one of these separate parts get a few headers set
|
||||||
that describes its individual content-type, size etc. Now, to enable your
|
that describe the individual content-type, size etc. To enable your
|
||||||
application to handicraft this formpost even more, libcurl allows you to
|
application to handicraft this formpost even more, libcurl allows you to
|
||||||
supply your own custom headers to an individual form part. You can of course
|
supply your own set of custom headers to such an individual form part. You
|
||||||
supply headers to as many parts you like, but this little example will show
|
can of course supply headers to as many parts you like, but this little
|
||||||
how you have set headers to one specific part when you add that to post
|
example will show how you set headers to one specific part when you add that
|
||||||
handle:
|
to the post handle:
|
||||||
|
|
||||||
struct curl_slist *headers=NULL;
|
struct curl_slist *headers=NULL;
|
||||||
headers = curl_slist_append(headers, "Content-Type: text/xml");
|
headers = curl_slist_append(headers, "Content-Type: text/xml");
|
||||||
@ -439,9 +456,22 @@ HTTP POSTing
|
|||||||
curl_formfree(post); /* free post */
|
curl_formfree(post); /* free post */
|
||||||
curl_slist_free_all(post); /* free custom header list */
|
curl_slist_free_all(post); /* free custom header list */
|
||||||
|
|
||||||
|
Since all options on an easyhandle are "sticky", they remain the same until
|
||||||
|
changed even if you do call curl_easy_perform(), you may need to tell curl to
|
||||||
|
go back to a plain GET request if you intend to do such a one as your next
|
||||||
|
request. You force an easyhandle to back to GET by using the CURLOPT_HTTPGET
|
||||||
|
option:
|
||||||
|
|
||||||
|
curl_easy_setopt(easyhandle, CURLOPT_HTTPGET, TRUE);
|
||||||
|
|
||||||
|
Just setting CURLOPT_POSTFIELDS to "" or NULL will *not* stop libcurl from
|
||||||
|
doing a POST. It will just make it POST without any data to send!
|
||||||
|
|
||||||
|
|
||||||
Showing Progress
|
Showing Progress
|
||||||
|
|
||||||
|
[ built-in progress meter, progress callback ]
|
||||||
|
|
||||||
|
|
||||||
libcurl with C++
|
libcurl with C++
|
||||||
|
|
||||||
@ -488,16 +518,107 @@ Proxies
|
|||||||
proxy is using the HTTP protocol. For example, you can't invoke your own
|
proxy is using the HTTP protocol. For example, you can't invoke your own
|
||||||
custom FTP commands or even proper FTP directory listings.
|
custom FTP commands or even proper FTP directory listings.
|
||||||
|
|
||||||
|
Proxy Options
|
||||||
|
|
||||||
To tell libcurl to use a proxy at a given port number:
|
To tell libcurl to use a proxy at a given port number:
|
||||||
|
|
||||||
curl_easy_setopt(easyhandle, CURLOPT_PROXY, "proxy-host.com:8080");
|
curl_easy_setopt(easyhandle, CURLOPT_PROXY, "proxy-host.com:8080");
|
||||||
|
|
||||||
Some proxies require user authentication before allowing a request, and you
|
Some proxies require user authentication before allowing a request, and
|
||||||
pass that information similar to this:
|
you pass that information similar to this:
|
||||||
|
|
||||||
curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "user:password");
|
curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "user:password");
|
||||||
|
|
||||||
[ environment variables, SSL, tunneling, automatic proxy config (.pac) ]
|
If you want to, you can specify the host name only in the CURLOPT_PROXY
|
||||||
|
option, and set the port number separately with CURLOPT_PROXYPORT.
|
||||||
|
|
||||||
|
Environment Variables
|
||||||
|
|
||||||
|
libcurl automaticly checks and uses a set of environment variables to know
|
||||||
|
what proxies to use for certain protocols. The names of the variables are
|
||||||
|
following an ancient de facto standard and are built up as
|
||||||
|
"[protocol]_proxy" (note the lower casing). Which makes the variable
|
||||||
|
'http_proxy' checked for a name of a proxy to use when the input URL is
|
||||||
|
HTTP. Following the same rule, the variable named 'ftp_proxy' is checked
|
||||||
|
for FTP URLs. Again, the proxies are always HTTP proxies, the different
|
||||||
|
names of the variables simply allows different HTTP proxies to be used.
|
||||||
|
|
||||||
|
The proxy environment variable contents should be in the format
|
||||||
|
"[protocol://]machine[:port]". Where the protocol:// part is simply
|
||||||
|
ignored if present (so http://proxy and bluerk://proxy will do the same)
|
||||||
|
and the optional port number specifies on which port the proxy operates on
|
||||||
|
the host. If not specified, the internal default port number will be used
|
||||||
|
and that is most likely *not* the one you would like it to be.
|
||||||
|
|
||||||
|
There are two special environment variables. 'all_proxy' is what sets
|
||||||
|
proxy for any URL in case the protocol specific variable wasn't set, and
|
||||||
|
'no_proxy' defines a list of hosts that should not use a proxy even though
|
||||||
|
a variable may say so. If 'no_proxy' is a plain asterisk ("*") it matches
|
||||||
|
all hosts.
|
||||||
|
|
||||||
|
SSL and Proxies
|
||||||
|
|
||||||
|
SSL is for secure point-to-point connections. This envolves strong
|
||||||
|
encryption and similar things, which effectivly makes it impossible for a
|
||||||
|
proxy to operate as a "man in between" which the proxy's task is as
|
||||||
|
previously discussed. Instead, the only way to have SSL work over a HTTP
|
||||||
|
proxy is to ask the proxy to tunnel trough everything without being able
|
||||||
|
to check the traffic.
|
||||||
|
|
||||||
|
Opening an SSL connection over a HTTP proxy is therefor a matter of asking
|
||||||
|
the proxy for a straight connection to the target host on a specified
|
||||||
|
port. This is made with the HTTP request CONNECT.
|
||||||
|
|
||||||
|
Because of the nature of this operation, where the proxy has no idea what
|
||||||
|
kind of data that is passed in and out through this tunnel, this
|
||||||
|
effectively breaks some of the pros a proxy might offer, such as caching.
|
||||||
|
Many organizations prevent this kind of tunneling to other destination
|
||||||
|
port numbers than 443 (which is the default HTTPS port number).
|
||||||
|
|
||||||
|
Tunneling Through Proxy
|
||||||
|
|
||||||
|
As explained above, tunneling is required for SSL to work and often even
|
||||||
|
restricted to the operation intended for SSL; HTTPS.
|
||||||
|
|
||||||
|
This is however not the only time proxy-tunneling might offer benefits to
|
||||||
|
you or your application.
|
||||||
|
|
||||||
|
As tunneling opens a direct connection from your application to the remote
|
||||||
|
machine, it suddenly also re-introduces the ability to do non-HTTP
|
||||||
|
operations over a HTTP proxy. You can in fact use things such as FTP
|
||||||
|
upload or FTP custom commands this way.
|
||||||
|
|
||||||
|
Again, this is often prevented by the adminstrators of proxies and is
|
||||||
|
rarely allowed.
|
||||||
|
|
||||||
|
Tell libcurl to use proxy tunneling like this:
|
||||||
|
|
||||||
|
curl_easy_setopt(easyhandle, CURLOPT_HTTPPROXYTUNNEL, TRUE);
|
||||||
|
|
||||||
|
Proxy Auto-Config
|
||||||
|
|
||||||
|
Netscape first came up with this. It is basicly a web page (usually using
|
||||||
|
a .pac extension) with a javascript that when executed by the browser with
|
||||||
|
the requested URL as input, returns information to the browser on how to
|
||||||
|
connect to the URL. The returned information might be "DIRECT" (which
|
||||||
|
means no proxy should be used), "PROXY host:port" (to tell the browser
|
||||||
|
where the proxy for this particular URL is) or "SOCKS host:port" (to
|
||||||
|
direct the brower to a SOCKS proxy).
|
||||||
|
|
||||||
|
libcurl has no means to interpret or evaluate javascript and thus it
|
||||||
|
doesn't support this. If you get yourself in a position where you face
|
||||||
|
this nasty invention, the following advice have been mentioned and used in
|
||||||
|
the past:
|
||||||
|
|
||||||
|
- Depending on the javascript complexity, write up a script that
|
||||||
|
translates it to another language and execute that.
|
||||||
|
|
||||||
|
- Read the javascript code and rewrite the same logic in another language.
|
||||||
|
|
||||||
|
- Implement a javascript interpreted, people have successfully used the
|
||||||
|
Mozilla javascript engine in the past.
|
||||||
|
|
||||||
|
- Ask your admins to stop this, for a static proxy setup or similar.
|
||||||
|
|
||||||
|
|
||||||
Security Considerations
|
Security Considerations
|
||||||
@ -505,7 +626,7 @@ Security Considerations
|
|||||||
[ ps output, netrc plain text, plain text protocols / base64 ]
|
[ ps output, netrc plain text, plain text protocols / base64 ]
|
||||||
|
|
||||||
|
|
||||||
Certificates and Other SSL Tricks
|
SSL, Certificates and Other Tricks
|
||||||
|
|
||||||
|
|
||||||
Future
|
Future
|
||||||
|
Loading…
x
Reference in New Issue
Block a user