amount_read. This is not entirely logical, but that's what the
callers expect, and it's not easy to change.
* ftp.c (ftp_loop_internal): Ditto.
* http.c (http_loop): Be smarter about assigning restval; if we're
in the nth pass of a download, simply use the information we have
about how much data has been retrieved as restval.
* ftp.c (getftp): Ditto for FTP "REST" command.
* http.c (gethttp): When the server doesn't respect range, skip
the first RESTVAL bytes of the read body. Never truncate the
output file.
* retr.c (fd_read_body): Support skipping initial STARTPOS octets.
- allow checking of server cert
- allow defining client cert type
- allow limit of ssl protocol
- check more return values
- added debug message on break
Published by Thomas Lussnig in <3CC09969.5000607@bewegungsmelder.de>.
* main.c (print_help): Document `--no-http-keep-alive'.
* utils.c (numdigit): Handle negative numbers *correctly*.
* hash.c (make_nocase_string_hash_table): Use term "nocase" rather
than the confusing "unsigned".
* utils.c (string_set_contains): Renamed from string_set_exists.
* hash.c (hash_table_contains): Renamed from hash_table_exists.
* cookies.c: Move case-insensitive hash tables to hash.c.
Published in <sxsheyq9vvl.fsf@florida.arsdigita.de>.
Published in <sxsy9slhu7g.fsf@florida.arsdigita.de>.
* http.c (gethttp): Return RETRUNNEEDED when the retrieval is
unneeded because the file is already there and fully downloaded,
and -c is specified.
(http_loop): Handle RETRUNNEEDED.
* wget.h (uerr_t): New value RETRUNNEEDED.
* http.c (http_loop): Set no_truncate for files that both exist
and are non-empty.
(gethttp): Consider the download finished when restval >= contlen,
not only when restval==contlen.
(gethttp): Handle redirection before giving up due to -c.
(gethttp): Clarify error message which explains that -c will not
truncate the file.
(gethttp): When returning CONTNOTSUPPORTED, don't forget to free
the stuff that needs freeing and release the socket.
* main.c (print_help): Wget booleans accept "off", not "no".
* wget.texi: Moved -nr from "Recursive Retrieval Options" to "FTP Options" and
gave it a @cindex entry. Alphabetized FTP options by long option name.
* main.c (print_help): -nr belongs in "FTP options" section of --help output,
not "Recursive retrieval" section. Alphabetized FTP options by long option
name.
- use mmap() to read whole files in core instead of allocating memory
and read'ing it.
- use a new, more general, HTML parser (html-parse.c) and interface to
it from Wget (html-url.c).
- respect <meta name=robots content=nofollow> (easy with the new HTML
parser).
- use hash tables instead of linked lists in places where the lists
were used to facilitate mappings.
- rewrite the code in host.c to be more readable and faster (hash
tables instead of home-grown lists.)
- make convert_links properly convert partial URLs to complete ones
for those URLs that have *not* been downloaded.
- use HTTP persistent connections where available. very
simple-minded, caches the last connection to the server.
Published in <sxshf533d5r.fsf@florida.arsdigita.de>.
files specified on the commandline. Made --convert-links be ignored when
--delete-after is specified. Added note about this fact to --delete-after docs
and made general improvements to them, including the clarification that
--delete-after only deletes local files.
* ftp.c (ftp_retrieve_list): Use new INFINITE_RECURSION #define.
* html.c: htmlfindurl() now takes final `dash_p_leaf_HTML' parameter.
Wrapped some > 80-column lines. When -p is specified and we're at a
leaf node, do not traverse <A>, <AREA>, or <LINK> tags other than
<LINK REL="stylesheet">.
* html.h (htmlfindurl): Now takes final `dash_p_leaf_HTML' parameter.
* init.c: Added new -p / --page-requisites / page_requisites option.
* main.c (print_help): Clarified that -l inf and -l 0 both allow
infinite recursion. Changed the unhelpful --mirrior description
to simply give the options it's equivalent to. Added new -p option.
(main): Added some comments; handle new -p / --page-requisites.
* options.h (struct options): Added new page_requisites field.
* recur.c: Changed "URL-s" to "URLs" and "HTML-s" to "HTMLs".
Calculate and pass down new `dash_p_leaf_HTML' parameter to
get_urls_html(). Use new INFINITE_RECURSION #define.
* retr.c: Changed "URL-s" to "URLs". get_urls_html() now takes
final `dash_p_leaf_HTML' parameter.
* url.c: get_urls_html() and htmlfindurl() now take final
`dash_p_leaf_HTML' parameter.
* url.h (get_urls_html): Now takes final `dash_p_leaf_HTML' parameter.
* wget.h: Added some comments and new INFINITE_RECURSION #define.
* wget.texi (Recursive Retrieval Options): Documented new -p option.
said that 0 seconds are waited after the first retry, which I believe is
incorrect and does not match what's written elsewhere (e.g. wget.texi). Changed
to 1.