amount_read. This is not entirely logical, but that's what the
callers expect, and it's not easy to change.
* ftp.c (ftp_loop_internal): Ditto.
* http.c (http_loop): Be smarter about assigning restval; if we're
in the nth pass of a download, simply use the information we have
about how much data has been retrieved as restval.
* ftp.c (getftp): Ditto for FTP "REST" command.
* http.c (gethttp): When the server doesn't respect range, skip
the first RESTVAL bytes of the read body. Never truncate the
output file.
* retr.c (fd_read_body): Support skipping initial STARTPOS octets.
- allow checking of server cert
- allow defining client cert type
- allow limit of ssl protocol
- check more return values
- added debug message on break
Published by Thomas Lussnig in <3CC09969.5000607@bewegungsmelder.de>.
* main.c (print_help): Document `--no-http-keep-alive'.
* utils.c (numdigit): Handle negative numbers *correctly*.
* hash.c (make_nocase_string_hash_table): Use term "nocase" rather
than the confusing "unsigned".
* utils.c (string_set_contains): Renamed from string_set_exists.
* hash.c (hash_table_contains): Renamed from hash_table_exists.
* cookies.c: Move case-insensitive hash tables to hash.c.
Published in <sxsheyq9vvl.fsf@florida.arsdigita.de>.
Published in <sxsy9slhu7g.fsf@florida.arsdigita.de>.
* http.c (gethttp): Return RETRUNNEEDED when the retrieval is
unneeded because the file is already there and fully downloaded,
and -c is specified.
(http_loop): Handle RETRUNNEEDED.
* wget.h (uerr_t): New value RETRUNNEEDED.
* http.c (http_loop): Set no_truncate for files that both exist
and are non-empty.
(gethttp): Consider the download finished when restval >= contlen,
not only when restval==contlen.
(gethttp): Handle redirection before giving up due to -c.
(gethttp): Clarify error message which explains that -c will not
truncate the file.
(gethttp): When returning CONTNOTSUPPORTED, don't forget to free
the stuff that needs freeing and release the socket.
* main.c (print_help): Wget booleans accept "off", not "no".
* wget.texi: Moved -nr from "Recursive Retrieval Options" to "FTP Options" and
gave it a @cindex entry. Alphabetized FTP options by long option name.
* main.c (print_help): -nr belongs in "FTP options" section of --help output,
not "Recursive retrieval" section. Alphabetized FTP options by long option
name.
- use mmap() to read whole files in core instead of allocating memory
and read'ing it.
- use a new, more general, HTML parser (html-parse.c) and interface to
it from Wget (html-url.c).
- respect <meta name=robots content=nofollow> (easy with the new HTML
parser).
- use hash tables instead of linked lists in places where the lists
were used to facilitate mappings.
- rewrite the code in host.c to be more readable and faster (hash
tables instead of home-grown lists.)
- make convert_links properly convert partial URLs to complete ones
for those URLs that have *not* been downloaded.
- use HTTP persistent connections where available. very
simple-minded, caches the last connection to the server.
Published in <sxshf533d5r.fsf@florida.arsdigita.de>.
files specified on the commandline. Made --convert-links be ignored when
--delete-after is specified. Added note about this fact to --delete-after docs
and made general improvements to them, including the clarification that
--delete-after only deletes local files.
* ftp.c (ftp_retrieve_list): Use new INFINITE_RECURSION #define.
* html.c: htmlfindurl() now takes final `dash_p_leaf_HTML' parameter.
Wrapped some > 80-column lines. When -p is specified and we're at a
leaf node, do not traverse <A>, <AREA>, or <LINK> tags other than
<LINK REL="stylesheet">.
* html.h (htmlfindurl): Now takes final `dash_p_leaf_HTML' parameter.
* init.c: Added new -p / --page-requisites / page_requisites option.
* main.c (print_help): Clarified that -l inf and -l 0 both allow
infinite recursion. Changed the unhelpful --mirrior description
to simply give the options it's equivalent to. Added new -p option.
(main): Added some comments; handle new -p / --page-requisites.
* options.h (struct options): Added new page_requisites field.
* recur.c: Changed "URL-s" to "URLs" and "HTML-s" to "HTMLs".
Calculate and pass down new `dash_p_leaf_HTML' parameter to
get_urls_html(). Use new INFINITE_RECURSION #define.
* retr.c: Changed "URL-s" to "URLs". get_urls_html() now takes
final `dash_p_leaf_HTML' parameter.
* url.c: get_urls_html() and htmlfindurl() now take final
`dash_p_leaf_HTML' parameter.
* url.h (get_urls_html): Now takes final `dash_p_leaf_HTML' parameter.
* wget.h: Added some comments and new INFINITE_RECURSION #define.
* wget.texi (Recursive Retrieval Options): Documented new -p option.
said that 0 seconds are waited after the first retry, which I believe is
incorrect and does not match what's written elsewhere (e.g. wget.texi). Changed
to 1.
download a single HTML document and all its constituents.
* po/*.{gmo,po,pot}: Regenerated after adding new options.
* po/hr.po: Hrvoje forgot '\n's on his translations of my altered messages,
causing msgfmt to balk and `make install' to fail.
* wget.texi (Recursive Retrieval Options): In -K description, added a link to
the discussion of interaction with -N.
(Recursive Accept/Reject Options): Did some alphabetizing and added descriptions
of new --follow-tags and -G / --ignore-tags options.
(Following Links): Changed "the loads of" to "loads of".
(Wgetrc Commands): Added descriptions of new follow_tags and ignore_tags
commands.
* html.c (idmatch): Implemented checking of my new --follow-tags and
--ignore-tags options.
* init.c (commands): Added comment reminding people adding new entries doing
allocation to add corresponding freeing in cleanup().
(commands): Added new followtags and ignoretags commands.
(cleanup): Free storage for new followtags and ignoretags.
* main.c: Use of "comma-separated list" was random -- normalized it. Did some
alphabetization. Added comments pointing out "Options without arguments" and
"Options accepting an argument" sections of long_options[]. Added new options
--follow-tags and -G / --ignore-tags. Added comment that Damir's --referer is
currently undocumented. Added comment that Heiko's --waitretry is partially
undocumented (mentioned in --help but not in wget.texi). Moved improperly
sorted 24, 129, and 'G' cases.
* options.h (struct options): Added new fields follow_tags and ignore_tags.
* wget.h: Added "#define EQ 0" so we can say "strcmp(a, b) == EQ".
Got rid of newly-introduced nested-if warnings in ftp.c and http.c. Fixed
apparently completely untested code in main.c that was trying to provide --wait
/ --waitretry backwards compatibility, but had multiple fundamental bugs.