amount_read. This is not entirely logical, but that's what the
callers expect, and it's not easy to change.
* ftp.c (ftp_loop_internal): Ditto.
* http.c (http_loop): Be smarter about assigning restval; if we're
in the nth pass of a download, simply use the information we have
about how much data has been retrieved as restval.
* ftp.c (getftp): Ditto for FTP "REST" command.
* http.c (gethttp): When the server doesn't respect range, skip
the first RESTVAL bytes of the read body. Never truncate the
output file.
* retr.c (fd_read_body): Support skipping initial STARTPOS octets.
- use mmap() to read whole files in core instead of allocating memory
and read'ing it.
- use a new, more general, HTML parser (html-parse.c) and interface to
it from Wget (html-url.c).
- respect <meta name=robots content=nofollow> (easy with the new HTML
parser).
- use hash tables instead of linked lists in places where the lists
were used to facilitate mappings.
- rewrite the code in host.c to be more readable and faster (hash
tables instead of home-grown lists.)
- make convert_links properly convert partial URLs to complete ones
for those URLs that have *not* been downloaded.
- use HTTP persistent connections where available. very
simple-minded, caches the last connection to the server.
Published in <sxshf533d5r.fsf@florida.arsdigita.de>.