This commit makes lots of whitespace only changes. It has been ensured that this
commit does not make any changes to the functioning of the program. The only
changes that have been made are:
* Remove trailing whitespaces
* Convert tabs to spaces
* Fix indentation issues in the code
* Other aesthetic changes to the formatting of comments
(uri_merge): Get rid of uri_merge_1.
(uri_merge): Merge "foo//", "bar" as "foo//bar", not "foo///bar",
i.e. don't add an extra slash merely because BASE ends with two
slashes.
(parse_credentials): Renamed from parse_uname. Rewrittern in
standard [beg, end) calling style.
(url_skip_credentials): Renamed from url_skip_uname. Made static.
(url_skip_credentials): Include # and ; as terminators. Old code
would mistakenly consider "http://foo.com#hniksic@iskon.hr" to
contain a username.
(url_skip_scheme): Removed because it was unused.
(url_has_scheme): Require "scheme" to be at least one char long.
similarity (SCHEME_HTTP and SCHEME_HTTPS are similar). Use it in recur.c
(download_child_p). Fixes a bug that caused -H option to be ignored when
child scheme different to parent scheme.
Published in <agn4eu8apduek7magfu9bfe63gto8i7cdh@farscape.privy.mev.co.uk>.
Published in <sxsvgo8nmub.fsf@florida.arsdigita.de>.
* retr.c (retrieve_url): Call uri_merge, not url_concat.
* html-url.c (collect_tags_mapper): Call uri_merge, not
url_concat.
* url.c (mkstruct): Use encode_string instead of xstrdup followed
by URL_CLEANSE.
(path_simplify_with_kludge): Deleted.
(contains_unsafe): Deleted.
(construct): Renamed to uri_merge_1.
(url_concat): Renamed to uri_merge.
* url.c (str_url): Use encode_string instead of the unnecessary
CLEANDUP.
(encode_string_maybe): New function, returns input string if no
encoding is needed.
(encode_string): Call encode_string_maybe to do the dirty work,
xstrdup if no work needed.
* wget.h (XDIGIT_TO_xchar): Define here.
* url.c (decode_string): Use new name.
(encode_string): Ditto.
* http.c (XDIGIT_TO_xchar): Rename HEXD2asc to XDIGIT_TO_xchar.
(dump_hash): Use new name.
* wget.h: Rename ASC2HEXD and HEXD2ASC to XCHAR_TO_XDIGIT and
XDIGIT_TO_XCHAR respectively.
- use mmap() to read whole files in core instead of allocating memory
and read'ing it.
- use a new, more general, HTML parser (html-parse.c) and interface to
it from Wget (html-url.c).
- respect <meta name=robots content=nofollow> (easy with the new HTML
parser).
- use hash tables instead of linked lists in places where the lists
were used to facilitate mappings.
- rewrite the code in host.c to be more readable and faster (hash
tables instead of home-grown lists.)
- make convert_links properly convert partial URLs to complete ones
for those URLs that have *not* been downloaded.
- use HTTP persistent connections where available. very
simple-minded, caches the last connection to the server.
Published in <sxshf533d5r.fsf@florida.arsdigita.de>.