wget.texi (Recursive Retrieval Options): Explained that you need
to use -r -l1 -p to get the two levels of requisites for a
<FRAMESET> page. Also made a few other wording improvements.
* wget.texi: Moved -nr from "Recursive Retrieval Options" to "FTP Options" and
gave it a @cindex entry. Alphabetized FTP options by long option name.
* main.c (print_help): -nr belongs in "FTP options" section of --help output,
not "Recursive retrieval" section. Alphabetized FTP options by long option
name.
included in the distribution or it'll get regenerated due to the wget.info
dependency, and then that file will get regenerated, forcing people to have
makeinfo installed unnecessarily. We could use a kludge of a 0-length file in
the distro, but the file isn't that big and should compress very well.
wget.texi: Changed "VERSION 1.5.3+dev" to "VERSION 1.7-dev" and "UPDATED Feb
2000" to "UPDATED Dec 2000". Like the comment in the file says, it'd be nice if
these were handled automatically...
- use mmap() to read whole files in core instead of allocating memory
and read'ing it.
- use a new, more general, HTML parser (html-parse.c) and interface to
it from Wget (html-url.c).
- respect <meta name=robots content=nofollow> (easy with the new HTML
parser).
- use hash tables instead of linked lists in places where the lists
were used to facilitate mappings.
- rewrite the code in host.c to be more readable and faster (hash
tables instead of home-grown lists.)
- make convert_links properly convert partial URLs to complete ones
for those URLs that have *not* been downloaded.
- use HTTP persistent connections where available. very
simple-minded, caches the last connection to the server.
Published in <sxshf533d5r.fsf@florida.arsdigita.de>.
files specified on the commandline. Made --convert-links be ignored when
--delete-after is specified. Added note about this fact to --delete-after docs
and made general improvements to them, including the clarification that
--delete-after only deletes local files.
I renamed to "lockable_boolean") in the .wgetrc (currently just passive_ftp).
Wrote documentation for his changes and added the missing "referer" to the
.wgetrc section (making mention of the issue of "referrer" being the correct
spelling).
* ftp.c (ftp_retrieve_list): Use new INFINITE_RECURSION #define.
* html.c: htmlfindurl() now takes final `dash_p_leaf_HTML' parameter.
Wrapped some > 80-column lines. When -p is specified and we're at a
leaf node, do not traverse <A>, <AREA>, or <LINK> tags other than
<LINK REL="stylesheet">.
* html.h (htmlfindurl): Now takes final `dash_p_leaf_HTML' parameter.
* init.c: Added new -p / --page-requisites / page_requisites option.
* main.c (print_help): Clarified that -l inf and -l 0 both allow
infinite recursion. Changed the unhelpful --mirrior description
to simply give the options it's equivalent to. Added new -p option.
(main): Added some comments; handle new -p / --page-requisites.
* options.h (struct options): Added new page_requisites field.
* recur.c: Changed "URL-s" to "URLs" and "HTML-s" to "HTMLs".
Calculate and pass down new `dash_p_leaf_HTML' parameter to
get_urls_html(). Use new INFINITE_RECURSION #define.
* retr.c: Changed "URL-s" to "URLs". get_urls_html() now takes
final `dash_p_leaf_HTML' parameter.
* url.c: get_urls_html() and htmlfindurl() now take final
`dash_p_leaf_HTML' parameter.
* url.h (get_urls_html): Now takes final `dash_p_leaf_HTML' parameter.
* wget.h: Added some comments and new INFINITE_RECURSION #define.
* wget.texi (Recursive Retrieval Options): Documented new -p option.
a separate item, and the .wgetrc version was misleading.
* wget.texi (Wgetrc Commands): Changed all instances of ", the same as" to the
more grammatical " -- the same as".
severely lacking -- ameliorated the situation. Some of the
previously-undocumented stuff (like the multiple-file-version numeric-suffixing)
that's now mentioned for the first (and only) time in the -nc documentation
should probably be mentioned elsewhere, but due to the way that wget.texi's
hierarchy is laid out, I had a hard time finding anywhere else appropriate.
dependencies, and distclean cleanup of this new file.
* sample.wgetrc: Uncommented waitretry and set it to 10, clarified some wording,
and re-wrapped some text to 71 columns due to @sample indentation in
wget.texi.
* wget.texi: Herold further expounded on the behavior of waitretry -- reworded
docs again. Changed note saying _all_ lines in sample.wgetrc are commented
out. Don't have an entire hand- cut-and-pasted copy of sample.wgetrc in this
file -- use @include.