1
0
mirror of https://github.com/moparisthebest/wget synced 2024-07-03 16:38:41 -04:00

[svn] TODO: -p should probably go "_two_ more hops" on <FRAMESET> pages.

wget.texi (Recursive Retrieval Options): Explained that you need
to use -r -l1 -p to get the two levels of requisites for a
<FRAMESET> page.  Also made a few other wording improvements.
This commit is contained in:
dan 2001-03-26 19:22:17 -08:00
parent a6a681d846
commit c33a1f97fe
4 changed files with 26 additions and 3 deletions

View File

@ -1,3 +1,7 @@
2001-03-26 Dan Harkless <wget@harkless.org>
* TODO: -p should probably go "_two_ more hops" on <FRAMESET> pages.
2001-03-22 Dan Harkless <wget@harkless.org>
* MACHINES: Added rs6000-ibm-aix4.3.3.0.

2
TODO
View File

@ -7,6 +7,8 @@ items are not listed in any particular order (except that recently-added items
may tend towards the top). Not all of these represent user-visible
changes.
* -p should probably go "_two_ more hops" on <FRAMESET> pages.
* Only normal link-following recursion should respect -np. Page-requisite
recursion should not. When -np -p is specified, Wget should still retrieve
requisite images and such on the server, even if they aren't in that directory

View File

@ -1,3 +1,9 @@
2001-03-26 Dan Harkless <wget@harkless.org>
* wget.texi (Recursive Retrieval Options): Explained that you need
to use -r -l1 -p to get the two levels of requisites for a
<FRAMESET> page. Also made a few other wording improvements.
2001-03-17 Dan Harkless <wget@harkless.org>
* Makefile.in: Using '^' in the sed call caused a weird failure on

View File

@ -1065,7 +1065,7 @@ requisites.
For instance, say document @file{1.html} contains an @code{<IMG>} tag
referencing @file{1.gif} and an @code{<A>} tag pointing to external
document @file{2.html}. Say that @file{2.html} is the same but that its
document @file{2.html}. Say that @file{2.html} is similar but that its
image is @file{2.gif} and it links to @file{3.html}. Say this
continues up to some arbitrarily high number.
@ -1103,8 +1103,8 @@ would download just @file{1.html} and @file{1.gif}, but unfortunately
this is not the case, because @samp{-l 0} is equivalent to
@samp{-l inf}---that is, infinite recursion. To download a single HTML
page (or a handful of them, all specified on the commandline or in a
@samp{-i} @sc{url} input file) and its requisites, simply leave off
@samp{-p} and @samp{-l}:
@samp{-i} @sc{url} input file) and its (or their) requisites, simply leave off
@samp{-r} and @samp{-l}:
@example
wget -p http://@var{site}/1.html
@ -1121,6 +1121,17 @@ likes to use a few options in addition to @samp{-p}:
wget -E -H -k -K -nh -p http://@var{site}/@var{document}
@end example
In one case you'll need to add a couple more options. If @var{document}
is a @code{<FRAMESET>} page, the "one more hop" that @samp{-p} gives you
won't be enough---you'll get the @code{<FRAME>} pages that are
referenced, but you won't get @emph{their} requisites. Therefore, in
this case you'll need to add @samp{-r -l1} to the commandline. The
@samp{-r -l1} will recurse from the @code{<FRAMESET>} page to to the
@code{<FRAME>} pages, and the @samp{-p} will get their requisites. If
you're already using a recursion level of 1 or more, you'll need to up
it by one. In the future, @samp{-p} may be made smarter so that it'll
do "two more hops" in the case of a @code{<FRAMESET>} page.
To finish off this topic, it's worth knowing that Wget's idea of an
external document link is any URL specified in an @code{<A>} tag, an
@code{<AREA>} tag, or a @code{<LINK>} tag other than @code{<LINK