[svn] * wget.texi (Download Options): --no-clobber's documentation was

severely lacking -- ameliorated the situation.  Some of the
previously-undocumented stuff (like the multiple-file-version numeric-suffixing)
that's now mentioned for the first (and only) time in the -nc documentation
should probably be mentioned elsewhere, but due to the way that wget.texi's
hierarchy is laid out, I had a hard time finding anywhere else appropriate.
This commit is contained in:
dan 2000-08-22 20:04:20 -07:00
parent 51642074f4
commit 28668d2875
6 changed files with 157 additions and 93 deletions

View File

@ -1,3 +1,13 @@
2000-08-22 Dan Harkless <dan-wget@dilvish.speed.net>
* wget.texi (Download Options): --no-clobber's documentation was
severely lacking -- ameliorated the situation. Some of the
previously-undocumented stuff (like the multiple-file-version
numeric-suffixing) that's now mentioned for the first (and only)
time in the -nc documentation should probably be mentioned
elsewhere, but due to the way that wget.texi's hierarchy is laid
out, I had a hard time finding anywhere else appropriate.
2000-07-17 Dan Harkless <dan-wget@dilvish.speed.net>
* wget.texi (HTTP Options): Minor clarification in "download a

View File

@ -26,8 +26,8 @@ notice identical to this one.

Indirect:
wget.info-1: 961
wget.info-2: 50079
wget.info-3: 92081
wget.info-2: 49932
wget.info-3: 93404

Tag Table:
(Indirect)
@ -39,50 +39,50 @@ Node: Option Syntax8163
Node: Basic Startup Options9587
Node: Logging and Input File Options10287
Node: Download Options12681
Node: Directory Options19043
Node: HTTP Options21521
Node: FTP Options25426
Node: Recursive Retrieval Options26619
Node: Recursive Accept/Reject Options28583
Node: Recursive Retrieval31481
Node: Following Links33779
Node: Relative Links34807
Node: Host Checking35321
Node: Domain Acceptance37346
Node: All Hosts39016
Node: Types of Files39443
Node: Directory-Based Limits41893
Node: FTP Links44533
Node: Time-Stamping45403
Node: Time-Stamping Usage47040
Node: HTTP Time-Stamping Internals48609
Node: FTP Time-Stamping Internals50079
Node: Startup File51287
Node: Wgetrc Location52160
Node: Wgetrc Syntax52975
Node: Wgetrc Commands53690
Node: Sample Wgetrc60972
Node: Examples65991
Node: Simple Usage66598
Node: Advanced Usage68992
Node: Guru Usage71743
Node: Various73405
Node: Proxies73929
Node: Distribution76694
Node: Mailing List77045
Node: Reporting Bugs77744
Node: Portability79529
Node: Signals80904
Node: Appendices81558
Node: Robots81973
Node: Introduction to RES83120
Node: RES Format85013
Node: User-Agent Field86117
Node: Disallow Field86881
Node: Norobots Examples87492
Node: Security Considerations88446
Node: Contributors89442
Node: Copying92081
Node: Concept Index111244
Node: Directory Options20366
Node: HTTP Options22844
Node: FTP Options26749
Node: Recursive Retrieval Options27942
Node: Recursive Accept/Reject Options29906
Node: Recursive Retrieval32804
Node: Following Links35102
Node: Relative Links36130
Node: Host Checking36644
Node: Domain Acceptance38669
Node: All Hosts40339
Node: Types of Files40766
Node: Directory-Based Limits43216
Node: FTP Links45856
Node: Time-Stamping46726
Node: Time-Stamping Usage48363
Node: HTTP Time-Stamping Internals49932
Node: FTP Time-Stamping Internals51402
Node: Startup File52610
Node: Wgetrc Location53483
Node: Wgetrc Syntax54298
Node: Wgetrc Commands55013
Node: Sample Wgetrc62295
Node: Examples67314
Node: Simple Usage67921
Node: Advanced Usage70315
Node: Guru Usage73066
Node: Various74728
Node: Proxies75252
Node: Distribution78017
Node: Mailing List78368
Node: Reporting Bugs79067
Node: Portability80852
Node: Signals82227
Node: Appendices82881
Node: Robots83296
Node: Introduction to RES84443
Node: RES Format86336
Node: User-Agent Field87440
Node: Disallow Field88204
Node: Norobots Examples88815
Node: Security Considerations89769
Node: Contributors90765
Node: Copying93404
Node: Concept Index112567

End Tag Table

View File

@ -357,12 +357,37 @@ Download Options
`-nc'
`--no-clobber'
Do not clobber existing files when saving to directory hierarchy
within recursive retrieval of several files. This option is
*extremely* useful when you wish to continue where you left off
with retrieval of many files. If the files have the `.html' or
(yuck) `.htm' suffix, they will be loaded from the local disk, and
parsed as if they have been retrieved from the Web.
If a file is downloaded more than once in the same directory,
wget's behavior depends on a few options, including `-nc'. In
certain cases, the local file will be "clobbered", or overwritten,
upon repeated download. In other cases it will be preserved.
When running wget without `-N', `-nc', or `-r', downloading the
same file in the same directory will result in the original copy
of `FILE' being preserved and the second copy being named
`FILE.1'. If that file is downloaded yet again, the third copy
will be named `FILE.2', and so on. When `-nc' is specified, this
behavior is suppressed, and wget will refuse to download newer
copies of `FILE'. Therefore, "no-clobber" is actually a misnomer
in this mode - it's not clobbering that's prevented (as the
numeric suffixes were already preventing clobbering), but rather
the multiple version saving that's prevented.
When running wget with `-r', but without `-N' or `-nc',
re-downloading a file will result in the new copy simply
overwriting the old. Adding `-nc' will prevent this behavior,
instead causing the original version to be preserved and any newer
copies on the server to be ignored.
When running wget with `-N', with or without `-r', the decision as
to whether or not to download a newer copy of a file depends on
the local and remote timestamp and size of the file (*Note
Time-Stamping::). `-nc' may not be specified at the same time as
`-N'.
Note that when `-nc' is specified, files with the suffixes `.html'
or (yuck) `.htm' will be loaded from the local disk and parsed as
if they had been retrieved from the Web.
`-c'
`--continue'
@ -1220,37 +1245,3 @@ following command every week:
wget --timestamping -r ftp://prep.ai.mit.edu/pub/gnu/

File: wget.info, Node: HTTP Time-Stamping Internals, Next: FTP Time-Stamping Internals, Prev: Time-Stamping Usage, Up: Time-Stamping
HTTP Time-Stamping Internals
============================
Time-stamping in HTTP is implemented by checking of the
`Last-Modified' header. If you wish to retrieve the file `foo.html'
through HTTP, Wget will check whether `foo.html' exists locally. If it
doesn't, `foo.html' will be retrieved unconditionally.
If the file does exist locally, Wget will first check its local
time-stamp (similar to the way `ls -l' checks it), and then send a
`HEAD' request to the remote server, demanding the information on the
remote file.
The `Last-Modified' header is examined to find which file was
modified more recently (which makes it "newer"). If the remote file is
newer, it will be downloaded; if it is older, Wget will give up.(1)
When `--backup-converted' (`-K') is specified in conjunction with
`-N', server file `X' is compared to local file `X.orig', if extant,
rather than being compared to local file `X', which will always differ
if it's been converted by `--convert-links' (`-k').
Arguably, HTTP time-stamping should be implemented using the
`If-Modified-Since' request.
---------- Footnotes ----------
(1) As an additional check, Wget will look at the `Content-Length'
header, and compare the sizes; if they are not the same, the remote
file will be downloaded no matter what the time-stamp says.

View File

@ -23,6 +23,40 @@ are included exactly as in the original, and provided that the entire
resulting derived work is distributed under the terms of a permission
notice identical to this one.

File: wget.info, Node: HTTP Time-Stamping Internals, Next: FTP Time-Stamping Internals, Prev: Time-Stamping Usage, Up: Time-Stamping
HTTP Time-Stamping Internals
============================
Time-stamping in HTTP is implemented by checking of the
`Last-Modified' header. If you wish to retrieve the file `foo.html'
through HTTP, Wget will check whether `foo.html' exists locally. If it
doesn't, `foo.html' will be retrieved unconditionally.
If the file does exist locally, Wget will first check its local
time-stamp (similar to the way `ls -l' checks it), and then send a
`HEAD' request to the remote server, demanding the information on the
remote file.
The `Last-Modified' header is examined to find which file was
modified more recently (which makes it "newer"). If the remote file is
newer, it will be downloaded; if it is older, Wget will give up.(1)
When `--backup-converted' (`-K') is specified in conjunction with
`-N', server file `X' is compared to local file `X.orig', if extant,
rather than being compared to local file `X', which will always differ
if it's been converted by `--convert-links' (`-k').
Arguably, HTTP time-stamping should be implemented using the
`If-Modified-Since' request.
---------- Footnotes ----------
(1) As an additional check, Wget will look at the `Content-Length'
header, and compare the sizes; if they are not the same, the remote
file will be downloaded no matter what the time-stamp says.

File: wget.info, Node: FTP Time-Stamping Internals, Prev: HTTP Time-Stamping Internals, Up: Time-Stamping

View File

@ -408,6 +408,7 @@ Concept Index
* bug reports: Reporting Bugs.
* bugs: Reporting Bugs.
* cache: HTTP Options.
* clobbering, file: Download Options.
* command line: Invoking.
* Content-Length, ignore: HTTP Options.
* continue retrieval: Download Options.
@ -424,6 +425,7 @@ Concept Index
* directory prefix: Directory Options.
* DNS lookup: Host Checking.
* dot style: Download Options.
* downloading multiple times: Download Options.
* examples: Examples.
* exclude directories: Directory-Based Limits.
* execute wgetrc command: Basic Startup Options.

View File

@ -453,15 +453,42 @@ already exists, it will be overwritten. If the @var{file} is @samp{-},
the documents will be written to standard output. Including this option
automatically sets the number of tries to 1.
@cindex clobbering, file
@cindex downloading multiple times
@cindex no-clobber
@item -nc
@itemx --no-clobber
Do not clobber existing files when saving to directory hierarchy within
recursive retrieval of several files. This option is @emph{extremely}
useful when you wish to continue where you left off with retrieval of
many files. If the files have the @samp{.html} or (yuck) @samp{.htm}
suffix, they will be loaded from the local disk, and parsed as if they
have been retrieved from the Web.
If a file is downloaded more than once in the same directory, wget's
behavior depends on a few options, including @samp{-nc}. In certain
cases, the local file will be "clobbered", or overwritten, upon repeated
download. In other cases it will be preserved.
When running wget without @samp{-N}, @samp{-nc}, or @samp{-r},
downloading the same file in the same directory will result in the
original copy of @samp{@var{file}} being preserved and the second copy
being named @samp{@var{file}.1}. If that file is downloaded yet again,
the third copy will be named @samp{@var{file}.2}, and so on. When
@samp{-nc} is specified, this behavior is suppressed, and wget will
refuse to download newer copies of @samp{@var{file}}. Therefore,
"no-clobber" is actually a misnomer in this mode -- it's not clobbering
that's prevented (as the numeric suffixes were already preventing
clobbering), but rather the multiple version saving that's prevented.
When running wget with @samp{-r}, but without @samp{-N} or @samp{-nc},
re-downloading a file will result in the new copy simply overwriting the
old. Adding @samp{-nc} will prevent this behavior, instead causing the
original version to be preserved and any newer copies on the server to
be ignored.
When running wget with @samp{-N}, with or without @samp{-r}, the
decision as to whether or not to download a newer copy of a file depends
on the local and remote timestamp and size of the file
(@xref{Time-Stamping}). @samp{-nc} may not be specified at the same
time as @samp{-N}.
Note that when @samp{-nc} is specified, files with the suffixes
@samp{.html} or (yuck) @samp{.htm} will be loaded from the local disk
and parsed as if they had been retrieved from the Web.
@cindex continue retrieval
@item -c