mirror of
https://github.com/moparisthebest/wget
synced 2024-07-03 16:38:41 -04:00
[svn] * wget.texi (Download Options): --no-clobber's documentation was
severely lacking -- ameliorated the situation. Some of the previously-undocumented stuff (like the multiple-file-version numeric-suffixing) that's now mentioned for the first (and only) time in the -nc documentation should probably be mentioned elsewhere, but due to the way that wget.texi's hierarchy is laid out, I had a hard time finding anywhere else appropriate.
This commit is contained in:
parent
51642074f4
commit
28668d2875
@ -1,3 +1,13 @@
|
|||||||
|
2000-08-22 Dan Harkless <dan-wget@dilvish.speed.net>
|
||||||
|
|
||||||
|
* wget.texi (Download Options): --no-clobber's documentation was
|
||||||
|
severely lacking -- ameliorated the situation. Some of the
|
||||||
|
previously-undocumented stuff (like the multiple-file-version
|
||||||
|
numeric-suffixing) that's now mentioned for the first (and only)
|
||||||
|
time in the -nc documentation should probably be mentioned
|
||||||
|
elsewhere, but due to the way that wget.texi's hierarchy is laid
|
||||||
|
out, I had a hard time finding anywhere else appropriate.
|
||||||
|
|
||||||
2000-07-17 Dan Harkless <dan-wget@dilvish.speed.net>
|
2000-07-17 Dan Harkless <dan-wget@dilvish.speed.net>
|
||||||
|
|
||||||
* wget.texi (HTTP Options): Minor clarification in "download a
|
* wget.texi (HTTP Options): Minor clarification in "download a
|
||||||
|
@ -26,8 +26,8 @@ notice identical to this one.
|
|||||||
|
|
||||||
Indirect:
|
Indirect:
|
||||||
wget.info-1: 961
|
wget.info-1: 961
|
||||||
wget.info-2: 50079
|
wget.info-2: 49932
|
||||||
wget.info-3: 92081
|
wget.info-3: 93404
|
||||||
|
|
||||||
Tag Table:
|
Tag Table:
|
||||||
(Indirect)
|
(Indirect)
|
||||||
@ -39,50 +39,50 @@ Node: Option Syntax8163
|
|||||||
Node: Basic Startup Options9587
|
Node: Basic Startup Options9587
|
||||||
Node: Logging and Input File Options10287
|
Node: Logging and Input File Options10287
|
||||||
Node: Download Options12681
|
Node: Download Options12681
|
||||||
Node: Directory Options19043
|
Node: Directory Options20366
|
||||||
Node: HTTP Options21521
|
Node: HTTP Options22844
|
||||||
Node: FTP Options25426
|
Node: FTP Options26749
|
||||||
Node: Recursive Retrieval Options26619
|
Node: Recursive Retrieval Options27942
|
||||||
Node: Recursive Accept/Reject Options28583
|
Node: Recursive Accept/Reject Options29906
|
||||||
Node: Recursive Retrieval31481
|
Node: Recursive Retrieval32804
|
||||||
Node: Following Links33779
|
Node: Following Links35102
|
||||||
Node: Relative Links34807
|
Node: Relative Links36130
|
||||||
Node: Host Checking35321
|
Node: Host Checking36644
|
||||||
Node: Domain Acceptance37346
|
Node: Domain Acceptance38669
|
||||||
Node: All Hosts39016
|
Node: All Hosts40339
|
||||||
Node: Types of Files39443
|
Node: Types of Files40766
|
||||||
Node: Directory-Based Limits41893
|
Node: Directory-Based Limits43216
|
||||||
Node: FTP Links44533
|
Node: FTP Links45856
|
||||||
Node: Time-Stamping45403
|
Node: Time-Stamping46726
|
||||||
Node: Time-Stamping Usage47040
|
Node: Time-Stamping Usage48363
|
||||||
Node: HTTP Time-Stamping Internals48609
|
Node: HTTP Time-Stamping Internals49932
|
||||||
Node: FTP Time-Stamping Internals50079
|
Node: FTP Time-Stamping Internals51402
|
||||||
Node: Startup File51287
|
Node: Startup File52610
|
||||||
Node: Wgetrc Location52160
|
Node: Wgetrc Location53483
|
||||||
Node: Wgetrc Syntax52975
|
Node: Wgetrc Syntax54298
|
||||||
Node: Wgetrc Commands53690
|
Node: Wgetrc Commands55013
|
||||||
Node: Sample Wgetrc60972
|
Node: Sample Wgetrc62295
|
||||||
Node: Examples65991
|
Node: Examples67314
|
||||||
Node: Simple Usage66598
|
Node: Simple Usage67921
|
||||||
Node: Advanced Usage68992
|
Node: Advanced Usage70315
|
||||||
Node: Guru Usage71743
|
Node: Guru Usage73066
|
||||||
Node: Various73405
|
Node: Various74728
|
||||||
Node: Proxies73929
|
Node: Proxies75252
|
||||||
Node: Distribution76694
|
Node: Distribution78017
|
||||||
Node: Mailing List77045
|
Node: Mailing List78368
|
||||||
Node: Reporting Bugs77744
|
Node: Reporting Bugs79067
|
||||||
Node: Portability79529
|
Node: Portability80852
|
||||||
Node: Signals80904
|
Node: Signals82227
|
||||||
Node: Appendices81558
|
Node: Appendices82881
|
||||||
Node: Robots81973
|
Node: Robots83296
|
||||||
Node: Introduction to RES83120
|
Node: Introduction to RES84443
|
||||||
Node: RES Format85013
|
Node: RES Format86336
|
||||||
Node: User-Agent Field86117
|
Node: User-Agent Field87440
|
||||||
Node: Disallow Field86881
|
Node: Disallow Field88204
|
||||||
Node: Norobots Examples87492
|
Node: Norobots Examples88815
|
||||||
Node: Security Considerations88446
|
Node: Security Considerations89769
|
||||||
Node: Contributors89442
|
Node: Contributors90765
|
||||||
Node: Copying92081
|
Node: Copying93404
|
||||||
Node: Concept Index111244
|
Node: Concept Index112567
|
||||||
|
|
||||||
End Tag Table
|
End Tag Table
|
||||||
|
@ -357,12 +357,37 @@ Download Options
|
|||||||
|
|
||||||
`-nc'
|
`-nc'
|
||||||
`--no-clobber'
|
`--no-clobber'
|
||||||
Do not clobber existing files when saving to directory hierarchy
|
If a file is downloaded more than once in the same directory,
|
||||||
within recursive retrieval of several files. This option is
|
wget's behavior depends on a few options, including `-nc'. In
|
||||||
*extremely* useful when you wish to continue where you left off
|
certain cases, the local file will be "clobbered", or overwritten,
|
||||||
with retrieval of many files. If the files have the `.html' or
|
upon repeated download. In other cases it will be preserved.
|
||||||
(yuck) `.htm' suffix, they will be loaded from the local disk, and
|
|
||||||
parsed as if they have been retrieved from the Web.
|
When running wget without `-N', `-nc', or `-r', downloading the
|
||||||
|
same file in the same directory will result in the original copy
|
||||||
|
of `FILE' being preserved and the second copy being named
|
||||||
|
`FILE.1'. If that file is downloaded yet again, the third copy
|
||||||
|
will be named `FILE.2', and so on. When `-nc' is specified, this
|
||||||
|
behavior is suppressed, and wget will refuse to download newer
|
||||||
|
copies of `FILE'. Therefore, "no-clobber" is actually a misnomer
|
||||||
|
in this mode - it's not clobbering that's prevented (as the
|
||||||
|
numeric suffixes were already preventing clobbering), but rather
|
||||||
|
the multiple version saving that's prevented.
|
||||||
|
|
||||||
|
When running wget with `-r', but without `-N' or `-nc',
|
||||||
|
re-downloading a file will result in the new copy simply
|
||||||
|
overwriting the old. Adding `-nc' will prevent this behavior,
|
||||||
|
instead causing the original version to be preserved and any newer
|
||||||
|
copies on the server to be ignored.
|
||||||
|
|
||||||
|
When running wget with `-N', with or without `-r', the decision as
|
||||||
|
to whether or not to download a newer copy of a file depends on
|
||||||
|
the local and remote timestamp and size of the file (*Note
|
||||||
|
Time-Stamping::). `-nc' may not be specified at the same time as
|
||||||
|
`-N'.
|
||||||
|
|
||||||
|
Note that when `-nc' is specified, files with the suffixes `.html'
|
||||||
|
or (yuck) `.htm' will be loaded from the local disk and parsed as
|
||||||
|
if they had been retrieved from the Web.
|
||||||
|
|
||||||
`-c'
|
`-c'
|
||||||
`--continue'
|
`--continue'
|
||||||
@ -1220,37 +1245,3 @@ following command every week:
|
|||||||
|
|
||||||
wget --timestamping -r ftp://prep.ai.mit.edu/pub/gnu/
|
wget --timestamping -r ftp://prep.ai.mit.edu/pub/gnu/
|
||||||
|
|
||||||
|
|
||||||
File: wget.info, Node: HTTP Time-Stamping Internals, Next: FTP Time-Stamping Internals, Prev: Time-Stamping Usage, Up: Time-Stamping
|
|
||||||
|
|
||||||
HTTP Time-Stamping Internals
|
|
||||||
============================
|
|
||||||
|
|
||||||
Time-stamping in HTTP is implemented by checking of the
|
|
||||||
`Last-Modified' header. If you wish to retrieve the file `foo.html'
|
|
||||||
through HTTP, Wget will check whether `foo.html' exists locally. If it
|
|
||||||
doesn't, `foo.html' will be retrieved unconditionally.
|
|
||||||
|
|
||||||
If the file does exist locally, Wget will first check its local
|
|
||||||
time-stamp (similar to the way `ls -l' checks it), and then send a
|
|
||||||
`HEAD' request to the remote server, demanding the information on the
|
|
||||||
remote file.
|
|
||||||
|
|
||||||
The `Last-Modified' header is examined to find which file was
|
|
||||||
modified more recently (which makes it "newer"). If the remote file is
|
|
||||||
newer, it will be downloaded; if it is older, Wget will give up.(1)
|
|
||||||
|
|
||||||
When `--backup-converted' (`-K') is specified in conjunction with
|
|
||||||
`-N', server file `X' is compared to local file `X.orig', if extant,
|
|
||||||
rather than being compared to local file `X', which will always differ
|
|
||||||
if it's been converted by `--convert-links' (`-k').
|
|
||||||
|
|
||||||
Arguably, HTTP time-stamping should be implemented using the
|
|
||||||
`If-Modified-Since' request.
|
|
||||||
|
|
||||||
---------- Footnotes ----------
|
|
||||||
|
|
||||||
(1) As an additional check, Wget will look at the `Content-Length'
|
|
||||||
header, and compare the sizes; if they are not the same, the remote
|
|
||||||
file will be downloaded no matter what the time-stamp says.
|
|
||||||
|
|
||||||
|
@ -23,6 +23,40 @@ are included exactly as in the original, and provided that the entire
|
|||||||
resulting derived work is distributed under the terms of a permission
|
resulting derived work is distributed under the terms of a permission
|
||||||
notice identical to this one.
|
notice identical to this one.
|
||||||
|
|
||||||
|
|
||||||
|
File: wget.info, Node: HTTP Time-Stamping Internals, Next: FTP Time-Stamping Internals, Prev: Time-Stamping Usage, Up: Time-Stamping
|
||||||
|
|
||||||
|
HTTP Time-Stamping Internals
|
||||||
|
============================
|
||||||
|
|
||||||
|
Time-stamping in HTTP is implemented by checking of the
|
||||||
|
`Last-Modified' header. If you wish to retrieve the file `foo.html'
|
||||||
|
through HTTP, Wget will check whether `foo.html' exists locally. If it
|
||||||
|
doesn't, `foo.html' will be retrieved unconditionally.
|
||||||
|
|
||||||
|
If the file does exist locally, Wget will first check its local
|
||||||
|
time-stamp (similar to the way `ls -l' checks it), and then send a
|
||||||
|
`HEAD' request to the remote server, demanding the information on the
|
||||||
|
remote file.
|
||||||
|
|
||||||
|
The `Last-Modified' header is examined to find which file was
|
||||||
|
modified more recently (which makes it "newer"). If the remote file is
|
||||||
|
newer, it will be downloaded; if it is older, Wget will give up.(1)
|
||||||
|
|
||||||
|
When `--backup-converted' (`-K') is specified in conjunction with
|
||||||
|
`-N', server file `X' is compared to local file `X.orig', if extant,
|
||||||
|
rather than being compared to local file `X', which will always differ
|
||||||
|
if it's been converted by `--convert-links' (`-k').
|
||||||
|
|
||||||
|
Arguably, HTTP time-stamping should be implemented using the
|
||||||
|
`If-Modified-Since' request.
|
||||||
|
|
||||||
|
---------- Footnotes ----------
|
||||||
|
|
||||||
|
(1) As an additional check, Wget will look at the `Content-Length'
|
||||||
|
header, and compare the sizes; if they are not the same, the remote
|
||||||
|
file will be downloaded no matter what the time-stamp says.
|
||||||
|
|
||||||
|
|
||||||
File: wget.info, Node: FTP Time-Stamping Internals, Prev: HTTP Time-Stamping Internals, Up: Time-Stamping
|
File: wget.info, Node: FTP Time-Stamping Internals, Prev: HTTP Time-Stamping Internals, Up: Time-Stamping
|
||||||
|
|
||||||
|
@ -408,6 +408,7 @@ Concept Index
|
|||||||
* bug reports: Reporting Bugs.
|
* bug reports: Reporting Bugs.
|
||||||
* bugs: Reporting Bugs.
|
* bugs: Reporting Bugs.
|
||||||
* cache: HTTP Options.
|
* cache: HTTP Options.
|
||||||
|
* clobbering, file: Download Options.
|
||||||
* command line: Invoking.
|
* command line: Invoking.
|
||||||
* Content-Length, ignore: HTTP Options.
|
* Content-Length, ignore: HTTP Options.
|
||||||
* continue retrieval: Download Options.
|
* continue retrieval: Download Options.
|
||||||
@ -424,6 +425,7 @@ Concept Index
|
|||||||
* directory prefix: Directory Options.
|
* directory prefix: Directory Options.
|
||||||
* DNS lookup: Host Checking.
|
* DNS lookup: Host Checking.
|
||||||
* dot style: Download Options.
|
* dot style: Download Options.
|
||||||
|
* downloading multiple times: Download Options.
|
||||||
* examples: Examples.
|
* examples: Examples.
|
||||||
* exclude directories: Directory-Based Limits.
|
* exclude directories: Directory-Based Limits.
|
||||||
* execute wgetrc command: Basic Startup Options.
|
* execute wgetrc command: Basic Startup Options.
|
||||||
|
@ -453,15 +453,42 @@ already exists, it will be overwritten. If the @var{file} is @samp{-},
|
|||||||
the documents will be written to standard output. Including this option
|
the documents will be written to standard output. Including this option
|
||||||
automatically sets the number of tries to 1.
|
automatically sets the number of tries to 1.
|
||||||
|
|
||||||
|
@cindex clobbering, file
|
||||||
|
@cindex downloading multiple times
|
||||||
@cindex no-clobber
|
@cindex no-clobber
|
||||||
@item -nc
|
@item -nc
|
||||||
@itemx --no-clobber
|
@itemx --no-clobber
|
||||||
Do not clobber existing files when saving to directory hierarchy within
|
If a file is downloaded more than once in the same directory, wget's
|
||||||
recursive retrieval of several files. This option is @emph{extremely}
|
behavior depends on a few options, including @samp{-nc}. In certain
|
||||||
useful when you wish to continue where you left off with retrieval of
|
cases, the local file will be "clobbered", or overwritten, upon repeated
|
||||||
many files. If the files have the @samp{.html} or (yuck) @samp{.htm}
|
download. In other cases it will be preserved.
|
||||||
suffix, they will be loaded from the local disk, and parsed as if they
|
|
||||||
have been retrieved from the Web.
|
When running wget without @samp{-N}, @samp{-nc}, or @samp{-r},
|
||||||
|
downloading the same file in the same directory will result in the
|
||||||
|
original copy of @samp{@var{file}} being preserved and the second copy
|
||||||
|
being named @samp{@var{file}.1}. If that file is downloaded yet again,
|
||||||
|
the third copy will be named @samp{@var{file}.2}, and so on. When
|
||||||
|
@samp{-nc} is specified, this behavior is suppressed, and wget will
|
||||||
|
refuse to download newer copies of @samp{@var{file}}. Therefore,
|
||||||
|
"no-clobber" is actually a misnomer in this mode -- it's not clobbering
|
||||||
|
that's prevented (as the numeric suffixes were already preventing
|
||||||
|
clobbering), but rather the multiple version saving that's prevented.
|
||||||
|
|
||||||
|
When running wget with @samp{-r}, but without @samp{-N} or @samp{-nc},
|
||||||
|
re-downloading a file will result in the new copy simply overwriting the
|
||||||
|
old. Adding @samp{-nc} will prevent this behavior, instead causing the
|
||||||
|
original version to be preserved and any newer copies on the server to
|
||||||
|
be ignored.
|
||||||
|
|
||||||
|
When running wget with @samp{-N}, with or without @samp{-r}, the
|
||||||
|
decision as to whether or not to download a newer copy of a file depends
|
||||||
|
on the local and remote timestamp and size of the file
|
||||||
|
(@xref{Time-Stamping}). @samp{-nc} may not be specified at the same
|
||||||
|
time as @samp{-N}.
|
||||||
|
|
||||||
|
Note that when @samp{-nc} is specified, files with the suffixes
|
||||||
|
@samp{.html} or (yuck) @samp{.htm} will be loaded from the local disk
|
||||||
|
and parsed as if they had been retrieved from the Web.
|
||||||
|
|
||||||
@cindex continue retrieval
|
@cindex continue retrieval
|
||||||
@item -c
|
@item -c
|
||||||
|
Loading…
Reference in New Issue
Block a user