From 1a5c5a006a36d1fcf259944d96a72943312b208e Mon Sep 17 00:00:00 2001 From: hniksic Date: Wed, 15 Nov 2000 02:44:18 -0800 Subject: [PATCH] [svn] Robots doc changes. Published at . --- doc/ChangeLog | 4 + doc/wget.info | 127 ++++++------ doc/wget.info-1 | 136 +++++++------ doc/wget.info-2 | 520 +++++++++++++++++++----------------------------- doc/wget.info-3 | 486 ++++++++++++++++++++++++++++++++++++-------- doc/wget.texi | 68 +++++-- 6 files changed, 797 insertions(+), 544 deletions(-) diff --git a/doc/ChangeLog b/doc/ChangeLog index 98f5421a..3ea5c873 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,3 +1,7 @@ +2000-11-15 Hrvoje Niksic + + * wget.texi (Robots): Rearrange text. Mention the meta tag. + 2000-11-14 Hrvoje Niksic * wget.texi: Add GFDL; remove norobots specification. diff --git a/doc/wget.info b/doc/wget.info index 27989582..827fb7fb 100644 --- a/doc/wget.info +++ b/doc/wget.info @@ -1,5 +1,4 @@ -This is Info file wget.info, produced by Makeinfo version 1.68 from the -input file ./wget.texi. +This is wget.info, produced by makeinfo version 4.0 from wget.texi. INFO-DIR-SECTION Net Utilities INFO-DIR-SECTION World Wide Web @@ -16,73 +15,73 @@ data. manual provided the copyright notice and this permission notice are preserved on all copies. - Permission is granted to copy and distribute modified versions of -this manual under the conditions for verbatim copying, provided also -that the sections entitled "Copying" and "GNU General Public License" -are included exactly as in the original, and provided that the entire -resulting derived work is distributed under the terms of a permission -notice identical to this one. + Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.1 or +any later version published by the Free Software Foundation; with the +Invariant Sections being "GNU General Public License" and "GNU Free +Documentation License", with no Front-Cover Texts, and with no +Back-Cover Texts. A copy of the license is included in the section +entitled "GNU Free Documentation License".  Indirect: -wget.info-1: 961 -wget.info-2: 48745 -wget.info-3: 97411 +wget.info-1: 1010 +wget.info-2: 48842 +wget.info-3: 94301  Tag Table: (Indirect) -Node: Top961 -Node: Overview1850 -Node: Invoking5024 -Node: URL Format5833 -Node: Option Syntax8163 -Node: Basic Startup Options9587 -Node: Logging and Input File Options10287 -Node: Download Options12812 -Node: Directory Options20910 -Node: HTTP Options23388 -Node: FTP Options28104 -Node: Recursive Retrieval Options30086 -Node: Recursive Accept/Reject Options35107 -Node: Recursive Retrieval38333 -Node: Following Links40631 -Node: Relative Links41659 -Node: Host Checking42173 -Node: Domain Acceptance44198 -Node: All Hosts45868 -Node: Types of Files46295 -Node: Directory-Based Limits48745 -Node: FTP Links51385 -Node: Time-Stamping52255 -Node: Time-Stamping Usage53892 -Node: HTTP Time-Stamping Internals55461 -Node: FTP Time-Stamping Internals56931 -Node: Startup File58139 -Node: Wgetrc Location59012 -Node: Wgetrc Syntax59827 -Node: Wgetrc Commands60542 -Node: Sample Wgetrc68941 -Node: Examples73960 -Node: Simple Usage74567 -Node: Advanced Usage76961 -Node: Guru Usage79712 -Node: Various81374 -Node: Proxies81898 -Node: Distribution84663 -Node: Mailing List85014 -Node: Reporting Bugs85713 -Node: Portability87498 -Node: Signals88873 -Node: Appendices89527 -Node: Robots89942 -Node: Introduction to RES91089 -Node: RES Format92982 -Node: User-Agent Field94086 -Node: Disallow Field94850 -Node: Norobots Examples95461 -Node: Security Considerations96415 -Node: Contributors97411 -Node: Copying100054 -Node: Concept Index119217 +Node: Top1010 +Node: Overview1924 +Node: Invoking5106 +Node: URL Format5915 +Ref: URL Format-Footnote-18143 +Node: Option Syntax8245 +Node: Basic Startup Options9670 +Node: Logging and Input File Options10370 +Node: Download Options12896 +Node: Directory Options20995 +Node: HTTP Options23477 +Node: FTP Options28194 +Node: Recursive Retrieval Options30177 +Node: Recursive Accept/Reject Options35199 +Node: Recursive Retrieval38426 +Node: Following Links40724 +Node: Relative Links41753 +Node: Host Checking42267 +Node: Domain Acceptance44293 +Node: All Hosts45965 +Node: Types of Files46392 +Node: Directory-Based Limits48842 +Node: FTP Links51482 +Node: Time-Stamping52352 +Node: Time-Stamping Usage53989 +Node: HTTP Time-Stamping Internals55558 +Ref: HTTP Time-Stamping Internals-Footnote-156829 +Node: FTP Time-Stamping Internals57028 +Node: Startup File58236 +Node: Wgetrc Location59109 +Node: Wgetrc Syntax59924 +Node: Wgetrc Commands60639 +Node: Sample Wgetrc69038 +Node: Examples69562 +Node: Simple Usage70169 +Node: Advanced Usage72571 +Node: Guru Usage75323 +Node: Various76985 +Node: Proxies77509 +Node: Distribution80274 +Node: Mailing List80625 +Node: Reporting Bugs81325 +Node: Portability83110 +Node: Signals84485 +Node: Appendices85139 +Node: Robots85457 +Node: Security Considerations88309 +Node: Contributors89305 +Node: Copying92189 +Node: GNU General Public License94301 +Node: GNU Free Documentation License113501 +Node: Concept Index133231  End Tag Table diff --git a/doc/wget.info-1 b/doc/wget.info-1 index 75f49368..067e5109 100644 --- a/doc/wget.info-1 +++ b/doc/wget.info-1 @@ -1,5 +1,4 @@ -This is Info file wget.info, produced by Makeinfo version 1.68 from the -input file ./wget.texi. +This is wget.info, produced by makeinfo version 4.0 from wget.texi. INFO-DIR-SECTION Net Utilities INFO-DIR-SECTION World Wide Web @@ -16,12 +15,13 @@ data. manual provided the copyright notice and this permission notice are preserved on all copies. - Permission is granted to copy and distribute modified versions of -this manual under the conditions for verbatim copying, provided also -that the sections entitled "Copying" and "GNU General Public License" -are included exactly as in the original, and provided that the entire -resulting derived work is distributed under the terms of a permission -notice identical to this one. + Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.1 or +any later version published by the Free Software Foundation; with the +Invariant Sections being "GNU General Public License" and "GNU Free +Documentation License", with no Front-Cover Texts, and with no +Back-Cover Texts. A copy of the license is included in the section +entitled "GNU Free Documentation License".  File: wget.info, Node: Top, Next: Overview, Prev: (dir), Up: (dir) @@ -32,7 +32,7 @@ Wget 1.5.3+dev This manual documents version 1.5.3+dev of GNU Wget, the freely available utility for network download. - Copyright (C) 1996, 1997, 1998 Free Software Foundation, Inc. + Copyright (C) 1996, 1997, 1998, 2000 Free Software Foundation, Inc. * Menu: @@ -45,7 +45,7 @@ available utility for network download. * Examples:: Examples of usage. * Various:: The stuff that doesn't fit anywhere else. * Appendices:: Some useful references. -* Copying:: You may give out copies of Wget. +* Copying:: You may give out copies of Wget and of this manual. * Concept Index:: Topics covered by this manual.  @@ -67,13 +67,15 @@ being: constant user's presence, which can be a great hindrance when transferring a lot of data. + * Wget is capable of descending recursively through the structure of HTML documents and FTP directory trees, making a local copy of the directory hierarchy similar to the one on the remote server. This feature can be used to mirror archives and home pages, or traverse - the web in search of data, like a WWW robot (*Note Robots::). In + the web in search of data, like a WWW robot (*note Robots::). In that spirit, Wget understands the `norobots' convention. + * File name wildcard matching and recursive mirroring of directories are available when retrieving via FTP. Wget can read the time-stamp information given by both HTTP and FTP servers, and @@ -82,12 +84,14 @@ being: version if it has. This makes Wget suitable for mirroring of FTP sites, as well as home pages. + * Wget works exceedingly well on slow or unstable connections, retrying the document until it is fully retrieved, or until a user-specified retry count is surpassed. It will try to resume the download from the point of interruption, using `REST' with FTP and `Range' with HTTP servers that support them. + * By default, Wget supports proxy servers, which can lighten the network load, speed up retrieval and provide access behind firewalls. However, if you are behind a firewall that requires @@ -95,23 +99,27 @@ being: and build wget with support for socks. Wget also supports the passive FTP downloading as an option. + * Builtin features offer mechanisms to tune which links you wish to - follow (*Note Following Links::). + follow (*note Following Links::). + * The retrieval is conveniently traced with printing dots, each dot representing a fixed amount of data received (1KB by default). These representations can be customized to your preferences. + * Most of the features are fully configurable, either through command line options, or via the initialization file `.wgetrc' - (*Note Startup File::). Wget allows you to define "global" + (*note Startup File::). Wget allows you to define "global" startup files (`/usr/local/etc/wgetrc' by default) for site settings. + * Finally, GNU Wget is free software. This means that everyone may use it, redistribute it and/or modify it under the terms of the GNU General Public License, as published by the Free Software - Foundation (*Note Copying::). + Foundation (*note Copying::).  File: wget.info, Node: Invoking, Next: Recursive Retrieval, Prev: Overview, Up: Top @@ -128,7 +136,7 @@ line. URL is a "Uniform Resource Locator", as defined below. However, you may wish to change some of the default parameters of Wget. You can do it two ways: permanently, adding the appropriate -command to `.wgetrc' (*Note Startup File::), or specifying it on the +command to `.wgetrc' (*note Startup File::), or specifying it on the command line. * Menu: @@ -218,7 +226,7 @@ remember, but take time to type. You may freely mix different option styles, or specify options after the command-line arguments. Thus you may write: - wget -r --tries=10 http://fly.cc.fer.hr/ -o log + wget -r --tries=10 http://fly.srk.fer.hr/ -o log The space between the option accepting an argument and the argument may be omitted. Instead `-o log' you can write `-olog'. @@ -243,7 +251,7 @@ convention that specifying an empty list clears its value. This can be useful to clear the `.wgetrc' settings. For instance, if your `.wgetrc' sets `exclude_directories' to `/cgi-bin', the following example will first reset it, and then set it to exclude `/~nobody' and `/~somebody'. -You can also clear the lists in `.wgetrc' (*Note Wgetrc Syntax::). +You can also clear the lists in `.wgetrc' (*note Wgetrc Syntax::). wget -X '' -X /~nobody,/~somebody @@ -268,8 +276,8 @@ Basic Startup Options `-e COMMAND' `--execute COMMAND' - Execute COMMAND as if it were a part of `.wgetrc' (*Note Startup - File::). A command thus invoked will be executed *after* the + Execute COMMAND as if it were a part of `.wgetrc' (*note Startup + File::). A command thus invoked will be executed _after_ the commands in `.wgetrc', thus taking precedence over them.  @@ -296,8 +304,8 @@ Logging and Input File Options administrator may have chosen to compile Wget without debug support, in which case `-d' will not work. Please note that compiling with debug support is always safe--Wget compiled with - the debug support will *not* print any debug info unless requested - with `-d'. *Note Reporting Bugs:: for more information on how to + the debug support will _not_ print any debug info unless requested + with `-d'. *Note Reporting Bugs::, for more information on how to use `-d' for sending bug reports. `-q' @@ -392,7 +400,7 @@ Download Options When running wget with `-N', with or without `-r', the decision as to whether or not to download a newer copy of a file depends on - the local and remote timestamp and size of the file (*Note + the local and remote timestamp and size of the file (*note Time-Stamping::). `-nc' may not be specified at the same time as `-N'. @@ -449,7 +457,7 @@ Download Options `-N' `--timestamping' - Turn on time-stamping. *Note Time-Stamping:: for details. + Turn on time-stamping. *Note Time-Stamping::, for details. `-S' `--server-response' @@ -491,7 +499,7 @@ Download Options retry. `--waitretry=SECONDS' - If you don't want Wget to wait between *every* retrieval, but only + If you don't want Wget to wait between _every_ retrieval, but only between retries of failed downloads, you can use this option. Wget will use "linear backoff", waiting 1 second after the first failure on a given file, then waiting 2 seconds after the second @@ -540,14 +548,14 @@ Directory Options `--force-directories' The opposite of `-nd'--create a hierarchy of directories, even if one would not have been created otherwise. E.g. `wget -x - http://fly.cc.fer.hr/robots.txt' will save the downloaded file to - `fly.cc.fer.hr/robots.txt'. + http://fly.srk.fer.hr/robots.txt' will save the downloaded file to + `fly.srk.fer.hr/robots.txt'. `-nH' `--no-host-directories' Disable generation of host-prefixed directories. By default, - invoking Wget with `-r http://fly.cc.fer.hr/' will create a - structure of directories beginning with `fly.cc.fer.hr/'. This + invoking Wget with `-r http://fly.srk.fer.hr/' will create a + structure of directories beginning with `fly.srk.fer.hr/'. This option disables such behavior. `--cut-dirs=NUMBER' @@ -609,7 +617,7 @@ HTTP Options doesn't yet know that the URL produces output of type `text/html'. To prevent this re-downloading, you must use `-k' and `-K' so that the original version of the file will be saved as `X.orig' - (*Note Recursive Retrieval Options::). + (*note Recursive Retrieval Options::). `--http-user=USER' `--http-passwd=PASSWORD' @@ -619,7 +627,7 @@ HTTP Options scheme. Another way to specify username and password is in the URL itself - (*Note URL Format::). For more information about security issues + (*note URL Format::). For more information about security issues with Wget, *Note Security Considerations::. `-C on/off' @@ -653,7 +661,7 @@ HTTP Options wget --header='Accept-Charset: iso-8859-2' \ --header='Accept-Language: hr' \ - http://fly.cc.fer.hr/ + http://fly.srk.fer.hr/ Specification of an empty string as the header value will clear all previous user-defined headers. @@ -727,7 +735,7 @@ FTP Options and `]' to retrieve more than one file from the same directory at once, like: - wget ftp://gnjilux.cc.fer.hr/*.msg + wget ftp://gnjilux.srk.fer.hr/*.msg By default, globbing will be turned on if the URL contains a globbing character. This option may be used to turn globbing on @@ -751,17 +759,17 @@ Recursive Retrieval Options `-r' `--recursive' - Turn on recursive retrieving. *Note Recursive Retrieval:: for more - details. + Turn on recursive retrieving. *Note Recursive Retrieval::, for + more details. `-l DEPTH' `--level=DEPTH' - Specify recursion maximum depth level DEPTH (*Note Recursive + Specify recursion maximum depth level DEPTH (*note Recursive Retrieval::). The default maximum depth is 5. `--delete-after' This option tells Wget to delete every single file it downloads, - *after* having done so. It is useful for pre-fetching popular + _after_ having done so. It is useful for pre-fetching popular pages through a proxy, e.g.: wget -r -nd --delete-after http://whatever.com/~popular/page/ @@ -788,7 +796,7 @@ Recursive Retrieval Options `-K' `--backup-converted' When converting a file, back up the original version with a `.orig' - suffix. Affects the behavior of `-N' (*Note HTTP Time-Stamping + suffix. Affects the behavior of `-N' (*note HTTP Time-Stamping Internals::). `-m' @@ -837,7 +845,7 @@ Recursive Retrieval Options wget -r -l 2 -p http://SITE/1.html - all the above files *and* `3.html''s requisite `3.gif' will be + all the above files _and_ `3.html''s requisite `3.gif' will be downloaded. Similarly, wget -r -l 1 -p http://SITE/1.html @@ -879,18 +887,18 @@ Recursive Accept/Reject Options `-A ACCLIST --accept ACCLIST' `-R REJLIST --reject REJLIST' Specify comma-separated lists of file name suffixes or patterns to - accept or reject (*Note Types of Files:: for more details). + accept or reject (*note Types of Files:: for more details). `-D DOMAIN-LIST' `--domains=DOMAIN-LIST' Set domains to be accepted and DNS looked-up, where DOMAIN-LIST is - a comma-separated list. Note that it does *not* turn on `-H'. + a comma-separated list. Note that it does _not_ turn on `-H'. This option speeds things up, even if only one host is spanned - (*Note Domain Acceptance::). + (*note Domain Acceptance::). `--exclude-domains DOMAIN-LIST' Exclude the domains given in a comma-separated DOMAIN-LIST from - DNS-lookup (*Note Domain Acceptance::). + DNS-lookup (*note Domain Acceptance::). `--follow-ftp' Follow FTP links from HTML documents. Without this option, Wget @@ -924,29 +932,29 @@ Recursive Accept/Reject Options `-H' `--span-hosts' Enable spanning across hosts when doing recursive retrieving - (*Note All Hosts::). + (*note All Hosts::). `-L' `--relative' Follow relative links only. Useful for retrieving a specific home page without any distractions, not even those from the same hosts - (*Note Relative Links::). + (*note Relative Links::). `-I LIST' `--include-directories=LIST' Specify a comma-separated list of directories you wish to follow - when downloading (*Note Directory-Based Limits:: for more + when downloading (*note Directory-Based Limits:: for more details.) Elements of LIST may contain wildcards. `-X LIST' `--exclude-directories=LIST' Specify a comma-separated list of directories you wish to exclude - from download (*Note Directory-Based Limits:: for more details.) + from download (*note Directory-Based Limits:: for more details.) Elements of LIST may contain wildcards. `-nh' `--no-host-lookup' - Disable the time-consuming DNS lookup of almost all hosts (*Note + Disable the time-consuming DNS lookup of almost all hosts (*note Host Checking::). `-np' @@ -954,8 +962,8 @@ Recursive Accept/Reject Options `--no-parent' Do not ever ascend to the parent directory when retrieving recursively. This is a useful option, since it guarantees that - only the files *below* a certain hierarchy will be downloaded. - *Note Directory-Based Limits:: for more details. + only the files _below_ a certain hierarchy will be downloaded. + *Note Directory-Based Limits::, for more details.  File: wget.info, Node: Recursive Retrieval, Next: Following Links, Prev: Invoking, Up: Top @@ -1003,7 +1011,7 @@ which can grind the machine to a halt. (`-l') and/or by lowering the number of retries (`-t'). You may also consider using the `-w' option to slow down your requests to the remote servers, as well as the numerous options to narrow the number of -followed links (*Note Following Links::). +followed links (*note Following Links::). Recursive retrieval is a good thing when used properly. Please take all precautions not to wreak havoc through carelessness. @@ -1019,7 +1027,7 @@ unnecessary data. Most of the time the users bear in mind exactly what they want to download, and want Wget to follow only specific links. For example, if you wish to download the music archive from -`fly.cc.fer.hr', you will not want to download all the home pages that +`fly.srk.fer.hr', you will not want to download all the home pages that happen to be referenced by an obscure part of the archive. Wget possesses several mechanisms that allows you to fine-tune which @@ -1061,13 +1069,13 @@ following links) all URLs that refer to the same host will be retrieved. The problem with this option are the aliases of the hosts and domains. Thus there is no way for Wget to know that `regoc.srce.hr' and -`www.srce.hr' are the same host, or that `fly.cc.fer.hr' is the same as -`fly.cc.etf.hr'. Whenever an absolute link is encountered, the host is -DNS-looked-up with `gethostbyname' to check whether we are maybe +`www.srce.hr' are the same host, or that `fly.srk.fer.hr' is the same +as `fly.cc.fer.hr'. Whenever an absolute link is encountered, the host +is DNS-looked-up with `gethostbyname' to check whether we are maybe dealing with the same hosts. Although the results of `gethostbyname' are cached, it is still a great slowdown, e.g. when dealing with large indices of home pages on different hosts (because each of the hosts -must be DNS-resolved to see whether it just *might* be an alias of the +must be DNS-resolved to see whether it just _might_ be an alias of the starting host). To avoid the overhead you may use `-nh', which will turn off @@ -1079,7 +1087,7 @@ and `regoc.srce.hr' will be flagged as different hosts). "virtual servers", each having its own directory hierarchy. Such "servers" are distinguished by their hostnames (all of which point to the same IP address); for this to work, a client must send a `Host' -header, which is what Wget does. However, in that case Wget *must not* +header, which is what Wget does. However, in that case Wget _must not_ try to divine a host's "real" address, nor try to use the same hostname for each access, i.e. `-nh' must be turned on. @@ -1098,17 +1106,17 @@ Domain Acceptance followed. The hosts the domain of which is not in this list will not be DNS-resolved. Thus you can specify `-Dmit.edu' just to make sure that *nothing outside of MIT gets looked up*. This is very important and -useful. It also means that `-D' does *not* imply `-H' (span all +useful. It also means that `-D' does _not_ imply `-H' (span all hosts), which must be specified explicitly. Feel free to use this options since it will speed things up, with almost all the reliability of checking for all hosts. Thus you could invoke - wget -r -D.hr http://fly.cc.fer.hr/ + wget -r -D.hr http://fly.srk.fer.hr/ to make sure that only the hosts in `.hr' domain get DNS-looked-up -for being equal to `fly.cc.fer.hr'. So `fly.cc.etf.hr' will be checked -(only once!) and found equal, but `www.gnu.ai.mit.edu' will not even be -checked. +for being equal to `fly.srk.fer.hr'. So `fly.cc.fer.hr' will be +checked (only once!) and found equal, but `www.gnu.ai.mit.edu' will not +even be checked. Of course, domain acceptance can be used to limit the retrieval to particular domains with spanning of hosts in them, but then you must @@ -1121,7 +1129,7 @@ and Stanford. If there are domains you want to exclude specifically, you can do it with `--exclude-domains', which accepts the same type of arguments of -`-D', but will *exclude* all the listed domains. For example, if you +`-D', but will _exclude_ all the listed domains. For example, if you want to download all the hosts from `foo.edu' domain, with the exception of `sunsite.foo.edu', you can do it like this: @@ -1177,7 +1185,7 @@ in `.wgetrc'. `--reject REJLIST' `reject = REJLIST' The `--reject' option works the same way as `--accept', only its - logic is the reverse; Wget will download all files *except* the + logic is the reverse; Wget will download all files _except_ the ones matching the suffixes (or patterns) in the list. So, if you want to download a whole page except for the cumbersome @@ -1189,7 +1197,7 @@ in `.wgetrc'. The `-A' and `-R' options may be combined to achieve even better fine-tuning of which files to retrieve. E.g. `wget -A "*zelazny*" -R .ps' will download all the files having `zelazny' as a part of their -name, but *not* the PostScript files. +name, but _not_ the PostScript files. Note that these two options do not affect the downloading of HTML files; Wget must load all the HTMLs to know where to go at diff --git a/doc/wget.info-2 b/doc/wget.info-2 index 90b3c863..8958f49b 100644 --- a/doc/wget.info-2 +++ b/doc/wget.info-2 @@ -1,5 +1,4 @@ -This is Info file wget.info, produced by Makeinfo version 1.68 from the -input file ./wget.texi. +This is wget.info, produced by makeinfo version 4.0 from wget.texi. INFO-DIR-SECTION Net Utilities INFO-DIR-SECTION World Wide Web @@ -16,12 +15,13 @@ data. manual provided the copyright notice and this permission notice are preserved on all copies. - Permission is granted to copy and distribute modified versions of -this manual under the conditions for verbatim copying, provided also -that the sections entitled "Copying" and "GNU General Public License" -are included exactly as in the original, and provided that the entire -resulting derived work is distributed under the terms of a permission -notice identical to this one. + Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.1 or +any later version published by the Free Software Foundation; with the +Invariant Sections being "GNU General Public License" and "GNU Free +Documentation License", with no Front-Cover Texts, and with no +Back-Cover Texts. A copy of the license is included in the section +entitled "GNU Free Documentation License".  File: wget.info, Node: Directory-Based Limits, Next: FTP Links, Prev: Types of Files, Up: Following Links @@ -57,7 +57,7 @@ equivalent command in `.wgetrc'. `--exclude LIST' `exclude_directories = LIST' `-X' option is exactly the reverse of `-I'--this is a list of - directories *excluded* from the download. E.g. if you do not want + directories _excluded_ from the download. E.g. if you do not want Wget to download things from `/cgi-bin' directory, specify `-X /cgi-bin' on the command line. @@ -184,7 +184,7 @@ remote file is more recent, Wget will proceed fetching it normally. `ls' will show that the timestamps are set according to the state on the remote server. Reissuing the command with `-N' will make Wget -re-fetch *only* the files that have been modified. +re-fetch _only_ the files that have been modified. In both HTTP and FTP retrieval Wget will time-stamp the local file correctly (with or without `-N') if it gets the stamps, i.e. gets the @@ -300,7 +300,7 @@ further attempts will be made. If `WGETRC' is not set, Wget will try to load `$HOME/.wgetrc'. The fact that user's settings are loaded after the system-wide ones -means that in case of collision user's wgetrc *overrides* the +means that in case of collision user's wgetrc _overrides_ the system-wide wgetrc (in `/usr/local/etc/wgetrc' by default). Fascist admins, away! @@ -346,11 +346,11 @@ hostnames or dotted-quad IP addresses. N can be any positive integer, or `inf' for infinity, where appropriate. STRING values can be any non-empty string. - Most of these commands have commandline equivalents (*Note + Most of these commands have commandline equivalents (*note Invoking::), though some of the more obscure or rarely used ones do not. accept/reject = STRING - Same as `-A'/`-R' (*Note Types of Files::). + Same as `-A'/`-R' (*note Types of Files::). add_hostdir = on/off Enable/disable host-prefixed file names. `-nH' disables it. @@ -397,14 +397,14 @@ dirstruct = on/off respectively. domains = STRING - Same as `-D' (*Note Domain Acceptance::). + Same as `-D' (*note Domain Acceptance::). dot_bytes = N Specify the number of bytes "contained" in a dot, as seen throughout the retrieval (1024 by default). You can postfix the value with `k' or `m', representing kilobytes and megabytes, respectively. With dot settings you can tailor the dot retrieval - to suit your needs, or you can use the predefined "styles" (*Note + to suit your needs, or you can use the predefined "styles" (*note Download Options::). dots_in_line = N @@ -419,10 +419,10 @@ dot_style = STRING exclude_directories = STRING Specify a comma-separated list of directories you wish to exclude - from download - the same as `-X' (*Note Directory-Based Limits::). + from download - the same as `-X' (*note Directory-Based Limits::). exclude_domains = STRING - Same as `--exclude-domains' (*Note Domain Acceptance::). + Same as `--exclude-domains' (*note Domain Acceptance::). follow_ftp = on/off Follow FTP links from HTML documents - the same as `-f'. @@ -497,7 +497,7 @@ noclobber = on/off no_parent = on/off Disallow retrieving outside the directory hierarchy, like - `--no-parent' (*Note Directory-Based Limits::). + `--no-parent' (*note Directory-Based Limits::). no_proxy = STRING Use STRING as the comma-separated list of domains to avoid in @@ -550,7 +550,7 @@ recursive = on/off Recursive on/off - the same as `-r'. relative_only = on/off - Follow only relative links - the same as `-L' (*Note Relative + Follow only relative links - the same as `-L' (*note Relative Links::). remove_listing = on/off @@ -562,7 +562,7 @@ retr_symlinks = on/off files; the same as `--retr-symlinks'. robots = on/off - Use (or not) `/robots.txt' file (*Note Robots::). Be sure to know + Use (or not) `/robots.txt' file (*note Robots::). Be sure to know what you are doing before changing the default (which is `on'). server_response = on/off @@ -570,7 +570,7 @@ server_response = on/off the same as `-S'. simple_host_check = on/off - Same as `-nh' (*Note Host Checking::). + Same as `-nh' (*note Host Checking::). span_hosts = on/off Same as `-H'. @@ -579,7 +579,7 @@ timeout = N Set timeout value - the same as `-T'. timestamping = on/off - Turn timestamping on/off. The same as `-N' (*Note Time-Stamping::). + Turn timestamping on/off. The same as `-N' (*note Time-Stamping::). tries = N Set number of retries per URL - the same as `-t'. @@ -613,114 +613,6 @@ Be careful about the things you change. have any effect, you must remove the `#' character at the beginning of its line. - ### - ### Sample Wget initialization file .wgetrc - ### - - ## You can use this file to change the default behaviour of wget or to - ## avoid having to type many many command-line options. This file does - ## not contain a comprehensive list of commands -- look at the manual - ## to find out what you can put into this file. - ## - ## Wget initialization file can reside in /usr/local/etc/wgetrc - ## (global, for all users) or $HOME/.wgetrc (for a single user). - ## - ## To use the settings in this file, you will have to uncomment them, - ## as well as change them, in most cases, as the values on the - ## commented-out lines are the default values (e.g. "off"). - - - ## - ## Global settings (useful for setting up in /usr/local/etc/wgetrc). - ## Think well before you change them, since they may reduce wget's - ## functionality, and make it behave contrary to the documentation: - ## - - # You can set retrieve quota for beginners by specifying a value - # optionally followed by 'K' (kilobytes) or 'M' (megabytes). The - # default quota is unlimited. - #quota = inf - - # You can lower (or raise) the default number of retries when - # downloading a file (default is 20). - #tries = 20 - - # Lowering the maximum depth of the recursive retrieval is handy to - # prevent newbies from going too "deep" when they unwittingly start - # the recursive retrieval. The default is 5. - #reclevel = 5 - - # Many sites are behind firewalls that do not allow initiation of - # connections from the outside. On these sites you have to use the - # `passive' feature of FTP. If you are behind such a firewall, you - # can turn this on to make Wget use passive FTP by default. - #passive_ftp = off - - # The "wait" command below makes Wget wait between every connection. - # If, instead, you want Wget to wait only between retries of failed - # downloads, set waitretry to maximum number of seconds to wait (Wget - # will use "linear backoff", waiting 1 second after the first failure - # on a file, 2 seconds after the second failure, etc. up to this max). - waitretry = 10 - - - ## - ## Local settings (for a user to set in his $HOME/.wgetrc). It is - ## *highly* undesirable to put these settings in the global file, since - ## they are potentially dangerous to "normal" users. - ## - ## Even when setting up your own ~/.wgetrc, you should know what you - ## are doing before doing so. - ## - - # Set this to on to use timestamping by default: - #timestamping = off - - # It is a good idea to make Wget send your email address in a `From:' - # header with your request (so that server administrators can contact - # you in case of errors). Wget does *not* send `From:' by default. - #header = From: Your Name - - # You can set up other headers, like Accept-Language. Accept-Language - # is *not* sent by default. - #header = Accept-Language: en - - # You can set the default proxy for Wget to use. It will override the - # value in the environment. - #http_proxy = http://proxy.yoyodyne.com:18023/ - - # If you do not want to use proxy at all, set this to off. - #use_proxy = on - - # You can customize the retrieval outlook. Valid options are default, - # binary, mega and micro. - #dot_style = default - - # Setting this to off makes Wget not download /robots.txt. Be sure to - # know *exactly* what /robots.txt is and how it is used before changing - # the default! - #robots = on - - # It can be useful to make Wget wait between connections. Set this to - # the number of seconds you want Wget to wait. - #wait = 0 - - # You can force creating directory structure, even if a single is being - # retrieved, by setting this to on. - #dirstruct = off - - # You can turn on recursive retrieving by default (don't do this if - # you are not sure you know what it means) by setting this to on. - #recursive = off - - # To always back up file X as X.orig before converting its links (due - # to -k / --convert-links / convert_links = on having been specified), - # set this variable to on: - #backup_converted = off - - # To have Wget follow FTP links from HTML files by default, set this - # to on: - #follow_ftp = off  File: wget.info, Node: Examples, Next: Various, Prev: Startup File, Up: Top @@ -748,13 +640,13 @@ Simple Usage * Say you want to download a URL. Just type: - wget http://fly.cc.fer.hr/ + wget http://fly.srk.fer.hr/ The response will be something like: - --13:30:45-- http://fly.cc.fer.hr:80/en/ + --13:30:45-- http://fly.srk.fer.hr:80/en/ => `index.html' - Connecting to fly.cc.fer.hr:80... connected! + Connecting to fly.srk.fer.hr:80... connected! HTTP request sent, awaiting response... 200 OK Length: 4,694 [text/html] @@ -770,13 +662,13 @@ Simple Usage the number of tries to 45, to insure that the whole file will arrive safely: - wget --tries=45 http://fly.cc.fer.hr/jpg/flyweb.jpg + wget --tries=45 http://fly.srk.fer.hr/jpg/flyweb.jpg * Now let's leave Wget to work in the background, and write its progress to log file `log'. It is tiring to type `--tries', so we shall use `-t'. - wget -t 45 -o log http://fly.cc.fer.hr/jpg/flyweb.jpg & + wget -t 45 -o log http://fly.srk.fer.hr/jpg/flyweb.jpg & The ampersand at the end of the line makes sure that Wget works in the background. To unlimit the number of retries, use `-t inf'. @@ -784,10 +676,10 @@ Simple Usage * The usage of FTP is as simple. Wget will take care of login and password. - $ wget ftp://gnjilux.cc.fer.hr/welcome.msg - --10:08:47-- ftp://gnjilux.cc.fer.hr:21/welcome.msg + $ wget ftp://gnjilux.srk.fer.hr/welcome.msg + --10:08:47-- ftp://gnjilux.srk.fer.hr:21/welcome.msg => `welcome.msg' - Connecting to gnjilux.cc.fer.hr:21... connected! + Connecting to gnjilux.srk.fer.hr:21... connected! Logging in as anonymous ... Logged in! ==> TYPE I ... done. ==> CWD not needed. ==> PORT ... done. ==> RETR welcome.msg ... done. @@ -848,9 +740,9 @@ Advanced Usage wget -r -l1 --no-parent -A.gif http://host/dir/ It is a bit of a kludge, but it works. `-r -l1' means to retrieve - recursively (*Note Recursive Retrieval::), with maximum depth of 1. + recursively (*note Recursive Retrieval::), with maximum depth of 1. `--no-parent' means that references to the parent directory are - ignored (*Note Directory-Based Limits::), and `-A.gif' means to + ignored (*note Directory-Based Limits::), and `-A.gif' means to download only the GIF files. `-A "*.gif"' would have worked too. * Suppose you were in the middle of downloading, when Wget was @@ -860,13 +752,13 @@ Advanced Usage wget -nc -r http://www.gnu.ai.mit.edu/ * If you want to encode your own username and password to HTTP or - FTP, use the appropriate URL syntax (*Note URL Format::). + FTP, use the appropriate URL syntax (*note URL Format::). wget ftp://hniksic:mypassword@jagor.srce.hr/.emacs * If you do not like the default retrieval visualization (1K dots with 10 dots per cluster and 50 dots per line), you can customize - it through dot settings (*Note Wgetrc Commands::). For example, + it through dot settings (*note Wgetrc Commands::). For example, many people like the "binary" style of retrieval, with 8K dots and 512K lines: @@ -875,10 +767,10 @@ Advanced Usage You can experiment with other styles, like: wget --dot-style=mega ftp://ftp.xemacs.org/pub/xemacs/xemacs-20.4/xemacs-20.4.tar.gz - wget --dot-style=micro http://fly.cc.fer.hr/ + wget --dot-style=micro http://fly.srk.fer.hr/ To make these settings permanent, put them in your `.wgetrc', as - described before (*Note Sample Wgetrc::). + described before (*note Sample Wgetrc::).  File: wget.info, Node: Guru Usage, Prev: Advanced Usage, Up: Examples @@ -902,7 +794,7 @@ Guru Usage * But what about mirroring the hosts networkologically close to you? It seems so awfully slow because of all that DNS resolving. Just - use `-D' (*Note Domain Acceptance::). + use `-D' (*note Domain Acceptance::). wget -rN -Dsrce.hr http://www.srce.hr/ @@ -976,7 +868,7 @@ the following environment variables: `no_proxy' This variable should contain a comma-separated list of domain - extensions proxy should *not* be used for. For instance, if the + extensions proxy should _not_ be used for. For instance, if the value of `no_proxy' is `.mit.edu', proxy will not be used to retrieve documents from MIT. @@ -1022,7 +914,7 @@ Distribution Like all GNU utilities, the latest version of Wget can be found at the master GNU archive site prep.ai.mit.edu, and its mirrors. For example, Wget 1.5.3+dev can be found at -`ftp://prep.ai.mit.edu/gnu/wget/wget-1.5.3+dev.tar.gz' +  File: wget.info, Node: Mailing List, Next: Reporting Bugs, Prev: Distribution, Up: Various @@ -1040,7 +932,7 @@ subscribe. The more people on the list, the better! magic word `subscribe' in the subject line. Unsubscribe by mailing to . - The mailing list is archived at `http://fly.cc.fer.hr/archive/wget'. + The mailing list is archived at .  File: wget.info, Node: Reporting Bugs, Next: Portability, Prev: Mailing List, Up: Various @@ -1076,7 +968,7 @@ simple guidelines. 3. Please start Wget with `-d' option and send the log (or the relevant parts of it). If Wget was compiled without debug support, - recompile it. It is *much* easier to trace bugs with debug support + recompile it. It is _much_ easier to trace bugs with debug support on. 4. If Wget has crashed, try to run it in a debugger, e.g. `gdb `which @@ -1138,9 +1030,7 @@ File: wget.info, Node: Appendices, Next: Copying, Prev: Various, Up: Top Appendices ********** - This chapter contains some references I consider useful, like the -Robots Exclusion Standard specification, as well as a list of -contributors to GNU Wget. + This chapter contains some references I consider useful. * Menu: @@ -1154,176 +1044,61 @@ File: wget.info, Node: Robots, Next: Security Considerations, Prev: Appendice Robots ====== - Since Wget is able to traverse the web, it counts as one of the Web -"robots". Thus Wget understands "Robots Exclusion Standard" -(RES)--contents of `/robots.txt', used by server administrators to -shield parts of their systems from wanderings of Wget. + It is extremely easy to make Wget wander aimlessly around a web site, +sucking all the available data in progress. `wget -r SITE', and you're +set. Great? Not for the server admin. + + While Wget is retrieving static pages, there's not much of a problem. +But for Wget, there is no real difference between the smallest static +page and the hardest, most demanding CGI or dynamic page. For instance, +a site I know has a section handled by an, uh, bitchin' CGI script that +converts all the Info files to HTML. The script can and does bring the +machine to its knees without providing anything useful to the +downloader. + + For such and similar cases various robot exclusion schemes have been +devised as a means for the server administrators and document authors to +protect chosen portions of their sites from the wandering of robots. + + The more popular mechanism is the "Robots Exclusion Standard" +written by Martijn Koster et al. in 1994. It is specified by placing a +file named `/robots.txt' in the server root, which the robots are +supposed to download and parse. Wget supports this specification. Norobots support is turned on only when retrieving recursively, and -*never* for the first page. Thus, you may issue: +_never_ for the first page. Thus, you may issue: - wget -r http://fly.cc.fer.hr/ + wget -r http://fly.srk.fer.hr/ - First the index of fly.cc.fer.hr will be downloaded. If Wget finds -anything worth downloading on the same host, only *then* will it load + First the index of fly.srk.fer.hr will be downloaded. If Wget finds +anything worth downloading on the same host, only _then_ will it load the robots, and decide whether or not to load the links after all. -`/robots.txt' is loaded only once per host. Wget does not support the -robots `META' tag. +`/robots.txt' is loaded only once per host. - The description of the norobots standard was written, and is -maintained by Martijn Koster . With his -permission, I contribute a (slightly modified) TeXified version of the -RES. + Note that the exlusion standard discussed here has undergone some +revisions. However, but Wget supports only the first version of RES, +the one written by Martijn Koster in 1994, available at +. A +later version exists in the form of an internet draft + titled "A Method for Web Robots Control", +which expired on June 4, 1997. I am not aware if it ever made to an +RFC. The text of the draft is available at +. +Wget does not yet support the new directives specified by this draft, +but we plan to add them. -* Menu: + This manual no longer includes the text of the old standard. -* Introduction to RES:: -* RES Format:: -* User-Agent Field:: -* Disallow Field:: -* Norobots Examples:: + The second, less known mechanism, enables the author of an individual +document to specify whether they want the links from the file to be +followed by a robot. This is achieved using the `META' tag, like this: - -File: wget.info, Node: Introduction to RES, Next: RES Format, Prev: Robots, Up: Robots + -Introduction to RES -------------------- - - "WWW Robots" (also called "wanderers" or "spiders") are programs -that traverse many pages in the World Wide Web by recursively -retrieving linked pages. For more information see the robots page. - - In 1993 and 1994 there have been occasions where robots have visited -WWW servers where they weren't welcome for various reasons. Sometimes -these reasons were robot specific, e.g. certain robots swamped servers -with rapid-fire requests, or retrieved the same files repeatedly. In -other situations robots traversed parts of WWW servers that weren't -suitable, e.g. very deep virtual trees, duplicated information, -temporary information, or cgi-scripts with side-effects (such as -voting). - - These incidents indicated the need for established mechanisms for -WWW servers to indicate to robots which parts of their server should -not be accessed. This standard addresses this need with an operational -solution. - - This document represents a consensus on 30 June 1994 on the robots -mailing list (`robots@webcrawler.com'), between the majority of robot -authors and other people with an interest in robots. It has also been -open for discussion on the Technical World Wide Web mailing list -(`www-talk@info.cern.ch'). This document is based on a previous working -draft under the same title. - - It is not an official standard backed by a standards body, or owned -by any commercial organization. It is not enforced by anybody, and there -no guarantee that all current and future robots will use it. Consider -it a common facility the majority of robot authors offer the WWW -community to protect WWW server against unwanted accesses by their -robots. - - The latest version of this document can be found at -`http://info.webcrawler.com/mak/projects/robots/norobots.html'. - - -File: wget.info, Node: RES Format, Next: User-Agent Field, Prev: Introduction to RES, Up: Robots - -RES Format ----------- - - The format and semantics of the `/robots.txt' file are as follows: - - The file consists of one or more records separated by one or more -blank lines (terminated by `CR', `CR/NL', or `NL'). Each record -contains lines of the form: - - : - - The field name is case insensitive. - - Comments can be included in file using UNIX Bourne shell conventions: -the `#' character is used to indicate that preceding space (if any) and -the remainder of the line up to the line termination is discarded. -Lines containing only a comment are discarded completely, and therefore -do not indicate a record boundary. - - The record starts with one or more User-agent lines, followed by one -or more Disallow lines, as detailed below. Unrecognized headers are -ignored. - - The presence of an empty `/robots.txt' file has no explicit -associated semantics, it will be treated as if it was not present, i.e. -all robots will consider themselves welcome. - - -File: wget.info, Node: User-Agent Field, Next: Disallow Field, Prev: RES Format, Up: Robots - -User-Agent Field ----------------- - - The value of this field is the name of the robot the record is -describing access policy for. - - If more than one User-agent field is present the record describes an -identical access policy for more than one robot. At least one field -needs to be present per record. - - The robot should be liberal in interpreting this field. A case -insensitive substring match of the name without version information is -recommended. - - If the value is `*', the record describes the default access policy -for any robot that has not matched any of the other records. It is not -allowed to have multiple such records in the `/robots.txt' file. - - -File: wget.info, Node: Disallow Field, Next: Norobots Examples, Prev: User-Agent Field, Up: Robots - -Disallow Field --------------- - - The value of this field specifies a partial URL that is not to be -visited. This can be a full path, or a partial path; any URL that -starts with this value will not be retrieved. For example, -`Disallow: /help' disallows both `/help.html' and `/help/index.html', -whereas `Disallow: /help/' would disallow `/help/index.html' but allow -`/help.html'. - - Any empty value, indicates that all URLs can be retrieved. At least -one Disallow field needs to be present in a record. - - -File: wget.info, Node: Norobots Examples, Prev: Disallow Field, Up: Robots - -Norobots Examples ------------------ - - The following example `/robots.txt' file specifies that no robots -should visit any URL starting with `/cyberworld/map/' or `/tmp/': - - # robots.txt for http://www.site.com/ - - User-agent: * - Disallow: /cyberworld/map/ # This is an infinite virtual URL space - Disallow: /tmp/ # these will soon disappear - - This example `/robots.txt' file specifies that no robots should -visit any URL starting with `/cyberworld/map/', except the robot called -`cybermapper': - - # robots.txt for http://www.site.com/ - - User-agent: * - Disallow: /cyberworld/map/ # This is an infinite virtual URL space - - # Cybermapper knows where to go. - User-agent: cybermapper - Disallow: - - This example indicates that no robots should visit this site further: - - # go away - User-agent: * - Disallow: / + This is explained in some detail at +. +Unfortunately, Wget does not support this method of robot exclusion yet, +but it will be implemented in the next release.  File: wget.info, Node: Security Considerations, Next: Contributors, Prev: Robots, Up: Appendices @@ -1350,3 +1125,124 @@ Here are the main issues, and some solutions. being careful when you send debug logs (yes, even when you send them to me). + +File: wget.info, Node: Contributors, Prev: Security Considerations, Up: Appendices + +Contributors +============ + + GNU Wget was written by Hrvoje Niksic . +However, its development could never have gone as far as it has, were it +not for the help of many people, either with bug reports, feature +proposals, patches, or letters saying "Thanks!". + + Special thanks goes to the following people (no particular order): + + * Karsten Thygesen--donated system resources such as the mailing + list, web space, and FTP space, along with a lot of time to make + these actually work. + + * Shawn McHorse--bug reports and patches. + + * Kaveh R. Ghazi--on-the-fly `ansi2knr'-ization. Lots of + portability fixes. + + * Gordon Matzigkeit--`.netrc' support. + + * Zlatko Calusic, Tomislav Vujec and Drazen Kacar--feature + suggestions and "philosophical" discussions. + + * Darko Budor--initial port to Windows. + + * Antonio Rosella--help and suggestions, plus the Italian + translation. + + * Tomislav Petrovic, Mario Mikocevic--many bug reports and + suggestions. + + * Francois Pinard--many thorough bug reports and discussions. + + * Karl Eichwalder--lots of help with internationalization and other + things. + + * Junio Hamano--donated support for Opie and HTTP `Digest' + authentication. + + * Brian Gough--a generous donation. + + The following people have provided patches, bug/build reports, useful +suggestions, beta testing services, fan mail and all the other things +that make maintenance so much fun: + + Tim Adam, Adrian Aichner, Martin Baehr, Dieter Baron, Roger Beeman +and the Gurus at Cisco, Dan Berger, Mark Boyns, John Burden, Wanderlei +Cavassin, Gilles Cedoc, Tim Charron, Noel Cragg, Kristijan Conkas, John +Daily, Andrew Davison, Andrew Deryabin, Ulrich Drepper, Marc Duponcheel, +Damir Dzeko, Aleksandar Erkalovic, Andy Eskilsson, Masashi Fujita, +Howard Gayle, Marcel Gerrits, Hans Grobler, Mathieu Guillaume, Dan +Harkless, Heiko Herold, Karl Heuer, HIROSE Masaaki, Gregor Hoffleit, +Erik Magnus Hulthen, Richard Huveneers, Simon Josefsson, Mario Juric, +Const Kaplinsky, Goran Kezunovic, Robert Kleine, Fila Kolodny, +Alexander Kourakos, Martin Kraemer, Simos KSenitellis, Hrvoje Lacko, +Daniel S. Lewart, Dave Love, Alexander V. Lukyanov, Jordan Mendelson, +Lin Zhe Min, Simon Munton, Charlie Negyesi, R. K. Owen, Andrew Pollock, +Steve Pothier, Jan Prikryl, Marin Purgar, Keith Refson, Tyler Riddle, +Tobias Ringstrom, Juan Jose Rodrigues, Edward J. Sabol, Heinz Salzmann, +Robert Schmidt, Andreas Schwab, Toomas Soome, Tage Stabell-Kulo, Sven +Sternberger, Markus Strasser, Szakacsits Szabolcs, Mike Thomas, Russell +Vincent, Charles G Waldman, Douglas E. Wegscheid, Jasmin Zainul, Bojan +Zdrnja, Kristijan Zimmer. + + Apologies to all who I accidentally left out, and many thanks to all +the subscribers of the Wget mailing list. + + +File: wget.info, Node: Copying, Next: Concept Index, Prev: Appendices, Up: Top + +Copying +******* + + Wget is "free software", where "free" refers to liberty, not price. +The exact legal distribution terms follow below, but in short, it means +that you have the right (freedom) to run and change and copy Wget, and +even--if you want--charge money for any of those things. The sole +restriction is that you have to grant your recipients the same rights. + + This method of licensing software is also known as "open-source", +because it requires that the recipients always receive a program's +source code along with the program. + + More specifically: + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + + In addition to this, this manual is free in the same sense: + + Permission is granted to copy, distribute and/or modify this + document under the terms of the GNU Free Documentation License, + Version 1.1 or any later version published by the Free Software + Foundation; with the Invariant Sections being "GNU General Public + License" and "GNU Free Documentation License", with no Front-Cover + Texts, and with no Back-Cover Texts. A copy of the license is + included in the section entitled "GNU Free Documentation License". + + The full texts of the GNU General Public License and of the GNU Free +Documentation License are available below. + +* Menu: + +* GNU General Public License:: +* GNU Free Documentation License:: + diff --git a/doc/wget.info-3 b/doc/wget.info-3 index b4c9e2ce..e9e28048 100644 --- a/doc/wget.info-3 +++ b/doc/wget.info-3 @@ -1,5 +1,4 @@ -This is Info file wget.info, produced by Makeinfo version 1.68 from the -input file ./wget.texi. +This is wget.info, produced by makeinfo version 4.0 from wget.texi. INFO-DIR-SECTION Net Utilities INFO-DIR-SECTION World Wide Web @@ -16,85 +15,19 @@ data. manual provided the copyright notice and this permission notice are preserved on all copies. - Permission is granted to copy and distribute modified versions of -this manual under the conditions for verbatim copying, provided also -that the sections entitled "Copying" and "GNU General Public License" -are included exactly as in the original, and provided that the entire -resulting derived work is distributed under the terms of a permission -notice identical to this one. + Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.1 or +any later version published by the Free Software Foundation; with the +Invariant Sections being "GNU General Public License" and "GNU Free +Documentation License", with no Front-Cover Texts, and with no +Back-Cover Texts. A copy of the license is included in the section +entitled "GNU Free Documentation License".  -File: wget.info, Node: Contributors, Prev: Security Considerations, Up: Appendices +File: wget.info, Node: GNU General Public License, Next: GNU Free Documentation License, Prev: Copying, Up: Copying -Contributors -============ - - GNU Wget was written by Hrvoje Niksic . -However, its development could never have gone as far as it has, were it -not for the help of many people, either with bug reports, feature -proposals, patches, or letters saying "Thanks!". - - Special thanks goes to the following people (no particular order): - - * Karsten Thygesen--donated the mailing list and the initial FTP - space. - - * Shawn McHorse--bug reports and patches. - - * Kaveh R. Ghazi--on-the-fly `ansi2knr'-ization. - - * Gordon Matzigkeit--`.netrc' support. - - * Zlatko Calusic, Tomislav Vujec and Drazen Kacar--feature - suggestions and "philosophical" discussions. - - * Darko Budor--initial port to Windows. - - * Antonio Rosella--help and suggestions, plus the Italian - translation. - - * Tomislav Petrovic, Mario Mikocevic--many bug reports and - suggestions. - - * Francois Pinard--many thorough bug reports and discussions. - - * Karl Eichwalder--lots of help with internationalization and other - things. - - * Junio Hamano--donated support for Opie and HTTP `Digest' - authentication. - - * Brian Gough--a generous donation. - - The following people have provided patches, bug/build reports, useful -suggestions, beta testing services, fan mail and all the other things -that make maintenance so much fun: - - Tim Adam, Martin Baehr, Dieter Baron, Roger Beeman and the Gurus at -Cisco, Dan Berger, Mark Boyns, John Burden, Wanderlei Cavassin, Gilles -Cedoc, Tim Charron, Noel Cragg, Kristijan Conkas, Andrew Deryabin, -Damir Dzeko, Andrew Davison, Ulrich Drepper, Marc Duponcheel, -Aleksandar Erkalovic, Andy Eskilsson, Masashi Fujita, Howard Gayle, -Marcel Gerrits, Hans Grobler, Mathieu Guillaume, Dan Harkless, Heiko -Herold, Karl Heuer, HIROSE Masaaki, Gregor Hoffleit, Erik Magnus -Hulthen, Richard Huveneers, Simon Josefsson, Mario Juric, Goran -Kezunovic, Robert Kleine, Fila Kolodny, Alexander Kourakos, Martin -Kraemer, Simos KSenitellis, Hrvoje Lacko, Daniel S. Lewart, Dave Love, -Jordan Mendelson, Lin Zhe Min, Charlie Negyesi, Andrew Pollock, Steve -Pothier, Jan Prikryl, Marin Purgar, Keith Refson, Tobias Ringstrom, -Juan Jose Rodrigues, Edward J. Sabol, Heinz Salzmann, Robert Schmidt, -Toomas Soome, Tage Stabell-Kulo, Sven Sternberger, Markus Strasser, -Szakacsits Szabolcs, Mike Thomas, Russell Vincent, Charles G Waldman, -Douglas E. Wegscheid, Jasmin Zainul, Bojan Zdrnja, Kristijan Zimmer. - - Apologies to all who I accidentally left out, and many thanks to all -the subscribers of the Wget mailing list. - - -File: wget.info, Node: Copying, Next: Concept Index, Prev: Appendices, Up: Top - -GNU GENERAL PUBLIC LICENSE -************************** +GNU General Public License +========================== Version 2, June 1991 @@ -454,6 +387,391 @@ library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Library General Public License instead of this License. + +File: wget.info, Node: GNU Free Documentation License, Prev: GNU General Public License, Up: Copying + +GNU Free Documentation License +============================== + + Version 1.1, March 2000 + + Copyright (C) 2000 Free Software Foundation, Inc. + 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + + + 0. PREAMBLE + + The purpose of this License is to make a manual, textbook, or other + written document "free" in the sense of freedom: to assure everyone + the effective freedom to copy and redistribute it, with or without + modifying it, either commercially or noncommercially. Secondarily, + this License preserves for the author and publisher a way to get + credit for their work, while not being considered responsible for + modifications made by others. + + This License is a kind of "copyleft", which means that derivative + works of the document must themselves be free in the same sense. + It complements the GNU General Public License, which is a copyleft + license designed for free software. + + We have designed this License in order to use it for manuals for + free software, because free software needs free documentation: a + free program should come with manuals providing the same freedoms + that the software does. But this License is not limited to + software manuals; it can be used for any textual work, regardless + of subject matter or whether it is published as a printed book. + We recommend this License principally for works whose purpose is + instruction or reference. + + + 1. APPLICABILITY AND DEFINITIONS + + This License applies to any manual or other work that contains a + notice placed by the copyright holder saying it can be distributed + under the terms of this License. The "Document", below, refers to + any such manual or work. Any member of the public is a licensee, + and is addressed as "you". + + A "Modified Version" of the Document means any work containing the + Document or a portion of it, either copied verbatim, or with + modifications and/or translated into another language. + + A "Secondary Section" is a named appendix or a front-matter + section of the Document that deals exclusively with the + relationship of the publishers or authors of the Document to the + Document's overall subject (or to related matters) and contains + nothing that could fall directly within that overall subject. + (For example, if the Document is in part a textbook of + mathematics, a Secondary Section may not explain any mathematics.) + The relationship could be a matter of historical connection with + the subject or with related matters, or of legal, commercial, + philosophical, ethical or political position regarding them. + + The "Invariant Sections" are certain Secondary Sections whose + titles are designated, as being those of Invariant Sections, in + the notice that says that the Document is released under this + License. + + The "Cover Texts" are certain short passages of text that are + listed, as Front-Cover Texts or Back-Cover Texts, in the notice + that says that the Document is released under this License. + + A "Transparent" copy of the Document means a machine-readable copy, + represented in a format whose specification is available to the + general public, whose contents can be viewed and edited directly + and straightforwardly with generic text editors or (for images + composed of pixels) generic paint programs or (for drawings) some + widely available drawing editor, and that is suitable for input to + text formatters or for automatic translation to a variety of + formats suitable for input to text formatters. A copy made in an + otherwise Transparent file format whose markup has been designed + to thwart or discourage subsequent modification by readers is not + Transparent. A copy that is not "Transparent" is called "Opaque". + + Examples of suitable formats for Transparent copies include plain + ASCII without markup, Texinfo input format, LaTeX input format, + SGML or XML using a publicly available DTD, and + standard-conforming simple HTML designed for human modification. + Opaque formats include PostScript, PDF, proprietary formats that + can be read and edited only by proprietary word processors, SGML + or XML for which the DTD and/or processing tools are not generally + available, and the machine-generated HTML produced by some word + processors for output purposes only. + + The "Title Page" means, for a printed book, the title page itself, + plus such following pages as are needed to hold, legibly, the + material this License requires to appear in the title page. For + works in formats which do not have any title page as such, "Title + Page" means the text near the most prominent appearance of the + work's title, preceding the beginning of the body of the text. + + + 2. VERBATIM COPYING + + You may copy and distribute the Document in any medium, either + commercially or noncommercially, provided that this License, the + copyright notices, and the license notice saying this License + applies to the Document are reproduced in all copies, and that you + add no other conditions whatsoever to those of this License. You + may not use technical measures to obstruct or control the reading + or further copying of the copies you make or distribute. However, + you may accept compensation in exchange for copies. If you + distribute a large enough number of copies you must also follow + the conditions in section 3. + + You may also lend copies, under the same conditions stated above, + and you may publicly display copies. + + + 3. COPYING IN QUANTITY + + If you publish printed copies of the Document numbering more than + 100, and the Document's license notice requires Cover Texts, you + must enclose the copies in covers that carry, clearly and legibly, + all these Cover Texts: Front-Cover Texts on the front cover, and + Back-Cover Texts on the back cover. Both covers must also clearly + and legibly identify you as the publisher of these copies. The + front cover must present the full title with all words of the + title equally prominent and visible. You may add other material + on the covers in addition. Copying with changes limited to the + covers, as long as they preserve the title of the Document and + satisfy these conditions, can be treated as verbatim copying in + other respects. + + If the required texts for either cover are too voluminous to fit + legibly, you should put the first ones listed (as many as fit + reasonably) on the actual cover, and continue the rest onto + adjacent pages. + + If you publish or distribute Opaque copies of the Document + numbering more than 100, you must either include a + machine-readable Transparent copy along with each Opaque copy, or + state in or with each Opaque copy a publicly-accessible + computer-network location containing a complete Transparent copy + of the Document, free of added material, which the general + network-using public has access to download anonymously at no + charge using public-standard network protocols. If you use the + latter option, you must take reasonably prudent steps, when you + begin distribution of Opaque copies in quantity, to ensure that + this Transparent copy will remain thus accessible at the stated + location until at least one year after the last time you + distribute an Opaque copy (directly or through your agents or + retailers) of that edition to the public. + + It is requested, but not required, that you contact the authors of + the Document well before redistributing any large number of + copies, to give them a chance to provide you with an updated + version of the Document. + + + 4. MODIFICATIONS + + You may copy and distribute a Modified Version of the Document + under the conditions of sections 2 and 3 above, provided that you + release the Modified Version under precisely this License, with + the Modified Version filling the role of the Document, thus + licensing distribution and modification of the Modified Version to + whoever possesses a copy of it. In addition, you must do these + things in the Modified Version: + + A. Use in the Title Page (and on the covers, if any) a title + distinct from that of the Document, and from those of previous + versions (which should, if there were any, be listed in the + History section of the Document). You may use the same title + as a previous version if the original publisher of that version + gives permission. + B. List on the Title Page, as authors, one or more persons or + entities responsible for authorship of the modifications in the + Modified Version, together with at least five of the principal + authors of the Document (all of its principal authors, if it + has less than five). + C. State on the Title page the name of the publisher of the + Modified Version, as the publisher. + D. Preserve all the copyright notices of the Document. + E. Add an appropriate copyright notice for your modifications + adjacent to the other copyright notices. + F. Include, immediately after the copyright notices, a license + notice giving the public permission to use the Modified Version + under the terms of this License, in the form shown in the + Addendum below. + G. Preserve in that license notice the full lists of Invariant + Sections and required Cover Texts given in the Document's + license notice. + H. Include an unaltered copy of this License. + I. Preserve the section entitled "History", and its title, and add + to it an item stating at least the title, year, new authors, and + publisher of the Modified Version as given on the Title Page. + If there is no section entitled "History" in the Document, + create one stating the title, year, authors, and publisher of + the Document as given on its Title Page, then add an item + describing the Modified Version as stated in the previous + sentence. + J. Preserve the network location, if any, given in the Document for + public access to a Transparent copy of the Document, and + likewise the network locations given in the Document for + previous versions it was based on. These may be placed in the + "History" section. You may omit a network location for a work + that was published at least four years before the Document + itself, or if the original publisher of the version it refers + to gives permission. + K. In any section entitled "Acknowledgements" or "Dedications", + preserve the section's title, and preserve in the section all the + substance and tone of each of the contributor acknowledgements + and/or dedications given therein. + L. Preserve all the Invariant Sections of the Document, + unaltered in their text and in their titles. Section numbers + or the equivalent are not considered part of the section titles. + M. Delete any section entitled "Endorsements". Such a section + may not be included in the Modified Version. + N. Do not retitle any existing section as "Endorsements" or to + conflict in title with any Invariant Section. + + If the Modified Version includes new front-matter sections or + appendices that qualify as Secondary Sections and contain no + material copied from the Document, you may at your option + designate some or all of these sections as invariant. To do this, + add their titles to the list of Invariant Sections in the Modified + Version's license notice. These titles must be distinct from any + other section titles. + + You may add a section entitled "Endorsements", provided it contains + nothing but endorsements of your Modified Version by various + parties-for example, statements of peer review or that the text has + been approved by an organization as the authoritative definition + of a standard. + + You may add a passage of up to five words as a Front-Cover Text, + and a passage of up to 25 words as a Back-Cover Text, to the end + of the list of Cover Texts in the Modified Version. Only one + passage of Front-Cover Text and one of Back-Cover Text may be + added by (or through arrangements made by) any one entity. If the + Document already includes a cover text for the same cover, + previously added by you or by arrangement made by the same entity + you are acting on behalf of, you may not add another; but you may + replace the old one, on explicit permission from the previous + publisher that added the old one. + + The author(s) and publisher(s) of the Document do not by this + License give permission to use their names for publicity for or to + assert or imply endorsement of any Modified Version. + + + 5. COMBINING DOCUMENTS + + You may combine the Document with other documents released under + this License, under the terms defined in section 4 above for + modified versions, provided that you include in the combination + all of the Invariant Sections of all of the original documents, + unmodified, and list them all as Invariant Sections of your + combined work in its license notice. + + The combined work need only contain one copy of this License, and + multiple identical Invariant Sections may be replaced with a single + copy. If there are multiple Invariant Sections with the same name + but different contents, make the title of each such section unique + by adding at the end of it, in parentheses, the name of the + original author or publisher of that section if known, or else a + unique number. Make the same adjustment to the section titles in + the list of Invariant Sections in the license notice of the + combined work. + + In the combination, you must combine any sections entitled + "History" in the various original documents, forming one section + entitled "History"; likewise combine any sections entitled + "Acknowledgements", and any sections entitled "Dedications". You + must delete all sections entitled "Endorsements." + + + 6. COLLECTIONS OF DOCUMENTS + + You may make a collection consisting of the Document and other + documents released under this License, and replace the individual + copies of this License in the various documents with a single copy + that is included in the collection, provided that you follow the + rules of this License for verbatim copying of each of the + documents in all other respects. + + You may extract a single document from such a collection, and + distribute it individually under this License, provided you insert + a copy of this License into the extracted document, and follow + this License in all other respects regarding verbatim copying of + that document. + + + 7. AGGREGATION WITH INDEPENDENT WORKS + + A compilation of the Document or its derivatives with other + separate and independent documents or works, in or on a volume of + a storage or distribution medium, does not as a whole count as a + Modified Version of the Document, provided no compilation + copyright is claimed for the compilation. Such a compilation is + called an "aggregate", and this License does not apply to the + other self-contained works thus compiled with the Document, on + account of their being thus compiled, if they are not themselves + derivative works of the Document. + + If the Cover Text requirement of section 3 is applicable to these + copies of the Document, then if the Document is less than one + quarter of the entire aggregate, the Document's Cover Texts may be + placed on covers that surround only the Document within the + aggregate. Otherwise they must appear on covers around the whole + aggregate. + + + 8. TRANSLATION + + Translation is considered a kind of modification, so you may + distribute translations of the Document under the terms of section + 4. Replacing Invariant Sections with translations requires special + permission from their copyright holders, but you may include + translations of some or all Invariant Sections in addition to the + original versions of these Invariant Sections. You may include a + translation of this License provided that you also include the + original English version of this License. In case of a + disagreement between the translation and the original English + version of this License, the original English version will prevail. + + + 9. TERMINATION + + You may not copy, modify, sublicense, or distribute the Document + except as expressly provided for under this License. Any other + attempt to copy, modify, sublicense or distribute the Document is + void, and will automatically terminate your rights under this + License. However, parties who have received copies, or rights, + from you under this License will not have their licenses + terminated so long as such parties remain in full compliance. + + + 10. FUTURE REVISIONS OF THIS LICENSE + + The Free Software Foundation may publish new, revised versions of + the GNU Free Documentation License from time to time. Such new + versions will be similar in spirit to the present version, but may + differ in detail to address new problems or concerns. See + http://www.gnu.org/copyleft/. + + Each version of the License is given a distinguishing version + number. If the Document specifies that a particular numbered + version of this License "or any later version" applies to it, you + have the option of following the terms and conditions either of + that specified version or of any later version that has been + published (not as a draft) by the Free Software Foundation. If + the Document does not specify a version number of this License, + you may choose any version ever published (not as a draft) by the + Free Software Foundation. + + +ADDENDUM: How to use this License for your documents +==================================================== + + To use this License in a document you have written, include a copy of +the License in the document and put the following copyright and license +notices just after the title page: + + + Copyright (C) YEAR YOUR NAME. + Permission is granted to copy, distribute and/or modify this document + under the terms of the GNU Free Documentation License, Version 1.1 + or any later version published by the Free Software Foundation; + with the Invariant Sections being LIST THEIR TITLES, with the + Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST. + A copy of the license is included in the section entitled ``GNU + Free Documentation License''. +If you have no Invariant Sections, write "with no Invariant +Sections" instead of saying which ones are invariant. If you have no +Front-Cover Texts, write "no Front-Cover Texts" instead of "Front-Cover +Texts being LIST"; likewise for Back-Cover Texts. + + If your document contains nontrivial examples of program code, we +recommend releasing these examples in parallel under your choice of +free software license, such as the GNU General Public License, to +permit their use in free software. +  File: wget.info, Node: Concept Index, Prev: Copying, Up: Top @@ -507,6 +825,7 @@ Concept Index * following links: Following Links. * force html: Logging and Input File Options. * ftp time-stamping: FTP Time-Stamping Internals. +* GFDL: Copying. * globbing, toggle: FTP Options. * GPL: Copying. * hangup: Signals. @@ -532,14 +851,9 @@ Concept Index * mailing list: Mailing List. * mirroring: Guru Usage. * no parent: Directory-Based Limits. -* no warranty: Copying. +* no warranty: GNU General Public License. * no-clobber: Download Options. * nohup: Invoking. -* norobots disallow: Disallow Field. -* norobots examples: Norobots Examples. -* norobots format: RES Format. -* norobots introduction: Introduction to RES. -* norobots user-agent: User-Agent Field. * number of retries: Download Options. * operating systems: Portability. * option syntax: Option Syntax. @@ -550,8 +864,8 @@ Concept Index * pause: Download Options. * portability: Portability. * proxies: Proxies. -* proxy <1>: Download Options. -* proxy: HTTP Options. +* proxy <1>: HTTP Options. +* proxy: Download Options. * proxy authentication: HTTP Options. * proxy filling: Recursive Retrieval Options. * proxy password: HTTP Options. diff --git a/doc/wget.texi b/doc/wget.texi index 68a74cbf..d066ee69 100644 --- a/doc/wget.texi +++ b/doc/wget.texi @@ -42,10 +42,11 @@ notice identical to this one except for the removal of this paragraph @end ignore Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or -any later version published by the Free Software Foundation; with no -Invariant Sections, with no Front-Cover Texts, and with no Back-Cover -Texts. A copy of the license is included in the section entitled ``GNU -Free Documentation License''. +any later version published by the Free Software Foundation; with the +Invariant Sections being ``GNU General Public License'' and ``GNU Free +Documentation License'', with no Front-Cover Texts, and with no +Back-Cover Texts. A copy of the license is included in the section +entitled ``GNU Free Documentation License''. @end ifinfo @titlepage @@ -60,10 +61,11 @@ Copyright @copyright{} 1996, 1997, 1998, 2000 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or -any later version published by the Free Software Foundation; with no -Invariant Sections, with no Front-Cover Texts, and with no Back-Cover -Texts. A copy of the license is included in the section entitled ``GNU -Free Documentation License''. +any later version published by the Free Software Foundation; with the +Invariant Sections being ``GNU General Public License'' and ``GNU Free +Documentation License'', with no Front-Cover Texts, and with no +Back-Cover Texts. A copy of the license is included in the section +entitled ``GNU Free Documentation License''. @end titlepage @ifinfo @@ -2485,10 +2487,26 @@ This chapter contains some references I consider useful. @cindex robots.txt @cindex server maintenance -Since Wget is able to traverse the web, it counts as one of the Web -@dfn{robots}. Thus Wget understands @dfn{Robots Exclusion Standard} -(@sc{res})---contents of @file{/robots.txt}, used by server -administrators to shield parts of their systems from wanderings of Wget. +It is extremely easy to make Wget wander aimlessly around a web site, +sucking all the available data in progress. @samp{wget -r @var{site}}, +and you're set. Great? Not for the server admin. + +While Wget is retrieving static pages, there's not much of a problem. +But for Wget, there is no real difference between the smallest static +page and the hardest, most demanding CGI or dynamic page. For instance, +a site I know has a section handled by an, uh, bitchin' CGI script that +converts all the Info files to HTML. The script can and does bring the +machine to its knees without providing anything useful to the +downloader. + +For such and similar cases various robot exclusion schemes have been +devised as a means for the server administrators and document authors to +protect chosen portions of their sites from the wandering of robots. + +The more popular mechanism is the @dfn{Robots Exclusion Standard} +written by Martijn Koster et al. in 1994. It is specified by placing a +file named @file{/robots.txt} in the server root, which the robots are +supposed to download and parse. Wget supports this specification. Norobots support is turned on only when retrieving recursively, and @emph{never} for the first page. Thus, you may issue: @@ -2500,8 +2518,7 @@ wget -r http://fly.srk.fer.hr/ First the index of fly.srk.fer.hr will be downloaded. If Wget finds anything worth downloading on the same host, only @emph{then} will it load the robots, and decide whether or not to load the links after all. -@file{/robots.txt} is loaded only once per host. Wget does not support -the robots @code{META} tag. +@file{/robots.txt} is loaded only once per host. Note that the exlusion standard discussed here has undergone some revisions. However, but Wget supports only the first version of @@ -2517,6 +2534,20 @@ but we plan to add them. This manual no longer includes the text of the old standard. +The second, less known mechanism, enables the author of an individual +document to specify whether they want the links from the file to be +followed by a robot. This is achieved using the @code{META} tag, like +this: + +@example + +@end example + +This is explained in some detail at +@url{http://info.webcrawler.com/mak/projects/robots/meta-user.html}. +Unfortunately, Wget does not support this method of robot exclusion yet, +but it will be implemented in the next release. + @node Security Considerations, Contributors, Robots, Appendices @section Security Considerations @cindex security @@ -2789,10 +2820,11 @@ In addition to this, this manual is free in the same sense: @quotation Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or -any later version published by the Free Software Foundation; with no -Invariant Sections, with no Front-Cover Texts, and with no Back-Cover -Texts. A copy of the license is included in the section entitled ``GNU -Free Documentation License''. +any later version published by the Free Software Foundation; with the +Invariant Sections being ``GNU General Public License'' and ``GNU Free +Documentation License'', with no Front-Cover Texts, and with no +Back-Cover Texts. A copy of the license is included in the section +entitled ``GNU Free Documentation License''. @end quotation @c #### Maybe we should wrap these licenses in ifinfo? Stallman says