1
0
mirror of https://github.com/moparisthebest/wget synced 2024-07-03 16:38:41 -04:00
Commit Graph

148 Commits

Author SHA1 Message Date
hniksic
c6dcc539f8 [svn] Updated comment. 2003-10-14 15:54:57 -07:00
hniksic
cc5d6f0ab8 [svn] Doc typo fix. 2003-10-07 14:45:26 -07:00
hniksic
65cec8deee [svn] Fix memory leak in a rare case in url.c.
Translate error messages from url_parse().
2003-10-01 12:59:48 -07:00
hniksic
831f376303 [svn] Fix oversight in escape handling. 2003-09-25 15:31:35 -07:00
hniksic
bebad75ff9 [svn] Another doc update. 2003-09-22 05:07:20 -07:00
hniksic
f78016fb95 [svn] path_simplify doc update. 2003-09-22 05:03:34 -07:00
hniksic
3e9dc5b994 [svn] Modified path_simplify not to rely on extensive use of memmove. 2003-09-21 17:23:44 -07:00
hniksic
dadde97838 [svn] Removed spurious includes. 2003-09-21 16:09:06 -07:00
hniksic
95c647eb44 [svn] Split off non-URL related stuff from url.c to convert.c. 2003-09-21 15:47:14 -07:00
hniksic
7211c51139 [svn] path_simplify would read two bytes past the end of the string in the "./" case. 2003-09-21 06:36:50 -07:00
hniksic
101f896e47 [svn] Minor fixes and cosmetic changes.
(uri_merge): Get rid of uri_merge_1.
(uri_merge): Merge "foo//", "bar" as "foo//bar", not "foo///bar",
i.e. don't add an extra slash merely because BASE ends with two
slashes.
(parse_credentials): Renamed from parse_uname.  Rewrittern in
standard [beg, end) calling style.
(url_skip_credentials): Renamed from url_skip_uname.  Made static.
(url_skip_credentials): Include # and ; as terminators.  Old code
would mistakenly consider "http://foo.com#hniksic@iskon.hr" to
contain a username.
(url_skip_scheme): Removed because it was unused.
(url_has_scheme): Require "scheme" to be at least one char long.
2003-09-19 17:05:36 -07:00
hniksic
a504d10ed5 [svn] Default dir_prefix to NULL rather than ".". 2003-09-19 08:28:36 -07:00
hniksic
7b5fb50cb1 [svn] Renamed wget.h XDIGIT-related macros to (hopefully) clearer names. 2003-09-19 07:08:37 -07:00
hniksic
4dcee39c88 [svn] Undef U, W, C after use. 2003-09-16 18:59:46 -07:00
hniksic
aa24b822ca [svn] Improved --restrict-file-names to accept ",nocontrol". 2003-09-16 18:32:05 -07:00
hniksic
d4281f04b2 [svn] Made sync_path more resilient to pathological values of u->file and u->dir. 2003-09-16 17:18:52 -07:00
hniksic
4b1afddab3 [svn] Allow unique_name to return the FILE argument unmodified.
Streamline and optimize unique_name_1.
2003-09-16 14:47:49 -07:00
hniksic
79157a03fd [svn] Made strpbrk_or_eos a macro under Gcc. 2003-09-15 03:47:46 -07:00
hniksic
0a3697ad65 [svn] New mechanism for quoting file names.
Published in <m3smmzt4px.fsf@hniksic.iskon.hr>.
2003-09-14 15:04:13 -07:00
hniksic
cd8b1259f1 [svn] IPv6 configure auto-detection. 2003-09-09 12:30:45 -07:00
hniksic
0e9b4de751 [svn] Return an error from url_parse if the IP address is IPv6 and we don't
handle IPv6.
2003-09-09 06:06:58 -07:00
hniksic
05715ed4d6 [svn] Add proper detection of numeric IPv6 addresses.
By Mauro Tortonesi.
2003-09-05 13:36:17 -07:00
hniksic
b2be7522c7 [svn] Update the license to include the OpenSSL exception. 2002-05-17 19:16:36 -07:00
abbotti
e863a6323b [svn] New function schemes_are_similar_p to test enumerated scheme codes for
similarity (SCHEME_HTTP and SCHEME_HTTPS are similar).  Use it in recur.c
(download_child_p).  Fixes a bug that caused -H option to be ignored when
child scheme different to parent scheme.
Published in <agn4eu8apduek7magfu9bfe63gto8i7cdh@farscape.privy.mev.co.uk>.
2002-05-16 10:22:24 -07:00
hniksic
5390ada318 [svn] Support FWTK-style proxies.
Pbublished in <sxslmbsxptu.fsf@florida.arsdigita.de>.
2002-04-12 20:04:47 -07:00
hniksic
485673c3a8 [svn] Make sure directory and file names are encoded the same way.
Published in <sxsofgqmf6e.fsf@florida.arsdigita.de>.
2002-04-11 08:25:51 -07:00
hniksic
b9e90c34b4 [svn] Don't treat '?' as a query string separator when parsing FTP URLs.
Published in <sxslmdqxdmq.fsf@florida.arsdigita.de>.
2002-02-18 21:09:14 -08:00
hniksic
1bea726393 [svn] Allow all hex digits in IPv6 IP addresses.
Published in <sxsofjgvo72.fsf@florida.arsdigita.de>.
2002-01-26 12:43:17 -08:00
hniksic
ef1eda86c4 [svn] Allow IPv6 numeric addresses in URLs.
Submitted in <sxsu1t9uedf.fsf@florida.arsdigita.de>.
2002-01-26 11:00:38 -08:00
hniksic
46617228fa [svn] URL-decode user and password in URL.
Published in <sxsita52hg6.fsf@florida.arsdigita.de>.
2002-01-14 05:26:16 -08:00
hniksic
524a1f54dc [svn] Handle links to relative "net locations," e.g. <a href="//www.server.com/">. 2002-01-13 17:56:40 -08:00
hniksic
a3500d32d7 [svn] Move path_simplify to url.c. 2001-12-14 07:46:00 -08:00
hniksic
b9f370004d [svn] Cosmetic changes to get_urls_html. 2001-12-12 11:06:10 -08:00
hniksic
943f657aa7 [svn] Rename long_to_string to number_to_string, and make it return a useful
value.
2001-12-09 18:29:12 -08:00
hniksic
dd84231c6a [svn] Minor fixes prompted by `lint'.
Published in <sxsadwt2nkg.fsf@florida.arsdigita.de>.
2001-12-08 17:24:41 -08:00
hniksic
0620ada923 [svn] Fix OpenSSL PRNG seeding.
Published in <sxs7ks1noc4.fsf@florida.arsdigita.de>.
2001-12-05 17:13:31 -08:00
hniksic
0fdc1bd8c0 [svn] Fix downloading of duplicate URLs.
Published in <sxsvgfmu2bj.fsf@florida.arsdigita.de>.
2001-12-04 13:03:35 -08:00
hniksic
e986f7dad3 [svn] Quote '?' as '%3F' in local files when `--html-extension' is turned on.
Published in <sxszo4ztiwr.fsf@florida.arsdigita.de>.
2001-12-04 01:49:37 -08:00
hniksic
8b2a216c77 [svn] Make --base -i work.
Published in <sxsoflisqcf.fsf@florida.arsdigita.de>.
2001-12-01 11:17:19 -08:00
hniksic
569fd61c95 [svn] Use the full path when building the authorization line.
Published in <sxsitbqu9iw.fsf@florida.arsdigita.de>.
2001-12-01 09:39:07 -08:00
hniksic
f4d019a423 [svn] Correctly convert links in <meta http-equiv=Refresh content="...">.
Published in <sxsadx3wp49.fsf@florida.arsdigita.de>.
2001-11-30 20:18:51 -08:00
hniksic
cca7541b10 [svn] Don't translate %d-%d. 2001-11-27 04:58:09 -08:00
hniksic
df05e7ff10 [svn] Handle <base href=...> when converting links.
Published in <sxsadxaae3t.fsf@florida.arsdigita.de>.
2001-11-25 10:40:55 -08:00
hniksic
2e6e3f21f8 [svn] Attempt to quote '?' as "%3F" when linking to local files.
Given up on the attempt, as it breaks local browsing.
2001-11-25 09:44:28 -08:00
hniksic
222e9465b7 [svn] Implemented breadth-first retrieval.
Published in <sxsherjczw2.fsf@florida.arsdigita.de>.
2001-11-24 19:10:34 -08:00
hniksic
d5be8ecca4 [svn] Rewrite parsing and handling of URLs.
Published in <sxs4rnnlklo.fsf@florida.arsdigita.de>.
2001-11-21 16:24:28 -08:00
hniksic
a24b3d50f0 [svn] Don't use the now-obsolete TYPE variable.
Published in <sxswv0ledyx.fsf@florida.arsdigita.de>.
2001-11-20 08:03:41 -08:00
hniksic
94c5b23136 [svn] Handle shorthands in proxy URLs.
Published in <sxs6686py1q.fsf@florida.arsdigita.de>.
2001-11-19 08:15:42 -08:00
hniksic
e8e8797873 [svn] Rewrite shorthand URLs in a step separate from parsing.
Published in <sxspu6f7ecz.fsf@florida.arsdigita.de>.
2001-11-18 17:14:14 -08:00
hniksic
f178e6c613 [svn] Clean up handling of schemes.
Published in <sxswv0n7h7s.fsf@florida.arsdigita.de>.
2001-11-18 16:12:05 -08:00
hniksic
303f406997 [svn] Don't list all the "known" (but unsupported) protocols. Instead, just
skip the characters until the first ':'.
Published in <sxsitc8a848.fsf@florida.arsdigita.de>.
2001-11-17 22:49:09 -08:00
hniksic
0c42479322 [svn] Applied Edward Sabol's patch from
<200106131813.f5DIDss1294858@alderaan.gsfc.nasa.gov>.
It fixes a memory leak in url_equal, and comments it out,
as it's unused.
2001-11-16 08:49:19 -08:00
hniksic
e1f4cff68c [svn] Make sure that slashes don't sneak in as part of file name via
query string.
Published in <sxsu21eb3te.fsf@florida.arsdigita.de>.
2001-06-18 02:08:04 -07:00
hniksic
0b056d1720 [svn] Update copyright notices. 2001-05-27 12:35:15 -07:00
hniksic
ae621c6770 [svn] Treat empty proxy environment vars as unset.
Published in <sxssniwq8d6.fsf@florida.arsdigita.de>.
2001-04-26 03:11:49 -07:00
hniksic
d80f6cbe8c [svn] Reimplemented UNSAFE_CHAR and RESERVED_CHAR.
Fixed snprintf.c to avoid ISDIGIT.
2001-04-24 17:20:30 -07:00
hniksic
ac7c8c1390 [svn] Improve performance of grow_hash_table.
Published in <sxs66g8nd4c.fsf@florida.arsdigita.de>.
2001-04-14 00:41:29 -07:00
hniksic
61bb00adc0 [svn] Various url.c-related changes.
Published in <sxsvgo8nmub.fsf@florida.arsdigita.de>.

* retr.c (retrieve_url): Call uri_merge, not url_concat.
* html-url.c (collect_tags_mapper): Call uri_merge, not
url_concat.
* url.c (mkstruct): Use encode_string instead of xstrdup followed
by URL_CLEANSE.
(path_simplify_with_kludge): Deleted.
(contains_unsafe): Deleted.
(construct): Renamed to uri_merge_1.
(url_concat): Renamed to uri_merge.
* url.c (str_url): Use encode_string instead of the unnecessary
CLEANDUP.
(encode_string_maybe): New function, returns input string if no
encoding is needed.
(encode_string): Call encode_string_maybe to do the dirty work,
xstrdup if no work needed.
* wget.h (XDIGIT_TO_xchar): Define here.
* url.c (decode_string): Use new name.
(encode_string): Ditto.
* http.c (XDIGIT_TO_xchar): Rename HEXD2asc to XDIGIT_TO_xchar.
(dump_hash): Use new name.
* wget.h: Rename ASC2HEXD and HEXD2ASC to XCHAR_TO_XDIGIT and
XDIGIT_TO_XCHAR respectively.
2001-04-13 21:11:35 -07:00
hniksic
8a0e9e765e [svn] Minor -Wall-induced fixes. Also, skip_url is removed.
Published in <sxs8zl5v5cw.fsf@florida.arsdigita.de>.
2001-04-12 20:39:23 -07:00
hniksic
963863113f [svn] Fix retrieval of directories when initial CWD is not `/'.
Published in <sxsitkc709p.fsf@florida.arsdigita.de>.

* url.c (parseurl): Don't strip trailing slash when u->dir is "/"
because that strips the *leading* slash, thus forcing relative
FTP retrieval.
* ftp.c (getftp): Convert initial FTP directory from VMS to UNIX
notation for VMS servers.
(ftp_retrieve_dirs): Do not prepend '/' to f->name when
odir is an empty string.
2001-04-10 17:24:59 -07:00
hniksic
c51015565a [svn] parse_uname() Would run past the end of the string if the
username was present, but the URL did not contain a slash, e.g.
http://foo:bar@myhost.
Reported by Christian Fraenkel.
2001-04-04 07:00:34 -07:00
hniksic
1a6058b1ec [svn] Applied Philipp Thomas's safe-ctype patch. Published in
<20010330025159.U21662@jeffreys.suse.de>.
2001-03-30 14:36:59 -08:00
janp
5014d32c3a [svn] Skip `:port' in the host header if it is the DEFAULT_HTTPS_PORT when
using SSL. Patch submitted by Hack Kampbjorn <hack@hackdata.com>.
2001-03-08 15:11:03 -08:00
hniksic
54811e2832 [svn] Applied Jan's patch to allow non-quoted @ character in
passwords.  Published in <20010106173455.A9455@erwin.telekabel.at>.
2001-02-10 16:28:22 -08:00
hniksic
b370dd1914 [svn] Applied Hack Kampbjorn's patch to print FTP type in debug output.
Published in <3A7D94B5.D9B932FB@hackdata.com>.
2001-02-10 16:06:59 -08:00
dan
fa636eb71d [svn] url.c (str_url): Clarified this function's comment header after Hrvoje answered
my question on the list as to when hide != 1.  Also Hrvoje pointed out I need to
use xstrdup() on the string literal.
2001-01-10 22:16:46 -08:00
dan
48cf02169d [svn] Just clarified a comment in the fix I just committed. 2001-01-09 20:32:29 -08:00
dan
1993e140f2 [svn] url.c (str_url): Henrik van Ginhoven pointed out on the list that we shouldn't
give away the number of characters in the password by replacing each character
with a 'x'.  Use "<password>" instead.
2001-01-09 20:30:43 -08:00
dan
a77dc45c4d [svn] Hrvoje's response to my "wondering" comment in write_backup_file() read
extremely strangely without adding tags to show who was saying what.  Also, one
of his phrases was very misleading.
2001-01-09 18:10:16 -08:00
hniksic
35325bd092 [svn] Include fragment identifiers in converted URLs. Published in
<sxs8zorl90l.fsf@florida.arsdigita.de>.
2001-01-04 05:53:53 -08:00
hniksic
5099ec0306 [svn] Apply lint-expired fixes from <sxsn1du7ufa.fsf@florida.arsdigita.de>. 2000-12-17 10:52:52 -08:00
hniksic
7828e81c79 [svn] Committed C. Frankel's SSL patch. 2000-12-05 15:09:41 -08:00
hniksic
7b5ad90acf [svn] Commit my url.c fix (space as unsafe character) and Jan's
winnt directory listing parsing.
2000-12-05 14:29:47 -08:00
hniksic
1cddc05edb [svn] Committed memory debugging stuff.
Published in <sxs1yw34pt4.fsf@florida.arsdigita.de>.
2000-11-22 14:15:45 -08:00
hniksic
2ffb47eabf [svn] Committed <sxsbsv854j9.fsf@florida.arsdigita.de>. 2000-11-22 08:58:28 -08:00
hniksic
6e598c81e3 [svn] Committed a bunch of different tweaks of mine.
Published in <sxsr9463wrx.fsf@florida.arsdigita.de>.
2000-11-20 18:06:36 -08:00
hniksic
b0b1c815c1 [svn] A bunch of new features:
- use mmap() to read whole files in core instead of allocating memory
  and read'ing it.

- use a new, more general, HTML parser (html-parse.c) and interface to
  it from Wget (html-url.c).

- respect <meta name=robots content=nofollow> (easy with the new HTML
  parser).

- use hash tables instead of linked lists in places where the lists
  were used to facilitate mappings.

- rewrite the code in host.c to be more readable and faster (hash
  tables instead of home-grown lists.)

- make convert_links properly convert partial URLs to complete ones
  for those URLs that have *not* been downloaded.

- use HTTP persistent connections where available.  very
  simple-minded, caches the last connection to the server.

Published in <sxshf533d5r.fsf@florida.arsdigita.de>.
2000-11-19 12:50:10 -08:00
hniksic
f306ae9626 [svn] Changed last_slash[-1] to *(last_slash - 1). 2000-11-08 07:51:28 -08:00
hniksic
b72b6cf387 [svn] Correctly handle URLs where / does not follow the host name.
Published in <sxsn1fag6zu.fsf@florida.arsdigita.de>.
2000-11-08 01:15:40 -08:00
hniksic
0e2b74ce3b [svn] Commit "minor fixes". 2000-11-06 13:24:57 -08:00
hniksic
366ad1d6d9 [svn] Rewrote the logging code.
Published at <sxs1ywrf300.fsf@florida.arsdigita.de>.
2000-11-04 20:38:31 -08:00
hniksic
eef4a668b7 [svn] Update copyright blurbs with the year 2000. 2000-11-01 17:50:03 -08:00
hniksic
b3758323ed [svn] Applied contributed fix. 2000-11-01 15:57:19 -08:00
hniksic
b9eeb0c54c [svn] Fix "optimization" of query-strings in URLs.
Published in <sxs3dhbwnmw.fsf@florida.arsdigita.de>.
2000-11-01 10:31:53 -08:00
hniksic
515d82fb95 [svn] Committed my patch from <sxsy9z4xz5m.fsf@florida.arsdigita.de>
(recognize HTML entities.)
2000-10-31 17:25:12 -08:00
hniksic
f6715dd08d [svn] Committed my patch from <sxs7l6ozghz.fsf@florida.arsdigita.de>. 2000-10-31 16:26:33 -08:00
hniksic
0dd418242a [svn] Committed my patches from <sxsbsw16sbu.fsf@florida.arsdigita.de>
and <sxsvgu824xk.fsf@florida.arsdigita.de>.
2000-10-31 11:25:32 -08:00
dan
b3e2c0ff97 [svn] Implemented and documented new -E / --html-extension / html_extension option. 2000-10-19 22:55:46 -07:00
dan
7931200609 [svn] * *.{gmo,po,pot}: Regenerated after modifying wget --help output.
* ftp.c (ftp_retrieve_list): Use new INFINITE_RECURSION #define.

* html.c: htmlfindurl() now takes final `dash_p_leaf_HTML' parameter.
Wrapped some > 80-column lines.  When -p is specified and we're at a
leaf node, do not traverse <A>, <AREA>, or <LINK> tags other than
<LINK REL="stylesheet">.

* html.h (htmlfindurl): Now takes final `dash_p_leaf_HTML' parameter.

* init.c: Added new -p / --page-requisites / page_requisites option.

* main.c (print_help): Clarified that -l inf and -l 0 both allow
infinite recursion.  Changed the unhelpful --mirrior description
to simply give the options it's equivalent to.  Added new -p option.
(main): Added some comments; handle new -p / --page-requisites.

* options.h (struct options): Added new page_requisites field.

* recur.c: Changed "URL-s" to "URLs" and "HTML-s" to "HTMLs".
Calculate and pass down new `dash_p_leaf_HTML' parameter to
get_urls_html().  Use new INFINITE_RECURSION #define.

* retr.c: Changed "URL-s" to "URLs".  get_urls_html() now takes
final `dash_p_leaf_HTML' parameter.

* url.c: get_urls_html() and htmlfindurl() now take final
`dash_p_leaf_HTML' parameter.

* url.h (get_urls_html): Now takes final `dash_p_leaf_HTML' parameter.

* wget.h: Added some comments and new INFINITE_RECURSION #define.

* wget.texi (Recursive Retrieval Options): Documented new -p option.
2000-08-30 04:26:21 -07:00
hniksic
1765080b2e [svn] Comment fix. 2000-06-09 01:03:19 -07:00
hniksic
0eec6b9f30 [svn] Committed my patch <dpem6hln1k.fsf@mraz.iskon.hr>. 2000-06-01 03:47:03 -07:00
dan
1ecfed1e10 [svn] * host.c (store_hostaddress): R. K. Owen's patch introduces a "left shift count
>= width of type" warning on 32-bit architectures.  Got rid of it by tricking
  the compiler w/ a variable.

* url.c (UNSAFE_CHAR): The macro didn't include all the illegal characters per
  RFC1738, namely everything above '~'.  It also generated a warning on OSes
  where char =~ unsigned char.  Fixed.
2000-04-04 20:08:10 -07:00
hniksic
0d42b49e30 [svn] Commit really old change. 2000-03-31 06:04:54 -08:00
dan
3a8c75cac4 [svn] Dan Berger's query string patch is totally bogus. If you have two different
URLs, gen_page.cgi?page1 and get_page.cgi?page2, they'll both be saved as
get_page.cgi and the second will overwrite the first.  Also, parameters to
implicit CGIs, like "http://www.host.com/db/?2000-03-02" cause the URLs to be
printed with trailing garbage characters, and could seg fault.  I'm not sure
what Dan had in mind with this patch (no explanatory comments), but I'm removing
it for now.  If he can rewrite it so it doesn't break stuff, okay.
2000-03-02 14:48:07 -08:00
hniksic
2b2fd2924a [svn] Added user-contributed patches. 2000-03-02 06:16:12 -08:00
dan
4331c39c9a [svn] Implemented the item I formerly had in the TODO: When -K and -N are used
together, we compare local file X.orig (if extant) against server file X.
Previously -k and -N were worthless in combination because the local converted
files always differed from the server versions.
2000-03-01 22:33:48 -08:00
dan
e5408e7db8 [svn] Implemented new -K / --backup-converted / backup_converted = on option. 2000-02-29 16:17:23 -08:00
kwget
31d6616c48 [svn] Initial revision 1999-12-01 23:42:23 -08:00