1
0
mirror of https://github.com/moparisthebest/wget synced 2024-07-03 16:38:41 -04:00
Commit Graph

91 Commits

Author SHA1 Message Date
hniksic
61bb00adc0 [svn] Various url.c-related changes.
Published in <sxsvgo8nmub.fsf@florida.arsdigita.de>.

* retr.c (retrieve_url): Call uri_merge, not url_concat.
* html-url.c (collect_tags_mapper): Call uri_merge, not
url_concat.
* url.c (mkstruct): Use encode_string instead of xstrdup followed
by URL_CLEANSE.
(path_simplify_with_kludge): Deleted.
(contains_unsafe): Deleted.
(construct): Renamed to uri_merge_1.
(url_concat): Renamed to uri_merge.
* url.c (str_url): Use encode_string instead of the unnecessary
CLEANDUP.
(encode_string_maybe): New function, returns input string if no
encoding is needed.
(encode_string): Call encode_string_maybe to do the dirty work,
xstrdup if no work needed.
* wget.h (XDIGIT_TO_xchar): Define here.
* url.c (decode_string): Use new name.
(encode_string): Ditto.
* http.c (XDIGIT_TO_xchar): Rename HEXD2asc to XDIGIT_TO_xchar.
(dump_hash): Use new name.
* wget.h: Rename ASC2HEXD and HEXD2ASC to XCHAR_TO_XDIGIT and
XDIGIT_TO_XCHAR respectively.
2001-04-13 21:11:35 -07:00
hniksic
8a0e9e765e [svn] Minor -Wall-induced fixes. Also, skip_url is removed.
Published in <sxs8zl5v5cw.fsf@florida.arsdigita.de>.
2001-04-12 20:39:23 -07:00
hniksic
963863113f [svn] Fix retrieval of directories when initial CWD is not `/'.
Published in <sxsitkc709p.fsf@florida.arsdigita.de>.

* url.c (parseurl): Don't strip trailing slash when u->dir is "/"
because that strips the *leading* slash, thus forcing relative
FTP retrieval.
* ftp.c (getftp): Convert initial FTP directory from VMS to UNIX
notation for VMS servers.
(ftp_retrieve_dirs): Do not prepend '/' to f->name when
odir is an empty string.
2001-04-10 17:24:59 -07:00
hniksic
c51015565a [svn] parse_uname() Would run past the end of the string if the
username was present, but the URL did not contain a slash, e.g.
http://foo:bar@myhost.
Reported by Christian Fraenkel.
2001-04-04 07:00:34 -07:00
hniksic
1a6058b1ec [svn] Applied Philipp Thomas's safe-ctype patch. Published in
<20010330025159.U21662@jeffreys.suse.de>.
2001-03-30 14:36:59 -08:00
janp
5014d32c3a [svn] Skip `:port' in the host header if it is the DEFAULT_HTTPS_PORT when
using SSL. Patch submitted by Hack Kampbjorn <hack@hackdata.com>.
2001-03-08 15:11:03 -08:00
hniksic
54811e2832 [svn] Applied Jan's patch to allow non-quoted @ character in
passwords.  Published in <20010106173455.A9455@erwin.telekabel.at>.
2001-02-10 16:28:22 -08:00
hniksic
b370dd1914 [svn] Applied Hack Kampbjorn's patch to print FTP type in debug output.
Published in <3A7D94B5.D9B932FB@hackdata.com>.
2001-02-10 16:06:59 -08:00
dan
fa636eb71d [svn] url.c (str_url): Clarified this function's comment header after Hrvoje answered
my question on the list as to when hide != 1.  Also Hrvoje pointed out I need to
use xstrdup() on the string literal.
2001-01-10 22:16:46 -08:00
dan
48cf02169d [svn] Just clarified a comment in the fix I just committed. 2001-01-09 20:32:29 -08:00
dan
1993e140f2 [svn] url.c (str_url): Henrik van Ginhoven pointed out on the list that we shouldn't
give away the number of characters in the password by replacing each character
with a 'x'.  Use "<password>" instead.
2001-01-09 20:30:43 -08:00
dan
a77dc45c4d [svn] Hrvoje's response to my "wondering" comment in write_backup_file() read
extremely strangely without adding tags to show who was saying what.  Also, one
of his phrases was very misleading.
2001-01-09 18:10:16 -08:00
hniksic
35325bd092 [svn] Include fragment identifiers in converted URLs. Published in
<sxs8zorl90l.fsf@florida.arsdigita.de>.
2001-01-04 05:53:53 -08:00
hniksic
5099ec0306 [svn] Apply lint-expired fixes from <sxsn1du7ufa.fsf@florida.arsdigita.de>. 2000-12-17 10:52:52 -08:00
hniksic
7828e81c79 [svn] Committed C. Frankel's SSL patch. 2000-12-05 15:09:41 -08:00
hniksic
7b5ad90acf [svn] Commit my url.c fix (space as unsafe character) and Jan's
winnt directory listing parsing.
2000-12-05 14:29:47 -08:00
hniksic
1cddc05edb [svn] Committed memory debugging stuff.
Published in <sxs1yw34pt4.fsf@florida.arsdigita.de>.
2000-11-22 14:15:45 -08:00
hniksic
2ffb47eabf [svn] Committed <sxsbsv854j9.fsf@florida.arsdigita.de>. 2000-11-22 08:58:28 -08:00
hniksic
6e598c81e3 [svn] Committed a bunch of different tweaks of mine.
Published in <sxsr9463wrx.fsf@florida.arsdigita.de>.
2000-11-20 18:06:36 -08:00
hniksic
b0b1c815c1 [svn] A bunch of new features:
- use mmap() to read whole files in core instead of allocating memory
  and read'ing it.

- use a new, more general, HTML parser (html-parse.c) and interface to
  it from Wget (html-url.c).

- respect <meta name=robots content=nofollow> (easy with the new HTML
  parser).

- use hash tables instead of linked lists in places where the lists
  were used to facilitate mappings.

- rewrite the code in host.c to be more readable and faster (hash
  tables instead of home-grown lists.)

- make convert_links properly convert partial URLs to complete ones
  for those URLs that have *not* been downloaded.

- use HTTP persistent connections where available.  very
  simple-minded, caches the last connection to the server.

Published in <sxshf533d5r.fsf@florida.arsdigita.de>.
2000-11-19 12:50:10 -08:00
hniksic
f306ae9626 [svn] Changed last_slash[-1] to *(last_slash - 1). 2000-11-08 07:51:28 -08:00
hniksic
b72b6cf387 [svn] Correctly handle URLs where / does not follow the host name.
Published in <sxsn1fag6zu.fsf@florida.arsdigita.de>.
2000-11-08 01:15:40 -08:00
hniksic
0e2b74ce3b [svn] Commit "minor fixes". 2000-11-06 13:24:57 -08:00
hniksic
366ad1d6d9 [svn] Rewrote the logging code.
Published at <sxs1ywrf300.fsf@florida.arsdigita.de>.
2000-11-04 20:38:31 -08:00
hniksic
eef4a668b7 [svn] Update copyright blurbs with the year 2000. 2000-11-01 17:50:03 -08:00
hniksic
b3758323ed [svn] Applied contributed fix. 2000-11-01 15:57:19 -08:00
hniksic
b9eeb0c54c [svn] Fix "optimization" of query-strings in URLs.
Published in <sxs3dhbwnmw.fsf@florida.arsdigita.de>.
2000-11-01 10:31:53 -08:00
hniksic
515d82fb95 [svn] Committed my patch from <sxsy9z4xz5m.fsf@florida.arsdigita.de>
(recognize HTML entities.)
2000-10-31 17:25:12 -08:00
hniksic
f6715dd08d [svn] Committed my patch from <sxs7l6ozghz.fsf@florida.arsdigita.de>. 2000-10-31 16:26:33 -08:00
hniksic
0dd418242a [svn] Committed my patches from <sxsbsw16sbu.fsf@florida.arsdigita.de>
and <sxsvgu824xk.fsf@florida.arsdigita.de>.
2000-10-31 11:25:32 -08:00
dan
b3e2c0ff97 [svn] Implemented and documented new -E / --html-extension / html_extension option. 2000-10-19 22:55:46 -07:00
dan
7931200609 [svn] * *.{gmo,po,pot}: Regenerated after modifying wget --help output.
* ftp.c (ftp_retrieve_list): Use new INFINITE_RECURSION #define.

* html.c: htmlfindurl() now takes final `dash_p_leaf_HTML' parameter.
Wrapped some > 80-column lines.  When -p is specified and we're at a
leaf node, do not traverse <A>, <AREA>, or <LINK> tags other than
<LINK REL="stylesheet">.

* html.h (htmlfindurl): Now takes final `dash_p_leaf_HTML' parameter.

* init.c: Added new -p / --page-requisites / page_requisites option.

* main.c (print_help): Clarified that -l inf and -l 0 both allow
infinite recursion.  Changed the unhelpful --mirrior description
to simply give the options it's equivalent to.  Added new -p option.
(main): Added some comments; handle new -p / --page-requisites.

* options.h (struct options): Added new page_requisites field.

* recur.c: Changed "URL-s" to "URLs" and "HTML-s" to "HTMLs".
Calculate and pass down new `dash_p_leaf_HTML' parameter to
get_urls_html().  Use new INFINITE_RECURSION #define.

* retr.c: Changed "URL-s" to "URLs".  get_urls_html() now takes
final `dash_p_leaf_HTML' parameter.

* url.c: get_urls_html() and htmlfindurl() now take final
`dash_p_leaf_HTML' parameter.

* url.h (get_urls_html): Now takes final `dash_p_leaf_HTML' parameter.

* wget.h: Added some comments and new INFINITE_RECURSION #define.

* wget.texi (Recursive Retrieval Options): Documented new -p option.
2000-08-30 04:26:21 -07:00
hniksic
1765080b2e [svn] Comment fix. 2000-06-09 01:03:19 -07:00
hniksic
0eec6b9f30 [svn] Committed my patch <dpem6hln1k.fsf@mraz.iskon.hr>. 2000-06-01 03:47:03 -07:00
dan
1ecfed1e10 [svn] * host.c (store_hostaddress): R. K. Owen's patch introduces a "left shift count
>= width of type" warning on 32-bit architectures.  Got rid of it by tricking
  the compiler w/ a variable.

* url.c (UNSAFE_CHAR): The macro didn't include all the illegal characters per
  RFC1738, namely everything above '~'.  It also generated a warning on OSes
  where char =~ unsigned char.  Fixed.
2000-04-04 20:08:10 -07:00
hniksic
0d42b49e30 [svn] Commit really old change. 2000-03-31 06:04:54 -08:00
dan
3a8c75cac4 [svn] Dan Berger's query string patch is totally bogus. If you have two different
URLs, gen_page.cgi?page1 and get_page.cgi?page2, they'll both be saved as
get_page.cgi and the second will overwrite the first.  Also, parameters to
implicit CGIs, like "http://www.host.com/db/?2000-03-02" cause the URLs to be
printed with trailing garbage characters, and could seg fault.  I'm not sure
what Dan had in mind with this patch (no explanatory comments), but I'm removing
it for now.  If he can rewrite it so it doesn't break stuff, okay.
2000-03-02 14:48:07 -08:00
hniksic
2b2fd2924a [svn] Added user-contributed patches. 2000-03-02 06:16:12 -08:00
dan
4331c39c9a [svn] Implemented the item I formerly had in the TODO: When -K and -N are used
together, we compare local file X.orig (if extant) against server file X.
Previously -k and -N were worthless in combination because the local converted
files always differed from the server versions.
2000-03-01 22:33:48 -08:00
dan
e5408e7db8 [svn] Implemented new -K / --backup-converted / backup_converted = on option. 2000-02-29 16:17:23 -08:00
kwget
31d6616c48 [svn] Initial revision 1999-12-01 23:42:23 -08:00