Commit Graph

72 Commits

Author SHA1 Message Date
micah 4d7c5e087b [svn] Merge of fix for bugs 20341 and 20410. 2007-07-09 22:53:22 -07:00
mtortonesi 763229b67f [svn] #include'd spider.h to get rid of compiler warnings. 2006-08-28 07:41:40 -07:00
mtortonesi 0dbef4ccb4 [svn] Several fixes for recursive spider mode. 2006-08-24 08:27:57 -07:00
mtortonesi 1c7493b83e [svn] Added sanity checks for -k, -p, -r and -N when -O is given. Added fixes for 64-bit platforms. Updated copyright and maintainer information. 2006-07-14 06:25:50 -07:00
mtortonesi 01093c0c33 [svn] Fixed recursive spider mode. 2006-05-25 09:11:29 -07:00
mtortonesi ea4ffded27 [svn] Restricted operational semantics of frontcmp and proclist from generic strings to directory names, and fixed dirname matching algorithm. Renamed above mentioned functions to subdir_p and dir_matches_p respectively. Added testcases for subdir_p and dir_matches_p. 2006-03-15 06:55:29 -08:00
mtortonesi 0bd1751372 [svn] recur.c: changed type of html_allowed member of struct queue_element to bool. 2006-03-14 05:52:17 -08:00
hniksic 097695b723 [svn] New option --ignore-case for case-insensitive matching. 2005-07-06 12:44:00 -07:00
hniksic db9de5b075 [svn] Update FSF's address and copyright years. 2005-07-01 19:26:52 -07:00
hniksic 2447fb9a9b [svn] Move extern declarations to .h files. 2005-06-27 11:19:22 -07:00
hniksic 002def87d2 [svn] Rename LARGE_INT to SUM_SIZE_INT, and simplify its handling. 2005-06-25 07:39:51 -07:00
hniksic 74fbb03b10 [svn] Use bool type for boolean variables and values. 2005-06-22 12:38:10 -07:00
hniksic 277e840a0f [svn] Remove K&R support. 2005-06-19 15:34:58 -07:00
hniksic b49b6db4f1 [svn] Correct logic of check #6 in download_child_p.
By Larry Jones and Hrvoje Niksic.
2005-04-09 15:18:36 -07:00
hniksic e2e9b753e4 [svn] Retired the `boolean' type. Renamed FREE_MAYBE to xfree_null and moved the
definition from wget.h to xmalloc.h.  Moved the DEFAULT_LOGFILE
define to log.h.  Moved the INFINITE_RECURSION define to recur.h.
2003-11-02 11:56:37 -08:00
hniksic 5f0a2b3f08 [svn] Use new macros xnew, xnew0, xnew_array, and xnew0_array in various places. 2003-10-31 06:55:50 -08:00
hniksic 711bf72609 [svn] Remove VERY_LONG_TYPE; use LARGE_INT instead. Remove special code
for handling VERY_LONG_TYPE overflows.
Make opt.quota a LARGE_INT.
2003-10-11 06:57:11 -07:00
hniksic 1b3cdef574 [svn] Don't descend into HTML that was downloaded by following <img src=...>
and such.
2003-10-10 07:25:10 -07:00
hniksic 097923f7b1 [svn] Move fnmatch() to cmpt.c and don't use it under GNU libc. 2003-10-07 16:53:31 -07:00
hniksic 95c647eb44 [svn] Split off non-URL related stuff from url.c to convert.c. 2003-09-21 15:47:14 -07:00
hniksic d7673d398b [svn] Check whether downloaded_html_set is non-NULL before using it.
Posted in <sxsr8hsvnhh.fsf@florida.munich.redhat.com>.
2002-07-24 14:16:30 -07:00
hniksic b2be7522c7 [svn] Update the license to include the OpenSSL exception. 2002-05-17 19:16:36 -07:00
abbotti 83dc077b17 [svn] (download_child_p): Minor optimization to avoid unnecessary call to
schemes_are_similar_p function.
Published in <kvq7eu4okekh2ohb0rdvavt16nbgb02v00@farscape.privy.mev.co.uk>.
2002-05-16 10:38:30 -07:00
abbotti e863a6323b [svn] New function schemes_are_similar_p to test enumerated scheme codes for
similarity (SCHEME_HTTP and SCHEME_HTTPS are similar).  Use it in recur.c
(download_child_p).  Fixes a bug that caused -H option to be ignored when
child scheme different to parent scheme.
Published in <agn4eu8apduek7magfu9bfe63gto8i7cdh@farscape.privy.mev.co.uk>.
2002-05-16 10:22:24 -07:00
hniksic 6fe9ec9f16 [svn] Indentation change. 2002-04-20 21:25:07 -07:00
hniksic bf018d5721 [svn] Revert order of check number 6 in download_child_p for clarity. 2002-04-20 19:15:11 -07:00
hniksic d4b0486cc4 [svn] Remove needless level of indentation. 2002-04-20 17:54:13 -07:00
hniksic f8b4b8bd12 [svn] When downloading recursively, don't ignore rejection of HTML
documents that are themselves leaves of recursion.
2002-04-15 14:57:10 -07:00
abbotti cfd7b9a951 [svn] Use new function to test filename for common html suffixes.
Submitted by Ian Abbott in <3CB72D29.4898.1F34872@localhost> with minor
changes to formatting and comments.
2002-04-12 11:53:39 -07:00
hniksic 1fa3b90235 [svn] Handle starting URL of recursing download being non-parsable.
Published in <sxszo26t33k.fsf@florida.arsdigita.de>.
2002-02-18 22:09:57 -08:00
hniksic 75a080ad0d [svn] Follow https links from http.
Submitted by Christian Lackas in <20020211202444.GA20371@lackas.desy.de>.
2002-02-18 21:23:35 -08:00
hniksic 8db1264218 [svn] Enqueue start_url in the canonical form.
Published in <sxsofkvi8zx.fsf@florida.arsdigita.de>.
2001-12-19 06:27:29 -08:00
hniksic 2cf87bea8b [svn] Fix crash introduced by previous patch. 2001-12-18 14:20:14 -08:00
hniksic 40fd876c57 [svn] Descend into HTML files we've already downloaded. 2001-12-18 14:14:31 -08:00
hniksic 416671063a [svn] Propagate referrer information from retrieve_tree to retrieve_url.
Submitted by Ian Abbott in <3C1F4BFE.17436.D2D7B2@localhost>.
2001-12-18 07:22:03 -08:00
hniksic f031900662 [svn] Don't abort when one URL references more than one file.
Published in <sxs1yhz0w1m.fsf@florida.arsdigita.de>.
2001-12-13 11:18:31 -08:00
hniksic 8a2ab60263 [svn] Fix overzealous URL-removal in register_download.
Published in <sxszo4yqq91.fsf@florida.arsdigita.de>.
2001-12-04 19:51:23 -08:00
hniksic 0fdc1bd8c0 [svn] Fix downloading of duplicate URLs.
Published in <sxsvgfmu2bj.fsf@florida.arsdigita.de>.
2001-12-04 13:03:35 -08:00
hniksic 7ab7f93f8d [svn] Make -p work with framed pages.
Published in <sxsu1vby71t.fsf@florida.arsdigita.de>.
2001-11-30 19:06:41 -08:00
hniksic a4db28e20f [svn] Ignore -np when in -p mode.
Published in <sxsg06w2c52.fsf@florida.arsdigita.de>.
2001-11-30 13:17:53 -08:00
hniksic 39482df431 [svn] descend_url_p: When resolving no_parent, compare with the start url,
not the parent url.
Published in <sxspu614ikm.fsf@florida.arsdigita.de>.
2001-11-29 09:04:28 -08:00
hniksic 024cb5ed3a [svn] A lot of host name changes.
Published in <sxs3d32856s.fsf@florida.arsdigita.de>.
2001-11-25 21:36:33 -08:00
hniksic f6921edc73 [svn] Be careful whether we want to descend into results of redirection.
Published in <sxs7kse8hmq.fsf@florida.arsdigita.de>.
2001-11-25 17:11:48 -08:00
hniksic 3afb9c659a [svn] Recursion and progress bar tweaks.
Published in <sxsd727cvc0.fsf@florida.arsdigita.de>.
2001-11-25 13:03:30 -08:00
hniksic df05e7ff10 [svn] Handle <base href=...> when converting links.
Published in <sxsadxaae3t.fsf@florida.arsdigita.de>.
2001-11-25 10:40:55 -08:00
hniksic 222e9465b7 [svn] Implemented breadth-first retrieval.
Published in <sxsherjczw2.fsf@florida.arsdigita.de>.
2001-11-24 19:10:34 -08:00
hniksic 1da2947d50 [svn] Fix typo that made us never use robots.txt. 2001-11-23 17:48:28 -08:00
hniksic d5be8ecca4 [svn] Rewrite parsing and handling of URLs.
Published in <sxs4rnnlklo.fsf@florida.arsdigita.de>.
2001-11-21 16:24:28 -08:00
hniksic f178e6c613 [svn] Clean up handling of schemes.
Published in <sxswv0n7h7s.fsf@florida.arsdigita.de>.
2001-11-18 16:12:05 -08:00
hniksic 05f90bb302 [svn] Plug in new implementation of RES.
Published in <sxselmwddt0.fsf@florida.arsdigita.de>.
2001-11-17 18:17:30 -08:00