1
0
mirror of https://github.com/moparisthebest/wget synced 2024-07-03 16:38:41 -04:00
Commit Graph

69 Commits

Author SHA1 Message Date
hniksic
f2f77d87fd [svn] New option --no-http-keep-alive.
Published in <sxsd7fr1pdf.fsf@florida.arsdigita.de>.
2000-11-19 16:04:06 -08:00
hniksic
b27144fcce [svn] My patch "persistent connection tweaks".
Published in <sxshf531qhj.fsf@florida.arsdigita.de>.

(Applied with the addition of correct calculation for the
length of the request.)
2000-11-19 15:42:13 -08:00
hniksic
b0b1c815c1 [svn] A bunch of new features:
- use mmap() to read whole files in core instead of allocating memory
  and read'ing it.

- use a new, more general, HTML parser (html-parse.c) and interface to
  it from Wget (html-url.c).

- respect <meta name=robots content=nofollow> (easy with the new HTML
  parser).

- use hash tables instead of linked lists in places where the lists
  were used to facilitate mappings.

- rewrite the code in host.c to be more readable and faster (hash
  tables instead of home-grown lists.)

- make convert_links properly convert partial URLs to complete ones
  for those URLs that have *not* been downloaded.

- use HTTP persistent connections where available.  very
  simple-minded, caches the last connection to the server.

Published in <sxshf533d5r.fsf@florida.arsdigita.de>.
2000-11-19 12:50:10 -08:00
hniksic
ccf31643ab [svn] vsnprintf() fixup. 2000-11-16 08:37:49 -08:00
hniksic
cc3b6eb3e4 [svn] Do the _XOPEN_SOURCE/_SVID_SOURCE things only on Linux. 2000-11-15 10:10:01 -08:00
hniksic
6a70f04a5c [svn] Don't clutter the host list with duplicate hosts.
Published in <sxsitpt56eh.fsf@florida.arsdigita.de>.
2000-11-12 16:46:13 -08:00
hniksic
e1f1c1ff40 [svn] Better version of read_whole_line().
Published in <sxsr94jd7z4.fsf@florida.arsdigita.de>.
2000-11-10 10:01:35 -08:00
hniksic
e18ca280fb [svn] Fix off-by-one error in comind().
Published in <sxsvgtvdcki.fsf@florida.arsdigita.de>.
2000-11-10 08:20:55 -08:00
hniksic
f306ae9626 [svn] Changed last_slash[-1] to *(last_slash - 1). 2000-11-08 07:51:28 -08:00
hniksic
b72b6cf387 [svn] Correctly handle URLs where / does not follow the host name.
Published in <sxsn1fag6zu.fsf@florida.arsdigita.de>.
2000-11-08 01:15:40 -08:00
hniksic
34ea31bb01 [svn] Sort commands[]. 2000-11-07 03:43:36 -08:00
hniksic
0e2b74ce3b [svn] Commit "minor fixes". 2000-11-06 13:24:57 -08:00
hniksic
366ad1d6d9 [svn] Rewrote the logging code.
Published at <sxs1ywrf300.fsf@florida.arsdigita.de>.
2000-11-04 20:38:31 -08:00
hniksic
c2c821b3c9 [svn] snprintf.c addition. 2000-11-04 14:49:46 -08:00
hniksic
6e23da9254 [svn] Hide password from URL when non-verbose, too. 2000-11-01 17:41:20 -08:00
hniksic
6eb0870af0 [svn] Contributed fix. 2000-11-01 17:02:56 -08:00
hniksic
986c445029 [svn] Fixed minor memory leaks. 2000-11-01 16:18:27 -08:00
hniksic
b3758323ed [svn] Applied contributed fix. 2000-11-01 15:57:19 -08:00
hniksic
b7a8c6d3f5 [svn] Gracefully handle opt.downloaded overflowing.
Published in <sxsd7gfnv17.fsf@florida.arsdigita.de>.
2000-11-01 15:17:31 -08:00
hniksic
29cdc8da20 [svn] Updated long_to_string(); enhanced opt.downloaded to use
64-bit types where available.
Published in <sxswvenqsmn.fsf@florida.arsdigita.de> and
<sxssnpbqshp.fsf@florida.arsdigita.de>.
2000-11-01 13:51:25 -08:00
hniksic
b9eeb0c54c [svn] Fix "optimization" of query-strings in URLs.
Published in <sxs3dhbwnmw.fsf@florida.arsdigita.de>.
2000-11-01 10:31:53 -08:00
hniksic
6d13e17142 [svn] Detect redirection cycles.
Published in <sxsd7ggtjac.fsf@florida.arsdigita.de>.
2000-10-31 20:21:50 -08:00
hniksic
515d82fb95 [svn] Committed my patch from <sxsy9z4xz5m.fsf@florida.arsdigita.de>
(recognize HTML entities.)
2000-10-31 17:25:12 -08:00
hniksic
846b045a69 [svn] Applied my patch from <sxs3dhczfv5.fsf@florida.arsdigita.de>. 2000-10-31 16:38:57 -08:00
hniksic
f6715dd08d [svn] Committed my patch from <sxs7l6ozghz.fsf@florida.arsdigita.de>. 2000-10-31 16:26:33 -08:00
hniksic
0dd418242a [svn] Committed my patches from <sxsbsw16sbu.fsf@florida.arsdigita.de>
and <sxsvgu824xk.fsf@florida.arsdigita.de>.
2000-10-31 11:25:32 -08:00
hniksic
b095202cad [svn] Applied Adrian Aichner's patch from
<20001029223711.28688.qmail@web10601.mail.yahoo.com>.
2000-10-30 13:07:04 -08:00
dan
24c465b5ad [svn] retr.c (retrieve_url): Manually applied T. Bharath
<TBharath@responsenetworks.com>'s patch to get wget to grok illegal relative URL
redirects.  Reformatted and re-commented it.
2000-10-27 20:18:20 -07:00
dan
1396b30055 [svn] Manually applied Rob Mayoff <mayoff@dqd.com>'s patch (vs. 1.5.3, not 1.5.3+dev)
to add --bind-address, making many necessary alphabetization, coding style,
comment, documentation, and naming fixes and additions.
2000-10-23 23:19:17 -07:00
dan
2fbb4936a0 [svn] main.c (print_help): Clarified that --delete-after deletes local files. 2000-10-23 20:52:34 -07:00
dan
f4673bcdaf [svn] --delete-after wasn't implemented for files retrieved by FTP or corresponding to
files specified on the commandline.  Made --convert-links be ignored when
--delete-after is specified.  Added note about this fact to --delete-after docs
and made general improvements to them, including the clarification that
--delete-after only deletes local files.
2000-10-23 20:43:47 -07:00
dan
8a9be7627d [svn] ftp.c (getftp): Applied Piotr Sulecki <Piotr.Sulecki@ios.krakow.pl>'s
patch to work around FTP servers that incorrectly respond to the
          	"REST" command with the remaining size rather than the total
          	file size.
2000-10-20 00:28:57 -07:00
dan
8cf52e0dd3 [svn] Applied John Daily <jdaily@cyberdude.com>'s patch for his "quad" commands (which
I renamed to "lockable_boolean") in the .wgetrc (currently just passive_ftp).
Wrote documentation for his changes and added the missing "referer" to the
.wgetrc section (making mention of the issue of "referrer" being the correct
spelling).
2000-10-19 23:59:30 -07:00
dan
b3e2c0ff97 [svn] Implemented and documented new -E / --html-extension / html_extension option. 2000-10-19 22:55:46 -07:00
dan
cbf018d0c0 [svn] --retr-symlinks was not previously documented properly. Based on my newfound
understanding of what its limitations are, added a TODO item.  Also made a minor
tweak in html.c to silence a warning.
2000-10-09 15:43:11 -07:00
dan
7931200609 [svn] * *.{gmo,po,pot}: Regenerated after modifying wget --help output.
* ftp.c (ftp_retrieve_list): Use new INFINITE_RECURSION #define.

* html.c: htmlfindurl() now takes final `dash_p_leaf_HTML' parameter.
Wrapped some > 80-column lines.  When -p is specified and we're at a
leaf node, do not traverse <A>, <AREA>, or <LINK> tags other than
<LINK REL="stylesheet">.

* html.h (htmlfindurl): Now takes final `dash_p_leaf_HTML' parameter.

* init.c: Added new -p / --page-requisites / page_requisites option.

* main.c (print_help): Clarified that -l inf and -l 0 both allow
infinite recursion.  Changed the unhelpful --mirrior description
to simply give the options it's equivalent to.  Added new -p option.
(main): Added some comments; handle new -p / --page-requisites.

* options.h (struct options): Added new page_requisites field.

* recur.c: Changed "URL-s" to "URLs" and "HTML-s" to "HTMLs".
Calculate and pass down new `dash_p_leaf_HTML' parameter to
get_urls_html().  Use new INFINITE_RECURSION #define.

* retr.c: Changed "URL-s" to "URLs".  get_urls_html() now takes
final `dash_p_leaf_HTML' parameter.

* url.c: get_urls_html() and htmlfindurl() now take final
`dash_p_leaf_HTML' parameter.

* url.h (get_urls_html): Now takes final `dash_p_leaf_HTML' parameter.

* wget.h: Added some comments and new INFINITE_RECURSION #define.

* wget.texi (Recursive Retrieval Options): Documented new -p option.
2000-08-30 04:26:21 -07:00
dan
001392bf2b [svn] * main.c (print_help): -B / --base was not mentioned. 2000-08-23 15:40:20 -07:00
dan
1f0acebeb0 [svn] * main.c (print_help): Modified -nc description to mention that it also prevents
the creation of multiple versions of the same file with ".<number>" suffixes.
2000-08-22 20:11:55 -07:00
hniksic
7794db052c [svn] Committed Jan Prikryl's patch from
<20000709171425.A16267@launzatte.cg.tuwien.ac.at>.
2000-07-14 07:15:23 -07:00
dan
ae77e4f08e [svn] Oops. Meant to check this change in with my last one, but the commit wouldn't
go through without doing an update first, and I forgot to make the change the
second time.  Just changed an erroneous main.c (main) to main.c (print_help).
2000-06-09 14:40:26 -07:00
dan
eea2d24220 [svn] Heiko's --help output for --waitretry was over 80 columns. Shortened. It also
said that 0 seconds are waited after the first retry, which I believe is
incorrect and does not match what's written elsewhere (e.g. wget.texi).  Changed
to 1.
2000-06-09 13:59:56 -07:00
hniksic
1765080b2e [svn] Comment fix. 2000-06-09 01:03:19 -07:00
hniksic
2e806fb2f3 [svn] Don't try to chmod() symlinks. 2000-06-01 04:20:05 -07:00
hniksic
0eec6b9f30 [svn] Committed my patch <dpem6hln1k.fsf@mraz.iskon.hr>. 2000-06-01 03:47:03 -07:00
dan
b05feb3ae2 [svn] Damir Dzeko <ddzeko@zesoi.fer.hr> did not document his new --referer option.
Did so (--help output and wget.texi).  Also tweaked --help output for --execute.
2000-05-22 19:29:38 -07:00
hniksic
ee6065f581 [svn] Committed my patch from <dpd7mj3sap.fsf@mraz.iskon.hr>. 2000-05-19 00:37:22 -07:00
hniksic
094481c386 [svn] Committed host.c patch from <dpk8i3za97.fsf_-_@mraz.iskon.hr>. 2000-04-14 02:31:21 -07:00
hniksic
6b4a85888e [svn] Commit several fixes. 2000-04-12 06:23:35 -07:00
dan
1ecfed1e10 [svn] * host.c (store_hostaddress): R. K. Owen's patch introduces a "left shift count
>= width of type" warning on 32-bit architectures.  Got rid of it by tricking
  the compiler w/ a variable.

* url.c (UNSAFE_CHAR): The macro didn't include all the illegal characters per
  RFC1738, namely everything above '~'.  It also generated a warning on OSes
  where char =~ unsigned char.  Fixed.
2000-04-04 20:08:10 -07:00
hniksic
bc7060a81d [svn] More old fixes. 2000-03-31 06:14:58 -08:00