Commit Graph

18 Commits

Author SHA1 Message Date
hniksic 233ebb78de [svn] Use hash table for tag lookup in html-url.c and html-parse.c. 2003-10-09 08:01:58 -07:00
hniksic ae1d264fcc [svn] Add FLAGS argument to map_html_tags. 2003-10-08 09:17:33 -07:00
hniksic 87275db136 [svn] Fix compilation problem on non-Gcc compilers. 2003-10-03 09:11:09 -07:00
hniksic eec3ea392d [svn] Better document html-parse macros. 2003-10-02 15:20:44 -07:00
hniksic 2e8899bc10 [svn] Added support for hexadecimal numeric entities. 2003-10-02 10:23:25 -07:00
hniksic 7c802e58d3 [svn] Introduce non-strict comment parsing. 2003-09-18 17:33:22 -07:00
hniksic 89b37c7eff [svn] Allow almost any character in attribute/tag names. 2002-05-27 08:03:35 -07:00
hniksic b2be7522c7 [svn] Update the license to include the OpenSSL exception. 2002-05-17 19:16:36 -07:00
hniksic a45e8255cc [svn] Allow standalone compilation of html-parse.c. 2001-12-18 11:33:36 -08:00
hniksic 90cdb82942 [svn] Use 0x22 instead of '"' or '\"'. 2001-11-16 09:26:42 -08:00
hniksic 0ce7b6bffc [svn] Support XML-style empty tags. 2001-11-16 08:44:34 -08:00
hniksic 0b056d1720 [svn] Update copyright notices. 2001-05-27 12:35:15 -07:00
hniksic e559249a48 [svn] Minor doc fix. 2001-04-24 17:59:39 -07:00
hniksic 1a6058b1ec [svn] Applied Philipp Thomas's safe-ctype patch. Published in
<20010330025159.U21662@jeffreys.suse.de>.
2001-03-30 14:36:59 -08:00
hniksic b84f96df34 [svn] Use '"' rather than '\"' in assert. 2000-12-13 05:37:37 -08:00
hniksic d8c9ce30aa [svn] Make sure xfree is #define'd in standalone mode in files that
support one.
2000-11-22 09:00:31 -08:00
hniksic 2ffb47eabf [svn] Committed <sxsbsv854j9.fsf@florida.arsdigita.de>. 2000-11-22 08:58:28 -08:00
hniksic b0b1c815c1 [svn] A bunch of new features:
- use mmap() to read whole files in core instead of allocating memory
  and read'ing it.

- use a new, more general, HTML parser (html-parse.c) and interface to
  it from Wget (html-url.c).

- respect <meta name=robots content=nofollow> (easy with the new HTML
  parser).

- use hash tables instead of linked lists in places where the lists
  were used to facilitate mappings.

- rewrite the code in host.c to be more readable and faster (hash
  tables instead of home-grown lists.)

- make convert_links properly convert partial URLs to complete ones
  for those URLs that have *not* been downloaded.

- use HTTP persistent connections where available.  very
  simple-minded, caches the last connection to the server.

Published in <sxshf533d5r.fsf@florida.arsdigita.de>.
2000-11-19 12:50:10 -08:00