1
0
mirror of https://github.com/moparisthebest/wget synced 2024-07-03 16:38:41 -04:00
Commit Graph

3480 Commits

Author SHA1 Message Date
Jookia
e4db00d74d Add option to write URL rejections to a tab-delimited CSV log.
* main.c: Add "--rejected-log" option.
 * init.c: Add "rejectedlog" command.
 * options.h: Add "rejected_log" parameter string.
 * wget.texi: Add brief documentation on new --rejected-log option.
 * recur.c: Optionally log details of URLs not traversed.
   Add reject_reason enum.
   (download_child_p -> download_child): Return a reject_reason.
   (descend_redirect_p -> descend_redirect): Return a reject_reason.
   (retrieve_tree): Support logging reasons for rejection.
   Add write_reject_log_header that writes a CSV format header to a file.
   Add write_reject_log_url that writes a url struct to a file in CSV format.
   Add write_reject_log_reason that writes the URL and parent URL as well as the
   rejection reason to a CSV file.
 * Test--rejected-log.px: Add a basic test for the --rejected-log command.
 * tests/Makefile.am: Run Test--rejected-log.px.

This allows you to figure out why URLs are being rejected and some context
around it. CSV is used as the output format since it can be used easily parsed,
it's delimited by tabs instead of commas to allow using all (quoted) URL
characters and includes column names which may be used for compatibility.
2015-08-06 08:10:55 +02:00
Tim Rühsen
670eb924e7 Fix memory leak in HSTS code
* src/main.c (get_hsts_database): Free 'home' variable
2015-08-04 17:41:54 +02:00
Tim Rühsen
5d55018ce6 void uninitialized variable in metalink code
* src/metalink.c: Init retr_err with METALINK_MISSING_RESOURCE
* src/wget.h: Add enum METALINK_MISSING_RESOURCE
2015-08-04 17:24:59 +02:00
Darshit Shah
4e56a91001 Fix function name collision with OpenSSL library
* src/utils.[ch], src/http.c, src/metalink.c: Rename function
    hex_to_string() to wg_hex_to_string sine it collides with a
    similarly named function in OpenSSL Library.
2015-07-24 23:52:43 +05:30
Darshit Shah
595f219a17 Fix configure options for metalink
* configure.ac: Ensure metalink support can be properly disabled
2015-07-24 23:42:20 +05:30
Alex Henrie
b6e242cd6f Make the filename marquee a proper marquee
* src/progress.c: Start the marquee in the middle of the available space
  and do not restart it until all of the text has scrolled out of view.
2015-07-22 16:52:20 +05:30
Giuseppe Scrivano
207006ef25 NEWS: cite HSTS 2015-07-20 16:31:17 +02:00
Giuseppe Scrivano
843634db59 Fix metalink tests
testenv/Test-metalink-http.py: initialize HTTP test server
testenv/Test-metalink-xml.py: initialize HTTP test server
2015-07-20 16:29:05 +02:00
Ander Juaristi
54058d2b18 Enhancements in testsuite engine + new HSTS test.
* testenv/Makefile.am: added new test 'Test-hsts.py'.
 * testenv/Test-hsts.py: new test for HSTS.
 * testenv/conf/domains.py: new hook to override domain list.
 * testenv/test/base_test.py: (__init__): new optional parameter
   for tests 'req_protocols'.
   (get_domain_addr): set the instance variables 'addr' and 'port'.
   Return address as an array (domain, port) instead of string.
   (gen_cmd_line): take into account domain and port.
 * testenv/test/http_test.py (__init__): new optional parameter
   'req_protocols'.
   (setup): new function. Call to server_setup() decoupled from
   begin() and moved here.
   (begin): call to superclass to maintain backward compatibility.
   Removed call to server_setup().

This patch adds a new parameter to the test suite called 'req_protocols',
and a new function called 'setup'. The ability for tests to be able to set some
extra parameters such as the actual requested protocols (with 'req_protocols')
became obvious when support for HSTS was added to Wget, where the requested URI
and the actual executed URI do not have to be the same. This new parameter is optional
and if not specified, the test suite behaves as before. Also, the new function 'setup'
is provided as a means to start the test HTTP server, but not launch the test yet
(this is done when calling 'begin', as usual), in case we want to query the address
and port in which the test server listens. If 'setup' is not called, it is automatically
invoked when calling 'begin'. With these measures, we preserve backward-compatibility with
existing tests.
2015-07-20 16:06:40 +02:00
Ander Juaristi
b60131a399 Added support for HSTS.
* Makefile.am: Added new source files hsts.c and hsts.h.
 * http.c (parse_strict_transport_security): new function for STS header
   parsing.
   (gethttp): update the HSTS store.
 * http.h: new include "hsts.h".
 * init.c: new options --hsts and --hsts-file.
 * main.c (get_hsts_database, load_hsts, save_hsts): new functions.
   New options --no-hsts and --hsts-file added to help.
   (main): load and save HSTS store.
 * options.h: new variables for supporting --hsts and --hsts-file.
 * retr.c (retrieve_url): rewrite the URI according to the HSTS policy before
   entering http_loop.
 * test.c, test.h: new unit tests for HSTS.
 * utils.c, utils.h (countchars): new function.
 * wget.h: new preprocessor check.
 * hsts.c, hsts.h: new files with the HSTS engine implementation.

Added support for HTTP Strict Transport Security (HSTS), as defined by RFC
6797.
2015-07-20 15:55:57 +02:00
Giuseppe Scrivano
fc8a545bfd NEWS: cite metalink support 2015-07-20 15:50:29 +02:00
Giuseppe Scrivano
9e12b8ca39 fix compiler warnings
* src/utils.h: Include <stdlib.h>
* src/recur.c: Include "exits.h"
2015-07-20 15:37:52 +02:00
Hubert Tarasiuk
4c3043d19d Test preferred location in Metalink-over-HTTP test case.
* testenv/Test-metalink-http.py: Ensure preferred location is handled
properly.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
6064f21c66 Geolocation support for Metalink resources.
* doc/wget.text: Add information about --preferred-location.
* src/init.c: Add --preferred-location option.
* src/main.c (option_data): Handle --preferred-location argument.
(main): Sort resources based on location if requested.
* src/metalink.c (metalink_res_cmp): Compare based on location if
priority and preference are equal.
* src/options.h (options): Add preferred_location option.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
97389a7497 Support at most one file signature. Adapt comments to libmetalink 0.13.
* src/metalink.c (retrieve_from_metalink): Add comment about new
libmetalink version. Do not iterate over signatures - support just one.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
225a87d4a2 Move some Metalink-related code from http.c to metalink.c.
* src/http.c: Move find_key_value, has_key, find_key_values.
* src/metalink.c: To here.
* src/metalink.h: Make them non-static and add prototypes here.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
92a889b278 Unit test for find_key_values.
* src/http.c: Add test_find_key_values.
* src/test.c (main): Run new test.
* src/test.h: Add test_find_key_values.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
1113e78534 Unit test for has_key.
* src/http.c: Add test_has_key.
* src/test.c (main): Run new test.
* src/test.h: Add test_has_key.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
70cbd59ed6 Unit test for find_key_value.
* src/http.c: Add test_find_key_value.
* src/test.c (main): Run new test.
* src/test.h: Add test_find_key_value.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
0e7aff7623 Test case for Metalink over HTTP.
* testenv/Test-metalink-http.py: New test.
* testenv/Makefile.am: Add to test list.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
792dd09a87 Support multiple headers with same name in Python test suite.
* testenv/README: Describe how to use repeated header name.
* testenv/server/http/http_server.py (finish_headers): Send all
values from list if the header value is a Python list.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
a4f5ced797 Test case for Metalink in XML.
* testenv/Test-metalink-xml.py: New test.
* testenv/Makefile.am: Add file for automake.
2015-07-20 15:31:06 +02:00
Hubert Tarasiuk
05c30c3b1b Start HTTP test only when calling begin().
* testenv/test/http_test.py: Move self.do_test() from __init__ to
begin().
2015-07-20 15:30:39 +02:00
Hubert Tarasiuk
37b58e3976 Metalink support.
* bootstrap.conf: Add crypto/sha256
* configure.ac: Look for libmetalink and GPGME
* doc/wget.texi: Add --input-metalink and --metalink-over-http
options description.
* po/POTFILES.in: Add metalink.c
* src/Makefile.am: Add new translation unit (metalink.c)
* src/http.c (http_stat): Add metalink field.
(free_stat): Free metalink field.
(find_key_value): Find value of given key in header string.
(has_key): Check if token exists in header string.
(find_key_values): Find all key=value pairs in header string.
(metalink_from_http): Obtain Metalink metadata from HTTP response.
(gethttp): Call metalink_from_http if requested.
(http_loop): Request Metalink metadata from HTTP response if should be.
Fall back to regular download if no Metalink metadata found.
* src/init.c: Add --input-metalink and --metalink-over-http options
* src/main.c (option_data): Handle --input-metalink and
--metalink-over-http cmd arguments.
(print_help): Print --input-metalink option description.
(main): Retrieve files from Metalink file
* src/metalink.c (retrieve_from_metalink): Download files described by
metalink.
(metalink_res_cmp): Comparator for resources priority-sorting.
* src/metalink.h: Create header for metalink.c
(RES_TYPE_SUPPORTED): Define supported resources media.
(DEFAULT_PRI): Default mirror priority for Metalink over HTTP.
(VALID_PRI_RANGE): Valid priority range.
* src/options.h (options): Add input_metalink option and metalink_over_http
options.
* src/utils.c (hex_to_string): Convert binary data to ASCII-hex.
* src/utils.h (hex_to_string): Add prototype.
* src/wget.h: Add metalink-related error enums
Add METALINK_METADATA flag for document type.
2015-07-20 15:30:39 +02:00
Romain Bentz
80303366ae Add NULL value check to fix #45289
* src/recur.c (retrieve_tree): Check return value of url_parse()
2015-07-15 18:10:08 +02:00
Tim Rühsen
bd0ffcf8bc Let HTTPS tests XFAIL when no TLS support configured
* configure.ac: Export WITH_SSL for use in Makefile.am
* testenv/Makefile.am: Add HTTPS tests to XFAIL_TESTS when !WITH_SSL

Reported-by: Ander Juaristi <ajuaristi@gmx.es>
2015-07-14 07:54:03 +02:00
Tim Rühsen
25c9b462bf Change function params to const in src/iri.[ch]
* iri.h, iri.c: Added const attribute for params of parse_charsset(),
	check_encoding_name(), idn_encode(), idn_decode(),
	remote_to_utf8(), set_uri_encoding(), set_content_encoding().
2015-07-01 17:15:10 +02:00
Tim Rühsen
77f5a27e65 Work around a libidn <= 1.30 vulnerability
* src/iri.c: Add _utf8_is_valid() to check UTF-8 sequences before
  passing them to idna_to_ascii_8z().
2015-07-01 17:15:05 +02:00
Ángel González
ae58d8a78b Fix wgetrc filename creation for Windows
* init.c/wgetrc_file_name: Remove obsolete code in WINDOWS code path

Reported-by: Gisle Vanem <gvanem@yahoo.no>
2015-06-27 21:32:48 +02:00
Darshit Shah
58702ffd4f Add valgrind suppression files for HTTPS tests
* testenv/test/base_test.py: Use Valgrind SSL suppressions file for
    tests
    * testenv/valgrind-suppression-ssl, tests/valgrind-suppression-ssl:
    Add new suppression files to suppress OpenSSL errors in valgrind
    * tests/test-proxied-https-auth.px: Use the valgrind SSL
    suppressions file for the test
    * tests/test-proxied-https-auth-keepalive.px: Same
2015-06-16 20:31:00 +05:30
Darshit Shah
103f940950 contrib/check-hard: Indentation and spacing cleanup
* contrib/check-hard: Reduce the amount of text output to the
    screen. Also implement some indentation and whitespace cleanups.
2015-06-15 01:35:53 +05:30
Tim Rühsen
5f0818d9f1 Fix usage of CFLAGS in contrib/check-hard
* contrib/check-hard: Set CFLAGS per command line instead of using export.

'make distcheck' changes CFLAGS. So using ./configure -C together with
exported CFLAGS fails. Setting CFLAGS per command line works smoothly.
2015-06-14 20:30:07 +02:00
Tim Rühsen
c6ac51d5bc Move test_* function protoypes from test.c to test.h
* src/test.c: Remove test_* function prototypes, make tests_run static
* src/test.h: Add test_* function protoypes
2015-06-13 22:34:36 +02:00
Giuseppe Scrivano
fd3a3245eb NEWS: cite --if-modified-since 2015-05-23 14:54:17 +02:00
Giuseppe Scrivano
48acb6693d gnulib: update gnulib 2015-05-23 14:51:52 +02:00
Hubert Tarasiuk
885eaaa214 Include --if-modified-since option in user manual.
* doc/wget.texi: Add --if-modified-since section.
2015-05-22 11:08:30 +02:00
Hubert Tarasiuk
8a8d138dcc Support If-Modified-Since header in timestamping mode.
* src/wget.h: Add IF_MODIFIED_SINCE enum for dt. Add TIMECONV_ERR
enum to uerr_t.
* src/http.c (time_to_rfc1123): Convert time_t do http time.
* src/http.c (initialize_request): Include If-Modified-Since header
if appropriate.
* src/http.c (set_file_timestamp): Separate this code from check_file_output.
* src/http.c (check_file_output): Use set_file_timestamp.
* src/http.c (gethttp): Handle properly 304 return code and 200 if server
ignores If-Modified-Since headers.
* src/http.c (http_loop): Load filename to hstat if condget was requested,
use IF_MODIFIED_SINCE if requested and current timestamp can be obtained.
2015-05-22 11:08:30 +02:00
Hubert Tarasiuk
0e8d2d4251 Add --if-modified-since option
* src/init.c: Add to commands array.
* src/main.c: Add to cmdline_option. Add to help message.
* src/options.h: Add to options struct.
2015-05-22 11:08:30 +02:00
Hubert Tarasiuk
be4f91737a Add test for condget requests.
* testenv/Test-condget.py: the test
* testenv/Makefile.am: add to tests list
2015-05-22 11:08:30 +02:00
Hubert Tarasiuk
901bc98edf Support conditional GET in testenv server.
* src/exc/server_error.py: Add exception for GET to HEAD fallback.
* src/server/http/http_server.py: Do not send body if 304 return
code requested for a file.
2015-05-22 11:08:30 +02:00
Hubert Tarasiuk
e397a48f6a Implement timestamp support for local files in testenv
* testenv/README: Change timestamp format definition
* testenv/conf/local_files.py: Set proper timestamps
2015-05-22 11:08:30 +02:00
Pär Karlsson
83537f2415 Fix undeclared loop variable in Perl test suite
Reported-by: Hubert Tarasiuk <hubert.tarasiuk@gmail.com>
2015-05-20 10:04:07 +02:00
Ander Juaristi
8682c2612f Make sure Wget does not unescape reserved chars.
* testenv/Test-reserved-chars.py: New file.

* testenv/Makefile.am: Added new test Test-reserved-chars.py.

When following redirections, Wget should not unescape the reserved
characters that might appear in target URLs.
2015-05-12 21:24:11 +02:00
Ander Juaristi
b0820d553b Fixed incorrect handling of reserved chars.
* src/iri.c (do_conversion): Call url_unescape_except_reserved,
instead of url_unescape.

* src/url.c (url_unescape_1): New static function.
(url_unescape): Calls url_unescape_1 with mask zero. Preserves
same behavior as before. Only code changes.
(url_unescape_except_reserved): New function.

* src/url.h: Added prototype for url_unescape_except_reserved().

When the locale is US-ASCII, URIs that contain special characters
in them are converted to IRIs according to RFC 3987, section 3.2
"Converting URIs to IRIs".
2015-05-12 21:24:06 +02:00
Darshit Shah
b6b1388fb7 Fix documentation for update_speed_ring()
* progress.c (update_speed_ring): The comment for the function
    incorrectly stated that the function uses thirty samples from the
    past instead of twenty.

    Reported-By: Yi Li <lovelylich@gmail.com>
2015-05-07 11:29:07 +05:30
Darshit Shah
9b1dd6dab8 Remove shadowed variable in http.c
* http.c (gethttp): Rename err to conn_err to prevent shadowed
    variable
2015-05-04 21:45:26 +05:30
Steven Schubiger
1cc835dc5b paramcheck: use explicit quoting for here-docs
* util/paramcheck.pl: Adjust here-docs
2015-05-04 10:13:24 +05:30
Tim Ruehsen
6b8dfe1d6e Fix format specifier warning
* src/utils.c (aprintf): Use %d for int argument
2015-05-03 21:18:47 +02:00
Nikolay Merinov
0e6d6ca963 Fix timestamping and continue behaviour with ftp protocol.
* src/ftp.c (ftp_loop_internal): Add option `force_full_retrieve' that force to
retrieve full file.
(ftp_retrieve_list): Pass `true' as `force_full_retrieve' option to
`ftp_loop_internal' if we want to download file with newer timestamp than local
copy.
2015-05-01 00:28:08 +02:00
Rohit Mathulla
3765a1b266 openssl: Read cert from private key file when needed
* src/openssl.c (ssl_init): Assign opt.cert_{file, type}
  from opt.private_key(_type)
2015-04-27 19:52:18 +02:00