* src/recur.c: Declare variables before code
(write_reject_log_url):
Use const keyword where appropriate
Use the 'default' switch statement
Use xfree() instead of free()
Renamed variable f -> fp
(write_reject_log_reason):
Use const keyword where appropriate
Use the 'default' switch statement
Renamed variable f -> fp
Renamed variable r -> reason
* main.c: Add "--rejected-log" option.
* init.c: Add "rejectedlog" command.
* options.h: Add "rejected_log" parameter string.
* wget.texi: Add brief documentation on new --rejected-log option.
* recur.c: Optionally log details of URLs not traversed.
Add reject_reason enum.
(download_child_p -> download_child): Return a reject_reason.
(descend_redirect_p -> descend_redirect): Return a reject_reason.
(retrieve_tree): Support logging reasons for rejection.
Add write_reject_log_header that writes a CSV format header to a file.
Add write_reject_log_url that writes a url struct to a file in CSV format.
Add write_reject_log_reason that writes the URL and parent URL as well as the
rejection reason to a CSV file.
* Test--rejected-log.px: Add a basic test for the --rejected-log command.
* tests/Makefile.am: Run Test--rejected-log.px.
This allows you to figure out why URLs are being rejected and some context
around it. CSV is used as the output format since it can be used easily parsed,
it's delimited by tabs instead of commas to allow using all (quoted) URL
characters and includes column names which may be used for compatibility.
Fixes a reported crash and prevents multiple downloads of the
same file in case the URL is escaped in different ways.
Reported-by: Frédéric <vfrederix@gmail.com>
This commit makes lots of whitespace only changes. It has been ensured that this
commit does not make any changes to the functioning of the program. The only
changes that have been made are:
* Remove trailing whitespaces
* Convert tabs to spaces
* Fix indentation issues in the code
* Other aesthetic changes to the formatting of comments