This is a feature complete re-implementation of the fetch based internal
downloader, with a few improvements:
* support for SSL
* gzip and deflate compression on HTTP connections
* reuses a single connection over the entire session for lower resource
usage.
Signed-off-by: Dave Reisner <d@falconindy.com>
no actual code changes here. change preprocessor logic to include
get_tempfile, get_destfile, signal handler enum, and the interrupt
handler logic when either HAVE_LIBCURL or HAVE_LIBFETCH are defined.
Signed-off-by: Dave Reisner <d@falconindy.com>
Do this in preparation for implementing similar curl based
functionality. We want the ability to test these side by side.
Signed-off-by: Dave Reisner <d@falconindy.com>
We located files in a few places but didn't check if they were files or
directories. Ensure they are actually files using stat() and S_ISREG(); this
showed itself when trying to download to the directory name itself in
FS#22645.
Signed-off-by: Dan McGee <dan@archlinux.org>
None of these warn at the normal "-Wall -Werror" level, but casts do occur
that we are fine with. Make them explicit to silence some warnings when
using "-Wconversion".
Signed-off-by: Dan McGee <dan@archlinux.org>
We use PATH_MAX everywhere by including limits.h so there is no
point in doing a check for it in a different header when dealing
with FreeBSD's libfetch.
Also, remove autoconf check for strings.h header as it is not used
anywhere.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
POSIX does not require PATH_MAX be defined when there is not actual
limit to its value. This affects HURD based systems. Work around
this by defining PATH_MAX to 4096 (as on Linux) when this is not
defined.
Also, clean up inclusions of limits.h and remove autoconf check for
this header as we do not use macro shields for its inclusion anyway.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This macro is deemed unnecessary by even the autoconf guys, so we really
don't need to use it.
Signed-off-by: Dan McGee <dan@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
I don't know what I tested in commit 3e7b90ff69, but it definitely wasn't
working as advertised. Fix the checks in the source code itself to match the
right define (HAVE_LIBFETCH), as well as make sure the configure check
defaults to looking for the library but not bailing if it could not be
found.
Signed-off-by: Dan McGee <dan@archlinux.org>
Model it after the new OpenSSL check, and have it be a bit more useful. If
you do not explicitly pass a command line option, it will be linked if
available but will not error out if it is missing. Also bump the version to
that where connection caching was introduced as we use these new features in
the codebase.
Signed-off-by: Dan McGee <dan@archlinux.org>
The casting of nread is safe as it is tested to be >0 when it is
initally assigned. It is also being implicitly cast in the fwrite
call in the line above.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
download_internal is supposed to always set pm_errno but did not in many
cases.
The most important (and tested) change is the one concerning fetchStat. This
is typically where the code will fail when the network is down for example.
Before commit d2dbb04a9a, this fetchStat call did not exist and the
same kind of errors would be encountered in the fetchXGet call that follows.
I just copied the error printing to restore the old behavior.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Sorry for this being such a huge patch, but I believe it is necessary for
quite a few reasons which I will attempt to explain herein. I've been
mulling this over for a while, but wasn't super happy with making the
download interface more complex. Instead, if we carefully order things in
the internal download code, we can actually make the interface simpler.
1. FS#15657 - This involves `name.db.tar.gz.part` files being left around the
filesystem, and then causing all sorts of issues when someone attempts to
rerun the operation they canceled. We need to ensure that if we resume a
download, we are resuming it on exactly the same file; if we cannot be
almost postive of that then we need to start over.
2. http://www.mail-archive.com/pacman-dev@archlinux.org/msg03536.html - Here
we have a lighttpd bug to ruin the day. If we send both a Range: header and
If-Modified-Since: header across the wire in a GET request, lighttpd doesn't
do what we want in several cases. If the file hadn't been modified, it
returns a '304 Not Modified' instead of a '206 Partial Content'. We need to
do a stat (e.g. HEAD in HTTP terms) operation here, and the proceed
accordingly based off the values we get back from it.
3. The mtime stuff was rather ugly, and relied on the called function to
write back to a passed in reference, which isn't the greatest. Instead, use
the power of the filesystem to contain this info. Every file downloaded
internally is now carefully timestamped with the remote file time. This
should allow the resume logic to work. In order to guarantee this, we need
to implement a signal handler that catches interrupts, notifies the running
code, and causes it to set the mtimes on the file. It then rethrows the
signal so the pacman signal handler (or any frontend) works as expected.
4. We did a lot of funky stuff in trying to track the DB last modified time.
It is a lot easier to just keep the downloaded DB file around and track the
time on that rather than in a funky dot file. It also kills a lot of code.
5. For GPG verification of the databases down the road, we are going to need
the DB file around for at least a short bit of time anyway, so this gets us
closer to that.
Signed-off-by: Dan McGee <dan@archlinux.org>
[Xav: fixed printf with off_t]
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
This fixes the following valgrind warning :
==26831== Syscall param rt_sigaction(act->sa_flags) points to uninitialised
byte(s)
==26831== at 0x4282547: __libc_sigaction (in /lib/libc-2.10.1.so)
==26831== by 0x403C693: download_internal (dload.c:152)
==26831== by 0x403D0E4: _alpm_download_single_file (dload.c:311)
==26831== by 0x4033B72: alpm_db_update (be_files.c:319)
==26831== by 0x805205E: pacman_sync (sync.c:257)
==26831== by 0x804EE54: main (pacman.c:1120)
==26831== Address 0xbec6cc04 is on thread 1's stack
==26831==
==26831== Syscall param rt_sigaction(act->sa_restorer) points to
uninitialised byte(s)
==26831== at 0x4282547: __libc_sigaction (in /lib/libc-2.10.1.so)
==26831== by 0x403C693: download_internal (dload.c:152)
==26831== by 0x403D0E4: _alpm_download_single_file (dload.c:311)
==26831== by 0x4033B72: alpm_db_update (be_files.c:319)
==26831== by 0x805205E: pacman_sync (sync.c:257)
==26831== by 0x804EE54: main (pacman.c:1120)
==26831== Address 0xbec6cc08 is on thread 1's stack
==26831==
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
fetchIO_read returns -1 in case of error, and the return type is
ssize_t, not size_t ! So we converted -1 to an unsigned, which led to
huge file write.
The rest is just changing the error return a bit.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
- fix one memleak if get_filename failed
- cleanup according to Joerg's feedback:
"url_for_string: If fetchParseURL returned successful, you should always
have a scheme set. The logic for anonftp should only be needed for very
broken server -- do you know of any such?
download_internal:
Specifying 'p' is now a nop -- it is tried by default first with
fall-back to active FTP."
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
[Dan: remove from pacman.conf and pacman.conf.5]
Signed-off-by: Dan McGee <dan@archlinux.org>
libfetch supports checking mtime so we do not need to do it manually.
when the databases are already up-to-date, initiating a connection with
fetchXGet and closing it right after with fetchIO_close took a very long
time (up to 10min!) on some network.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
(cherry picked from commit d7675e393f)
We had 10000 as our timeout value, assuming it was expressed in ms. This is
false after looking at the current code, so reset it back to 10 seconds.
Addresses FS#15369.
Signed-off-by: Dan McGee <dan@archlinux.org>
I assume the loop was never iterated more than once, because the write
location was not updated at each loop iteration (buffer instead of buffer +
nwritten), yet we never had reports of corrupted download.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
libfetch supports checking mtime so we do not need to do it manually.
when the databases are already up-to-date, initiating a connection with
fetchXGet and closing it right after with fetchIO_close took a very long
time (up to 10min!) on some network.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
After commit 30c4d53ce5, get_destfile and
get_tempfile are only used for internal download, so move these two
functions inside the ifdef
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
This allows a frontend to define its own download algorithm so that the
libfetch dependency can be omitted without using an external process.
The callback will be used when if it is defined, otherwise the old
behavior applies.
Signed-off-by: Sebastian Nowicki <sebnow@gmail.com>
[Dan: minor cleanups]
Signed-off-by: Dan McGee <dan@archlinux.org>
This fixes FS#14899. When running an -Sp operation without servers
configured for a repository, we would segfault, so add an assert to the
backend method returning the first server preventing a null pointer
dereference.
In addition, add a new error code to libalpm that indicates we have no
servers configured for a repository. This makes -Sy and -S <package>
operations fail gracefully and helpfully when a repo is set up with no
servers, as the default mirrorlist in Arch is provided this way.
Signed-off-by: Dan McGee <dan@archlinux.org>
Aaron said to consider libdownload a dead project so libdownload support was
removed to more easily fix libfetch one (otherwise many ifdef needed).
There was no direct replacement for ferror to detect an error while
downloading. So instead, I added a check at the end to see if the file was
fully downloaded, which is just a small chunk of code taken from here:
http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/net/libfetch/files/fetch.c?only_with_tag=MAIN
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
May help debug issues we come across with proxy behavior (e.g. those pesky
segfaults) as well as be informative to the user when things aren't working
quite right. Addresses FS#12396.
Signed-off-by: Dan McGee <dan@archlinux.org>
We don't want a failed write to kill our whole program when we are
downloading things, so set the SIGPIPE handler to ignore when downloading
and restore any previous signal handler when we complete the download.
Signed-off-by: Dan McGee <dan@archlinux.org>
Use libfetch naming in the code in place of libdownload names. This is in
preparation for dropping support for libdownload at some point as libfetch
can run on Linux.
Signed-off-by: Dan McGee <dan@archlinux.org>
If a Server specified in pacman.conf had a trailing slash, libalpm ended up
building URLs with double slashes, and this broke libdownload with errors
like the following one :
error: failed retrieving file 'redland-1.0.8-1-i686.pkg.tar.gz'
from 192.168.0.90 : Command okay
So the public function alpm_db_set_server will make sure to remove the
trailing slash of servers. For the private function
_alpm_download_single_file, I only added a comment.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Before commit fc48dc31, file:/// urls forced the use of the internal
downloader (libdownload), because the default XferCommand, wget, does not
handle them. We tried to move away from forcing usage of libdownload, so
this commit implemented the handling of file:/// urls manually. However,
this implementation is way too basic. It does not handle the progress bar,
thus nothing at all appears in pacman's output when a file: repo is
synchronized, or when a file is downloaded from a sync repo. Also, it is not
able to detect when the repo is already up-to-date. When libdownload was
used, both were handled.
It seems better to just drop this implementation for now. All users who use
libdownload will get the much better file:// handling back. For the users of
XferCommand, it will be more problematic, but they have several options:
1) Switch to a downloader handling file:// (wget doesn't, but curl does for
example).
2) Drop the file:// repo, and set up light http or ftp servers instead.
Consider that going that way would make this repo available for the whole
local network, which can be useful.
3) Switch back to libdownload, which works perfectly for many users.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
We have been using unsigned long as a file size type for a while, which
works but isn't quite correct and could easily break. Worse was probably our
use of int in the download callback functions, which could be restrictive
for packages > 2GB in size.
Switch all file size variables to use off_t, which is the preferred type for
file sizes. Note that at least on Linux, all applications compiled against
libalpm must now be sure to use large file support, where _FILE_OFFSET_BITS
is defined to be 64 or there will be some weird issues that crop up.
Signed-off-by: Dan McGee <dan@archlinux.org>
This should remove the need for any additional patching to run on platforms
that have libfetch available but not libdownload. It isn't the prettiest,
but we have kept our libdownload impact down to just a few files, so it can
be easily done.
Signed-off-by: Dan McGee <dan@archlinux.org>
Add a new --disable-internal-download flag to configure allowing the
internal download code to be skipped. This will be helpful on platforms that
currently don't support either libdownload or libfetch (such as Cygwin) and
for just compiling a lighter weight pacman binary.
This was made really easy by our recent refactoring of the download code
into separate internal and external functions, as well as some error code
cleanup.
Signed-off-by: Dan McGee <dan@archlinux.org>
In the file:// download case, we didn't free the return from get_destfile()
after we were done with it. Fix it. (Found with xfercommand001.py)
Signed-off-by: Dan McGee <dan@archlinux.org>
This should be the main step in the download refactoring initiated by commit
81a2a06818.
The stub functions introduced by that commit were implemented.
The big download code was mostly composed of two steps, and so it has been
naturally splitted in two functions : download_external and download_internal
file:/// urls are now handled manually, instead of forcing the use of the
internal downloader.
Thanks to Dan for fixing the remaining issues and cleaning up the patch :)
Signed-off-by: Chantry Xavier <shiningxc@gmail.com>
I screwed up originally when I accepted the TotalDownload patch,
8ec27835f4. I didn't realize how deeply it
modified libalpm and I probably shouldn't have let it do what it did. This
commit reverts much of what that patch added in order to clean up our
internal function calls. We can find another way to do it right down the
road here but for now it has to go.
Signed-off-by: Dan McGee <dan@archlinux.org>