stat()'s behaviour is undefined if the first argument is NULL and might
be prone to segfault. Add an additional check to skip the stat()
invocation if no destfile is used.
Signed-off-by: Lukas Fleischer <archlinux@cryptocrack.de>
Signed-off-by: Dan McGee <dan@archlinux.org>
Avoid a potential segfault that may occur if we use a temporary file and
fail to build the destination file name from the effective URL.
Signed-off-by: Lukas Fleischer <archlinux@cryptocrack.de>
Signed-off-by: Dan McGee <dan@archlinux.org>
This reverts some hacky behavior from 5fc3ec and resets the handle's
pm_errno where it should be reset -- prior to each download. This
prevents a transaction with a download from being aborted when a package
is successfully grabbed from a secondary server.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Take this opportunity to refactor the if/then/else logic into a
switch/case which is likely going to be needed to fine tune more
exceptions in the future.
Fixes FS#25531
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This prevents possible null dereferences in FTP transfers when the
progress callback is touched during connection teardown.
http://curl.haxx.se/mail/lib-2011-08/0128.html
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Noticed in my PowerPC Linux VM:
cc1: warnings being treated as errors
dload.c:45: error: 'get_filename' defined but not used
make[3]: *** [dload.lo] Error 1
Signed-off-by: Dan McGee <dan@archlinux.org>
We did a good job checking this in add.c, but not necessarily anywhere
else. Fix this up by adding checks into dload.c, remove.c, and conf.c in
the frontend. Also add loggers where appropriate and make the message
syntax more consistent.
Signed-off-by: Dan McGee <dan@archlinux.org>
Restore some sanity to the number of arguments passed to _alpm_download
and curl_download_internal.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
This means creating a new struct which can pass more descriptive data
from the back end sync functions to the downloader. In particular, we're
interested in the download size read from the sync DB. When the remote
server reports a size larger than this (via a content-length header),
abort the transfer.
In cases where the size is unknown, we set a hard upper limit of:
* 25MiB for a sync DB
* 16KiB for a signature
For reference, 25MiB is more than twice the size of all of the current
binary repos (with files) combined, and 16KiB is a truly gargantuan
signature.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
URLs might end with a slash and follow redirects, or could be a
generated by a script such as /getpkg.php?id=12345. In both cases, we
may have a better filename that we can write to, taken from either
content-disposition header, or the effective URL.
Specific to the first case, we write to a temporary file of the format
'alpmtmp.XXXXXX', where XXXXXX is randomized by mkstemp(3). Since this
is a randomly generated file, we cannot support resuming and the file is
unlinked in the event of an interrupt.
We also run into the possibility of changing out the filename from under
alpm on a -U operation, so callers of _alpm_download can optionally pass
a pointer to a *char to be filled in by curl_download_internal with the
actual filename we wrote to. Any sync operation will pass a NULL pointer
here, as we rely on specific names for packages from a mirror.
Fixes FS#22645.
Signed-off-by: Dave Reisner <d@falconindy.com>
This gives us more granularity than the former Never/Optional/Always
trifecta. The frontend still uses these values temporarily but that will
be changed in a future patch.
* Use 'siglevel' consistenly in method names, 'level' as variable name
* The level becomes an enum bitmask value for flexibility
* Signature check methods now return a array of status codes rather than
a simple integer success/failure value. This allows callers to
determine whether things such as an unknown signature are valid.
* Specific signature error codes mostly disappear in favor of the above
returned status code; pm_errno is now set only to PKG_INVALID_SIG or
DB_INVALID_SIG as appropriate.
Signed-off-by: Dan McGee <dan@archlinux.org>
Some of these are legit (the backup hash NULL checks), while others are
either extemely unlikely or just impossible for the static code
analysis to prove, but are worth adding anyway because they have little
overhead.
Signed-off-by: Dan McGee <dan@archlinux.org>
Only one of these looked like a real red flag, in find_requiredby(), but
it doesn't hurt to fix several of them up anyway.
Unfortunately, we can't turn this on universally due to things like the
sync(), remove(), etc. builtins which we often use as variable names.
Signed-off-by: Dan McGee <dan@archlinux.org>
Documented the _alpm_download() function in dload.c
Signed-off-by: Kerrick Staley <mail@kerrickstaley.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
We didn't do due diligence before and ensure prior pm_errno values
weren't influencing what happened in further ALPM calls. I observed one
case of early setup code setting pm_errno to PM_ERR_WRONG_ARGS and that
flag persisting the entire time we were calling library code.
Add a new CHECK_HANDLE() macro that does two things: 1) ensures the
handle variable passed to it is non-NULL and 2) clears any existing
pm_errno flag set on the handle. This macro can replace many places we
used the ASSERT(handle != NULL, ...) pattern before.
Several other other places only need a simple 'set to zero' of the
pm_errno field.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is the last user of our global handle object. Once again the diff
is large but the functional changes are not.
Signed-off-by: Dan McGee <dan@archlinux.org>
This requires a lot of line changes, but not many functional changes as
more often than not our handle variable is already available in some
fashion.
Signed-off-by: Dan McGee <dan@archlinux.org>
This will make the patching process less invasive as we start to remove
this variable from all source files.
Signed-off-by: Dan McGee <dan@archlinux.org>
The usefulness of this is rather limited due to it not being compiled
into production builds. When you do choose to see the output, it is
often overwhelming and not helpful. The best bet is to use a debugger
and/or well-placed fprintf() statements.
Signed-off-by: Dan McGee <dan@archlinux.org>
Callers to curl_download_internal now tell us if its okay to continue a
transfer, so obey this instead of using a heuristic.
Signed-off-by: Dave Reisner <d@falconindy.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
If a connection drops below 1kb/s for 10s, curl will kill the transfer
and we'll report failure. This is the average transfer speed over the
delta defined by CURLOPT_LOW_SPEED_TIME, so setting a low value here
shouldn't bother folks using 14.4k dial-up.
Signed-off-by: Dave Reisner <d@falconindy.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
This started off removing the "(void)foo" hacks to work around
unused function parameters and ended up fixing every warning
generated by -Wunused-parameter.
Dan: rename to UNUSED.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
There's a lot of related moving parts here:
* Iteration through mirrors is moved back to the calling functions. This
allows removal of _alpm_download_single_file and _alpm_download_files.
* The download function gets a few more arguments to influence behavior.
This allows several different scenarios to customize behavior:
- database
- database signature (req'd and optional)
- package
- package via direct URL
- package signature via direct URL (req'd and optional)
* For databases, we need signatures from the same mirror, so structure
the code accordingly.
Some-inspiration-from: Dave Reisner <d@falconindy.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
The allow_resume is the start of the fix to the "don't ever resume
database downloads" problem, as well as being useful for '.sig'
downloads as well. For now, we say "always allow resume", but this will
eventually get pushed down as necessary.
Error checks are reworked in order to correctly error out when a file is
not found on the remote end and reports 0 bytes downloaded. In addition,
the two error messages printed are now different as one reports a more
specific error message provided via the cURL error buffer.
Some example output from an -Sy run with [testing], [community],
[community2], [eee], and [nonexistant] defined as repos. [community2]
and [nonexistant] are both invalid, one using FTP and one using HTTP.
:: Synchronizing package databases...
testing is up to date
community is up to date
error: failed retrieving file 'community2.db' from ftp.archlinux.org : Given file does not exist
error: failed to update community2 (FTP: couldn't retrieve (RETR failed) the specified file)
eee is up to date
error: failed retrieving file 'nonexistant.db' from code.toofishes.net : The requested URL returned error: 404
error: failed to update nonexistant (HTTP response code said error)
Signed-off-by: Dan McGee <dan@archlinux.org>
This is the standard, and we have had a few of these introduced lately
that should not be here.
Done with:
find -name '*.c' | xargs sed -i -e 's#if (#if(#g'
find -name '*.c' | xargs sed -i -e 's#while (#while(#g'
Signed-off-by: Dan McGee <dan@archlinux.org>
That's a funny one, building with optimization levels (with both gcc and
clang) caused open_mode to always be set to "ab", which worked.
This was spotted both with clang-analyzer, and by Jakob who reported a
segfault as he was using an un-optimized build.
Signed-off-by: Xavier Chantry <chantry.xavier@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
This greatly simplifies the cleanup fallthrough in our download function
and we'll be able to reuse this for signatures.
Signed-off-by: Dave Reisner <d@falconindy.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Based on the fact that localf always points to the same file, there's no
need to code in multiple fopen calls with varying results. Instead,
track the desired file open mode and make a single call to fopen.
Signed-off-by: Dave Reisner <d@falconindy.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Create a more general function that allows appending a suffix to a
filepath.
Signed-off-by: Dave Reisner <d@falconindy.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
This lets us determine the real size of the file on disk so that we can
properly bump the progress bar when we're resuming a download.
Signed-off-by: Dave Reisner <d@falconindy.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Our curl callback does a whole lot of work for nothing if the front end
never defined a callback to receive the data we'd calculate for it.
Signed-off-by: Dave Reisner <d@falconindy.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
CURLINFO_HTTP_CODE is deprecated in favor of CURLINFO_RESPONSE_CODE.
Both yield the same values.
Signed-off-by: Dave Reisner <d@falconindy.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
The files we transfer are generally compressed already, so this just
adds unnecessary overhead.
Signed-off-by: Dave Reisner <d@falconindy.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Use a static variable to effectively track the initialization state of
the progress callback via the last byte amount reported as downloaded by
libcurl.
Signed-off-by: Dave Reisner <d@falconindy.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
* introduces new macro in util.h (DOUBLE_EQ) for properly comparing
floating point values
Signed-off-by: Dave Reisner <d@falconindy.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Drawing progress bars before calling curl_easy_perform() is needless as
the curl progress callback is called with zero progress before actually
downloading the file anyways. Fixes display of "0%" progress bars when
sync'ing package databases that are already up to date.
Signed-off-by: Lukas Fleischer <archlinux@cryptocrack.de>
Signed-off-by: Dan McGee <dan@archlinux.org>
This was discussed and more or less agreed upon on the mailing list. A
huge checkin, but if we just do it and let people adjust the pain will
end soon enough. Rebasing should be relatively straighforward for anyone
that sees conflicts; just be sure you use the new return style if
possible.
The following semantic patch was used to do the change, along with some
hand-massaging in order to preserve parenthesis where appropriate:
The semantic match that finds this problem is as follows, although some
hand-massaging was done in order to keep parenthesis where appropriate:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression a;
@@
- return(a);
+ return a;
// </smpl>
A macros_file was also provided with the following content:
Additional steps taken, mainly for ASSERT() macros:
$ sed -i -e 's#return(NULL)#return NULL#' lib/libalpm/*.c
$ sed -i -e 's#return(-1)#return -1#' lib/libalpm/*.c
Signed-off-by: Dan McGee <dan@archlinux.org>
this is just some debuggery to allow pacman to operate with both fetch
and curl at the same time. use the PACMANDL variable to control which
library is used.
Signed-off-by: Dave Reisner <d@falconindy.com>
This is a feature complete re-implementation of the fetch based internal
downloader, with a few improvements:
* support for SSL
* gzip and deflate compression on HTTP connections
* reuses a single connection over the entire session for lower resource
usage.
Signed-off-by: Dave Reisner <d@falconindy.com>
no actual code changes here. change preprocessor logic to include
get_tempfile, get_destfile, signal handler enum, and the interrupt
handler logic when either HAVE_LIBCURL or HAVE_LIBFETCH are defined.
Signed-off-by: Dave Reisner <d@falconindy.com>
Do this in preparation for implementing similar curl based
functionality. We want the ability to test these side by side.
Signed-off-by: Dave Reisner <d@falconindy.com>
We located files in a few places but didn't check if they were files or
directories. Ensure they are actually files using stat() and S_ISREG(); this
showed itself when trying to download to the directory name itself in
FS#22645.
Signed-off-by: Dan McGee <dan@archlinux.org>
None of these warn at the normal "-Wall -Werror" level, but casts do occur
that we are fine with. Make them explicit to silence some warnings when
using "-Wconversion".
Signed-off-by: Dan McGee <dan@archlinux.org>
We use PATH_MAX everywhere by including limits.h so there is no
point in doing a check for it in a different header when dealing
with FreeBSD's libfetch.
Also, remove autoconf check for strings.h header as it is not used
anywhere.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
POSIX does not require PATH_MAX be defined when there is not actual
limit to its value. This affects HURD based systems. Work around
this by defining PATH_MAX to 4096 (as on Linux) when this is not
defined.
Also, clean up inclusions of limits.h and remove autoconf check for
this header as we do not use macro shields for its inclusion anyway.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This macro is deemed unnecessary by even the autoconf guys, so we really
don't need to use it.
Signed-off-by: Dan McGee <dan@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
I don't know what I tested in commit 3e7b90ff69, but it definitely wasn't
working as advertised. Fix the checks in the source code itself to match the
right define (HAVE_LIBFETCH), as well as make sure the configure check
defaults to looking for the library but not bailing if it could not be
found.
Signed-off-by: Dan McGee <dan@archlinux.org>
Model it after the new OpenSSL check, and have it be a bit more useful. If
you do not explicitly pass a command line option, it will be linked if
available but will not error out if it is missing. Also bump the version to
that where connection caching was introduced as we use these new features in
the codebase.
Signed-off-by: Dan McGee <dan@archlinux.org>
The casting of nread is safe as it is tested to be >0 when it is
initally assigned. It is also being implicitly cast in the fwrite
call in the line above.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
download_internal is supposed to always set pm_errno but did not in many
cases.
The most important (and tested) change is the one concerning fetchStat. This
is typically where the code will fail when the network is down for example.
Before commit d2dbb04a9a, this fetchStat call did not exist and the
same kind of errors would be encountered in the fetchXGet call that follows.
I just copied the error printing to restore the old behavior.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Sorry for this being such a huge patch, but I believe it is necessary for
quite a few reasons which I will attempt to explain herein. I've been
mulling this over for a while, but wasn't super happy with making the
download interface more complex. Instead, if we carefully order things in
the internal download code, we can actually make the interface simpler.
1. FS#15657 - This involves `name.db.tar.gz.part` files being left around the
filesystem, and then causing all sorts of issues when someone attempts to
rerun the operation they canceled. We need to ensure that if we resume a
download, we are resuming it on exactly the same file; if we cannot be
almost postive of that then we need to start over.
2. http://www.mail-archive.com/pacman-dev@archlinux.org/msg03536.html - Here
we have a lighttpd bug to ruin the day. If we send both a Range: header and
If-Modified-Since: header across the wire in a GET request, lighttpd doesn't
do what we want in several cases. If the file hadn't been modified, it
returns a '304 Not Modified' instead of a '206 Partial Content'. We need to
do a stat (e.g. HEAD in HTTP terms) operation here, and the proceed
accordingly based off the values we get back from it.
3. The mtime stuff was rather ugly, and relied on the called function to
write back to a passed in reference, which isn't the greatest. Instead, use
the power of the filesystem to contain this info. Every file downloaded
internally is now carefully timestamped with the remote file time. This
should allow the resume logic to work. In order to guarantee this, we need
to implement a signal handler that catches interrupts, notifies the running
code, and causes it to set the mtimes on the file. It then rethrows the
signal so the pacman signal handler (or any frontend) works as expected.
4. We did a lot of funky stuff in trying to track the DB last modified time.
It is a lot easier to just keep the downloaded DB file around and track the
time on that rather than in a funky dot file. It also kills a lot of code.
5. For GPG verification of the databases down the road, we are going to need
the DB file around for at least a short bit of time anyway, so this gets us
closer to that.
Signed-off-by: Dan McGee <dan@archlinux.org>
[Xav: fixed printf with off_t]
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
This fixes the following valgrind warning :
==26831== Syscall param rt_sigaction(act->sa_flags) points to uninitialised
byte(s)
==26831== at 0x4282547: __libc_sigaction (in /lib/libc-2.10.1.so)
==26831== by 0x403C693: download_internal (dload.c:152)
==26831== by 0x403D0E4: _alpm_download_single_file (dload.c:311)
==26831== by 0x4033B72: alpm_db_update (be_files.c:319)
==26831== by 0x805205E: pacman_sync (sync.c:257)
==26831== by 0x804EE54: main (pacman.c:1120)
==26831== Address 0xbec6cc04 is on thread 1's stack
==26831==
==26831== Syscall param rt_sigaction(act->sa_restorer) points to
uninitialised byte(s)
==26831== at 0x4282547: __libc_sigaction (in /lib/libc-2.10.1.so)
==26831== by 0x403C693: download_internal (dload.c:152)
==26831== by 0x403D0E4: _alpm_download_single_file (dload.c:311)
==26831== by 0x4033B72: alpm_db_update (be_files.c:319)
==26831== by 0x805205E: pacman_sync (sync.c:257)
==26831== by 0x804EE54: main (pacman.c:1120)
==26831== Address 0xbec6cc08 is on thread 1's stack
==26831==
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
fetchIO_read returns -1 in case of error, and the return type is
ssize_t, not size_t ! So we converted -1 to an unsigned, which led to
huge file write.
The rest is just changing the error return a bit.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
- fix one memleak if get_filename failed
- cleanup according to Joerg's feedback:
"url_for_string: If fetchParseURL returned successful, you should always
have a scheme set. The logic for anonftp should only be needed for very
broken server -- do you know of any such?
download_internal:
Specifying 'p' is now a nop -- it is tried by default first with
fall-back to active FTP."
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
[Dan: remove from pacman.conf and pacman.conf.5]
Signed-off-by: Dan McGee <dan@archlinux.org>
libfetch supports checking mtime so we do not need to do it manually.
when the databases are already up-to-date, initiating a connection with
fetchXGet and closing it right after with fetchIO_close took a very long
time (up to 10min!) on some network.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
(cherry picked from commit d7675e393f)
We had 10000 as our timeout value, assuming it was expressed in ms. This is
false after looking at the current code, so reset it back to 10 seconds.
Addresses FS#15369.
Signed-off-by: Dan McGee <dan@archlinux.org>
I assume the loop was never iterated more than once, because the write
location was not updated at each loop iteration (buffer instead of buffer +
nwritten), yet we never had reports of corrupted download.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
libfetch supports checking mtime so we do not need to do it manually.
when the databases are already up-to-date, initiating a connection with
fetchXGet and closing it right after with fetchIO_close took a very long
time (up to 10min!) on some network.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
After commit 30c4d53ce5, get_destfile and
get_tempfile are only used for internal download, so move these two
functions inside the ifdef
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
This allows a frontend to define its own download algorithm so that the
libfetch dependency can be omitted without using an external process.
The callback will be used when if it is defined, otherwise the old
behavior applies.
Signed-off-by: Sebastian Nowicki <sebnow@gmail.com>
[Dan: minor cleanups]
Signed-off-by: Dan McGee <dan@archlinux.org>
This fixes FS#14899. When running an -Sp operation without servers
configured for a repository, we would segfault, so add an assert to the
backend method returning the first server preventing a null pointer
dereference.
In addition, add a new error code to libalpm that indicates we have no
servers configured for a repository. This makes -Sy and -S <package>
operations fail gracefully and helpfully when a repo is set up with no
servers, as the default mirrorlist in Arch is provided this way.
Signed-off-by: Dan McGee <dan@archlinux.org>
Aaron said to consider libdownload a dead project so libdownload support was
removed to more easily fix libfetch one (otherwise many ifdef needed).
There was no direct replacement for ferror to detect an error while
downloading. So instead, I added a check at the end to see if the file was
fully downloaded, which is just a small chunk of code taken from here:
http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/net/libfetch/files/fetch.c?only_with_tag=MAIN
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
May help debug issues we come across with proxy behavior (e.g. those pesky
segfaults) as well as be informative to the user when things aren't working
quite right. Addresses FS#12396.
Signed-off-by: Dan McGee <dan@archlinux.org>
We don't want a failed write to kill our whole program when we are
downloading things, so set the SIGPIPE handler to ignore when downloading
and restore any previous signal handler when we complete the download.
Signed-off-by: Dan McGee <dan@archlinux.org>
Use libfetch naming in the code in place of libdownload names. This is in
preparation for dropping support for libdownload at some point as libfetch
can run on Linux.
Signed-off-by: Dan McGee <dan@archlinux.org>
If a Server specified in pacman.conf had a trailing slash, libalpm ended up
building URLs with double slashes, and this broke libdownload with errors
like the following one :
error: failed retrieving file 'redland-1.0.8-1-i686.pkg.tar.gz'
from 192.168.0.90 : Command okay
So the public function alpm_db_set_server will make sure to remove the
trailing slash of servers. For the private function
_alpm_download_single_file, I only added a comment.
Signed-off-by: Xavier Chantry <shiningxc@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>