This accomplishes quite a few things with one rather invasive change.
1. Iteration is much more performant, due to a reduction in pointer
chasing and linear item access.
2. Data structures are smaller- we no longer have the overhead of the
linked list as the file struts are now laid out consecutively in
memory.
3. Memory allocation has been massively reworked. Before, we would
allocate three different pieces of memory per file item- the list
struct, the file struct, and the copied filename. What this resulted
in was massive fragmentation of memory when loading filelists since
the memory allocator had to leave holes all over the place. The new
situation here now removes the need for any list item allocation;
allocates the file structs in contiguous memory (and reallocs as
necessary), leaving only the strings as individually allocated. Tests
using valgrind (massif) show some pretty significant memory
reductions on the worst case `pacman -Ql > /dev/null` (366387 files
on my machine):
Before:
Peak heap: 54,416,024 B
Useful heap: 36,840,692 B
Extra heap: 17,575,332 B
After:
Peak heap: 38,004,352 B
Useful heap: 28,101,347 B
Extra heap: 9,903,005 B
Several small helper methods have been introduced, including a list to
array conversion helper as well as a filelist merge sort that works
directly on arrays.
Signed-off-by: Dan McGee <dan@archlinux.org>
As noted by Allan, we failed pretty hard if gpgme was compiled out. With
these changes, only sign001.py fails. This can/will be fixed later once
we beef up the test suite with more signing tests anyway.
Signed-off-by: Dan McGee <dan@archlinux.org>
If we can't read the keyring, gpgme will output confusing debug
information and fail to verify the signature, so we should log some
debug information.
Signed-off-by: Florian Pritz <bluewind@xinu.at>
Signed-off-by: Dan McGee <dan@archlinux.org>
This is a wrapper function for access() which logs some debug
information and eases handling in case of split directory and filename.
Signed-off-by: Florian Pritz <bluewind@xinu.at>
Signed-off-by: Dan McGee <dan@archlinux.org>
This addresses FS#25141. We shouldn't remove every empty directory we
come across during the removal process unless it is truly not known to
any other package. This will prevent removal of essential directories
such as '/var/lock/'.
This is accomplished by first checking the empty/non-empty status of a
directory, which was previously done implicitly by calling rmdir() and
ignoring errors. We do this to avoid the next (new) check in most cases,
which is to look at all local packages to see if the to-be-removed
directory is present in another packages' filelist. If we do not find it
anywhere, then we remove it, else we keep the file around.
The pactest has been updated to test more cases, as well as finding a
flaw in the original expected to fail case- we need separate DIR and
FILE based EXIST rules.
Signed-off-by: Dan McGee <dan@archlinux.org>
This can only ever operate on the local database, and a local package at
that. Change the function signature to take a handle and package object,
add the relevant asserts, and ensure the frontend can detect the package
not found condition when finding packages to pass to this method.
Signed-off-by: Dan McGee <dan@archlinux.org>
The bulk of this commit is adding new tests to ensure the new behavior
works without disrupting old behavior. This is a relatively sane maneuver
when a package adds a conf file (e.g. '/etc/mercurial/hgrc') that was
not previously in the package, but it is placed in the backup array. In
essence, we can treat the existing file as having always been a part of
the package and do our normal compare/install as pacnew logic checks.
Signed-off-by: Dan McGee <dan@archlinux.org>
This code duplication has always been a rather clumsy casuality of
fixing some past upgrade issues. Unify the removal code across upgrade
and remove operations into a new _alpm_remove_single_package() method
wihch makes it very clear how we handle upgrade and remove differently,
via several conditionals on newpkg.
This commit highlights interesting behavior such as the fact that the
implicit removal in every package upgrade never gets transaction events
or progress callbacks.
Signed-off-by: Dan McGee <dan@archlinux.org>
Fixes "error: no previous prototype for '_alpm_raw_cmp'
[-Werror=missing-prototypes]" warnings, and also prevents someone from
getting the prototypes and functions out of sync.
Signed-off-by: Dan McGee <dan@archlinux.org>
Restore some sanity to the number of arguments passed to _alpm_download
and curl_download_internal.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
This means creating a new struct which can pass more descriptive data
from the back end sync functions to the downloader. In particular, we're
interested in the download size read from the sync DB. When the remote
server reports a size larger than this (via a content-length header),
abort the transfer.
In cases where the size is unknown, we set a hard upper limit of:
* 25MiB for a sync DB
* 16KiB for a signature
For reference, 25MiB is more than twice the size of all of the current
binary repos (with files) combined, and 16KiB is a truly gargantuan
signature.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
URLs might end with a slash and follow redirects, or could be a
generated by a script such as /getpkg.php?id=12345. In both cases, we
may have a better filename that we can write to, taken from either
content-disposition header, or the effective URL.
Specific to the first case, we write to a temporary file of the format
'alpmtmp.XXXXXX', where XXXXXX is randomized by mkstemp(3). Since this
is a randomly generated file, we cannot support resuming and the file is
unlinked in the event of an interrupt.
We also run into the possibility of changing out the filename from under
alpm on a -U operation, so callers of _alpm_download can optionally pass
a pointer to a *char to be filled in by curl_download_internal with the
actual filename we wrote to. Any sync operation will pass a NULL pointer
here, as we rely on specific names for packages from a mirror.
Fixes FS#22645.
Signed-off-by: Dave Reisner <d@falconindy.com>
The supposed safety blanket of this function is better handled by
explicit length checking and usages of strlen() on known NULL-terminated
strings rather than hoping things fit in a buffer. We also have no need
to fully fill a PATH_MAX length variable with NULLs every time as long
as a single terminating byte is there. Remove usages of it by using
strcpy() or memcpy() as appropriate, after doing length checks via
strlen().
Signed-off-by: Dan McGee <dan@archlinux.org>
We can readily detect the first node in a list by checking if
node->prev->next is NULL. So there is no need to pass the head
of the list to this function and its prototype now looks like
all the other item accessors.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
The only thing this accessor did was remove the const qualifier
given our entire list implementation requires passing around the
head anyway.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
They are placeholders, but important for things like trying to re-sync a
database missing a signature. By using the alpm_db_validity() method at
the right time, a client can take the appropriate action with these
invalid databases as necessary.
In pacman's case, we disallow just about anything that involves looking
at a sync database outside of an '-Sy' operation (although we do check
the validity immediately after). A few operations are still permitted-
'-Q' ops that don't touch sync databases as well as '-R'.
Signed-off-by: Dan McGee <dan@archlinux.org>
Show output in -Qip for each package signature, which includes the UID
string from the key ("Joe User <joe@example.com>") and the validity of
said key. Example output:
Signatures : Valid signature from "Dan McGee <dpmcgee@gmail.com>"
Unknown signature from "<Key Unknown>"
Invalid signature from "Dan McGee <dpmcgee@gmail.com>"
Also add a backend alpm_sigresult_cleanup() function since memory
allocation took place on this object, and we need some way of freeing
it.
Signed-off-by: Dan McGee <dan@archlinux.org>
The error code is in fact a bitmask value of an error code and an error
source, so use the proper function to get only the relevant bits. For
the no error case, this shouldn't ever matter, but it bit me when I was
trying to compare the error code to other values and wondered why it
wasn't working, so set a good example.
Signed-off-by: Dan McGee <dan@archlinux.org>
This gives us more granularity than the former Never/Optional/Always
trifecta. The frontend still uses these values temporarily but that will
be changed in a future patch.
* Use 'siglevel' consistenly in method names, 'level' as variable name
* The level becomes an enum bitmask value for flexibility
* Signature check methods now return a array of status codes rather than
a simple integer success/failure value. This allows callers to
determine whether things such as an unknown signature are valid.
* Specific signature error codes mostly disappear in favor of the above
returned status code; pm_errno is now set only to PKG_INVALID_SIG or
DB_INVALID_SIG as appropriate.
Signed-off-by: Dan McGee <dan@archlinux.org>
Now that the filelists capture mode and size information, we can read
the data from there and prevent having to loop through and uncompress
every archive to check required diskspace usage.
Signed-off-by: Dan McGee <dan@archlinux.org>
This allows us to capture size and mode data when building filelists
from package files. Future patches will take advantage of this newly
available information, and frontends can use it as well.
Signed-off-by: Dan McGee <dan@archlinux.org>
This saves replicating the potentially large list of files in a package
that is being removed.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
We passed in 'line', but not 'buf.line'. In addition, the macros
building off of READ_NEXT() assume variable names anyway. Since we only
use these macros in one function, might as well simplify them.
Signed-off-by: Dan McGee <dan@archlinux.org>
Change the check into a loop over all signatures present and returned by
GPGME. Also modify the return values and checks slightly now that I know
a little bit more about what type of values are returned.
Signed-off-by: Dan McGee <dan@archlinux.org>
There is little need to expose the guts of this function even within the
library. Make it static in be_local.c, and clean up a few other things
since we know exactly where it is being called from:
* Remove unnecessary origin checks in _cache_get_*() methods- if you are
calling a cache method your package type will be correct.
* Remove sanity checks within local_db_read() itself- packages will
always have a name and version if they get this far, and the package
object will never be NULL either.
The one case calling this from outside the backend was in add.c, where
we forced a full load of a package before we duplicated it. Move this
concern elsewhere and have pkg_dup() always force a full package load
via a new force_load() function on the operations callback struct.
Signed-off-by: Dan McGee <dan@archlinux.org>
Some of these are legit (the backup hash NULL checks), while others are
either extemely unlikely or just impossible for the static code
analysis to prove, but are worth adding anyway because they have little
overhead.
Signed-off-by: Dan McGee <dan@archlinux.org>
Modifying prefix caused tmp directories to be left behind after
running scriptlets, and the path '/' to be passed to _alpm_rmrf. Broken
in f01c6f.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This avoids, probably among other things, leaving the lock file in place
after a SIGINT'd sync DB update.
Fixes regression introduced in 4f8ae2b.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
The following function renames take place for the same reasoning as
the previous commit:
_alpm_grp_new -> _alpm_group_new
_alpm_grp_free -> _alpm_group_free
_alpm_db_free_grpcache -> _alpm_db_free_groupcache
_alpm_db_get_grpfromcache -> _alpm_db_get_groupfromcache
Signed-off-by: Allan McRae <allan@archlinux.org>
Using grp instead of group is a small saving at the cost of clarity.
Rename the following functions:
alpm_option_get_ignoregrps -> alpm_option_get_ignoregroups
alpm_option_add_ignoregrp -> alpm_option_add_ignoregroup
alpm_option_set_ignoregrps -> alpm_option_set_ignoregroups
alpm_option_remove_ignoregrp -> alpm_option_remove_ignoregroup
alpm_db_readgrp -> alpm_db_readgroup
alpm_db_get_grpcache -> alpm_db_get_groupcache
alpm_find_grp_pkgs -> alpm_find_group_pkgs
Signed-off-by: Allan McRae <allan@archlinux.org>
Only one of these looked like a real red flag, in find_requiredby(), but
it doesn't hurt to fix several of them up anyway.
Unfortunately, we can't turn this on universally due to things like the
sync(), remove(), etc. builtins which we often use as variable names.
Signed-off-by: Dan McGee <dan@archlinux.org>
We have just looped through the list of files, so might as well get
the count as we go.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This addresses FS#24904. In a normal upgrade case, this replacement
seems to work just fine. However, when doing a sync "replace" type
upgrade, we weren't properly handling this edge case due to path
comparison not ignoring trailing slashes. Fix this by pruning any
trailing slashes past a certain point of file conflict resolution where
we no longer need them, which allows us to safely detect cases such as
now tested in the new pactest.
Signed-off-by: Dan McGee <dan@archlinux.org>
While researching the root cause of FS#24904, I couldn't help but clean
up some of the cruft in here. A few whitespace/line-wrapping issues, but
also fix shadowed variables and add some const where applicable.
Signed-off-by: Dan McGee <dan@archlinux.org>
We can reorganize things a bit to not require reading a directory-only
entry first (or at all). This was noticed while working on some pactest
improvements, but should be a good step forward anyway.
Also make _alpm_splitname() a bit more generic in where it stores the
data it parses.
Signed-off-by: Dan McGee <dan@archlinux.org>
Discovered this when doing some pactest rewrite work to generate
archives in memory only. If a sync database file or PKGINFO file is
missing a newline on the final line, the text from that line gets tossed
aside and never read into the package struct. This is pretty critical
when that last line is a depend or something.
Signed-off-by: Dan McGee <dan@archlinux.org>
These operate on the handle, and the state is stored on the handle, so
move them where they belong. Up until now only the transaction stuff
calls them, but this will soon change and alpm_db_update() will handle
locking all on its own.
Signed-off-by: Dan McGee <dan@archlinux.org>
Start by converting all of our flags to a 'status' bitmask (pkgcache
status, grpcache status). Add a new 'valid' flag as well. This will let
us keep track if the database itself has been marked valid in whatever
fashion.
For local databases at the moment we ensure there are no depends files;
for sync databases we ensure the PGP signature is valid if
required/requested. The loading of the pkgcache is prohibited if the
database is invalid.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is another step toward doing both local database validation
(ensuring we don't have depends files) and sync database validation (via
signatures if present) when the database is registered.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is the ideal place to do it as all clients should be checking the
return value and ensuring there are no errors. This is similar to
pkg_load().
We also add an additional step of validation after we download a new
database; a subsequent '-y' operation can potentially invalidate the
original check at registration time.
Note that this implementation is still a bit naive; if a signature is
invalid it is currently impossible to refresh and re-download the file
without manually deleting it first. Similarly, if one downloads a
database and the check fails, the database object is still there and can
be used. These shortcomings will be addressed in a future commit.
Signed-off-by: Dan McGee <dan@archlinux.org>
For the files count when loading from a package, we can keep a counter.
The two in the frontend were completely useless due to the fact that if
sync_dbs is non-NULL, alpm_list_count() will always be greater than 0.
Signed-off-by: Dan McGee <dan@archlinux.org>
This doesn't fix the real (bigger) problem of failing to parse sync
databases without directory entries, but it does prevent the parser from
segfaulting when the first desc file encountered did not have a
directory entry, among other conditions.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is the first step at separating the pacman message catalog and the
scripts message catalog. Makefiles, configure.ac, and other such files
are adjusted accordingly, as well as renaming files. The TEXTDOMAIN of
scripts is also adjusted.
Note that no actual pot or po files get changed here; these will get
pruned in a future commit so each catalog contains only the necessary
messages.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is for the eventual 4.0.0 release, but more importantly to
logically separate new translations and strings from the PO split about
to happen between pacman and scripts.
Signed-off-by: Dan McGee <dan@archlinux.org>
This allows us to separate the name and hash elements in one place and
not scatter different parsing code all over the place, including both
the frontend and backend.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is an unfortunate chain of events. RET_ERR and RET_ERR_VOID will
eventually call CHECK_HANDLE, which resets the handle's pm_errno member.
Dan probably had a reason for doing this, so we merely switch the order
of operations in the RET_ERR macros to avoid stomping on our pm_errno.
Signed-off-by: Dave Reisner <d@falconindy.com>
* Check the return value of canonicalize_path() for non-NULL
* Use ASSERT and RET_ERR as appropriate
* Make remove_cachedir() use same path munge logic as add_cachedir()
Signed-off-by: Dan McGee <dan@archlinux.org>
Documented the _alpm_download() function in dload.c
Signed-off-by: Kerrick Staley <mail@kerrickstaley.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Added a line to the top of each of be_local.c, be_package.c, and
be_sync.c indicating their purposes.
Signed-off-by: Kerrick Staley <mail@kerrickstaley.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
We were using copy_data before; this works for the struct itself but not
the strings contained within. Fix it up by duplicating all the data as
we do with our other structures.
Signed-off-by: Dan McGee <dan@archlinux.org>
Calling get_logcb() here would reset any previous setting of
handle->pm_errno due to the CHECK_HANDLE() macro contained within. This
would make error setting a bit funny if one set pm_errno before calling
_alpm_log(), such as in the RET_ERR() macro.
Signed-off-by: Dan McGee <dan@archlinux.org>
This removes the need to write accessor methods for every type we have,
and simplifies the API. Any type that doesn't need magic* can be
converted in this fashion to make it easier for frontend applications to
use, as well as make it less of a pain to introduce new such structs in
the future.
* "magic" meaning something like pmpkg_t where values can be lazy loaded.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is more in line with reality and what we have our makepkg, etc.
options named anyway.
Original-patch-by: Kerrick Staley <mail@kerrickstaley.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
* Don't name static methods with a gpgme_ prefix to avoid confusion with
methods provided by the library. These are static and local to our
file so just give them sane non-prefixed names.
* Rework sigsum_test_bit() to not require assignment.
* Don't balk if there is more than one signature available (for now,
only check the first).
* Fix error codes in publicly visible methods to return -1, not 0, if pkg
or db are not provided.
Signed-off-by: Dan McGee <dan@archlinux.org>
We didn't do due diligence before and ensure prior pm_errno values
weren't influencing what happened in further ALPM calls. I observed one
case of early setup code setting pm_errno to PM_ERR_WRONG_ARGS and that
flag persisting the entire time we were calling library code.
Add a new CHECK_HANDLE() macro that does two things: 1) ensures the
handle variable passed to it is non-NULL and 2) clears any existing
pm_errno flag set on the handle. This macro can replace many places we
used the ASSERT(handle != NULL, ...) pattern before.
Several other other places only need a simple 'set to zero' of the
pm_errno field.
Signed-off-by: Dan McGee <dan@archlinux.org>
* Move several variables into better scope
* const-ify a few variables
* Avoid duplicating filelists if it is unnecessary
* Better handling out out of memory condition when adding file conflicts
to our list
Signed-off-by: Dan McGee <dan@archlinux.org>
This allows callers to retrieve it from wherever is convenient, which
may or may not be on the package object itself.
Signed-off-by: Dan McGee <dan@archlinux.org>
This method is old, it doesn't adequately check for a NULL server list,
and can easily be done using better API method we provide these days.
All former users of this method can get similar results by calling
alpm_db_get_servers() and using the data from the returned server list.
Signed-off-by: Dan McGee <dan@archlinux.org>
Note that is a bit different than the normal _alpm_db_path() method; the
caller is expected to free the result.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is the last user of our global handle object. Once again the diff
is large but the functional changes are not.
Signed-off-by: Dan McGee <dan@archlinux.org>
This makes these functions consistent with the rest of the transaction
related API calls. We do an additional assert to ensure the handle
attached to the package is the same as the handle passed in.
Signed-off-by: Dan McGee <dan@archlinux.org>
The siglevel field of a newly created pmdb_t struct is now
initialized when it is created in _alpm_db_new().
Signed-off-by: Kerrick Staley <mail@kerrickstaley.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
A few of these snuck in as of late, some from the table display patches
that were using the previous format before we changed it after the 3.5.X
major release.
Noticed-by: Kerrick Staley <mail@kerrickstaley.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Commit e68f5d9a30 did something a bit silly and changed the
scriptlet calls to use 'newpkg->handle' rather than the 'handle'
argument passed in. Use the handle directly.
Signed-off-by: Dan McGee <dan@archlinux.org>
Begin enforcing the need to pass a handle. This allows us to remove one
more extern handle declaration from the backend.
Signed-off-by: Dan McGee <dan@archlinux.org>
This requires a lot of line changes, but not many functional changes as
more often than not our handle variable is already available in some
fashion.
Signed-off-by: Dan McGee <dan@archlinux.org>
The few remaining instances were utilized for buffers in calls to
snprintf() and realpath(). Both of these functions will always ensure
the returned value is padded with '\0', so there is no need for the
extra byte.
Signed-off-by: Dan McGee <dan@archlinux.org>
The vast majority of the time we will just be passing the same string
value on to the lstat() call. The only time we need to duplicate it is
if the path ends in '/'. In one run using a profiler, only 400 of the
200,000 calls (0.2%) required the string to be copied first.
Signed-off-by: Dan McGee <dan@archlinux.org>
Due to the way we set up the graph structure, we don't always have good
parent information. The changes made in dd8cf0c12d assumed this, so
back them out and just live with the dead pointers being there in the
memory while we are cleaning up after ourselves.
Signed-off-by: Dan McGee <dan@archlinux.org>
These new method signatures return and take handle objects to operate on
so we can move away from the idea of one global handle in the API. There
is also another important change and that deals with the setting of root
and dbpaths. These are now done at initialization time instead of using
setter methods. This allows the library to operate more safely knowing
that paths won't change underneath it.
Signed-off-by: Dan McGee <dan@archlinux.org>
This keeps duplicate code to a minimum. This will come in more handy as
we refactor some of these option setters away.
Signed-off-by: Dan McGee <dan@archlinux.org>