* Move is_local standalone field to status enum
* Create VALID/INVALID flag pair
* Create EXISTS/MISSING flag pair
With these additional fields, we can be more intelligent with database
loading and messages to the user. We now only warn once if a sync
database does not exist and do not continue to try to load it once we
have marked it as missing.
The reason for the flags existing in pairs is so the unknown case can be
represented. There should never be a time when both flags in the same
group are true, but if they are both false, it represents the unknown
case. Care is taken to always manipulate both flags at the same time.
Signed-off-by: Dan McGee <dan@archlinux.org>
We did this with depends way back in commit c244cfecf6 in 2007. We
can do it with these fields as well.
Of note is the inclusion of provides even though only '=' is supported-
we'll parse other things, but no guarantees are given as to behavior,
which is more or less similar to before since we only looked for the
equals sign.
Also of note is the non-inclusion of optdepends; this will likely be
resolved down the road.
The biggest benefactors of this change will be the resolving code that
formerly had to parse and reparse several of these fields; it only
happens once now at load time. This does lead to the disadvantage that
we will now always be parsing this information up front even if we never
need it in the split form, but as these are uncommon fields and our
parser is quite efficient it shouldn't be a big concern.
Signed-off-by: Dan McGee <dan@archlinux.org>
These items are never present in anything but sync databases, nor do we
even try to load them from the local database. Remvoe the indirection
meant to allow the caching layer to work since it will never do anything
anyway.
Signed-off-by: Dan McGee <dan@archlinux.org>
This adds a field in the package struct for this checksum type as well
as allowing access via the API to it. The frontend is now able to
display any read value. Note that this does not implement any use or
verification of the value internally.
Signed-off-by: Dan McGee <dan@archlinux.org>
If we are missing a local database file, we get repeated messages over
and over telling us the same thing, rather than being sane and erroring
only once. This package adds an INFRQ_ERROR level that is added to the
mask if we encounter any errors on a local_db_read() operation, and
short circuits future calls if found in the value. This fixes FS#25313.
Note that this does not make any behavior changes other than suppressing
error messages and repeated code calls to failure cases; we still have
more to do in the "local database is hosed" department.
Also make a small update to the wrong but unused flags set in
be_package; using INFRQ_ALL there was not totally correct.
Signed-off-by: Dan McGee <dan@archlinux.org>
We don't write with extra or unknown whitespace, so there is little
reason for us to trim it when reading either. This also fixes the
hopefully never encountered "paths that start or end with spaces" issue,
for which two pactests have been added. The tests also contain other
evil characters that we have encountered before and handle just fine,
but it doesn't hurt to ensure we don't break such support in the future.
Signed-off-by: Dan McGee <dan@archlinux.org>
This accomplishes quite a few things with one rather invasive change.
1. Iteration is much more performant, due to a reduction in pointer
chasing and linear item access.
2. Data structures are smaller- we no longer have the overhead of the
linked list as the file struts are now laid out consecutively in
memory.
3. Memory allocation has been massively reworked. Before, we would
allocate three different pieces of memory per file item- the list
struct, the file struct, and the copied filename. What this resulted
in was massive fragmentation of memory when loading filelists since
the memory allocator had to leave holes all over the place. The new
situation here now removes the need for any list item allocation;
allocates the file structs in contiguous memory (and reallocs as
necessary), leaving only the strings as individually allocated. Tests
using valgrind (massif) show some pretty significant memory
reductions on the worst case `pacman -Ql > /dev/null` (366387 files
on my machine):
Before:
Peak heap: 54,416,024 B
Useful heap: 36,840,692 B
Extra heap: 17,575,332 B
After:
Peak heap: 38,004,352 B
Useful heap: 28,101,347 B
Extra heap: 9,903,005 B
Several small helper methods have been introduced, including a list to
array conversion helper as well as a filelist merge sort that works
directly on arrays.
Signed-off-by: Dan McGee <dan@archlinux.org>
This allows us to capture size and mode data when building filelists
from package files. Future patches will take advantage of this newly
available information, and frontends can use it as well.
Signed-off-by: Dan McGee <dan@archlinux.org>
There is little need to expose the guts of this function even within the
library. Make it static in be_local.c, and clean up a few other things
since we know exactly where it is being called from:
* Remove unnecessary origin checks in _cache_get_*() methods- if you are
calling a cache method your package type will be correct.
* Remove sanity checks within local_db_read() itself- packages will
always have a name and version if they get this far, and the package
object will never be NULL either.
The one case calling this from outside the backend was in add.c, where
we forced a full load of a package before we duplicated it. Move this
concern elsewhere and have pkg_dup() always force a full package load
via a new force_load() function on the operations callback struct.
Signed-off-by: Dan McGee <dan@archlinux.org>
We can reorganize things a bit to not require reading a directory-only
entry first (or at all). This was noticed while working on some pactest
improvements, but should be a good step forward anyway.
Also make _alpm_splitname() a bit more generic in where it stores the
data it parses.
Signed-off-by: Dan McGee <dan@archlinux.org>
Start by converting all of our flags to a 'status' bitmask (pkgcache
status, grpcache status). Add a new 'valid' flag as well. This will let
us keep track if the database itself has been marked valid in whatever
fashion.
For local databases at the moment we ensure there are no depends files;
for sync databases we ensure the PGP signature is valid if
required/requested. The loading of the pkgcache is prohibited if the
database is invalid.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is another step toward doing both local database validation
(ensuring we don't have depends files) and sync database validation (via
signatures if present) when the database is registered.
Signed-off-by: Dan McGee <dan@archlinux.org>
This allows us to separate the name and hash elements in one place and
not scatter different parsing code all over the place, including both
the frontend and backend.
Signed-off-by: Dan McGee <dan@archlinux.org>
Added a line to the top of each of be_local.c, be_package.c, and
be_sync.c indicating their purposes.
Signed-off-by: Kerrick Staley <mail@kerrickstaley.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
This is the last user of our global handle object. Once again the diff
is large but the functional changes are not.
Signed-off-by: Dan McGee <dan@archlinux.org>
This requires a lot of line changes, but not many functional changes as
more often than not our handle variable is already available in some
fashion.
Signed-off-by: Dan McGee <dan@archlinux.org>
These new method signatures return and take handle objects to operate on
so we can move away from the idea of one global handle in the API. There
is also another important change and that deals with the setting of root
and dbpaths. These are now done at initialization time instead of using
setter methods. This allows the library to operate more safely knowing
that paths won't change underneath it.
Signed-off-by: Dan McGee <dan@archlinux.org>
This will make the patching process less invasive as we start to remove
this variable from all source files.
Signed-off-by: Dan McGee <dan@archlinux.org>
Similar to what we just did for the database; this will make it easy to
always know what handle a given package originated from.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is the first step in a long process to remove our dependence on the
global handle variable we currently share in libalpm, with the goal to
make things a bit more thread-safe and re-entrant.
Signed-off-by: Dan McGee <dan@archlinux.org>
The usefulness of this is rather limited due to it not being compiled
into production builds. When you do choose to see the output, it is
often overwhelming and not helpful. The best bet is to use a debugger
and/or well-placed fprintf() statements.
Signed-off-by: Dan McGee <dan@archlinux.org>
The switch from FUNCTION to DEBUG was ill-advised inside the local
database load. Instead, add a DEBUG level logger to both local and sync
database loads that shows the number of packages processed.
Signed-off-by: Dan McGee <dan@archlinux.org>
This started off removing the "(void)foo" hacks to work around
unused function parameters and ended up fixing every warning
generated by -Wunused-parameter.
Dan: rename to UNUSED.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This does touch a lot of things, and hopefully doesn't break things on
other platforms, but allows us to also clean up a bunch of crud that no
longer needs to be there.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is the standard, and we have had a few of these introduced lately
that should not be here.
Done with:
find -name '*.c' | xargs sed -i -e 's#if (#if(#g'
find -name '*.c' | xargs sed -i -e 's#while (#while(#g'
Signed-off-by: Dan McGee <dan@archlinux.org>
For a package to be loaded from any of our backends, these two fields
are always required upfront. Due to this fact, we don't need them to be
backend-specific operations and can just refer to the field directly.
Additionally, our static (and thus private) cache package accessors had
a NULL check on pkg before returning the relevant field. Eliminate this
since they only way they are ever called is via the packages attached
callback struct, which would have caused the NULL pointer dereference in
the first place.
Signed-off-by: Dan McGee <dan@archlinux.org>
This was discussed and more or less agreed upon on the mailing list. A
huge checkin, but if we just do it and let people adjust the pain will
end soon enough. Rebasing should be relatively straighforward for anyone
that sees conflicts; just be sure you use the new return style if
possible.
The following semantic patch was used to do the change, along with some
hand-massaging in order to preserve parenthesis where appropriate:
The semantic match that finds this problem is as follows, although some
hand-massaging was done in order to keep parenthesis where appropriate:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression a;
@@
- return(a);
+ return a;
// </smpl>
A macros_file was also provided with the following content:
Additional steps taken, mainly for ASSERT() macros:
$ sed -i -e 's#return(NULL)#return NULL#' lib/libalpm/*.c
$ sed -i -e 's#return(-1)#return -1#' lib/libalpm/*.c
Signed-off-by: Dan McGee <dan@archlinux.org>
Fixes FS#23090, a rather serious problem where the user was completely
unable to read the local database. Even if entry->d_type is available,
the given filesystem providing it may not fill the contents, in which
case we should fall back to a stat() as we did before. In this case, the
filesystem was XFS but there may be others.
Signed-off-by: Dan McGee <dan@archlinux.org>
Ensure we have a local DB version that is up to par with what we expect
before we go down any road that might modify it. This should prevent
stupid mistakes with the 3.5.X upgrade and people not running
pacman-db-upgrade after the transaction as they will need to.
Signed-off-by: Dan McGee <dan@archlinux.org>
* Use stat() and not lstat(); we don't care for the size of the symlink if
it is one, we want the size of the reference file.
* FS#22896, fix local database estimation on platforms that don't abide by
the nlink assumption for number of children.
* Fix a missing newline on an error message.
Signed-off-by: Dan McGee <dan@archlinux.org>
In sync_db_populate() and local_db_populate(), a NULL db->pkgcache is not
caught, allowing the functions to continue instead of exiting.
A later alpm_list_msort() call which uses alpm_list_nth() will thus traverse
invalid pointers in a non-existent db->pkgcache->list.
pm_errno is set to PM_ERR_MEMORY as _alpm_pkghash_create() will only return
NULL when we run out of memory / exceed max hash table size. The local/sync
db_populate() functions are also exited.
Signed-off-by: Pang Yan Han <pangyanhan@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
When reading the "desc" file in _alpm_local_db_read(), some
strings are trimmed and checked for length > 0 before their
use/duplication subsequently. They are then trimmed again
when there is no need to.
The following code snippet should illustrate it clearly:
while(fgets(line, sizeof(line), fp) &&
strlen(_alpm_strtrim(line))) {
char *linedup;
STRDUP(linedup, _alpm_strtrim(line), goto error);
info->groups = alpm_list_add(info->groups, linedup);
}
This patch removes the redundant _alpm_strtrim() calls in
_alpm_local_db_read() such as the one inside the STRDUP shown
above.
Signed-off-by: Pang Yan Han <pangyanhan@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Read the package information for sync/local databases into a pmpkghash_t
structure.
Provide a alpm_db_get_pkgcache_list() method that returns the list from
the hash object. Most usages of alpm_db_get_pkgcache are converted to
this at this stage for ease of implementation. Review whether these are
better accessing the hash table directly at a later stage.
Signed-off-by: Allan McRae <allan@archlinux.org>
This works for both local and sync databases in slightly different ways. For
the local database, we can use the directory hard link count on the local/
folder. For sync databases, we use the archive size coupled with some
computed average per-package sizes to determine an estimate.
This is currently a dead assignment once calculated, but could be used to
set the initial size of a hash table.
Signed-off-by: Dan McGee <dan@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
Noted in FS#22697. When I factored out _alpm_parsedate() into a common
function, I didn't move the <locale.h> include properly, causing a build
failure when NLS is disabled and this header isn't automatically included
everywhere.
Signed-off-by: Dan McGee <dan@archlinux.org>
Perform the cheap struct and string setup of the local DB at handle
initialization time to match the teardown we do when releasing the handle.
If the local DB is not needed, all real initialization is done lazily after
DB paths and other things have been configured anyway.
Signed-off-by: Dan McGee <dan@archlinux.org>
Instead, go the same route we have always taken with version-release in
libalpm and treat it all as one piece of information. Makepkg is the only
script that knows about epoch as a distinct value; from there on out we will
parse out the components as necessary.
This makes the code a lot simpler as far as epoch handling goes. The
downside here is that we are tossing some compatibility to the wind;
packages using force will have to be rebuilt with an incremented epoch to
keep their special status.
Signed-off-by: Dan McGee <dan@archlinux.org>
Due to the way we funk around with package data loading, we had a condition
where the filelist got doubled up because it was loaded twice.
Packages are originally loaded with INFRQ_BASE. In an upgrade/sync, the
package is checked for file conflicts next, leaving us in an "INFRQ_BASE |
INFRQ_FILES" state. Later, when committing a single package, we have an
explicit call to _alpm_local_db_read() with INFRQ_ALL as the level. Because
the package's level did not match this, we skipped over our previous "does
the incoming level match where I'm at" shortcut, and continued to load
things again, because of a lack of fine-grained checking for each of DESC,
FILES, and INSTALL.
The end result is we loaded the filelist twice, causing our remove logic to
iterate twice over the installed files, spewing a bunch of "cannot find file
X" messages.
Fix the problem by doing a bit more bitmasking logic throughout the load
method, and also fix the sanity check at the beginning of the function- this
should *only* be used for local packages as opposed to the "not a package"
check that was there before.
A debug log message was added to upgraderemove as well to match the one
already in the normal remove codepath.
Signed-off-by: Dan McGee <dan@archlinux.org>
There is a lot of swtiching between size_t and int for alpm_list sizes
in the codebase. Start converting these to all be size_t by adjusting
the return type of alpm_list_count and fixing all additional warnings
given by -Wconversion that are generated by this change.
Dan: a few more small changes to ensure things compile, adjusting some
printf format string characters to accommodate the larger size on x86_64.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
All functions that are limited to the local translation unit are
declared static. This exposed that the _pkg_get_deltas declaration
in be_local.c was being satified by the function in packages.c which
when declared static caused linker failures.
Fixes all warnings with -Wmissing-{declarations,prototypes}.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Whenever depends is needed from the local db, so is desc. The only
disadvantage to merging them is the additional time taken to read the
depends entries when they are not needed. As depends is in general
relatively small, the additional time taken to read it in will be
negligable. Also, merging these files will speed up local database
access due to less file seeks.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
We were including the header in a lot of places it is no longer used.
Additionally, use the correct autoconf macro for determining whether
d_type is available as a member: HAVE_STRUCT_DIRENT_D_TYPE.
Signed-off-by: Dan McGee <dan@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This will allow for better control of what was previously the 'force' option
in a PKGBUILD and transferred into the built package.
Signed-off-by: Dan McGee <dan@archlinux.org>
Remove unnecessary parsing of fields not found in local desc files.
Leave %FORCE% parsing as this likely will make an appearance in desc
files in the future.
Signed-off-by: Allan McRae <allan@archlinux.org>
The splitname function is a general utility function and so is better
suited to util.h. Rename it to _alpm_splitname to indicate it is an
internal libalpm function as was the case prior to splitting local and
sync db handling.
Signed-off-by: Allan McRae <allan@archlinux.org>
These functions are only needed by be_local and were only promoted
to db.{h,c} as part of the splitting of handling the local and sync
dbs. Move them into be_local.c and make them static again.
Signed-off-by: Allan McRae <allan@archlinux.org>
Read in list of packages for sync db from tar archive.
Breaks reading in _alpm_sync_db_read and a lot of pactests (which
is expected as they do not handle sync db in archives...).
Signed-off-by: Allan McRae <allan@archlinux.org>
Put the db_operations struct to use and completely split the handling
of the sync and local databases.
Signed-off-by: Allan McRae <allan@archlinux.org>
The file be_files.c is "split" to be_local.c and be_sync.c in order
to achieve separate handling of sync and local databases.
Some basic clean-up of functions that are only of use for local or
sync databases has been performed and some rough function renaming
in duplicated code has been performed to prevent compilation errors.
However, most of the clean-up and final separation of sync and local
db handling occurs in following patches.
Signed-off-by: Allan McRae <allan@archlinux.org>