Just like we did for package name comparsions, if we add a depend name_hash
field on depend struct initialization, we can use it instead of doing a
string name comparison, saving us a lot of checks in the depcmp code.
Signed-off-by: Dan McGee <dan@archlinux.org>
Noticed when tweaking testdb, when we run _alpm_depcmp in loops and call it
seven million times, the strdup()/free() combo can add up. Remove the need
for any string duplication by some pointer manipulation and use of strncmp
instead of strcmp. Also kill the function logger and add an escape so we
don't needlessly retrieve the list of provides.
Signed-off-by: Dan McGee <dan@archlinux.org>
The old function was written in a time before we relied on it for nearly
every operation. Since then, we have switched to the archive backend and now
fast parsing is a big deal.
The former function made a per-character call to the libarchive
archive_read_data() function, which resulted in some 21 million calls in a
typical "load all sync dbs" operation. If we instead do some buffering of
our own and read the blocks directly, and then find our newlines from there,
we can cut out the multiple layers of overhead and go from archive to parsed
data much quicker.
Both users of the former function are switched over to the new signature,
made easier by the macros now in place in the sync backend parsing code.
Performance: for a `pacman -Su` (no upgrades available),
_alpm_archive_fgets() goes from being 29% of the total time to 12% The time
spent on the libarchive function being called dropped from 24% to 6%.
This pushes _alpm_pkg_find back to the title of slowest low-level function.
Signed-off-by: Dan McGee <dan@archlinux.org>
The amount of diskspace needed for a transaction can be less than
zero. Only test this against the available disk space if it is
positive, which avoids a comparison being made between signed and
unsigned types (-Wsign-compare).
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Always declare a function with (void) rather than () when we expect
no arguements. Fixes all warnings with -Wstrict-prototypes.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This simplifies a lot of the repetative code and makes it obvious where the
tricky or different ones are (e.g. depends, dates). It also makes it
significantly easier to change the way this code works in the future.
There should be no functional change with this patch.
Signed-off-by: Dan McGee <dan@archlinux.org>
Saves a few bytes due to padding (256 -> 248 bytes), especially on x86_64,
so we get the overhead of our new hash field right back.
Signed-off-by: Dan McGee <dan@archlinux.org>
This results in huge gains to a lot of our codepaths since this is the most
frequent method of random access to packages in a list. The gains are seen
in both profiling and real life.
$ pacman -Sii zvbi
real: 0.41 sec -> 0.32 sec
strcmp: 16,669,760 calls -> 473,942 calls
_alpm_pkg_find: 52.73% -> 26.31% of time
$ pacman -Su (no upgrades found)
real: 0.40 sec -> 0.50 sec
strcmp: 19,497,226 calls -> 524,097 calls
_alpm_pkg_find: 52.36% -> 26.15% of time
There is some minor risk with this patch, but most of it should be avoided
by falling back to strcmp() if we encounter a package with a '0' hash value
(which we should not via any existing code path). We also do a strcmp once
hash values match to ensure against hash collisions. The risk left is that a
package name is modified once it was originally set, but the hash value is
left alone. That would probably result in a lot of other problems anyway.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is prepping for the addition of a hash field to each package to greatly
speed up the string comparisons we frequently do on package name in
_alpm_pkg_find.
Signed-off-by: Dan McGee <dan@archlinux.org>
Rather than error out, this is easy enough. Looks quite similar to the code
in be_local for creating the local directory.
Signed-off-by: Dan McGee <dan@archlinux.org>
Whenever depends is needed from the local db, so is desc. The only
disadvantage to merging them is the additional time taken to read the
depends entries when they are not needed. As depends is in general
relatively small, the additional time taken to read it in will be
negligable. Also, merging these files will speed up local database
access due to less file seeks.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This should hopefully address some of the concerns raised in FS#11639 with
regards to continuing after filling the disk up.
Add some more checks and passing of error conditions between our functions.
When a libarchive warning is encountered, check if it is due to lack of disk
space and if so upgrade it to an error condition. A review of other
libarchive warnings suggests that these are less critical and can remain as
informative warning messages at this stage.
Note the presence of errors after extraction of an entire package is complete.
If so, we abort the transaction to be on the safe side and keep damage to a
minimum.
Signed-off-by: Dan McGee <dan@archlinux.org>
[Allan: make ENOSPC warning into an error]
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Rather than hiding these warnings, show them to the user as they happen.
This will prevent things such as hiding full filesystem errors (ENOSPC) from
the user as seen in FS#11639.
Signed-off-by: Dan McGee <dan@archlinux.org>
[Allan: adjust warning wording and add gettext calls]
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Whether it be "desc", "depends", or "deltas", it really doesn't matter-
treat them all the same and have the ability to read any data from any file
in that list. This continues the work in a44c7b8956.
Signed-off-by: Dan McGee <dan@archlinux.org>
* Use our normal return() function syntax
* Rework a few things to reduce number of casts
* Fix void function argument declaration
* Add missing gettext _() call
* Remove need for seperate malloc() of statvfs/statfs structure
* Unify argument order of static functions- mountpoints now always
passed first
* Count all files that start with '.' in a package against the DB
* Rename db to db_local in check_diskspace to clarify some code
* Fix some line wrapping to respect 80 characters
Signed-off-by: Dan McGee <dan@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
Turn it into a configure-type typedef, which allows us to reduce the
amount of duplicated code and clean up some #ifdef magic in the code
itself. Adjust some of the other defined checks to look at the headers
available rather than trying to pull in the right ones based on
configure checks.
Signed-off-by: Dan McGee <dan@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
Checking disk space needed for a transaction can take a while so add
an informative progress bar.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Disk space checking is likely to be an unnecessary bottleneck to
people with reasonable partition sizes so add a configuration option
to allow it to be disabled/enabled as wanted.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Pull together the work of the previous commits to implement a check
for enough free space before performing an install transaction. Abort
if there is not enough free space with an appropriate pm_errno..
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Two helper function are added to calculate the disk usage from packages
that are either currently installed on the system or from a package
archive.
Some minor approximations have been made:
1. Size for directories is not considered when removing a package from the
filesystem to avoid multiple counting across packages. Also, these are
reported to take zero size while installing.
2. Symlinks are reported to contribute zero size towards removal as
libarchive reports them to have zero size for install.
3. Package data files (.PKGINFO, .INSTALL, .CHANGELOG) are counted towards
usage on dbpath on install, but their size is not counted on package
removal.
4. No handling of extra size needed for .pacsave/.pacnew files.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
For a given file, determine which mount point it is on or will be
installed to. Take into account that we might be using an alternative
installation root.
Add additional helper function added to sort mount point list for easier
matching.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Add a mount_point_list() function that attempts to portably obtain
a list of system mount points and a struct to hold needed mount point
information.
Abort the transaction if we are unable to determine the mount points.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Very basic prototyping for adding functionality to check free disk
space before performing package installs.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This macro is deemed unnecessary by even the autoconf guys, so we really
don't need to use it.
Signed-off-by: Dan McGee <dan@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
We were including the header in a lot of places it is no longer used.
Additionally, use the correct autoconf macro for determining whether
d_type is available as a member: HAVE_STRUCT_DIRENT_D_TYPE.
Signed-off-by: Dan McGee <dan@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This is a delta specific definition so it makes more sense to put
it in the delta specific header file.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This will allow us to eventually combine the depends and desc entries
within the sync database.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
When a -Sk or -Uk operation induced a removal of an existing local
package, --dbonly was not in effect and the files were all removed.
Fixing this behavior was already marked as TODO in database012 pactest
------------
TODO: I honestly think the above should NOT delete the original les, it
hould upgrade the DB entry without touching anything on the file stem.
E.g. this test should be the same as:
pacman -R --dbonly dummy && pacman -U --dbonly dummy.pkg.tar.gz
------------
Signed-off-by: Xavier Chantry <chantry.xavier@gmail.com>
[Dan: small coding style touchup]
Signed-off-by: Dan McGee <dan@archlinux.org>
Either we expose all low levels function dealing with pmdepend_t
(splitdep and depfree come to mind), or we don't.
Since none of the tools use depcmp, I chose to remove it. In the future,
we might want to expose higher level functions such as
alpm_find_satisfier, or just lower level functions like splitdep and
depfree together with depcmp.
Signed-off-by: Xavier Chantry <chantry.xavier@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
This has been replaced by the more flexible alpm_find_satisfier
function, and alpm_deptest was completely unsused now.
Signed-off-by: Xavier Chantry <chantry.xavier@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
whatprovides and splitdep were removed, so depcmp alone is quite useless
now without splitdep, and deptest is not flexible enough.
Introduce a new alpm_find_satisfier which is hopefully more flexible,
this should make implementation of deptest very easy, and also help alpm
tools such as pactree.
Signed-off-by: Xavier Chantry <chantry.xavier@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
This patch is only meant for 3.4.x. It prepares the place for the future
epoch-aware release.
All force packages that get reinstalled or upgraded will get an EPOCH
entry in the local database, and thus the new pacman with epoch won't
reinstall them by mistake on the first -Su.
Signed-off-by: Xavier Chantry <chantry.xavier@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
If we have a corrupted database, a package can come through without an arch,
causing the code to blow up when making strcmp() calls. It might even be
possible with perfectly valid database entries lacking an 'arch =' line.
This behavior was seen as at least one of the problems in FS#21668.
Ensure pkgarch is not null before doing anything further.
Signed-off-by: Dan McGee <dan@archlinux.org>
$ pacman -Rd kde-meta
Remove (15): kde-meta-kdewebdev-4.5-1 [0.00 MB] kde-meta-kdeutils-4.5-1 [0.00 MB]
kde-meta-kdetoys-4.5-1 [0.00 MB] kde-meta-kdesdk-4.5-1 [0.00 MB]
kde-meta-kdeplasma-addons-4.5-1 [0.00 MB] kde-meta-kdepim-4.5-1 [0.00 MB]
kde-meta-kdenetwork-4.5-1 [0.00 MB] kde-meta-kdemultimedia-4.5-1 [0.00 MB]
kde-meta-kdegraphics-4.5-1 [0.00 MB] kde-meta-kdegames-4.5-1 [0.00 MB]
kde-meta-kdeedu-4.5-1 [0.00 MB] kde-meta-kdebase-4.5-1 [0.00 MB]
kde-meta-kdeartwork-4.5-1 [0.00 MB] kde-meta-kdeadmin-4.5-1 [0.00 MB]
kde-meta-kdeaccessibility-4.5-1 [0.00 MB]
Total Removed Size: 0.06 MB
Do you want to remove these packages? [Y/n]
( 1/15) removing kde-meta-kdewebdev [------------------------] 100%
$ it stopped here..
On one side, libalpm did not initialize the progress bar at 0 percent.
So with meta-packages that have 0 files, there was only one progress bar
call with percent == 100.
On the other side, pacman callback kept track of the last percent that
it received. When there are only meta-packages, we always received only
100, so pacman believed the progress bar needed not update. Thus only
the first package was actually displayed.
A proper fix for the callback would be to keep track of last package
name to make sure the recorded prev percent applies.
But since we now specify that both Add and Remove should at least send
percent=0 at beginning and percent=100 at the end, there is no need
for that.
Signed-off-by: Xavier Chantry <chantry.xavier@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
We still need to read force entry in epoch-aware pacman, so that when we
install an old force package, EPOCH gets written to the local db.
Signed-off-by: Xavier Chantry <chantry.xavier@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
This will allow for better control of what was previously the 'force' option
in a PKGBUILD and transferred into the built package.
Signed-off-by: Dan McGee <dan@archlinux.org>
Remove unnecessary parsing of fields not found in local desc files.
Leave %FORCE% parsing as this likely will make an appearance in desc
files in the future.
Signed-off-by: Allan McRae <allan@archlinux.org>
The splitname function is a general utility function and so is better
suited to util.h. Rename it to _alpm_splitname to indicate it is an
internal libalpm function as was the case prior to splitting local and
sync db handling.
Signed-off-by: Allan McRae <allan@archlinux.org>
These functions are only needed by be_local and were only promoted
to db.{h,c} as part of the splitting of handling the local and sync
dbs. Move them into be_local.c and make them static again.
Signed-off-by: Allan McRae <allan@archlinux.org>
Read in list of packages for sync db from tar archive.
Breaks reading in _alpm_sync_db_read and a lot of pactests (which
is expected as they do not handle sync db in archives...).
Signed-off-by: Allan McRae <allan@archlinux.org>
Put the db_operations struct to use and completely split the handling
of the sync and local databases.
Signed-off-by: Allan McRae <allan@archlinux.org>
The file be_files.c is "split" to be_local.c and be_sync.c in order
to achieve separate handling of sync and local databases.
Some basic clean-up of functions that are only of use for local or
sync databases has been performed and some rough function renaming
in duplicated code has been performed to prevent compilation errors.
However, most of the clean-up and final separation of sync and local
db handling occurs in following patches.
Signed-off-by: Allan McRae <allan@archlinux.org>
These will be needed for the handling of both local and sync database
caches, so put them in a common location.
Signed-off-by: Allan McRae <allan@archlinux.org>
Move splitname, checkdbdir, get_pkgpath into db.{h,c} as these will be
needed to parse both the local and sync databases during the initial
splitting. They will be moved out of db.{h,c} at to more appropriate
locations at a later stage.
Signed-off-by: Allan McRae <allan@archlinux.org>
It doesn't do a whole lot yet, but these type of operations will
potentially be different for the DBs we load.
Signed-off-by: Dan McGee <dan@archlinux.org>
Cache bullshit only has relevance to be_files, so move it there.
Signed-off-by: Dan McGee <dan@archlinux.org>
[Allan: BIG rebase]
Signed-off-by: Allan McRae <allan@archlinux.org>
Hopefully we've finally arrived at package handling nirvana, or at least
this commit will get us a heck of a lot closer. The former method of getting
the depends list for a package was the following:
1. call alpm_pkg_get_depends()
2. this method would check if the package came from the cache
3. if so, ensure our cache level is correct, otherwise call db_load
4. finally return the depends list
Why did this suck? Because getting the depends list from the package
shouldn't care about whether the package was loaded from a file, from the
'package cache', or some other system which we can't even use because the
damn thing is so complicated. It should just return the depends list.
So what does this commit change? It adds a pointer to a struct of function
pointers to every package for all of these 'package operations' as I've
decided to call them (I know, sounds completely straightforward, right?). So
now when we call an alpm_pkg_get-* function, we don't do any of the cache
logic or anything else there- we let the actual backend handle it by
delegating all work to the method at pkg->ops->get_depends.
Now that be_package has achieved equal status with be_files, we can treat
packages from these completely different load points differently. We know a
package loaded from a zip file will have all of its fields populated, so
we can set up all its accessor functions to be direct accessors. On the
other hand, the packages loaded from the local and sync DBs are not always
fully-loaded, so their accessor functions are routed through the same logic
as before.
Net result? More code. However, this code now make it roughly 52 times
easier to open the door to something like a read-only tar.gz database
backend.
Are you still reading? I'm impressed. Looking at the patch will probably be
clearer than this long-winded explanation.
Signed-off-by: Dan McGee <dan@archlinux.org>
[Allan: rebase and adjust]
Signed-off-by: Allan McRae <allan@archlinux.org>
Implement this seemingly simple change in package.h:
typedef enum _pmpkgfrom_t {
- PKG_FROM_CACHE = 1,
- PKG_FROM_FILE
+ PKG_FROM_FILE = 1,
+ PKG_FROM_LOCALDB,
+ PKG_FROM_SYNCDB
} pmpkgfrom_t;
which requires flushing out several assumptions from around the codebase
with regards to usage of the PKG_FROM_CACHE value. Make some changes where
required to allow the switch, and now the correct value should be set (via a
crude hack) depending on whether a package was loaded as an entry in a local
db or a sync db.
This patch underwent some big rebasing from Allan and Dan.
Signed-off-by: Dan McGee <dan@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
Move almost all of the caching related stuff into a single #define
(which should maybe even just be a static function) so we don't
duplicate logic all over the place. This also makes the code a heck of a
lot shorter and means further changes to this stuff don't have to touch
each and every getter function.
Signed-off-by: Dan McGee <dan@archlinux.org>
We weren't reading this in from our packages, thus causing us not to write
it out to our local database. Adding this now will help ease the upgrade
path for epoch later and not require reinstallation of all force packages.
Signed-off-by: Dan McGee <dan@archlinux.org>
On Linux and OS X, we can determine if an entry obtained through a readdir()
call is a directory without also having to stat it. This can save a
significant number of syscalls. The performance increase isn't dramatic, but
it could be on some platforms (e.g. Cygwin) so it shouldn't hurt to use this
unconditionally where supported.
Signed-off-by: Dan McGee <dan@archlinux.org>
I don't know what I tested in commit 3e7b90ff69, but it definitely wasn't
working as advertised. Fix the checks in the source code itself to match the
right define (HAVE_LIBFETCH), as well as make sure the configure check
defaults to looking for the library but not bailing if it could not be
found.
Signed-off-by: Dan McGee <dan@archlinux.org>
It touched up these a bit after it ran, so might as well check the changes
in so we don't have to deal with them again later.
Signed-off-by: Dan McGee <dan@archlinux.org>
Model it after the new OpenSSL check, and have it be a bit more useful. If
you do not explicitly pass a command line option, it will be linked if
available but will not error out if it is missing. Also bump the version to
that where connection caching was introduced as we use these new features in
the codebase.
Signed-off-by: Dan McGee <dan@archlinux.org>
I've noticed my Atom-powered laptop is dog-slow when doing integrity checks
on packages, and it turns out our MD5 implementation isn't near as good as
that provided by OpenSSL. Using their routines instead provided anywhere
from a 1.4x up to a 1.8x performance benefit over our built-in MD5 function.
This does not remove the MD5 code from our codebase, but it does enable
linking against OpenSSL to get their much faster implementation if it is
available on whatever platform you are using. At configure-time, we will
default to using it if it is available, but this can be easily changed by
using the `--with-openssl` or `--without-openssl` arguments to configure.
Signed-off-by: Dan McGee <dan@archlinux.org>
This gave at least a 10% improvement on a few tested platforms due to the
reduced number of read calls from files when computing the md5sum. It really
is just a precursor to another patch to come which is to use MD5 functions
that do the job a lot better than anything we can do.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is being checked in as 'pt' rather than 'pt_PT' as that is what
Transifex seems to want, and it is also the dominant choice of packages
already installed on my system when doing a count of the files located in
the /usr/share/locale translation directories.
Thanks for the new translation!
Signed-off-by: Dan McGee <dan@archlinux.org>
Fixes FS#18770, and hopefully an occasional deadlock in my frontend as well.
For simplicity it redirects all scriptlet output through SCRIPTLET_INFO, and
all callbacks in the child process have been replaced for thread-safety.
Signed-off-by: Jonathan Conder <j@skurvy.no-ip.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
The combination of tabs and spaces is annoying in any editor that
does not use a tab width of 2 spaces.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
From the fgets manpage:
fgets() reads in at most one less than size characters from stream and
stores them into the buffer pointed to by s. Reading stops after an EOF
or a newline. If a newline is read, it is stored into the buffer. A
'\0' is stored after the last character in the buffer.
This means there is no need at all to do 'size - 1' math. Remove all of that
and just use sizeof() for simplicity on the buffer we plan on reading into.
Signed-off-by: Dan McGee <dan@archlinux.org>
This will allow downloads to reuse connections if possible, which could make
big differences on perceived FTP speed as the connection won't have to be
reestablished each time. For the most part, HTTP requests wouldn't be using
keep alive anyway so this won't have an effect there.
I'm not enthused about having to do this with the library initialization,
but there isn't a much better place due to the fact that the loop over
databases occurs on the frontend and not the backend.
Signed-off-by: Dan McGee <dan@archlinux.org>
We no longer use these anywhere outside of sync.c, so do the rename and add
static to their definition to meet our coding standards.
Signed-off-by: Dan McGee <dan@archlinux.org>
As reported in FS#20221, we don't always do the right thing when installing
a group and using the --needed option. This was due to the code pulling
packages based on what was already in the transaction's add list, but
completely ignoring the fact that we may have already seen and skipped this
same package in an earlier repository.
Add a list to the private _alpm_sync_pkg() function that allows us to have
this extra information so we don't mistakenly downgrade a package when using
--needed.
Signed-off-by: Dan McGee <dan@archlinux.org>
The sync db should be stored in the sync/ folder. This cleans up
DBPath to only have local/ and sync/ directories in it.
A nice side effect is that the db are now in the right place so we
can implement directly reading from them.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
With commit 5dffef78, the repo database always has a symlink
of the form reponame.db. Use that filename and let libarchive
determine the compression type.
Signed-off-by: Allan McRae <allan@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>