This reduces the number of functions we call by log(n) in this function,
and the inlined version is trivial and barely increases the size of the
function.
Signed-off-by: Dan McGee <dan@archlinux.org>
This reduces the number of regcomp() calls when parsing delta entries in
the database from once per entry to once for the entire context handle
by storing the compiled regex data on the handle itself. Just as we do
with the cURL handle, we initialize it the first time it is needed and
free it when releasing the handle.
A few other small tweaks to the parsing function also take place,
including using the stack to store the transient and short file size
string while parsing it.
When parsing a sync database with 1378 delta entries, this reduces the
time of a `pacman -Sl deltas` operation by 50% from 0.22s to 0.12s.
Signed-off-by: Dan McGee <dan@archlinux.org>
In commit 4c5e7af32f, we changed this code to use the regex gathered
substrings. However, we failed to correctly store the delta file name
(leaking memory), as well as freeing the temporary string used to hold
the file size string.
Signed-off-by: Dan McGee <dan@archlinux.org>
More than likely the compiler will do the three operation breakdown we
had here before (2 shifts + subtraction), but let the compiler do the
optimizations and make the actual operation more obvious. This actually
slightly shrinks the function binary size, likely due to instruction
reordering or something.
Signed-off-by: Dan McGee <dan@archlinux.org>
This eliminates the need for strtrim() usage completely, instead relying
on the fact that the only allowed delimiter between key and value is the
" = " string.
Signed-off-by: Dan McGee <dan@archlinux.org>
Extract the actual unlinking of files into a new method, which
eliminates a goto used for flow control. Also fix up a few small issues
in the code:
* Unnecessary (unsigned long) cast, use '%zd' instead
* Total up errors returned from unlink_file calls and return to caller
* Be consistent with scriptlets- we run pre_remove on dbonly, so we
should also run post_remove. Both can be disabled by way of the
--noscriptlet argument.
* Don't pass an invalid pointer to oldpkg to the event callbacks;
instead call the callback before we free the object.
Signed-off-by: Dan McGee <dan@archlinux.org>
Used in alpm_compute_md5sum() and alpm_compute_sha256sum().
Signed-off-by: Diogo Sousa <diogogsousa@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Ensures that config.h is always ordered correctly (first) in the
includes. Also means that new source files get this for free without
having to remember to add it.
We opt for -imacros over -include as its more portable, and the
added constraint by -imacros doesn't bother us for config.h.
This also touches the HACKING file to remove the explicit mention of
config.h as part of the includes.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Mostly a waste of time. Sure, we no longer make sure your pacman
database partition has enough space, but if you are using this option
you better know what you are doing anyway.
Signed-off-by: Dan McGee <dan@archlinux.org>
It is quite easy to hoist this potentially repeated computation out of
the loop; even if we don't end up using it, it is super cheap to do it
only once.
Signed-off-by: Dan McGee <dan@archlinux.org>
The return should probably be checked to ensure its not longer than
PATH_MAX, but I have no idea what the correct behavior is when that
happens.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
As per HACKING file, we use 'CTRL(' rather than 'CTRL ('
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Since we know the length of the line, we can use this all the way
through and do a cheaper operation than strdup() by just invoking malloc
and memcpy directly.
Signed-off-by: Dan McGee <dan@archlinux.org>
We were using the size of a pointer, not the size of the whole
archive_read_buffer struct. Thanks to Clang/LLVM 3.0 and Allan/Dave in
IRC for finding this one.
Signed-off-by: Dan McGee <dan@archlinux.org>
We had a 16 KiB limit on database signatures, we should do the same here
too to have a slight sanity check, even if we can't do so for the
package itself yet.
Signed-off-by: Dan McGee <dan@archlinux.org>
We do this in several of the package duplication steps; add a helper
function for doing so to reduce some of the repetitive code.
Also add a free_deplist function for our repeated depend list free calls
of both the data and the list.
Signed-off-by: Dan McGee <dan@archlinux.org>
Made existing documentation more consistent and added
documentation where there was none. One function still
needs documentation and is marked with 'TODO'.
Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
This moves the common setup code of about 5 different callers into one
method. Error messages will now be common and shared in all places;
several paths did not have any messages at all before.
In addition, we now pick an ideal block size for the archive read based
off the larger value of our default buffer size or the st.st_blksize
field. For a filesystem such as NFS, this is often much larger than the
default 8192- values such as 32768 and 131072 are common.
Signed-off-by: Dan McGee <dan@archlinux.org>
When doing a bare -U operation on a local package that doesn't pull in
any dependencies from the sync databases, we can get away with missing
database files. This makes the check conditional on no sync targets
found in the target list. This is not the prettiest code here so we have
a bit of hackish behavior required to straighten both the behavior and
the nonsensical error message out.
Addresses FS#26899.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is consistent with the other enums and structs, and should be
slightly more readable.
Signed-off-by: Jonathan Conder <jonno.conder@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
Bump the version, update the translation template files, and fill in
NEWS with relevant commits and changes since 4.0.0.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is work originally provided by Sascha Kruse on FS#20360 with only
minor adjustments to the implementation. It's been expanded to cover:
NoUpgrade, NoExtract, IgnorePkg, IgnoreGroup.
Adds tests ignore008, sync139, sync502, and sync503.
Also satisfies FS#18988.
Original-work-by: Sascha Kruse <knopwob@googlemail.com>
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
This is a simple change that allows comparions to be more in line with
how other checks are done. It will be necessary for ensuing patchwork
that implements fnmatch for comparing and assumes a specific argument
ordering.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
The point of this early compare to NULL byte check was so we could bail
early and skip the strcmp() call. Given we weren't doing the check
right, this never exited early. Fix it to work as intended.
Noticed-by: Pepe Juárez <trulustapa@gmail.com>
Signed-off-by: Dan McGee <dan@archlinux.org>
This is a trivial operation that doesn't require calling a function over
and over- just do some math and indexing into a character array.
Signed-off-by: Dan McGee <dan@archlinux.org>
This gives us a bit more control and over the archive reading process,
and a bit less is done behind the scenes. It also allows us to use
fstat() in preference to stat(), which should avoid some potential race
conditions.
Some reorganization is necessary to move the stat calls after the open()
calls. Error handling and cleanup in general is also improved, as we had
several potential memory and file handle leaks before in some error
paths.
Signed-off-by: Dan McGee <dan@archlinux.org>
This removes an unnecessary level of buffering. We are not doing
line-based I/O here, so we can read in blocks of 8K at a time directly
from the file.
Signed-off-by: Dan McGee <dan@archlinux.org>
Replacing the strdup when after the first NULL check assures that we get
continue with payload->remote_name defined.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Dan: fix mask calculation, add it to the success/fail block instead.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This takes the place of three previously used constants:
ARCHIVE_DEFAULT_BYTES_PER_BLOCK, BUFFER_SIZE, and CPBUFSIZE.
In libarchive 3.0, the first constant will be no more, so we can ensure
we are forward-compatible by removing our usage of it now. The rest are
unified for consistency.
By default, we will use the value of BUFSIZ provided by <stdio.h>, which
is 8192 on Linux. If that is undefined, a default value is provided.
Signed-off-by: Dan McGee <dan@archlinux.org>
There aretwo seperate issues in the same block of file conflict
checking code here:
1) If realpath errored, such as when a symlink was broken, we would call
'continue' rather than simply exit this particular method of
resolution. This was likely just a copy-paste mistake as the previous
resolving steps all use loops where continue makes sense. Refactor
the check so we only proceed if realpath is successful, and continue
with the rest of the checks either way.
2) The real problem this code was trying to solve was canonicalizing
path component (e.g., directory) symlinks. The final component, if
not a directory, should not be handled at all in this loop. Add a
!S_ISLNK() condition to the loop so we only call this for real files.
There are few other small cleanups to the debug messages that I made
while debugging this problem- we don't need to keep printing the file
name, and ensure every block that sets resolved_conflict to true prints
a debug message so we know how it was resolved.
This fixes the expected failures from symlink010.py and symlink011.py,
while still ensuring the fix for fileconflict007.py works.
Signed-off-by: Dan McGee <dan@archlinux.org>
There is some pecular behavior going on here when a package is loaded
that has no files, as is very common in our test suite. When we enter
the realloc/sort code, a package without files will call the following:
files = realloc(NULL, 0);
One would assume this is a no-op, returning a NULL pointer, but that is
not the case and valgrind later reports we are leaking memory. Fix the
whole thing by skipping the reallocation and sort steps if the pointer
is NULL, as we have nothing to do.
Note that the package still gets marked as 'files loaded', becuase
although there were none, we tried and were successful.
Signed-off-by: Dan McGee <dan@archlinux.org>
First, use fstat() in preference to stat() since we already have an open
file handle. This also removes the need to check for a symlink as that
is not possible when a file is opened.
Next, use archive_entry_mode() rather than archive_entry_stat() as we
only use the mode portion of the stat struct and the call is much
cheaper. Also delay it until it is necessary.
Signed-off-by: Dan McGee <dan@archlinux.org>
Extend the return values of compute_download_size to allow callers to
know that a .part file exists for the package.
This extra value isn't currently used, but it'll be needed later on.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This adds a logger to the CURLE_OK case so we can always know the return
code if it was >= 400, and debug log it regardless. Also adjust another
logger to use the cURL error message directly, as well as use fstat()
when we have an open file handle rather than stat().
Signed-off-by: Dan McGee <dan@archlinux.org>
Create a new static function called 'download_single_file' which
iterates over the servers for each payload.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This is done by both the delta and regular file code, so we can extract
a little helper method. Done mostly to satisfy my "why are we repeating
code here" itch.
Signed-off-by: Dan McGee <dan@archlinux.org>
Break out the logic of finding payloads into a separate static function
to avoid nesting mayhem. After gathering all the records, download them
all at once.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
The absolutely terrible part about this is the failure on GPGME's part
to distinguish between "key not found" and "keyserver timeout". Instead,
it returns the same silly GPG_ERR_EOF in both cases (why isn't
GPG_ERR_TIMEOUT being used?), leaving us helpless to tell them apart.
Spit out a generic enough error message that covers both cases;
unfortunately we can't provide much guidance to the user because we
aren't sure what actually happened.
Signed-off-by: Dan McGee <dan@archlinux.org>
This logic is reused in both diskspace and downloadspace check
functions, so pull it out into its own static method.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This function determines if the given cachedir has at least the given
amount of free space on it. This will be later used in the sync code to
preemptively halt downloads.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
We can take a large shortcut here that saves us a lot of time,
especially when upgrading packages with lots of directories. Obviously
iterating the full file list of every single package to determine if
this directory was present in any other package can take quite some time
on a system with many packages installed. We don't need to remove a
directory at all if we are upgrading a package and the version we are
moving to still had the directory.
Also make a small optimization on the package comparsion- we really only
care about equality here, not the result of the compare, so we can
shortcut using our name_hash.
What kind of benefit does this give us? Oh, only a reduction from 295.7
million to 1.4 million strcmp() calls (99.5% fewer) during a
`pacman -S linux libreoffice-common` operation.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is in the realm of "probably not going to happen", but if someone
were to translate "disk" to a string longer than 256 characters, we
would have a smashed/corrupted stack due to our unchecked strcpy() call.
Rework the function to always length-check the value we copy into the
hostname buffer, and do it with memcpy rather than the more cumbersome
and unnecessary snprintf.
Finally, move the magic 256 value into a constant and pass it into the
function which is going to get inlined anyway.
Signed-off-by: Dan McGee <dan@archlinux.org>
This one is pretty darn useless. Just derefence the ->data attribute
since the type is public anyway and save yourself the function call.
Signed-off-by: Dan McGee <dan@archlinux.org>
This will be useful when extending disk space checks to free space
checking before we download package files.
Signed-off-by: Dan McGee <dan@archlinux.org>
Instead of iterating character by character, use memchr() calls to
hopefully speed up the search. A newline is the most likely culprit, so
search for that first followed by a NULL byte if there was no newline in
the buffer.
Signed-off-by: Dan McGee <dan@archlinux.org>
In the default configuration, we can enter the signing code but still
have nothing to do with GPGME- for example, if database signatures are
optional but none are present. Delay initialization of GPGME until we
know there is a signature file present or we were passed base64-encoded
data.
This also makes debugging with valgrind a lot easier as you don't have
to deal with all the GPGME error noise because their code leaks like a
sieve.
Signed-off-by: Dan McGee <dan@archlinux.org>
If you need zero-filled allocations, call CALLOC() instead.
This was from the original definition of these macros in commit
cc754bc6e3be0f3; hopefully our code is in the shape it needs to be to
switch this behavior.
Signed-off-by: Dan McGee <dan@archlinux.org>
These backup-related paths in package extraction are used on relatively
few files during the install process, so bump them off the stack and
into the heap. This removes the artificial PATH_MAX limitation on their
length as well.
Signed-off-by: Dan McGee <dan@archlinux.org>
This will always be a 64-bit signed integer rather than the variable length
time_t type. Dates beyond 2038 should be fully supported in the library; the
frontend still lags behind because 32-bit platforms provide no localtime64()
or equivalent function to convert from an epoch value to a broken down time
structure.
Signed-off-by: Dan McGee <dan@archlinux.org>
This prepares the function to handle values past year 2038. The return type
is still limited to 32-bits on 32-bit systems; this will be adjusted in a
future patch.
Signed-off-by: Dan McGee <dan@archlinux.org>
We have a few incomplete translations, but these should be addressable
before the 4.0.1 maint release that is surely not that far in the
future.
Signed-off-by: Dan McGee <dan@archlinux.org>
Similar to what we did in edd9ed6a, disconnect the relationship with our
stack allocated error buffer from the curl handle. Just as an FTP
connection might have some network chatter on teardown causing the
progress callback to be triggered, we might also hit an error condition
that causes curl to write to our (now out of scope) error buffer.
I'm unable to reproduce FS#26327, but I have a suspicion that this
should fix it.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
Expose the current static get_pkgpath() function internally to the rest
of the library as _alpm_local_db_pkgpath(). This allows use of this
convenience function in add.c and remove.c when forming the path to the
scriptlet location.
Signed-off-by: Dan McGee <dan@archlinux.org>
Add an is_archive parameter to reduce the amount of black magic going
on. Rework to use fewer PATH_MAX sized local variables, and simplify
some of the logic where appropriate in both this function and in the
callers where duplicate calls can be replaced by some conditional
parameter code.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is a poor place for it, and it will likely move again in the
future, but it's better to have it here than as a static variable.
Initialization of this variable is now no longer necessary as its
zeroed on creation of the payload struct.
Signed-off-by: Dave Reisner <dreisner@archlinux.org>
Signed-off-by: Dan McGee <dan@archlinux.org>
This was done to squash a memory leak in the sync database download
code. When we downloaded a database and then reused the payload struct,
we could find ourselves calling get_fullpath() for the signatures and
overwriting non-freed values we had left over from the database
download.
Refactor the payload_free function into a payload_reset function that we
can call that does NOT free the payload itself, so we can reuse payload
structs. This also allows us to move the payload to the stack in some
call paths, relieving us of the need to alloc space.
Signed-off-by: Dan McGee <dan@archlinux.org>
Rather than always initializing it on any handle creation. There are
several frontend operations (search, info, etc.) that never need the
download code, so spending time initializing this every single time is a
bit silly. This makes it a bit more like the GPGME code init path.
Signed-off-by: Dan McGee <dan@archlinux.org>
Rather than free them right away, keep the list on the transaction as
we already do with add and remove lists. This is necessary because we
may be manipulating pointers the frontend needs to refer to packages,
and we are breaking our contract as stated in the alpm_add_pkg()
documentation of only freeing packages at the end of a transaction.
This fixes an issue found when refactoring the package list display
code.
Signed-off-by: Dan McGee <dan@archlinux.org>
This is definitely not in the normal hot path, so we can afford to do
some temporary heap allocation here.
Signed-off-by: Dan McGee <dan@archlinux.org>
No wonder these were slower than expected. We were only reading 4
(32-bit) or 8 (64-bit) bytes at a time and feeding it to the hash
functions. Define a buffer size constant and use it correctly so we feed
8K at a time into the hashing algorithm.
This cut one larger `-Sw --noconfirm` operation, with nothing to
actually download so only timing integrity, from 3.3s to 1.7s.
This has been broken since the original commit eba521913d introducing
OpenSSL usage for crypto hash functions. Boy do I feel stupid.
Signed-off-by: Dan McGee <dan@archlinux.org>
In the sync code, we explicitly allocated a string for this field, while
in the dload code itself it was filled in with a pointer to another
string. This led to a memory leak in the sync download case.
Make remote_name non-const and always explicitly allocate it. This patch
ensures this as well as uses malloc + snprintf (rather than calloc) in
several codepaths, and eliminates the only use of PATH_MAX in the
download code.
Signed-off-by: Dan McGee <dan@archlinux.org>
This commit was made with the intent of displaying "correctly" sorted
package lists to users. Here are some reasons I think this is incorrect:
* It is done in the wrong place. If a frontend application wants to show
a different order of packages dependent on locale, it should do that
on its own.
* Even if one wants a locale-specific order, almost all package names
are all ASCII and language agnostic, so this different comparison
makes little sense and may serve only to confuse people.
* _alpm_pkg_cmp was unlike any other comparator function. None of the
rest had any dependency on anything but the content of the structs
being compared (e.g., they only used strcmp() or other basic
comparison operators).
This reverts commit 3e4d2c3aa6.
Signed-off-by: Dan McGee <dan@archlinux.org>
There was only one simple to handle case where we left a field
uninitialized; set it to NULL and use malloc() instead.
Signed-off-by: Dan McGee <dan@archlinux.org>