Wednesday, June 27, 2007

Overcoming a powerfailure in the middle of an upgrade

Upgrading a distribution invariably introduces new challenges. In this case, it resulted from taking an absurd risk of upgrading during the summer in the midst of power failures. Our area in Chandigarh does not get many cuts, but still...

After upgrading about half the packages, the power failure occurred. UPS batteries did not last long enough. I restarted the upgrade and, to my relief, found that the machine still booted and allowed me to upgrade the distribution. It upgraded only the packages which had not already been done.

The unpleasant discovery came after the upgrade was successfully done and I decided to apply the available updates. There were problems of conflicts in files because many of the FC6 packages were still installed.

The net helped in understanding that this was a consequence of yum crashing in the middle of a transaction. However, manually fixing about 600 packages was a pain. So, I enjoyed myself and wrote a python script to clean up the mess.

The script follows, in case anyone else ever needs it.

# Power failure during FC7 upgrade results in duplicate entries in rpmdb
# This program will create a file 'deleteList.txt'
# containing duplicate rpm's which may be deleted using
# rpm -e `cat deleteList.txt`
# Depending upon the number of packages to be deleted, it can take time.
# Anil Seth, Jun 2007.

import rpm

def chk_dups(pkgs,arch):
""" find the duplicates for a given architecture
by looking at the distribution (4th element in list - index 3)
or by checking the suffix of release(2nd element in list).
In case the above strategy does not find a new package,
select the one with the highest version(1st element in list)
Returns 2 lists - new packages and remaining packages.
We expect, but do not require, one item in each list.
Returns None if there are no duplicates.

dup_pkgs = filter(lambda x: x[2] == arch, pkgs)
if len(dup_pkgs) > 1:
newPkg = filter(lambda x: x[3] == NEW_DISTRIBUTION or REL_SUFFIX in x[1], dup_pkgs)
restPkg = filter(lambda x: not(x[3] == NEW_DISTRIBUTION or REL_SUFFIX in x[1]), dup_pkgs)
if len(newPkg)==0:
max_version = max([x[0] for x in dup_pkgs])
newPkg = filter(lambda x: x[0] == max_version, dup_pkgs)
restPkg = filter(lambda x: x[0]!= max_version,dup_pkgs)
return newPkg,restPkg
return None

def delete_duplicates(ts,dups):
""" convert the items in dups list into package names suitable for erasing
It is written in a file
We could use ts.addErase(rpmname), ts.check(), ts.order() & to
delete the packages through the program. Hence, ts is being passed as a parameter.
dups is a pair of lists of which the second is one for deletion
for name in dups:
for rpm in dups[name][1]:
rpmname = name[0] + '-' + rpm[0] + '-' + rpm[1] + '.' + rpm[2]
f.write(rpmname + '\n')
print '''Now as root, run
rpm -e `cat deleteList.txt` '''

def main():
""" Iterate over the rpm data base, creating a dictionary
with name as the key with the value being a list of attr which is a list package attributes
Duplicates need to be checked for each architecture separately.
Hence, we create a dictionary with (name,arch) pair as the key.
The values are the two lists returned by chk_dups.
We returned the list of new packages in case
we wanted to verify the installation of these packages. Not being done.
ts = rpm.TransactionSet()
packages = {}
for hdr in mi:
name = hdr['name']
if name in packages:
packages[name]= [attr]

duplicates = {}
for name in packages:
for arch in ARCHS:
dups = chk_dups(packages[name],arch)
if dups:
duplicates[(name,arch)] = dups

No comments:

Post a Comment