A Few ceph-deploy Gotchas
TweetThe ceph-deploy tools makes setting up a Ceph cluster very easy. One feature that is extremely useful is the tool's ability to deploy a development branch that is automatically built for a number of platforms by gitbuilder daemons. In this post I'll describe two use cases that I recently encountered that didn't behave the way I expected, and how I resolved the issues.
Disappearing Versions
Typically a development branch will be installed with ceph-deploy
using a
command such as:
This will install the latest version of the development branch in the Ceph Git repository. For instance, the build corresponding to the latest version of the Giant development branch is located here:
The packages stored at this URL are named using the SHA1 prefix corresponding to the latest commit:
When a new commit is made to the branch a new build of the branch will replace
the one located at the URL above. Imagine now that Ceph was installed
with the version 0.85-986-g031ef05
, and after installing Ceph a new version
of the branch is created and deployed to the above URL by the gitbuilder
daemons. When a separate package needs to be installed (e.g. libcephfs-jni
)
then the following ceph-deploy
command can handle it:
However in this case the version of libcephfs-jni
fetched will correspond to
the version installed on the node. A conflict arises because that version has
been replaced by the latest version. The error that one sees is a HTTP 404
error because the computed URL with the old version doesn't exist anymore.
The solution I used to solve this is to simply run apt-get upgrade
on the
nodes and then use ceph-deploy
to install the additional packages. After the
upgrade ceph-deploy
will compute the name of a package using the latest
version of the branch.
Mismatched Library Versions
I was faced with the following challenge. I had the latest Giant development
branch installed, and wanted to test a bug fix that existed in a separate
work-in-progress branch. This can be done with ceph-deploy
by removing the
current version and installing the target version. There are separate purge
commands for removing configuration files and data, and we want to leave all
that in place. So the following does the trick:
After these commands node1
will be running Ceph using the new development
branch. However in this particular instance the wip-9663
branch contained a
fix to the package libcephfs1
which was installed using the separate
ceph-deploy pkg
command (not ceph-deploy install
). The result is that
tools like ceph --version
reported the version of Ceph from wip-9663
, but
the bug wasn't fixed. I new there was a problem with the upgrade when
the stack traces resulting from the bug being fixed reported a SHA1 that
corresponded to the old version of libcephfs1
that contained the bug.
The issue is that ceph-deploy uninstall
doesn't remove those other packages,
nor does it (at least not loudly) report the version inconsistencies. The
solution was to just remove the packages by hand using apt-get remove
and
re-install using ceph-deploy pkg
.