The ceph-deploy tools makes setting up a Ceph cluster very easy. One feature that is extremely useful is the tool's ability to deploy a development branch that is automatically built for a number of platforms by gitbuilder daemons. In this post I'll describe two use cases that I recently encountered that didn't behave the way I expected, and how I resolved the issues.

Disappearing Versions

Typically a development branch will be installed with ceph-deploy using a command such as:

ceph-deploy install --dev giant node1

This will install the latest version of the development branch in the Ceph Git repository. For instance, the build corresponding to the latest version of the Giant development branch is located here:

http://gitbuilder.ceph.com/ceph-deb-trusty-x86_64-basic/ref/giant/pool/main/c/ceph/

The packages stored at this URL are named using the SHA1 prefix corresponding to the latest commit:

ceph-common-dbg_0.85-986-g031ef05-1trusty_amd64.deb      06-Oct-2014 10:53   52M  
ceph-common_0.85-986-g031ef05-1trusty_amd64.deb          06-Oct-2014 10:53  4.6M  
ceph-dbg_0.85-986-g031ef05-1trusty_amd64.deb             06-Oct-2014 10:53   81M  

When a new commit is made to the branch a new build of the branch will replace the one located at the URL above. Imagine now that Ceph was installed with the version 0.85-986-g031ef05, and after installing Ceph a new version of the branch is created and deployed to the above URL by the gitbuilder daemons. When a separate package needs to be installed (e.g. libcephfs-jni) then the following ceph-deploy command can handle it:

ceph-deploy pkg --install libcephfs-jni node1

However in this case the version of libcephfs-jni fetched will correspond to the version installed on the node. A conflict arises because that version has been replaced by the latest version. The error that one sees is a HTTP 404 error because the computed URL with the old version doesn't exist anymore.

The solution I used to solve this is to simply run apt-get upgrade on the nodes and then use ceph-deploy to install the additional packages. After the upgrade ceph-deploy will compute the name of a package using the latest version of the branch.

Mismatched Library Versions

I was faced with the following challenge. I had the latest Giant development branch installed, and wanted to test a bug fix that existed in a separate work-in-progress branch. This can be done with ceph-deploy by removing the current version and installing the target version. There are separate purge commands for removing configuration files and data, and we want to leave all that in place. So the following does the trick:

ceph-deploy uninstall node1
ceph-deploy install --dev wip-9663 node1

After these commands node1 will be running Ceph using the new development branch. However in this particular instance the wip-9663 branch contained a fix to the package libcephfs1 which was installed using the separate ceph-deploy pkg command (not ceph-deploy install). The result is that tools like ceph --version reported the version of Ceph from wip-9663, but the bug wasn't fixed. I new there was a problem with the upgrade when the stack traces resulting from the bug being fixed reported a SHA1 that corresponded to the old version of libcephfs1 that contained the bug.

The issue is that ceph-deploy uninstall doesn't remove those other packages, nor does it (at least not loudly) report the version inconsistencies. The solution was to just remove the packages by hand using apt-get remove and re-install using ceph-deploy pkg.