Pete Zaitcev (zaitcev) wrote,
Pete Zaitcev
zaitcev

Ceph needs a build revolution

I've been poking at this Ceph thing since June, or 9 months now, and I feel like I'm overdue for a good rant. Today's topic is, Ceph's build system is absolutely insufferable. It's unbelievably fragile, and people break it every week at the most. Then it takes a week to fix, or even worse: it only breaks for me, but not for others, and then it stays broken, because I cannot possibly wade into the swamp this deep to fix it.

Here's today's post-10.0.3 trunk:

[zaitcev@lembas ceph-tip]$ sh autogen.sh 
..................
[zaitcev@lembas ceph-tip]$ ./configure --prefix=$(pwd)/build --with-radosgw
..................
[zaitcev@lembas ceph-tip]$ make -j3
..................
  CXX      common/PluginRegistry.lo
  CXXLD    libcommon_crc.la
ar: `u' modifier ignored since `D' is the default (see `U')
ar: common/.libs/libcommon_crc_la-crc32c_intel_fast_asm.o: No such file or direc
tory
Makefile:13137: recipe for target 'libcommon_crc.la' failed
[zaitcev@lembas ceph-tip]$ find . -name '*crc32c_intel*'
./src/common/.deps/libcommon_crc_la-crc32c_intel_fast_asm.Plo
./src/common/.deps/libcommon_crc_la-crc32c_intel_fast_zero_asm.Plo
./src/common/.deps/libcommon_crc_la-crc32c_intel_baseline.Plo
./src/common/.deps/libcommon_crc_la-crc32c_intel_fast.Plo
./src/common/.libs/libcommon_crc_la-crc32c_intel_baseline.o
./src/common/.libs/libcommon_crc_la-crc32c_intel_fast.o
./src/common/crc32c_intel_baseline.c
./src/common/crc32c_intel_baseline.h
./src/common/crc32c_intel_fast.c
./src/common/crc32c_intel_fast.h
./src/common/crc32c_intel_fast_asm.S
./src/common/crc32c_intel_fast_zero_asm.S
./src/common/libcommon_crc_la-crc32c_intel_fast.lo
./src/common/libcommon_crc_la-crc32c_intel_fast_asm.lo
./src/common/libcommon_crc_la-crc32c_intel_baseline.o
./src/common/libcommon_crc_la-crc32c_intel_fast.o
./src/common/libcommon_crc_la-crc32c_intel_fast_zero_asm.lo
./src/common/libcommon_crc_la-crc32c_intel_baseline.lo
[zaitcev@lembas ceph-tip]$ 

Sometimes these things fix themselves after a fresh clone/autogen.sh/configure/make. But doing so all the time is prohibited by how long Ceph builds. Literally it takes many hours (depending if you use autotools or Cmake, and how parallel your build is). I bought a 4-core laptop with 16 GB and SSD just for that. A $1,200 later, I only have to wait 4 hours. Yay, I can build Ceph 2 times in 1 day.

The situation is completely insane, and it remained so for the months I spent working on this. The worst is that I don't understand how people even deal with this without killing themselves. If you look at the pull requests, obviously a large number of developers manage to build this thing somehow... unless all of them post untested patches all the time.

UPDATE: Waiting a bit and a fresh clone made the build to complete, but then:

..................
make[1]: Nothing to be done for 'all'.
make[1]: Leaving directory '/q/zaitcev/ceph/ceph-tip/selinux'
[zaitcev@lembas ceph-tip]$ echo $?
0
[zaitcev@lembas ceph-tip]$ ./src/vstart.sh -n -d -r -i 192.168.132.2
ls: cannot access compressor/*/: No such file or directory
** going verbose **
./src/vstart.sh: line 374: ./init-ceph: No such file or directory
[zaitcev@lembas ceph-tip]$

We are about to freeze Jewel with this codebase.

  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 0 comments