Sunday, August 30, 2009

Debian squeeze in a solaris BrandZ zone!

I wanted to run some debian squeeze programs on my opensolaris machine in a branded zone but after successfully installing it using some guide I found online I noticed that lots of commands fail due to functions not being implemented!

unconfigured:~# touch foo
unconfigured:~# rm foo
rm: cannot remove `foo': Function not implemented

Oh noes! How come rm isn't implemented, that's pretty basic right?

unconfigured:~# strace rm foo 2>&1 | grep -i "not implemented"
unlinkat(AT_FDCWD, "foo", 0) = -1 ENOSYS (Function not implemented)

Ah, what's this? unlinkat? I bet lx-brand doesn't implement this syscall since it's so new or maybe there's some switch to enable it? But how does this BrandZ voodoo magic work anyway? Turns out the emulation itself is done in userland by one library (/usr/lib/lx_brand.so.1), and after some digging I came across {"unlinkat", NULL, NOSYS_NULL, 0} in some code which pretty much confirms that unlinkat and the other *at functions aren't implemented at all!
Well damn, this sucks, maybe I'll recompile squeeze libraries and programs so that they don't use these syscalls and thus work with BrandZ, but that sounds tedious and not very fun... so clearly it's time to hack some new syscalls into lx-brand!
But how the hell do you compile onnv-gate code anyway? It seems to use a very odd build system which is designed for build robots rather than humans! Well luckily there are other crazy people on the internet who have already found out how, http://www.pagetable.com/?p=44 is a very nice tutorial on how to do it and it works great, just instead of going to usr/src/uts goto usr/src/lib/brand/lx/lx_brand!

So a few hours of hacking and reading posix standards later I now have a /usr/lib/lx_brand.so.1 which implements openat(), mkdirat(), mknodat(), fchownat(), futimesat(), fstatat64(), unlinkat(), renameat(), linkat(), symlinkat(), readlinkat(), fchmodat() and faccessat() resulting in a working debian squeeze userland, although there are still a few issues like the debconf flock issue mentioned in the etch-zone tutorial and some syscalls still fail but seem to be innocuous...
Also I'm sure the syscalls implemented aren't fully compatible with the proper linux versions and I'm sure there are lots of error conditions that will freak out user mode programs... but for now it works well enough to run debian squeeze which was the goal!

If you're interested in trying out what I've done you can download my patch and a pre-built library (built against the b118 onnv-gate source, since that's what I happened to be running at the time) from my website, and if you do try it out please let me know how it goes! Although be warned that it's super experimental and not recommended for production systems, I highly recommend you take snapshots before proceeding!

[2009-12-17 Update] The patch has been integrated: e8c975bf2038 should be in snv131 and above