From: Stephen S. <rad...@gm...> - 2014-02-05 11:19:53
|
Wait a second.. I've just reviewed the source and the build log in a bit more detail. In server.c, indeed the result of setting SO_REUSEPORT is actually ignored -- and is only tried in a multicast context in any case. Secondly, the build log doesn't seem to indicate anything to me about SO_REUSEPORT. The errors are actually quite varied. I think the SO_REUSEPORT thing may be a red herring, or at least I don't see any evidence for it. Let's see: armel: seems successful. armhf: everything seems to go as expected without errors, but then the process exits with Error 2. Any way of getting more information? ia64: the error is a C++ compile error, which version of g++ is in use? I think the C++11 features may fail on g++4.6, where C++11 was only partially implemented yet the lambda test passes in configure. That's the only case I know of that could cause this problem. kfreebsd: yes, testlo failed for me when i tested on freebsd in qemu i386, s390x, powerpc > liblo server error 9912 in /: Invalid message received (expected) > E: Caught signal ‘Terminated’: terminating immediately What's that all about? Does the debian build server send a terminate signal to the test process for some reason? Steve On Tue, Feb 4, 2014 at 10:45 PM, Stephen Sinclair <rad...@gm...> wrote: > On Tue, Feb 4, 2014 at 1:59 PM, Felipe Sateler <fsa...@gm...> wrote: >> On Tue, Feb 4, 2014 at 4:42 AM, Stephen Sinclair <rad...@gm...> wrote: >>> Hi Felipe, >>> >>> Hm, that is unexpected. Since it is already #ifdef'd out, why is it >>> causing a problem? I'm not sure I understand the mechanism behind the >>> build network. Is it being built on an old kernel, but against >>> headers for a newer kernel? That's the only reason I could think this >>> would cause a problem.. >> >> Indeed, that is the problem, as already mentioned by IOhannes. >> The buildd network consists of machines >> running debian stable (that is, a 3.2 kernel) in which a chroot with >> debian unstable is created. The package is built in the chroot, so >> that means we have new libc and kernel headers, but old runtime >> kernel. As mentioned in the original mail, the problem is during the >> testsuite run, not at build time. > > Sure -- I was proposing running a program at configure time to detect > if it is available, but you're right it may be better to do it every > time the code runs. This implies (as I state below) simply trying to > set it but ignoring any error, but this just feels a bit weird to me. > >>> That said, perhaps it would actually be better to detect the existence >>> of SO_REUSEPORT at configure time instead of compile time, not sure if >>> that would fix the problem or not. >> >> It wouldn't really fix it, because the fundamental problem is that >> kernel features can only be detected at runtime, because there is no >> guarantee that the running kernel is the same as the one with which >> the package was built. It would make it easier to disable it, though. >> >> Maybe on lo_server_new a test could be made to check if SO_REUSEPORT >> is supported by the kernel (AFAICT SO_REUSEPORT is only used on the >> server?). > > That's not a bad idea... > >> I'm a bit confused as to how lo_throw works, but that seems to be the >> proximate cause of the problem. Alternative solution: perhaps liblo >> can just ignore the setsockopts(SO_REUSEPORT) errors? > > I think ignoring the error is the correct behaviour in conditions > where it is not needed anyway, but then perhaps it shouldn't be > attempted in the first place. The goal is to not have unpredictable > behaviour, unicast and multicast should "just work" under all > supported operating systems. I have to go back through old emails, > but if I remember it turned out that SO_REUSEPORT was needed on OS X, > and available but not actually needed on Linux. > >>> Yes, technically you could just take out those sections of the code, >>> but I was hoping you wouldn't need to repack the archive. I don't >>> understand why SO_REUSEPORT is defined if it is not supported. >> >> Ok, I'm commenting out the code section to prevent build failures for >> now. If SO_REUSEPORT can be detected at runtime or the errors can be >> safely ignored, I can replace the patch with one that does the right >> thing. > > I think patching the code for Linux is an ok solution for now, but I'd > like to figure out the right thing to do. I would have been happy > taking the flag out entirely as IOhannes has suggested in the past, > but I ran into some situation which needed it, and made the assumption > that if it is defined, it should be required. > > > Steve |