Bug#441975: nvidia-glx should only provide the TLS version

View: New views
6 Messages — Rating Filter:   Alert me  

Bug#441975: nvidia-glx should only provide the TLS version

by Chris Reeves-4 :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

On Fri, Feb 15, 2008 at 18:59:30 +0100, Aurelian Jarno wrote:

>
> severity 441975 serious
>
> On Sat, Jan 19, 2008 at 12:57:42AM +0100, Aurelien Jarno wrote:
> >
> > FYI the problem is that /etc/ld.so.nohwcaps disable all optimized
> > libraries and use the one from /usr/lib. NVidia had the idea to provide
> > a TLS version (in /usr/lib/tls) and a non-TLS version (in /usr/lib) of
> > their library. Disabling optimized libraries means that the non-TLS
> > version of the library is used. However, their code chose between TLS
> > and non-TLS code on a different way (a test code), which always succeed
> > on recent systems with NPTL library. This lead to a mix of TLS and
> > non-TLS code, leading to a crash.
> >
> > I will workaround to the glibc to also use tls/ directory even when
> > optimized libraries are disabled, as TLS is alway available in lenny.
>
> This workaround causes problems when upgrading from etch to lenny, so it
> will be removed in the next upload. As a consequence, this bug really
> has to be fixed, so I am upgrading it to serious.

I have been able to reproduce this on a lenny machine with a 2.6.25-2 kernel.
In order to do so one must use an nVidia graphics card with the nVidia binary
driver and /etc/ld.so.nohwcaps must exist. The test (as described in a
previous message) is that "perl -e 'use Qt'" will segfault.

This bug will affect any user of the nvidia-glx package who has their debconf
frontend set to kde (or similar) and tries to upgrade a package which makes
use of /etc/ld.so.nohwcaps (e.g. libc6). These users will be affected
irrespective of whether nvidia-graphics-drivers makes it into lenny or not.


Aurelian is largely correct with this. The nVidia installer comes with two
different copies of libnvidia-tls.so.<version> inside the installer package.
 - According to the nvidia-installer docs, the version in
   <package-dir>/usr/lib is for glibc <= 2.2, while the version in
   <package-dir>/usr/lib/tls is for glibc >= 2.3.
 - According to the README.Debian for nvidia-glx, however, the differing
   versions are for 2.4 and 2.6 kernels (presumably on the assumption that
   NPTL is implemented in the latter and not in the former).
Whichever of these interpretations is actually correct, the same version of
the library should be installed into both /usr/lib and /usr/lib/tls so that
the presence of /etc/ld.so.nohwcap does not affect which version of the
library is used (which it shouldn't).

On the basis of the nVidia docs it might seem reasonable to only ship the
second version, since lenny is guaranteed to come with glibc >= 2.3. In this
case we only require a one-line change to debian/rules to get things to work
(although the USE_TLS flag would become redundant and so we could also remove
related code and documentation).

On the other hand, if Randall's README.Debian is the more accurate, we might
break things for some users with older kernels. In this case it would take a
few more changes to get things to work (keep both versions of libnvidia-tls in
/usr/lib/nvidia and modify the init scripts to symlink both /usr/lib and
/usr/lib/tls to the same version).

My vote would be for the second option. It would be useful if people could
express their preferences so that I can produce a patch for the preferred
option. This would fix the nvidia-glx package, but does *not* fix the bug
completely.


As I said earlier, this bug will affect any of the users that I have
previously described, irrespective of whether an updated package makes it into
lenny - the presence/use of an old version of nvidia-glx will trigger this
bug. In order to actually fix the bug, nvidia-glx must be upgraded *before*
libc6 (or any other /etc/ld.so.nohwcap-using package).

My thoughts on this would be to make affected packages (e.g. libc6) Conflict
with nvidia-glx (< fixed-version). I'm no expert on how Debian/apt resolves
dependencies, so I'm not 100% sure whether this will result in:
 - removal of nvidia-glx;
 - no upgrade of affected packages;
 - or upgrade of nvidia-glx before affected packages (the desired result).
I'm also unsure of the politics of getting the affected packages to make the
required change, especially considering that they are probably frozen (e.g.
libc6).

Your thoughts and input would be much appreciated.

Cheers,
    Chris



--
To UNSUBSCRIBE, email to debian-bugs-rc-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Bug#441975: [pkg-nvidia-devel] Bug#441975: nvidia-glx should only provide the TLS version

by Randall Donald :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

On Thu, 2008-07-24 at 01:16 +0100, Chris Reeves wrote:

> Your thoughts and input would be much appreciated.
>

I was working on this yesterday based on your email. Just to confirm,
will providing TLS-enabled libnvidia-tls in both /usr/lib
and /usr/lib/tls work?


--
--------------------------------------------
Randall Donald             randy@...
http://www.khensu.org    rdonald@...
Programmer/Debian Developer GnuPG: 6C27DEAB                    
--------------------------------------------



--
To UNSUBSCRIBE, email to debian-bugs-rc-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Bug#441975: [pkg-nvidia-devel] Bug#441975: nvidia-glx should only provide the TLS version

by Chris Reeves-4 :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

On Thu, Jul 24, 2008 at 08:48:07AM -0700, Randall Donald wrote:
>
> I was working on this yesterday based on your email. Just to confirm,
> will providing TLS-enabled libnvidia-tls in both /usr/lib
> and /usr/lib/tls work?

It depends upon what exactly your definition of "work" is. It works on my
lenny system (glibc 2.7, kernel 2.6.25) in as much as having the /usr/lib/tls
version of the library in both /usr/lib/tls and /usr/lib has had no adverse
effect on my running system (whether I boot with /etc/ld.so.nohwcap present or
not) and means that "perl -e 'use Qt'" does not segfault when
/etc/ld.so.nohwcap is present (which should prevent debconf from crashing when
the kde frontend is used).

As I said in my previous mail, IMHO there is no reason for both versions of
this library to be accessible by a running system at any one time (the
libraries are for use with different glibc and/or kernel versions). Therefore
/usr/lib and /usr/lib/tls should both contain the same version.

The question remains as to whether we wish to continue supporting nvidia-glx
users who use an unpatched kernel 2.4 (which your README.Debian implies are
the target users of the 'non-tls' version). If that is the case then we should
install both versions into /usr/lib/nvidia and continue to use USE_TLS to
switch between the versions. If not, then the one-line change that you made to
debian/rules in r432 should be sufficient to fix things on the nvidia-glx side
of things. Of course, libc6 would still need a Conflicts: or some other
modification.

I hope this is confirmation enough. My more reserved language is simply
because I have a limited ability to test this (the system that this applies to
is a production system).

Cheers,
    Chris



--
To UNSUBSCRIBE, email to debian-bugs-rc-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Parent Message unknown Bug#441975: [pkg-nvidia-devel] Bug#441975: nvidia-glx should only provide the TLS version

by Sven Joachim :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

On 2008-07-24 18:32 +0200, Chris Reeves wrote:

> The question remains as to whether we wish to continue supporting nvidia-glx
> users who use an unpatched kernel 2.4 (which your README.Debian implies are

This is not really a question, since nvidia-glx requires libc6 (>= 2.7-1)
and that will not run on a 2.4 kernel.

Sven



--
To UNSUBSCRIBE, email to debian-bugs-rc-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Bug#441975: nvidia-glx should only provide the TLS version

by Aurelien Jarno :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

Chris Reeves a écrit :

> On Fri, Feb 15, 2008 at 18:59:30 +0100, Aurelian Jarno wrote:
>> severity 441975 serious
>>
>> On Sat, Jan 19, 2008 at 12:57:42AM +0100, Aurelien Jarno wrote:
>>> FYI the problem is that /etc/ld.so.nohwcaps disable all optimized
>>> libraries and use the one from /usr/lib. NVidia had the idea to provide
>>> a TLS version (in /usr/lib/tls) and a non-TLS version (in /usr/lib) of
>>> their library. Disabling optimized libraries means that the non-TLS
>>> version of the library is used. However, their code chose between TLS
>>> and non-TLS code on a different way (a test code), which always succeed
>>> on recent systems with NPTL library. This lead to a mix of TLS and
>>> non-TLS code, leading to a crash.
>>>
>>> I will workaround to the glibc to also use tls/ directory even when
>>> optimized libraries are disabled, as TLS is alway available in lenny.
>> This workaround causes problems when upgrading from etch to lenny, so it
>> will be removed in the next upload. As a consequence, this bug really
>> has to be fixed, so I am upgrading it to serious.
>
> I have been able to reproduce this on a lenny machine with a 2.6.25-2 kernel.
> In order to do so one must use an nVidia graphics card with the nVidia binary
> driver and /etc/ld.so.nohwcaps must exist. The test (as described in a
> previous message) is that "perl -e 'use Qt'" will segfault.
>
> This bug will affect any user of the nvidia-glx package who has their debconf
> frontend set to kde (or similar) and tries to upgrade a package which makes
> use of /etc/ld.so.nohwcaps (e.g. libc6). These users will be affected
> irrespective of whether nvidia-graphics-drivers makes it into lenny or not.
>
>
> Aurelian is largely correct with this. The nVidia installer comes with two
> different copies of libnvidia-tls.so.<version> inside the installer package.
>  - According to the nvidia-installer docs, the version in
>    <package-dir>/usr/lib is for glibc <= 2.2, while the version in
>    <package-dir>/usr/lib/tls is for glibc >= 2.3.
>  - According to the README.Debian for nvidia-glx, however, the differing
>    versions are for 2.4 and 2.6 kernels (presumably on the assumption that
>    NPTL is implemented in the latter and not in the former).
> Whichever of these interpretations is actually correct, the same version of

I guess the most correct interpretation is the one in README.Debian,
more precisely replacing 2.4 by non-NPTL and 2.6 by NPTL.

> the library should be installed into both /usr/lib and /usr/lib/tls so that
> the presence of /etc/ld.so.nohwcap does not affect which version of the
> library is used (which it shouldn't).

Or only the NPTL version in /usr/lib, as the non-NPTL version does not
exists anymore in Lenny.

> On the basis of the nVidia docs it might seem reasonable to only ship the
> second version, since lenny is guaranteed to come with glibc >= 2.3. In this
> case we only require a one-line change to debian/rules to get things to work
> (although the USE_TLS flag would become redundant and so we could also remove
> related code and documentation).
>
> On the other hand, if Randall's README.Debian is the more accurate, we might
> break things for some users with older kernels. In this case it would take a
> few more changes to get things to work (keep both versions of libnvidia-tls in
> /usr/lib/nvidia and modify the init scripts to symlink both /usr/lib and
> /usr/lib/tls to the same version).

OTOH, as we switched to NTPL only in Lenny, older kernels (I mean 2.4
kernels) are not supported anymore. IIRC the minimum kernel is even
2.6.18 for i386. In that case it may be even more easier to remove the
initscript, and provide only one symlink in /usr/lib directly in the
package.

> My vote would be for the second option. It would be useful if people could
> express their preferences so that I can produce a patch for the preferred
> option. This would fix the nvidia-glx package, but does *not* fix the bug
> completely.
>
>
> As I said earlier, this bug will affect any of the users that I have
> previously described, irrespective of whether an updated package makes it into
> lenny - the presence/use of an old version of nvidia-glx will trigger this
> bug. In order to actually fix the bug, nvidia-glx must be upgraded *before*
> libc6 (or any other /etc/ld.so.nohwcap-using package).
>
> My thoughts on this would be to make affected packages (e.g. libc6) Conflict
> with nvidia-glx (< fixed-version). I'm no expert on how Debian/apt resolves
> dependencies, so I'm not 100% sure whether this will result in:
>  - removal of nvidia-glx;
>  - no upgrade of affected packages;
>  - or upgrade of nvidia-glx before affected packages (the desired result).
> I'm also unsure of the politics of getting the affected packages to make the
> required change, especially considering that they are probably frozen (e.g.
> libc6).

I am currently thinking of other alternative, but I currently can't see
one. If it is the better one, I don't think the freeze will block us
(that is we can convince the release team).

--
  .''`.  Aurelien Jarno            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@...         | aurelien@...
   `-    people.debian.org/~aurel32 | www.aurel32.net



--
To UNSUBSCRIBE, email to debian-bugs-rc-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Bug#441975: [pkg-nvidia-devel] Bug#441975: nvidia-glx should only provide the TLS version

by Aurelien Jarno :: Rate this Message:

Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message

Chris Reeves a écrit :

> On Thu, Jul 24, 2008 at 08:48:07AM -0700, Randall Donald wrote:
>> I was working on this yesterday based on your email. Just to confirm,
>> will providing TLS-enabled libnvidia-tls in both /usr/lib
>> and /usr/lib/tls work?
>
> It depends upon what exactly your definition of "work" is. It works on my
> lenny system (glibc 2.7, kernel 2.6.25) in as much as having the /usr/lib/tls
> version of the library in both /usr/lib/tls and /usr/lib has had no adverse
> effect on my running system (whether I boot with /etc/ld.so.nohwcap present or
> not) and means that "perl -e 'use Qt'" does not segfault when
> /etc/ld.so.nohwcap is present (which should prevent debconf from crashing when
> the kde frontend is used).
>
> As I said in my previous mail, IMHO there is no reason for both versions of
> this library to be accessible by a running system at any one time (the
> libraries are for use with different glibc and/or kernel versions). Therefore
> /usr/lib and /usr/lib/tls should both contain the same version.
>
> The question remains as to whether we wish to continue supporting nvidia-glx
> users who use an unpatched kernel 2.4 (which your README.Debian implies are
> the target users of the 'non-tls' version). If that is the case then we should
> install both versions into /usr/lib/nvidia and continue to use USE_TLS to
> switch between the versions. If not, then the one-line change that you made to
> debian/rules in r432 should be sufficient to fix things on the nvidia-glx side
> of things. Of course, libc6 would still need a Conflicts: or some other
> modification.

As explained in my previous email, there is no point in supporting a 2.4
kernel, as Lenny won't run on it.

--
  .''`.  Aurelien Jarno            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@...         | aurelien@...
   `-    people.debian.org/~aurel32 | www.aurel32.net



--
To UNSUBSCRIBE, email to debian-bugs-rc-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...

LightInTheBox - Buy quality products at wholesale price