"clear all" problem for classes defined in oct-files (Was: : bug) (Concerns: SWIG)

View: New views
9 Messages — Rating Filter:   Alert me  

"clear all" problem for classes defined in oct-files (Was: : bug) (Concerns: SWIG)

by Michael Goffioul-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, May 26, 2008 at 9:58 AM, Michael Goffioul
<michael.goffioul@...> wrote:
>> I'm still hoping one of Octave's windows based developers will chime
>> in. In the meantime, does the same result occur when you type "clear
>> all" as the initial command, or does it only occur when in a script?
>
> I can reproduce this bug. Simply typing "clear all" at octave prompt
> makes octave crash. This does not occur with development branch.
> I'll try to find the reason, but this can take time as I have to recompile
> octave with debug enable.

OK, I could figure out what caused the crash, but unfortunately the fix
is not as easy as I thought. That's why I would like to hear comments
from John (for the symbol table stuff) and Xavier (for the SWIG stuff).

Note: this problem occurs in 3.0.1, but I could not check if it was present
in development code (it happens with SWIG-based code, but SWIG
support for octave is only available for 3.0.x branch, AFAIK).

Basically, the problem is that "clear all" at octave prompt makes octave
crash when a SWIG-based package like "database" is loaded in memory,
more specifically the SQLite3 support. When loaded by octave, this
package installs various global variables, whose class is contained in the
oct-file sqlite3.oct. When calling "clear all", octave does the following:
1) clear current symbol table (curr_sym_tab)
2) clear functions of fbi_sym_tab
3) clear global symbol table (global_sym_tab)

Now the problem is that step 2) unmaps sqlite3.oct from process address
space. This invalidates the virtual table of sqlite3 objects contained in the
global symbol table. In step 3), clearing some of those objects results in
calling functions in the invalid virtual table and produces a segmentation
fault.

A simple workaround for the user is to simply not load those SWIG-based
packages are startup: ftp, ann and database. This can be done by either
deselecting them at installation time, or by simply typing

pkg rebuild -noauto ftp database ann

at octave prompt, then restart octave. You can check that packages are
not loaded with "pkg list".

Michael.
_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www.cae.wisc.edu/mailman/listinfo/bug-octave

Re: "clear all" problem for classes defined in oct-files (Was: : bug) (Concerns: SWIG)

by Xavier Delacour :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, May 26, 2008 at 11:16 AM, Michael Goffioul
<michael.goffioul@...> wrote:
> Note: this problem occurs in 3.0.1, but I could not check if it was present
> in development code (it happens with SWIG-based code, but SWIG
> support for octave is only available for 3.0.x branch, AFAIK).

I haven't been working on head, but I imagine it should work without
any problems. (or at least I'm not currently aware of any). I'll give
it a try when I get a chance.

>
> Basically, the problem is that "clear all" at octave prompt makes octave
> crash when a SWIG-based package like "database" is loaded in memory,
> more specifically the SQLite3 support. When loaded by octave, this
> package installs various global variables, whose class is contained in the
> oct-file sqlite3.oct. When calling "clear all", octave does the following:
> 1) clear current symbol table (curr_sym_tab)
> 2) clear functions of fbi_sym_tab
> 3) clear global symbol table (global_sym_tab)
>
> Now the problem is that step 2) unmaps sqlite3.oct from process address
> space. This invalidates the virtual table of sqlite3 objects contained in the
> global symbol table. In step 3), clearing some of those objects results in
> calling functions in the invalid virtual table and produces a segmentation
> fault.

Won't any oct-file that installs a custom type have this problem if
variables of that type are installed in the global symbol table? Off
hand it seems like the correct solution is to swap steps 2 and 3. Or
unload symbols etc in step 2, but only unload the shared libs in a new
step 4.

I think this is only SWIG specific in that it installs a global
variable at load time. Other packages that use custom types probably
have the same problem, but you have to assign a global variable of the
custom type to notice it.

You should also see a similar problem if you touch an oct-file that
has installed a custom type (and an instance of it exists somewhere).
Octave will try to reload the oct-file and make the type invalid.. and
subsequently using the existing variable instance will cause a crash.

Xavier
_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www.cae.wisc.edu/mailman/listinfo/bug-octave

Parent Message unknown Re: "clear all" problem for classes defined in oct-files (Was: : bug) (Concerns: SWIG)

by David Bateman-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Xavier Delacour wrote:

> On Mon, May 26, 2008 at 11:16 AM, Michael Goffioul
> <michael.goffioul@...> wrote:
>> Note: this problem occurs in 3.0.1, but I could not check if it was present
>> in development code (it happens with SWIG-based code, but SWIG
>> support for octave is only available for 3.0.x branch, AFAIK).
>
> I haven't been working on head, but I imagine it should work without
> any problems. (or at least I'm not currently aware of any). I'll give
> it a try when I get a chance.
>
>> Basically, the problem is that "clear all" at octave prompt makes octave
>> crash when a SWIG-based package like "database" is loaded in memory,
>> more specifically the SQLite3 support. When loaded by octave, this
>> package installs various global variables, whose class is contained in the
>> oct-file sqlite3.oct. When calling "clear all", octave does the following:
>> 1) clear current symbol table (curr_sym_tab)
>> 2) clear functions of fbi_sym_tab
>> 3) clear global symbol table (global_sym_tab)
>>
>> Now the problem is that step 2) unmaps sqlite3.oct from process address
>> space. This invalidates the virtual table of sqlite3 objects contained in the
>> global symbol table. In step 3), clearing some of those objects results in
>> calling functions in the invalid virtual table and produces a segmentation
>> fault.
>
> Won't any oct-file that installs a custom type have this problem if
> variables of that type are installed in the global symbol table? Off
> hand it seems like the correct solution is to swap steps 2 and 3. Or
> unload symbols etc in step 2, but only unload the shared libs in a new
> step 4.
>
> I think this is only SWIG specific in that it installs a global
> variable at load time. Other packages that use custom types probably
> have the same problem, but you have to assign a global variable of the
> custom type to notice it.
>
> You should also see a similar problem if you touch an oct-file that
> has installed a custom type (and an instance of it exists somewhere).
> Octave will try to reload the oct-file and make the type invalid.. and
> subsequently using the existing variable instance will cause a crash.
>
> Xavier
>

Yes the fixed type has the same issue. The solution used in the fixed
type is to "mlock" the constructor function in place when the package is
loaded so that "clear all" won't remove it.. fixed.oct currently uses

  // Lock constructor function in place, otherwise
  // "a=fixed(3,1); clear functions; a" generates a seg-fault!!
  // The below is the function "mlock", but in a way useable
  // for older versions of octave as well.
  fbi_sym_tab->lookup("fixed")->mark_as_static ();

for this purpose as mlock wasn't in some older 2.1.x versions of Octave.
However this is incompatible with 3.1.x, but mlock is incompatible
between the 3.0 and 3.1 versions as well, or at least

void mlock (const std::string&)

doesn't exist in 3.1.x yet.. I'd suggest using the above in any case,
and I believe we should readd the same function in 3.1.

Regards
David



_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www.cae.wisc.edu/mailman/listinfo/bug-octave

Re: "clear all" problem for classes defined in oct-files (Was: : bug) (Concerns: SWIG)

by Michael Goffioul-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, May 28, 2008 at 7:43 AM, Xavier Delacour
<xavier.delacour@...> wrote:
> On Mon, May 26, 2008 at 11:16 AM, Michael Goffioul
> <michael.goffioul@...> wrote:
>> Note: this problem occurs in 3.0.1, but I could not check if it was present
>> in development code (it happens with SWIG-based code, but SWIG
>> support for octave is only available for 3.0.x branch, AFAIK).
>
> I haven't been working on head, but I imagine it should work without
> any problems. (or at least I'm not currently aware of any). I'll give
> it a try when I get a chance.

No, it does not work out-of-the-box (I already tried a few weeks ago).
The symbol table code has completely been revamped.

> Won't any oct-file that installs a custom type have this problem if
> variables of that type are installed in the global symbol table? Off
> hand it seems like the correct solution is to swap steps 2 and 3. Or
> unload symbols etc in step 2, but only unload the shared libs in a new
> step 4.
>
> I think this is only SWIG specific in that it installs a global
> variable at load time. Other packages that use custom types probably
> have the same problem, but you have to assign a global variable of the
> custom type to notice it.
>
> You should also see a similar problem if you touch an oct-file that
> has installed a custom type (and an instance of it exists somewhere).
> Octave will try to reload the oct-file and make the type invalid.. and
> subsequently using the existing variable instance will cause a crash.

Win32 platform is slightly different about this, in the sense that you can't
modify (or delete) a shared module that is mapped by some process.
So as long as an oct-file is loaded by octave, you can't touch it.

Another solution for this kind of problem would be to prevent octave from
unloading an oct-file (this does not mean that the symbol cannot be removed
from the symbol table, but simply that the shared module is not unmapped
from the process address space) while there are still variables of classes
contained in the oct-file. One way to achieve this is to make all such variables
to hold a reference to their containing oct-file, in the same way
octave_dld_function class does, by using octave_shlib. With automatic
referencing, the oct-file would only be unloaded when all functions and all
variables of a contained class are cleared. I think this would makes
things cleaner, and it could even be provided in octave by some standard
mechanism (like an octave_dld_base_value class, from which classes in
oct-files would inherit).

Michael.
_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www.cae.wisc.edu/mailman/listinfo/bug-octave

Re: "clear all" problem for classes defined in oct-files (Was: : bug) (Concerns: SWIG)

by John W. Eaton :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 28-May-2008, David Bateman wrote:

| Yes the fixed type has the same issue. The solution used in the fixed
| type is to "mlock" the constructor function in place when the package is
| loaded so that "clear all" won't remove it.. fixed.oct currently uses
|
|   // Lock constructor function in place, otherwise
|   // "a=fixed(3,1); clear functions; a" generates a seg-fault!!
|   // The below is the function "mlock", but in a way useable
|   // for older versions of octave as well.
|   fbi_sym_tab->lookup("fixed")->mark_as_static ();
|
| for this purpose as mlock wasn't in some older 2.1.x versions of Octave.
| However this is incompatible with 3.1.x, but mlock is incompatible
| between the 3.0 and 3.1 versions as well, or at least
|
| void mlock (const std::string&)
|
| doesn't exist in 3.1.x yet.. I'd suggest using the above in any case,
| and I believe we should readd the same function in 3.1.

I don't think we should need to have "mlock (NAME)".  Instead, you can
use the method posted by Jaroslav recently:  in the "fixed" function,
you can do

  octave_function *f = octave_call_stack::current ();
  if (f)
    f->lock ();

jwe
_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www.cae.wisc.edu/mailman/listinfo/bug-octave

Re: "clear all" problem for classes defined in oct-files (Was: : bug) (Concerns: SWIG)

by John W. Eaton :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 28-May-2008, Michael Goffioul wrote:

| Another solution for this kind of problem would be to prevent octave from
| unloading an oct-file (this does not mean that the symbol cannot be removed
| from the symbol table, but simply that the shared module is not unmapped
| from the process address space) while there are still variables of classes
| contained in the oct-file. One way to achieve this is to make all such variables
| to hold a reference to their containing oct-file, in the same way
| octave_dld_function class does, by using octave_shlib. With automatic
| referencing, the oct-file would only be unloaded when all functions and all
| variables of a contained class are cleared. I think this would makes
| things cleaner, and it could even be provided in octave by some standard
| mechanism (like an octave_dld_base_value class, from which classes in
| oct-files would inherit).

Isn't this how things currently work?  I think the octave_base_shlib
class manages this properly now, at least in the default branch (I
have no plans to fix a problem like this in the release-3-0-x branch).
But if not, then it would be helpful if someone could submit a simple
test case that demonstrates the problem.

Thanks,

jwe
_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www.cae.wisc.edu/mailman/listinfo/bug-octave

Re: "clear all" problem for classes defined in oct-files (Was: : bug) (Concerns: SWIG)

by Michael Goffioul-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 11, 2008 at 10:29 PM, John W. Eaton <jwe@...> wrote:

> On 28-May-2008, Michael Goffioul wrote:
>
> | Another solution for this kind of problem would be to prevent octave from
> | unloading an oct-file (this does not mean that the symbol cannot be removed
> | from the symbol table, but simply that the shared module is not unmapped
> | from the process address space) while there are still variables of classes
> | contained in the oct-file. One way to achieve this is to make all such variables
> | to hold a reference to their containing oct-file, in the same way
> | octave_dld_function class does, by using octave_shlib. With automatic
> | referencing, the oct-file would only be unloaded when all functions and all
> | variables of a contained class are cleared. I think this would makes
> | things cleaner, and it could even be provided in octave by some standard
> | mechanism (like an octave_dld_base_value class, from which classes in
> | oct-files would inherit).
>
> Isn't this how things currently work?  I think the octave_base_shlib
> class manages this properly now, at least in the default branch (I
> have no plans to fix a problem like this in the release-3-0-x branch).
> But if not, then it would be helpful if someone could submit a simple
> test case that demonstrates the problem.

This works for DLD functions, but not for DLD classes. What I propose is to
extend this behavior to DLD classes, either on a per-class basis (every
DLD class must take care itself of the octave_shlib referencing) or through
an octave-provided mechanism (for instance in the form of an octave_dld_value
class).

Michael,
_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www.cae.wisc.edu/mailman/listinfo/bug-octave

Re: "clear all" problem for classes defined in oct-files (Was: : bug) (Concerns: SWIG)

by John W. Eaton :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11-Jun-2008, Michael Goffioul wrote:

| On Wed, Jun 11, 2008 at 10:29 PM, John W. Eaton <jwe@...> wrote:
| > On 28-May-2008, Michael Goffioul wrote:
| >
| > | Another solution for this kind of problem would be to prevent octave from
| > | unloading an oct-file (this does not mean that the symbol cannot be removed
| > | from the symbol table, but simply that the shared module is not unmapped
| > | from the process address space) while there are still variables of classes
| > | contained in the oct-file. One way to achieve this is to make all such variables
| > | to hold a reference to their containing oct-file, in the same way
| > | octave_dld_function class does, by using octave_shlib. With automatic
| > | referencing, the oct-file would only be unloaded when all functions and all
| > | variables of a contained class are cleared. I think this would makes
| > | things cleaner, and it could even be provided in octave by some standard
| > | mechanism (like an octave_dld_base_value class, from which classes in
| > | oct-files would inherit).
| >
| > Isn't this how things currently work?  I think the octave_base_shlib
| > class manages this properly now, at least in the default branch (I
| > have no plans to fix a problem like this in the release-3-0-x branch).
| > But if not, then it would be helpful if someone could submit a simple
| > test case that demonstrates the problem.
|
| This works for DLD functions, but not for DLD classes. What I propose is to
| extend this behavior to DLD classes, either on a per-class basis (every
| DLD class must take care itself of the octave_shlib referencing) or through
| an octave-provided mechanism (for instance in the form of an octave_dld_value
| class).

I guess I'm not following what you mean by a "DLD class".

jwe
_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www.cae.wisc.edu/mailman/listinfo/bug-octave

Re: "clear all" problem for classes defined in oct-files (Was: : bug) (Concerns: SWIG)

by Michael Goffioul-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 11, 2008 at 11:17 PM, John W. Eaton <jwe@...> wrote:
> | This works for DLD functions, but not for DLD classes. What I propose is to
> | extend this behavior to DLD classes, either on a per-class basis (every
> | DLD class must take care itself of the octave_shlib referencing) or through
> | an octave-provided mechanism (for instance in the form of an octave_dld_value
> | class).
>
> I guess I'm not following what you mean by a "DLD class".

I mean an octave_base_value-derived class that is defined in an
oct-file and registered when the oct-file is loaded by octave. Several
packages use this technique: fixed, communications, java, SWIG...

The problem is that the virtual table of objects from such class
points to code that is located in the oct-file. If the oct-file is
unloaded while such objects still exist, their virtual table then
points to invalid code segment, leading to segfault in the end.

Michael.
_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www.cae.wisc.edu/mailman/listinfo/bug-octave