|
View:
New views
7 Messages
—
Rating Filter:
Alert me
|
|
|
"Fix" for PowerPC build (and probably others as well)Hello,
After a lot of building/debugging I discovered that the problem I described earlier appeared after the addition of function: check_variable_value_replacement The change was done on 23 June. The function itself is called from 3 places (eval.d): (eval1): use it (interpret_bytecode_): use it for cod_getvalue & cod_getvalue_push After I restored the original code of before 23 June for these 3 places - all the tests passed fine. I do not know in details the internals of clisp (just traced the segfaults to this place) - however seems there are more things pushed on the stack than in the older version (otherwise the code is the same) - which later on make mess. The strange thing is that on linux x86 there are not problems with the check-tests. Can somebody more knowledgeable on the internals check this? Thanks, Vladimir On 7/4/08, Sam Steingold <sds@...> wrote: > I compiled your test file and loaded it and run the function without any > problems. > please specify exactly what I need to do to get a segfault. > please try to debug the problem a little bit: use xout and zout to print > objects, examine STACK &c > (see http://clisp.cons.org/impnotes/faq.html#faq-debug) > please try compiling with g++ to catch gc-safety bugs: > ./configure --cbc CFLAGS='' --with-debug CC='g++' build-g-gxx > and try your test case there. > > thanks. > > -- > > Sam Steingold (http://sds.podval.org/) on Fedora release 9 (Sulphur) > > http://iris.org.il http://pmw.org.il http://truepeace.org http://jihadwatch.org > http://honestreporting.com http://palestinefacts.org > If you want it done right, you have to do it yourself > ------------------------------------------------------------------------- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 _______________________________________________ clisp-devel mailing list clisp-devel@... https://lists.sourceforge.net/lists/listinfo/clisp-devel |
|
|
Re: "Fix" for PowerPC build (and probably others as well)Vladimir Tzankov wrote:
> > After a lot of building/debugging I discovered that the problem I > described earlier appeared after the addition of function: > > check_variable_value_replacement > > The change was done on 23 June. The function itself is called from 3 > places (eval.d): > (eval1): use it > (interpret_bytecode_): use it for cod_getvalue & cod_getvalue_push > > After I restored the original code of before 23 June for these 3 > places - all the tests passed fine. nice work! thanks! > I do not know in details the internals of clisp (just traced the > segfaults to this place) - however seems there are more things pushed > on the stack than in the older version (otherwise the code is the > same) - which later on make mess. > > The strange thing is that on linux x86 there are not problems with the > check-tests. indeed I see no problems on either x86 or amd64. maybe this is a gc-safety bug? could you please compile with ./configure --with-debug CC=g++ build-g-gxx and try your code? if you get an abort in ngci_pointable (or similar), then the backtrace would be eminently helpful. thanks ------------------------------------------------------------------------- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 _______________________________________________ clisp-devel mailing list clisp-devel@... https://lists.sourceforge.net/lists/listinfo/clisp-devel |
|
|
Re: "Fix" for PowerPC build (and probably others as well)Hi,
> > maybe this is a gc-safety bug? > could you please compile with > ./configure --with-debug CC=g++ build-g-gxx > and try your code? if you get an abort in ngci_pointable (or similar), then > the backtrace would be eminently helpful. > > thanks > i've built the it with g++ - the segfault is on the same place - no difference with the gcc build. These days I'll have more free time and will try to examine the problem. During the tests with g++ another problem arose - related with tests - alltest.tst - form: (or #+win32 (string= "g++" (software-type) :end2 3) (let* ((n (min lambda-parameters-limit 1024)) (vars (loop repeat n collect (gensym)))) (eval `(= ,n (flet ((%f ,vars (+ ,@vars))) (%f ,@(loop for e in vars collect 1))))))) It causes segmentation faults on PPC OSX when built with g++ (with gcc - no problem). So I put the same conditional as for win32 when the build is for OSX - #+(or win32 macos) (string= "g++" .....) This seems like a known problem (by the way it is handled on win32). On PPC OSX with g++ build the above form is fine if the parameters are up to 675 if this is of interest. Another (minor) thing with the above form is: (min lambda-parameters-limit 1024) it should be be (1- lambda-parameters-limit) since accordng to the CLHS lambda-parameters-limit is exclusive. Hope to find the reason for the check_variable_value_replacement problem on ppc osx platform. BR Vladimir ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ clisp-devel mailing list clisp-devel@... https://lists.sourceforge.net/lists/listinfo/clisp-devel |
|
|
Re: "Fix" for PowerPC build (and probably others as well)Dan, Vladimir,
please try the latest cvs head. if it does not work, please try the appended patch. I have no idea what is wrong with the code, but Vladimir says that this change caused the crash. Thanks. Sam. ps. clisp-list mail did not go through because of the attachment... --- eval.d.~1.257.~ 2008-07-16 11:04:44.000000000 -0400 +++ eval.d 2008-07-16 11:15:24.000279000 -0400 @@ -6386,7 +6386,9 @@ local /*maygc*/ Values interpret_bytecod var object symbol = TheCclosure(closure)->clos_consts[n]; /* The Compiler has already checked, that it's a Symbol. */ if (!boundp(Symbol_value(symbol))) { - pushSTACK(symbol); check_variable_value_replacement(&STACK_0,false); + pushSTACK(symbol); /* CELL-ERROR slot NAME */ + pushSTACK(symbol); pushSTACK(Closure_name(closure)); + error(unbound_variable,GETTEXT("~S: symbol ~S has no value")); } VALUES1(Symbol_value(symbol)); } goto next_byte; @@ -6396,7 +6398,9 @@ local /*maygc*/ Values interpret_bytecod var object symbol = TheCclosure(closure)->clos_consts[n]; /* The Compiler has already checked, that it's a Symbol. */ if (!boundp(Symbol_value(symbol))) { - pushSTACK(symbol); check_variable_value_replacement(&STACK_0,false); + pushSTACK(symbol); /* CELL-ERROR slot NAME */ + pushSTACK(symbol); pushSTACK(Closure_name(closure)); + error(unbound_variable,GETTEXT("~S: symbol ~S has no value")); } pushSTACK(Symbol_value(symbol)); } goto next_byte; ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ clisp-devel mailing list clisp-devel@... https://lists.sourceforge.net/lists/listinfo/clisp-devel |
|
|
Re: "Fix" for PowerPC build (and probably others as well)Hi Sam,
The latest cvs and the patch you supplied did not fix the problem. I have tried similar things before with no success. Fortunately by chance I think I succeeded to localize and fix the problem. The problem is with handling the type of the object to which back_trace->bt_function points (subr_self). In function check_variable_value_replacement: pushSTACK(TheSubr(subr_self)->name); if (restart_p) check_value(unbound_variable,GETTEXT("~S: variable ~S has no value")); else error(unbound_variable,GETTEXT("~S: variable ~S has no value")); If instead of trying to print the subr name we just skip it (not push it on stack and removing one ~S from the format string) all tests pass with no segfault. Also if instead of pushSTACK(TheSubr(subr_self)->name); I put another pushSTACK(*symbol_); - keeping both ~S in format string - again everything works fine. So it seems that TheSubr(subr_self)->name causes the problem. Actually the fault address is exactly this address. (the above behavior is both in version 2.46 and current cvs). In the prior versions (before introduction of check_variable_value_replacement calls in eval.d) the subr name was not printed - so this explains why when I copied the old code everything worked fine (and this caused me to make the above tests). The lisp stack trace in gdb when the error happens looks like (just before the call to pushStack(TheSubr(subr_self)->name): [0/0xbfff0fe0]> #<SPECIAL-OPERATOR CL::LET> delta: STACK=20; SP=1078 [1/0xbfff20b8]> #<FUNCTION :LAMBDA> 0 args delta: STACK=6; SP=960 [2/0xbfff2fb8]> #<COMPILED-FUNCTION #:COMPILED-FORM-126> delta: STACK=0; SP=142 [3/0xbfff31f0]> #<SPECIAL-OPERATOR CL::LOCALLY> delta: STACK=15; SP=340 When the check_variable_value_replacement works fine (with some calls) the lisp stack trace (at the same point as above) looks like: [0/0xbfff3b60]> #<SYSTEM-FUNCTION CL::EVAL> delta: STACK=9; SP=933 [1/0xbfff49f4]> #<SYSTEM-FUNCTION CL::EVAL> delta: STACK=3; SP=317 [2/0xbfff4ee8]> #<COMPILED-FUNCTION USER::MY-EVAL> delta: STACK=1; SP=960 I am not sure where to look exactly but seems that at the back_trace->bt_function we may have subr, fsubr and may be other (closure, macro, symbol_macro) ??? So I think that the problem is caused by a wrong cast by TheSubr in some cases. Instead of pushStack(TheSubr(subr_self)->name), I wrote: if (fsubrp(subr_self)) pushSTACK(TheFsubr(subr_self)->name); else if (subrp(subr_self)) pushSTACK(TheSubr(subr_self)->name); else pushSTACK(NIL); And everything seems fine - not other changes to cvs version (from yesterday however) or to the clisp-2.46 are needed. With this patch all the test pass and no segfaults happen. (probably the same applies for x86 but there just happens to be a valid pointer). BR Vladimir On 7/16/08, Sam Steingold <sds@...> wrote: > Dan, Vladimir, > please try the latest cvs head. > if it does not work, please try the appended patch. > I have no idea what is wrong with the code, but Vladimir says that this > change caused the crash. > Thanks. > Sam. > > ps. clisp-list mail did not go through because of the attachment... > > ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ clisp-devel mailing list clisp-devel@... https://lists.sourceforge.net/lists/listinfo/clisp-devel |
|
|
Re: "Fix" for PowerPC build (and probably others as well)Hi Vladimir,
Vladimir Tzankov wrote: > > If instead of trying to print the subr name we just skip it (not push > it on stack and removing one ~S from the format string) all tests pass > with no segfault. fascinating. yes, you nailed the problem, thank you very much! the fix is in the CVS, please try it. incidentally, the reason it worked for me but not for you is that fsubr and subr structs start the same: typedef struct { XRECORD_HEADER gcv_object_t name _attribute_aligned_object_; /* name */ the the only difference is that subr ends with #if defined(HEAPCODES) && (alignment_long < 4) && defined(GNU) /* Force all Subrs to be allocated with a 4-byte alignment. GC needs this. */ __attribute__ ((aligned (4))) #endif you can do "make lispbibl.h" (if you are using gcc) and see the values of HEAPCODES, alignment_long, and the actual definition of subr_t and fsubr_t. I bet you have the __attribute__ ((aligned (4))). Thanks again for debugging this! Sam. ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ clisp-devel mailing list clisp-devel@... https://lists.sourceforge.net/lists/listinfo/clisp-devel |
|
|
Re: "Fix" for PowerPC build (and probably others as well)Hi Sam,
Yes the current CVS is working fine. It is really alignment issue however I have not traced it down to in lispbibl.h. BR Vladimir On Thu, Jul 17, 2008 at 6:23 PM, Sam Steingold <sds@...> wrote: > Hi Vladimir, > > Vladimir Tzankov wrote: >> >> If instead of trying to print the subr name we just skip it (not push >> it on stack and removing one ~S from the format string) all tests pass >> with no segfault. > > fascinating. > yes, you nailed the problem, thank you very much! > the fix is in the CVS, please try it. > > incidentally, the reason it worked for me but not for you is that > fsubr and subr structs start the same: > > typedef struct { > XRECORD_HEADER > gcv_object_t name _attribute_aligned_object_; /* name */ > > the the only difference is that subr ends with > > #if defined(HEAPCODES) && (alignment_long < 4) && defined(GNU) > /* Force all Subrs to be allocated with a 4-byte alignment. GC needs this. > */ > __attribute__ ((aligned (4))) > #endif > > you can do "make lispbibl.h" (if you are using gcc) and see the values of > HEAPCODES, alignment_long, and the actual definition of subr_t and fsubr_t. > I bet you have the __attribute__ ((aligned (4))). > > Thanks again for debugging this! > > Sam. > > ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ clisp-devel mailing list clisp-devel@... https://lists.sourceforge.net/lists/listinfo/clisp-devel |
| Free Forum Powered by Nabble | Forum Help |