Troubles with OSX

View: New views
6 Messages — Rating Filter:   Alert me  

Troubles with OSX

by Phil Malin-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Howdy all.

Recently I've been noticing an issue when running SmartEiffel
applications on OSX.  Namely, I've been getting EXC_BAD_ACCESS (SIGSEGV)
exceptions thrown.  This occurs for apps that I've written which tend to
use a reasonable amount of memory and I'm wondering if it's related to a
GC issue.  This issue hasn't occurred for any other OS I've used
(FreeBSD, Linux and Windows) when running the same applications.

This problem occurs under both Tiger and Leopard and for gcc 4.0.1 and 4.2.

I've tried to debug one of my apps using gdb but to be honest I don't
have much experience with this debugger and am not getting back too much
info (the routine it says where it's dying at is just an empty string).  
I've been able to reproduce the issue easily enough in one of my apps
and the Eiffel code where it's occurring seems harmless enough - it's
simply a reference equality check (I determined this using logging
statements).

Does anyone have any suggestions on how to go about tracking the root of
this problem down?  I've started to have a look at the GC code but I'd
like to know if I really need to look at this as I can see it might take
a bit of work to get up to speed with it.  :-)

Thanks in advance for any help.

Cheers.

Phil.


Re: Troubles with OSX

by Cyril ADRIAN :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Phil,

On Mon, May 5, 2008 at 2:52 AM, Phil Malin <philmalin@...> wrote:
>  I've tried to debug one of my apps using gdb but to be honest I don't have
> much experience with this debugger and am not getting back too much info
> (the routine it says where it's dying at is just an empty string).

Tip: try using the -g -no_strip flags, you'll have a real backtrace.

If it's GC-related, occurs on OSX but not on BSD, I'm a bit at a loss.
A problem with mark_stack_and_registers() maybe??

> I've
> been able to reproduce the issue easily enough in one of my apps and the
> Eiffel code where it's occurring seems harmless enough - it's simply a
> reference equality check (I determined this using logging statements).

Maybe it's silly, but did you try activating assertions? Something
like -all_check -sedb?

Best regards,
--
Cyril ADRIAN - http://www.cadrian.net/~cyril

Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it.
-- Brian W. Kernighan

Re: Troubles with OSX

by Frederic Merizen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Howdy all.

Hi. Is it an Intel Mac or a PowerPC Mac?


Re: Troubles with OSX

by Phil Malin-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Cyril and Frederic.

Ahh - of course, I forgot about SmartEiffel stripping the executable.  :-)

I also forgot to mention that I'm running an Intel mac, not a PowerPC mac (it's a MacBook Pro).

Well, I compiled my test application with '-g -no_strip' and found out where it's dying.  The application I'm running is basically a large unit test app for a data structure library that I've built up over the years using SmartEiffel.  The particular code it's dying on relates to when I'm inserting values into an AVL set that I wrote.  The exact C code is in a gc_mark function.  In particular:

    typedef struct S368 T368;
    struct S368{T2 _element;T0* _parent;T0* _left;T0* _right;T0* _prev;T0* _next;T2 _level;};

    void gc_mark368(T368*o){
    begin:
    if(((gc368*)o)->header.flag==FSOH_UNMARKED){
    ((gc368*)o)->header.flag=FSOH_MARKED;
#   /*7p*/if(NULL!=o->_next)gc_mark368((T368*)(o->_next));
    /*7p*/if(NULL!=o->_prev)gc_mark368((T368*)(o->_prev));
    /*7p*/if(NULL!=o->_right)gc_mark368((T368*)(o->_right));
    /*7p*/if(NULL!=o->_left)gc_mark368((T368*)(o->_left));
    o=(void*)o->_parent;
    if((o!=NULL))goto begin;
    }
    }

It's dying on the line denoted by the '#'.  This structure refers to a node in my tree, which contains pointers to the left and right children as well as pointers to the next and previous nodes (ordered over the elements of the set).  The exact error message is:

    Program received signal EXC_BAD_ACCESS, Could not access memory.
    Reason: KERN_PROTECTION_FAILURE at address: 0xbf7ffffc
    0x00060963 in gc_mark368 (o=0x1bdd058) at tst11.c:8784
    8784    /*7p*/if(NULL!=o->_next)gc_mark368((T368*)(o->_next));


The odd thing is that the address 0xbf7ffffc is always the offending address, even when I encounter this error in other applications.  Is there anything else I can do to shed some more light?

Phil.

Cyril ADRIAN wrote:
Hi Phil,

On Mon, May 5, 2008 at 2:52 AM, Phil Malin philmalin@... wrote:
  
 I've tried to debug one of my apps using gdb but to be honest I don't have
much experience with this debugger and am not getting back too much info
(the routine it says where it's dying at is just an empty string).
    

Tip: try using the -g -no_strip flags, you'll have a real backtrace.

If it's GC-related, occurs on OSX but not on BSD, I'm a bit at a loss.
A problem with mark_stack_and_registers() maybe??

  
I've
been able to reproduce the issue easily enough in one of my apps and the
Eiffel code where it's occurring seems harmless enough - it's simply a
reference equality check (I determined this using logging statements).
    

Maybe it's silly, but did you try activating assertions? Something
like -all_check -sedb?

Best regards,
  


Re: Troubles with OSX

by Philippe Ribet :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Phil Malin wrote:

> Hi Cyril and Frederic.
>
> Ahh - of course, I forgot about SmartEiffel stripping the executable.  :-)
>
> I also forgot to mention that I'm running an Intel mac, not a PowerPC
> mac (it's a MacBook Pro).
>
> Well, I compiled my test application with '-g -no_strip' and found out
> where it's dying.  The application I'm running is basically a large
> unit test app for a data structure library that I've built up over the
> years using SmartEiffel.  The particular code it's dying on relates to
> when I'm inserting values into an AVL set that I wrote.  The exact C
> code is in a gc_mark function.  In particular:
>
>     typedef struct S368 T368;
>     struct S368{T2 _element;T0* _parent;T0* _left;T0* _right;T0*
> _prev;T0* _next;T2 _level;};
>
>     void gc_mark368(T368*o){
>     begin:
>     if(((gc368*)o)->header.flag==FSOH_UNMARKED){
>     ((gc368*)o)->header.flag=FSOH_MARKED;
> #   /*7p*/if(NULL!=o->_next)gc_mark368((T368*)(o->_next));
>     /*7p*/if(NULL!=o->_prev)gc_mark368((T368*)(o->_prev));
>     /*7p*/if(NULL!=o->_right)gc_mark368((T368*)(o->_right));
>     /*7p*/if(NULL!=o->_left)gc_mark368((T368*)(o->_left));
>     o=(void*)o->_parent;
>     if((o!=NULL))goto begin;
>     }
>     }
>
> It's dying on the line denoted by the '#'.  This structure refers to a
> node in my tree, which contains pointers to the left and right
> children as well as pointers to the next and previous nodes (ordered
> over the elements of the set).  The exact error message is:
>
>     Program received signal EXC_BAD_ACCESS, Could not access memory.
>     Reason: KERN_PROTECTION_FAILURE at address: 0xbf7ffffc
>     0x00060963 in gc_mark368 (o=0x1bdd058) at tst11.c:8784
>     8784    /*7p*/if(NULL!=o->_next)gc_mark368((T368*)(o->_next));
>
> The odd thing is that the address 0xbf7ffffc is always the offending
> address, even when I encounter this error in other applications.  Is
> there anything else I can do to shed some more light?
>
Could you please try again with C compiler optimisations turned off?
(-O0 for gcc)

I didn't say there is a bug in the C opmitimiser code! The effect will
be to use memory instead of registers. Does it works better?

Hope this will help,

--

             Philippe Ribet


SmartEiffel:
one methodoology, one language,
highest quality kept secret.
Visit http://smarteiffel.loria.fr



Re: Troubles with OSX

by Phil Malin-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Philippe Ribet wrote:

> Phil Malin wrote:
> [snip]
>>
> Could you please try again with C compiler optimisations turned off?
> (-O0 for gcc)
>
> I didn't say there is a bug in the C opmitimiser code! The effect will
> be to use memory instead of registers. Does it works better?
>
> Hope this will help,
>

Actually, I didn't have any optimisations - I had turned these off.  But
I did have unroll-loops, inline functions and setting the architecture
to a Pentium4.  I just ran it again with no compiler options except for
'-g' and I get exactly the same problem (same place, same memory location).

Phil.