erts design information/experts around?

View: New views
3 Messages — Rating Filter:   Alert me  

erts design information/experts around?

by Paul Fisher-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I sent the following note to erlang-bugs two days ago and have not seen
anyone comment.
 

> We have a system where we run lots of linked-in driver ports that get
> created/used/closed frequently and sometimes very quickly.  Today when
> several open_port/2, port_command/2 and port_close/1 cycles happened
> rapid succession, a SIGSEGV occurrect in erl_bif_ddl.c:
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 1125235040 (LWP 12087)]
> 0x0000000000449712 in erl_ddll_try_unload_2 (p=0x2aaaab11fc90,
>     name_term=659339, options=46912503328425) at beam/erl_bif_ddll.c:592
>

This is not the first email about runtime internals that I have sent
which has gone without comment, and so I am wondering if the people with
detailed knowledge of the runtime do not follow these lists.

Is there a better place/way to get in contact?  (What I am really
looking for is a discussion of the overall internals design of the smp
runtime structures, so that I can get a jump start on fixing these types
of thing myself.)

thanks in advance,


--
paul

_______________________________________________
erlang-questions mailing list
erlang-questions@...
http://www.erlang.org/mailman/listinfo/erlang-questions

Re: erts design information/experts around?

by Raimo Niskanen-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jul 03, 2008 at 10:12:27AM -0500, Paul Fisher wrote:

> I sent the following note to erlang-bugs two days ago and have not seen
> anyone comment.
>  
> > We have a system where we run lots of linked-in driver ports that get
> > created/used/closed frequently and sometimes very quickly.  Today when
> > several open_port/2, port_command/2 and port_close/1 cycles happened
> > rapid succession, a SIGSEGV occurrect in erl_bif_ddl.c:
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 1125235040 (LWP 12087)]
> > 0x0000000000449712 in erl_ddll_try_unload_2 (p=0x2aaaab11fc90,
> >     name_term=659339, options=46912503328425) at beam/erl_bif_ddll.c:592
> >
>
> ???This is not the first email about runtime internals that I have sent
> which has gone without comment, and so I am wondering if the people with
> detailed knowledge of the runtime do not follow these lists.
>

They are on vacation. We would need both the experts on
SMP and the experts on Windows debugging.

Were it on Unix we would have requested a core dump,
but on Windows I do not know if there is such a thing,
or if there has to be a VisualStudio 8 installation
plus source code on the failing machine. The experts
on Windows debugging would know.

> Is there a better place/way to get in contact?  (What I am really
> looking for is a discussion of the overall internals design of the smp
> runtime structures, so that I can get a jump start on fixing these types
> of thing myself.)

Erlang-questions is a good place. Here there are lots of
smart guys that not all read erlang-bugs and may
have a clue.

>
> thanks in advance,
>
>
> --
> paul
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@...
> http://www.erlang.org/mailman/listinfo/erlang-questions

--

/ Raimo Niskanen, Erlang/OTP, Ericsson AB
_______________________________________________
erlang-questions mailing list
erlang-questions@...
http://www.erlang.org/mailman/listinfo/erlang-questions

Re: erts design information/experts around?

by Paul Fisher-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 2008-07-03 at 17:53 +0200, Raimo Niskanen wrote:

> On Thu, Jul 03, 2008 at 10:12:27AM -0500, Paul Fisher wrote:
> > I sent the following note to erlang-bugs two days ago and have not seen
> > anyone comment.
> >  
> > > We have a system where we run lots of linked-in driver ports that get
> > > created/used/closed frequently and sometimes very quickly.  Today when
> > > several open_port/2, port_command/2 and port_close/1 cycles happened
> > > rapid succession, a SIGSEGV occurrect in erl_bif_ddl.c:
> > >
> > > Program received signal SIGSEGV, Segmentation fault.
> > > [Switching to Thread 1125235040 (LWP 12087)]
> > > 0x0000000000449712 in erl_ddll_try_unload_2 (p=0x2aaaab11fc90,
> > >     name_term=659339, options=46912503328425) at beam/erl_bif_ddll.c:592
> > >
> >
> > ???This is not the first email about runtime internals that I have sent
> > which has gone without comment, and so I am wondering if the people with
> > detailed knowledge of the runtime do not follow these lists.
> >
>
> They are on vacation. We would need both the experts on
> SMP and the experts on Windows debugging.

Vacation explains it, thx!

My fault, this is R12B-3 on 64-bit Intel Linux (debian etch).  Also
happens on R12B-2 in the same way.


> > Is there a better place/way to get in contact?  (What I am really
> > looking for is a discussion of the overall internals design of the smp
> > runtime structures, so that I can get a jump start on fixing these types
> > of thing myself.)
>
> Erlang-questions is a good place. Here there are lots of
> smart guys that not all read erlang-bugs and may
> have a clue.

In the hopes that someone on the list can give some insight into the
invariants maintained by the runtime while managing port instances, I
had the following question(s) about the erts_ports[] array maintained by
the runtime:

The code at the point of the SIGSEGV @ erl_bif_ddll.c:592 says:

        for (j = 0; j < erts_max_ports; j++) {
=>          if (!(erts_port[j].status &  FREE_PORT_FLAGS)
                && erts_port[j].drv_ptr->handle == dh) {

It appears that the code assumes that if the erts_port array entry being
evaluated during the search has a valid (non-zero) drv_ptr value, if the
entry is not marked as free.

So two question:
1) is whether the assumption built into this code is correct?
2) if so, is there missing synchronization that allows violating these
assumptions?.

I'd appreciate some insight into what could be going on here, and where
I should start looking.


--
paul

_______________________________________________
erlang-questions mailing list
erlang-questions@...
http://www.erlang.org/mailman/listinfo/erlang-questions