« Return to Thread: Problems Writing a Scalable Shared Data Structure
Thanks Robert and Hynek.
Hynek; playing around with the number of scheduler threads (+S) or asynchronous threads (surprisingly) does not significantly alter the results. As for the tree size, on average it should be around 500 nodes, assuming it's balanced this gives us a bst that's roughly 8 nodes deep. A tree this big should scale well with 8 processors. Moreover, I'd expect a tree this big to scale almost perfectly going from 1 to 2 processors; especially if the ratio of lookups (contains()), that don't modify the data structure, is high. However, all I'm seeing is minor improvements. Moreover, increasing the size of the tree from 1000 to 10000 does improve scalability a bit, but not by all that much either.
Robert; I think you're referring to an earlier version of my code, where I don't wait for the results of an insert or a delete. The lastest one (http://www.cs.auckland.ac.nz/~fuad/cbst.erl), the final node inserted/deleted does inform the client whether the operation was successful or not. I can't confirm, but on a big enough tree stacking shouldn't be an issue since each node will probably only relay its request. But then again you're right, deeper analyses seems to be in order before I could make such statements with any kind of certainty.
That said, I feel that this solution doesn't seem to be the best one. I'm sure other Erlang programmers must have had to solve a similar problem, maintaining shared data between different threads in a scalable way.
Thanks again,
/FuadOn Mon, Jul 7, 2008 at 9:58 PM, Robert Raschke <rtrlists@...> wrote:
Hi Fuad,
OK, I see.
On 7/7/08, Fuad Tabba <fuad@...> wrote:
> Thanks Robert,
>
> Yes I am familiar with Amdahl's law. That said, it's not true that all
> writes need to be serialized; it's a binary tree, and unless the
> modifications happen to be to the same node, there's no reason why they need
> to be serialized.
>
> A database would solve the problem kinda; but that's like using a
> steamroller to crack a nut. What I'm looking for a lightweight option that
> could temporarily store data shared between threads, and would scale well as
> the number of threads accessing/modifying this structure increases. I know
> how to solve this problem in C (writing a concurrent binary search tree that
> scales well), but since I'm new to Erlang I was unsuccessful trying to solve
> this problem in Erlang.
So, if I read your code correctly, your rpcInserts() and rpcDeletes()
return quick, because your request is trickling down the tree on its
own. Its the rpcContains() that need to wait for a proper result.
It might be interesting to test the individual operations on their
own, and then combinations thereof. That might give a quicker insight
into what is causing the lack of speedup. My (uneducated) hunch is
that the rpcContains() calls are where your "threads" are getting
stuck (due in part to all the inserts and deletes clogging up the
tree).
Very unscientific, sorry.
Robby
_______________________________________________
erlang-questions mailing list
erlang-questions@...
http://www.erlang.org/mailman/listinfo/erlang-questions
« Return to Thread: Problems Writing a Scalable Shared Data Structure
| Free Forum Powered by Nabble | Forum Help |