|
View:
New views
10 Messages
—
Rating Filter:
Alert me
|
|
|
Project FishFarm; making a ForkJoinPool distributeableHello,
I am Michael Bien the project owner of https://fishfarm.dev.java.net/ a project to distribute tasks written in the fork-join framework over the network. The project is based on Shoal for the p2p communication and jsr166y for local distribution. We maintain a temporary fork of jsr166y because we made minor modifications to the ForkJoinPool to make it distributable over grids. Our main goal is to distribute as simple but also as efficient as possible. We introduced the DistributedForkJoinPool which extends the ForkJoinPool. It is under the hood a member of a shoal peer group and allows the tasks to be stolen from other group members (work stealing concurrency pattern). The distributed version behaves identical to the original fj-pool it will also work while being offline. That enables a dynamic cluster - you can add and remove nodes any time you wish. If you run your grid in one network you need even no additional configurations. As I already said, I made minor modifications to the framework. I would like to discuss the changes here since if we can't contribute them back to the jsr our project will stay a proof of concept because fragmenting the framework makes in my opinion no sens and was never intended. changes: -made ForkJoinTask Serializeable -added popQuedTask() to ForkJoinPool which returns a not-yet executed task and removes it from the pool (i am pretty sure this will not work under all conditions) -added getTask() to Submission all changes are tagged with '//NEW' + comment in the sources Any chance to get this functionality integrated? (I planned to attach a diff patch but I haven't found a tool which is able to diff svn vrs. cvs - I guess it was a mistake to commit the jsr166y fork into our own repository..) FishFarm is not yet ready for production but we recently run a 24h test with 40k submitted tasks on 5 nodes successfully but there is still some work left (e.g cancelling tasks is not yet implemented). Feel free to check out from svn and try it. There is a ready to run test.DistributedFibunacciTask and the worker nodes are startable via webstart: https://fishfarm.dev.java.net/demo (because of system tray bugs we currently require the early access java re update 10 version for the webstart app but this will change). feedback is very appreciated ;) best regards, Michael _______________________________________________ Concurrency-interest mailing list Concurrency-interest@... http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest |
|
|
|
|
|
Re: Project FishFarm; making a ForkJoinPool distributeableFor sure we can discuss this off-line if you like. Feel free to contact me directly. But keep in mind all the motivation to build FishFarm was actually doing distribution without introducing a new Framework ;) The idea is to make grid computing optional in case you are using the fj-framework anyway. FishFarm is around 6 month old and is a freetime project, it was never in production since I would never introduce a project which is based on a fork of a jsr ;-) -michael
_______________________________________________ Concurrency-interest mailing list Concurrency-interest@... http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest |
|
|
|
|
|
Re: Project FishFarm; making a ForkJoinPool distributeableMichael Bien wrote:
> Ben Manes wrote: >> How far along is this project? >> >> I wrote a distributed master/slave framework and (for fun) have a >> map-reduce abstraction prototyped. I was hoping to add fork-join, >> similar to your message, when we moved to Java-6 (and could use the >> jsr lib). So far my framework has been in production for 2 years, >> gone through performance reviews, etc. I've always wanted to open >> source it, but never got the powers that be to approve. >> >> If your interested, we can discuss such things off-line. > For sure we can discuss this off-line if you like. Feel free to contact > me directly. > But keep in mind all the motivation to build FishFarm was actually doing > distribution without introducing a new Framework ;) The idea is to make > grid computing optional in case you are using the fj-framework anyway. > > FishFarm is around 6 month old and is a freetime project, it was never > in production since I would never introduce a project which is based on > a fork of a jsr ;-) I hope that everyone working in this direction with Java, remembers to pay attention to all the work that has gone into Jini to make this possible with security and all the other tools such as leases and transactions, already implemented and working to help track distributed resources effectively, including the security features of Java, extended in the JERI stack to help eliminate the impact of DOS attacks. The current version of Jini is now the River Podling in the Apache Incubator. Gregg Wonderly _______________________________________________ Concurrency-interest mailing list Concurrency-interest@... http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest |
|
|
Re: Project FishFarm; making a ForkJoinPool distributeableGregg Wonderly wrote:
> Michael Bien wrote: >> Ben Manes wrote: >>> How far along is this project? >>> >>> I wrote a distributed master/slave framework and (for fun) have a >>> map-reduce abstraction prototyped. I was hoping to add fork-join, >>> similar to your message, when we moved to Java-6 (and could use the >>> jsr lib). So far my framework has been in production for 2 years, >>> gone through performance reviews, etc. I've always wanted to open >>> source it, but never got the powers that be to approve. >>> >>> If your interested, we can discuss such things off-line. >> For sure we can discuss this off-line if you like. Feel free to >> contact me directly. >> But keep in mind all the motivation to build FishFarm was actually >> doing distribution without introducing a new Framework ;) The idea is >> to make grid computing optional in case you are using the >> fj-framework anyway. >> >> FishFarm is around 6 month old and is a freetime project, it was >> never in production since I would never introduce a project which is >> based on a fork of a jsr ;-) > > I hope that everyone working in this direction with Java, remembers to > pay attention to all the work that has gone into Jini to make this > possible with security and all the other tools such as leases and > transactions, already implemented and working to help track > distributed resources effectively, including the security features of > Java, extended in the JERI stack to help eliminate the impact of DOS > attacks. > > The current version of Jini is now the River Podling in the Apache > Incubator. > > Gregg Wonderly > There are many distribution systems around and each of them has its use case. Java Spaces (and its commercial forks), Grid Gain, Terracotta, Apache River... and even more specialized and therefore not mainstream projects like Helios for Sunflow ray tracing tasks. Almost each of them introduced its own framework or lets say it in other words - IS a framework. (Terracotta is the exception which is a layer on top of the JVM) FishFarm is no full featured distribution system, it concentrates currently only on the distribution of ForkJoinPools and has no framework at all (jsr166y is its framework). Again, it simple works if you exchange "new ForkJoinPool()" with "new DistributedForkJoinPool()" and put the jars in the classpath. It has currently no security at all since we haven't got the requirement yet to install it outside closed networks which are pretty common for clusters. We even decided to try it without transactions (there is no JEE in it just simple messaging based on Shoal which is based on JXTA) because the master simple does not care if worker succeed with the computation or got its requested work or not. The master simple puts all of its "stolen" work into a backup and copies it back into the pool when the pool is empty - it still behaves like a ForkJoinPool. But my initial question was pretty general even FishFarm independent and could be used in any framework you like: Is there interest in making ForkJoinTasks Serializeable and providing a mechanism for querying not yet executed tasks from a ForkJoinPool? best regards, Michael _______________________________________________ Concurrency-interest mailing list Concurrency-interest@... http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest |
|
|
Re: Project FishFarm; making a ForkJoinPool distributeableMichael Bien wrote:
> Gregg Wonderly wrote: >> Michael Bien wrote: >>> Ben Manes wrote: >>>> How far along is this project? >>>> >>>> I wrote a distributed master/slave framework and (for fun) have a >>>> map-reduce abstraction prototyped. I was hoping to add fork-join, >>>> similar to your message, when we moved to Java-6 (and could use the >>>> jsr lib). So far my framework has been in production for 2 years, >>>> gone through performance reviews, etc. I've always wanted to open >>>> source it, but never got the powers that be to approve. >>>> >>>> If your interested, we can discuss such things off-line. >>> For sure we can discuss this off-line if you like. Feel free to >>> contact me directly. >>> But keep in mind all the motivation to build FishFarm was actually >>> doing distribution without introducing a new Framework ;) The idea is >>> to make grid computing optional in case you are using the >>> fj-framework anyway. >>> >>> FishFarm is around 6 month old and is a freetime project, it was >>> never in production since I would never introduce a project which is >>> based on a fork of a jsr ;-) >> I hope that everyone working in this direction with Java, remembers to >> pay attention to all the work that has gone into Jini to make this >> possible with security and all the other tools such as leases and >> transactions, already implemented and working to help track >> distributed resources effectively, including the security features of >> Java, extended in the JERI stack to help eliminate the impact of DOS >> attacks. >> >> The current version of Jini is now the River Podling in the Apache >> Incubator. >> >> Gregg Wonderly >> > Hello Gregg, > > There are many distribution systems around and each of them has its use > case. Java Spaces (and its commercial forks), Grid Gain, Terracotta, > Apache River... and even more specialized and therefore not mainstream > projects like Helios for Sunflow ray tracing tasks. Almost each of them > introduced its own framework or lets say it in other words - IS a > framework. (Terracotta is the exception which is a layer on top of the JVM) > > FishFarm is no full featured distribution system, it concentrates > currently only on the distribution of ForkJoinPools and has no framework > at all (jsr166y is its framework). Again, it simple works if you > exchange "new ForkJoinPool()" with "new DistributedForkJoinPool()" and > put the jars in the classpath. One the more interesting issues is that Remote computational tasks have a unique remote communications requirement that has to be accounted for in some way. Partial failure and security usually become issues faster than most think, and as you start to deal with those issues, you end up reinventing the wheel because of all the work already done to deal with those issues. In the end, it comes down to APIs and some simple algorithmic choices. All of the issues that tool sets such as Jini deal with are real and important to manage at some level. The JERI stack, finally provides a unique opportunity to optimize and tune the RPC stack, dynamically, at deployment time, instead of in your code. The Jini Configuration interface and the ConfigurationFile implementation allow you to change the Exporter instance you deploy with for example, so that you can put TCP vs SSL, No-Auth vs X.500 vs Kerbos authentication and security constraints all into a configuration step instead of a programming/reprogramming task. Many people still look at Jini as "tied" to "RMI". It is tied to the RMI programming model, and through the use of JERI, you can separate most language issues for remote interaction (HTTP endpoint with an XML invocation layer factory for example). > It has currently no security at all since > we haven't got the requirement yet to install it outside closed networks > which are pretty common for clusters. We even decided to try it without > transactions (there is no JEE in it just simple messaging based on Shoal > which is based on JXTA) because the master simple does not care if > worker succeed with the computation or got its requested work or not. > The master simple puts all of its "stolen" work into a backup and copies > it back into the pool when the pool is empty - it still behaves like a > ForkJoinPool. Simple is easy to start with. There was a study done on creating a JXTA endpoint for the JERI stack. I've not used it. The transactional services in Jini are based on a simple 2PC model that is implemented by the Mahalo service in the River podling (and the previously released Jini Technology Starter Kit v2.1). The ability to get dependable and predictable results and see progress, seems like an important thing to me. So I'm not sure that I understand how this framework can support that. The Javaspaces master/worker pattern is pretty well proven as a distributed system for massive scaling. Creating a worker that uses the ForkJoin stuff to further divide work locally would be an added mechanism for further exploiting the power available. > But my initial question was pretty general even FishFarm independent and > could be used in any framework you like: > Is there interest in making ForkJoinTasks Serializeable and providing a > mechanism for querying not yet executed tasks from a ForkJoinPool? I think these two things have some merit. Serializable implies a number of issues about the underlying references as well though. Gregg Wonderly _______________________________________________ Concurrency-interest mailing list Concurrency-interest@... http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest |
|
|
Re: Project FishFarm; making a ForkJoinPool distributeableGregg Wonderly wrote:
> Michael Bien wrote: >> Gregg Wonderly wrote: >>> Michael Bien wrote: >>>> Ben Manes wrote: >>>>> How far along is this project? >>>>> >>>>> I wrote a distributed master/slave framework and (for fun) have a >>>>> map-reduce abstraction prototyped. I was hoping to add fork-join, >>>>> similar to your message, when we moved to Java-6 (and could use >>>>> the jsr lib). So far my framework has been in production for 2 >>>>> years, gone through performance reviews, etc. I've always wanted >>>>> to open source it, but never got the powers that be to approve. >>>>> >>>>> If your interested, we can discuss such things off-line. >>>> For sure we can discuss this off-line if you like. Feel free to >>>> contact me directly. >>>> But keep in mind all the motivation to build FishFarm was actually >>>> doing distribution without introducing a new Framework ;) The idea >>>> is to make grid computing optional in case you are using the >>>> fj-framework anyway. >>>> >>>> FishFarm is around 6 month old and is a freetime project, it was >>>> never in production since I would never introduce a project which >>>> is based on a fork of a jsr ;-) >>> I hope that everyone working in this direction with Java, remembers >>> to pay attention to all the work that has gone into Jini to make >>> this possible with security and all the other tools such as leases >>> and transactions, already implemented and working to help track >>> distributed resources effectively, including the security features >>> of Java, extended in the JERI stack to help eliminate the impact of >>> DOS attacks. >>> >>> The current version of Jini is now the River Podling in the Apache >>> Incubator. >>> >>> Gregg Wonderly >>> >> Hello Gregg, >> >> There are many distribution systems around and each of them has its >> use case. Java Spaces (and its commercial forks), Grid Gain, >> Terracotta, Apache River... and even more specialized and therefore >> not mainstream projects like Helios for Sunflow ray tracing tasks. >> Almost each of them introduced its own framework or lets say it in >> other words - IS a framework. (Terracotta is the exception which is a >> layer on top of the JVM) >> >> FishFarm is no full featured distribution system, it concentrates >> currently only on the distribution of ForkJoinPools and has no >> framework at all (jsr166y is its framework). Again, it simple works >> if you exchange "new ForkJoinPool()" with "new >> DistributedForkJoinPool()" and put the jars in the classpath. > > One the more interesting issues is that Remote computational tasks > have a unique remote communications requirement that has to be > accounted for in some way. Partial failure and security usually become > issues faster than most think, and as you start to deal with those > issues, you end up reinventing the wheel because of all the work > already done to deal with those issues. > > In the end, it comes down to APIs and some simple algorithmic > choices. All of the issues that tool sets such as Jini deal with are > real and important to manage at some level. The JERI stack, finally > provides a unique opportunity to optimize and tune the RPC stack, > dynamically, at deployment time, instead of in your code. The Jini > Configuration interface and the ConfigurationFile implementation allow > you to change the Exporter instance you deploy with for example, so > that you can put TCP vs SSL, No-Auth vs X.500 vs Kerbos authentication > and security constraints all into a configuration step instead of a > programming/reprogramming task. > > Many people still look at Jini as "tied" to "RMI". It is tied to the > RMI programming model, and through the use of JERI, you can separate > most language issues for remote interaction (HTTP endpoint with an XML > invocation layer factory for example). > > > It has currently no security at all since >> we haven't got the requirement yet to install it outside closed >> networks which are pretty common for clusters. We even decided to try >> it without transactions (there is no JEE in it just simple messaging >> based on Shoal which is based on JXTA) because the master simple does >> not care if worker succeed with the computation or got its requested >> work or not. The master simple puts all of its "stolen" work into a >> backup and copies it back into the pool when the pool is empty - it >> still behaves like a ForkJoinPool. > > Simple is easy to start with. There was a study done on creating a > JXTA endpoint for the JERI stack. I've not used it. The > transactional services in Jini are based on a simple 2PC model that is > implemented by the Mahalo service in the River podling (and the > previously released Jini Technology Starter Kit v2.1). > > The ability to get dependable and predictable results and see > progress, seems like an important thing to me. So I'm not sure that I > understand how this framework can support that. > > The Javaspaces master/worker pattern is pretty well proven as a > distributed system for massive scaling. Creating a worker that uses > the ForkJoin stuff to further divide work locally would be an added > mechanism for further exploiting the power available. Hello Gregg, I agree with you with almost everything you said, the point is: FishFarm is no framework and does not try to reinvent the wheel ;-) It really just distributes the ForkJoinPool behind the scenes - nothing more. It does not have to order task execution because ForkJoinPools also have no order, same for seeing progress...etc It would be insane to try to reinvent javaspaces & co and claim to be better within 6 month of a freetime project. Thats not possible and I will never try that (despite the fact that students have a lot of time ;-) ). If you are in a green field and design your app to be distributed you will design it with distribution framework in mind. But maybe you already have code which uses ForkJoinPools and was never designed to be distributed you can distribute it immediately if you want with FishFarm without any configuration - thats the usecase. > >> But my initial question was pretty general even FishFarm independent >> and could be used in any framework you like: >> Is there interest in making ForkJoinTasks Serializeable and providing >> a mechanism for querying not yet executed tasks from a ForkJoinPool? > > I think these two things have some merit. Serializable implies a > number of issues about the underlying references as well though. ForkJoinTask: ForkJoinTask<V> implements Serializable is the only diff regarding serialization. You don't want to serialize the whole pool including state. ForkJoinTasks are plain objects if you want something special, provide write and readObject(..) as always. regarding the not yet executed tasks (ForkJoinPool): public <T> ForkJoinTask<T> popQueuedTask() { Submission<?> submission = submissionQueue.poll(); if(submission == null) return null; return (ForkJoinTask<T>)submission.getTask(); } Submission: public ForkJoinTask<T> getTask() { return task; } (is there a issue tracker for jsr166y where i can submit it?) > > Gregg Wonderly > > thank you for your feedback, -michael _______________________________________________ Concurrency-interest mailing list Concurrency-interest@... http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest |
|
|
Re: Project FishFarm; making a ForkJoinPool distributeable
Michael Bien wrote:
I haven't got an answer yet. Is there no interest in providing this kind of features in jsr166y? yes/no/perhaps? best regards, michael https://fishfarm.dev.java.net/ _______________________________________________ Concurrency-interest mailing list Concurrency-interest@... http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest |
|
|
Re: Project FishFarm; making a ForkJoinPool distributeableSorry for the long delay replying to this!
Michael Bien wrote: > -made ForkJoinTask Serializeable We can't mandate that all ForkJoinTasks are serializable. In a non-JDK release maybe we could, because then we could just tell people creating subclasses (which is the main way you use FJ) to ignore the serializablility if they don't need it, because nothing actually relies on it. An alternative would be to introduce SerializableForkJoinTask, but then we'd need the basic exec internals fleshed out for the various main flavors (SerializableRecursiveAction etc) because we don't/can't allow people to implement these themselves. Further, all you really want here is to ensure that arguments and results are serializable. Sadly you can't write this as a conjunctive type like: void foo([ForkJoinTask & Serializable] task) So my best suggestion, sadly enough, is to rely on dynamic typing, as in: void foo(ForkJoinTask task) { if (!(task implements Serializable)) throw... Which would then entail javadoc @param specs etc that spell out the otherwise unstated type requirements. (Perhaps someone has already created an annotation tag along these lines?) How bad would that be? > -added popQuedTask() to ForkJoinPool which returns a not-yet executed > task and removes it from the pool (i am pretty sure this will not work > under all conditions) You mean a submission to a pool, right? That is now possible via ForkJoinWorkerThread.pollSubmission. It seems somewhat dangerous to expose the ForkJoinPool version so that non-FJ code can remove tasks, but maybe there is a good argument for it? > -added getTask() to Submission The Submission class is not even public, so this by itself wouldn't do much good. For plumbing-level manipulation though, perhaps you could use privileged reflection to directly access the task field? -Doug _______________________________________________ Concurrency-interest mailing list Concurrency-interest@... http://cs.oswego.edu/mailman/listinfo/concurrency-interest |
| Free Forum Powered by Nabble | Forum Help |