[activecluster-user] peer:// - nodes do not rejoin cluster... (test enclosed)

View: New views
16 Messages — Rating Filter:   Alert me  

[activecluster-user] peer:// - nodes do not rejoin cluster... (test enclosed)

by Jules Gosnell-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


tested against activecluster-1.1-SNAPSHOT, activemq-4.0-SNAPSHOT and Sun
JDK 1.4.2_08 and 1.5.0 - two node cluster, both on same machine.

start your first node (red)

[jules@zeuglodon core]$ ./cluster.sh activecluster red
2005/10/29 00:04:18:317 BST [INFO] ACCluster - starting...
2005/10/29 00:04:18:321 BST [INFO] ACCluster - ...started

start your second node (green)

[jules@zeuglodon core]$ ./cluster.sh activecluster green
2005/10/29 00:04:31:874 BST [INFO] ACCluster - starting...
2005/10/29 00:04:31:885 BST [INFO] ACCluster - ...started
2005/10/29 00:04:33:323 BST [INFO] ACCluster - onNodeAdd: red

red says:

2005/10/29 00:04:33:536 BST [INFO] ACCluster - onNodeAdd: green
2005/10/29 00:04:33:537 BST [INFO] ACCluster - onCoordinatorChanged: green

ctl-c green, after a few seconds red says:

2005/10/29 00:04:46:624 BST [INFO] ACCluster - onNodeFailed: green
2005/10/29 00:04:46:626 BST [INFO] ACCluster - onCoordinatorChanged: red

all fine so far - now restart green:

[jules@zeuglodon core]$ ./cluster.sh activecluster green
2005/10/29 00:04:51:962 BST [INFO] ACCluster - starting...
2005/10/29 00:04:51:967 BST [INFO] ACCluster - ...started

red says......nothing.


if you then start a third node (blue):

[jules@zeuglodon core]$ ./cluster.sh activecluster blue
2005/10/29 00:08:47:990 BST [INFO] ACCluster - starting...
2005/10/29 00:08:47:994 BST [INFO] ACCluster - ...started
2005/10/29 00:08:50:198 BST [INFO] ACCluster - onNodeAdd: green

it sees green, but not red and green says:

2005/10/29 00:08:49:661 BST [INFO] ACCluster - onNodeAdd: blue
2005/10/29 00:08:49:662 BST [INFO] ACCluster - onCoordinatorChanged: blue

so green sees blue too.

but red says.....nothing.


Using tcp:// and a broker, red will happily see blue and green after
they are restarted.

This has been the case with activecluster/activemq for a long time, but
I was hoping that it would be fixed by improvements to peer:// in
activemq-4.0 - but :-(


Jules

--
"Open Source is a self-assembling organism. You dangle a piece of
string into a super-saturated solution and a whole operating-system
crystallises out around it."

/**********************************
 * Jules Gosnell
 * Partner
 * Core Developers Network (Europe)
 *
 *    www.coredevelopers.net
 *
 * Open Source Training & Support.
 **********************************/


package org.codehaus.wadi.sandbox.partition.impl;

import java.util.HashMap;
import java.util.Map;

import javax.jms.JMSException;

import org.activecluster.ClusterEvent;
import org.activecluster.ClusterFactory;
import org.activecluster.ClusterListener;
import org.activecluster.impl.DefaultClusterFactory;
import org.activemq.ActiveMQConnectionFactory;
import org.activemq.store.vm.VMPersistenceAdapterFactory;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.codehaus.wadi.dindex.impl.SeniorityElectionStrategy;
import org.codehaus.wadi.impl.CustomClusterFactory;
import org.codehaus.wadi.sandbox.partition.Cluster;

public class ACCluster {

        protected final Log _log = LogFactory.getLog(getClass());

        //protected final String _clusterUri="tcp://smilodon:61616";
        protected final String _clusterUri="peer://org.codehaus.wadi";
    protected final String _clusterName="ORG.CODEHAUS.WADI.TEST";
    protected final ActiveMQConnectionFactory _connectionFactory=new ActiveMQConnectionFactory(_clusterUri);
    protected final ClusterFactory _clusterFactory=new DefaultClusterFactory(_connectionFactory);
        protected final org.activecluster.Cluster _cluster;
        protected final long _timeout=30*1000L;
       
        public ACCluster(String nodeName) throws Exception {
                System.setProperty("activemq.persistenceAdapterFactory", VMPersistenceAdapterFactory.class.getName());
                _cluster=_clusterFactory.createCluster(_clusterName);
                Map state=new HashMap();
                state.put("nodeName", nodeName);
                _cluster.getLocalNode().setState(state);
                _cluster.addClusterListener(new ClusterListener() {
                       
                        public void onNodeAdd(ClusterEvent arg0) {
                                _log.info("onNodeAdd: "+arg0.getNode().getState().get("nodeName"));
                        }
                       
                        public void onNodeUpdate(ClusterEvent arg0) {
                                _log.info("onNodeUpdate: "+arg0.getNode().getState().get("nodeName"));
                        }
                       
                        public void onNodeRemoved(ClusterEvent arg0) {
                                _log.info("onNodeRemoved: "+arg0.getNode().getState().get("nodeName"));
                        }
                       
                        public void onNodeFailed(ClusterEvent arg0) {
                                _log.info("onNodeFailed: "+arg0.getNode().getState().get("nodeName"));
                        }
                       
                        public void onCoordinatorChanged(ClusterEvent arg0) {
                                _log.info("onCoordinatorChanged: "+arg0.getNode().getState().get("nodeName"));
                        }
                });
        }
       
        public void start() throws JMSException {
                _log.info("starting...");
                _cluster.start();
                _log.info("...started");
        }
       
        public void stop() {
               
        }
       
        public static void main(String[] args) throws Exception {
                ACCluster cluster=new ACCluster(args[0]);
                cluster.start();
                Thread.sleep(100*1000);
                cluster.stop();
        }
}


cluster.sh (1K) Download Attachment

Re: [activecluster-user] peer:// - nodes do not rejoin cluster... (test enclosed)

by Jules Gosnell-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jules Gosnell wrote:

>
> tested against activecluster-1.1-SNAPSHOT, activemq-4.0-SNAPSHOT and
> Sun JDK 1.4.2_08 and 1.5.0 - two node cluster, both on same machine.

oops - AMQ-4.0 was not on the machine that I ran this on - so this issue
is actually with AMQ-3.2-M1.

I am trying to run the test with AMQ-4.0, but the second node won't
start because "java.io.IOException: Journal is allready opened by
another application", So i will have to figure out how to tell it not to
journal - I'll be back ;-)

Jules

>
> start your first node (red)
>
> [jules@zeuglodon core]$ ./cluster.sh activecluster red
> 2005/10/29 00:04:18:317 BST [INFO] ACCluster - starting...
> 2005/10/29 00:04:18:321 BST [INFO] ACCluster - ...started
>
> start your second node (green)
>
> [jules@zeuglodon core]$ ./cluster.sh activecluster green
> 2005/10/29 00:04:31:874 BST [INFO] ACCluster - starting...
> 2005/10/29 00:04:31:885 BST [INFO] ACCluster - ...started
> 2005/10/29 00:04:33:323 BST [INFO] ACCluster - onNodeAdd: red
>
> red says:
>
> 2005/10/29 00:04:33:536 BST [INFO] ACCluster - onNodeAdd: green
> 2005/10/29 00:04:33:537 BST [INFO] ACCluster - onCoordinatorChanged:
> green
>
> ctl-c green, after a few seconds red says:
>
> 2005/10/29 00:04:46:624 BST [INFO] ACCluster - onNodeFailed: green
> 2005/10/29 00:04:46:626 BST [INFO] ACCluster - onCoordinatorChanged: red
>
> all fine so far - now restart green:
>
> [jules@zeuglodon core]$ ./cluster.sh activecluster green
> 2005/10/29 00:04:51:962 BST [INFO] ACCluster - starting...
> 2005/10/29 00:04:51:967 BST [INFO] ACCluster - ...started
>
> red says......nothing.
>
>
> if you then start a third node (blue):
>
> [jules@zeuglodon core]$ ./cluster.sh activecluster blue
> 2005/10/29 00:08:47:990 BST [INFO] ACCluster - starting...
> 2005/10/29 00:08:47:994 BST [INFO] ACCluster - ...started
> 2005/10/29 00:08:50:198 BST [INFO] ACCluster - onNodeAdd: green
>
> it sees green, but not red and green says:
>
> 2005/10/29 00:08:49:661 BST [INFO] ACCluster - onNodeAdd: blue
> 2005/10/29 00:08:49:662 BST [INFO] ACCluster - onCoordinatorChanged: blue
>
> so green sees blue too.
>
> but red says.....nothing.
>
>
> Using tcp:// and a broker, red will happily see blue and green after
> they are restarted.
>
> This has been the case with activecluster/activemq for a long time,
> but I was hoping that it would be fixed by improvements to peer:// in
> activemq-4.0 - but :-(
>
>
> Jules
>
>------------------------------------------------------------------------
>
>package org.codehaus.wadi.sandbox.partition.impl;
>
>import java.util.HashMap;
>import java.util.Map;
>
>import javax.jms.JMSException;
>
>import org.activecluster.ClusterEvent;
>import org.activecluster.ClusterFactory;
>import org.activecluster.ClusterListener;
>import org.activecluster.impl.DefaultClusterFactory;
>import org.activemq.ActiveMQConnectionFactory;
>import org.activemq.store.vm.VMPersistenceAdapterFactory;
>import org.apache.commons.logging.Log;
>import org.apache.commons.logging.LogFactory;
>import org.codehaus.wadi.dindex.impl.SeniorityElectionStrategy;
>import org.codehaus.wadi.impl.CustomClusterFactory;
>import org.codehaus.wadi.sandbox.partition.Cluster;
>
>public class ACCluster {
>
> protected final Log _log = LogFactory.getLog(getClass());
>
> //protected final String _clusterUri="tcp://smilodon:61616";
> protected final String _clusterUri="peer://org.codehaus.wadi";
>    protected final String _clusterName="ORG.CODEHAUS.WADI.TEST";
>    protected final ActiveMQConnectionFactory _connectionFactory=new ActiveMQConnectionFactory(_clusterUri);
>    protected final ClusterFactory _clusterFactory=new DefaultClusterFactory(_connectionFactory);
> protected final org.activecluster.Cluster _cluster;
> protected final long _timeout=30*1000L;
>
> public ACCluster(String nodeName) throws Exception {
> System.setProperty("activemq.persistenceAdapterFactory", VMPersistenceAdapterFactory.class.getName());
> _cluster=_clusterFactory.createCluster(_clusterName);
> Map state=new HashMap();
> state.put("nodeName", nodeName);
> _cluster.getLocalNode().setState(state);
> _cluster.addClusterListener(new ClusterListener() {
>
> public void onNodeAdd(ClusterEvent arg0) {
> _log.info("onNodeAdd: "+arg0.getNode().getState().get("nodeName"));
> }
>
> public void onNodeUpdate(ClusterEvent arg0) {
> _log.info("onNodeUpdate: "+arg0.getNode().getState().get("nodeName"));
> }
>
> public void onNodeRemoved(ClusterEvent arg0) {
> _log.info("onNodeRemoved: "+arg0.getNode().getState().get("nodeName"));
> }
>
> public void onNodeFailed(ClusterEvent arg0) {
> _log.info("onNodeFailed: "+arg0.getNode().getState().get("nodeName"));
> }
>
> public void onCoordinatorChanged(ClusterEvent arg0) {
> _log.info("onCoordinatorChanged: "+arg0.getNode().getState().get("nodeName"));
> }
> });
> }
>
> public void start() throws JMSException {
> _log.info("starting...");
> _cluster.start();
> _log.info("...started");
> }
>
> public void stop() {
>
> }
>
> public static void main(String[] args) throws Exception {
> ACCluster cluster=new ACCluster(args[0]);
> cluster.start();
> Thread.sleep(100*1000);
> cluster.stop();
> }
>}
>  
>


--
"Open Source is a self-assembling organism. You dangle a piece of
string into a super-saturated solution and a whole operating-system
crystallises out around it."

/**********************************
 * Jules Gosnell
 * Partner
 * Core Developers Network (Europe)
 *
 *    www.coredevelopers.net
 *
 * Open Source Training & Support.
 **********************************/


Re: [activecluster-user] peer:// - nodes do not rejoin cluster... (test enclosed)

by rajdavies :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Jules,

I'm in the process of porting activecluster to amq4. There's an issue  
with discovery in amq4 at the moment that needs to be fixed.

cheers,

Rob
On 29 Oct 2005, at 02:10, Jules Gosnell wrote:

> Jules Gosnell wrote:
>
>
>>
>> tested against activecluster-1.1-SNAPSHOT, activemq-4.0-SNAPSHOT  
>> and Sun JDK 1.4.2_08 and 1.5.0 - two node cluster, both on same  
>> machine.
>>
>
> oops - AMQ-4.0 was not on the machine that I ran this on - so this  
> issue is actually with AMQ-3.2-M1.
>
> I am trying to run the test with AMQ-4.0, but the second node won't  
> start because "java.io.IOException: Journal is allready opened by  
> another application", So i will have to figure out how to tell it  
> not to journal - I'll be back ;-)
>
> Jules
>
>
>>
>> start your first node (red)
>>
>> [jules@zeuglodon core]$ ./cluster.sh activecluster red
>> 2005/10/29 00:04:18:317 BST [INFO] ACCluster - starting...
>> 2005/10/29 00:04:18:321 BST [INFO] ACCluster - ...started
>>
>> start your second node (green)
>>
>> [jules@zeuglodon core]$ ./cluster.sh activecluster green
>> 2005/10/29 00:04:31:874 BST [INFO] ACCluster - starting...
>> 2005/10/29 00:04:31:885 BST [INFO] ACCluster - ...started
>> 2005/10/29 00:04:33:323 BST [INFO] ACCluster - onNodeAdd: red
>>
>> red says:
>>
>> 2005/10/29 00:04:33:536 BST [INFO] ACCluster - onNodeAdd: green
>> 2005/10/29 00:04:33:537 BST [INFO] ACCluster -  
>> onCoordinatorChanged: green
>>
>> ctl-c green, after a few seconds red says:
>>
>> 2005/10/29 00:04:46:624 BST [INFO] ACCluster - onNodeFailed: green
>> 2005/10/29 00:04:46:626 BST [INFO] ACCluster -  
>> onCoordinatorChanged: red
>>
>> all fine so far - now restart green:
>>
>> [jules@zeuglodon core]$ ./cluster.sh activecluster green
>> 2005/10/29 00:04:51:962 BST [INFO] ACCluster - starting...
>> 2005/10/29 00:04:51:967 BST [INFO] ACCluster - ...started
>>
>> red says......nothing.
>>
>>
>> if you then start a third node (blue):
>>
>> [jules@zeuglodon core]$ ./cluster.sh activecluster blue
>> 2005/10/29 00:08:47:990 BST [INFO] ACCluster - starting...
>> 2005/10/29 00:08:47:994 BST [INFO] ACCluster - ...started
>> 2005/10/29 00:08:50:198 BST [INFO] ACCluster - onNodeAdd: green
>>
>> it sees green, but not red and green says:
>>
>> 2005/10/29 00:08:49:661 BST [INFO] ACCluster - onNodeAdd: blue
>> 2005/10/29 00:08:49:662 BST [INFO] ACCluster -  
>> onCoordinatorChanged: blue
>>
>> so green sees blue too.
>>
>> but red says.....nothing.
>>
>>
>> Using tcp:// and a broker, red will happily see blue and green  
>> after they are restarted.
>>
>> This has been the case with activecluster/activemq for a long  
>> time, but I was hoping that it would be fixed by improvements to  
>> peer:// in activemq-4.0 - but :-(
>>
>>
>> Jules
>>
>> ---------------------------------------------------------------------
>> ---
>>
>> package org.codehaus.wadi.sandbox.partition.impl;
>>
>> import java.util.HashMap;
>> import java.util.Map;
>>
>> import javax.jms.JMSException;
>>
>> import org.activecluster.ClusterEvent;
>> import org.activecluster.ClusterFactory;
>> import org.activecluster.ClusterListener;
>> import org.activecluster.impl.DefaultClusterFactory;
>> import org.activemq.ActiveMQConnectionFactory;
>> import org.activemq.store.vm.VMPersistenceAdapterFactory;
>> import org.apache.commons.logging.Log;
>> import org.apache.commons.logging.LogFactory;
>> import org.codehaus.wadi.dindex.impl.SeniorityElectionStrategy;
>> import org.codehaus.wadi.impl.CustomClusterFactory;
>> import org.codehaus.wadi.sandbox.partition.Cluster;
>>
>> public class ACCluster {
>>
>>     protected final Log _log = LogFactory.getLog(getClass());
>>
>>     //protected final String _clusterUri="tcp://smilodon:61616";
>>     protected final String _clusterUri="peer://org.codehaus.wadi";
>>    protected final String _clusterName="ORG.CODEHAUS.WADI.TEST";
>>    protected final ActiveMQConnectionFactory  
>> _connectionFactory=new ActiveMQConnectionFactory(_clusterUri);
>>    protected final ClusterFactory _clusterFactory=new  
>> DefaultClusterFactory(_connectionFactory);
>>     protected final org.activecluster.Cluster _cluster;
>>     protected final long _timeout=30*1000L;
>>
>>     public ACCluster(String nodeName) throws Exception {
>>         System.setProperty("activemq.persistenceAdapterFactory",  
>> VMPersistenceAdapterFactory.class.getName());
>>         _cluster=_clusterFactory.createCluster(_clusterName);
>>         Map state=new HashMap();
>>         state.put("nodeName", nodeName);
>>         _cluster.getLocalNode().setState(state);
>>         _cluster.addClusterListener(new ClusterListener() {
>>
>>             public void onNodeAdd(ClusterEvent arg0) {
>>                 _log.info("onNodeAdd: "+arg0.getNode().getState
>> ().get("nodeName"));
>>             }
>>
>>             public void onNodeUpdate(ClusterEvent arg0) {
>>                 _log.info("onNodeUpdate: "+arg0.getNode().getState
>> ().get("nodeName"));
>>             }
>>
>>             public void onNodeRemoved(ClusterEvent arg0) {
>>                 _log.info("onNodeRemoved: "+arg0.getNode().getState
>> ().get("nodeName"));
>>             }
>>
>>             public void onNodeFailed(ClusterEvent arg0) {
>>                 _log.info("onNodeFailed: "+arg0.getNode().getState
>> ().get("nodeName"));
>>             }
>>
>>             public void onCoordinatorChanged(ClusterEvent arg0) {
>>                 _log.info("onCoordinatorChanged: "+arg0.getNode
>> ().getState().get("nodeName"));
>>             }
>>         });
>>     }
>>
>>     public void start() throws JMSException {
>>         _log.info("starting...");
>>         _cluster.start();
>>         _log.info("...started");
>>     }
>>
>>     public void stop() {
>>
>>     }
>>
>>     public static void main(String[] args) throws Exception {
>>         ACCluster cluster=new ACCluster(args[0]);
>>         cluster.start();
>>         Thread.sleep(100*1000);
>>         cluster.stop();
>>     }
>> }
>>
>>
>
>
> --
> "Open Source is a self-assembling organism. You dangle a piece of
> string into a super-saturated solution and a whole operating-system
> crystallises out around it."
>
> /**********************************
> * Jules Gosnell
> * Partner
> * Core Developers Network (Europe)
> *
> *    www.coredevelopers.net
> *
> * Open Source Training & Support.
> **********************************/
>
>


Re: [activecluster-user] peer:// - nodes do not rejoin cluster... (test enclosed)

by Jules Gosnell-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jules Gosnell wrote:

> Jules Gosnell wrote:
>
>>
>> tested against activecluster-1.1-SNAPSHOT, activemq-4.0-SNAPSHOT and
>> Sun JDK 1.4.2_08 and 1.5.0 - two node cluster, both on same machine.
>
>
> oops - AMQ-4.0 was not on the machine that I ran this on - so this
> issue is actually with AMQ-3.2-M1.

I just ran this on AMQ-4.0

The red and green nodes do not find each other in the first place. :-(

So, in conclusion, this seems to be an AMQ issue (there does not appear
to be any special case code for peer:// in AC).

using AMQ-3.2, nodes fail to rejoin a cluster
using AMQ-4.0, nodes fail to join a cluster in the first place

thanks for you time,


Jules

>
> I am trying to run the test with AMQ-4.0, but the second node won't
> start because "java.io.IOException: Journal is allready opened by
> another application", So i will have to figure out how to tell it not
> to journal - I'll be back ;-)
>
> Jules
>
>>
>> start your first node (red)
>>
>> [jules@zeuglodon core]$ ./cluster.sh activecluster red
>> 2005/10/29 00:04:18:317 BST [INFO] ACCluster - starting...
>> 2005/10/29 00:04:18:321 BST [INFO] ACCluster - ...started
>>
>> start your second node (green)
>>
>> [jules@zeuglodon core]$ ./cluster.sh activecluster green
>> 2005/10/29 00:04:31:874 BST [INFO] ACCluster - starting...
>> 2005/10/29 00:04:31:885 BST [INFO] ACCluster - ...started
>> 2005/10/29 00:04:33:323 BST [INFO] ACCluster - onNodeAdd: red
>>
>> red says:
>>
>> 2005/10/29 00:04:33:536 BST [INFO] ACCluster - onNodeAdd: green
>> 2005/10/29 00:04:33:537 BST [INFO] ACCluster - onCoordinatorChanged:
>> green
>>
>> ctl-c green, after a few seconds red says:
>>
>> 2005/10/29 00:04:46:624 BST [INFO] ACCluster - onNodeFailed: green
>> 2005/10/29 00:04:46:626 BST [INFO] ACCluster - onCoordinatorChanged: red
>>
>> all fine so far - now restart green:
>>
>> [jules@zeuglodon core]$ ./cluster.sh activecluster green
>> 2005/10/29 00:04:51:962 BST [INFO] ACCluster - starting...
>> 2005/10/29 00:04:51:967 BST [INFO] ACCluster - ...started
>>
>> red says......nothing.
>>
>>
>> if you then start a third node (blue):
>>
>> [jules@zeuglodon core]$ ./cluster.sh activecluster blue
>> 2005/10/29 00:08:47:990 BST [INFO] ACCluster - starting...
>> 2005/10/29 00:08:47:994 BST [INFO] ACCluster - ...started
>> 2005/10/29 00:08:50:198 BST [INFO] ACCluster - onNodeAdd: green
>>
>> it sees green, but not red and green says:
>>
>> 2005/10/29 00:08:49:661 BST [INFO] ACCluster - onNodeAdd: blue
>> 2005/10/29 00:08:49:662 BST [INFO] ACCluster - onCoordinatorChanged:
>> blue
>>
>> so green sees blue too.
>>
>> but red says.....nothing.
>>
>>
>> Using tcp:// and a broker, red will happily see blue and green after
>> they are restarted.
>>
>> This has been the case with activecluster/activemq for a long time,
>> but I was hoping that it would be fixed by improvements to peer:// in
>> activemq-4.0 - but :-(
>>
>>
>> Jules
>>
>> ------------------------------------------------------------------------
>>
>> package org.codehaus.wadi.sandbox.partition.impl;
>>
>> import java.util.HashMap;
>> import java.util.Map;
>>
>> import javax.jms.JMSException;
>>
>> import org.activecluster.ClusterEvent;
>> import org.activecluster.ClusterFactory;
>> import org.activecluster.ClusterListener;
>> import org.activecluster.impl.DefaultClusterFactory;
>> import org.activemq.ActiveMQConnectionFactory;
>> import org.activemq.store.vm.VMPersistenceAdapterFactory;
>> import org.apache.commons.logging.Log;
>> import org.apache.commons.logging.LogFactory;
>> import org.codehaus.wadi.dindex.impl.SeniorityElectionStrategy;
>> import org.codehaus.wadi.impl.CustomClusterFactory;
>> import org.codehaus.wadi.sandbox.partition.Cluster;
>>
>> public class ACCluster {
>>
>>     protected final Log _log = LogFactory.getLog(getClass());
>>
>>     //protected final String _clusterUri="tcp://smilodon:61616";
>>     protected final String _clusterUri="peer://org.codehaus.wadi";
>>    protected final String _clusterName="ORG.CODEHAUS.WADI.TEST";
>>    protected final ActiveMQConnectionFactory _connectionFactory=new
>> ActiveMQConnectionFactory(_clusterUri);
>>    protected final ClusterFactory _clusterFactory=new
>> DefaultClusterFactory(_connectionFactory);
>>     protected final org.activecluster.Cluster _cluster;
>>     protected final long _timeout=30*1000L;
>>    
>>     public ACCluster(String nodeName) throws Exception {
>>         System.setProperty("activemq.persistenceAdapterFactory",
>> VMPersistenceAdapterFactory.class.getName());
>>         _cluster=_clusterFactory.createCluster(_clusterName);
>>         Map state=new HashMap();
>>         state.put("nodeName", nodeName);
>>         _cluster.getLocalNode().setState(state);
>>         _cluster.addClusterListener(new ClusterListener() {
>>            
>>             public void onNodeAdd(ClusterEvent arg0) {
>>                 _log.info("onNodeAdd:
>> "+arg0.getNode().getState().get("nodeName"));
>>             }
>>            
>>             public void onNodeUpdate(ClusterEvent arg0) {
>>                 _log.info("onNodeUpdate:
>> "+arg0.getNode().getState().get("nodeName"));
>>             }
>>            
>>             public void onNodeRemoved(ClusterEvent arg0) {
>>                 _log.info("onNodeRemoved:
>> "+arg0.getNode().getState().get("nodeName"));
>>             }
>>            
>>             public void onNodeFailed(ClusterEvent arg0) {
>>                 _log.info("onNodeFailed:
>> "+arg0.getNode().getState().get("nodeName"));
>>             }
>>            
>>             public void onCoordinatorChanged(ClusterEvent arg0) {
>>                 _log.info("onCoordinatorChanged:
>> "+arg0.getNode().getState().get("nodeName"));
>>             }
>>         });
>>     }
>>    
>>     public void start() throws JMSException {
>>         _log.info("starting...");
>>         _cluster.start();
>>         _log.info("...started");
>>     }
>>    
>>     public void stop() {
>>        
>>     }
>>    
>>     public static void main(String[] args) throws Exception {
>>         ACCluster cluster=new ACCluster(args[0]);
>>         cluster.start();
>>         Thread.sleep(100*1000);
>>         cluster.stop();
>>     }
>> }
>>  
>>
>
>


--
"Open Source is a self-assembling organism. You dangle a piece of
string into a super-saturated solution and a whole operating-system
crystallises out around it."

/**********************************
 * Jules Gosnell
 * Partner
 * Core Developers Network (Europe)
 *
 *    www.coredevelopers.net
 *
 * Open Source Training & Support.
 **********************************/


Re: [activecluster-user] peer:// - nodes do not rejoin cluster... (test enclosed)

by Jules Gosnell-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Rob Davies wrote:

> Hi Jules,
>
> I'm in the process of porting activecluster to amq4. There's an issue  
> with discovery in amq4 at the moment that needs to be fixed.

Ok - no problem - if you let me know when the fix goes in, I will be
happy to retest and report.

BTW - do you know if AMQ4 is the release that is intended to be bundled
with Geronimo-1.0 ? I am targetting a WADI release at Geronimo-1.0 and I
need to make sure that we run on the same version of AMQ as is packaged
with G.

Cheers, Rob,

Jules

>
> cheers,
>
> Rob
> On 29 Oct 2005, at 02:10, Jules Gosnell wrote:
>
>> Jules Gosnell wrote:
>>
>>
>>>
>>> tested against activecluster-1.1-SNAPSHOT, activemq-4.0-SNAPSHOT  
>>> and Sun JDK 1.4.2_08 and 1.5.0 - two node cluster, both on same  
>>> machine.
>>>
>>
>> oops - AMQ-4.0 was not on the machine that I ran this on - so this  
>> issue is actually with AMQ-3.2-M1.
>>
>> I am trying to run the test with AMQ-4.0, but the second node won't  
>> start because "java.io.IOException: Journal is allready opened by  
>> another application", So i will have to figure out how to tell it  
>> not to journal - I'll be back ;-)
>>
>> Jules
>>
>>
>>>
>>> start your first node (red)
>>>
>>> [jules@zeuglodon core]$ ./cluster.sh activecluster red
>>> 2005/10/29 00:04:18:317 BST [INFO] ACCluster - starting...
>>> 2005/10/29 00:04:18:321 BST [INFO] ACCluster - ...started
>>>
>>> start your second node (green)
>>>
>>> [jules@zeuglodon core]$ ./cluster.sh activecluster green
>>> 2005/10/29 00:04:31:874 BST [INFO] ACCluster - starting...
>>> 2005/10/29 00:04:31:885 BST [INFO] ACCluster - ...started
>>> 2005/10/29 00:04:33:323 BST [INFO] ACCluster - onNodeAdd: red
>>>
>>> red says:
>>>
>>> 2005/10/29 00:04:33:536 BST [INFO] ACCluster - onNodeAdd: green
>>> 2005/10/29 00:04:33:537 BST [INFO] ACCluster -  
>>> onCoordinatorChanged: green
>>>
>>> ctl-c green, after a few seconds red says:
>>>
>>> 2005/10/29 00:04:46:624 BST [INFO] ACCluster - onNodeFailed: green
>>> 2005/10/29 00:04:46:626 BST [INFO] ACCluster -  
>>> onCoordinatorChanged: red
>>>
>>> all fine so far - now restart green:
>>>
>>> [jules@zeuglodon core]$ ./cluster.sh activecluster green
>>> 2005/10/29 00:04:51:962 BST [INFO] ACCluster - starting...
>>> 2005/10/29 00:04:51:967 BST [INFO] ACCluster - ...started
>>>
>>> red says......nothing.
>>>
>>>
>>> if you then start a third node (blue):
>>>
>>> [jules@zeuglodon core]$ ./cluster.sh activecluster blue
>>> 2005/10/29 00:08:47:990 BST [INFO] ACCluster - starting...
>>> 2005/10/29 00:08:47:994 BST [INFO] ACCluster - ...started
>>> 2005/10/29 00:08:50:198 BST [INFO] ACCluster - onNodeAdd: green
>>>
>>> it sees green, but not red and green says:
>>>
>>> 2005/10/29 00:08:49:661 BST [INFO] ACCluster - onNodeAdd: blue
>>> 2005/10/29 00:08:49:662 BST [INFO] ACCluster -  
>>> onCoordinatorChanged: blue
>>>
>>> so green sees blue too.
>>>
>>> but red says.....nothing.
>>>
>>>
>>> Using tcp:// and a broker, red will happily see blue and green  
>>> after they are restarted.
>>>
>>> This has been the case with activecluster/activemq for a long  time,
>>> but I was hoping that it would be fixed by improvements to  peer://
>>> in activemq-4.0 - but :-(
>>>
>>>
>>> Jules
>>>
>>> ---------------------------------------------------------------------
>>> ---
>>>
>>> package org.codehaus.wadi.sandbox.partition.impl;
>>>
>>> import java.util.HashMap;
>>> import java.util.Map;
>>>
>>> import javax.jms.JMSException;
>>>
>>> import org.activecluster.ClusterEvent;
>>> import org.activecluster.ClusterFactory;
>>> import org.activecluster.ClusterListener;
>>> import org.activecluster.impl.DefaultClusterFactory;
>>> import org.activemq.ActiveMQConnectionFactory;
>>> import org.activemq.store.vm.VMPersistenceAdapterFactory;
>>> import org.apache.commons.logging.Log;
>>> import org.apache.commons.logging.LogFactory;
>>> import org.codehaus.wadi.dindex.impl.SeniorityElectionStrategy;
>>> import org.codehaus.wadi.impl.CustomClusterFactory;
>>> import org.codehaus.wadi.sandbox.partition.Cluster;
>>>
>>> public class ACCluster {
>>>
>>>     protected final Log _log = LogFactory.getLog(getClass());
>>>
>>>     //protected final String _clusterUri="tcp://smilodon:61616";
>>>     protected final String _clusterUri="peer://org.codehaus.wadi";
>>>    protected final String _clusterName="ORG.CODEHAUS.WADI.TEST";
>>>    protected final ActiveMQConnectionFactory  _connectionFactory=new
>>> ActiveMQConnectionFactory(_clusterUri);
>>>    protected final ClusterFactory _clusterFactory=new  
>>> DefaultClusterFactory(_connectionFactory);
>>>     protected final org.activecluster.Cluster _cluster;
>>>     protected final long _timeout=30*1000L;
>>>
>>>     public ACCluster(String nodeName) throws Exception {
>>>         System.setProperty("activemq.persistenceAdapterFactory",  
>>> VMPersistenceAdapterFactory.class.getName());
>>>         _cluster=_clusterFactory.createCluster(_clusterName);
>>>         Map state=new HashMap();
>>>         state.put("nodeName", nodeName);
>>>         _cluster.getLocalNode().setState(state);
>>>         _cluster.addClusterListener(new ClusterListener() {
>>>
>>>             public void onNodeAdd(ClusterEvent arg0) {
>>>                 _log.info("onNodeAdd: "+arg0.getNode().getState
>>> ().get("nodeName"));
>>>             }
>>>
>>>             public void onNodeUpdate(ClusterEvent arg0) {
>>>                 _log.info("onNodeUpdate: "+arg0.getNode().getState
>>> ().get("nodeName"));
>>>             }
>>>
>>>             public void onNodeRemoved(ClusterEvent arg0) {
>>>                 _log.info("onNodeRemoved: "+arg0.getNode().getState
>>> ().get("nodeName"));
>>>             }
>>>
>>>             public void onNodeFailed(ClusterEvent arg0) {
>>>                 _log.info("onNodeFailed: "+arg0.getNode().getState
>>> ().get("nodeName"));
>>>             }
>>>
>>>             public void onCoordinatorChanged(ClusterEvent arg0) {
>>>                 _log.info("onCoordinatorChanged: "+arg0.getNode
>>> ().getState().get("nodeName"));
>>>             }
>>>         });
>>>     }
>>>
>>>     public void start() throws JMSException {
>>>         _log.info("starting...");
>>>         _cluster.start();
>>>         _log.info("...started");
>>>     }
>>>
>>>     public void stop() {
>>>
>>>     }
>>>
>>>     public static void main(String[] args) throws Exception {
>>>         ACCluster cluster=new ACCluster(args[0]);
>>>         cluster.start();
>>>         Thread.sleep(100*1000);
>>>         cluster.stop();
>>>     }
>>> }
>>>
>>>
>>
>>
>> --
>> "Open Source is a self-assembling organism. You dangle a piece of
>> string into a super-saturated solution and a whole operating-system
>> crystallises out around it."
>>
>> /**********************************
>> * Jules Gosnell
>> * Partner
>> * Core Developers Network (Europe)
>> *
>> *    www.coredevelopers.net
>> *
>> * Open Source Training & Support.
>> **********************************/
>>
>>


--
"Open Source is a self-assembling organism. You dangle a piece of
string into a super-saturated solution and a whole operating-system
crystallises out around it."

/**********************************
 * Jules Gosnell
 * Partner
 * Core Developers Network (Europe)
 *
 *    www.coredevelopers.net
 *
 * Open Source Training & Support.
 **********************************/


Re: [activecluster-user] peer:// - nodes do not rejoin cluster... (test enclosed)

by rajdavies :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

hi Jules,

the latest ActiveCluster in the repo  works with ActiveMQ 3.2 - amq4  
will hopefully be ready soon. I'd appreciate it if you could build  
activecluster from src - and see if your problems have disappeared  -  
there was a a nasty deadlock I ended up fixing.

cheers,

Rob

On 6 Nov 2005, at 18:41, Jules Gosnell wrote:

> Rob Davies wrote:
>
>> Hi Jules,
>>
>> I'm in the process of porting activecluster to amq4. There's an  
>> issue  with discovery in amq4 at the moment that needs to be fixed.
>
> Ok - no problem - if you let me know when the fix goes in, I will  
> be happy to retest and report.
>
> BTW - do you know if AMQ4 is the release that is intended to be  
> bundled with Geronimo-1.0 ? I am targetting a WADI release at  
> Geronimo-1.0 and I need to make sure that we run on the same  
> version of AMQ as is packaged with G.
>
> Cheers, Rob,
>
> Jules
>
>>
>> cheers,
>>
>> Rob
>> On 29 Oct 2005, at 02:10, Jules Gosnell wrote:
>>
>>> Jules Gosnell wrote:
>>>
>>>
>>>>
>>>> tested against activecluster-1.1-SNAPSHOT, activemq-4.0-
>>>> SNAPSHOT  and Sun JDK 1.4.2_08 and 1.5.0 - two node cluster,  
>>>> both on same  machine.
>>>>
>>>
>>> oops - AMQ-4.0 was not on the machine that I ran this on - so  
>>> this  issue is actually with AMQ-3.2-M1.
>>>
>>> I am trying to run the test with AMQ-4.0, but the second node  
>>> won't  start because "java.io.IOException: Journal is allready  
>>> opened by  another application", So i will have to figure out how  
>>> to tell it  not to journal - I'll be back ;-)
>>>
>>> Jules
>>>
>>>
>>>>
>>>> start your first node (red)
>>>>
>>>> [jules@zeuglodon core]$ ./cluster.sh activecluster red
>>>> 2005/10/29 00:04:18:317 BST [INFO] ACCluster - starting...
>>>> 2005/10/29 00:04:18:321 BST [INFO] ACCluster - ...started
>>>>
>>>> start your second node (green)
>>>>
>>>> [jules@zeuglodon core]$ ./cluster.sh activecluster green
>>>> 2005/10/29 00:04:31:874 BST [INFO] ACCluster - starting...
>>>> 2005/10/29 00:04:31:885 BST [INFO] ACCluster - ...started
>>>> 2005/10/29 00:04:33:323 BST [INFO] ACCluster - onNodeAdd: red
>>>>
>>>> red says:
>>>>
>>>> 2005/10/29 00:04:33:536 BST [INFO] ACCluster - onNodeAdd: green
>>>> 2005/10/29 00:04:33:537 BST [INFO] ACCluster -  
>>>> onCoordinatorChanged: green
>>>>
>>>> ctl-c green, after a few seconds red says:
>>>>
>>>> 2005/10/29 00:04:46:624 BST [INFO] ACCluster - onNodeFailed: green
>>>> 2005/10/29 00:04:46:626 BST [INFO] ACCluster -  
>>>> onCoordinatorChanged: red
>>>>
>>>> all fine so far - now restart green:
>>>>
>>>> [jules@zeuglodon core]$ ./cluster.sh activecluster green
>>>> 2005/10/29 00:04:51:962 BST [INFO] ACCluster - starting...
>>>> 2005/10/29 00:04:51:967 BST [INFO] ACCluster - ...started
>>>>
>>>> red says......nothing.
>>>>
>>>>
>>>> if you then start a third node (blue):
>>>>
>>>> [jules@zeuglodon core]$ ./cluster.sh activecluster blue
>>>> 2005/10/29 00:08:47:990 BST [INFO] ACCluster - starting...
>>>> 2005/10/29 00:08:47:994 BST [INFO] ACCluster - ...started
>>>> 2005/10/29 00:08:50:198 BST [INFO] ACCluster - onNodeAdd: green
>>>>
>>>> it sees green, but not red and green says:
>>>>
>>>> 2005/10/29 00:08:49:661 BST [INFO] ACCluster - onNodeAdd: blue
>>>> 2005/10/29 00:08:49:662 BST [INFO] ACCluster -  
>>>> onCoordinatorChanged: blue
>>>>
>>>> so green sees blue too.
>>>>
>>>> but red says.....nothing.
>>>>
>>>>
>>>> Using tcp:// and a broker, red will happily see blue and green  
>>>> after they are restarted.
>>>>
>>>> This has been the case with activecluster/activemq for a long  
>>>> time, but I was hoping that it would be fixed by improvements  
>>>> to  peer:// in activemq-4.0 - but :-(
>>>>
>>>>
>>>> Jules
>>>>
>>>> -------------------------------------------------------------------
>>>> -- ---
>>>>
>>>> package org.codehaus.wadi.sandbox.partition.impl;
>>>>
>>>> import java.util.HashMap;
>>>> import java.util.Map;
>>>>
>>>> import javax.jms.JMSException;
>>>>
>>>> import org.activecluster.ClusterEvent;
>>>> import org.activecluster.ClusterFactory;
>>>> import org.activecluster.ClusterListener;
>>>> import org.activecluster.impl.DefaultClusterFactory;
>>>> import org.activemq.ActiveMQConnectionFactory;
>>>> import org.activemq.store.vm.VMPersistenceAdapterFactory;
>>>> import org.apache.commons.logging.Log;
>>>> import org.apache.commons.logging.LogFactory;
>>>> import org.codehaus.wadi.dindex.impl.SeniorityElectionStrategy;
>>>> import org.codehaus.wadi.impl.CustomClusterFactory;
>>>> import org.codehaus.wadi.sandbox.partition.Cluster;
>>>>
>>>> public class ACCluster {
>>>>
>>>>     protected final Log _log = LogFactory.getLog(getClass());
>>>>
>>>>     //protected final String _clusterUri="tcp://smilodon:61616";
>>>>     protected final String _clusterUri="peer://org.codehaus.wadi";
>>>>    protected final String _clusterName="ORG.CODEHAUS.WADI.TEST";
>>>>    protected final ActiveMQConnectionFactory  
>>>> _connectionFactory=new ActiveMQConnectionFactory(_clusterUri);
>>>>    protected final ClusterFactory _clusterFactory=new  
>>>> DefaultClusterFactory(_connectionFactory);
>>>>     protected final org.activecluster.Cluster _cluster;
>>>>     protected final long _timeout=30*1000L;
>>>>
>>>>     public ACCluster(String nodeName) throws Exception {
>>>>         System.setProperty
>>>> ("activemq.persistenceAdapterFactory",  
>>>> VMPersistenceAdapterFactory.class.getName());
>>>>         _cluster=_clusterFactory.createCluster(_clusterName);
>>>>         Map state=new HashMap();
>>>>         state.put("nodeName", nodeName);
>>>>         _cluster.getLocalNode().setState(state);
>>>>         _cluster.addClusterListener(new ClusterListener() {
>>>>
>>>>             public void onNodeAdd(ClusterEvent arg0) {
>>>>                 _log.info("onNodeAdd: "+arg0.getNode().getState  
>>>> ().get("nodeName"));
>>>>             }
>>>>
>>>>             public void onNodeUpdate(ClusterEvent arg0) {
>>>>                 _log.info("onNodeUpdate: "+arg0.getNode
>>>> ().getState ().get("nodeName"));
>>>>             }
>>>>
>>>>             public void onNodeRemoved(ClusterEvent arg0) {
>>>>                 _log.info("onNodeRemoved: "+arg0.getNode
>>>> ().getState ().get("nodeName"));
>>>>             }
>>>>
>>>>             public void onNodeFailed(ClusterEvent arg0) {
>>>>                 _log.info("onNodeFailed: "+arg0.getNode
>>>> ().getState ().get("nodeName"));
>>>>             }
>>>>
>>>>             public void onCoordinatorChanged(ClusterEvent arg0) {
>>>>                 _log.info("onCoordinatorChanged: "+arg0.getNode  
>>>> ().getState().get("nodeName"));
>>>>             }
>>>>         });
>>>>     }
>>>>
>>>>     public void start() throws JMSException {
>>>>         _log.info("starting...");
>>>>         _cluster.start();
>>>>         _log.info("...started");
>>>>     }
>>>>
>>>>     public void stop() {
>>>>
>>>>     }
>>>>
>>>>     public static void main(String[] args) throws Exception {
>>>>         ACCluster cluster=new ACCluster(args[0]);
>>>>         cluster.start();
>>>>         Thread.sleep(100*1000);
>>>>         cluster.stop();
>>>>     }
>>>> }
>>>>
>>>>
>>>
>>>
>>> --
>>> "Open Source is a self-assembling organism. You dangle a piece of
>>> string into a super-saturated solution and a whole operating-system
>>> crystallises out around it."
>>>
>>> /**********************************
>>> * Jules Gosnell
>>> * Partner
>>> * Core Developers Network (Europe)
>>> *
>>> *    www.coredevelopers.net
>>> *
>>> * Open Source Training & Support.
>>> **********************************/
>>>
>>>
>
>
> --
> "Open Source is a self-assembling organism. You dangle a piece of
> string into a super-saturated solution and a whole operating-system
> crystallises out around it."
>
> /**********************************
> * Jules Gosnell
> * Partner
> * Core Developers Network (Europe)
> *
> *    www.coredevelopers.net
> *
> * Open Source Training & Support.
> **********************************/
>


Re: [activecluster-user] peer:// - nodes do not rejoin cluster... (test enclosed)

by Jules Gosnell-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Rob,

Thanks for putting some time into this...

I've just rebuilt AC and run the test that I included again. I'm afraid
that it doesn't fix the problem. Once the second node leaves the cluster
the first node becomes unable to see any subsequent node that wishes to
join. It looks as if it is some logic problem associated with cluster's
dropping to one node, but, curiously it works fine with the tcp://
protocol... Maybe it is something to do with autodiscovery ? Do tcp://
and peer;// use common or separate autodiscovery mechanisms ?

Is there some test that I can run to help pinpoint the problem ?

Sorry to be the bearer of bad news :-(


Jules

Rob Davies wrote:

> hi Jules,
>
> the latest ActiveCluster in the repo  works with ActiveMQ 3.2 - amq4  
> will hopefully be ready soon. I'd appreciate it if you could build  
> activecluster from src - and see if your problems have disappeared  -  
> there was a a nasty deadlock I ended up fixing.
>
> cheers,
>
> Rob
>
> On 6 Nov 2005, at 18:41, Jules Gosnell wrote:
>
>> Rob Davies wrote:
>>
>>> Hi Jules,
>>>
>>> I'm in the process of porting activecluster to amq4. There's an  
>>> issue  with discovery in amq4 at the moment that needs to be fixed.
>>
>>
>> Ok - no problem - if you let me know when the fix goes in, I will  be
>> happy to retest and report.
>>
>> BTW - do you know if AMQ4 is the release that is intended to be  
>> bundled with Geronimo-1.0 ? I am targetting a WADI release at  
>> Geronimo-1.0 and I need to make sure that we run on the same  version
>> of AMQ as is packaged with G.
>>
>> Cheers, Rob,
>>
>> Jules
>>
>>>
>>> cheers,
>>>
>>> Rob
>>> On 29 Oct 2005, at 02:10, Jules Gosnell wrote:
>>>
>>>> Jules Gosnell wrote:
>>>>
>>>>
>>>>>
>>>>> tested against activecluster-1.1-SNAPSHOT, activemq-4.0- SNAPSHOT  
>>>>> and Sun JDK 1.4.2_08 and 1.5.0 - two node cluster,  both on same  
>>>>> machine.
>>>>>
>>>>
>>>> oops - AMQ-4.0 was not on the machine that I ran this on - so  
>>>> this  issue is actually with AMQ-3.2-M1.
>>>>
>>>> I am trying to run the test with AMQ-4.0, but the second node  
>>>> won't  start because "java.io.IOException: Journal is allready  
>>>> opened by  another application", So i will have to figure out how  
>>>> to tell it  not to journal - I'll be back ;-)
>>>>
>>>> Jules
>>>>
>>>>
>>>>>
>>>>> start your first node (red)
>>>>>
>>>>> [jules@zeuglodon core]$ ./cluster.sh activecluster red
>>>>> 2005/10/29 00:04:18:317 BST [INFO] ACCluster - starting...
>>>>> 2005/10/29 00:04:18:321 BST [INFO] ACCluster - ...started
>>>>>
>>>>> start your second node (green)
>>>>>
>>>>> [jules@zeuglodon core]$ ./cluster.sh activecluster green
>>>>> 2005/10/29 00:04:31:874 BST [INFO] ACCluster - starting...
>>>>> 2005/10/29 00:04:31:885 BST [INFO] ACCluster - ...started
>>>>> 2005/10/29 00:04:33:323 BST [INFO] ACCluster - onNodeAdd: red
>>>>>
>>>>> red says:
>>>>>
>>>>> 2005/10/29 00:04:33:536 BST [INFO] ACCluster - onNodeAdd: green
>>>>> 2005/10/29 00:04:33:537 BST [INFO] ACCluster -  
>>>>> onCoordinatorChanged: green
>>>>>
>>>>> ctl-c green, after a few seconds red says:
>>>>>
>>>>> 2005/10/29 00:04:46:624 BST [INFO] ACCluster - onNodeFailed: green
>>>>> 2005/10/29 00:04:46:626 BST [INFO] ACCluster -  
>>>>> onCoordinatorChanged: red
>>>>>
>>>>> all fine so far - now restart green:
>>>>>
>>>>> [jules@zeuglodon core]$ ./cluster.sh activecluster green
>>>>> 2005/10/29 00:04:51:962 BST [INFO] ACCluster - starting...
>>>>> 2005/10/29 00:04:51:967 BST [INFO] ACCluster - ...started
>>>>>
>>>>> red says......nothing.
>>>>>
>>>>>
>>>>> if you then start a third node (blue):
>>>>>
>>>>> [jules@zeuglodon core]$ ./cluster.sh activecluster blue
>>>>> 2005/10/29 00:08:47:990 BST [INFO] ACCluster - starting...
>>>>> 2005/10/29 00:08:47:994 BST [INFO] ACCluster - ...started
>>>>> 2005/10/29 00:08:50:198 BST [INFO] ACCluster - onNodeAdd: green
>>>>>
>>>>> it sees green, but not red and green says:
>>>>>
>>>>> 2005/10/29 00:08:49:661 BST [INFO] ACCluster - onNodeAdd: blue
>>>>> 2005/10/29 00:08:49:662 BST [INFO] ACCluster -  
>>>>> onCoordinatorChanged: blue
>>>>>
>>>>> so green sees blue too.
>>>>>
>>>>> but red says.....nothing.
>>>>>
>>>>>
>>>>> Using tcp:// and a broker, red will happily see blue and green  
>>>>> after they are restarted.
>>>>>
>>>>> This has been the case with activecluster/activemq for a long  
>>>>> time, but I was hoping that it would be fixed by improvements  to  
>>>>> peer:// in activemq-4.0 - but :-(
>>>>>
>>>>>
>>>>> Jules
>>>>>
>>>>> -------------------------------------------------------------------
>>>>> -- ---
>>>>>
>>>>> package org.codehaus.wadi.sandbox.partition.impl;
>>>>>
>>>>> import java.util.HashMap;
>>>>> import java.util.Map;
>>>>>
>>>>> import javax.jms.JMSException;
>>>>>
>>>>> import org.activecluster.ClusterEvent;
>>>>> import org.activecluster.ClusterFactory;
>>>>> import org.activecluster.ClusterListener;
>>>>> import org.activecluster.impl.DefaultClusterFactory;
>>>>> import org.activemq.ActiveMQConnectionFactory;
>>>>> import org.activemq.store.vm.VMPersistenceAdapterFactory;
>>>>> import org.apache.commons.logging.Log;
>>>>> import org.apache.commons.logging.LogFactory;
>>>>> import org.codehaus.wadi.dindex.impl.SeniorityElectionStrategy;
>>>>> import org.codehaus.wadi.impl.CustomClusterFactory;
>>>>> import org.codehaus.wadi.sandbox.partition.Cluster;
>>>>>
>>>>> public class ACCluster {
>>>>>
>>>>>     protected final Log _log = LogFactory.getLog(getClass());
>>>>>
>>>>>     //protected final String _clusterUri="tcp://smilodon:61616";
>>>>>     protected final String _clusterUri="peer://org.codehaus.wadi";
>>>>>    protected final String _clusterName="ORG.CODEHAUS.WADI.TEST";
>>>>>    protected final ActiveMQConnectionFactory  
>>>>> _connectionFactory=new ActiveMQConnectionFactory(_clusterUri);
>>>>>    protected final ClusterFactory _clusterFactory=new  
>>>>> DefaultClusterFactory(_connectionFactory);
>>>>>     protected final org.activecluster.Cluster _cluster;
>>>>>     protected final long _timeout=30*1000L;
>>>>>
>>>>>     public ACCluster(String nodeName) throws Exception {
>>>>>         System.setProperty
>>>>> ("activemq.persistenceAdapterFactory",  
>>>>> VMPersistenceAdapterFactory.class.getName());
>>>>>         _cluster=_clusterFactory.createCluster(_clusterName);
>>>>>         Map state=new HashMap();
>>>>>         state.put("nodeName", nodeName);
>>>>>         _cluster.getLocalNode().setState(state);
>>>>>         _cluster.addClusterListener(new ClusterListener() {
>>>>>
>>>>>             public void onNodeAdd(ClusterEvent arg0) {
>>>>>                 _log.info("onNodeAdd: "+arg0.getNode().getState  
>>>>> ().get("nodeName"));
>>>>>             }
>>>>>
>>>>>             public void onNodeUpdate(ClusterEvent arg0) {
>>>>>                 _log.info("onNodeUpdate: "+arg0.getNode
>>>>> ().getState ().get("nodeName"));
>>>>>             }
>>>>>
>>>>>             public void onNodeRemoved(ClusterEvent arg0) {
>>>>>                 _log.info("onNodeRemoved: "+arg0.getNode
>>>>> ().getState ().get("nodeName"));
>>>>>             }
>>>>>
>>>>>             public void onNodeFailed(ClusterEvent arg0) {
>>>>>                 _log.info("onNodeFailed: "+arg0.getNode
>>>>> ().getState ().get("nodeName"));
>>>>>             }
>>>>>
>>>>>             public void onCoordinatorChanged(ClusterEvent arg0) {
>>>>>                 _log.info("onCoordinatorChanged: "+arg0.getNode  
>>>>> ().getState().get("nodeName"));
>>>>>             }
>>>>>         });
>>>>>     }
>>>>>
>>>>>     public void start() throws JMSException {
>>>>>         _log.info("starting...");
>>>>>         _cluster.start();
>>>>>         _log.info("...started");
>>>>>     }
>>>>>
>>>>>     public void stop() {
>>>>>
>>>>>     }
>>>>>
>>>>>     public static void main(String[] args) throws Exception {
>>>>>         ACCluster cluster=new ACCluster(args[0]);
>>>>>         cluster.start();
>>>>>         Thread.sleep(100*1000);
>>>>>         cluster.stop();
>>>>>     }
>>>>> }
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> "Open Source is a self-assembling organism. You dangle a piece of
>>>> string into a super-saturated solution and a whole operating-system
>>>> crystallises out around it."
>>>>
>>>> /**********************************
>>>> * Jules Gosnell
>>>> * Partner
>>>> * Core Developers Network (Europe)
>>>> *
>>>> *    www.coredevelopers.net
>>>> *
>>>> * Open Source Training & Support.
>>>> **********************************/
>>>>
>>>>
>>
>>
>> --
>> "Open Source is a self-assembling organism. You dangle a piece of
>> string into a super-saturated solution and a whole operating-system
>> crystallises out around it."
>>
>> /**********************************
>> * Jules Gosnell
>> * Partner
>> * Core Developers Network (Europe)
>> *
>> *    www.coredevelopers.net
>> *
>> * Open Source Training & Support.
>> **********************************/
>>


--
"Open Source is a self-assembling organism. You dangle a piece of
string into a super-saturated solution and a whole operating-system
crystallises out around it."

/**********************************
 * Jules Gosnell
 * Partner
 * Core Developers Network (Europe)
 *
 *    www.coredevelopers.net
 *
 * Open Source Training & Support.
 **********************************/


Re: [activecluster-user] peer:// - nodes do not rejoin cluster... (test enclosed)

by Jules Gosnell-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jules Gosnell wrote:

> Rob,
>
> Thanks for putting some time into this...
>
> I've just rebuilt AC and run the test that I included again. I'm
> afraid that it doesn't fix the problem. Once the second node leaves
> the cluster the first node becomes unable to see any subsequent node
> that wishes to join. It looks as if it is some logic problem
> associated with cluster's dropping to one node, but, curiously it
> works fine with the tcp:// protocol... Maybe it is something to do
> with autodiscovery ? Do tcp:// and peer;// use common or separate
> autodiscovery mechanisms ?
>
> Is there some test that I can run to help pinpoint the problem ?
>
> Sorry to be the bearer of bad news :-(
>
>
> Jules


P.S.

I ran my tests on AMQ 3.2-M1 - I guess that I need to run them on AMQ3-2
to be sure - only problem is that 3.2 breaks our mavens build with deps
that cannot be downloaded - maybe our repo list is not what it should be
- I will see what I can do and get back to you in a little while.

>
> Rob Davies wrote:
>
>> hi Jules,
>>
>> the latest ActiveCluster in the repo  works with ActiveMQ 3.2 - amq4  
>> will hopefully be ready soon. I'd appreciate it if you could build  
>> activecluster from src - and see if your problems have disappeared  
>> -  there was a a nasty deadlock I ended up fixing.
>>
>> cheers,
>>
>> Rob
>>
>> On 6 Nov 2005, at 18:41, Jules Gosnell wrote:
>>
>>> Rob Davies wrote:
>>>
>>>> Hi Jules,
>>>>
>>>> I'm in the process of porting activecluster to amq4. There's an  
>>>> issue  with discovery in amq4 at the moment that needs to be fixed.
>>>
>>>
>>>
>>> Ok - no problem - if you let me know when the fix goes in, I will  
>>> be happy to retest and report.
>>>
>>> BTW - do you know if AMQ4 is the release that is intended to be  
>>> bundled with Geronimo-1.0 ? I am targetting a WADI release at  
>>> Geronimo-1.0 and I need to make sure that we run on the same  
>>> version of AMQ as is packaged with G.
>>>
>>> Cheers, Rob,
>>>
>>> Jules
>>>
>>>>
>>>> cheers,
>>>>
>>>> Rob
>>>> On 29 Oct 2005, at 02:10, Jules Gosnell wrote:
>>>>
>>>>> Jules Gosnell wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> tested against activecluster-1.1-SNAPSHOT, activemq-4.0-
>>>>>> SNAPSHOT  and Sun JDK 1.4.2_08 and 1.5.0 - two node cluster,  
>>>>>> both on same  machine.
>>>>>>
>>>>>
>>>>> oops - AMQ-4.0 was not on the machine that I ran this on - so  
>>>>> this  issue is actually with AMQ-3.2-M1.
>>>>>
>>>>> I am trying to run the test with AMQ-4.0, but the second node  
>>>>> won't  start because "java.io.IOException: Journal is allready  
>>>>> opened by  another application", So i will have to figure out how  
>>>>> to tell it  not to journal - I'll be back ;-)
>>>>>
>>>>> Jules
>>>>>
>>>>>
>>>>>>
>>>>>> start your first node (red)
>>>>>>
>>>>>> [jules@zeuglodon core]$ ./cluster.sh activecluster red
>>>>>> 2005/10/29 00:04:18:317 BST [INFO] ACCluster - starting...
>>>>>> 2005/10/29 00:04:18:321 BST [INFO] ACCluster - ...started
>>>>>>
>>>>>> start your second node (green)
>>>>>>
>>>>>> [jules@zeuglodon core]$ ./cluster.sh activecluster green
>>>>>> 2005/10/29 00:04:31:874 BST [INFO] ACCluster - starting...
>>>>>> 2005/10/29 00:04:31:885 BST [INFO] ACCluster - ...started
>>>>>> 2005/10/29 00:04:33:323 BST [INFO] ACCluster - onNodeAdd: red
>>>>>>
>>>>>> red says:
>>>>>>
>>>>>> 2005/10/29 00:04:33:536 BST [INFO] ACCluster - onNodeAdd: green
>>>>>> 2005/10/29 00:04:33:537 BST [INFO] ACCluster -  
>>>>>> onCoordinatorChanged: green
>>>>>>
>>>>>> ctl-c green, after a few seconds red says:
>>>>>>
>>>>>> 2005/10/29 00:04:46:624 BST [INFO] ACCluster - onNodeFailed: green
>>>>>> 2005/10/29 00:04:46:626 BST [INFO] ACCluster -  
>>>>>> onCoordinatorChanged: red
>>>>>>
>>>>>> all fine so far - now restart green:
>&g