|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
NameNode failover procedureHi all,
I've been reading the docs and the code, but I'm still somewhat hazy as to what is the exact step-by-step procedure to perform a failover between a primary NameNode and a SecondaryNameNode, in case the former explodes or catches fire. So far I learned that the secondary namenode keeps refreshing periodically its backup copies of fsimage and editlog files, and if the primary namenode disappears, it's the responsibility of the cluster admin to notice this, shut down the cluster, switch the configs across the cluster to point to the secondary namenode, start a primary namenode on the secondary namenode's host, and restart the rest of the daemons. In the meantime the other admin person frantically tries to restore the primary namenode machine, and when it's ready we apply the process in reverse, or we make it into a secondary namenode. Any comments, clarifications and/or automation of this procedure are welcome. ;) -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com |
|
|
Re: NameNode failover procedureAndrzej Bialecki wrote:
> So far I learned that the secondary namenode keeps refreshing > periodically its backup copies of fsimage and editlog files, and if the > primary namenode disappears, it's the responsibility of the cluster > admin to notice this, shut down the cluster, switch the configs across > the cluster to point to the secondary namenode, start a primary namenode > on the secondary namenode's host, and restart the rest of the daemons. If you use DNS to switch the namenode from the primary to the secondary, then no configuration changes or other daemon restarts are required. I think that is the best practice. Doug |
|
|
Re: NameNode failover procedureThis is now on the wiki under NameNodeFailover and linked from the main page. There are some questions unanswered on that page, however. Could somebody who actually knows the answers (unlike me) edit that page to fill it out a bit? On 7/20/07 9:53 AM, "Doug Cutting" <cutting@...> wrote: >> So far I learned that the secondary namenode keeps refreshing >> periodically its backup copies of fsimage and editlog files, and if the >> primary namenode disappears, it's the responsibility of the cluster >> admin to notice this, shut down the cluster, switch the configs across >> the cluster to point to the secondary namenode, start a primary namenode >> on the secondary namenode's host, and restart the rest of the daemons. > > If you use DNS to switch the namenode from the primary to the secondary, > then no configuration changes or other daemon restarts are required. I > think that is the best practice. |
|
|
RE: NameNode failover procedureA good way to implement failover is to make the Namenode log transactions to
more than one directory, typically a local directory and a NFS mounted directory. The Namenode writes transactions to both directories synchronously. If the Namenode machine dies, copy the fsimage and fsiedits from the NFS server and you will have recovered *all* committed transactions. The SecondaryNamenode pulls the fsimage and fsedits once every configured period, typically ranging from a few minutes to an hour. If you use the image from the SecondaryNamenode, you might lose the last few minutes of transactions. Thanks dhruba On 7/20/07 9:53 AM, "Doug Cutting" <cutting@...> wrote: >> So far I learned that the secondary namenode keeps refreshing >> periodically its backup copies of fsimage and editlog files, and if the >> primary namenode disappears, it's the responsibility of the cluster >> admin to notice this, shut down the cluster, switch the configs across >> the cluster to point to the secondary namenode, start a primary namenode >> on the secondary namenode's host, and restart the rest of the daemons. > > If you use DNS to switch the namenode from the primary to the secondary, > then no configuration changes or other daemon restarts are required. I > think that is the best practice. |
|
|
Re: NameNode failover procedureDhruba Borthakur wrote:
> A good way to implement failover is to make the Namenode log transactions to > more than one directory, typically a local directory and a NFS mounted > directory. The Namenode writes transactions to both directories > synchronously. > > If the Namenode machine dies, copy the fsimage and fsiedits from the NFS > server and you will have recovered *all* committed transactions. > > The SecondaryNamenode pulls the fsimage and fsedits once every configured > period, typically ranging from a few minutes to an hour. If you use the > image from the SecondaryNamenode, you might lose the last few minutes of > transactions. That's a good idea. But then, what's the purpose of running a secondary namenode, if it can't guarantee that the data loss is minimal ??? Should't edits be written synchronously to a secondary namenode, and fsimage updated synchronously whenever a primary namenode performs a checkpoint? -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com |
|
|
RE: NameNode failover procedureIt seems there is no answer yet for all these questions and the wiki has not
been updated. I do not understand the statement of just changing the DNS settings. How will that work exactly? We would have to change the masters list so that the secondary namenode is first on the list and it would work automatically? The files in the secondary namenode directory are quite different, how do they get used by a primary name node? It is still quite confusing to me. Thanks, Ankur -----Original Message----- From: Ted Dunning [mailto:tdunning@...] Sent: Friday, 20 July, 2007 1:07 PM To: hadoop-user@... Subject: Re: NameNode failover procedure This is now on the wiki under NameNodeFailover and linked from the main page. There are some questions unanswered on that page, however. Could somebody who actually knows the answers (unlike me) edit that page to fill it out a bit? On 7/20/07 9:53 AM, "Doug Cutting" <cutting@...> wrote: >> So far I learned that the secondary namenode keeps refreshing >> periodically its backup copies of fsimage and editlog files, and if the >> primary namenode disappears, it's the responsibility of the cluster >> admin to notice this, shut down the cluster, switch the configs across >> the cluster to point to the secondary namenode, start a primary namenode >> on the secondary namenode's host, and restart the rest of the daemons. > > If you use DNS to switch the namenode from the primary to the secondary, > then no configuration changes or other daemon restarts are required. I > think that is the best practice. |
|
|
Re: NameNode failover procedureThe problem here is probably that the name "secondary namenode" is
confusing. It is not a name-node in the sense that data-nodes cannot connect to the secondary name-node, and in no event it can replace the primary name-node in case of its failure. The only purpose of the secondary name-node is to perform periodic checkpoints. The secondary name-node periodically downloads current name-node image and edits log files, joins them into new image and uploads the new image back to the (primary and the only) name-node. So if the name-node fails and you can restart it on the same node then there is no need to shut down data-nodes, just the name-node need to be restarted. If you cannot use the old node anymore you will need to copy the latest image somewhere else. The latest image can be found either on the node that used to be the primary before failure if available; or from the secondary name-node. This will be latest checkpoint without subsequent edits log, so the latest name space modifications may be missing there. You will probably need to restart the whole cluster in this case. I don't know whether dns tricks will work with current rpc implementation. Thanks, Konstantin Ankur Sethi wrote: >It seems there is no answer yet for all these questions and the wiki has not >been updated. > >I do not understand the statement of just changing the DNS settings. How >will that work exactly? > >We would have to change the masters list so that the secondary namenode is >first on the list and it would work automatically? The files in the >secondary namenode directory are quite different, how do they get used by a >primary name node? > >It is still quite confusing to me. > >Thanks, >Ankur > >-----Original Message----- >From: Ted Dunning [mailto:tdunning@...] >Sent: Friday, 20 July, 2007 1:07 PM >To: hadoop-user@... >Subject: Re: NameNode failover procedure > > >This is now on the wiki under NameNodeFailover and linked from the main >page. > >There are some questions unanswered on that page, however. Could somebody >who actually knows the answers (unlike me) edit that page to fill it out a >bit? > > >On 7/20/07 9:53 AM, "Doug Cutting" <cutting@...> wrote: > > > >>>So far I learned that the secondary namenode keeps refreshing >>>periodically its backup copies of fsimage and editlog files, and if the >>>primary namenode disappears, it's the responsibility of the cluster >>>admin to notice this, shut down the cluster, switch the configs across >>>the cluster to point to the secondary namenode, start a primary namenode >>>on the secondary namenode's host, and restart the rest of the daemons. >>> >>> >>If you use DNS to switch the namenode from the primary to the secondary, >>then no configuration changes or other daemon restarts are required. I >>think that is the best practice. >> >> > > > > |
|
|
Re: NameNode failover procedureThe NFS seems to be having problem as NFS locking causes namenode hangup. Can't be there any other way, say if namenode starts writing synchronously to secondary namenode apart from local directories, then in case of namenode failover, we can start the primary namenode process on secondary namenode and the latest checkpointed fsimage is already there on secondary namenode.
This also raises a fundamental question, whether we can run secondary namenode process on the same node as primary namenode process without any out of memory / heap exceptions ? Also ideally what should be the memory size of primary namenode if alone and when with secondary namenode process ?
|
| Free Forum Powered by Nabble | Forum Help |