Backup of an courier-imap mailarchive

View: New views
15 Messages — Rating Filter:   Alert me  

Backup of an courier-imap mailarchive

by Michelle Konzack-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

On my "business" mailserver (courier-imap) I have 43 users (some of them
as On-Line-Archive), together over  42.000  mailfolders  and  17.000.000
messages stored and each month  we  get  1800  mailfolders  and  150.000
messages more.  The total volume is arround 180 GByte. (real not blocks)

Currently I run a daily plus weekly cronjob with a selfmade script which
check first the /cur/ directories and if it does not match a  previously
created MD5 + directory listing it will backuped.  Daily incremental and
weekly full.

The daily incremental backup take arround 5 hours and the weekly full
backup arround 8 hours.

OK, the BaSH script works very fast but there is a problem with it...

It take 100% CPU resources...  and it is nearly  impossibel  to  connect
via IMAP to the server since the connection times out...  :-(

Q1:  Is there someone with a similar problem and can help out?

Q2:  Would it be better, if I install 2-3 very small additional  storage
     servers (they need only 147 GByte Raid-1 storage each)  which  hold
     only the huge Mailarchives and serv them over a separated  GigaBit)
     link?        (I can install a second NIC in the imap server connect
     to a 5 port Switch for the backup server and storage servers)

Note: The backup server is an Athlon XP1800+/512GB with Adaptec 29160
      and has 6 x 74 GByte of storage for 6 weeks. (one HDD per week)

Thanks, Greetings and nice Day
    Michelle Konzack
    Systemadministrator
    24V Electronic Engineer
    Tamay Dogan Network
    Debian GNU/Linux Consultant


--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
##################### Debian GNU/Linux Consultant #####################
Michelle Konzack   Apt. 917                  ICQ #328449886
+49/177/9351947    50, rue de Soultz         MSN LinuxMichi
+33/6/61925193     67100 Strasbourg/France   IRC #Debian (irc.icq.com)


signature.pgp (196 bytes) Download Attachment

Re: Backup of an courier-imap mailarchive

by Thomas Goirand :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Michelle Konzack wrote:

> Hello,
>
> On my "business" mailserver (courier-imap) I have 43 users (some of them
> as On-Line-Archive), together over  42.000  mailfolders  and  17.000.000
> messages stored and each month  we  get  1800  mailfolders  and  150.000
> messages more.  The total volume is arround 180 GByte. (real not blocks)
>
> Currently I run a daily plus weekly cronjob with a selfmade script which
> check first the /cur/ directories and if it does not match a  previously
> created MD5 + directory listing it will backuped.  Daily incremental and
> weekly full.

Why did you prefer such solution over something like dirvish for
example? It does rsync incremental backup every day, and I don't believe
it would take 100% of your CPU.

Thomas


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: Backup of an courier-imap mailarchive

by Michelle Konzack-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Am 2008-05-25 23:49:31, schrieb Thomas Goirand:
> Why did you prefer such solution over something like dirvish for
> example? It does rsync incremental backup every day, and I don't believe
> it would take 100% of your CPU.

Because "rsync" has not only eaten my CPU to 100% but also the Memory...

Thanks, Greetings and nice Day
    Michelle Konzack
    Systemadministrator
    24V Electronic Engineer
    Tamay Dogan Network
    Debian GNU/Linux Consultant


--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
##################### Debian GNU/Linux Consultant #####################
Michelle Konzack   Apt. 917                  ICQ #328449886
+49/177/9351947    50, rue de Soultz         MSN LinuxMichi
+33/6/61925193     67100 Strasbourg/France   IRC #Debian (irc.icq.com)


signature.pgp (196 bytes) Download Attachment

Re: Backup of an courier-imap mailarchive

by Joerg Backschues-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Michelle Konzack wrote:

> On my "business" mailserver (courier-imap) I have 43 users (some of them
> as On-Line-Archive), together over  42.000  mailfolders  and  17.000.000
> messages stored and each month  we  get  1800  mailfolders  and  150.000
> messages more.  The total volume is arround 180 GByte. (real not blocks)
>
> Currently I run a daily plus weekly cronjob with a selfmade script which
> check first the /cur/ directories and if it does not match a  previously
> created MD5 + directory listing it will backuped.  Daily incremental and
> weekly full.
>
> The daily incremental backup take arround 5 hours and the weekly full
> backup arround 8 hours.
>
> OK, the BaSH script works very fast but there is a problem with it...
>
> It take 100% CPU resources...  and it is nearly  impossibel  to  connect
> via IMAP to the server since the connection times out...  :-(

The best way is it to make backups of your filesystem based on block
level technologies e.g. with LVM. Snapshots based on block level doesn't
care about file and directory quantity.

There was an article - sorry, in German only - in the iX-Magazin in 2004
(<http://www.heise.de/kiosk/archiv/ix/04/10/136_Im_Blitzlicht>).

--
Greetings
Jörg Backschues


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: Backup of an courier-imap mailarchive

by Maarten Vink :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Op 25 mei 2008, om 19:51 heeft Michelle Konzack het volgende geschreven:

> Am 2008-05-25 23:49:31, schrieb Thomas Goirand:
>> Why did you prefer such solution over something like dirvish for
>> example? It does rsync incremental backup every day, and I don't  
>> believe
>> it would take 100% of your CPU.
>
> Because "rsync" has not only eaten my CPU to 100% but also the  
> Memory...

Rsync tends to use a lot of memory for larger filesystems because it  
loads the entire filesystem tree in memory before connecting to the  
destination host. There are two ways to fix this:

1) Build a script that splits up the backup in several rsync runs, for  
example by starting rsync for each user instead of the entire mailspool
2) Try rsync 3.0; it has a "quickstart" mode that starts synchronizing  
files before the entire filesystem is read, saving huge amounts of  
memory. Rsync 3 is available in the backports collection.

Maarten


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: Backup of an courier-imap mailarchive

by Michelle Konzack-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello Maarten,

Am 2008-05-25 20:41:03, schrieb Maarten Vink:
> Rsync tends to use a lot of memory for larger filesystems because it  
> loads the entire filesystem tree in memory before connecting to the  
> destination host. There are two ways to fix this:
>
> 1) Build a script that splits up the backup in several rsync runs, for  
> example by starting rsync for each user instead of the entire mailspool

This is already done but there are 4 "archive" user which hold over 90%
of the whole Mailstorage...

> 2) Try rsync 3.0; it has a "quickstart" mode that starts synchronizing  
> files before the entire filesystem is read, saving huge amounts of  
> memory. Rsync 3 is available in the backports collection.

Thanks for the info, I will give it a try...

Thanks, Greetings and nice Day
    Michelle Konzack

--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
##################### Debian GNU/Linux Consultant #####################
Michelle Konzack   Apt. 917                  ICQ #328449886
+49/177/9351947    50, rue de Soultz         MSN LinuxMichi
+33/6/61925193     67100 Strasbourg/France   IRC #Debian (irc.icq.com)


signature.pgp (196 bytes) Download Attachment

Re: Backup of an courier-imap mailarchive

by Mike Bird-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon May 26 2008 06:44:51 Michelle Konzack wrote:
> Am 2008-05-25 20:41:03, schrieb Maarten Vink:
> > 1) Build a script that splits up the backup in several rsync runs, for
> > example by starting rsync for each user instead of the entire mailspool
>
> This is already done but there are 4 "archive" user which hold over 90%
> of the whole Mailstorage...

We have a lot of daily rsyncs, one of them 360GB, and they only
take a few minutes unless a lot has changed.  Could this be an
ext3 filesystem built long ago without dir_index?

--Mike Bird


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: Backup of an courier-imap mailarchive

by Andrew McGlashan :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Mike Bird wrote:

> On Mon May 26 2008 06:44:51 Michelle Konzack wrote:
>> Am 2008-05-25 20:41:03, schrieb Maarten Vink:
>>> 1) Build a script that splits up the backup in several rsync runs,
>>> for example by starting rsync for each user instead of the entire
>>> mailspool
>>
>> This is already done but there are 4 "archive" user which hold over
>> 90% of the whole Mailstorage...
>
> We have a lot of daily rsyncs, one of them 360GB, and they only
> take a few minutes unless a lot has changed.  Could this be an
> ext3 filesystem built long ago without dir_index?

Okay, I just checked mine with dumpe2fs and it showed that whilst I didn't
specify dir_index, it was a feature in use.

When did this start being a standard feature?

I know that I do rsync backups on the hour as it works extremely well, I may
increase the frequency given how well it performs for my system [which isn't
high end by any stretch].  I am thinking about doing 4 rsync backups per
hour with each one operating on a different backup area so that I can easily
go back to any file changed or deleted less than 1 hour ago, I may even do 8
over 2 hours.  Certainly rsynce works very well for my needs on my ext3 file
system.  I'm not currently concerned about using anything that is Maildir
specific (and aware), but that might come later.

Kind Regards
AndrewM

Andrew McGlashan
Broadband Solutions now including VoIP

Current Land Line No: 03 9912 0504
Mobile: 04 2574 1827 Fax: 03 9012 2178

National No: 1300 85 3804

Affinity Vision Australia Pty Ltd
http://www.affinityvision.com.au
http://adsl2choice.net.au

In Case of Emergency --  http://www.affinityvision.com.au/ice.html 


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: Backup of an courier-imap mailarchive

by Mike Bird-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon May 26 2008 07:20:32 Andrew McGlashan wrote:
> Okay, I just checked mine with dumpe2fs and it showed that whilst I didn't
> specify dir_index, it was a feature in use.
>
> When did this start being a standard feature?

e2fsprogs (1.39-1) unstable; urgency=low

  * New upstream version
  * Fix debugfs's dump_unused command so it will not core dump on
    filesystems with a 64k blocksize
  * Clarified and improved man pages, including spelling errors
    (Closes: #368392, #368393, #368394, #368179)
  * New filesystems are now created with directory indexing and
    on-line resizing enabled by default
  * Fix previously mangled wording in an older Debian changelog entry
  * Fix doc-base pointer to the top-level html file (Closes: #362544, #362970)

 -- Theodore Y. Ts'o <tytso@...>  Mon, 29 May 2006 11:07:53 -0400

--Mike Bird


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: Backup of an courier-imap mailarchive

by Michelle Konzack-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Am 2008-05-26 08:13:22, schrieb Mike Bird:

> e2fsprogs (1.39-1) unstable; urgency=low
>
>   * New upstream version
>   * Fix debugfs's dump_unused command so it will not core dump on
>     filesystems with a 64k blocksize
>   * Clarified and improved man pages, including spelling errors
>     (Closes: #368392, #368393, #368394, #368179)
>   * New filesystems are now created with directory indexing and
>     on-line resizing enabled by default
>   * Fix previously mangled wording in an older Debian changelog entry
>   * Fix doc-base pointer to the top-level html file (Closes: #362544, #362970)
>
>  -- Theodore Y. Ts'o <tytso@...>  Mon, 29 May 2006 11:07:53 -0400
My filesystem was created under Sarge and even Etch does not
include this feature, unless you have installed a backport...

Maybe I will reinitializer the filesystem...

Thanks, Greetings and nice Day
    Michelle Konzack
    Systemadministrator
    24V Electronic Engineer
    Tamay Dogan Network
    Debian GNU/Linux Consultant


--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
##################### Debian GNU/Linux Consultant #####################
Michelle Konzack   Apt. 917                  ICQ #328449886
+49/177/9351947    50, rue de Soultz         MSN LinuxMichi
+33/6/61925193     67100 Strasbourg/France   IRC #Debian (irc.icq.com)


signature.pgp (196 bytes) Download Attachment

Re: Backup of an courier-imap mailarchive

by A. Dreyer (debian-isp) :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Michelle Konzack wrote:
> My filesystem was created under Sarge and even Etch does not
> include this feature, unless you have installed a backport...
>
> Maybe I will reinitializer the filesystem...

Hi Michelle,


No need to rebuild, just tune your filesystem:

tune2fs -l $dev | grep features

tune2fs -O dir_index $dev


.. should be easy enough.



Regards,
Achim

--
Achim Dreyer                 || http://www.adreyer.com/
Senior Unix & Network Admin  || RHCE, RHCA, CCNA, CCSA, CCSE, JNCIA-FW
Internet Security Consultant || Phone: +44 7756948229


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: Backup of an courier-imap mailarchive

by Oliver Hitz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Mike

On 26 May 2008, Mike Bird wrote:
> > This is already done but there are 4 "archive" user which hold over 90%
> > of the whole Mailstorage...
> We have a lot of daily rsyncs, one of them 360GB, and they only
> take a few minutes unless a lot has changed.  Could this be an
> ext3 filesystem built long ago without dir_index?

The problem is not the size in terms of GB's, but the number of files.
I have the same problem here with a file system containing millions of
very small files. rsync takes ages, even with dir_index activated. Even
though I don't really like it, I found that using reiserfs made quite a
difference for me.

Regards

Oliver


signature.asc (196 bytes) Download Attachment

Re: Backup of an courier-imap mailarchive

by Seth Mattinen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Michelle Konzack wrote:

> Am 2008-05-26 08:13:22, schrieb Mike Bird:
>> e2fsprogs (1.39-1) unstable; urgency=low
>>
>>   * New upstream version
>>   * Fix debugfs's dump_unused command so it will not core dump on
>>     filesystems with a 64k blocksize
>>   * Clarified and improved man pages, including spelling errors
>>     (Closes: #368392, #368393, #368394, #368179)
>>   * New filesystems are now created with directory indexing and
>>     on-line resizing enabled by default
>>   * Fix previously mangled wording in an older Debian changelog entry
>>   * Fix doc-base pointer to the top-level html file (Closes: #362544, #362970)
>>
>>  -- Theodore Y. Ts'o <tytso@...>  Mon, 29 May 2006 11:07:53 -0400
>
> My filesystem was created under Sarge and even Etch does not
> include this feature, unless you have installed a backport...
>
> Maybe I will reinitializer the filesystem...
>

I use (and recommend) XFS for mail spools and giant maildir trees these
days.

--
Seth Mattinen sethm@...
Roller Network LLC


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: Backup of an courier-imap mailarchive

by Christian Kujau-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Mon, May 26, 2008 18:33, ml10154@... wrote:
>> Maybe I will reinitializer the filesystem...
> No need to rebuild, just tune your filesystem:
>
> tune2fs -l $dev | grep features
> tune2fs -O dir_index $dev
> .. should be easy enough.

just for the record and for the sake of the archives:

You need to "e2fsck -fD" afterwards to index large *existing* directories:
http://lwn.net/Articles/11481/

Thanks,
Christian.
--
make bzImage, not war


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Parent Message unknown Re: Backup of an courier-imap mailarchive

by Christian Kujau-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 10 Jul 2008, Andrew McGlashan wrote:
>> You need to "e2fsck -fD" afterwards to index large *existing*
>> directories: http://lwn.net/Articles/11481/
>
> Can this be done with a file system that is mounted or must it be unmounted
> first?

e2fsck has to be run on an unmounted filesystem as usual. "e2fsck -n"
would not help, as this does not alter the filesystem and thus would not
add a directory index.

> Also, how can you determine whether or not such htree index exists already?

There's lsattr(1) and the chattr(1) manpage explains:

   The  ’I’ attribute is used by the htree code to indicate that a
   directory is being indexed using hashed trees.
   It may not be set or reset using chattr(1), although it can be
   displayed by lsattr(1).

I guess that dumpe2fs can be used to get this information as well, but
lsattr is so much easier :)

C.
--
BOFH excuse #31:

cellular telephone interference


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...