Massive cvs migration

View: New views
6 Messages — Rating Filter:   Alert me  

Massive cvs migration

by Wurdock, Tom :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I am planning to move perhaps a million files int cvs prom a previous
source control system.

In some of my tests, I mangled binary files due to not flagging them as
binary.

I've run a script on one module to gather all the extensions used.
There are perhaps 60 extensions that need to be added as binary files.

My recursive script to add the files uses a simple "cvs add *", so we'll
have to rely on cvswrappers to map the extensions.  

A few questions:
       
Does each case of an extension need to be mapped? ("doc" AND "DOC" and
"Doc") (someone initial-capped an extension.

Is a cvswrappers file with 120 entries going to cause performance or
other problems?

Is there a better way to go about this?

Perhaps I add each file individually and if it's not clearly a texty
file, we add as binary?

Tom Wurdock
Programmer Analyst
PLATO Learning, Inc.
952-832-1527
twurdock@...



Re: Massive cvs migration

by Spiro Trikaliotis-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello Tom,

* On Mon, Sep 29, 2008 at 11:18:33PM -0500 Wurdock, Tom wrote:
> I am planning to move perhaps a million files int cvs prom a previous
> source control system.
>
> In some of my tests, I mangled binary files due to not flagging them as
> binary.
[...]
 
> Is there a better way to go about this?

I would do it (and I have done it) in the following way:

0. Make sure that all text files use LF line ending only, and make sure
   you are on a machine where LF is the natural line ending.

1. On initial check in, add *all* files as binary.
 
2. Afterwards, you can use "cvs admin -kkv" (or one of the other options)
   on the files that are not binary. On Unix alike machines, this can be
   done rather easily with find . -name \*.c|xargs cvs admin -kkv

3. Perform a "cvs up -A" in the sandbox afterwards to account for the
   changes in 2.

4. Repeat 2. and 3. until you are done. ;)

This way, you do not trash your binary files accidentially. Note,
however, that step 0., the precondition, can get tricky. In the case
that your files use other line endings, this step can be very
inconvenient.

However, if this precondition holds true (0.), you can be sure you did
not trash your binary files. Additionally, you can undo any accidential
changes.

Best regards,
Spiro.

--
Spiro R. Trikaliotis                              http://opencbm.sf.net/
http://www.trikaliotis.net/                     http://www.viceteam.org/



RE: Massive cvs migration

by Wurdock, Tom :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> 1. On initial check in, add *all* files as binary.

Thank you for the input.  I think having the base be binary and then
explicitly changing the ones I know to be text is a great idea.  Perhaps
in my script I will keep a list of files know to be text.  We are
working exclusively with windows line endings.

If a text file is added as binary, there is no negative consequence as
far as the file being mangled, correct?  We would just have extra data,
no?

Tom



Re: Massive cvs migration

by Larry Jones-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Wurdock, Tom writes:
>
> Does each case of an extension need to be mapped? ("doc" AND "DOC" and
> "Doc") (someone initial-capped an extension.

Yes.  You can use character classes to handle all the variations with
just a single entry:

        *.[Dd][Oo][Cc] -k 'b'

> Is a cvswrappers file with 120 entries going to cause performance or
> other problems?

No.
--
Larry Jones

I think if Santa is going to judge my behavior over the last year,
I ought to be entitled to legal representation. -- Calvin



RE: Massive cvs migration

by Wurdock, Tom :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Wow.  Thank you.

> -----Original Message-----
> From: Larry Jones [mailto:lawrence.jones@...]
> Sent: Tuesday, September 30, 2008 9:37 AM
> To: Wurdock, Tom
> Cc: info-cvs@...
> Subject: Re: Massive cvs migration
>
> Wurdock, Tom writes:
> >
> > Does each case of an extension need to be mapped? ("doc"
> AND "DOC" and
> > "Doc") (someone initial-capped an extension.
>
> Yes.  You can use character classes to handle all the
> variations with just a single entry:
>
> *.[Dd][Oo][Cc] -k 'b'
>
> > Is a cvswrappers file with 120 entries going to cause
> performance or
> > other problems?
>
> No.
> --
> Larry Jones
>
> I think if Santa is going to judge my behavior over the last
> year, I ought to be entitled to legal representation. -- Calvin
>



Re: Massive cvs migration

by Larry Jones-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Wurdock, Tom writes:
>
> If a text file is added as binary, there is no negative consequence as
> far as the file being mangled, correct?  We would just have extra data,
> no?

That depends on your definition of "mangled" and "extra data".  :-)

Since you're on Windows, adding a text file as binary will include the
<CR> at the end of each line as part of the contents of the line rather
than treating it as part of the line separator like it should.  Changing
the file to non-binary with cvs admin does not fix that.  Thus, when you
check the file out, you'll have an "extra" <CR> at the end of each line
(i.e., each line will end with <CR><CR><LF> instead of just <CR><LF>).
That can be innocuous or a serious problem depending on what you want to
do with the file -- some things don't mind, other things get terribly
confused.
--
Larry Jones

Don't you hate it when your boogers freeze? -- Calvin


LightInTheBox - Buy quality products at wholesale price!