cvs update 'pull-only-changed' model ?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

cvs update 'pull-only-changed' model ?

Landry Breuil-4
Hello,

I was wondering if some people already had the idea of setting some kind of
'pull-only-changed' model for cvs update.
I know available methods (cvssync, anoncvs and cvsup) are full pull-model,
where we compare the full local cvs copy with the whole remote repository,
which is :
- time and network consuming for the user
- bandwidth and load consuming for the server

Nowadays, we have various methods to be aware of changes in the repository,
the first coming to mind being subscribing to src-changes@ and
ports-changes@, or refreshing a RSS for a more "user-level" view. For
example, i know the man running freshbsd.org uses ruby and a set of procmail
filters to update his website. This becomes more a 'push-like' model :)

So, basically, the idea would be :
1) parse (perl !) upon mail reception the *-changes@ ml-output, gather
Modified/Added/Removed/Imported files/directories
2) put them in a queue/file
3) use you traditional cvs update method to update _only_ modified parts of
the tree, either manually or with an other cronjob
4) empty the queue

Is it worth trying it, or cvs is already designed to be the least
time/bandwith-consuming possible, and the gain would be near to zero ?
I know this proposal assumes that your tree is always near-up-to-date, or
that you run manual updates if changes@ are not received in a period of time
or update fails, otherwise it may lead to a tree being sync only for some
parts, and other parts being out-of-sync. May this idea lead to errors in
cvs internal files ?
What methods guys are you using ? Normal automated anoncvs up ? Manually
update only changed parts ? Is the load on anoncvs servers neglectable ?
If it has already been discussed, sorry for being lame..

Thanks for any comments/input,

Landry

Reply | Threaded
Open this post in threaded view
|

Re: cvs update 'pull-only-changed' model ?

Christian Weisgerber
Landry Breuil <[hidden email]> wrote:

> I was wondering if some people already had the idea of setting some kind of
> 'pull-only-changed' model for cvs update.
> I know available methods (cvssync, anoncvs and cvsup) are full pull-model,
> where we compare the full local cvs copy with the whole remote repository,

No.  CVSync and CVSup only send a meta data summary to the server, the
server compares this against its own database and only sends diffs of
the changes back to the client.

> which is :
> - time and network consuming for the user
> - bandwidth and load consuming for the server

It really isn't.  And any well-configured CVSync or CVSup server
keeps a meta data summary in a "scan file" around, so it just needs
to compare this with the summary sent by the client and _not_ go
and stat() every file.

AnonCVS is a different story.  CVS remote checkout was not designed
as a mirroring tool and is horrendously inefficient in every respect.

> What methods guys are you using ?

Use CVSync or CVSup to update a local repository copy, which is
fast and bandwidth-efficient, and run local cvs update from that
local repository.

Alternatively, if you have no use for the repository, use CVSup (or
the CSup client) in checkout mode.

--
Christian "naddy" Weisgerber                          [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: cvs update 'pull-only-changed' model ?

Landry Breuil-4
2007/5/28, Christian Weisgerber <[hidden email]>:

>
> Landry Breuil <[hidden email]> wrote:
>
> > I was wondering if some people already had the idea of setting some kind
> of
> > 'pull-only-changed' model for cvs update.
> > I know available methods (cvssync, anoncvs and cvsup) are full
> pull-model,
> > where we compare the full local cvs copy with the whole remote
> repository,
>
> No.  CVSync and CVSup only send a meta data summary to the server, the
> server compares this against its own database and only sends diffs of
> the changes back to the client.
>
> > which is :
> > - time and network consuming for the user
> > - bandwidth and load consuming for the server
>
> It really isn't.  And any well-configured CVSync or CVSup server
> keeps a meta data summary in a "scan file" around, so it just needs
> to compare this with the summary sent by the client and _not_ go
> and stat() every file.


Ok, i see now.

AnonCVS is a different story.  CVS remote checkout was not designed
> as a mirroring tool and is horrendously inefficient in every respect.


Yes, i was having the impression that using AnonCVS was terribly slow and
resource-consuming.. updating the ports-tree take 5-10mns.
So, way better using cvsync or cvsup/csup.

I suppose CTM has been deprecated ? It disappeared from the 'official
methods' three years ago, page is still here but snaps/diffs are not
generated since two years... i haven't found an 'official support drop'.

> What methods guys are you using ?
>
> Use CVSync or CVSup to update a local repository copy, which is
> fast and bandwidth-efficient, and run local cvs update from that
> local repository.
>
> Alternatively, if you have no use for the repository, use CVSup (or
> the CSup client) in checkout mode.


I'm regularly updating ports i'm working on, that's why i was asking if it
was possible to update _only_ modified parts of the tree instead of
comparing the whole tree. using AnonCVS was my mistake :)

No comments on my initial idea ? I suppose i'll implement it 'for fun' and
may submit it here someday, if ppl are interested.

Thanks for the clarification,
Landry

Reply | Threaded
Open this post in threaded view
|

Re: cvs update 'pull-only-changed' model ?

Christian Weisgerber
Landry Breuil <[hidden email]> wrote:

> I suppose CTM has been deprecated ?

It's in the cabinet next to the dodo.

--
Christian "naddy" Weisgerber                          [hidden email]