Re: A Python script to put CTAN into git (from DVDs)



On 06/11/11 18:51, Sivaram Neelakantan wrote:
On Sun, Nov 06 2011,Jonathan Fine wrote:

Hi

This it to let you know that I'm writing (in Python) a script that
places the content of CTAN into a git repository.
https://bitbucket.org/jfine/python-ctantools

I'm working from the TeX Collection DVDs that are published each year
by the TeX user groups, which contain a snapshot of CTAN (about
100,000 files occupying 4Gb), which means I have to unzip folders and
do a few other things.


unlike CVS where you can only get HEAD(or specifically a version),
this git repository....umm...won't it replicate the entire history
locally to the user's machine; all 4GB + deltas? In 2 years time,
say, won't it have all the different versions which might not be of
consequence for the general user populace?

I'm not objecting to your project but why git with its local storage
model for this?

I chose git because it handles large files well, and because it has an efficient transport mechanism. It's internal architecture is also convenient for my purposes.

Let's focus on macros, which perhaps deserves a repository of its own. Suppose a user, such as yourself, has 2011/ctan/macros in a git repository. Next year, 2012/ctan/macros might not have changed that much, and so it would downloading the difference won't take so long. When 2015 comes around the user, if wished, could remove the 2011 macros and reclaim the space.

However, my first task is to create a single git archive that contains all the CTAN snapshots on DVD. I would not expect anyone to download that. Instead, you create it by feeding in the TeX Collection DVDs.

And from there we create smaller repositories, suitable for users and particular purposes. Or anything else you might want to do.

Do you have Linux and a TeX Collection DVD. If so, I could talk you through the process, which might make it all a bit clearer.

I hope this helps.

--
Jonathan
.



Relevant Pages

  • Re: A Python script to put CTAN into git (from DVDs)
    ... places the content of CTAN into a git repository. ... I'm working from the TeX Collection DVDs that are published each year ... by the TeX user groups, which contain a snapshot of CTAN (about ...
    (comp.text.tex)
  • Re: [PATCH 0/10] MAINTAINERS - add script, patterns and misc updates
    ... On Fri, 17 Apr 2009, Andrew Morton wrote: ... it whines about not being run in a git directory, ... Not a git repository" is from git saying it can't do that. ... top five spots. ...
    (Linux-Kernel)
  • Re: Full git history of Linux
    ... I have put on my websitea git repository ... containing the full history of Linux, ... Note that because I used the graft feature, this repo is ...
    (Linux-Kernel)
  • Re: [Mac OS X] redcar install
    ... I've downloaded, using git redcar with: ... fatal: Not a git repository: .git ...
    (comp.lang.ruby)
  • [Mac OS X] redcar install
    ... I've downloaded, using git redcar with: ... fatal: Not a git repository: .git ... zsh-% git submodule update ...
    (comp.lang.ruby)