[Grok-dev] the installation problems saga, part 2

Martijn Faassen faassen at startifact.com
Wed Apr 23 19:57:32 EDT 2008


Hi there,

Last year we had a bad set of installation problems with Grok: you'd get 
an arbitrary set of versions, which would sometimes break. We didn't 
control what versions you used. We fixed that by pinning versions.

Our installation problems are not over yet, however. We have two kinds 
of problems:

Problem 1: relying on too many servers for installation

Some packages we use don't actually upload their tarballs to PyPI. 
Instead, they upload their tarballs somewhere else, and then point the 
PyPI index page to their homepage. Thanks to the magic of setuptools, it 
goes off to the homepage to find the download URLs, and downloads it.

mechanize is an example. See its index page here:

http://pypi.python.org/pypi/mechanize

no tarballs to download, just links to sourceforge there. The 'simple' 
page is actually the most instructive to see what setuptools looks at:

http://pypi.python.org/simple/mechanize

What actually happens it that setuptools doesn't appear to use the 
download URL in this page, but instead goes off to the home page of 
mechanize, parses it and then downloads the zip file.

Unfortunately sometimes this other website is down. It might be, say, 
sourceforge. Sourceforge is not always very well-behaved. I've also had 
problems installing psycopg recently, because the initd.org website it 
is hosted on seems to be rather flaky.

This means that people's installation procedure will sometimes break in 
the middle. That sucks. I'd rather rely on PyPI than on a lot of 
different websites that can fail.

How to fix this one? Somehow fix PyPI so that they suck in all packages 
into it? That's one alternative.

We could also modify KGS, Zope 3's package indexing system, to suck in 
all packages. Then we'll run our own Grok KGS. Right now it doesn't but 
just mirrors PyPI, and thus has the same problems PyPI does. Drawback: 
we'd need to mirror *all* the packages on the PyPI. This might be quite 
a lot of storage space and bandwith, plus needs to be maintained.

Yet another alternative would be to create a 'big tarball download' 
installer for Grok. It'd come with all the proper files already 
available. We'd need to make sure we should also include the Windows 
binaries of those packages that need it. This would work for Grok, 
though could still lead to problems as soon as someone adds in some 
other package in their setup.py. It therefore won't work for all 
Grok-based *applications*, and ideally we should find something that 
works for both.

Ideas anyone?

Problem 2: versions for non-Grok dependencies

When someone develops an application, they pull in dependencies that are 
not listed in our versions.cfg, such as megrok.form. This in turn has 
other dependencies, such as zc.datetimewidget. I just now discovered 
that certain combinations of megrok.form and zc.datetimewidget actually 
result in ZCML conflict errors. Ick!

We need to solve the problem of version management for applications, not 
just for Grok. What I'd like to avoid is that everybody has to become 
their own version manager - developers would need to maintain a list of 
'versions' in their buildouts and just magically have to know which 
versions of dependencies work together. The package developer knows 
which versions work together, and the developer that uses the package 
shouldn't really have to worry about it unless there's a special case.

The easy fix would be for the package developer to pin down *all* the 
versions of *all* the packages (directly or indirectly) that the package 
depends on in setup.py's requirements. Except the ones that Grok already 
depends on, as that'd result in a conflict. This has two consequences, 
however:

* the package might break with new Grok releases which have newer 
dependencies.

* the application developer is absolutely locked into using those 
versions, there's no flexibility to upgrade a dependency to a higher 
version, needed for some other package, etc.

I think pinning things down in setup.py is the only route we can really 
take properly now, but we need to think of a better solution? Maintain a 
package index with KGS for Grok *and* all possible Grok extensions? But 
that's potentially all Python packages in the world...

Do people have ideas on this one?

Solving Problem 1 + Problem 2 do sound like eventually we'll need to 
move into the "distribution management" business, similar to a Linux 
distribution manages its packages. That's a big burden to take on, 
though. I still have the hopes someone has some great idea that we 
haven't thought of yet, though.

Perhaps this is a direction to explore: we could write a tool that for a 
given package downloads all the right versions of the dependencies (as 
we specify somewhere), and then packages them up in some form of special 
tarball that contains the package and all its dependencies. We then also 
have a tool which can find these big balls somewhere and installs them 
into a place on the user's filesystem where buildout can find them as 
normal eggs. Unfortunately I can't think of a way to fit this into the 
whole setuptools/eggs system, and it'd be a bother to have to step 
outside it - it'd be nice if setuptools looked for these balls first if 
it saw a dependency in setup.py, and will get it first.

Regards,

Martijn



More information about the Grok-dev mailing list