Specifying dependencies on Haskell code

View: New views
18 Messages — Rating Filter:   Alert me  

Specifying dependencies on Haskell code

by Duncan Coutts :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

All,

In the initial discussions on a common architecture for building
applications and libraries one of the goals was to reduce or eliminate
untracked dependencies. The aim being that you could reliably deploy a
package from one machine to another.

We settled on a fairly traditional model, where one specifies the names
and versions of packages of Haskell code.

An obvious alternative model is embodied in ghc --make and in autoconf
style systems where you look in the environment not for packages but
rather for specific modules or functions.

Both models have passionate advocates. There are of course advantages
and disadvantages to each. Both models seem to get implemented as
reactions having the other model inflicted on the author. For example
the current Cabal model of package names and versions was a reaction to
the perceived problem of untracked dependencies with the ghc --make
system. One could see implementations such as searchpath and franchise
as reactions in the opposite direction.

The advantages and disadvantages of specifying dependencies on module
names vs package names and versions are mostly inverses. Module name
clashes between packages are problematic with one system and not a
problem with the other. Moving modules between packages is not a problem
for one system and a massive pain for the other.

The fact is that both module name and package name + version are being
used as proxies to represent some vague combination of required Haskell
interface and implementation thereof. Sometimes people intend only to
specify an interface and sometimes people really want to specify
(partial) semantics (eg to require a version of something including some
bug fix / semantic change). In this situation the package version is
being used to specify an implementation as a proxy for semantics.

Neither are very good ways of identifying an interface or
implementation/semantics. Modules do move from one package to another
without fundamentally changing. Modules do change interface and
semantics without changing name. There is no guarantee about the
relationship between a package's version and its interface or semantics
though there are some conventions.

Another view would be to try and identify the requirements about
dependent code more accurately. For example to view modules as functors
and look at what interface they require of the modules they import. Then
we can say that they depend on any module that provides a superset of
that interface. It doesn't help with semantics of course. Dependencies
like these are not so compact and easy to write down.

I don't have any point here exactly, except that there is no obvious
solution. I guess I'd like to provoke a bit of a discussion on this,
though hopefully not just rehashing known issues. In particular if
people have any ideas about how we could improve either model to address
their weak points then that'd be well worth discussing.

For example the package versioning policy attempts to tighten the
relationship between a package version and changes in its interface and
semantics. It still does not help at all with modules moving between
packages.

Duncan

_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Specifying dependencies on Haskell code

by Thomas Schilling-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On 20 apr 2008, at 22.22, Duncan Coutts wrote:

> All,
>
> In the initial discussions on a common architecture for building
> applications and libraries one of the goals was to reduce or eliminate
> untracked dependencies. The aim being that you could reliably deploy a
> package from one machine to another.
>
> We settled on a fairly traditional model, where one specifies the  
> names
> and versions of packages of Haskell code.
>
> An obvious alternative model is embodied in ghc --make and in autoconf
> style systems where you look in the environment not for packages but
> rather for specific modules or functions.
>
> Both models have passionate advocates. There are of course advantages
> and disadvantages to each. Both models seem to get implemented as
> reactions having the other model inflicted on the author. For example
> the current Cabal model of package names and versions was a  
> reaction to
> the perceived problem of untracked dependencies with the ghc --make
> system. One could see implementations such as searchpath and franchise
> as reactions in the opposite direction.
>
> The advantages and disadvantages of specifying dependencies on module
> names vs package names and versions are mostly inverses. Module name
> clashes between packages are problematic with one system and not a
> problem with the other. Moving modules between packages is not a  
> problem
> for one system and a massive pain for the other.
>
> The fact is that both module name and package name + version are being
> used as proxies to represent some vague combination of required  
> Haskell
> interface and implementation thereof. Sometimes people intend only to
> specify an interface and sometimes people really want to specify
> (partial) semantics (eg to require a version of something including  
> some
> bug fix / semantic change). In this situation the package version is
> being used to specify an implementation as a proxy for semantics.
>
> Neither are very good ways of identifying an interface or
> implementation/semantics. Modules do move from one package to another
> without fundamentally changing. Modules do change interface and
> semantics without changing name. There is no guarantee about the
> relationship between a package's version and its interface or  
> semantics
> though there are some conventions.
>
> Another view would be to try and identify the requirements about
> dependent code more accurately. For example to view modules as  
> functors
> and look at what interface they require of the modules they import.  
> Then
> we can say that they depend on any module that provides a superset of
> that interface. It doesn't help with semantics of course. Dependencies
> like these are not so compact and easy to write down.
>
> I don't have any point here exactly, except that there is no obvious
> solution. I guess I'd like to provoke a bit of a discussion on this,
> though hopefully not just rehashing known issues. In particular if
> people have any ideas about how we could improve either model to  
> address
> their weak points then that'd be well worth discussing.
>
> For example the package versioning policy attempts to tighten the
> relationship between a package version and changes in its interface  
> and
> semantics. It still does not help at all with modules moving between
> packages.
>
> Duncan
>
[Replying so late as I only saw this today.]

I believe that using tight version constraints in conjunction with  
the PVP to be a good solution.  For now.

I don't quite know how Searchpath works (the website is rather  
taciturn), but I think that we should strive for a better  
approximation to real dependencies, specifically, name, interface,  
and semantics of imported functions.  As I see it, what's missing is  
proper tool support to do it practically for both library authors and  
users.

Library users really shouldn't need to do anything except to run a  
tool to determine all dependencies of a given package.  Library  
authors should be able to run a tool that determines what's new and  
what might have changed.  The package author then merely decides  
whether semantics was changed and if so, in what way (i.e.,  
compatible or not to previous semantics).  Packages will still carry  
versions, but they are only used to mark changes.  Semantic  
information is provided via a "change database" which contains enough  
information to determine whether a version of a package contains  
appropriate implementations of the functions (or, more generally,  
entities) used in a dependent package.

For example, if we write a program that uses the function 'Foo.foo'  
contained in package 'foo' and we happen to have used 'foo-0.42' for  
testing of our program.  Then, given the knowledge that 'Foo.foo' was  
introduced in 'foo-0.23' and changed semantics in 'foo-2.0' then we  
know that 'foo >= 0.23 && < 2.0' is the correct and complete  
dependency description.

That's the ideal, maybe we can work towards this?
Or does this sound crazy?

/ Thomas
--
"Today a young man on acid realized that all matter is merely energy  
condensed to a slow vibration, that we are all one consciousness  
experiencing itself subjectively, there is no such thing as death,  
life is only a dream, and we are the imagination of ourselves." --
Bill Hicks



_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

PGP.sig (201 bytes) Download Attachment

Re: Specifying dependencies on Haskell code

by Duncan Coutts :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Fri, 2008-05-02 at 00:28 +0200, Thomas Schilling wrote:
> On 20 apr 2008, at 22.22, Duncan Coutts wrote:

> [Replying so late as I only saw this today.]
>
> I believe that using tight version constraints in conjunction with  
> the PVP to be a good solution.  For now.

I think I tend to agree.

> I don't quite know how Searchpath works (the website is rather  
> taciturn), but I think that we should strive for a better  
> approximation to real dependencies, specifically, name, interface,  
> and semantics of imported functions.  As I see it, what's missing is  
> proper tool support to do it practically for both library authors and  
> users.

Yes, we can make package name and version a better approximation of the
package interface with tools to enforce the versioning policy.

> Library users really shouldn't need to do anything except to run a  
> tool to determine all dependencies of a given package.  Library  
> authors should be able to run a tool that determines what's new and  
> what might have changed.  The package author then merely decides  
> whether semantics was changed and if so, in what way (i.e.,  
> compatible or not to previous semantics).  Packages will still carry  
> versions, but they are only used to mark changes.  Semantic  
> information is provided via a "change database" which contains enough  
> information to determine whether a version of a package contains  
> appropriate implementations of the functions (or, more generally,  
> entities) used in a dependent package.
>
> For example, if we write a program that uses the function 'Foo.foo'  
> contained in package 'foo' and we happen to have used 'foo-0.42' for  
> testing of our program.  Then, given the knowledge that 'Foo.foo' was  
> introduced in 'foo-0.23' and changed semantics in 'foo-2.0' then we  
> know that 'foo >= 0.23 && < 2.0' is the correct and complete  
> dependency description.
>
> That's the ideal, maybe we can work towards this?
> Or does this sound crazy?

I think extracting package APIs and comparing them across versions is an
excellent thing to do. It'd help users see what has changed and it'd let
us enforce the versioning policy (at least for interface changes, not
for semantic changes).

Having a central collection of those interfaces and using that to work
out which versions of which packages would be compatible with the
program I just wrote is quite an interesting idea.

It's related to what I was saying about identifying code by it's full
interface, as a functor, but then using that to map back to packages
that provide the interface.

Something like that might go some way to addressing David Roundy's quite
legitimate criticism that the system of specifying deps on package names
and versions requires one to know the full development history of that
code, eg to track it across package renames.

However it would only help for the development _history_, we still have
no solution for the problem of packages being renamed (or modules moving
between packages) breaking other existing packages. Though similarly we
have no solution to the problem of modules being renamed. Perhaps it's
just that we have not done much module renaming recently so people don't
see it as an issue.

Duncan

_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Specifying dependencies on Haskell code

by apfelmus :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Duncan Coutts wrote:
> Thomas Schilling wrote:
>>
>> For example, if we write a program that uses the function 'Foo.foo'  
>> contained in package 'foo' and we happen to have used 'foo-0.42' for  
>> testing of our program.  Then, given the knowledge that 'Foo.foo' was  
>> introduced in 'foo-0.23' and changed semantics in 'foo-2.0' then we  
>> know that 'foo >= 0.23 && < 2.0' is the correct and complete  
>> dependency description.

I would go even further and simply use "my program 'bar' compiles with
foo-0.42" as dependency description. In other words, whether the package
foo-0.23 can be used to supply this dependency or not will be determined
when somebody else tries to compile Bar with it.

In both cases, the basic idea is that the library user should *not*
think about library versions, he just uses the one that is in scope on
his system. Figuring out which other versions can be substituted is the
job of the library author. In other words, the burden of proof is
shifted from the user ("will my program compile with foo-1.1?") to the
author ("which versions of my library are compatible?"), where it belongs.

> However it would only help for the development _history_, we still have
> no solution for the problem of packages being renamed (or modules moving
> between packages) breaking other existing packages. Though similarly we
> have no solution to the problem of modules being renamed. Perhaps it's
> just that we have not done much module renaming recently so people don't
> see it as an issue.

With the approach above, it's possible to handle package/module
renaming. For instance, if the package 'foo' is split into 'f-0.1' and
'oo-0.1' at some point, we can still use the union of these two to
fulfill the old dependency 'foo-0.42'.

In other words, the basic model is that a module/package like 'bar' with
a dependency like 'foo-0.42' as just a function that maps a value of the
same type (= export list) as 'foo-0.42' to another value (namely the set
of exports of 'bar'). So, we can compile for instance

   bar (foo-0.42)

or

   bar (f-0.1 `union` oo-0.1)

Of course, the problems are

  1) specifying the types of the parameters,
  2) automatically choosing good parameters.

For 1), one could use a very detailed import list, but I think that this
feels wrong. I mean, if I have to specify the imports myself, why did I
import foo-0.42 in the first place? Put differently, when I say 'import
Data.Map' I want to import both its implementation and the interface.
So, I argue that the goal is to allow type specifications of the form
'same type as foo-0.42'.

Problem 2) exists because if I have foo-0.5 on in scope on my system and
a package lists foo-0.42 as a dependency, the compiler should somehow
figure out that he can use foo-0.5 as argument. Of course, it will be
tricky/impossible to figure out that  f-0.1 `union` oo-0.1  is a valid
argument, too.


So, the task would be to develop a formalism, i.e. some kind of "lambda
calculus for modules" that can handle problems 1) and 2). The formalism
should be simple to understand and use yet powerful, just like our
beloved lambda calculus.


A potential pitfall to any solution is that name and version number
don't identify a compiled package uniquely! For instance,

   foo-0.3 (bytestring-1.1)

is very different from

   foo-0.3 (bytestring-1.2)

if foo exports the ByteString type. That's the diamond import problem.
In other words,  foo-0.3  is always the same function, but the evaluated
results are not.



Regards,
apfelmus

_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Specifying dependencies on Haskell code

by David Roundy-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Apr 20, 2008 at 09:22:56PM +0100, Duncan Coutts wrote:
> In the initial discussions on a common architecture for building
> applications and libraries one of the goals was to reduce or eliminate
> untracked dependencies. The aim being that you could reliably deploy a
> package from one machine to another.
>
> We settled on a fairly traditional model, where one specifies the names
> and versions of packages of Haskell code.

Do you actually have any precedent for such a system? I've never heard of
one, but then I've been sort of sheltered, due to living in the linux world
where there is a distinction between packagers and upstream authors.  I
consider this a useful distinction.  But that's probably because I'm lazy,
or perhaps because I care about my users--and thus like to give them
options and reduce the dependencies of my software.

I know there is a long history of the autoconf-style approach being
successful.  Can you point to any success stories of the approach chosen
for cabal?

David
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Specifying dependencies on Haskell code

by Thomas Schilling-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On 2 maj 2008, at 11.27, apfelmus wrote:

> Duncan Coutts wrote:
>> Thomas Schilling wrote:
>>>
>>> For example, if we write a program that uses the function  
>>> 'Foo.foo'  contained in package 'foo' and we happen to have used  
>>> 'foo-0.42' for  testing of our program.  Then, given the  
>>> knowledge that 'Foo.foo' was  introduced in 'foo-0.23' and  
>>> changed semantics in 'foo-2.0' then we  know that 'foo >= 0.23 &&  
>>> < 2.0' is the correct and complete  dependency description.
>
> I would go even further and simply use "my program 'bar' compiles  
> with foo-0.42" as dependency description. In other words, whether  
> the package foo-0.23 can be used to supply this dependency or not  
> will be determined when somebody else tries to compile Bar with it.
>
> In both cases, the basic idea is that the library user should *not*  
> think about library versions, he just uses the one that is in scope  
> on his system. Figuring out which other versions can be substituted  
> is the job of the library author. In other words, the burden of  
> proof is shifted from the user ("will my program compile with  
> foo-1.1?") to the author ("which versions of my library are  
> compatible?"), where it belongs.
I think we mean the same thing.  If I write a program and test it  
against a specific version of a library then my program's source code  
and knowledge about which specific versions of libraries I used, most  
of the time, contains *all* the information necessary to determine  
which other library versions it can be built with.

 From the source code we need information about what is imported,  
from the library author we need a *formal* changelog.  This changelog  
describes for each released version what part of the interface and  
semantics have changed.

The problem here is, of course, that this is a lot of information to  
provide.  Furthermore, I think we need information about imports from  
the library user, if we ignore this, then the PVP is *exactly* what  
we need. The PVP describes when things *could* break, but it does so  
in an extremely pessimistic way.  If we have information about what  
exactly changed and what is used by a particular library, we can find  
out what the exact version range is.  For example, if we build our  
package against foo-0.42 and bar-2.3 and both packages follow the PVP  
then the following will trivially be true:

   build-depends: foo-0.42.*, bar-2.3.*

where "-X.Y.*" is a shortcut for ">= X.Y && < X.(Y+1)".  The problem  
is that this is extremely pessimistic, so we have to manually check  
whenever a new version of a dependency comes out and update the  
"known-to-work-with"-range.  With more information (obtained mostly  
by tools) we can automate this process, and, in fact, both approaches  
can co-exist.

>
>> However it would only help for the development _history_, we still  
>> have
>> no solution for the problem of packages being renamed (or modules  
>> moving
>> between packages) breaking other existing packages. Though  
>> similarly we
>> have no solution to the problem of modules being renamed. Perhaps  
>> it's
>> just that we have not done much module renaming recently so people  
>> don't
>> see it as an issue.
>
> With the approach above, it's possible to handle package/module  
> renaming. For instance, if the package 'foo' is split into 'f-0.1'  
> and 'oo-0.1' at some point, we can still use the union of these two  
> to fulfill the old dependency 'foo-0.42'.
This is kind of the same like using a "virtual package" that is  
simply a re-export of other packages.  This would help a lot with our  
current problems with the base split (which will continue, as base  
will be split up even further).

>
> In other words, the basic model is that a module/package like 'bar'  
> with a dependency like 'foo-0.42' as just a function that maps a  
> value of the same type (= export list) as 'foo-0.42' to another  
> value (namely the set of exports of 'bar'). So, we can compile for  
> instance
>
>   bar (foo-0.42)
>
> or
>
>   bar (f-0.1 `union` oo-0.1)
>
> Of course, the problems are
>
>  1) specifying the types of the parameters,
>  2) automatically choosing good parameters.
>
> For 1), one could use a very detailed import list, but I think that  
> this feels wrong. I mean, if I have to specify the imports myself,  
> why did I import foo-0.42 in the first place? Put differently, when  
> I say 'import Data.Map' I want to import both its implementation  
> and the interface. So, I argue that the goal is to allow type  
> specifications of the form 'same type as foo-0.42'.
>
> Problem 2) exists because if I have foo-0.5 on in scope on my  
> system and a package lists foo-0.42 as a dependency, the compiler  
> should somehow figure out that he can use foo-0.5 as argument. Of  
> course, it will be tricky/impossible to figure out that  f-0.1  
> `union` oo-0.1  is a valid argument, too.
>
>
> So, the task would be to develop a formalism, i.e. some kind of  
> "lambda calculus for modules" that can handle problems 1) and 2).  
> The formalism should be simple to understand and use yet powerful,  
> just like our beloved lambda calculus.
>
>
> A potential pitfall to any solution is that name and version number  
> don't identify a compiled package uniquely! For instance,
>
>   foo-0.3 (bytestring-1.1)
>
> is very different from
>
>   foo-0.3 (bytestring-1.2)
>
> if foo exports the ByteString type. That's the diamond import  
> problem. In other words,  foo-0.3  is always the same function, but  
> the evaluated results are not.
I think a formal changelog can also help with renaming (even of  
exported entities), but, I agree, for all this to work we need to  
formalise it first, and then build tools to automate most of the work.


/ Thomas
--
My shadow / Change is coming. / Now is my time. / Listen to my muscle  
memory. / Contemplate what I've been clinging to. / Forty-six and two  
ahead of me.






_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

PGP.sig (201 bytes) Download Attachment

Re: Specifying dependencies on Haskell code

by Ian Lynagh :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, May 02, 2008 at 09:55:32AM -0700, David Roundy wrote:

> On Sun, Apr 20, 2008 at 09:22:56PM +0100, Duncan Coutts wrote:
> >
> > We settled on a fairly traditional model, where one specifies the names
> > and versions of packages of Haskell code.
>
> Do you actually have any precedent for such a system?
>
> I know there is a long history of the autoconf-style approach being
> successful.  Can you point to any success stories of the approach chosen
> for cabal?

LaTeX does things like
    \RequirePackage{longtable}[1995/01/01]

According to http://peak.telecommunity.com/DevCenter/PythonEggs, with
python eggs you do things like
    from pkg_resources import require
    require("FooBar>=1.2")

According to http://blogs.cocoondev.org/crafterm/archives/004653.html,
with Ruby gems you do things like
    s.add_dependency("dependency", ">= 0.x.x")

(URLs found by googling for "how to make a <foo>")

Those were just the first 3 things I thought of.
I don't know what you would consider a success, though.


Thanks
Ian

_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Specifying dependencies on Haskell code

by David Roundy-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, May 03, 2008 at 02:51:33PM +0100, Ian Lynagh wrote:

> On Fri, May 02, 2008 at 09:55:32AM -0700, David Roundy wrote:
> > On Sun, Apr 20, 2008 at 09:22:56PM +0100, Duncan Coutts wrote:
> > >
> > > We settled on a fairly traditional model, where one specifies the names
> > > and versions of packages of Haskell code.
> >
> > Do you actually have any precedent for such a system?
> >
> > I know there is a long history of the autoconf-style approach being
> > successful.  Can you point to any success stories of the approach chosen
> > for cabal?
>
> LaTeX does things like
>     \RequirePackage{longtable}[1995/01/01]

I wouldn't call LaTeX a build system, although it's certainly a wonderful
typesetting system.

> According to http://peak.telecommunity.com/DevCenter/PythonEggs, with
> python eggs you do things like
>     from pkg_resources import require
>     require("FooBar>=1.2")

>From what I can tell, python eggs aren't a build system either, but rather
a binary package format.

> According to http://blogs.cocoondev.org/crafterm/archives/004653.html,
> with Ruby gems you do things like
>     s.add_dependency("dependency", ">= 0.x.x")

It seems that a ruby gem is also a binary package.

> (URLs found by googling for "how to make a <foo>")
>
> Those were just the first 3 things I thought of.
> I don't know what you would consider a success, though.

I'd definitely call LaTeX a success, have no idea about gems or eggs (which
I'd never heard of before this email), but none of these are build systems,
so far as I can tell.
--
David Roundy
Department of Physics
Oregon State University
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Specifying dependencies on Haskell code

by Ian Lynagh :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, May 03, 2008 at 11:30:44AM -0700, David Roundy wrote:
> On Sat, May 03, 2008 at 02:51:33PM +0100, Ian Lynagh wrote:
> >
> > According to http://peak.telecommunity.com/DevCenter/PythonEggs, with
> > python eggs you do things like
> >     from pkg_resources import require
> >     require("FooBar>=1.2")
>
> >From what I can tell, python eggs aren't a build system either, but rather
> a binary package format.

To install a trac plugin you download a tarball and do something like
    python setup.py bdist_egg
to create the .egg file, which you can then put in the appropriate
place. I think in general you can also do
    python setup.py install
to have it installed as a python library.

I know virtually nothing about eggs, and even less about gems, but I am
under the impression that they aim to solve the same problem as Cabal.


Thanks
Ian

_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Specifying dependencies on Haskell code

by David Roundy-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, May 04, 2008 at 05:20:54PM +0100, Ian Lynagh wrote:

> On Sat, May 03, 2008 at 11:30:44AM -0700, David Roundy wrote:
> > On Sat, May 03, 2008 at 02:51:33PM +0100, Ian Lynagh wrote:
> > >
> > > According to http://peak.telecommunity.com/DevCenter/PythonEggs, with
> > > python eggs you do things like
> > >     from pkg_resources import require
> > >     require("FooBar>=1.2")
> >
> > > From what I can tell, python eggs aren't a build system either, but
> > > rather a binary package format.
>
> To install a trac plugin you download a tarball and do something like
> python setup.py bdist_egg to create the .egg file, which you can then put
> in the appropriate place. I think in general you can also do python
> setup.py install to have it installed as a python library.
>
> I know virtually nothing about eggs, and even less about gems, but I am
> under the impression that they aim to solve the same problem as Cabal.

Maybe the problem is that noone seems to know what problem cabal is
supposed to be solving.  What problem is that? Some say it's a
configuration/build system.  Others say it's a packaging system.  I think
it's the latter.

David
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Specifying dependencies on Haskell code

by Duncan Coutts :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Fri, 2008-05-02 at 09:55 -0700, David Roundy wrote:

> On Sun, Apr 20, 2008 at 09:22:56PM +0100, Duncan Coutts wrote:
> > In the initial discussions on a common architecture for building
> > applications and libraries one of the goals was to reduce or eliminate
> > untracked dependencies. The aim being that you could reliably deploy a
> > package from one machine to another.
> >
> > We settled on a fairly traditional model, where one specifies the names
> > and versions of packages of Haskell code.
>
> Do you actually have any precedent for such a system?

I would count all the distro packaging systems as precedent. There are a
few others but those are the most significant.

> I've never heard of one, but then I've been sort of sheltered, due to
> living in the linux world where there is a distinction between
> packagers and upstream authors. I consider this a useful distinction.

I agree it is a useful distinction. I was a packager for gentoo for
three years. The jobs have roughly the same goal -- to deliver great
software to users -- but there is certainly a different focus.

> But that's probably because I'm lazy, or perhaps because I care about
> my users--and thus like to give them options and reduce the
> dependencies of my software.

We are actually very lucky to have people doing the packaging job for
us. It takes time and because of that only the most important bits of
software get packaged.

If we could significantly reduce the amount of time that packing people
have to spend on each package then we could increase the number of
packages that could benefit.

So that's what Cabal's model of specifying dependencies is for, to
provide enough information to enable package management. Without that
information provided up front the packaging people have to spend much
more time manually discovering the dependencies by reading through
README and configure.ac files.

With Cabal packages we have the possibility of generating distro
packages automatically. Several distros have tools to do this automatic
translation. This is something that is essentially impossible with
autoconf. When we started using our translation tool in Gentoo we were
able to increase the number of packages we provided by an order of
magnitude.

Of course we do not expect every little Haskell package to appear in
every distro but the information provided by packages makes it possible
to provide package management (in the form of cabal-install) even for
the packages that do not meet the popularity or QA standards for the
distros.

> I know there is a long history of the autoconf-style approach being
> successful.  Can you point to any success stories of the approach chosen
> for cabal?

Again I'd point to all the package management systems. If you want
examples of build systems that provide enough information for package
management then admittedly there are fewer. Ian already pointed out
Python eggs and Ruby Gems. I think CPAN also has some method for
tracking dependencies though I don't know if or how CPAN modules specify
dependencies.

Duncan

_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Specifying dependencies on Haskell code

by Duncan Coutts :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Mon, 2008-05-05 at 03:50 -0700, David Roundy wrote:

> Maybe the problem is that noone seems to know what problem cabal is
> supposed to be solving.  What problem is that? Some say it's a
> configuration/build system.  Others say it's a packaging system.  I think
> it's the latter.

I'd say that Cabal is a build system but one that provides enough
information to enable package management. That's the reason for the
slight blurring/confusion with packaging systems. There is a much
clearer division with autoconf/automake because it is a build system
that does not provide enough information to enable package management.

Cabal interfaces with package management systems in a similar way
to ./configure && make && make install as one can see from the scripts
that the distros use to build packages from source.

Tools like cabal-rpm, hackport and dh_haskell use the information
provided by cabal packages to make distro packages semi-automatically
(It does not eliminate the QA job).

cabal-install is a package manager for those Cabal packages that are not
already packaged by the distros. It seems likely that there will always
be a significant number of such packages as there is with CPAN etc.

Hackage is an archive and distribution point for Cabal packages.

Duncan

_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Specifying dependencies on Haskell code

by Roman Leshchinskiy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Duncan Coutts wrote:
>
> In the initial discussions on a common architecture for building
> applications and libraries one of the goals was to reduce or eliminate
> untracked dependencies. The aim being that you could reliably deploy a
> package from one machine to another.

Sorry for jumping in so late but here are my two cents anyway.

IMO, a package is absolutely the wrong thing to depend on. Essentially,
a package is an implementation of an interface and depending on
implementations is a bad thing. Code should only depend on interfaces
which are completely independent entities. I suspect that a lot of the
problems with packages occur because the current system doesn't follow
this simple principle.

It would be nice if Cabal had an explicit concept of interfaces, with
the idea that code depends on them and packages implement them. In the
simplest case, an interface is just a name. Ideally, it would be a
combination of type signatures, Quickcheck properties, proof obligations
etc. The important thing is that it has an explicit definition which is
completely independent of any concrete implementation and which never
changes.

Something like this would immediately solve a lot of problems. Several
packages could implement the same interface and we could pick which one
we want when building stuff. We could have much more fine-grained
dependencies (if all I need is an AVL tree, I don't want to depend on
the entire containers package, but rather just on the AVL part of it).
One package could implement several versions of an interface to ensure
compatibility with old code (I could imagine module names like
AVL_1.Data.AVLTree, AVL_2.Data.AVLTree etc., where AVL_1 and AVL_2 are
interface names; Cabal could then map the right module to Data.AVLTree
when building). If interface definitions include something like
Quickcheck properties, we would have at least some assurance that a
package actually does implement its interfaces. Moreover, this would
also make the properties themselves reusable.

Note that I don't propose that we automatically extract interfaces from
code. In fact, I think that would be precisely the wrong way to go. An
interface is not a by-product of implementing a package. It should be
defined explicitly.

In general, I don't think that existing package management systems do a
very good job of specifying dependencies. They sort of work for
distributing software but do they really work for versioning libraries?
In any case, we ought to have something better for Haskell where we
(hopefully) have somewhat different standards when it comes to
correctness and ease of use. It might be more worthwhile to look at
systems such as Corba, Microsoft's OLE or whatever that's called
nowadays, Java's equivalent, whatever that is and, of course, ML's
modules. None of these is quite right for what we want but IMO they are
much closer to our problem domain than something like RPM.

Roman



_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Specifying dependencies on Haskell code

by Simon Marlow-7 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

David Roundy wrote:

> On Sun, May 04, 2008 at 05:20:54PM +0100, Ian Lynagh wrote:
>> On Sat, May 03, 2008 at 11:30:44AM -0700, David Roundy wrote:
>>> On Sat, May 03, 2008 at 02:51:33PM +0100, Ian Lynagh wrote:
>>>> According to http://peak.telecommunity.com/DevCenter/PythonEggs, with
>>>> python eggs you do things like
>>>>     from pkg_resources import require
>>>>     require("FooBar>=1.2")
>>>> From what I can tell, python eggs aren't a build system either, but
>>>> rather a binary package format.
>> To install a trac plugin you download a tarball and do something like
>> python setup.py bdist_egg to create the .egg file, which you can then put
>> in the appropriate place. I think in general you can also do python
>> setup.py install to have it installed as a python library.
>>
>> I know virtually nothing about eggs, and even less about gems, but I am
>> under the impression that they aim to solve the same problem as Cabal.
>
> Maybe the problem is that noone seems to know what problem cabal is
> supposed to be solving.  What problem is that? Some say it's a
> configuration/build system.  Others say it's a packaging system.  I think
> it's the latter.

Does it matter?  It's fine for a system to not fit entirely into one of the
predefined boxes that you know about (e.g. is ZFS a file system or a volume
manager?).  Cabal solves a specific problem, which is:

   it allows a package to be built from source, and installed, on
   a system with only a Haskell compiler (and Cabal).

the last part is important for people on Windows who don't want to install
Cygwin or MSYS just to build Haskell packages.

Now, we discovered that by adding bits here and there we could solve other
problems too: e.g. Cabal also builds programs.  But the above statement was
originally the main reason for Cabal's existence.

Cheers,
        Simon
_______________________________________________
Libraries mailing list
Libraries@...
http://www.haskell.org/mailman/listinfo/libraries

Re: Specifying dependencies on Haskell code

by David Roundy-2 :: Rate this Message:

Reply to Author