stackless fixed-arity concatenative languages

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

stackless fixed-arity concatenative languages

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In some languages, such as Haskell, all functions consume a fixed  
number of arguments and produce a fixed number of arguments.* For  
example, the 'foldr' function in Haskell always consumes three values  
and produces one. In fact, all functions in Haskell produce one value.

Concatenative languages like Joy expand on this; functions are not  
restricted to returning only one value.** For example, 'dup' consumes  
one value and produces two. We'll say 'dup' has a 1/2 arity; 'pop'  
would have a 1/0 arity, '+' would have a 2/1 arity, 'swap' would have  
a 2/2 arity, and so on.

However, some functions in Joy go beyond this. For example, 'i'  
essentially takes the entire state of the program as an argument and  
yields a new state. Because 'i' simply invokes the quotation on the  
top of the stack, there's no way to tell how many values 'i' will  
consume or produce without knowing what's on top of the stack. Other  
examples of such functions include 'dip' and 'while'.

One downside of such functions is that it becomes impossible to  
efficiently compile a language like Joy to C; you have no choice but  
to make use of a stack (or some equivalent mechanism) in the  
implementation. In contrast, a concatenative language with only fixed-
arity functions can be efficiently compiled to C without any need for  
a stack. An example of such a language is Forth without n-ary words  
like 'roll' and with the modest restriction that both branches of a  
conditional must have the same stack effect. [1][2]

It seems the way to do this for a language like Joy would be to find  
alternatives for functions like 'i' and add a type system to ensure  
conditional branches are balanced. Ignoring the type system for now,  
it seems one possibility would be to restrict 'i' to simply dequoting  
the quotation on the top of the stack without allowing it access to  
the stack itself. If one wanted the quotation to have access to  
arguments on the stack, they could be partially applied to the  
quotation first. Several versions of 'i' would need to be available to  
indicate how many values will be returned from its use. For example:

    # 'p' is partial application
    # 'i1' indicates that the quotation produces 1 value
    # the result is '7'
    2 [5 +] p i1

Rather than dealing with partial application, a convenient and  
efficient implementation would simply offer a suite of 'i' combinators  
from 0i0 up until 4i4 or so, where the numbers indicate the number of  
values to be consumed and produced.

While this all likely seems to be overly burdensome at first, most  
functions that make use of quotations would not need so many versions.  
For example, 'fold' would only be valid for quotations of arity 2/1,  
and hence only one version would be needed.

Does anyone think such a language would be feasible? What might the  
benefits be? What useful features would be lost?

- John

[1] http://www.complang.tuwien.ac.at/papers/ertl%26maierhofer95.ps.gz
[2] http://www.complang.tuwien.ac.at/papers/ertl96diss.ps.gz

* Yes, all functions in Haskell /really/ take one argument and are  
curried, and yes, there are tricky ways of implementing variadic  
functions, but we'll ignore that for now.

** Yes, all functions are /really/ unary functions of stacks to  
stacks, but we'll ignore that for now.



Re: stackless fixed-arity concatenative languages

by LittleDan :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This is a reasonable idea for some cases, but there's some generality
that's lost. Look at how reduce is defined in Factor:

: reduce ( seq identity quot -- result )
    swapd each ;

When each runs, it has the stack underneath as an implicit
accumulator. That means that we can just use the code of each when
writing reduce, but only because the quotation can be not just 1/0 but
also 2/1. The genericness here allows a reduction in code duplication
in certain cases which are completely impossible in Haskell (unless
you start by implementing reduce, then base a each off of it
(pretending Haskell has side effects), and this is more complicated),
though arguably unnecessary.

Anyway, can't a type signature for each of something like ( A ( A x --
A ) [x] -- A ) handle this adequately? Is it difficult to infer or
deal with such a type?

Dan

On Tue, May 20, 2008 at 1:15 AM, John Nowak <john@...> wrote:

> In some languages, such as Haskell, all functions consume a fixed
> number of arguments and produce a fixed number of arguments.* For
> example, the 'foldr' function in Haskell always consumes three values
> and produces one. In fact, all functions in Haskell produce one value.
>
> Concatenative languages like Joy expand on this; functions are not
> restricted to returning only one value.** For example, 'dup' consumes
> one value and produces two. We'll say 'dup' has a 1/2 arity; 'pop'
> would have a 1/0 arity, '+' would have a 2/1 arity, 'swap' would have
> a 2/2 arity, and so on.
>
> However, some functions in Joy go beyond this. For example, 'i'
> essentially takes the entire state of the program as an argument and
> yields a new state. Because 'i' simply invokes the quotation on the
> top of the stack, there's no way to tell how many values 'i' will
> consume or produce without knowing what's on top of the stack. Other
> examples of such functions include 'dip' and 'while'.
>
> One downside of such functions is that it becomes impossible to
> efficiently compile a language like Joy to C; you have no choice but
> to make use of a stack (or some equivalent mechanism) in the
> implementation. In contrast, a concatenative language with only fixed-
> arity functions can be efficiently compiled to C without any need for
> a stack. An example of such a language is Forth without n-ary words
> like 'roll' and with the modest restriction that both branches of a
> conditional must have the same stack effect. [1][2]
>
> It seems the way to do this for a language like Joy would be to find
> alternatives for functions like 'i' and add a type system to ensure
> conditional branches are balanced. Ignoring the type system for now,
> it seems one possibility would be to restrict 'i' to simply dequoting
> the quotation on the top of the stack without allowing it access to
> the stack itself. If one wanted the quotation to have access to
> arguments on the stack, they could be partially applied to the
> quotation first. Several versions of 'i' would need to be available to
> indicate how many values will be returned from its use. For example:
>
> # 'p' is partial application
> # 'i1' indicates that the quotation produces 1 value
> # the result is '7'
> 2 [5 +] p i1
>
> Rather than dealing with partial application, a convenient and
> efficient implementation would simply offer a suite of 'i' combinators
> from 0i0 up until 4i4 or so, where the numbers indicate the number of
> values to be consumed and produced.
>
> While this all likely seems to be overly burdensome at first, most
> functions that make use of quotations would not need so many versions.
> For example, 'fold' would only be valid for quotations of arity 2/1,
> and hence only one version would be needed.
>
> Does anyone think such a language would be feasible? What might the
> benefits be? What useful features would be lost?
>
> - John
>
> [1] http://www.complang.tuwien.ac.at/papers/ertl%26maierhofer95.ps.gz
> [2] http://www.complang.tuwien.ac.at/papers/ertl96diss.ps.gz
>
> * Yes, all functions in Haskell /really/ take one argument and are
> curried, and yes, there are tricky ways of implementing variadic
> functions, but we'll ignore that for now.
>
> ** Yes, all functions are /really/ unary functions of stacks to
> stacks, but we'll ignore that for now.
>
>

Re: stackless fixed-arity concatenative languages

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On May 20, 2008, at 2:39 AM, Daniel Ehrenberg wrote:

> This is a reasonable idea for some cases, but there's some generality
> that's lost.

Certainly.

> Look at how reduce is defined in Factor:
>
> : reduce ( seq identity quot -- result )
>    swapd each ;

Aye. It makes sense as 'each' is a generalized version of 'fold' that  
allows you to access the stack and not just deal with two values. It's  
not clear to me though if something like 'each' is too powerful for  
its own good. It probably isn't.

> When each runs, it has the stack underneath as an implicit
> accumulator. That means that we can just use the code of each when
> writing reduce, but only because the quotation can be not just 1/0 but
> also 2/1.

I suppose I just restated this above for no good reason.

> Anyway, can't a type signature for each of something like ( A ( A x --
> A ) [x] -- A ) handle this adequately? Is it difficult to infer or
> deal with such a type?

The type for 'each' in Cat is exactly that and it works fine in  
practice. My main interest here is in allowing a direct and efficient  
translation to C. My guess is that the sacrifices necessary will be  
too severe, although it may possibly allow easier reasoning about  
program behavior that might make the loss of generality worth it.  
Given the lack of problems n-ary functions have presented though in  
terms of types and formalizing a term rewriting semantics, I'm  
skeptical of that being the case.

- John

stackless concatenativity + visual dataflow

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Here is a visualization of the function 'swap dup dip swap' where we  
want the quotation supplied to 'dip' to consume two arguments and  
produce one (as '+' or 'cons' does); note that 'R' and 'S' are row  
variables that represent some "rest of the stack":

     http://johnnowak.com/heap/stackless-dataflow.png

As you can (maybe?) see, the restrictions necessary to allow a  
stackless implementation of a concatenative language, namely that the  
arities of all functions be locally knowable, simplifies things quite  
a bit. At the very least, it gives us a nice mapping to a visual  
dataflow language where each "transformer" has a fixed number of  
"ports" and the column position of each "port" directly corresponds to  
an absolute stack position (for lack of a better term). I find this  
easier to follow than the alternative version and would certainly find  
it easier to explain to someone else.

I suppose all there is to do at this point is implement an interpreter  
for such a language and see how painful the "fixed arity" restriction  
is relative to its benefits. I suspect a simple macro system would go  
a long way towards making the restriction tolerable by allowing  
"generalized" versions of combinators to be written. Alternatively,  
syntactic support for specifying arities could be built into the  
language. Ultimately, that would be really no different than the  
information you give the compiler when you write something like (foo a  
b c) in Scheme.

- John

Re: stackless concatenativity + visual dataflow

by Chris Double :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, May 20, 2008 at 9:48 PM, John Nowak <john@...> wrote:
> I suppose all there is to do at this point is implement an interpreter
> for such a language and see how painful the "fixed arity" restriction
> is relative to its benefits.

Based on the things I've written in Factor a 'fixed arity' restriction
would have little to no effect on what I've written. It did in the
past when 'curry' was discouraged and quotations passed to 'map' and
the like would often access items deeper in the stack. For example:

  5 { 1 2 3 } [ over + ] map
  => { 6 7 8 }

A map-with operation was eventually provided to make this easier:

  5 { 1 2 3 } [ + ] map-with

Now with the composition of quotations being more acceptable in factor
(previously composing quotations caused performance problems) words
like curry can be used:

  5 { 1 2 3 } swap [ + ] curry map

(There's actually a word 'with' that makes this more readable).

Would your restriction still allow curry, compose, etc?

Chris.
--
http://www.bluishcoder.co.nz

Re: stackless fixed-arity concatenative languages

by William Tanksley, Jr :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

John Nowak <john@...> wrote:
> One downside of such functions is that it becomes impossible to
> efficiently compile a language like Joy to C; you have no choice but
> to make use of a stack (or some equivalent mechanism) in the
> implementation.

"Impossible" is an overly strong word (especially in conjunction with
a shades-of-grey word like "efficiently"). It's quite possible to use
the stack in the same way that C itself does; but it's very hard to
work together with your C compiler's optimizer, and I'd say that is
the major difficulty, not anything having to do with stacks.

Even in the presence of statically indeterminate stack effects, MOST
of the program can be efficiently compiled.

BUT, I do agree that it's simple to statically determine stack
effects, so your point is otherwise well-taken. I think that would be
a good variant of Joy.

> It seems the way to do this for a language like Joy would be to find
> alternatives for functions like 'i' and add a type system to ensure
> conditional branches are balanced. Ignoring the type system for now,
> it seems one possibility would be to restrict 'i' to simply dequoting
> the quotation on the top of the stack without allowing it access to
> the stack itself. If one wanted the quotation to have access to
> arguments on the stack, they could be partially applied to the
> quotation first. Several versions of 'i' would need to be available to
> indicate how many values will be returned from its use. For example:

Indeed, Joy does this in many places, with all its functions that
contain a numeral in their name.

>    # 'p' is partial application
>    # 'i1' indicates that the quotation produces 1 value
>    # the result is '7'
>    2 [5 +] p i1
>
> Rather than dealing with partial application, a convenient and
> efficient implementation would simply offer a suite of 'i' combinators
> from 0i0 up until 4i4 or so, where the numbers indicate the number of
> values to be consumed and produced.

At this point, it would seem convenient to add some syntax for
staticly-known (and required) parameters. Perhaps (i 4 4) would be an
example of such syntax. strongForth does something like this with its
EXECUTE function (equivalent to 'i'), which is always followed by a
parenthetical type declaration.

> Does anyone think such a language would be feasible? What might the
> benefits be? What useful features would be lost?

Well, you'd also need to remove Joy's stack-preservation semantics. As
long as that's present, you simply have to implement a stack, no way
around it.

But yes, I see no reason why it'd be hard at all. You'd lose the
ability to create code with dynamic stack effects, but such code is
generally frowned upon anyhow -- it's very hard to read.

> - John

-Wm

Re: stackless concatenativity + visual dataflow

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On May 20, 2008, at 7:52 AM, Chris Double wrote:

> Based on the things I've written in Factor a 'fixed arity' restriction
> would have little to no effect on what I've written.

Very good to hear! I was half-expecting it to be shot down for being  
overly restrictive.

> It did in the past when 'curry' was discouraged and quotations passed
> to 'map' and the like would often access items deeper in the stack...
>
>  5 { 1 2 3 } [ over + ] map
>  5 { 1 2 3 } [ + ] map-with

The version using map-with is certainly clearer. Perhaps there really  
is a readability benefit in gently forcing the programmer to use  
combinators with fixed stack effects.

> Would your restriction still allow curry, compose, etc?

Yes, and you wouldn't need different versions of 'compose' for each  
possible arity combination of the functions being composed. In other  
words, '[0] [+] compose' and '[swap] [over] compose' would both be  
fine, even though the arities of the input quotations and the arities  
of the resulting composed quotations do not match at all.

- John

Re: stackless fixed-arity concatenative languages

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On May 20, 2008, at 9:42 AM, William Tanksley, Jr wrote:

> John Nowak <john@...> wrote:
>
>> One downside of such functions is that it becomes impossible to
>> efficiently compile a language like Joy to C; you have no choice but
>> to make use of a stack (or some equivalent mechanism) in the
>> implementation.
>
> "Impossible" is an overly strong word (especially in conjunction with
> a shades-of-grey word like "efficiently"). It's quite possible to use
> the stack in the same way that C itself does; but it's very hard to
> work together with your C compiler's optimizer, and I'd say that is
> the major difficulty, not anything having to do with stacks.

That makes sense. Agreed.

>> Rather than dealing with partial application, a convenient and
>> efficient implementation would simply offer a suite of 'i'  
>> combinators
>> from 0i0 up until 4i4 or so, where the numbers indicate the number of
>> values to be consumed and produced.
>
> At this point, it would seem convenient to add some syntax for
> staticly-known (and required) parameters. Perhaps (i 4 4) would be an
> example of such syntax.

That's certainly possible. I already have that syntax for special  
forms anyway. It might be nice to not require the numbers to be given  
when the effect is inferable though. For example, I often do things  
like '[+] dip', and it would be rather annoying to have to do '[+]  
(dip 2 1)' instead.

Once you start inferring that sort of thing though, you have a  
situation (again) where you can't split definitions cleanly because  
there might not be enough information to infer stack effects. Perhaps  
instead, you could have a 'dip-with' form that takes a quotation  
directly. You could then do something like '(dip-with +)'. This way,  
it is at least clear that you can't split between '+' and 'dip-with'.

In any case, it should be possible to figure out some nice way of  
doing all this.

>> Does anyone think such a language would be feasible? What might the
>> benefits be? What useful features would be lost?
>
> Well, you'd also need to remove Joy's stack-preservation semantics. As
> long as that's present, you simply have to implement a stack, no way
> around it.

Aye. I've already decided against that as Joy's semantics are just too  
hard to compile efficiently. The one thing I really wanted Joy's stack-
preservation for however was assertions and contracts. It's important  
to be able to remove assertions from a program without changing the  
program's meaning, and preserving the stack takes care of that.  
However, a fixed arity restriction does that as well, as it is known  
at compile time how many elements of the stack need to be saved and  
restored for the call to assert. Since we're not longer dealing with a  
stack though, you just call assertion with the arguments it needs and  
ignore the result. Very easy.

> But yes, I see no reason why it'd be hard at all. You'd lose the
> ability to create code with dynamic stack effects, but such code is
> generally frowned upon anyhow -- it's very hard to read.

Very good. Thank you William and Chris for giving me the go-ahead.

- John

Re: stackless fixed-arity concatenative languages

by William Tanksley, Jr :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

John Nowak <john@...> wrote:
> William Tanksley, Jr wrote:
>> At this point, it would seem convenient to add some syntax for
>> staticly-known (and required) parameters. Perhaps (i 4 4) would be an
>> example of such syntax.

> That's certainly possible. I already have that syntax for special
> forms anyway. It might be nice to not require the numbers to be given
> when the effect is inferable though. For example, I often do things
> like '[+] dip', and it would be rather annoying to have to do '[+]
> (dip 2 1)' instead.

Agreed; this should tie in with type inference and annotation. This
stack effect notation is really an abbreviated form of type notation
and static polymorphism / function overloading, so perhaps it should
be clearly annotated as such (rather than simply looking like a macro,
as I suggested).

> Once you start inferring that sort of thing though, you have a
> situation (again) where you can't split definitions cleanly because
> there might not be enough information to infer stack effects. Perhaps
> instead, you could have a 'dip-with' form that takes a quotation
> directly. You could then do something like '(dip-with +)'. This way,
> it is at least clear that you can't split between '+' and 'dip-with'.

For that to work consistently, the type system would have to produce
an error whenever annotation was missing, even if the inference could
theoretically be made.

> In any case, it should be possible to figure out some nice way of
> doing all this.

Perhaps :-).

>>> Does anyone think such a language would be feasible? What might the
>>> benefits be? What useful features would be lost?

>> Well, you'd also need to remove Joy's stack-preservation semantics. As
>> long as that's present, you simply have to implement a stack, no way
>> around it.

> It's important
> to be able to remove assertions from a program without changing the
> program's meaning, and preserving the stack takes care of that.
> However, a fixed arity restriction does that as well, as it is known
> at compile time how many elements of the stack need to be saved and
> restored for the call to assert. Since we're not longer dealing with a
> stack though, you just call assertion with the arguments it needs and
> ignore the result. Very easy.

Yes, it seems like fixing the arity is the correct action (really, you
don't ignore the result; you make sure the result is always 'true').

> - John

-Wm

Re: stackless concatenativity + visual dataflow

by William Tanksley, Jr :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

John Nowak <john@...> wrote:
> The version using map-with is certainly clearer. Perhaps there really
> is a readability benefit in gently forcing the programmer to use
> combinators with fixed stack effects.

The Forth community's experience strongly suggests that there is.
Words with externally visible variable stack effects are frowned upon.

> - John

-Wm

Re: stackless fixed-arity concatenative languages

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On May 20, 2008, at 7:17 PM, William Tanksley, Jr wrote:

>>> At this point, it would seem convenient to add some syntax for
>>> staticly-known (and required) parameters. Perhaps (i 4 4) would be  
>>> an
>>> example of such syntax.
>>
>> it would be rather annoying to have to do '[+] (dip 2 1)' instead.
>
> Agreed; this should tie in with type inference and annotation. This
> stack effect notation is really an abbreviated form of type notation
> and static polymorphism / function overloading, so perhaps it should
> be clearly annotated as such (rather than simply looking like a macro,
> as I suggested).

It seems to me that the way to handle this is to simply require a  
normal type annotation that provides enough information to figure  
these things out. There is a simple rule for determining if all  
arities for all function instances within a function are known by  
looking at the row variables, so this wouldn't introduce too much  
complication. The rule is that the row variable for the outermost  
function must be the same in the consumption and production. (There  
are ways to break this with stack-saving/restoring primitives, so  
they'd need to be special forms if they were to exist at all.) In  
other words, 'A (B -> C) d -> A d (B -> C)' is fine, but 'A (A -> B) d  
-> B d' isn't.

Specifying arities directly seems like it would be too much of a pain.  
The reason is that I use normal functions to deconstruct sum types  
where each quotation provided to the function indicates what to do if  
the sum type was constructed via a particular constructor. It would be  
way too inconvenient to have to specify arities for all of these.  
They'd also no longer be "normal functions" if they somehow were  
allowed to get around the arity rule. This applies to something like  
'dip' as well. If you require all user-written functions to have a  
fixed arity, something harmless like 'foo = dip' would not be valid.

It seems what is required is two classes of functions, ones with fixed  
arities and ones without. For functions with known arities, you could  
do something like '(define foo [+] dip)'. For functions without known  
arities, you could do something like '(define-nary bar [dip] dip)'.  
Functions that are defined as n-ary functions would act like macros;  
code would not be generated for them until they are instantiated in  
such a way that the arity is known for that particular instance. This  
seems like a reasonable requirement, makes rules for separate  
compilation simple, and encourages fixed arity functions while  
allowing n-ary functions where necessary.

>> You could then do something like '(dip-with +)'. This way,
>> it is at least clear that you can't split between '+' and 'dip-with'.
>
> For that to work consistently, the type system would have to produce
> an error whenever annotation was missing, even if the inference could
> theoretically be made.

Yes, this is the general problem with ensuring that you can break  
functions apart: You either deal with unnecessary restrictions or  
resort to some form of whole program analysis. The type system I'm  
working on now is essentially a clever way of doing whole program  
analysis by delaying certain things when doing local inference, but it  
doesn't interact well with type annotations which will be necessary  
for restricting n-ary functions to fixed arities so that code can be  
generated for them when doing separate compilation.

I'm possibly just talking to myself at this point, but it seems like I  
should just give up on the idea of preserving cut-and-paste factoring  
in all cases, use a simpler yet highly expressive type system that  
occasionally requires annotations like Leijen's HMF or HML, drop  
macros (since we have no binding forms to deal with, we have a  
lightweight syntax for quotations, we don't build abstractions around  
things like SET!, postfix syntax is already quite good for DSLs, etc),  
and just get on with writing an implementation already.

- John

P.S. Apologies if this message ends up on the list twice. My first  
attempt was rejected.

Re: stackless fixed-arity concatenative languages

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On May 20, 2008, at 10:00 PM, John Nowak wrote:

> There is a simple rule for determining if all
> arities for all function instances within a function are known by
> looking at the row variables, so this wouldn't introduce too much
> complication. The rule is that the row variable for the outermost
> function must be the same in the consumption and production.

After a brief review, this is clearly bogus. For example, we might  
have a 'loop' combinator that simply executes a procedure repeatedly:

    loop :: A (A -> A) -> A

Here, the row variables match, but we have no idea of how many  
elements of the stack 'loop' will actually use. There are, however,  
functions with the same signature as 'loop' that would work just fine,  
such as those that simply require a function that would not alter the  
type of the stack but do not actually call it. Therefore, the type of  
a function alone (at least with a Cat-like type system) can give no  
guarantee either way regarding if the fixed arity requirement has been  
satisfied. Some other solution will be necessary.

- John

Re: stackless fixed-arity concatenative languages

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

After doing some thinking, I've come up with an example of where an n-
ary 'dip' combinator makes a complete mess of things. The example is  
Factor's 'bi@' combinator. The 'bi@' combinator takes some x, some y,  
and a quotation which it applies to both. As an example, '4 5 [ dup  
* ] bi@' yields '16 25'.

Here's one way we might think to implement 'bi@' in Factor:

    : bi@ ( x y quot -- x' y' ) dup dip dip ;

The thinking here is that we duplicate the quotation, apply it below  
the original quotation to 'y', then apply the original quotation to  
'x'. Just to be clear, I'll trace the execution of an expression as if  
Factor used term rewriting (which I find is the easiest way to show  
this sort of thing):

    4 5 [ dup * ] bi@
    4 5 [ dup * ] dup dip dip
    4 5 [ dup * ] [ dup * ] dip dip
    4 5 dup * [ dup * ] dip
    4 5 5 * [ dup * ] dip
    4 25 [ dup * ] dip
    4 dup * 25
    4 4 * 25
    16 25

Great. It worked as we wanted it to. There is, however, a bug. The bug  
is that the first call to the quotation has both 4 and 5 on the stack,  
while the second call only has 4 on the stack. In other words, the  
quotations are exposed to stacks of different sizes. If the quotation  
supplied uses two elements on the stack rather than one, very weird  
things will happen.

Surely a type system like Cat's would catch this, yes? Well, sort of.  
Unfortunately, it does it in the worst way possible; it gives bi@ the  
type 'A [A -> A b] -> A b b'. What this means is that the quotation  
must keep the type of the stack the same and add another element. This  
means doing '4 5 [dup *] bi@' would be a type error since '[dup *]' is  
not an instance of '[A -> A b]'; it's an instance of '[A b -> A b]'.  
Furthermore, it doesn't even indicate that two elements must be below  
the quotation on the stack! As a result, all you can do are things  
like '4 5 [6] bi@' to yield '4 5 6 6' which makes the function  
completely useless.

One way to solve this problem is to use a version of 'dip' that calls  
a quotation with only one element and requires the quotation to yield  
one element. Cat's type system cannot enforce this, but Fifth's type  
system can. Below are the types of the unrestricted 'dip' combinator  
and the restricted version:

    dip    :: A b [A -> C] -> C b
    dip1-1 :: a b [a -> c] -> c b

As you can see, the types are basically the same, except the quotation  
can only take one element 'a' and transform it to one element 'c'  
rather than the stack 'A' to the stack 'C'. Using the restricted  
combinator, our function would look like this:

    bi@ = dup dip1-1 dip1-1

We could then assign it the following type:

    bi@ :: a a [a -> b] -> b b

This type is much more like what we wanted. Indeed, it'll let us do '4  
5 [dup *] bi@'. It does have the restriction however that both  
elements below the quotation must be of the same type. To solve this  
we need higher rank types which is another discussion altogether.

Now the nasty bit: 'bi@ = dup dip dip' can be assigned the type above  
in Fifth provided that you give it a signature. The signature 'a a [a -
 > b] -> b b', however, is not an instance of the type that is  
inferred (which is the same as Cat's: 'A [A -> A b] -> A b b'). This  
is very unfortunate as it means Fifth does not have principal types in  
the presence of n-ary combinators! This problem would hold for Cat as  
well if it had a way of enforcing similar arity restrictions (which it  
will need to be able to eventually for things like callbacks if  
nothing else).

This is rather damning, and it may mean that n-ary combinators have to  
be completely eliminated unless a more expressive system can be found.  
One way to do this would be fixed arity versions of 'dip' and 'i',  
along with built-in mechanisms for pattern matching and deconstruction  
(a la Haskell, et al) rather than simply using normal functions like  
'unlist' as I've talked about previously. There would still need to be  
some way to address the need to allow the user to write new  
combinators like '2dip' that apply a quotation below the top two  
elements of the stack rather than just one. Introducing something like  
lambda expressions would be one way to do this (similar to how  
Backus's Formal FP allowed lambda expressions for creating new  
combinators).

It's important to remember that 'dup dip dip' works fine for our  
expected use case, but has some very strange behaviors lurking about!  
The new type system I'm working on does capture this by giving 'dup  
dip dip' the type '[$A] => $A [$A] dip'. Essentially, it types 'dup  
dip', realizes there's no most general way to proceed, and stops and  
waits for more information about $A (the quotation passed in).  
Unfortunately, this type system is undecidable, expensive, and yields  
ugly types. What fun!

So what's the correct way to write 'bi@' so that Cat gives it the  
correct type without placing an arity restriction on the quotation? It  
turns out there isn't one. I thought maybe something like 'under apply  
slip' would work, but I get the rather insane type 'A b (A self b -> C  
(C -> D) e) -> D e'. Amusing. Another attempt was 'under [[apply dip]  
dip apply', but this gets the type 'A b b -> (A b -> A) -> A' which  
means you can only do things like '3 4 [pop] bi@' and hence it's  
rather useless. If you think about it, there's really no way it could  
possibly give a useful type without an arity restriction. Well,  
actually it could if you had first-class stacks. That's something else  
I have to consider as a possible solution...

Hey, I thought concatenative languages were supposed to be simpler!

- John

Re: stackless fixed-arity concatenative languages

by John Nowak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On May 21, 2008, at 4:08 AM, John Nowak wrote:

> Another attempt was 'under [[apply dip]
> dip apply', but this gets the type 'A b b -> (A b -> A) -> A' which
> means you can only do things like '3 4 [pop] bi@' and hence it's
> rather useless.

Minor correction: The type is 'A b b (A b -> A) -> A'.

Too much curry.

- John

Re: stackless fixed-arity concatenative languages

by Christopher Diggins :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> It's important to remember that 'dup dip dip' works fine for our
> expected use case, but has some very strange behaviors lurking about!
> The new type system I'm working on does capture this by giving 'dup
> dip dip' the type '[$A] => $A [$A] dip'. Essentially, it types 'dup
> dip', realizes there's no most general way to proceed, and stops and
> waits for more information about $A (the quotation passed in).
> Unfortunately, this type system is undecidable, expensive, and yields
> ugly types. What fun!

On the other hand this looks like a turing-complete programming
language at the type level. Perhaps if you go that route, and not
worry about decidability, you will arrive at something interesting.

That Cat type issues you have identified are appreciated. They are
related to the inability to properly assign the correct level of
nesting of nested forall qualifiers, and to do renaming properly. This
is still something that I have on my to-do list, to try and improve.

Oh, and for the record, I still want variable arity functions in
concatenative langauges, where the arity depends on the arity of
function parameters. However, some of the Joy primitives have a little
too much flexibility with stack access: I don't want all of my list
processing functions to have access to the stack, because it seriously
impedes performance. What I am doing in Cat is making multiple
versions of the list processing functions (like Map and Fold): ones
that allows unfettered access to the stack (and thus is order
dependent) and others that don't.

- Christopher

Re: stackless fixed-arity concatenative languages

by William Tanksley, Jr :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

John Nowak <john@...> wrote:
> William Tanksley, Jr wrote:
>>>> At this point, it would seem convenient to add some syntax for
>>>> staticly-known (and required) parameters. Perhaps (i 4 4) would be
>>>> an example of such syntax.
>>> it would be rather annoying to have to do '[+] (dip 2 1)' instead.

>> Agreed; this should tie in with type inference and annotation. This
>> stack effect notation is really an abbreviated form of type notation
>> and static polymorphism / function overloading, so perhaps it should
>> be clearly annotated as such (rather than simply looking like a macro,
>> as I suggested).

> It seems to me that the way to handle this is to simply require a
> normal type annotation that provides enough information to figure
> these things out.

I agree; for a language with static typechecking, this makes complete
sense. If you don't have static typechecking but you want to do static
stack depths, you have to check and annotate arities.

> Specifying arities directly seems like it would be too much of a pain.

Sure, if you also have to specify types it's a waste of time. If you
don't have types it's easier to add arities than it is to add types.

> I'm possibly just talking to myself at this point, but it seems like I
> should just give up on the idea of preserving cut-and-paste factoring
> in all cases, use a simpler yet