|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
fundamental type system problems + possible solutionsIt seems there's a fundamental problem with typing concatenative
languages: Most type systems are not compositional which destroys the simple algebra of concatenative programs. Take these equivalencies for example: [$A] i == $A -- application [$A] [$B] o == [$A $B] -- composition $a [$B] pa == [$a $B] -- partial application [$A] m == [$A] $A -- self-application Unfortunately, all of these rules (except perhaps for partial application) fall apart once types are involved. This is of practical concern as it makes cut-and-paste refactoring impossible and will cause problems with many macro transformations. What is needed is a type system that is compositional. One system that meets this requirement is System E developed by Carlier and Wells [1]. While this seems like an appealing solution at first, the types produced by System E are so complex that they're not really human- decipherable. Even if a solution to making error reports understandable could be found via some form of type error slicing [2], there would still be no way to make types useful as documentation. As I see it, there are three main benefits of static type systems: Automatic and correct documentation, increased safety, and increased efficiency. Of these three, I consider documentation to be the most important. After all, there are language implementations without static type systems that perform quite well such as SBCL and LuaJIT. As for safety, dynamic languages are used to build highly-reliable systems (see Erlang); ultimately, the design of a program and ability of the implementors contributes much more to software reliability than the presence of a type system (compare Haskell/Darcs to C/Perl/Git). The need for documentation, however, you can't escape. This is especially so in stack-based languages where stack effect comments are necessary to make sense out of things. While no type system will replace the need for such comments entirely, as it is often necessary to state not just the type of a value but its purpose, a type system can at least ensure the correct number of arguments will always be passed and that type constraints will be respected. I would rather have no type system than one which makes simple transformations, and hence macros and much refactoring, impossible. Static type systems seem doomed to being either non-compositional or not suited to producing human-readable documentation. Therefore, as I see it, there are two possibilities going forward. One is to develop a soft type system that hopefully will be able to report most errors without rejecting any program. If a language is designed with a soft type system in mind, the quality of error reporting can be quite good. For example, dynamic checks can be inserted to ensure lists are used homogeneously and type declarations can be required for data types to avoid the need for equi-recursive types. The other approach would be some way of inferring the equivalences mentioned at the beginning of this email. I guess this would take the form of some kind of abstract interpretation. The goal here would not be to catch all type errors (although some would be catchable), but rather to provide a detailed description of what a function does beyond what an HM-based type system would be capable of yielding. It seems the primary issue here would be giving equivalencies for functions involving recursive data types. For example, if we define some 'each' function that simply places an element from a list on the stack and then calls some function repeatedly until the list is empty, how would we express this? Anyway, these are just my current thoughts. As for Fifth, I've decided to delay inclusion of the type system until I can find a solution that won't cause problems with macros or inhibit cut-and-paste refactoring. (I'll credit Slava Pestov for reminding me of how critical this is.) Any thoughts would be appreciated, especially regarding the second approach described above. - John [1] http://www.church-project.org/modular/compositional/system-E/ [2] http://www.macs.hw.ac.uk/ultra/compositional-analysis/type-error-slicing/slicing.cgi |
|
|
Re: fundamental type system problems + possible solutionsJohn Nowak <john@...> wrote:
> It seems there's a fundamental problem with typing concatenative > languages: Most type systems are not compositional which destroys the > simple algebra of concatenative programs. The literature includes some uses of the word "compositional" for type systems, but is there a specific meaning you have in mind here? I think you're talking about a type system which completely preserves the associative property of concatenative languages (the associative property guarantees both arbitrary factoring and arbitrary order of evaluation). That's a goal I share... Although it's also a goal I want to be able to work around. There are times when a type system should be able to force a specific order of evaluation, primarily when the system deduces the presence of side effects. Let me add one more thing before I go on: this problem is specific to type inference, not to type checking in general. If you always annotate inputs, outputs, and executions, there's no problem. I need to read up on this; thank you for the references. > As I see it, there are three main benefits of static type systems: > Automatic and correct documentation, increased safety, and increased > efficiency. I suppose the availability of statically known information to the programmer counts as "documentation", but sometimes it's more than that. Consider Ada's type attributes, as in TYPENAME'MAX or TYPENAME'SIZE. In theory, we could do much better. I would consider a static type system to be a huge "win" if it helped programmers. I'm not against static type systems for their own sake, but I'm strongly for them when they help. The effect system you proposed would have some interesting ways of helping. Compare Spark Ada -- which, by the way, might be a useful role model for a very powerful type system that doesn't actually hurt code that it can't check (it's entirely optional, contained entirely within comments of perfectly standard Ada code). http://en.wikipedia.org/wiki/SPARK_programming_language is perhaps a good place to start. (I know there are more formal systems out there, but this one's especially interesting because it's purely optional.) > Of these three, I consider documentation to be the most > important. After all, there are language implementations without > static type systems that perform quite well such as SBCL and LuaJIT. A good point; JITs tend to have a lot of useful information available to them. On the other hand, it's easy to design a language that is very hard to JIT. > ultimately, the design of a program and ability > of the implementors contributes much more to software reliability than > the presence of a type system (compare Haskell/Darcs to C/Perl/Git). Ability, dedication, and above all else _testing_. Darcs isn't buggy because its maintainer is unskilled; it's buggy (to the extent that it IS) because he can't consider all the possible cases, and there aren't enough people hammering on it (both in coders and in users) to find them reliably. (I know Darcs is getting a LOT better, by the way; it was my favorite, although I use Mercurial now.) Support for testing is important; it would be nice to have support for it baked into the compiler, and to have support not only for specific test cases, but also for universal claims (as in Design By Contract, Haskell's QuickCheck, and formal static annotation-augmented analysis like SPARK). > I would rather have no type system than one which makes simple > transformations, and hence macros and much refactoring, impossible. > Static type systems seem doomed to being either non-compositional or > not suited to producing human-readable documentation. Therefore, as I > see it, there are two possibilities going forward. Requiring limited annotation could be useful, and has some nice effects in terms of safety as well -- it's easy to write a function for which the compiler infers a correct type that is not similar to the type you'd desired (thus deferring the error message until later, when it's probably less useful). > The other approach would be some way of inferring the equivalences > mentioned at the beginning of this email. I guess this would take the > form of some kind of abstract interpretation. The goal here would not > be to catch all type errors (although some would be catchable), but > rather to provide a detailed description of what a function does > beyond what an HM-based type system would be capable of yielding. It > seems the primary issue here would be giving equivalencies for > functions involving recursive data types. For example, if we define > some 'each' function that simply places an element from a list on the > stack and then calls some function repeatedly until the list is empty, > how would we express this? The uncomputable sky is the limit -- we need a type theory in order to decide anything here. > - John -Wm |
|
|
Re: fundamental type system problems + possible solutionsOn Apr 29, 2008, at 1:16 PM, William Tanksley, Jr wrote: > John Nowak <john@...> wrote: >> It seems there's a fundamental problem with typing concatenative >> languages: Most type systems are not compositional which destroys the >> simple algebra of concatenative programs. > > The literature includes some uses of the word "compositional" for type > systems, but is there a specific meaning you have in mind here? I > think you're talking about a type system which completely preserves > the associative property of concatenative languages (the associative > property guarantees both arbitrary factoring and arbitrary order of > evaluation). Correct. The expression '[a b] [c] compose' should always give the same type as '[a] [b c] compose'. Additionally, '[foo] i' should give the same type as 'foo'. > That's a goal I share... Although it's also a goal I want to be able > to work around. There are times when a type system should be able to > force a specific order of evaluation, primarily when the system > deduces the presence of side effects. Fifth is a strict language in which evaluation is always left to right, so I don't think this would be an issue. > Let me add one more thing before I go on: this problem is specific to > type inference, not to type checking in general. If you always > annotate inputs, outputs, and executions, there's no problem. This is true. The problem comes in when dealing with macros. For example, a macro that translates lambda expressions to stack-based code needs to splice values into (possibly nested) quotations. Unfortunately, this translation might result in very confusing type errors, and there would be no way to give types to the generated quotations since the code is the result of a macro. Ignoring macros for a moment, I have no issue with requiring annotations. I'm particularly partial to Daan Leijen's HML system (http://research.microsoft.com/users/daan/pubs.html ). It gives more precise types than System F and offers full inference for impredicative instantiation. Type annotations are only required when a function parameter has a polymorphic type. Maybe the solution is to just forget about macros, build in lambda expressions (which will give better error reporting), build in pattern matching, and just get on with it. Bah. > Compare Spark Ada -- which, by the way, might be a useful role model > for a very powerful type system that doesn't actually hurt code that > it can't check (it's entirely optional, contained entirely within > comments of perfectly standard Ada code). Aye, Spark is great. It's unfortunate that it isn't free. > Ability, dedication, and above all else _testing_. Darcs isn't buggy > because its maintainer is unskilled; it's buggy (to the extent that it > IS) because he can't consider all the possible cases, and there aren't > enough people hammering on it (both in coders and in users) to find > them reliably. (I know Darcs is getting a LOT better, by the way; it > was my favorite, although I use Mercurial now.) I love Darcs myself, and I continue to use it despite the issues. To be clear, in no way meant to say the people behind Darcs are unskilled. Rather, my point is that there are problems in the big- picture design of the program itself. Despite Haskell's rather fancy types, and the team's use of GADTs to enforce correctness, I still wouldn't trust it for anything involving more than a few people. Luckily, it does seem to be getting better. Darcs 2 is quite zippy, relatively speaking. To be honest, I'm just experiencing some cognitive dissonance. My brain is trying frantically to rationalize avoiding the type system for awhile. > Support for testing is important; it would be nice to have support for > it baked into the compiler, and to have support not only for specific > test cases, but also for universal claims (as in Design By Contract, > Haskell's QuickCheck, and formal static annotation-augmented analysis > like SPARK). Fifth will offer contracts. This was a bit of a trick to get to working efficiently given that I wanted to be able to ensure removing contracts would have no effect on the program. In particular, it proved necessary for the programmer to state how many items on the stack a contract will need to access so that a portion of the stack can be (shallow) copied and restored before each assertion. The type system would ensure no more than the number of elements specified are accessible to the assertion. This would all be much easier in Joy where the stack is persistent, but this is out of the question for efficiency reasons. Thanks for the input. - John |
|
|
Re: fundamental type system problems + possible solutionsJohn Nowak <john@...> wrote:
> William Tanksley, Jr wrote: > > John Nowak <john@...> wrote: > > That's a goal I share... Although it's also a goal I want to be able > > to work around. There are times when a type system should be able to > > force a specific order of evaluation, primarily when the system > > deduces the presence of side effects. > Fifth is a strict language in which evaluation is always left to > right, so I don't think this would be an issue. Are you saying that it's impossible to write an optimizing code generator for Fifth? :-) Seriously, though, I don't see why you should worry about 'effects inference' if you can't reorder anything. The only reason I know of to disallow effects is if evaluation order is flexible. > > Let me add one more thing before I go on: this problem is specific to > > type inference, not to type checking in general. If you always > > annotate inputs, outputs, and executions, there's no problem. > This is true. The problem comes in when dealing with macros. For > example, a macro that translates lambda expressions to stack-based > code needs to splice values into (possibly nested) quotations. > Unfortunately, this translation might result in very confusing type > errors, and there would be no way to give types to the generated > quotations since the code is the result of a macro. Perhaps some more thought could be put into integrating the macro system and the type system. It seems to me that there are a few ways it could be done. Every parameter of the macro has a type; if inferring the type is too hard (and it probably is) you can require type signatures as part of the macro definition. The result of the macro is always a stretch of code, which also always has a type, although I expect it to not be possible to manually sign that type in general (although there's a proper subset of macros for which the type is manually signable). Obviously, a typechecked macro system would exclude some obvious macros... But can we get things done with a subset of all possible macros? I think so... > > Compare Spark Ada -- which, by the way, might be a useful role model > > for a very powerful type system that doesn't actually hurt code that > > it can't check (it's entirely optional, contained entirely within > > comments of perfectly standard Ada code). > Aye, Spark is great. It's unfortunate that it isn't free. Yes, but there's plenty of research on similar concepts. > unskilled. Rather, my point is that there are problems in the big- > picture design of the program itself. Despite Haskell's rather fancy > types, and the team's use of GADTs to enforce correctness, I still > wouldn't trust it for anything involving more than a few people. > Luckily, it does seem to be getting better. Darcs 2 is quite zippy, > relatively speaking. Yup. The crucial lesson is that typechecking can't specify correctness, ever. Correctness is uncomputable, and neither types alone nor tests alone nor contracts alone are sufficient to specify it. > To be honest, I'm just experiencing some cognitive dissonance. My > brain is trying frantically to rationalize avoiding the type system > for awhile. Well, if you CAN, go for it. I enjoy having the freedom to put off a painful part of a project sometimes. > - John -Wm |
|
|
Re: fundamental type system problems + possible solutionsOn Apr 30, 2008, at 10:52 AM, William Tanksley, Jr wrote: >> Fifth is a strict language in which evaluation is always left to >> right, so I don't think this would be an issue. > > Are you saying that it's impossible to write an optimizing code > generator for Fifth? :-) > Seriously, though, I don't see why you should worry about 'effects > inference' if you can't reorder anything. The only reason I know of to > disallow effects is if evaluation order is flexible. Ah, I misunderstood. I thought you were thinking that Fifth was lazy and somehow needed the effect system to avoid stomping all over itself when IO was involved. No, of course the effect system is useful for optimization purposes. It's also useful for things like ensuring the function passed to a parallel map function has no effects. > Perhaps some more thought could be put into integrating the macro > system and the type system. It seems to me that there are a few ways > it could be done. > > Every parameter of the macro has a type; if inferring the type is too > hard (and it probably is) you can require type signatures as part of > the macro definition. The problem is not typing the arguments to the macro; it's dealing with the situation where the expansion of the macro is not inferable and there's no way to provide annotations to quotations within the expansion because the expansion is obviously not in the source code. There are also issues with macros becoming much more tedious due to the need to shuffle data in and out of some AST sum type in the macro procedure itself. Ideally, you'd want implicit coersion to some 'any' type as necessary to avoid the noise. > Obviously, a typechecked macro system would exclude some obvious > macros... But can we get things done with a subset of all possible > macros? I think so... Maybe. It's not just that though; if the facility becomes too much of a pain to use for useful things, it might as well not be there at all. Concatenative languages have a huge benefit in terms of macros because there are no hygiene issues to worry about. Ignoring types, macros are beyond trivial. Ideally, it would be nice to keep this property. A soft type system makes this easy. Not being able to prove that the macro evaluation itself will not have a type error is of little concern to me provided that the expansion can be checked in the vast majority of cases. A dynamic failure during a compile time expansion isn't a big deal. Right now, I'm leaning towards a soft intersection type system that will be robust with respect to transformations and infer most any reasonable program. Even if the type system never gets in your way, there are benefits to soft types. For example, being able to run a program to discover what's wrong with it when the type errors are undecipherable is a very nice feature, especially for programmers new to the language. I'll probably write up something about my approach soon once I've filled in the details. - John |
|
|
Re: fundamental type system problems + possible solutionsJohn Nowak <john@...> wrote:
> William Tanksley, Jr wrote: > > Perhaps some more thought could be put into integrating the macro > > system and the type system. It seems to me that there are a few ways > > it could be done. > > Every parameter of the macro has a type; if inferring the type is too > > hard (and it probably is) you can require type signatures as part of > > the macro definition. > The problem is not typing the arguments to the macro; it's dealing > with the situation where the expansion of the macro is not inferable > and there's no way to provide annotations to quotations within the > expansion because the expansion is obviously not in the source code. I can read this okay up until the phrase "obviously not in the source code." Then I get completely lost. I'm going to reduce the argument to a logical form as I understand it... "It's impossible to X when condition Y holds." X = "provide annotations to quotations within the expansion" Y = "the expansion is obviously not in the source code" I don't see how this could be _possibly_ true. Why should it ever be impossible to annotate a quotation generated by a macro? Why is "obviousness" part of the criteria for impossibility? How can code generated by a macro not be part of the source code (unless you mean not part of the original source code pre-generation, but again, I don't see how it makes a difference whether the code was typed out by hand or generated by a macro). > There are also issues with macros becoming much more tedious due to > the need to shuffle data in and out of some AST sum type in the macro > procedure itself. I think I see your point here... I don't know what AST means (abstract syntax tree can't be right, can it?), but I can imagine a macro being able to dynamically generate code AND also dynamically generate the code's type signature. Yes, that would be both powerful and very annoying. > > Obviously, a typechecked macro system would exclude some obvious > > macros... But can we get things done with a subset of all possible > > macros? I think so... > Maybe. It's not just that though; if the facility becomes too much of > a pain to use for useful things, it might as well not be there at all. That's what I mean -- break "macros" down into different classes and subclasses, and only implement the classes that can currently be reasonably accomplished, possibly using distinct syntax (or better, distinct defining words) for each type. The big problem with doing things this way is that someday you'll find a sensible way to accomplish something new, and you'll have to add on a new subclass of macros to express that. Your language will want to change, but legacy will hold it back. > Concatenative languages have a huge benefit in terms of macros because > there are no hygiene issues to worry about. Ignoring types, macros are > beyond trivial. Ideally, it would be nice to keep this property. Agreed. This is a hard problem with types in general. > A soft type system makes this easy. Check out "hybrid typing", http://www.cs.ucsc.edu/~cormac/papers/popl06-hybrid.pdf; I don't know if it's any good, but it claims to be :-). > Not being able to prove that the > macro evaluation itself will not have a type error is of little > concern to me provided that the expansion can be checked in the vast > majority of cases. A dynamic failure during a compile time expansion > isn't a big deal. But a dynamic failure "during a compile time expansion" is actually a static failure during the definition of the word using the macro, and you've got all the same problems you wanted to solve as far as printing out a readable error message that the person trying to use the macro will be able to figure out. > Right now, I'm leaning towards a soft intersection type system that > will be robust with respect to transformations and infer most any > reasonable program. Even if the type system never gets in your way, > there are benefits to soft types. For example, being able to run a > program to discover what's wrong with it when the type errors are > undecipherable is a very nice feature, especially for programmers new > to the language. I'll probably write up something about my approach > soon once I've filled in the details. I'm interested and curious. I think I get your point... Perhaps such a type system could interact with the contract and test case specifications to see what possibilities are ruled out or explicitly required (respectively); and a type error could imply that you need to correct one of your specifications or add a new specification, and (even better) the error could hint what that specification would look like. > - John -Wm |
|
|
Re: fundamental type system problems + possible solutionsOn May 1, 2008, at 11:30 AM, William Tanksley, Jr wrote: >> The problem is not typing the arguments to the macro; it's dealing >> with the situation where the expansion of the macro is not inferable >> and there's no way to provide annotations to quotations within the >> expansion because the expansion is obviously not in the source code. > > I can read this okay up until the phrase "obviously not in the source > code." ... (unless you mean > not part of the original source code pre-generation, but again, I > don't see how it makes a difference whether the code was typed out by > hand or generated by a macro). That is what I mean. Say you give a macro a quotation. That quotation type checks. Now say the macro tears the quotation apart, and one of the individual parts of the quotation doesn't check (which is very possible). How would you annotate the fragment of the quotation that failed to check? > But a dynamic failure "during a compile time expansion" is actually a > static failure during the definition of the word using the macro, and > you've got all the same problems you wanted to solve as far as > printing out a readable error message that the person trying to use > the macro will be able to figure out. Good point. Debugging macros is already little fun to begin with though. My point was just that you won't be getting an error at runtime. In any case, I have another approach for a type system I want to try that more heavily depends on the algebraic properties of the language. Initial results are encouraging, but I'll need another couple of months to implement it as it's rather complex and I'm guessing my way through it. - John |
|
|
Re: fundamental type system problems + possible solutionsBefore I start: http://citeseer.ist.psu.edu/dornan98typesecure.html
seems to possibly have something to do with this (maybe?). It's about using dynamic types in a static system (which is nice for a soft type system), and also about a static typechecked metaprogramming system. John Nowak <john@...> wrote: > That is what I mean. Say you give a macro a quotation. That quotation > type checks. Now say the macro tears the quotation apart, and one of > the individual parts of the quotation doesn't check (which is very > possible). How would you annotate the fragment of the quotation that > failed to check? You're right -- I'm so used to reasoning about my experimental 01 language I'd forgotten that in the real world quotations can be torn apart. I don't have a good answer to that. I was thinking of providing tools that would cut quotations on boundaries of known types, but that won't work if you're cutting the quotations on the results of string searches (which is what a lambda-substitution macro has to do). It would make sense to make quotations impossible to cut, but possible to disassemble -- when you cut a quotation on a symbol, you get back the quotation before the symbol, the quotation after the symbol, and a program which when concatenated after those two correctly concatenates them together. Simplest case: [yadda x yadda] cut(x) == [yadda] [yadda] [concat] All of the words that cut, concat, or insert quotations would, of course, recompute and alter the quotations type signature. I haven't figured out a heuristic to guess where a type error message should direct the programmer's attention, though. Perhaps that's something that should go in the annotations that metaprogrammers could use... "If check fail in here, blame my macro code", versus "if checks fail in here, trace it back to the user's annotations". No, I don't see that one working. Oh well. > In any case, I have another approach for a type system I want to try > that more heavily depends on the algebraic properties of the language. > Initial results are encouraging, but I'll need another couple of > months to implement it as it's rather complex and I'm guessing my way > through it. Whew! Good luck. > - John -Wm |
| Free Forum Powered by Nabble | Forum Help |