slate gui slowness needs fixing before proceeding

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

slate gui slowness needs fixing before proceeding

by Timmy Douglas-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Well, I've added patches to my repos for simple one-way undo and now
you can type in all the characters fine. The problem is that there is
no point in going further (selections, copy paste, searching,
kill-line, cleaning up my ugly code, etc) when the current gui is too
slow to type in. I guess my ideal path after those features to the
text buffer would be to modify demo/inspector.slate to elegantly edit
the current environment. But enough of the future talk.


So I'd like to take a look at speeding up (or something to speed up
the GUI portion) of slate, but the only real dealings I have with
compilers are from when I wrote a tiger compiler in sml (following
appel's book for a class). Anyways, I didn't see any hint of where to
start from the last 1000 messages on this list, so I thought I'd start
a thread. So what are the options?


In about a week, I'm going to get more busy though since another
summer class will start, but until then, I should have time to get
something done.


Re: slate gui slowness needs fixing before proceeding

by Brian Rice :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jun 11, 2006, at 12:09 PM, Timmy Douglas wrote:

> Well, I've added patches to my repos for simple one-way undo and now
> you can type in all the characters fine. The problem is that there is
> no point in going further (selections, copy paste, searching,
> kill-line, cleaning up my ugly code, etc) when the current gui is too
> slow to type in. I guess my ideal path after those features to the
> text buffer would be to modify demo/inspector.slate to elegantly edit
> the current environment. But enough of the future talk.

Okay. I've pulled these into the site repositories.

> So I'd like to take a look at speeding up (or something to speed up
> the GUI portion) of slate, but the only real dealings I have with
> compilers are from when I wrote a tiger compiler in sml (following
> appel's book for a class). Anyways, I didn't see any hint of where to
> start from the last 1000 messages on this list, so I thought I'd start
> a thread. So what are the options?

I've discussed this off-list with someone who was going to work on  
it, but I haven't heard back from him after initial email exchanges.  
I've CC'd him just to get some basic communication re-established,  
hopefully. The options that we went over were:

1) Improve/port the experimental_jit.c. It already gives a 2-4x  
speedup. The problem is that it does no dynamic inlining, so that the  
huge message-send layering which is the majority of the performance  
problem is not taken care of.
2) Fix/complete Lee's optimizing compiler framework. This has a  
couple of sub-options.
  The direct option is to finish his x86 code generator, figure out  
how to link it with the image, etc. Basically lots of stuff that I  
have no idea how to do, and I don't know if I can get him to come  
back to do it (although I'd try if enough people asked... or maybe  
they should try themselves).
  The other option is to add a back-end target to LLVM from the IR  
code. This is slightly problematic because Lee wrote the IR to use  
SSU (single static usage) instead of SSA (single static assignment)  
form, which are inversions of each other from a data-flow  
perspective. Other than that, Lee's framework does "speak" LLVM at  
least from the abstract perspective.
3) Write an inlining bytecode compiler (my idea, would work  
independently of a JIT) and associated caching/flushing system.
4) Translate the VM into a direct-threaded style, which Eliot Miranda  
endorsed at Smalltalk Solutions this year when I spoke with him. It  
makes inlining much easier and has other benefits in terms of  
architectural/code-manipulation simplifications.

> In about a week, I'm going to get more busy though since another
> summer class will start, but until then, I should have time to get
> something done.

That likely won't be enough time to accomplish a deep change, but  
some sketch code to start with would be feasible. Slate's VM  
structure is pretty simple and malleable. All the bytecode-related  
code is in vm.slate, for example.

I hope that David won't mind, but I've attached his initial VM  
proposal email from a couple of months ago.


From: David Gilmore <davgil@...>
Date: April 5, 2006 4:14:31 PM PDT
To: Brian Rice <water@...>
Subject: Slate VM Proposal

Brian,

I have been studying the VM code and now have a clearer idea about how things work.  Here are my thoughts.  I am eager to hear what you think.  Feel free to call me and/or email as you wish.

Slate VM Enhancement

The bottom line is that I believe that it is within my knowledge and experience to implement a replacement for vm.slate (and the checkMethodCacheOn:arity: method) with a "threaded-code" interpreter/jit which will be written in hand-coded PowerPC assembly language and C.  The new VM will live entirely in processor registers and the 32k instruction cache.  It will take advantage of vector processing (AltiVec) where possible to speed up bytecode decoding and PMD type checks.

While PowerPC is a dead-end as far as Apple is concerned, the Quad-processor G5 is still currently the fastest workstation available, and the Powerbook G4 is my main machine.  PowerPC is a growing embedded platform, but the big future for Slate on the PowerPC is the XBox 360 and the Cell processor in the Playstation 3 and elsewhere.

After this success, I hope to rework Lee's jit.c on the intel platform to get rid of the dependencies on the existing interpreter and other bottlenecks in the support environment.

Details:

        Structure:

A Threaded-Code VM consists of a series of instruction handlers which are each made up of a few assembly language instructions stored in a fixed location.

A method is expressed as a sequence of jump instructions each of which flows into the code of an instruction handler.  There is approximately a 30% performance overhead over unadapted linear native code, but there is also about a 70% improvement in memory requirements.

A stack is used to temporarily store arguments and variables.

        Behavior:

The Threaded-Code interpreter/jit would operate in two passes.

        1.  Transform bytecodes with object-indices to jump-table offsets with object pointers.
        2.  Flow through the list of jump-points to "call" the individual operations.

The "jump list" could be destroyed after each method is done (interpreter mode) or could be stored in a cache and reused (jit mode).

        Interpreter State:

Every effort would be made to minimize the cost of maintaining interpreter state.  This will be accomplished by 2 means:

        1.  Compile the interpreter state into the jump-list as much as possible.
        2.  Optimize the remaining state into register-based machine instructions.

        The stack would be implemented by a cpu register stack-pointer and a large, fixed chunk of memory.  If the stack is made to be ridiculously large, stack policing could be avoided (at least for now).

        Method Lookups:

Method lookups will be cached into a polymorphic global cache.  Native Vector operations will be used to optimize type-checks and cache lookups.


Notes:

I call attention to the IRC discussion of March 13, 2005 for Lee's point of view on these issues.  He has some good ideas here that I have tried to incorporate.

I do not know what kind of performance to expect, but the biggest prospect for gain lies in eliminating the bytecode processing overhead and reducing setup time, teardown time, and type-checks for message sends, and in general minimizing memory accesses as much as is possible.

Altivec can compare two (contiguous) argument arrays with an arity of 4 in 4 instruction cycles  (versus 15).  It is also useful for hashing. (Intel SSE could do 7 cycles versus 11). However, in the real-world, arity is usually 1 or 2, so this may be a smaller win.  It may be a good way to implement a PMD system in general, where everything is designed to be handled in 128-bit chunks.

Altivec can be used to reorganize memory, but only if that memory is independent, byte-to-byte.  Since the immediate data in the bytecode stream is variable-length and dependent on the previous byte, Altivec cannot do as much as it otherwise could to rearrange data.  Originally, I naively thought that I could transform 16 bytes from the bytecode stream into less than 16 cycles.  This is not realistic because of the need for a scalar pass to determine what is bytecode and what is immediate data.  A more static bytecode structure would require more memory but could be processed very quickly.

This design does allow for the use of polymorphic-inline-caches, but I'm not sure at the moment how it yields performance benefits beyond a large global cache.  It seems that the cost of lookup and type-checking would be about the same.


Sincerely,

David



If anyone wants to discuss this intensively, I recommend Skype or IM.  
My ID there is "water451" and my IM identifiers are in my signature  
vCard.

--
-Brian
http://tunes.org/~water/brice.vcf



PGP.sig (193 bytes) Download Attachment

Re: slate gui slowness needs fixing before proceeding

by Mark Haniford :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This seems to be the second time that someone has had to stop work on
the gui because of the slowness of Slate.  It might be time to ask Lee
to come back and work on the compiler/JIT stuff.  Time keeps on
marching by and it's not waiting for Slate.

On 6/11/06, Brian Rice <water@...> wrote:

>
> On Jun 11, 2006, at 12:09 PM, Timmy Douglas wrote:
>
> > Well, I've added patches to my repos for simple one-way undo and now
> > you can type in all the characters fine. The problem is that there is
> > no point in going further (selections, copy paste, searching,
> > kill-line, cleaning up my ugly code, etc) when the current gui is too
> > slow to type in. I guess my ideal path after those features to the
> > text buffer would be to modify demo/inspector.slate to elegantly edit
> > the current environment. But enough of the future talk.
>
> Okay. I've pulled these into the site repositories.
>
> > So I'd like to take a look at speeding up (or something to speed up
> > the GUI portion) of slate, but the only real dealings I have with
> > compilers are from when I wrote a tiger compiler in sml (following
> > appel's book for a class). Anyways, I didn't see any hint of where to
> > start from the last 1000 messages on this list, so I thought I'd start
> > a thread. So what are the options?
>
> I've discussed this off-list with someone who was going to work on
> it, but I haven't heard back from him after initial email exchanges.
> I've CC'd him just to get some basic communication re-established,
> hopefully. The options that we went over were:
>
> 1) Improve/port the experimental_jit.c. It already gives a 2-4x
> speedup. The problem is that it does no dynamic inlining, so that the
> huge message-send layering which is the majority of the performance
> problem is not taken care of.
> 2) Fix/complete Lee's optimizing compiler framework. This has a
> couple of sub-options.
>   The direct option is to finish his x86 code generator, figure out
> how to link it with the image, etc. Basically lots of stuff that I
> have no idea how to do, and I don't know if I can get him to come
> back to do it (although I'd try if enough people asked... or maybe
> they should try themselves).
>   The other option is to add a back-end target to LLVM from the IR
> code. This is slightly problematic because Lee wrote the IR to use
> SSU (single static usage) instead of SSA (single static assignment)
> form, which are inversions of each other from a data-flow
> perspective. Other than that, Lee's framework does "speak" LLVM at
> least from the abstract perspective.
> 3) Write an inlining bytecode compiler (my idea, would work
> independently of a JIT) and associated caching/flushing system.
> 4) Translate the VM into a direct-threaded style, which Eliot Miranda
> endorsed at Smalltalk Solutions this year when I spoke with him. It
> makes inlining much easier and has other benefits in terms of
> architectural/code-manipulation simplifications.
>
> > In about a week, I'm going to get more busy though since another
> > summer class will start, but until then, I should have time to get
> > something done.
>
> That likely won't be enough time to accomplish a deep change, but
> some sketch code to start with would be feasible. Slate's VM
> structure is pretty simple and malleable. All the bytecode-related
> code is in vm.slate, for example.
>
> I hope that David won't mind, but I've attached his initial VM
> proposal email from a couple of months ago.
>
>
>
>
> If anyone wants to discuss this intensively, I recommend Skype or IM.
> My ID there is "water451" and my IM identifiers are in my signature
> vCard.
>
> --
> -Brian
> http://tunes.org/~water/brice.vcf
>
>
>
>
>


Slate's slowness impeding UI/IDE development

by Brian Rice :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jun 11, 2006, at 6:28 PM, Mark Haniford wrote:

> This seems to be the second time that someone has had to stop work on
> the gui because of the slowness of Slate.  It might be time to ask Lee
> to come back and work on the compiler/JIT stuff.  Time keeps on
> marching by and it's not waiting for Slate.

I agree; this observation has been impressed on me since the moment  
he declared his lack of enthusiasm.

I just want this project to work and do useful things for people. I'd  
sacrifice quite a bit of control to achieve that.

To Lee:
  How can we persuade you to return somehow? Would it need to involve  
shedding some of the formalities of a public open-source project?

> On 6/11/06, Brian Rice <water@...> wrote:
>>
>> On Jun 11, 2006, at 12:09 PM, Timmy Douglas wrote:
>>
>> > Well, I've added patches to my repos for simple one-way undo and  
>> now
>> > you can type in all the characters fine. The problem is that  
>> there is
>> > no point in going further (selections, copy paste, searching,
>> > kill-line, cleaning up my ugly code, etc) when the current gui  
>> is too
>> > slow to type in. I guess my ideal path after those features to the
>> > text buffer would be to modify demo/inspector.slate to elegantly  
>> edit
>> > the current environment. But enough of the future talk.
>>
>> Okay. I've pulled these into the site repositories.
>>
>> > So I'd like to take a look at speeding up (or something to speed up
>> > the GUI portion) of slate, but the only real dealings I have with
>> > compilers are from when I wrote a tiger compiler in sml (following
>> > appel's book for a class). Anyways, I didn't see any hint of  
>> where to
>> > start from the last 1000 messages on this list, so I thought I'd  
>> start
>> > a thread. So what are the options?
>>
>> I've discussed this off-list with someone who was going to work on
>> it, but I haven't heard back from him after initial email exchanges.
>> I've CC'd him just to get some basic communication re-established,
>> hopefully. The options that we went over were:
>>
>> 1) Improve/port the experimental_jit.c. It already gives a 2-4x
>> speedup. The problem is that it does no dynamic inlining, so that the
>> huge message-send layering which is the majority of the performance
>> problem is not taken care of.
>> 2) Fix/complete Lee's optimizing compiler framework. This has a
>> couple of sub-options.
>>   The direct option is to finish his x86 code generator, figure out
>> how to link it with the image, etc. Basically lots of stuff that I
>> have no idea how to do, and I don't know if I can get him to come
>> back to do it (although I'd try if enough people asked... or maybe
>> they should try themselves).
>>   The other option is to add a back-end target to LLVM from the IR
>> code. This is slightly problematic because Lee wrote the IR to use
>> SSU (single static usage) instead of SSA (single static assignment)
>> form, which are inversions of each other from a data-flow
>> perspective. Other than that, Lee's framework does "speak" LLVM at
>> least from the abstract perspective.
>> 3) Write an inlining bytecode compiler (my idea, would work
>> independently of a JIT) and associated caching/flushing system.
>> 4) Translate the VM into a direct-threaded style, which Eliot Miranda
>> endorsed at Smalltalk Solutions this year when I spoke with him. It
>> makes inlining much easier and has other benefits in terms of
>> architectural/code-manipulation simplifications.
>>
>> > In about a week, I'm going to get more busy though since another
>> > summer class will start, but until then, I should have time to get
>> > something done.
>>
>> That likely won't be enough time to accomplish a deep change, but
>> some sketch code to start with would be feasible. Slate's VM
>> structure is pretty simple and malleable. All the bytecode-related
>> code is in vm.slate, for example.
>>
>> I hope that David won't mind, but I've attached his initial VM
>> proposal email from a couple of months ago.
--
-Brian
http://tunes.org/~water/brice.vcf



PGP.sig (193 bytes) Download Attachment

Re: Slate's slowness impeding UI/IDE development

by Mark Haniford :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

So what is the "real" reason for Lee quitting in the first place?
Maybe if Lee was bestowed the position of "Benevolant Dictator" then
he would come back.

But wasn't Lee the only person working on the VM anyway?  Were other
people trying to bogart in on the VM or were their arguments about the
design or what?

I think Slate has some great ideas behind it.

On 6/11/06, Brian Rice <water@...> wrote:
> On Jun 11, 2006, at 6:28 PM, Mark Haniford wrote:
> To Lee:
>   How can we persuade you to return somehow? Would it need to involve
> shedding some of the formalities of a public open-source project?


Re: slate gui slowness needs fixing before proceeding

by Timmy Douglas-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Brian Rice <water@...> writes:

> On Jun 11, 2006, at 12:09 PM, Timmy Douglas wrote:
>
>> Well, I've added patches to my repos for simple one-way undo and now
>> you can type in all the characters fine. The problem is that there is
>> no point in going further (selections, copy paste, searching,
>> kill-line, cleaning up my ugly code, etc) when the current gui is too
>> slow to type in. I guess my ideal path after those features to the
>> text buffer would be to modify demo/inspector.slate to elegantly edit
>> the current environment. But enough of the future talk.
>
> Okay. I've pulled these into the site repositories.

thanks. I made another patch with drop-mark, delete-region, and
kill-line (which for now behaves differently than emacs since it does
drop-mark, end-of-line, and delete-region in one command, which sucks
up the next line I think).

>> So I'd like to take a look at speeding up (or something to speed up
>> the GUI portion) of slate, but the only real dealings I have with
>> compilers are from when I wrote a tiger compiler in sml (following
>> appel's book for a class). Anyways, I didn't see any hint of where to
>> start from the last 1000 messages on this list, so I thought I'd start
>> a thread. So what are the options?
>
> I've discussed this off-list with someone who was going to work on  
> it, but I haven't heard back from him after initial email exchanges.  
> I've CC'd him just to get some basic communication re-established,  
> hopefully. The options that we went over were:
>
> 1) Improve/port the experimental_jit.c. It already gives a 2-4x  
> speedup. The problem is that it does no dynamic inlining, so that the  
> huge message-send layering which is the majority of the performance  
> problem is not taken care of.

hm, this sort of seems like a temporary patch to speed up things but
I'm not sure it is enough. You'd have to try out the UI to see for
yourself, but 2-4x doesn't seem like it will fix the problem. It looks
like it might be the easiest option though. It's weird because
sometimes there are long delays and then the events will all get
processed by the ui fairly quickly (1 sec between typed characters
showing up) compared to the delay before the events registered (like
2-10 sec after the first keypress). But if you try inspector.slate and
try to drag the inspector windows you will go crazy. I guess mouse
motion events really drag it down or something along those lines.


I'm not really that familiar with how slate's compiler/interpreter
works or builds now. I was hoping to find something that would tell me
how mobius/ is built since it's like already slate code and I'm not
sure what can actually build that first stage (which I assume produces
the vm.c file in the base directory). Can someone point me to docs on
how the whole build process works? I think it'd go a long way with
understanding the vm... so far I've just been opening up like millions
of source files but it's hard to get the whole picture from that and
mobius.pdf.


> 2) Fix/complete Lee's optimizing compiler framework. This has a  
> couple of sub-options.
>  The direct option is to finish his x86 code generator, figure out  
> how to link it with the image, etc. Basically lots of stuff that I  
> have no idea how to do, and I don't know if I can get him to come  
> back to do it (although I'd try if enough people asked... or maybe  
> they should try themselves).


Yeah, there is quite a bit of code in src/mobius/optimizer. So you're
saying none of that is being used at the moment?


>  The other option is to add a back-end target to LLVM from the IR  
> code. This is slightly problematic because Lee wrote the IR to use  
> SSU (single static usage) instead of SSA (single static assignment)  
> form, which are inversions of each other from a data-flow  
> perspective. Other than that, Lee's framework does "speak" LLVM at  
> least from the abstract perspective.

Sounds like we probably wouldn't have to worry about optimizations
once we got it into llvm's hands. I don't know how good that'd be. I
guess I'd have to spend a week figuring out how the optimization
framework works first.

> 3) Write an inlining bytecode compiler (my idea, would work  
> independently of a JIT) and associated caching/flushing system.

Well I'm pretty new to this stuff so I don't really know what would
get cached and flushed... I went to the library today since I saw you
thought ACD&I(?) was a good book (on some irc conversation a while
back) but they only let me check it out for 2 hours in-library... I
gave up after 30 min since there wasn't much I could do there.

> 4) Translate the VM into a direct-threaded style, which Eliot Miranda  
> endorsed at Smalltalk Solutions this year when I spoke with him. It  
> makes inlining much easier and has other benefits in terms of  
> architectural/code-manipulation simplifications.

ok

>> In about a week, I'm going to get more busy though since another
>> summer class will start, but until then, I should have time to get
>> something done.
>
> That likely won't be enough time to accomplish a deep change, but
> some sketch code to start with would be feasible. Slate's VM
> structure is pretty simple and malleable. All the bytecode-related
> code is in vm.slate, for example.
>
> I hope that David won't mind, but I've attached his initial VM  
> proposal email from a couple of months ago.

Well, if he takes the powerpc asm route, I hope there is a way that I
could run it on this old athlon box.

Thanks for the reply.


Re: slate gui slowness needs fixing before proceeding

by Tony Garnock-Jones-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Timmy Douglas wrote:
> processed by the ui fairly quickly (1 sec between typed characters
> showing up) compared to the delay before the events registered (like
> 2-10 sec after the first keypress). But if you try inspector.slate and
> try to drag the inspector windows you will go crazy. I guess mouse
> motion events really drag it down or something along those lines.

It sounds like there's something rotten in the interface to SDL, rather
than basic slowness of Slate... well, I'd be surprised if it was Slate
since I've always been pleasantly surprised at the speed of the system
in general. Perhaps there's something sub-par in the polling logic?

I can't say for sure, of course, since I've been concentrating on other
aspects of the system and haven't tried the SDL interfaces for a long,
long time!

> Well I'm pretty new to this stuff so I don't really know what would
> get cached and flushed... I went to the library today since I saw you
> thought ACD&I(?) was a good book (on some irc conversation a while
> back) but they only let me check it out for 2 hours in-library... I
> gave up after 30 min since there wasn't much I could do there.

Try the Self papers:

http://citeseer.ist.psu.edu/chambers92design.html
http://citeseer.ist.psu.edu/chambers91making.html
http://citeseer.ist.psu.edu/hlzle94adaptive.html
http://citeseer.ist.psu.edu/chambers90iterative.html

They changed my life ;-)

> Well, if he takes the powerpc asm route, I hope there is a way that I
> could run it on this old athlon box.

One important aspect of a direct-threaded design, ISTM, is the
instruction encodings... if someone can design a fixed-width "VLIW"
format for direct-threaded bytecodes, then the same architecture ought
to be able to apply to x86 as well as PPC.

Tony


Re: slate gui slowness needs fixing before proceeding

by Nick Forde :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Tony Garnock-Jones wrote:

> Timmy Douglas wrote:
>
>>processed by the ui fairly quickly (1 sec between typed characters
>>showing up) compared to the delay before the events registered (like
>>2-10 sec after the first keypress). But if you try inspector.slate and
>>try to drag the inspector windows you will go crazy. I guess mouse
>>motion events really drag it down or something along those lines.
>
> It sounds like there's something rotten in the interface to SDL, rather
> than basic slowness of Slate... well, I'd be surprised if it was Slate
> since I've always been pleasantly surprised at the speed of the system
> in general. Perhaps there's something sub-par in the polling logic?

I haven't tried the SDL interface but you can get a very rough idea
of the basic VM performance by running 'make benchmark' and
comparing the results to those at: http://shootout.alioth.debian.org/
Last time I tried this the results were unfortunately not good :-(

The source for these tests can be found in test/benchmark and there is
some documentation in doc/benchmarks.txt.

Nick.


Re: slate gui slowness needs fixing before proceeding

by Brian Rice :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jun 12, 2006, at 5:27 AM, Nick Forde wrote:

> Tony Garnock-Jones wrote:
>> Timmy Douglas wrote:
>>> processed by the ui fairly quickly (1 sec between typed characters
>>> showing up) compared to the delay before the events registered (like
>>> 2-10 sec after the first keypress). But if you try  
>>> inspector.slate and
>>> try to drag the inspector windows you will go crazy. I guess mouse
>>> motion events really drag it down or something along those lines.
>> It sounds like there's something rotten in the interface to SDL,  
>> rather
>> than basic slowness of Slate... well, I'd be surprised if it was  
>> Slate
>> since I've always been pleasantly surprised at the speed of the  
>> system
>> in general. Perhaps there's something sub-par in the polling logic?
>
> I haven't tried the SDL interface but you can get a very rough idea
> of the basic VM performance by running 'make benchmark' and
> comparing the results to those at: http://shootout.alioth.debian.org/
> Last time I tried this the results were unfortunately not good :-(
*ahem* Just to be pedantic, that's not the VM's performance, that's  
the VM+image's performance. There are a lot of message sends that  
happen for basic control-flow operations, for example, or even for  
just executing a non-literal (argument to a method) block.

That said, the VM design is rather naive and could be much better,  
but not by the order of magnitude that the shootout benchmarks are.  
The performance problem we have relates to the Strongtalk design - we  
just have no inlining compiler to rely on.

> The source for these tests can be found in test/benchmark and there is
> some documentation in doc/benchmarks.txt.

--
-Brian
http://tunes.org/~water/brice.vcf



PGP.sig (193 bytes) Download Attachment

Re: Slate's slowness impeding UI/IDE development

by Brian Rice :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jun 11, 2006, at 7:43 PM, Mark Haniford wrote:

> So what is the "real" reason for Lee quitting in the first place?
> Maybe if Lee was bestowed the position of "Benevolant Dictator" then
> he would come back.

The last relevant quotes (over instant messaging) I got from him  
about his stance towards the project were:

"i can say with absolute certainty i have no interest in continuing it"

"i'm disinterested in where the project is going and i'm not sure we  
are capable of running it together to the mutual benefit of us both"

"as it has been it has been nothing more than parasitic to me"

If I may paraphrase further, basically he did not want to maintain a  
public project at all, and only cared about a hobby-level language
+compiler+OS toolchain that ran on bare hardware. He actually  
*suggested* that we focus on the IDE to make Slate more useful rather  
than fret about performance. I'm not sure how he never took the  
performance issue seriously enough to realize that the UI-based IDE  
would be unusable without a dynamic inliner.

I am CC'ing him so that he can clarify his stance on this if he wishes.

> But wasn't Lee the only person working on the VM anyway?

Lee actually didn't originally like the idea of using a VM. I simply  
introduced a (buggy) Slate-to-C translator a la Squeak and sketched  
out a VM design just to get the ball rolling when we were stuck in a  
Common Lisp interpreter. If we hadn't done that, we'd still be there  
on CL since he never completed his compiler.

That said, he wrote most of the VM code and wound up maintaining it.  
He has a propensity to write pages and pages of really interesting  
code with no comments; I spoke with other people he has collaborated  
with and they've confirmed this. So it's non-trivial to pick up code  
that he wrote and run with it, and he got stuck with what he had  
written, with little-to-no desire to do so (apparently).

> Were other people trying to bogart in on the VM or were their  
> arguments about the design or what?

Towards the end there was a decent amount of clamoring for  
continuation support, which he apparently found unwarranted. He  
basically silently refused to code any support for it, while making  
hand-waving explanations about how easy it would be to get a subset  
of the functionality. At least a few people disagreed with him on the  
matter.

No one really criticized or tried to mess with the basic VM design or  
such; in fact I think he was its biggest critic.

The last that I heard, Lee was learning his father's real estate  
business and selling real estate in/near Las Vegas. I suggested a few  
open positions for the type of work he was doing with Slate, but it  
didn't interest him. He probably makes good money and is totally  
wasting his technical talent (or not - he occasionally just  
contributes to a few projects as a donor).

> I think Slate has some great ideas behind it.
>
> On 6/11/06, Brian Rice <water@...> wrote:
>> On Jun 11, 2006, at 6:28 PM, Mark Haniford wrote:
>> To Lee:
>>   How can we persuade you to return somehow? Would it need to involve
>> shedding some of the formalities of a public open-source project?

--
-Brian
http://tunes.org/~water/brice.vcf



PGP.sig (193 bytes) Download Attachment

Re: slate gui slowness needs fixing before proceeding

by Nick Forde :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Brian Rice wrote:

> On Jun 12, 2006, at 5:27 AM, Nick Forde wrote:
>
>> Tony Garnock-Jones wrote:
>>
>>> Timmy Douglas wrote:
>>>
>>>> processed by the ui fairly quickly (1 sec between typed characters
>>>> showing up) compared to the delay before the events registered (like
>>>> 2-10 sec after the first keypress). But if you try  inspector.slate and
>>>> try to drag the inspector windows you will go crazy. I guess mouse
>>>> motion events really drag it down or something along those lines.
>>>
>>> It sounds like there's something rotten in the interface to SDL,  rather
>>> than basic slowness of Slate... well, I'd be surprised if it was  Slate
>>> since I've always been pleasantly surprised at the speed of the  system
>>> in general. Perhaps there's something sub-par in the polling logic?
>>
>>
>> I haven't tried the SDL interface but you can get a very rough idea
>> of the basic VM performance by running 'make benchmark' and
>> comparing the results to those at: http://shootout.alioth.debian.org/
>> Last time I tried this the results were unfortunately not good :-(
>
>
> *ahem* Just to be pedantic, that's not the VM's performance, that's  the
> VM+image's performance. There are a lot of message sends that  happen
> for basic control-flow operations, for example, or even for  just
> executing a non-literal (argument to a method) block.

For most of the benchmarks that is indeed true. However, I added an extra
primitive to the VM to trace PSInterpreter_[re]sendMessage*() calls only
during the body of the benchmarks. I found that in some of the simple tight
loop tests the number of message sends wasn't excessive but most of the
execution time was in their dispatch. These benchmarks still fared poorly
when compared to the shootout results.

> That said, the VM design is rather naive and could be much better,  but
> not by the order of magnitude that the shootout benchmarks are.  The
> performance problem we have relates to the Strongtalk design - we  just
> have no inlining compiler to rely on.

I think you're right and agree that this is where the biggest
performance gains can be had.

Nick.


Re: slate gui slowness needs fixing before proceeding

by Brian Rice :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jun 11, 2006, at 10:36 PM, Timmy Douglas wrote:

> Brian Rice <water@...> writes:
>
>> On Jun 11, 2006, at 12:09 PM, Timmy Douglas wrote:
>>
>>> Well, I've added patches to my repos for simple one-way undo and now
>>> you can type in all the characters fine. The problem is that  
>>> there is
>>> no point in going further (selections, copy paste, searching,
>>> kill-line, cleaning up my ugly code, etc) when the current gui is  
>>> too
>>> slow to type in. I guess my ideal path after those features to the
>>> text buffer would be to modify demo/inspector.slate to elegantly  
>>> edit
>>> the current environment. But enough of the future talk.
>>
>> Okay. I've pulled these into the site repositories.
>
> thanks. I made another patch with drop-mark, delete-region, and
> kill-line (which for now behaves differently than emacs since it does
> drop-mark, end-of-line, and delete-region in one command, which sucks
> up the next line I think).
Got it. Thanks!

>>> So I'd like to take a look at speeding up (or something to speed up
>>> the GUI portion) of slate, but the only real dealings I have with
>>> compilers are from when I wrote a tiger compiler in sml (following
>>> appel's book for a class). Anyways, I didn't see any hint of  
>>> where to
>>> start from the last 1000 messages on this list, so I thought I'd  
>>> start
>>> a thread. So what are the options?
>>
>> I've discussed this off-list with someone who was going to work on
>> it, but I haven't heard back from him after initial email exchanges.
>> I've CC'd him just to get some basic communication re-established,
>> hopefully. The options that we went over were:
>>
>> 1) Improve/port the experimental_jit.c. It already gives a 2-4x
>> speedup. The problem is that it does no dynamic inlining, so that the
>> huge message-send layering which is the majority of the performance
>> problem is not taken care of.
>
> hm, this sort of seems like a temporary patch to speed up things but
> I'm not sure it is enough. You'd have to try out the UI to see for
> yourself, but 2-4x doesn't seem like it will fix the problem.
Yeah, I never thought that it would.

> I'm not really that familiar with how slate's compiler/interpreter
> works or builds now. I was hoping to find something that would tell me
> how mobius/ is built since it's like already slate code and I'm not
> sure what can actually build that first stage (which I assume produces
> the vm.c file in the base directory). Can someone point me to docs on
> how the whole build process works? I think it'd go a long way with
> understanding the vm... so far I've just been opening up like millions
> of source files but it's hard to get the whole picture from that and
> mobius.pdf.

The code that "make newboot" calls (mentioned in INSTALL) is in src/
mobius/vm/build.slate and then there's src/mobius/vm/bootstrap.slate  
and so forth. src/mobius/init.slate shows the load order needed to  
build up a working VM+image bootstrapper. The README maps out the  
contents of the directories of the Slate repositories.

That said, a guide to Slate's build system is certainly warranted.  
I'll think about how to start one, but suggestions are also welcome.

>> 2) Fix/complete Lee's optimizing compiler framework. This has a
>> couple of sub-options.
>>  The direct option is to finish his x86 code generator, figure out
>> how to link it with the image, etc. Basically lots of stuff that I
>> have no idea how to do, and I don't know if I can get him to come
>> back to do it (although I'd try if enough people asked... or maybe
>> they should try themselves).
>
> Yeah, there is quite a bit of code in src/mobius/optimizer. So  
> you're saying none of that is being used at the moment?
Yes. He worked on it for about four years in various incarnations and  
never hooked it up to test it, other than the IR front-end in tests  
that were local to his setup (i.e. he never published them).

>>  The other option is to add a back-end target to LLVM from the IR
>> code. This is slightly problematic because Lee wrote the IR to use
>> SSU (single static usage) instead of SSA (single static assignment)
>> form, which are inversions of each other from a data-flow
>> perspective. Other than that, Lee's framework does "speak" LLVM at
>> least from the abstract perspective.
>
> Sounds like we probably wouldn't have to worry about optimizations
> once we got it into llvm's hands. I don't know how good that'd be.  
> I guess I'd have to spend a week figuring out how the optimization
> framework works first.
Yeah, it's at least a medium-sized task. Compiler-coding familiarity  
helps quite a bit but Lee made his own decisions for his own reasons  
that have to be divined from the source.

>> 3) Write an inlining bytecode compiler (my idea, would work
>> independently of a JIT) and associated caching/flushing system.
>
> Well I'm pretty new to this stuff so I don't really know what would
> get cached and flushed... I went to the library today since I saw you
> thought ACD&I(?) was a good book (on some irc conversation a while
> back) but they only let me check it out for 2 hours in-library... I
> gave up after 30 min since there wasn't much I could do there.

Advanced Compiler Design & Implementation only covers advanced back-
end optimizations and intermediate-form representation choices. It's  
the basis for a lot of Lee's work in that area.

>> 4) Translate the VM into a direct-threaded style, which Eliot Miranda
>> endorsed at Smalltalk Solutions this year when I spoke with him. It
>> makes inlining much easier and has other benefits in terms of
>> architectural/code-manipulation simplifications.
>
> ok
>
>>> In about a week, I'm going to get more busy though since another
>>> summer class will start, but until then, I should have time to get
>>> something done.
>>
>> That likely won't be enough time to accomplish a deep change, but
>> some sketch code to start with would be feasible. Slate's VM
>> structure is pretty simple and malleable. All the bytecode-related
>> code is in vm.slate, for example.
>>
>> I hope that David won't mind, but I've attached his initial VM
>> proposal email from a couple of months ago.
>
> Well, if he takes the powerpc asm route, I hope there is a way that  
> I could run it on this old athlon box.
We could probably easily lift it up to C level, if not Pidgin.

--
-Brian
http://tunes.org/~water/brice.vcf



PGP.sig (193 bytes) Download Attachment

Re: slate gui slowness needs fixing before proceeding

by Brian Rice :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I need to clarify at least this point:

On Jun 12, 2006, at 7:02 AM, Brian Rice wrote:

> On Jun 11, 2006, at 10:36 PM, Timmy Douglas wrote:
>> Brian Rice <water@...> writes:
>>>  The other option is to add a back-end target to LLVM from the IR
>>> code. This is slightly problematic because Lee wrote the IR to use
>>> SSU (single static usage) instead of SSA (single static assignment)
>>> form, which are inversions of each other from a data-flow
>>> perspective. Other than that, Lee's framework does "speak" LLVM at
>>> least from the abstract perspective.
>>
>> Sounds like we probably wouldn't have to worry about optimizations
>> once we got it into llvm's hands. I don't know how good that'd be.  
>> I guess I'd have to spend a week figuring out how the optimization
>> framework works first.
>
> Yeah, it's at least a medium-sized task. Compiler-coding  
> familiarity helps quite a bit but Lee made his own decisions for  
> his own reasons that have to be divined from the source.
There are also several text files with notes on his design decisions  
in doc/*.txt.

--
-Brian
http://tunes.org/~water/brice.vcf



PGP.sig (193 bytes) Download Attachment

Re: slate gui slowness needs fixing before proceeding

by Brian Rice :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jun 11, 2006, at 4:15 PM, Brian Rice wrote:

> On Jun 11, 2006, at 12:09 PM, Timmy Douglas wrote:
>> So I'd like to take a look at speeding up (or something to speed up
>> the GUI portion) of slate, but the only real dealings I have with
>> compilers are from when I wrote a tiger compiler in sml (following
>> appel's book for a class). Anyways, I didn't see any hint of where to
>> start from the last 1000 messages on this list, so I thought I'd  
>> start
>> a thread. So what are the options?
>
> I've discussed this off-list with someone who was going to work on  
> it, but I haven't heard back from him after initial email  
> exchanges. I've CC'd him just to get some basic communication re-
> established, hopefully. The options that we went over were:
>
> 1) Improve/port the experimental_jit.c. It already gives a 2-4x  
> speedup.