|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 | Next > |
|
|
slate gui slowness needs fixing before proceedingWell, I've added patches to my repos for simple one-way undo and now you can type in all the characters fine. The problem is that there is no point in going further (selections, copy paste, searching, kill-line, cleaning up my ugly code, etc) when the current gui is too slow to type in. I guess my ideal path after those features to the text buffer would be to modify demo/inspector.slate to elegantly edit the current environment. But enough of the future talk. So I'd like to take a look at speeding up (or something to speed up the GUI portion) of slate, but the only real dealings I have with compilers are from when I wrote a tiger compiler in sml (following appel's book for a class). Anyways, I didn't see any hint of where to start from the last 1000 messages on this list, so I thought I'd start a thread. So what are the options? In about a week, I'm going to get more busy though since another summer class will start, but until then, I should have time to get something done. |
|
|
Re: slate gui slowness needs fixing before proceedingOn Jun 11, 2006, at 12:09 PM, Timmy Douglas wrote: > Well, I've added patches to my repos for simple one-way undo and now > you can type in all the characters fine. The problem is that there is > no point in going further (selections, copy paste, searching, > kill-line, cleaning up my ugly code, etc) when the current gui is too > slow to type in. I guess my ideal path after those features to the > text buffer would be to modify demo/inspector.slate to elegantly edit > the current environment. But enough of the future talk. Okay. I've pulled these into the site repositories. > So I'd like to take a look at speeding up (or something to speed up > the GUI portion) of slate, but the only real dealings I have with > compilers are from when I wrote a tiger compiler in sml (following > appel's book for a class). Anyways, I didn't see any hint of where to > start from the last 1000 messages on this list, so I thought I'd start > a thread. So what are the options? I've discussed this off-list with someone who was going to work on it, but I haven't heard back from him after initial email exchanges. I've CC'd him just to get some basic communication re-established, hopefully. The options that we went over were: 1) Improve/port the experimental_jit.c. It already gives a 2-4x speedup. The problem is that it does no dynamic inlining, so that the huge message-send layering which is the majority of the performance problem is not taken care of. 2) Fix/complete Lee's optimizing compiler framework. This has a couple of sub-options. The direct option is to finish his x86 code generator, figure out how to link it with the image, etc. Basically lots of stuff that I have no idea how to do, and I don't know if I can get him to come back to do it (although I'd try if enough people asked... or maybe they should try themselves). The other option is to add a back-end target to LLVM from the IR code. This is slightly problematic because Lee wrote the IR to use SSU (single static usage) instead of SSA (single static assignment) form, which are inversions of each other from a data-flow perspective. Other than that, Lee's framework does "speak" LLVM at least from the abstract perspective. 3) Write an inlining bytecode compiler (my idea, would work independently of a JIT) and associated caching/flushing system. 4) Translate the VM into a direct-threaded style, which Eliot Miranda endorsed at Smalltalk Solutions this year when I spoke with him. It makes inlining much easier and has other benefits in terms of architectural/code-manipulation simplifications. > In about a week, I'm going to get more busy though since another > summer class will start, but until then, I should have time to get > something done. That likely won't be enough time to accomplish a deep change, but some sketch code to start with would be feasible. Slate's VM structure is pretty simple and malleable. All the bytecode-related code is in vm.slate, for example. I hope that David won't mind, but I've attached his initial VM proposal email from a couple of months ago. From: David Gilmore <davgil@...> Date: April 5, 2006 4:14:31 PM PDT To: Brian Rice <water@...> Subject: Slate VM Proposal Brian, I have been studying the VM code and now have a clearer idea about how things work. Here are my thoughts. I am eager to hear what you think. Feel free to call me and/or email as you wish. Slate VM Enhancement The bottom line is that I believe that it is within my knowledge and experience to implement a replacement for vm.slate (and the checkMethodCacheOn:arity: method) with a "threaded-code" interpreter/jit which will be written in hand-coded PowerPC assembly language and C. The new VM will live entirely in processor registers and the 32k instruction cache. It will take advantage of vector processing (AltiVec) where possible to speed up bytecode decoding and PMD type checks. While PowerPC is a dead-end as far as Apple is concerned, the Quad-processor G5 is still currently the fastest workstation available, and the Powerbook G4 is my main machine. PowerPC is a growing embedded platform, but the big future for Slate on the PowerPC is the XBox 360 and the Cell processor in the Playstation 3 and elsewhere. After this success, I hope to rework Lee's jit.c on the intel platform to get rid of the dependencies on the existing interpreter and other bottlenecks in the support environment. Details: Structure: A Threaded-Code VM consists of a series of instruction handlers which are each made up of a few assembly language instructions stored in a fixed location. A method is expressed as a sequence of jump instructions each of which flows into the code of an instruction handler. There is approximately a 30% performance overhead over unadapted linear native code, but there is also about a 70% improvement in memory requirements. A stack is used to temporarily store arguments and variables. Behavior: The Threaded-Code interpreter/jit would operate in two passes. 1. Transform bytecodes with object-indices to jump-table offsets with object pointers. 2. Flow through the list of jump-points to "call" the individual operations. The "jump list" could be destroyed after each method is done (interpreter mode) or could be stored in a cache and reused (jit mode). Interpreter State: Every effort would be made to minimize the cost of maintaining interpreter state. This will be accomplished by 2 means: 1. Compile the interpreter state into the jump-list as much as possible. 2. Optimize the remaining state into register-based machine instructions. The stack would be implemented by a cpu register stack-pointer and a large, fixed chunk of memory. If the stack is made to be ridiculously large, stack policing could be avoided (at least for now). Method Lookups: Method lookups will be cached into a polymorphic global cache. Native Vector operations will be used to optimize type-checks and cache lookups. Notes: I call attention to the IRC discussion of March 13, 2005 for Lee's point of view on these issues. He has some good ideas here that I have tried to incorporate. I do not know what kind of performance to expect, but the biggest prospect for gain lies in eliminating the bytecode processing overhead and reducing setup time, teardown time, and type-checks for message sends, and in general minimizing memory accesses as much as is possible. Altivec can compare two (contiguous) argument arrays with an arity of 4 in 4 instruction cycles (versus 15). It is also useful for hashing. (Intel SSE could do 7 cycles versus 11). However, in the real-world, arity is usually 1 or 2, so this may be a smaller win. It may be a good way to implement a PMD system in general, where everything is designed to be handled in 128-bit chunks. Altivec can be used to reorganize memory, but only if that memory is independent, byte-to-byte. Since the immediate data in the bytecode stream is variable-length and dependent on the previous byte, Altivec cannot do as much as it otherwise could to rearrange data. Originally, I naively thought that I could transform 16 bytes from the bytecode stream into less than 16 cycles. This is not realistic because of the need for a scalar pass to determine what is bytecode and what is immediate data. A more static bytecode structure would require more memory but could be processed very quickly. This design does allow for the use of polymorphic-inline-caches, but I'm not sure at the moment how it yields performance benefits beyond a large global cache. It seems that the cost of lookup and type-checking would be about the same. Sincerely, David If anyone wants to discuss this intensively, I recommend Skype or IM. My ID there is "water451" and my IM identifiers are in my signature vCard. -- -Brian http://tunes.org/~water/brice.vcf |
|
|
Re: slate gui slowness needs fixing before proceedingThis seems to be the second time that someone has had to stop work on
the gui because of the slowness of Slate. It might be time to ask Lee to come back and work on the compiler/JIT stuff. Time keeps on marching by and it's not waiting for Slate. On 6/11/06, Brian Rice <water@...> wrote: > > On Jun 11, 2006, at 12:09 PM, Timmy Douglas wrote: > > > Well, I've added patches to my repos for simple one-way undo and now > > you can type in all the characters fine. The problem is that there is > > no point in going further (selections, copy paste, searching, > > kill-line, cleaning up my ugly code, etc) when the current gui is too > > slow to type in. I guess my ideal path after those features to the > > text buffer would be to modify demo/inspector.slate to elegantly edit > > the current environment. But enough of the future talk. > > Okay. I've pulled these into the site repositories. > > > So I'd like to take a look at speeding up (or something to speed up > > the GUI portion) of slate, but the only real dealings I have with > > compilers are from when I wrote a tiger compiler in sml (following > > appel's book for a class). Anyways, I didn't see any hint of where to > > start from the last 1000 messages on this list, so I thought I'd start > > a thread. So what are the options? > > I've discussed this off-list with someone who was going to work on > it, but I haven't heard back from him after initial email exchanges. > I've CC'd him just to get some basic communication re-established, > hopefully. The options that we went over were: > > 1) Improve/port the experimental_jit.c. It already gives a 2-4x > speedup. The problem is that it does no dynamic inlining, so that the > huge message-send layering which is the majority of the performance > problem is not taken care of. > 2) Fix/complete Lee's optimizing compiler framework. This has a > couple of sub-options. > The direct option is to finish his x86 code generator, figure out > how to link it with the image, etc. Basically lots of stuff that I > have no idea how to do, and I don't know if I can get him to come > back to do it (although I'd try if enough people asked... or maybe > they should try themselves). > The other option is to add a back-end target to LLVM from the IR > code. This is slightly problematic because Lee wrote the IR to use > SSU (single static usage) instead of SSA (single static assignment) > form, which are inversions of each other from a data-flow > perspective. Other than that, Lee's framework does "speak" LLVM at > least from the abstract perspective. > 3) Write an inlining bytecode compiler (my idea, would work > independently of a JIT) and associated caching/flushing system. > 4) Translate the VM into a direct-threaded style, which Eliot Miranda > endorsed at Smalltalk Solutions this year when I spoke with him. It > makes inlining much easier and has other benefits in terms of > architectural/code-manipulation simplifications. > > > In about a week, I'm going to get more busy though since another > > summer class will start, but until then, I should have time to get > > something done. > > That likely won't be enough time to accomplish a deep change, but > some sketch code to start with would be feasible. Slate's VM > structure is pretty simple and malleable. All the bytecode-related > code is in vm.slate, for example. > > I hope that David won't mind, but I've attached his initial VM > proposal email from a couple of months ago. > > > > > If anyone wants to discuss this intensively, I recommend Skype or IM. > My ID there is "water451" and my IM identifiers are in my signature > vCard. > > -- > -Brian > http://tunes.org/~water/brice.vcf > > > > > |
|
|
Slate's slowness impeding UI/IDE developmentOn Jun 11, 2006, at 6:28 PM, Mark Haniford wrote:
> This seems to be the second time that someone has had to stop work on > the gui because of the slowness of Slate. It might be time to ask Lee > to come back and work on the compiler/JIT stuff. Time keeps on > marching by and it's not waiting for Slate. I agree; this observation has been impressed on me since the moment he declared his lack of enthusiasm. I just want this project to work and do useful things for people. I'd sacrifice quite a bit of control to achieve that. To Lee: How can we persuade you to return somehow? Would it need to involve shedding some of the formalities of a public open-source project? > On 6/11/06, Brian Rice <water@...> wrote: >> >> On Jun 11, 2006, at 12:09 PM, Timmy Douglas wrote: >> >> > Well, I've added patches to my repos for simple one-way undo and >> now >> > you can type in all the characters fine. The problem is that >> there is >> > no point in going further (selections, copy paste, searching, >> > kill-line, cleaning up my ugly code, etc) when the current gui >> is too >> > slow to type in. I guess my ideal path after those features to the >> > text buffer would be to modify demo/inspector.slate to elegantly >> edit >> > the current environment. But enough of the future talk. >> >> Okay. I've pulled these into the site repositories. >> >> > So I'd like to take a look at speeding up (or something to speed up >> > the GUI portion) of slate, but the only real dealings I have with >> > compilers are from when I wrote a tiger compiler in sml (following >> > appel's book for a class). Anyways, I didn't see any hint of >> where to >> > start from the last 1000 messages on this list, so I thought I'd >> start >> > a thread. So what are the options? >> >> I've discussed this off-list with someone who was going to work on >> it, but I haven't heard back from him after initial email exchanges. >> I've CC'd him just to get some basic communication re-established, >> hopefully. The options that we went over were: >> >> 1) Improve/port the experimental_jit.c. It already gives a 2-4x >> speedup. The problem is that it does no dynamic inlining, so that the >> huge message-send layering which is the majority of the performance >> problem is not taken care of. >> 2) Fix/complete Lee's optimizing compiler framework. This has a >> couple of sub-options. >> The direct option is to finish his x86 code generator, figure out >> how to link it with the image, etc. Basically lots of stuff that I >> have no idea how to do, and I don't know if I can get him to come >> back to do it (although I'd try if enough people asked... or maybe >> they should try themselves). >> The other option is to add a back-end target to LLVM from the IR >> code. This is slightly problematic because Lee wrote the IR to use >> SSU (single static usage) instead of SSA (single static assignment) >> form, which are inversions of each other from a data-flow >> perspective. Other than that, Lee's framework does "speak" LLVM at >> least from the abstract perspective. >> 3) Write an inlining bytecode compiler (my idea, would work >> independently of a JIT) and associated caching/flushing system. >> 4) Translate the VM into a direct-threaded style, which Eliot Miranda >> endorsed at Smalltalk Solutions this year when I spoke with him. It >> makes inlining much easier and has other benefits in terms of >> architectural/code-manipulation simplifications. >> >> > In about a week, I'm going to get more busy though since another >> > summer class will start, but until then, I should have time to get >> > something done. >> >> That likely won't be enough time to accomplish a deep change, but >> some sketch code to start with would be feasible. Slate's VM >> structure is pretty simple and malleable. All the bytecode-related >> code is in vm.slate, for example. >> >> I hope that David won't mind, but I've attached his initial VM >> proposal email from a couple of months ago. -Brian http://tunes.org/~water/brice.vcf |
|
|
Re: Slate's slowness impeding UI/IDE developmentSo what is the "real" reason for Lee quitting in the first place?
Maybe if Lee was bestowed the position of "Benevolant Dictator" then he would come back. But wasn't Lee the only person working on the VM anyway? Were other people trying to bogart in on the VM or were their arguments about the design or what? I think Slate has some great ideas behind it. On 6/11/06, Brian Rice <water@...> wrote: > On Jun 11, 2006, at 6:28 PM, Mark Haniford wrote: > To Lee: > How can we persuade you to return somehow? Would it need to involve > shedding some of the formalities of a public open-source project? |
|
|
Re: slate gui slowness needs fixing before proceedingBrian Rice <water@...> writes:
> On Jun 11, 2006, at 12:09 PM, Timmy Douglas wrote: > >> Well, I've added patches to my repos for simple one-way undo and now >> you can type in all the characters fine. The problem is that there is >> no point in going further (selections, copy paste, searching, >> kill-line, cleaning up my ugly code, etc) when the current gui is too >> slow to type in. I guess my ideal path after those features to the >> text buffer would be to modify demo/inspector.slate to elegantly edit >> the current environment. But enough of the future talk. > > Okay. I've pulled these into the site repositories. thanks. I made another patch with drop-mark, delete-region, and kill-line (which for now behaves differently than emacs since it does drop-mark, end-of-line, and delete-region in one command, which sucks up the next line I think). >> So I'd like to take a look at speeding up (or something to speed up >> the GUI portion) of slate, but the only real dealings I have with >> compilers are from when I wrote a tiger compiler in sml (following >> appel's book for a class). Anyways, I didn't see any hint of where to >> start from the last 1000 messages on this list, so I thought I'd start >> a thread. So what are the options? > > I've discussed this off-list with someone who was going to work on > it, but I haven't heard back from him after initial email exchanges. > I've CC'd him just to get some basic communication re-established, > hopefully. The options that we went over were: > > 1) Improve/port the experimental_jit.c. It already gives a 2-4x > speedup. The problem is that it does no dynamic inlining, so that the > huge message-send layering which is the majority of the performance > problem is not taken care of. hm, this sort of seems like a temporary patch to speed up things but I'm not sure it is enough. You'd have to try out the UI to see for yourself, but 2-4x doesn't seem like it will fix the problem. It looks like it might be the easiest option though. It's weird because sometimes there are long delays and then the events will all get processed by the ui fairly quickly (1 sec between typed characters showing up) compared to the delay before the events registered (like 2-10 sec after the first keypress). But if you try inspector.slate and try to drag the inspector windows you will go crazy. I guess mouse motion events really drag it down or something along those lines. I'm not really that familiar with how slate's compiler/interpreter works or builds now. I was hoping to find something that would tell me how mobius/ is built since it's like already slate code and I'm not sure what can actually build that first stage (which I assume produces the vm.c file in the base directory). Can someone point me to docs on how the whole build process works? I think it'd go a long way with understanding the vm... so far I've just been opening up like millions of source files but it's hard to get the whole picture from that and mobius.pdf. > 2) Fix/complete Lee's optimizing compiler framework. This has a > couple of sub-options. > The direct option is to finish his x86 code generator, figure out > how to link it with the image, etc. Basically lots of stuff that I > have no idea how to do, and I don't know if I can get him to come > back to do it (although I'd try if enough people asked... or maybe > they should try themselves). Yeah, there is quite a bit of code in src/mobius/optimizer. So you're saying none of that is being used at the moment? > The other option is to add a back-end target to LLVM from the IR > code. This is slightly problematic because Lee wrote the IR to use > SSU (single static usage) instead of SSA (single static assignment) > form, which are inversions of each other from a data-flow > perspective. Other than that, Lee's framework does "speak" LLVM at > least from the abstract perspective. Sounds like we probably wouldn't have to worry about optimizations once we got it into llvm's hands. I don't know how good that'd be. I guess I'd have to spend a week figuring out how the optimization framework works first. > 3) Write an inlining bytecode compiler (my idea, would work > independently of a JIT) and associated caching/flushing system. Well I'm pretty new to this stuff so I don't really know what would get cached and flushed... I went to the library today since I saw you thought ACD&I(?) was a good book (on some irc conversation a while back) but they only let me check it out for 2 hours in-library... I gave up after 30 min since there wasn't much I could do there. > 4) Translate the VM into a direct-threaded style, which Eliot Miranda > endorsed at Smalltalk Solutions this year when I spoke with him. It > makes inlining much easier and has other benefits in terms of > architectural/code-manipulation simplifications. ok >> In about a week, I'm going to get more busy though since another >> summer class will start, but until then, I should have time to get >> something done. > > That likely won't be enough time to accomplish a deep change, but > some sketch code to start with would be feasible. Slate's VM > structure is pretty simple and malleable. All the bytecode-related > code is in vm.slate, for example. > > I hope that David won't mind, but I've attached his initial VM > proposal email from a couple of months ago. Well, if he takes the powerpc asm route, I hope there is a way that I could run it on this old athlon box. Thanks for the reply. |
|
|
Re: slate gui slowness needs fixing before proceedingTimmy Douglas wrote:
> processed by the ui fairly quickly (1 sec between typed characters > showing up) compared to the delay before the events registered (like > 2-10 sec after the first keypress). But if you try inspector.slate and > try to drag the inspector windows you will go crazy. I guess mouse > motion events really drag it down or something along those lines. It sounds like there's something rotten in the interface to SDL, rather than basic slowness of Slate... well, I'd be surprised if it was Slate since I've always been pleasantly surprised at the speed of the system in general. Perhaps there's something sub-par in the polling logic? I can't say for sure, of course, since I've been concentrating on other aspects of the system and haven't tried the SDL interfaces for a long, long time! > Well I'm pretty new to this stuff so I don't really know what would > get cached and flushed... I went to the library today since I saw you > thought ACD&I(?) was a good book (on some irc conversation a while > back) but they only let me check it out for 2 hours in-library... I > gave up after 30 min since there wasn't much I could do there. Try the Self papers: http://citeseer.ist.psu.edu/chambers92design.html http://citeseer.ist.psu.edu/chambers91making.html http://citeseer.ist.psu.edu/hlzle94adaptive.html http://citeseer.ist.psu.edu/chambers90iterative.html They changed my life ;-) > Well, if he takes the powerpc asm route, I hope there is a way that I > could run it on this old athlon box. One important aspect of a direct-threaded design, ISTM, is the instruction encodings... if someone can design a fixed-width "VLIW" format for direct-threaded bytecodes, then the same architecture ought to be able to apply to x86 as well as PPC. Tony |
|
|
Re: slate gui slowness needs fixing before proceedingTony Garnock-Jones wrote:
> Timmy Douglas wrote: > >>processed by the ui fairly quickly (1 sec between typed characters >>showing up) compared to the delay before the events registered (like >>2-10 sec after the first keypress). But if you try inspector.slate and >>try to drag the inspector windows you will go crazy. I guess mouse >>motion events really drag it down or something along those lines. > > It sounds like there's something rotten in the interface to SDL, rather > than basic slowness of Slate... well, I'd be surprised if it was Slate > since I've always been pleasantly surprised at the speed of the system > in general. Perhaps there's something sub-par in the polling logic? I haven't tried the SDL interface but you can get a very rough idea of the basic VM performance by running 'make benchmark' and comparing the results to those at: http://shootout.alioth.debian.org/ Last time I tried this the results were unfortunately not good :-( The source for these tests can be found in test/benchmark and there is some documentation in doc/benchmarks.txt. Nick. |
|
|
Re: slate gui slowness needs fixing before proceedingOn Jun 12, 2006, at 5:27 AM, Nick Forde wrote:
> Tony Garnock-Jones wrote: >> Timmy Douglas wrote: >>> processed by the ui fairly quickly (1 sec between typed characters >>> showing up) compared to the delay before the events registered (like >>> 2-10 sec after the first keypress). But if you try >>> inspector.slate and >>> try to drag the inspector windows you will go crazy. I guess mouse >>> motion events really drag it down or something along those lines. >> It sounds like there's something rotten in the interface to SDL, >> rather >> than basic slowness of Slate... well, I'd be surprised if it was >> Slate >> since I've always been pleasantly surprised at the speed of the >> system >> in general. Perhaps there's something sub-par in the polling logic? > > I haven't tried the SDL interface but you can get a very rough idea > of the basic VM performance by running 'make benchmark' and > comparing the results to those at: http://shootout.alioth.debian.org/ > Last time I tried this the results were unfortunately not good :-( the VM+image's performance. There are a lot of message sends that happen for basic control-flow operations, for example, or even for just executing a non-literal (argument to a method) block. That said, the VM design is rather naive and could be much better, but not by the order of magnitude that the shootout benchmarks are. The performance problem we have relates to the Strongtalk design - we just have no inlining compiler to rely on. > The source for these tests can be found in test/benchmark and there is > some documentation in doc/benchmarks.txt. -- -Brian http://tunes.org/~water/brice.vcf |
|
|
Re: Slate's slowness impeding UI/IDE developmentOn Jun 11, 2006, at 7:43 PM, Mark Haniford wrote: > So what is the "real" reason for Lee quitting in the first place? > Maybe if Lee was bestowed the position of "Benevolant Dictator" then > he would come back. The last relevant quotes (over instant messaging) I got from him about his stance towards the project were: "i can say with absolute certainty i have no interest in continuing it" "i'm disinterested in where the project is going and i'm not sure we are capable of running it together to the mutual benefit of us both" "as it has been it has been nothing more than parasitic to me" If I may paraphrase further, basically he did not want to maintain a public project at all, and only cared about a hobby-level language +compiler+OS toolchain that ran on bare hardware. He actually *suggested* that we focus on the IDE to make Slate more useful rather than fret about performance. I'm not sure how he never took the performance issue seriously enough to realize that the UI-based IDE would be unusable without a dynamic inliner. I am CC'ing him so that he can clarify his stance on this if he wishes. > But wasn't Lee the only person working on the VM anyway? Lee actually didn't originally like the idea of using a VM. I simply introduced a (buggy) Slate-to-C translator a la Squeak and sketched out a VM design just to get the ball rolling when we were stuck in a Common Lisp interpreter. If we hadn't done that, we'd still be there on CL since he never completed his compiler. That said, he wrote most of the VM code and wound up maintaining it. He has a propensity to write pages and pages of really interesting code with no comments; I spoke with other people he has collaborated with and they've confirmed this. So it's non-trivial to pick up code that he wrote and run with it, and he got stuck with what he had written, with little-to-no desire to do so (apparently). > Were other people trying to bogart in on the VM or were their > arguments about the design or what? Towards the end there was a decent amount of clamoring for continuation support, which he apparently found unwarranted. He basically silently refused to code any support for it, while making hand-waving explanations about how easy it would be to get a subset of the functionality. At least a few people disagreed with him on the matter. No one really criticized or tried to mess with the basic VM design or such; in fact I think he was its biggest critic. The last that I heard, Lee was learning his father's real estate business and selling real estate in/near Las Vegas. I suggested a few open positions for the type of work he was doing with Slate, but it didn't interest him. He probably makes good money and is totally wasting his technical talent (or not - he occasionally just contributes to a few projects as a donor). > I think Slate has some great ideas behind it. > > On 6/11/06, Brian Rice <water@...> wrote: >> On Jun 11, 2006, at 6:28 PM, Mark Haniford wrote: >> To Lee: >> How can we persuade you to return somehow? Would it need to involve >> shedding some of the formalities of a public open-source project? -- -Brian http://tunes.org/~water/brice.vcf |
|
|
Re: slate gui slowness needs fixing before proceedingBrian Rice wrote:
> On Jun 12, 2006, at 5:27 AM, Nick Forde wrote: > >> Tony Garnock-Jones wrote: >> >>> Timmy Douglas wrote: >>> >>>> processed by the ui fairly quickly (1 sec between typed characters >>>> showing up) compared to the delay before the events registered (like >>>> 2-10 sec after the first keypress). But if you try inspector.slate and >>>> try to drag the inspector windows you will go crazy. I guess mouse >>>> motion events really drag it down or something along those lines. >>> >>> It sounds like there's something rotten in the interface to SDL, rather >>> than basic slowness of Slate... well, I'd be surprised if it was Slate >>> since I've always been pleasantly surprised at the speed of the system >>> in general. Perhaps there's something sub-par in the polling logic? >> >> >> I haven't tried the SDL interface but you can get a very rough idea >> of the basic VM performance by running 'make benchmark' and >> comparing the results to those at: http://shootout.alioth.debian.org/ >> Last time I tried this the results were unfortunately not good :-( > > > *ahem* Just to be pedantic, that's not the VM's performance, that's the > VM+image's performance. There are a lot of message sends that happen > for basic control-flow operations, for example, or even for just > executing a non-literal (argument to a method) block. For most of the benchmarks that is indeed true. However, I added an extra primitive to the VM to trace PSInterpreter_[re]sendMessage*() calls only during the body of the benchmarks. I found that in some of the simple tight loop tests the number of message sends wasn't excessive but most of the execution time was in their dispatch. These benchmarks still fared poorly when compared to the shootout results. > That said, the VM design is rather naive and could be much better, but > not by the order of magnitude that the shootout benchmarks are. The > performance problem we have relates to the Strongtalk design - we just > have no inlining compiler to rely on. I think you're right and agree that this is where the biggest performance gains can be had. Nick. |
|
|
Re: slate gui slowness needs fixing before proceedingOn Jun 11, 2006, at 10:36 PM, Timmy Douglas wrote: > Brian Rice <water@...> writes: > >> On Jun 11, 2006, at 12:09 PM, Timmy Douglas wrote: >> >>> Well, I've added patches to my repos for simple one-way undo and now >>> you can type in all the characters fine. The problem is that >>> there is >>> no point in going further (selections, copy paste, searching, >>> kill-line, cleaning up my ugly code, etc) when the current gui is >>> too >>> slow to type in. I guess my ideal path after those features to the >>> text buffer would be to modify demo/inspector.slate to elegantly >>> edit >>> the current environment. But enough of the future talk. >> >> Okay. I've pulled these into the site repositories. > > thanks. I made another patch with drop-mark, delete-region, and > kill-line (which for now behaves differently than emacs since it does > drop-mark, end-of-line, and delete-region in one command, which sucks > up the next line I think). >>> So I'd like to take a look at speeding up (or something to speed up >>> the GUI portion) of slate, but the only real dealings I have with >>> compilers are from when I wrote a tiger compiler in sml (following >>> appel's book for a class). Anyways, I didn't see any hint of >>> where to >>> start from the last 1000 messages on this list, so I thought I'd >>> start >>> a thread. So what are the options? >> >> I've discussed this off-list with someone who was going to work on >> it, but I haven't heard back from him after initial email exchanges. >> I've CC'd him just to get some basic communication re-established, >> hopefully. The options that we went over were: >> >> 1) Improve/port the experimental_jit.c. It already gives a 2-4x >> speedup. The problem is that it does no dynamic inlining, so that the >> huge message-send layering which is the majority of the performance >> problem is not taken care of. > > hm, this sort of seems like a temporary patch to speed up things but > I'm not sure it is enough. You'd have to try out the UI to see for > yourself, but 2-4x doesn't seem like it will fix the problem. > I'm not really that familiar with how slate's compiler/interpreter > works or builds now. I was hoping to find something that would tell me > how mobius/ is built since it's like already slate code and I'm not > sure what can actually build that first stage (which I assume produces > the vm.c file in the base directory). Can someone point me to docs on > how the whole build process works? I think it'd go a long way with > understanding the vm... so far I've just been opening up like millions > of source files but it's hard to get the whole picture from that and > mobius.pdf. The code that "make newboot" calls (mentioned in INSTALL) is in src/ mobius/vm/build.slate and then there's src/mobius/vm/bootstrap.slate and so forth. src/mobius/init.slate shows the load order needed to build up a working VM+image bootstrapper. The README maps out the contents of the directories of the Slate repositories. That said, a guide to Slate's build system is certainly warranted. I'll think about how to start one, but suggestions are also welcome. >> 2) Fix/complete Lee's optimizing compiler framework. This has a >> couple of sub-options. >> The direct option is to finish his x86 code generator, figure out >> how to link it with the image, etc. Basically lots of stuff that I >> have no idea how to do, and I don't know if I can get him to come >> back to do it (although I'd try if enough people asked... or maybe >> they should try themselves). > > Yeah, there is quite a bit of code in src/mobius/optimizer. So > you're saying none of that is being used at the moment? never hooked it up to test it, other than the IR front-end in tests that were local to his setup (i.e. he never published them). >> The other option is to add a back-end target to LLVM from the IR >> code. This is slightly problematic because Lee wrote the IR to use >> SSU (single static usage) instead of SSA (single static assignment) >> form, which are inversions of each other from a data-flow >> perspective. Other than that, Lee's framework does "speak" LLVM at >> least from the abstract perspective. > > Sounds like we probably wouldn't have to worry about optimizations > once we got it into llvm's hands. I don't know how good that'd be. > I guess I'd have to spend a week figuring out how the optimization > framework works first. helps quite a bit but Lee made his own decisions for his own reasons that have to be divined from the source. >> 3) Write an inlining bytecode compiler (my idea, would work >> independently of a JIT) and associated caching/flushing system. > > Well I'm pretty new to this stuff so I don't really know what would > get cached and flushed... I went to the library today since I saw you > thought ACD&I(?) was a good book (on some irc conversation a while > back) but they only let me check it out for 2 hours in-library... I > gave up after 30 min since there wasn't much I could do there. Advanced Compiler Design & Implementation only covers advanced back- end optimizations and intermediate-form representation choices. It's the basis for a lot of Lee's work in that area. >> 4) Translate the VM into a direct-threaded style, which Eliot Miranda >> endorsed at Smalltalk Solutions this year when I spoke with him. It >> makes inlining much easier and has other benefits in terms of >> architectural/code-manipulation simplifications. > > ok > >>> In about a week, I'm going to get more busy though since another >>> summer class will start, but until then, I should have time to get >>> something done. >> >> That likely won't be enough time to accomplish a deep change, but >> some sketch code to start with would be feasible. Slate's VM >> structure is pretty simple and malleable. All the bytecode-related >> code is in vm.slate, for example. >> >> I hope that David won't mind, but I've attached his initial VM >> proposal email from a couple of months ago. > > Well, if he takes the powerpc asm route, I hope there is a way that > I could run it on this old athlon box. -- -Brian http://tunes.org/~water/brice.vcf |
|
|
Re: slate gui slowness needs fixing before proceedingI need to clarify at least this point:
On Jun 12, 2006, at 7:02 AM, Brian Rice wrote: > On Jun 11, 2006, at 10:36 PM, Timmy Douglas wrote: >> Brian Rice <water@...> writes: >>> The other option is to add a back-end target to LLVM from the IR >>> code. This is slightly problematic because Lee wrote the IR to use >>> SSU (single static usage) instead of SSA (single static assignment) >>> form, which are inversions of each other from a data-flow >>> perspective. Other than that, Lee's framework does "speak" LLVM at >>> least from the abstract perspective. >> >> Sounds like we probably wouldn't have to worry about optimizations >> once we got it into llvm's hands. I don't know how good that'd be. >> I guess I'd have to spend a week figuring out how the optimization >> framework works first. > > Yeah, it's at least a medium-sized task. Compiler-coding > familiarity helps quite a bit but Lee made his own decisions for > his own reasons that have to be divined from the source. in doc/*.txt. -- -Brian http://tunes.org/~water/brice.vcf |
|
|
Re: slate gui slowness needs fixing before proceedingOn Jun 11, 2006, at 4:15 PM, Brian Rice wrote: > On Jun 11, 2006, at 12:09 PM, Timmy Douglas wrote: >> So I'd like to take a look at speeding up (or something to speed up >> the GUI portion) of slate, but the only real dealings I have with >> compilers are from when I wrote a tiger compiler in sml (following >> appel's book for a class). Anyways, I didn't see any hint of where to >> start from the last 1000 messages on this list, so I thought I'd >> start >> a thread. So what are the options? > > I've discussed this off-list with someone who was going to work on > it, but I haven't heard back from him after initial email > exchanges. I've CC'd him just to get some basic communication re- > established, hopefully. The options that we went over were: > > 1) Improve/port the experimental_jit.c. It already gives a 2-4x > speedup. |