|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 | Next > |
|
|
Memory allocation overheadI've spent the last three weekends making a serious attempt to get to
the bottom of the problem of quadratic performance in gexslt that has been present ever since I first successfully ran an identity transformation back in 2004. I used the profiler in EiffelStudio 6.1 (a run that normally last 71 minutes when compiled with gec takes about 14 hours when compiled with ISE 6.1 with the profiler on), and spent lots of time following false leads (the report gives exact numbers for routine calls, which is very useful, and nonsense for time spent e.g. 873.49% for one routine, which is definitely not useful). By this morning I had concluded there was nothing in my code that could explain the results seen - or at least, if there was, I wasn't going to be able to find it using the EiffelStudio profiler. So I edited $GOBO/tool/gec/config/c/gcc.cfg to write out profiling information for gprof (Eric, when can we have the setting of cflags and lflags in system.xace?). The resulting runtime was 118 minutes - a much more acceptable overhead. And the resulting report from gprof is highly illuminating. here follows the first few lines: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 36.58 2943.00 2943.00 GC_mark_from 16.32 4256.06 1313.06 GC_header_cache_miss 13.18 5316.64 1060.58 GC_mark_local 10.57 6166.79 850.15 GC_steal_mark_stack 4.32 6514.36 347.57 GC_add_to_black_list_stack 1.55 6639.04 124.68 GC_push_marked 1.26 6740.27 101.23 GC_find_header 1.22 6838.10 97.83 GC_reclaim_clear 1.10 6926.51 88.41 GC_do_local_mark 0.82 6992.39 65.88 GC_generic_malloc_many 0.64 7044.16 51.77 GC_block_was_dirty 0.63 7095.09 50.93 GC_install_header 0.58 7141.48 46.39 GC_apply_to_all_blocks 0.47 7179.41 37.93 GC_build_fl 0.44 7214.44 35.03 530952040 0.00 0.00 T122f10 0.40 7246.41 31.97 GC_allochblk_nth 0.29 7269.71 23.30 GC_enclosing_mapping 0.26 7290.70 20.99 262713928 0.00 0.00 T506f27 0.25 7311.07 20.37 GC_malloc 0.25 7331.32 20.25 127354522 0.00 0.00 T162x16589 0.22 7348.82 17.50 GC_free_block_ending_at 0.22 7366.13 17.31 GC_malloc_atomic 0.20 7382.11 15.99 3529690218 0.00 0.00 T240f8 0.19 7397.55 15.44 GC_reclaim_block 0.18 7412.41 14.86 GC_reclaim_uninit 0.18 7427.16 14.75 529087403 0.00 0.00 T57f9 0.18 7441.84 14.68 529541505 0.00 0.00 T15f4 0.13 7452.31 10.47 2030496041 0.00 0.00 T240c7 0.11 7461.08 8.77 105048154 0.00 0.00 T27f8 0.10 7469.40 8.32 2049701317 0.00 0.00 GE_new239 0.10 7477.42 8.02 7679890135 0.00 0.00 GE_check_null 0.10 7485.32 7.90 172970992 0.00 0.00 T772f27p1 0.10 7493.20 7.88 2049636054 0.00 0.00 GE_new240 in short, almost all the run time is for memory management (this did not surprise me as typically I see the program using 125-290% CPU - I have a quad-core processor, so fully utilized = 400% - and I configured the boehm-gc for parallel marking). Now my problem is what to do about it (incidentally T122f10 is {DS_ARRAYED_LIST}.item - nothing very surprising about that either, I guess.). Does anyone have any suggested approaches? -- Colin Adams Preston Lancashire ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overheadColin Paul Adams wrote:
> So I edited $GOBO/tool/gec/config/c/gcc.cfg to write out profiling > information for gprof (Eric, when can we have the setting of cflags > and lflags in system.xace?). It's already implemented as far as I know: <option name="c_compiler_options" value="..."/> <option name="link" value="..."/> > Now my problem is what to do about it (incidentally T122f10 is > {DS_ARRAYED_LIST}.item - nothing very surprising about that either, I > guess.). Does anyone have any suggested approaches? You can try reducing the amount of garbage generated, and hence put less stress on the GC. I try to do that on all the Gobo tools, even gec. This is one of my primary goals during the design phase. The net advantage is that these tools can run even without the GC. -- Eric Bezault mailto:ericb@... http://www.gobosoft.com ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overhead> I used the profiler in EiffelStudio 6.1 (a run that normally
> last 71 minutes when compiled with gec takes about 14 hours > when compiled with ISE 6.1 with the profiler on), and spent You should not enable profiler on all classes but just on yours. Then you see much quickly where the bottleneck is on your code rather than trying to profile STRING/ARRAY/ and other basic stuff which is not going to help you much. Once you have identified the bottleneck you can try reduce the scope of profiling to just the bottleneck. Manu ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overheadColin Paul Adams wrote:
> Now my problem is what to do about it (incidentally T122f10 is > {DS_ARRAYED_LIST}.item - nothing very surprising about that either, I > guess.). Does anyone have any suggested approaches? Can you send me the C code generated for T122f10? I don't understand how we can have 35 seconds self and 7214 seconds cumulative. Unless its thread is blocked when the GC is doing something special on its own thread. -- Eric Bezault mailto:ericb@... http://www.gobosoft.com ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
|
|
|
Re: Memory allocation overhead>>>>> "Eric" == Eric Bezault <ericb@...> writes:
Eric> You can try reducing the amount of garbage generated So as an example, in XM_XPATH_STATIC_PROPERTY, I have four ARRAY [BOOLEAN] which are included in every expression. Presumably it would be much cheaper to remove the booleans from the arrays (I can't think why I put them in ARRAYs in the first place). -- Colin Adams Preston Lancashire ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overhead>>>>> "Colin" == Colin Paul Adams <colin@...> writes:
>>>>> "Eric" == Eric Bezault <ericb@...> writes: Eric> You can try reducing the amount of garbage generated Colin> So as an example, in XM_XPATH_STATIC_PROPERTY, I have four Colin> ARRAY [BOOLEAN] which are included in every Colin> expression. Presumably it would be much cheaper to remove Colin> the booleans from the arrays (I can't think why I put them Colin> in ARRAYs in the first place). This seems very productive. I changed the first two arrays to be just a series of BOOLEANs, and the runtime came down from 71 minutes to 50 minutes (my target is sub-one-minute, so there's a way to go, but it's a good start). So I'll do the other two ARRAYs next. Another possibility is actually to merge all the BOOLEANs as a bit-field. This will save memory but not the number of objects, so I'm not really interested in doing this. It might hurt performance. But I'm interested in what approaches/techniques you use to avoid extra allocations. Do you use free at all? Obviously that won't reduce the number of allocations, but it will reduce the number of objects the GC has to deal with. Does it give any advantage over assigning Void to the reference? Expanded attributes? This implies having default_create as a creation procedure, whereas the Gobo standard is to have an empty make. Another possibility is to avoid OO techniques. For instance, I know from last weekend's profiling that there is a VERY large number of XM_XPATH_UNTYPED_ATOMIC_VALUE objects created by conversion from XM_XPATH_STRING_VALUE. Untyped-atomic is XPath's coercible data type. Although untyped-atomic does not inherit from xs:string in the XPath type hierarchy, I have implemented XM_XPATH_UNTYPED_ATOMIC_VALUE as inheriting from XM_XPATH_STRING_VALUE for convenience, as they are nearly identical excpet for the coercing behaviour. So a clear saving could be made by merging the two classes with a BOOLEAN to indicate which type is actualy meant. Then the coercion to xs:string is simply implemented by flipping the BOOLEAN. I suspect this is going to be a big saving, but it is very anti-OO. -- Colin Adams Preston Lancashire ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overheadColin Paul Adams wrote:
> But I'm interested in what approaches/techniques you use to avoid > extra allocations. > > Do you use free at all? Obviously that won't reduce the number of > allocations, but it will reduce the number of objects the GC has to > deal with. Does it give any advantage over assigning Void to the > reference? I don't use free. > Expanded attributes? This implies having default_create as a creation > procedure, whereas the Gobo standard is to have an empty make. I don't use expanded. > Another possibility is to avoid OO techniques. For instance, I know > from last weekend's profiling that there is a VERY large number of > XM_XPATH_UNTYPED_ATOMIC_VALUE objects created by conversion from > XM_XPATH_STRING_VALUE. Untyped-atomic is XPath's coercible data > type. Although untyped-atomic does not inherit from xs:string in the > XPath type hierarchy, I have implemented XM_XPATH_UNTYPED_ATOMIC_VALUE > as inheriting from XM_XPATH_STRING_VALUE for convenience, as they are > nearly identical excpet for the coercing behaviour. So a clear saving > could be made by merging the two classes with a BOOLEAN to indicate > which type is actualy meant. Then the coercion to xs:string is simply > implemented by flipping the BOOLEAN. I suspect this is going to be a > big saving, but it is very anti-OO. This might be the kind of things I could use indeed. Otherwise I try to be very careful with strings. String concatenations and substring operations create a lot of intermediary objects. Be careful as well when strings get resized (even implicitly by some operations). Likewise with DS_ARRAYED_... classes: try to create them with a capacity which is not too big but not too small to avoid too many resizings. I know, it's often not easy to choose the best capacity at creation time! I also try to share objects as much as possible. Of course, when shared, we have to be very careful that these objects don't have their state modified. When that happens, then we need to clone it beforehand. I use a lot of shared objects for the tokens when generating the ASTs in gec. Another technique that I use is to try to reuse objects, rather than giving them back to the GC and creating new ones right after. In gec, I have AST visitor classes to implement the different Eiffel "Degree" compilation passes. They all try to keep the intermediary objects that they need to process a given imput class. They reuse these intermediary objects when processing the next input class, and so forth. I probably use other techniques, but it's already a good start. One thing to remember is that when an implementation technique is not "very" OO, try at least to make it look as if it was OO in the class interface. -- Eric Bezault mailto:ericb@... http://www.gobosoft.com ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overheadEric Bezault wrote:
> Colin Paul Adams wrote: > >> But I'm interested in what approaches/techniques you use to avoid >> extra allocations. >> >> Do you use free at all? Obviously that won't reduce the number of >> allocations, but it will reduce the number of objects the GC has to >> deal with. Does it give any advantage over assigning Void to the >> reference? >> > > I don't use free. > > >> Expanded attributes? This implies having default_create as a creation >> procedure, whereas the Gobo standard is to have an empty make. >> > > I don't use expanded. > > >> Another possibility is to avoid OO techniques. For instance, I know >> from last weekend's profiling that there is a VERY large number of >> XM_XPATH_UNTYPED_ATOMIC_VALUE objects created by conversion from >> XM_XPATH_STRING_VALUE. Untyped-atomic is XPath's coercible data >> type. Although untyped-atomic does not inherit from xs:string in the >> XPath type hierarchy, I have implemented XM_XPATH_UNTYPED_ATOMIC_VALUE >> as inheriting from XM_XPATH_STRING_VALUE for convenience, as they are >> nearly identical excpet for the coercing behaviour. So a clear saving >> could be made by merging the two classes with a BOOLEAN to indicate >> which type is actualy meant. Then the coercion to xs:string is simply >> implemented by flipping the BOOLEAN. I suspect this is going to be a >> big saving, but it is very anti-OO. >> > > This might be the kind of things I could use indeed. Otherwise I try > to be very careful with strings. String concatenations and substring > operations create a lot of intermediary objects. Be careful as well > when strings get resized (even implicitly by some operations). Likewise > with DS_ARRAYED_... classes: try to create them with a capacity which > is not too big but not too small to avoid too many resizings. I know, > it's often not easy to choose the best capacity at creation time! > I also try to share objects as much as possible. Of course, when > shared, we have to be very careful that these objects don't have > their state modified. When that happens, then we need to clone it > beforehand. I use a lot of shared objects for the tokens when generating > the ASTs in gec. Another technique that I use is to try to reuse > objects, rather than giving them back to the GC and creating new > ones right after. In gec, I have AST visitor classes to implement > the different Eiffel "Degree" compilation passes. They all try to > keep the intermediary objects that they need to process a given > imput class. They reuse these intermediary objects when processing > the next input class, and so forth. I probably use other techniques, > but it's already a good start. One thing to remember is that when an > implementation technique is not "very" OO, try at least to make it > look as if it was OO in the class interface read something comparable. I am not sure if you have a long term plan to use http://gobo-eiffel.wiki.sourceforge.net/, but I would be willing to publish it there. Is that ok? ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overheadDaniel Tuser wrote:
> This kind of information is very interesting in my opinion. But I never > read something comparable. I am not sure if you have a long term plan to > use http://gobo-eiffel.wiki.sourceforge.net/, but I would be willing to > publish it there. Is that ok? I'm not a big fan of Wikis, so I have no long term plan to use the Gobo project Wiki. I'm fine if Gobo users want to use this space to publish Gobo related information. So feel free to use it. -- Eric Bezault mailto:ericb@... http://www.gobosoft.com ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overhead>>>>> "Eric" == Eric Bezault <ericb@...> writes:
>> Another possibility is to avoid OO techniques. For instance, I >> know from last weekend's profiling that there is a VERY large >> number of XM_XPATH_UNTYPED_ATOMIC_VALUE objects created by >> conversion from XM_XPATH_STRING_VALUE. Untyped-atomic is >> XPath's coercible data type. Although untyped-atomic does not >> inherit from xs:string in the XPath type hierarchy, I have >> implemented XM_XPATH_UNTYPED_ATOMIC_VALUE as inheriting from >> XM_XPATH_STRING_VALUE for convenience, as they are nearly >> identical excpet for the coercing behaviour. So a clear saving >> could be made by merging the two classes with a BOOLEAN to >> indicate which type is actualy meant. Then the coercion to >> xs:string is simply implemented by flipping the BOOLEAN. I >> suspect this is going to be a big saving, but it is very >> anti-OO. Eric> This might be the kind of things I could use Eric> indeed. It helped, although not as dramatically as eliminating my ARRAY [BOOLEAN]s. Eliminating these 4 arrays takes the time down from 71 minutes to 30 minutes. This change to the untyped atomic values brings it further down to 22 minutes. Eric> Another Eric> technique that I use is to try to reuse objects, rather than Eric> giving them back to the GC and creating new ones right Eric> after. In gec, I have AST visitor classes to implement the Eric> different Eiffel "Degree" compilation passes. They all try Eric> to keep the intermediary objects that they need to process a Eric> given imput class. They reuse these intermediary objects Eric> when processing the next input class, and so forth. Presumably you can only do this because you know when you have finished with an object, and don't have to rely on the garbage collector working out when it can free memory. I don't know how many such cases I am likely to find. If I do, then perhaps there is the possibility of doing memory pooling - allocating a single chunk of memory for a large number of identical objects. I'm not sure how to go about that in Eiffel. I would guess that if I have a reference class C, with default_create as a creation procedure (is that necessary?) then if I create an ARRAY [expanded C] the memory will all be claimed in one call to malloc. Then if I can get a reference to one of these objects (how do I do that in Eiffel?), I can call the real creation procedure (it will need to be exported to the calling class) to initialize the referenced expanded object. I then just have to keep track of free slots for reusing the memory. -- Colin Adams Preston Lancashire ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overhead>>>>> "Eric" == Eric Bezault <ericb@...> writes:
Eric> Otherwise I try to be very careful with Eric> strings. String concatenations and substring operations Eric> create a lot of intermediary objects. Be careful as well Eric> when strings get resized (even implicitly by some Eric> operations). Is there a more efficient way of doing this sort of thing? a_preceding_path := parent.path if STRING_.same_string (a_preceding_path, "/") then Result := STRING_.concat (a_preceding_path, node_name) else Result := STRING_.concat (a_preceding_path, "/") Result := STRING_.appended_string (Result, node_name) Result := STRING_.appended_string (Result, "[") Result := STRING_.appended_string (Result, simple_number) Result := STRING_.appended_string (Result, "]") end -- Colin Adams Preston Lancashire ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overheadHello Colin,
Friday, March 28, 2008, 7:50:06 PM, you wrote: >>>>>> "Eric" == Eric Bezault <ericb@...> writes: CPA> Eric> Otherwise I try to be very careful with CPA> Eric> strings. String concatenations and substring operations CPA> Eric> create a lot of intermediary objects. Be careful as well CPA> Eric> when strings get resized (even implicitly by some CPA> Eric> operations). CPA> Is there a more efficient way of doing this sort of thing? CPA> a_preceding_path := parent.path CPA> if STRING_.same_string (a_preceding_path, "/") then CPA> Result := STRING_.concat (a_preceding_path, node_name) CPA> else CPA> Result := STRING_.concat (a_preceding_path, "/") CPA> Result := STRING_.appended_string (Result, node_name) CPA> Result := STRING_.appended_string (Result, "[") CPA> Result := STRING_.appended_string (Result, simple_number) CPA> Result := STRING_.appended_string (Result, "]") CPA> end Exactly for this reason some java compilers have a special optimization inside the code generator when it detects sequences of strings concatenations. It's just a to important operation. -- Best regards, Lothar mailto:llothar@... ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overhead>>>>> "Lothar" == Lothar Scholz <llothar@...> writes:
Lothar> Exactly for this reason some java compilers have a special Lothar> optimization inside the code generator when it detects Lothar> sequences of strings concatenations. It's just a to Lothar> important operation. so what do they do about it? -- Colin Adams Preston Lancashire ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overhead>>>>> "Colin" == Colin Paul Adams <colin@...> writes:
>>>>> "Eric" == Eric Bezault <ericb@...> writes: Colin> Is there a more efficient way of doing this sort of thing? Perhaps we should have: STRING_.concat(<<once "1", once "2", once "3", once "4">>) Strings are indeed somewhat trouble in Eiffel. Although the default case is safe, immutable strings and syntactic sugar for efficient string concatenations would have been nice. -- Cheers, Berend de Boer ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ gobo-eiffel-develop mailing list gobo-eiffel-develop@... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop |
|
|
Re: Memory allocation overhead |