|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 | Next > |
|
|
Enable SSA at -O0Hi,
this patch enables SSA at -O0. There are no regressions with the last round of fixes I sent and I also tested gdb testsuite that we don't obstructate debug info. Concerning memory usage, naturally SSA comes at some cost. For Gerald's testcase I get 266673Kb instead of 252286Kb. We however produce smaller assembly: -rw-r--r-- 1 jh jh 683217 Jul 17 13:11 combine2.s -rw-r--r-- 1 jh jh 4960222 Jul 17 16:39 generate-3.4.s to: -rw-r--r-- 1 jh jh 671035 Jul 17 13:32 combine2.s -rw-r--r-- 1 jh jh 4859672 Jul 17 16:40 generate-3.4.s For compilation of GCC modules I get following timmings: real 2m54.967s user 2m52.875s sys 0m2.044s real 2m54.997s user 2m52.871s sys 0m2.124s to: real 2m55.269s user 2m53.195s sys 0m2.084s real 2m55.175s user 2m53.039s sys 0m2.140s For Gerald's testcase it is 4.51s to 4.50s With tuples, we might be slightly better here, since SSA passes gets noticeably cheaper relative to RTL passes and we save some little work at RTL level with the patch. We can also get some improvement by limited DCE. Initial tests was in favour of SSA, but I had to disable more of TER. In particular it would be nice to allow TERing loads into expression when there are no stores in between the statements, this is very common case. The main motivation for the patch is however better maintenability of compiler (ie Richards variable length arrays won't need non-SSA lowering passes) and to get rid of O0 code quality regressions we have relative to pre-tree-SSA compilers. Bootstrapped/regtested i686-linux. I am testing x86_64-linux and PPC-linux but both passed with earlier version of the patch. Honza * cgraph.c (cgraph_add_new_function): Do early local passes. * tree-nrv.c (gate_pass_return_slot): New gate. (pass_nrv): Add the gate. * tree-ssa-coalese.c (hash_ssa_name_by_var, eq_ssa_name_by_var): New functions. (coalesce_ssa_name): Coalesce SSA names. * tree-ssa-live.c (remove_unused_locals): Be more conservative when not optimizing so unused user vars remains visible. * common.opt (flag_tree_ter): Always enable by default. * tree-ssa-ter.c: Include flags.h (is_replaceable_p): Check that locations match; when aliasing is missing be conservative about loads. * tree-optimize.c (gate_init_datastructures): Remove. (pass_init_datastructures): New. * passes.c: Reorder passes so we always go into SSA. Index: cgraph.c =================================================================== --- cgraph.c (revision 137903) +++ cgraph.c (working copy) @@ -1366,7 +1366,7 @@ if (!lowered) tree_lowering_passes (fndecl); bitmap_obstack_initialize (NULL); - if (!gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)) && optimize) + if (!gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl))) execute_pass_list (pass_early_local_passes.pass.sub); bitmap_obstack_release (NULL); tree_rest_of_compilation (fndecl); Index: tree-nrv.c =================================================================== --- tree-nrv.c (revision 137903) +++ tree-nrv.c (working copy) @@ -226,12 +226,18 @@ return 0; } +static bool +gate_pass_return_slot (void) +{ + return optimize > 0; +} + struct gimple_opt_pass pass_nrv = { { GIMPLE_PASS, "nrv", /* name */ - NULL, /* gate */ + gate_pass_return_slot, /* gate */ tree_nrv, /* execute */ NULL, /* sub */ NULL, /* next */ Index: tree-ssa-coalesce.c =================================================================== --- tree-ssa-coalesce.c (revision 137903) +++ tree-ssa-coalesce.c (working copy) @@ -1297,7 +1297,25 @@ } } +/* Returns a hash code for P. */ +static hashval_t +hash_ssa_name_by_var (const void *p) +{ + const_tree n = (const_tree) p; + return (hashval_t) htab_hash_pointer (SSA_NAME_VAR (n)); +} + +/* Returns nonzero if P1 and P2 are equal. */ + +static int +eq_ssa_name_by_var (const void *p1, const void *p2) +{ + const_tree n1 = (const_tree) p1; + const_tree n2 = (const_tree) p2; + return SSA_NAME_VAR (n1) == SSA_NAME_VAR (n2); +} + /* Reduce the number of copies by coalescing variables in the function. Return a partition map with the resulting coalesces. */ @@ -1310,13 +1328,45 @@ coalesce_list_p cl; bitmap used_in_copies = BITMAP_ALLOC (NULL); var_map map; + unsigned int i; + static htab_t ssa_name_hash; cl = create_coalesce_list (); map = create_outofssa_var_map (cl, used_in_copies); + /* We need to coalesce all names originating same SSA_NAME_VAR + so debug info remains undisturbed. */ + if (!optimize) + { + ssa_name_hash = htab_create (10, hash_ssa_name_by_var, + eq_ssa_name_by_var, NULL); + for (i = 1; i < num_ssa_names; i++) + { + tree a = ssa_name (i); + + if (a && SSA_NAME_VAR (a) && !DECL_ARTIFICIAL (SSA_NAME_VAR (a))) + { + tree *slot = (tree *) htab_find_slot (ssa_name_hash, a, INSERT); + + if (!*slot) + *slot = a; + else + { + add_coalesce (cl, SSA_NAME_VERSION (a), SSA_NAME_VERSION (*slot), + MUST_COALESCE_COST - 1); + bitmap_set_bit (used_in_copies, SSA_NAME_VERSION (a)); + bitmap_set_bit (used_in_copies, SSA_NAME_VERSION (*slot)); + } + } + } + htab_delete (ssa_name_hash); + } + if (dump_file && (dump_flags & TDF_DETAILS)) + dump_var_map (dump_file, map); + /* Don't calculate live ranges for variables not in the coalesce list. */ partition_view_bitmap (map, used_in_copies, true); BITMAP_FREE (used_in_copies); if (num_var_partitions (map) < 1) { Index: tree-ssa-live.c =================================================================== --- tree-ssa-live.c (revision 137903) +++ tree-ssa-live.c (working copy) @@ -582,7 +582,8 @@ var_ann_t ann; bitmap global_unused_vars = NULL; - mark_scope_block_unused (DECL_INITIAL (current_function_decl)); + if (optimize) + mark_scope_block_unused (DECL_INITIAL (current_function_decl)); /* Assume all locals are unused. */ FOR_EACH_REFERENCED_VAR (t, rvi) var_ann (t)->used = false; @@ -661,7 +662,8 @@ if (TREE_CODE (var) == VAR_DECL && is_global_var (var) - && bitmap_bit_p (global_unused_vars, DECL_UID (var))) + && bitmap_bit_p (global_unused_vars, DECL_UID (var)) + && (optimize || DECL_ARTIFICIAL (var))) *cell = TREE_CHAIN (*cell); else cell = &TREE_CHAIN (*cell); @@ -681,9 +683,11 @@ && TREE_CODE (t) != RESULT_DECL && !(ann = var_ann (t))->used && !ann->symbol_mem_tag - && !TREE_ADDRESSABLE (t)) + && !TREE_ADDRESSABLE (t) + && (optimize || DECL_ARTIFICIAL (t))) remove_referenced_var (t); - remove_unused_scope_block_p (DECL_INITIAL (current_function_decl)); + if (optimize) + remove_unused_scope_block_p (DECL_INITIAL (current_function_decl)); } Index: common.opt =================================================================== --- common.opt (revision 137903) +++ common.opt (working copy) @@ -1157,7 +1157,7 @@ Perform scalar replacement of aggregates ftree-ter -Common Report Var(flag_tree_ter) Optimization +Common Report Var(flag_tree_ter) Init(1) Optimization Replace temporary expressions in the SSA->normal pass ftree-lrs Index: tree-ssa-ter.c =================================================================== --- tree-ssa-ter.c (revision 137903) +++ tree-ssa-ter.c (working copy) @@ -30,6 +30,7 @@ #include "tree-flow.h" #include "tree-dump.h" #include "tree-ssa-live.h" +#include "flags.h" /* Temporary Expression Replacement (TER) @@ -364,6 +365,8 @@ tree call_expr; use_operand_p use_p; tree def, use_stmt; + location_t locus1, locus2; + tree block1, block2; /* Only consider modify stmts. */ if (TREE_CODE (stmt) != GIMPLE_MODIFY_STMT) @@ -386,6 +389,36 @@ if (bb_for_stmt (use_stmt) != bb_for_stmt (stmt)) return false; + if (GIMPLE_STMT_P (stmt)) + { + locus1 = GIMPLE_STMT_LOCUS (stmt); + block1 = GIMPLE_STMT_BLOCK (stmt); + } + else + { + locus1 = *EXPR_LOCUS (stmt); + block1 = TREE_BLOCK (stmt); + } + if (GIMPLE_STMT_P (use_stmt)) + { + locus2 = GIMPLE_STMT_LOCUS (use_stmt); + block2 = GIMPLE_STMT_BLOCK (use_stmt); + } + if (TREE_CODE (use_stmt) == PHI_NODE) + { + locus2 = 0; + block2 = NULL_TREE; + } + else + { + locus2 = *EXPR_LOCUS (use_stmt); + block2 = TREE_BLOCK (use_stmt); + } + + if (!optimize + && ((locus1 && locus1 != locus2) || (block1 && block1 != block2))) + return false; + /* Used in this block, but at the TOP of the block, not the end. */ if (TREE_CODE (use_stmt) == PHI_NODE) return false; @@ -394,6 +427,10 @@ if (!(ZERO_SSA_OPERANDS (stmt, SSA_OP_VDEF))) return false; + /* Without alias info we can't move around loads. */ + if (stmt_ann (stmt)->references_memory && !optimize) + return false; + /* Float expressions must go through memory if float-store is on. */ if (flag_float_store && FLOAT_TYPE_P (TREE_TYPE (GENERIC_TREE_OPERAND (stmt, 1)))) @@ -412,7 +449,6 @@ /* Leave any stmt with volatile operands alone as well. */ if (stmt_ann (stmt)->has_volatile_ops) return false; - return true; } Index: tree-optimize.c =================================================================== --- tree-optimize.c (revision 137903) +++ tree-optimize.c (working copy) @@ -341,20 +341,12 @@ return 0; } -/* Gate: initialize or not the SSA datastructures. */ - -static bool -gate_init_datastructures (void) -{ - return (optimize >= 1); -} - struct gimple_opt_pass pass_init_datastructures = { { GIMPLE_PASS, NULL, /* name */ - gate_init_datastructures, /* gate */ + NULL, /* gate */ execute_init_datastructures, /* execute */ NULL, /* sub */ NULL, /* next */ Index: passes.c =================================================================== --- passes.c (revision 137903) +++ passes.c (working copy) @@ -542,12 +542,13 @@ NEXT_PASS (pass_cleanup_cfg); NEXT_PASS (pass_init_datastructures); NEXT_PASS (pass_expand_omp); + + NEXT_PASS (pass_referenced_vars); + NEXT_PASS (pass_reset_cc_flags); + NEXT_PASS (pass_build_ssa); NEXT_PASS (pass_all_early_optimizations); { struct opt_pass **p = &pass_all_early_optimizations.pass.sub; - NEXT_PASS (pass_referenced_vars); - NEXT_PASS (pass_reset_cc_flags); - NEXT_PASS (pass_build_ssa); NEXT_PASS (pass_early_warn_uninitialized); NEXT_PASS (pass_rebuild_cgraph_edges); NEXT_PASS (pass_early_inline); @@ -574,8 +575,8 @@ NEXT_PASS (pass_tail_recursion); NEXT_PASS (pass_convert_switch); NEXT_PASS (pass_profile); - NEXT_PASS (pass_release_ssa_names); } + NEXT_PASS (pass_release_ssa_names); NEXT_PASS (pass_rebuild_cgraph_edges); } NEXT_PASS (pass_ipa_increase_alignment); @@ -712,11 +713,12 @@ NEXT_PASS (pass_tail_calls); NEXT_PASS (pass_rename_ssa_copies); NEXT_PASS (pass_uncprop); - NEXT_PASS (pass_del_ssa); - NEXT_PASS (pass_nrv); - NEXT_PASS (pass_mark_used_blocks); - NEXT_PASS (pass_cleanup_cfg_post_optimizing); } + NEXT_PASS (pass_del_ssa); + NEXT_PASS (pass_nrv); + NEXT_PASS (pass_mark_used_blocks); + NEXT_PASS (pass_cleanup_cfg_post_optimizing); + NEXT_PASS (pass_warn_function_noreturn); NEXT_PASS (pass_free_datastructures); NEXT_PASS (pass_mudflap_2); |
|
|
Re: Enable SSA at -O0Jan Hubicka wrote:
> Hi, > this patch enables SSA at -O0. There are no regressions with the last round of > fixes I sent and I also tested gdb testsuite that we don't obstructate debug > info. > I wonder how much that actually tell us about debug info in large complex cases... right now -O0 is the only saving grace we have, and as much as I want to see the SSA path at -O0, I'd also hate for us to end up with numerous cases where we no longer have some basic info. > Bootstrapped/regtested i686-linux. I am testing x86_64-linux and PPC-linux but both > passed with earlier version of the patch. > > I presume this is being targeted for 4.4? Andrew |
|
|
Re: Enable SSA at -O0> Jan Hubicka wrote:
> >Hi, > >this patch enables SSA at -O0. There are no regressions with the last > >round of > >fixes I sent and I also tested gdb testsuite that we don't obstructate > >debug > >info. > > > I wonder how much that actually tell us about debug info in large > complex cases... Well, I don't know either... I didn't run into debug info quality problems during development and I tried to use this as build compiler. > > right now -O0 is the only saving grace we have, and as much as I want to > see the SSA path at -O0, I'd also hate for us to end up with numerous > cases where we no longer have some basic info. I've disabled TERing across statements, just within one statement or putting statements with no debug info into statement with debug info. Also I re-coalesce all user variables into single variable. So in theory we should do no worse then we did before tree-SSA as trees fed into expander are now only closer to pre-tree-SSA form. > > >Bootstrapped/regtested i686-linux. I am testing x86_64-linux and > >PPC-linux but both > >passed with earlier version of the patch. > > > > > I presume this is being targeted for 4.4? I guess it depends on overall consensus. I started with this patch as simple hack as Richard needed it for the lowering passes. It turned out to need a lot of fixing here and there and enabling unit-at-a-time by default, so it ended up being more work. If for nothing, this patch was useful to fix a lot of latent or semi-latent problems ;)) and I think the results are not discougrating. Seeing it from the other POV, this patch would enforce us fixing debug info quality issues in SSA that should improve debug quality when optimizing. Honza > > Andrew |
|
|
Re: Enable SSA at -O0On Thu, 17 Jul 2008, Jan Hubicka wrote:
> > Jan Hubicka wrote: > > >Hi, > > >this patch enables SSA at -O0. There are no regressions with the last > > >round of > > >fixes I sent and I also tested gdb testsuite that we don't obstructate > > >debug > > >info. > > > > > I wonder how much that actually tell us about debug info in large > > complex cases... > > Well, I don't know either... I didn't run into debug info quality > problems during development and I tried to use this as build compiler. Well have to bite the bullet and see, and eventually fix some fallout. > > right now -O0 is the only saving grace we have, and as much as I want to > > see the SSA path at -O0, I'd also hate for us to end up with numerous > > cases where we no longer have some basic info. > > I've disabled TERing across statements, just within one statement or > putting statements with no debug info into statement with debug info. > Also I re-coalesce all user variables into single variable. So in > theory we should do no worse then we did before tree-SSA as trees fed > into expander are now only closer to pre-tree-SSA form. > > > > >Bootstrapped/regtested i686-linux. I am testing x86_64-linux and > > >PPC-linux but both > > >passed with earlier version of the patch. > > > > > > > > I presume this is being targeted for 4.4? > > I guess it depends on overall consensus. I started with this patch as > simple hack as Richard needed it for the lowering passes. It turned out > to need a lot of fixing here and there and enabling unit-at-a-time by > default, so it ended up being more work. If for nothing, this patch was > useful to fix a lot of latent or semi-latent problems ;)) and I think > the results are not discougrating. > Seeing it from the other POV, this patch would enforce us fixing debug > info quality issues in SSA that should improve debug quality when > optimizing. It also should help us to "fix" expansion to not require TER but use the SSA form to do initial instruction selection. This will especially become important with tuples to avoid going back to GENERIC trees. Of course we want to only have one expansion code which means we need SSA at -O0 anyway ... Richard. |
|
|
Re: Enable SSA at -O0Jan Hubicka wrote:
> > Well, I don't know either... I didn't run into debug info quality > problems during development and I tried to use this as build compiler. > thats a good sign anyway! > >> right now -O0 is the only saving grace we have, and as much as I want to >> see the SSA path at -O0, I'd also hate for us to end up with numerous >> cases where we no longer have some basic info. >> > > I've disabled TERing across statements, just within one statement or > putting statements with no debug info into statement with debug info. > Also I re-coalesce all user variables into single variable. So in > theory we should do no worse then we did before tree-SSA as trees fed > into expander are now only closer to pre-tree-SSA form. > right. And as long as we don't do any SSA optimizations which move code around, we should never have to worry about overlapping live ranges which play havoc with debug output. in theory, it should be OK, but the specter of compiler temps without debug info finding their way in where they shouldn't be lurks at the window... we certainly have to be careful what ssa passes are run. >>> Bootstrapped/regtested i686-linux. I am testing x86_64-linux and >>> PPC-linux but both >>> passed with earlier version of the patch. >>> >>> >>> >> I presume this is being targeted for 4.4? >> > > I guess it depends on overall consensus. I started with this patch as > simple hack as Richard needed it for the lowering passes. It turned out > to need a lot of fixing here and there and enabling unit-at-a-time by > default, so it ended up being more work. If for nothing, this patch was > useful to fix a lot of latent or semi-latent problems ;)) and I think > the results are not discougrating. > Seeing it from the other POV, this patch would enforce us fixing debug > info quality issues in SSA that should improve debug quality when > optimizing. > comparison of the 'user quality' of debug info issued between different runs of the compiler. It should also identify what isn't correct and where. At least that is the plan :-) This would be another useful thing to run through it, -O0 with SSA and -O0 without SSA and see how similar it is. And also as a test of the comparator mechanism itself. We might then be able to detect that we have a problem we need to look at :-) Andrew |
|
|
Re: Enable SSA at -O0Richard Guenther wrote:
>> >> I guess it depends on overall consensus. I started with this patch as >> simple hack as Richard needed it for the lowering passes. It turned out >> to need a lot of fixing here and there and enabling unit-at-a-time by >> default, so it ended up being more work. If for nothing, this patch was >> useful to fix a lot of latent or semi-latent problems ;)) and I think >> the results are not discougrating. >> Seeing it from the other POV, this patch would enforce us fixing debug >> info quality issues in SSA that should improve debug quality when >> optimizing. >> > > It also should help us to "fix" expansion to not require TER but > use the SSA form to do initial instruction selection. This will > especially become important with tuples to avoid going back to > GENERIC trees. Of course we want to only have one expansion code > which means we need SSA at -O0 anyway ... > > goal, and this is certainly a step in that direction. Using SSA on the -O0 path is inevitable. Andrew |
|
|
Re: Enable SSA at -O0On Thu, Jul 17, 2008 at 11:11:00AM -0400, Andrew MacLeod wrote:
> Jan Hubicka wrote: > >Hi, > >this patch enables SSA at -O0. There are no regressions with the last > >round of > >fixes I sent and I also tested gdb testsuite that we don't obstructate > >debug > >info. > > > I wonder how much that actually tell us about debug info in large > complex cases... > > right now -O0 is the only saving grace we have, and as much as I want to > see the SSA path at -O0, I'd also hate for us to end up with numerous > cases where we no longer have some basic info. Even at -O0 recent GCCs screw up the debuginfo badly, see PR debug/34037 or PR debug/36690. Jakub |
|
|
Re: Enable SSA at -O0> + /* We need to coalesce all names originating same SSA_NAME_VAR > + so debug info remains undisturbed. */ > + if (!optimize) Would it make sense to do this at optimize <= 1? Paolo > + { > + ssa_name_hash = htab_create (10, hash_ssa_name_by_var, > + eq_ssa_name_by_var, NULL); > + for (i = 1; i < num_ssa_names; i++) > + { > + tree a = ssa_name (i); > + > + if (a && SSA_NAME_VAR (a) && !DECL_ARTIFICIAL (SSA_NAME_VAR (a))) > + { > + tree *slot = (tree *) htab_find_slot (ssa_name_hash, a, INSERT); > + > + if (!*slot) > + *slot = a; > + else > + { > + add_coalesce (cl, SSA_NAME_VERSION (a), SSA_NAME_VERSION (*slot), > + MUST_COALESCE_COST - 1); > + bitmap_set_bit (used_in_copies, SSA_NAME_VERSION (a)); > + bitmap_set_bit (used_in_copies, SSA_NAME_VERSION (*slot)); > + } > + } > + } > + htab_delete (ssa_name_hash); > + } > + if (dump_file && (dump_flags & TDF_DETAILS)) > + dump_var_map (dump_file, map); |
|
|
Re: Enable SSA at -O0>
> >+ /* We need to coalesce all names originating same SSA_NAME_VAR > >+ so debug info remains undisturbed. */ > >+ if (!optimize) > > Would it make sense to do this at optimize <= 1? I don't know. Variable splitting is quite standard technique in register allocation. I am not sure how serious we are about debug info quality difference in between -O1 and -O2 Honza |
|
|
Re: Enable SSA at -O0Jan Hubicka wrote:
>>> + /* We need to coalesce all names originating same SSA_NAME_VAR >>> + so debug info remains undisturbed. */ >>> + if (!optimize) >> Would it make sense to do this at optimize <= 1? > > I don't know. Variable splitting is quite standard technique in register > allocation. But, if you mean flag_web, we don't do it unless -funroll-loops or -fpeel-loops, on the ground that the splitting we get from out-of-SSA should be enough. So, enabling this kind of coalescing at -O1 would indeed prevent the compiler from doing splitting. > I am not sure how serious we are about debug info quality > difference in between -O1 and -O2 Some people (especially the AdaCore guys) have been constantly begging for better (usable, at least) -O1 debug info, and FWIW I second that. Paolo |
|
|
Re: Enable SSA at -O0On Fri, Jul 18, 2008 at 5:52 PM, Paolo Bonzini <bonzini@...> wrote:
> Jan Hubicka wrote: >>>> >>>> + /* We need to coalesce all names originating same SSA_NAME_VAR >>>> + so debug info remains undisturbed. */ >>>> + if (!optimize) >>> >>> Would it make sense to do this at optimize <= 1? >> >> I don't know. Variable splitting is quite standard technique in register >> allocation. > > But, if you mean flag_web, we don't do it unless -funroll-loops or > -fpeel-loops, on the ground that the splitting we get from out-of-SSA should > be enough. So, enabling this kind of coalescing at -O1 would indeed prevent > the compiler from doing splitting. > >> I am not sure how serious we are about debug info quality >> difference in between -O1 and -O2 > > Some people (especially the AdaCore guys) have been constantly begging for > better (usable, at least) -O1 debug info, and FWIW I second that. I'd rather do a -Og instead. -O1 already badly messes with debuggability. For -Og I would chose -Os inlining heuristics for example and only do basic propagation and dead code removal but no memory optimization or things like SRA or IVOPTs. Just ccp, copyprop, dce and inline accessor style things to make C++ usable. Richard. |
|
|
Re: Enable SSA at -O0> I'd rather do a -Og instead. -O1 already badly messes with debuggability. > For -Og I would chose -Os inlining heuristics for example and only do > basic propagation and dead code removal but no memory optimization or > things like SRA or IVOPTs. Just ccp, copyprop, dce and inline accessor style > things to make C++ usable. But since in 4.4 there is no more any reason to use -O1 when alias analysis blows up, why couldn't -O1 *be* -Og? Paolo |
|
|
Re: Enable SSA at -O0On Fri, 18 Jul 2008, Paolo Bonzini wrote:
> > > I'd rather do a -Og instead. -O1 already badly messes with debuggability. > > For -Og I would chose -Os inlining heuristics for example and only do > > basic propagation and dead code removal but no memory optimization or > > things like SRA or IVOPTs. Just ccp, copyprop, dce and inline accessor > > style > > things to make C++ usable. > > But since in 4.4 there is no more any reason to use -O1 when alias analysis > blows up, why couldn't -O1 *be* -Og? Well, do you suggest to "drop" -O1 in favor of simply -O2 and "re-introduce" it with -Og behavior? I think this is more confusing than adding -Og. But we could certainly think of taking out some optimizations from -O1 that hurt debugging, but I wouldn't go as far as I suggested for -Og. -Og could be nearly as good as -O0 with the difference that 'set' in gdb will work even less than with -O0. Richard. |
|
|
Re: Enable SSA at -O0>> But since in 4.4 there is no more any reason to use -O1 when alias analysis >> blows up, why couldn't -O1 *be* -Og? > > Well, do you suggest to "drop" -O1 in favor of simply -O2 and > "re-introduce" it with -Og behavior? Yes, but just because... > I think this is more confusing > than adding -Og. But we could certainly think of taking out some > optimizations from -O1 that hurt debugging, but I wouldn't go as far > as I suggested for -Og. ... I agree with the AdaCore guys' opinion was that in 3.x -O1 was -Og in practice, and it was then broken in 4.x. So the non-debuggability of -O1 in 4.x is in some sense a regression. If we can now declare to have a hold on tree-level compile-time problems, and that there is no need to keep -O1 as a workaround for those, it's time to fix that regression. Paolo |
|
|
Re: Enable SSA at -O0On Fri, 18 Jul 2008, Paolo Bonzini wrote:
> > > > But since in 4.4 there is no more any reason to use -O1 when alias > > > analysis > > > blows up, why couldn't -O1 *be* -Og? > > > > Well, do you suggest to "drop" -O1 in favor of simply -O2 and > > "re-introduce" it with -Og behavior? > > Yes, but just because... > > > I think this is more confusing > > than adding -Og. But we could certainly think of taking out some > > optimizations from -O1 that hurt debugging, but I wouldn't go as far > > as I suggested for -Og. > > ... I agree with the AdaCore guys' opinion was that in 3.x -O1 was -Og in > practice, and it was then broken in 4.x. So the non-debuggability of -O1 in > 4.x is in some sense a regression. If we can now declare to have a hold on > tree-level compile-time problems, and that there is no need to keep -O1 as a > workaround for those, it's time to fix that regression. Ok. Let's see what others think. The most obvious candidates for disabling at -O1 because of debugging are SRA and IVOPTs. Inlining is something else we can throttle down to -Os measures. I doubt the result will be much better for debugging, but for the remaining stuff we could at least try to fix some obvious oversights in the passes. Richard. |
|
|
Re: Enable SSA at -O0Hi,
to bring little more confussion into -O1 wrt -Og discussion, I should note that the variable tracking seems to do (relatively) resonable job here. I had to enforce coalescing to handle somewhat extreme testcases such as: t() { int i=0; /*breakpoint here*/ for (i=0; .... } where the original copy of i initialized to 0 didn't get properly tracked and the early breakpoint got wrong value. This is something that I would expect to break at O0 anyway, since the initial assignment is dead. So in this particular case I would rather look into ways improving register tracking rather than disabling splitting of variables. We've heard a lot about this on GCC summit so things might improve soon ;) Honza |
|
|
Re: Enable SSA at -O0On 7/17/08 7:52 AM, Jan Hubicka wrote:
> * cgraph.c (cgraph_add_new_function): Do early local passes. > * tree-nrv.c (gate_pass_return_slot): New gate. > (pass_nrv): Add the gate. > * tree-ssa-coalese.c (hash_ssa_name_by_var, eq_ssa_name_by_var): New > functions. > (coalesce_ssa_name): Coalesce SSA names. > * tree-ssa-live.c (remove_unused_locals): Be more conservative when > not optimizing so unused user vars remains visible. > * common.opt (flag_tree_ter): Always enable by default. > * tree-ssa-ter.c: Include flags.h > (is_replaceable_p): Check that locations match; when aliasing is missing > be conservative about loads. > * tree-optimize.c (gate_init_datastructures): Remove. > (pass_init_datastructures): New. > * passes.c: Reorder passes so we always go into SSA. This is OK with me. If we have debugging information regressions, then we can deal with those separately. Building SSA at -O0 will simplify very many things, so I think we should enable it before 4.4 is released. Diego. |
|
|
Re: Enable SSA at -O0 |