Enable SSA at -O0

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 - 3 | Next >

Enable SSA at -O0

by Jan Hubicka :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,
this patch enables SSA at -O0.  There are no regressions with the last round of
fixes I sent and I also tested gdb testsuite that we don't obstructate debug
info.

Concerning memory usage, naturally SSA comes at some cost. For Gerald's testcase I get
266673Kb instead of 252286Kb.  We however produce smaller assembly:
-rw-r--r-- 1 jh jh  683217 Jul 17 13:11 combine2.s
-rw-r--r-- 1 jh jh 4960222 Jul 17 16:39 generate-3.4.s
to:
-rw-r--r-- 1 jh jh  671035 Jul 17 13:32 combine2.s
-rw-r--r-- 1 jh jh 4859672 Jul 17 16:40 generate-3.4.s

For compilation of GCC modules I get following timmings:

real    2m54.967s
user    2m52.875s
sys     0m2.044s

real    2m54.997s
user    2m52.871s
sys     0m2.124s

to:

real    2m55.269s
user    2m53.195s
sys     0m2.084s

real    2m55.175s
user    2m53.039s
sys     0m2.140s

For Gerald's testcase it is 4.51s to 4.50s

With tuples, we might be slightly better here, since SSA passes gets noticeably
cheaper relative to RTL passes and we save some little work at RTL level with
the patch.  We can also get some improvement by limited DCE. Initial
tests was in favour of SSA, but I had to disable more of TER.  In
particular it would be nice to allow TERing loads into expression when
there are no stores in between the statements, this is very common case.

The main motivation for the patch is however better maintenability of compiler
(ie Richards variable length arrays won't need non-SSA lowering passes) and to
get rid of O0 code quality regressions we have relative to pre-tree-SSA
compilers.

Bootstrapped/regtested i686-linux.  I am testing x86_64-linux and PPC-linux but both
passed with earlier version of the patch.

Honza

        * cgraph.c (cgraph_add_new_function): Do early local passes.
        * tree-nrv.c (gate_pass_return_slot): New gate.
        (pass_nrv): Add the gate.
        * tree-ssa-coalese.c (hash_ssa_name_by_var, eq_ssa_name_by_var): New
        functions.
        (coalesce_ssa_name): Coalesce SSA names.
        * tree-ssa-live.c (remove_unused_locals): Be more conservative when
        not optimizing so unused user vars remains visible.
        * common.opt (flag_tree_ter): Always enable by default.
        * tree-ssa-ter.c: Include flags.h
        (is_replaceable_p): Check that locations match; when aliasing is missing
        be conservative about loads.
        * tree-optimize.c (gate_init_datastructures): Remove.
        (pass_init_datastructures): New.
        * passes.c: Reorder passes so we always go into SSA.

Index: cgraph.c
===================================================================
--- cgraph.c (revision 137903)
+++ cgraph.c (working copy)
@@ -1366,7 +1366,7 @@
  if (!lowered)
           tree_lowering_passes (fndecl);
  bitmap_obstack_initialize (NULL);
- if (!gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)) && optimize)
+ if (!gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
   execute_pass_list (pass_early_local_passes.pass.sub);
  bitmap_obstack_release (NULL);
  tree_rest_of_compilation (fndecl);
Index: tree-nrv.c
===================================================================
--- tree-nrv.c (revision 137903)
+++ tree-nrv.c (working copy)
@@ -226,12 +226,18 @@
   return 0;
 }
 
+static bool
+gate_pass_return_slot (void)
+{
+  return optimize > 0;
+}
+
 struct gimple_opt_pass pass_nrv =
 {
  {
   GIMPLE_PASS,
   "nrv", /* name */
-  NULL, /* gate */
+  gate_pass_return_slot, /* gate */
   tree_nrv, /* execute */
   NULL, /* sub */
   NULL, /* next */
Index: tree-ssa-coalesce.c
===================================================================
--- tree-ssa-coalesce.c (revision 137903)
+++ tree-ssa-coalesce.c (working copy)
@@ -1297,7 +1297,25 @@
     }
 }
 
+/* Returns a hash code for P.  */
 
+static hashval_t
+hash_ssa_name_by_var (const void *p)
+{
+  const_tree n = (const_tree) p;
+  return (hashval_t) htab_hash_pointer (SSA_NAME_VAR (n));
+}
+
+/* Returns nonzero if P1 and P2 are equal.  */
+
+static int
+eq_ssa_name_by_var (const void *p1, const void *p2)
+{
+  const_tree n1 = (const_tree) p1;
+  const_tree n2 = (const_tree) p2;
+  return SSA_NAME_VAR (n1) == SSA_NAME_VAR (n2);
+}
+
 /* Reduce the number of copies by coalescing variables in the function.  Return
    a partition map with the resulting coalesces.  */
 
@@ -1310,13 +1328,45 @@
   coalesce_list_p cl;
   bitmap used_in_copies = BITMAP_ALLOC (NULL);
   var_map map;
+  unsigned int i;
+  static htab_t ssa_name_hash;
 
   cl = create_coalesce_list ();
   map = create_outofssa_var_map (cl, used_in_copies);
 
+  /* We need to coalesce all names originating same SSA_NAME_VAR
+     so debug info remains undisturbed.  */
+  if (!optimize)
+    {
+      ssa_name_hash = htab_create (10, hash_ssa_name_by_var,
+         eq_ssa_name_by_var, NULL);
+      for (i = 1; i < num_ssa_names; i++)
+ {
+  tree a = ssa_name (i);
+
+  if (a && SSA_NAME_VAR (a) && !DECL_ARTIFICIAL (SSA_NAME_VAR (a)))
+    {
+      tree *slot = (tree *) htab_find_slot (ssa_name_hash, a, INSERT);
+
+      if (!*slot)
+ *slot = a;
+      else
+ {
+  add_coalesce (cl, SSA_NAME_VERSION (a), SSA_NAME_VERSION (*slot),
+ MUST_COALESCE_COST - 1);
+  bitmap_set_bit (used_in_copies, SSA_NAME_VERSION (a));
+  bitmap_set_bit (used_in_copies, SSA_NAME_VERSION (*slot));
+ }
+    }
+ }
+      htab_delete (ssa_name_hash);
+    }
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    dump_var_map (dump_file, map);
+
   /* Don't calculate live ranges for variables not in the coalesce list.  */
   partition_view_bitmap (map, used_in_copies, true);
   BITMAP_FREE (used_in_copies);
 
   if (num_var_partitions (map) < 1)
     {
Index: tree-ssa-live.c
===================================================================
--- tree-ssa-live.c (revision 137903)
+++ tree-ssa-live.c (working copy)
@@ -582,7 +582,8 @@
   var_ann_t ann;
   bitmap global_unused_vars = NULL;
 
-  mark_scope_block_unused (DECL_INITIAL (current_function_decl));
+  if (optimize)
+    mark_scope_block_unused (DECL_INITIAL (current_function_decl));
   /* Assume all locals are unused.  */
   FOR_EACH_REFERENCED_VAR (t, rvi)
     var_ann (t)->used = false;
@@ -661,7 +662,8 @@
 
   if (TREE_CODE (var) == VAR_DECL
       && is_global_var (var)
-      && bitmap_bit_p (global_unused_vars, DECL_UID (var)))
+      && bitmap_bit_p (global_unused_vars, DECL_UID (var))
+      && (optimize || DECL_ARTIFICIAL (var)))
     *cell = TREE_CHAIN (*cell);
   else
     cell = &TREE_CHAIN (*cell);
@@ -681,9 +683,11 @@
  && TREE_CODE (t) != RESULT_DECL
  && !(ann = var_ann (t))->used
  && !ann->symbol_mem_tag
- && !TREE_ADDRESSABLE (t))
+ && !TREE_ADDRESSABLE (t)
+ && (optimize || DECL_ARTIFICIAL (t)))
       remove_referenced_var (t);
-  remove_unused_scope_block_p (DECL_INITIAL (current_function_decl));
+  if (optimize)
+    remove_unused_scope_block_p (DECL_INITIAL (current_function_decl));
 }
 
 
Index: common.opt
===================================================================
--- common.opt (revision 137903)
+++ common.opt (working copy)
@@ -1157,7 +1157,7 @@
 Perform scalar replacement of aggregates
 
 ftree-ter
-Common Report Var(flag_tree_ter) Optimization
+Common Report Var(flag_tree_ter) Init(1) Optimization
 Replace temporary expressions in the SSA->normal pass
 
 ftree-lrs
Index: tree-ssa-ter.c
===================================================================
--- tree-ssa-ter.c (revision 137903)
+++ tree-ssa-ter.c (working copy)
@@ -30,6 +30,7 @@
 #include "tree-flow.h"
 #include "tree-dump.h"
 #include "tree-ssa-live.h"
+#include "flags.h"
 
 
 /* Temporary Expression Replacement (TER)
@@ -364,6 +365,8 @@
   tree call_expr;
   use_operand_p use_p;
   tree def, use_stmt;
+  location_t locus1, locus2;
+  tree block1, block2;
 
   /* Only consider modify stmts.  */
   if (TREE_CODE (stmt) != GIMPLE_MODIFY_STMT)
@@ -386,6 +389,36 @@
   if (bb_for_stmt (use_stmt) != bb_for_stmt (stmt))
     return false;
 
+  if (GIMPLE_STMT_P (stmt))
+    {
+      locus1 = GIMPLE_STMT_LOCUS (stmt);
+      block1 = GIMPLE_STMT_BLOCK (stmt);
+    }
+  else
+    {
+      locus1 = *EXPR_LOCUS (stmt);
+      block1 = TREE_BLOCK (stmt);
+    }
+  if (GIMPLE_STMT_P (use_stmt))
+    {
+      locus2 = GIMPLE_STMT_LOCUS (use_stmt);
+      block2 = GIMPLE_STMT_BLOCK (use_stmt);
+    }
+  if (TREE_CODE (use_stmt) == PHI_NODE)
+    {
+      locus2 = 0;
+      block2 = NULL_TREE;
+    }
+  else
+    {
+      locus2 = *EXPR_LOCUS (use_stmt);
+      block2 = TREE_BLOCK (use_stmt);
+    }
+
+  if (!optimize
+      && ((locus1 && locus1 != locus2) || (block1 && block1 != block2)))
+    return false;
+
   /* Used in this block, but at the TOP of the block, not the end.  */
   if (TREE_CODE (use_stmt) == PHI_NODE)
     return false;
@@ -394,6 +427,10 @@
   if (!(ZERO_SSA_OPERANDS (stmt, SSA_OP_VDEF)))
     return false;
 
+  /* Without alias info we can't move around loads.  */
+  if (stmt_ann (stmt)->references_memory && !optimize)
+    return false;
+
   /* Float expressions must go through memory if float-store is on.  */
   if (flag_float_store
       && FLOAT_TYPE_P (TREE_TYPE (GENERIC_TREE_OPERAND (stmt, 1))))
@@ -412,7 +449,6 @@
   /* Leave any stmt with volatile operands alone as well.  */
   if (stmt_ann (stmt)->has_volatile_ops)
     return false;
-  
 
   return true;
 }
Index: tree-optimize.c
===================================================================
--- tree-optimize.c (revision 137903)
+++ tree-optimize.c (working copy)
@@ -341,20 +341,12 @@
   return 0;
 }
 
-/* Gate: initialize or not the SSA datastructures.  */
-
-static bool
-gate_init_datastructures (void)
-{
-  return (optimize >= 1);
-}
-
 struct gimple_opt_pass pass_init_datastructures =
 {
  {
   GIMPLE_PASS,
   NULL, /* name */
-  gate_init_datastructures, /* gate */
+  NULL, /* gate */
   execute_init_datastructures, /* execute */
   NULL, /* sub */
   NULL, /* next */
Index: passes.c
===================================================================
--- passes.c (revision 137903)
+++ passes.c (working copy)
@@ -542,12 +542,13 @@
       NEXT_PASS (pass_cleanup_cfg);
       NEXT_PASS (pass_init_datastructures);
       NEXT_PASS (pass_expand_omp);
+
+      NEXT_PASS (pass_referenced_vars);
+      NEXT_PASS (pass_reset_cc_flags);
+      NEXT_PASS (pass_build_ssa);
       NEXT_PASS (pass_all_early_optimizations);
  {
   struct opt_pass **p = &pass_all_early_optimizations.pass.sub;
-  NEXT_PASS (pass_referenced_vars);
-  NEXT_PASS (pass_reset_cc_flags);
-  NEXT_PASS (pass_build_ssa);
   NEXT_PASS (pass_early_warn_uninitialized);
   NEXT_PASS (pass_rebuild_cgraph_edges);
   NEXT_PASS (pass_early_inline);
@@ -574,8 +575,8 @@
   NEXT_PASS (pass_tail_recursion);
   NEXT_PASS (pass_convert_switch);
           NEXT_PASS (pass_profile);
-  NEXT_PASS (pass_release_ssa_names);
  }
+      NEXT_PASS (pass_release_ssa_names);
       NEXT_PASS (pass_rebuild_cgraph_edges);
     }
   NEXT_PASS (pass_ipa_increase_alignment);
@@ -712,11 +713,12 @@
       NEXT_PASS (pass_tail_calls);
       NEXT_PASS (pass_rename_ssa_copies);
       NEXT_PASS (pass_uncprop);
-      NEXT_PASS (pass_del_ssa);
-      NEXT_PASS (pass_nrv);
-      NEXT_PASS (pass_mark_used_blocks);
-      NEXT_PASS (pass_cleanup_cfg_post_optimizing);
     }
+  NEXT_PASS (pass_del_ssa);
+  NEXT_PASS (pass_nrv);
+  NEXT_PASS (pass_mark_used_blocks);
+  NEXT_PASS (pass_cleanup_cfg_post_optimizing);
+
   NEXT_PASS (pass_warn_function_noreturn);
   NEXT_PASS (pass_free_datastructures);
   NEXT_PASS (pass_mudflap_2);

Re: Enable SSA at -O0

by Andrew MacLeod :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jan Hubicka wrote:
> Hi,
> this patch enables SSA at -O0.  There are no regressions with the last round of
> fixes I sent and I also tested gdb testsuite that we don't obstructate debug
> info.
>  
I wonder how much that actually tell us about debug info in large
complex cases...

right now -O0 is the only saving grace we have, and as much as I want to
see the SSA path at -O0, I'd also hate for us to end up with numerous
cases where we no longer have some basic info.

> Bootstrapped/regtested i686-linux.  I am testing x86_64-linux and PPC-linux but both
> passed with earlier version of the patch.
>
>  
I presume this is being targeted for 4.4?

Andrew

Re: Enable SSA at -O0

by Jan Hubicka :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Jan Hubicka wrote:
> >Hi,
> >this patch enables SSA at -O0.  There are no regressions with the last
> >round of
> >fixes I sent and I also tested gdb testsuite that we don't obstructate
> >debug
> >info.
> >  
> I wonder how much that actually tell us about debug info in large
> complex cases...

Well, I don't know either...  I didn't run into debug info quality
problems during development and I tried to use this as build compiler.

>
> right now -O0 is the only saving grace we have, and as much as I want to
> see the SSA path at -O0, I'd also hate for us to end up with numerous
> cases where we no longer have some basic info.

I've disabled TERing across statements, just within one statement or
putting statements with no debug info into statement with debug info.
Also I re-coalesce all user variables into single variable.  So in
theory we should do no worse then we did before tree-SSA as trees fed
into expander are now only closer to pre-tree-SSA form.
>
> >Bootstrapped/regtested i686-linux.  I am testing x86_64-linux and
> >PPC-linux but both
> >passed with earlier version of the patch.
> >
> >  
> I presume this is being targeted for 4.4?

I guess it depends on overall consensus.  I started with this patch as
simple hack as Richard needed it for the lowering passes.  It turned out
to need a lot of fixing here and there and enabling unit-at-a-time by
default, so it ended up being more work. If for nothing, this patch was
useful to fix a lot of latent or semi-latent problems ;)) and I think
the results are not discougrating.  
Seeing it from the other POV, this patch would enforce us fixing debug
info quality issues in SSA that should improve debug quality when
optimizing.

Honza
>
> Andrew

Re: Enable SSA at -O0

by Richard Guenther-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 17 Jul 2008, Jan Hubicka wrote:

> > Jan Hubicka wrote:
> > >Hi,
> > >this patch enables SSA at -O0.  There are no regressions with the last
> > >round of
> > >fixes I sent and I also tested gdb testsuite that we don't obstructate
> > >debug
> > >info.
> > >  
> > I wonder how much that actually tell us about debug info in large
> > complex cases...
>
> Well, I don't know either...  I didn't run into debug info quality
> problems during development and I tried to use this as build compiler.

Well have to bite the bullet and see, and eventually fix some fallout.

> > right now -O0 is the only saving grace we have, and as much as I want to
> > see the SSA path at -O0, I'd also hate for us to end up with numerous
> > cases where we no longer have some basic info.
>
> I've disabled TERing across statements, just within one statement or
> putting statements with no debug info into statement with debug info.
> Also I re-coalesce all user variables into single variable.  So in
> theory we should do no worse then we did before tree-SSA as trees fed
> into expander are now only closer to pre-tree-SSA form.
> >
> > >Bootstrapped/regtested i686-linux.  I am testing x86_64-linux and
> > >PPC-linux but both
> > >passed with earlier version of the patch.
> > >
> > >  
> > I presume this is being targeted for 4.4?
>
> I guess it depends on overall consensus.  I started with this patch as
> simple hack as Richard needed it for the lowering passes.  It turned out
> to need a lot of fixing here and there and enabling unit-at-a-time by
> default, so it ended up being more work. If for nothing, this patch was
> useful to fix a lot of latent or semi-latent problems ;)) and I think
> the results are not discougrating.  
> Seeing it from the other POV, this patch would enforce us fixing debug
> info quality issues in SSA that should improve debug quality when
> optimizing.

It also should help us to "fix" expansion to not require TER but
use the SSA form to do initial instruction selection.  This will
especially become important with tuples to avoid going back to
GENERIC trees.  Of course we want to only have one expansion code
which means we need SSA at -O0 anyway ...

Richard.

Re: Enable SSA at -O0

by Andrew MacLeod :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jan Hubicka wrote:
>
> Well, I don't know either...  I didn't run into debug info quality
> problems during development and I tried to use this as build compiler.
>  
thats a good sign anyway!

>  
>> right now -O0 is the only saving grace we have, and as much as I want to
>> see the SSA path at -O0, I'd also hate for us to end up with numerous
>> cases where we no longer have some basic info.
>>    
>
> I've disabled TERing across statements, just within one statement or
> putting statements with no debug info into statement with debug info.
> Also I re-coalesce all user variables into single variable.  So in
> theory we should do no worse then we did before tree-SSA as trees fed
> into expander are now only closer to pre-tree-SSA form.
>  

right. And as long as we don't do any SSA optimizations which move code
around, we should never have to worry about overlapping live ranges
which play havoc with debug output. in theory, it should be OK, but the
specter of compiler temps without debug info finding their way in where
they shouldn't be lurks at the window...  we certainly have to be
careful what ssa passes are run.

>>> Bootstrapped/regtested i686-linux.  I am testing x86_64-linux and
>>> PPC-linux but both
>>> passed with earlier version of the patch.
>>>
>>>  
>>>      
>> I presume this is being targeted for 4.4?
>>    
>
> I guess it depends on overall consensus.  I started with this patch as
> simple hack as Richard needed it for the lowering passes.  It turned out
> to need a lot of fixing here and there and enabling unit-at-a-time by
> default, so it ended up being more work. If for nothing, this patch was
> useful to fix a lot of latent or semi-latent problems ;)) and I think
> the results are not discougrating.  
> Seeing it from the other POV, this patch would enforce us fixing debug
> info quality issues in SSA that should improve debug quality when
> optimizing.
>  
Yeah. I'm currently working on a mechanism to provide a quantitative
comparison of the 'user quality' of debug info issued between different
runs of the compiler. It should also identify what isn't correct and
where.  At least that is the plan :-)  This would be another useful
thing to run through it, -O0 with SSA and -O0 without SSA and see how
similar it is. And also as a test of the comparator mechanism itself.

We might then be able to detect that we have a problem we need to look
at :-)

Andrew

Re: Enable SSA at -O0

by Andrew MacLeod :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Richard Guenther wrote:

>>
>> I guess it depends on overall consensus.  I started with this patch as
>> simple hack as Richard needed it for the lowering passes.  It turned out
>> to need a lot of fixing here and there and enabling unit-at-a-time by
>> default, so it ended up being more work. If for nothing, this patch was
>> useful to fix a lot of latent or semi-latent problems ;)) and I think
>> the results are not discougrating.  
>> Seeing it from the other POV, this patch would enforce us fixing debug
>> info quality issues in SSA that should improve debug quality when
>> optimizing.
>>    
>
> It also should help us to "fix" expansion to not require TER but
> use the SSA form to do initial instruction selection.  This will
> especially become important with tuples to avoid going back to
> GENERIC trees.  Of course we want to only have one expansion code
> which means we need SSA at -O0 anyway ...
>
>  
Yes, replacing TER and expand with a new expander/selector is a fine
goal, and this is certainly a step in that direction. Using SSA on the
-O0 path is inevitable.

Andrew


Re: Enable SSA at -O0

by Jakub Jelinek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Jul 17, 2008 at 11:11:00AM -0400, Andrew MacLeod wrote:

> Jan Hubicka wrote:
> >Hi,
> >this patch enables SSA at -O0.  There are no regressions with the last
> >round of
> >fixes I sent and I also tested gdb testsuite that we don't obstructate
> >debug
> >info.
> >  
> I wonder how much that actually tell us about debug info in large
> complex cases...
>
> right now -O0 is the only saving grace we have, and as much as I want to
> see the SSA path at -O0, I'd also hate for us to end up with numerous
> cases where we no longer have some basic info.

Even at -O0 recent GCCs screw up the debuginfo badly, see
PR debug/34037 or PR debug/36690.

        Jakub

Re: Enable SSA at -O0

by Paolo Bonzini-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


> +  /* We need to coalesce all names originating same SSA_NAME_VAR
> +     so debug info remains undisturbed.  */
> +  if (!optimize)

Would it make sense to do this at optimize <= 1?

Paolo

> +    {
> +      ssa_name_hash = htab_create (10, hash_ssa_name_by_var,
> +         eq_ssa_name_by_var, NULL);
> +      for (i = 1; i < num_ssa_names; i++)
> + {
> +  tree a = ssa_name (i);
> +
> +  if (a && SSA_NAME_VAR (a) && !DECL_ARTIFICIAL (SSA_NAME_VAR (a)))
> +    {
> +      tree *slot = (tree *) htab_find_slot (ssa_name_hash, a, INSERT);
> +
> +      if (!*slot)
> + *slot = a;
> +      else
> + {
> +  add_coalesce (cl, SSA_NAME_VERSION (a), SSA_NAME_VERSION (*slot),
> + MUST_COALESCE_COST - 1);
> +  bitmap_set_bit (used_in_copies, SSA_NAME_VERSION (a));
> +  bitmap_set_bit (used_in_copies, SSA_NAME_VERSION (*slot));
> + }
> +    }
> + }
> +      htab_delete (ssa_name_hash);
> +    }
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +    dump_var_map (dump_file, map);

Re: Enable SSA at -O0

by Jan Hubicka-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>
> >+  /* We need to coalesce all names originating same SSA_NAME_VAR
> >+     so debug info remains undisturbed.  */
> >+  if (!optimize)
>
> Would it make sense to do this at optimize <= 1?

I don't know. Variable splitting is quite standard technique in register
allocation.  I am not sure how serious we are about debug info quality
difference in between -O1 and -O2

Honza

Re: Enable SSA at -O0

by Paolo Bonzini-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jan Hubicka wrote:
>>> +  /* We need to coalesce all names originating same SSA_NAME_VAR
>>> +     so debug info remains undisturbed.  */
>>> +  if (!optimize)
>> Would it make sense to do this at optimize <= 1?
>
> I don't know. Variable splitting is quite standard technique in register
> allocation.

But, if you mean flag_web, we don't do it unless -funroll-loops or
-fpeel-loops, on the ground that the splitting we get from out-of-SSA
should be enough.  So, enabling this kind of coalescing at -O1 would
indeed prevent the compiler from doing splitting.

> I am not sure how serious we are about debug info quality
> difference in between -O1 and -O2

Some people (especially the AdaCore guys) have been constantly begging
for better (usable, at least) -O1 debug info, and FWIW I second that.

Paolo

Re: Enable SSA at -O0

by Richard Guenther-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Jul 18, 2008 at 5:52 PM, Paolo Bonzini <bonzini@...> wrote:

> Jan Hubicka wrote:
>>>>
>>>> +  /* We need to coalesce all names originating same SSA_NAME_VAR
>>>> +     so debug info remains undisturbed.  */
>>>> +  if (!optimize)
>>>
>>> Would it make sense to do this at optimize <= 1?
>>
>> I don't know. Variable splitting is quite standard technique in register
>> allocation.
>
> But, if you mean flag_web, we don't do it unless -funroll-loops or
> -fpeel-loops, on the ground that the splitting we get from out-of-SSA should
> be enough.  So, enabling this kind of coalescing at -O1 would indeed prevent
> the compiler from doing splitting.
>
>> I am not sure how serious we are about debug info quality
>> difference in between -O1 and -O2
>
> Some people (especially the AdaCore guys) have been constantly begging for
> better (usable, at least) -O1 debug info, and FWIW I second that.

I'd rather do a -Og instead.  -O1 already badly messes with debuggability.
For -Og I would chose -Os inlining heuristics for example and only do
basic propagation and dead code removal but no memory optimization or
things like SRA or IVOPTs.  Just ccp, copyprop, dce and inline accessor style
things to make C++ usable.

Richard.

Re: Enable SSA at -O0

by Paolo Bonzini-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


> I'd rather do a -Og instead.  -O1 already badly messes with debuggability.
> For -Og I would chose -Os inlining heuristics for example and only do
> basic propagation and dead code removal but no memory optimization or
> things like SRA or IVOPTs.  Just ccp, copyprop, dce and inline accessor style
> things to make C++ usable.

But since in 4.4 there is no more any reason to use -O1 when alias
analysis blows up, why couldn't -O1 *be* -Og?

Paolo

Re: Enable SSA at -O0

by Richard Guenther-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, 18 Jul 2008, Paolo Bonzini wrote:

>
> > I'd rather do a -Og instead.  -O1 already badly messes with debuggability.
> > For -Og I would chose -Os inlining heuristics for example and only do
> > basic propagation and dead code removal but no memory optimization or
> > things like SRA or IVOPTs.  Just ccp, copyprop, dce and inline accessor
> > style
> > things to make C++ usable.
>
> But since in 4.4 there is no more any reason to use -O1 when alias analysis
> blows up, why couldn't -O1 *be* -Og?

Well, do you suggest to "drop" -O1 in favor of simply -O2 and
"re-introduce" it with -Og behavior?  I think this is more confusing
than adding -Og.  But we could certainly think of taking out some
optimizations from -O1 that hurt debugging, but I wouldn't go as far
as I suggested for -Og.  -Og could be nearly as good as -O0 with
the difference that 'set' in gdb will work even less than with -O0.

Richard.

Re: Enable SSA at -O0

by Paolo Bonzini-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


>> But since in 4.4 there is no more any reason to use -O1 when alias analysis
>> blows up, why couldn't -O1 *be* -Og?
>
> Well, do you suggest to "drop" -O1 in favor of simply -O2 and
> "re-introduce" it with -Og behavior?

Yes, but just because...

> I think this is more confusing
> than adding -Og.  But we could certainly think of taking out some
> optimizations from -O1 that hurt debugging, but I wouldn't go as far
> as I suggested for -Og.

... I agree with the AdaCore guys' opinion was that in 3.x -O1 was -Og
in practice, and it was then broken in 4.x.  So the non-debuggability of
-O1 in 4.x is in some sense a regression.  If we can now declare to have
a hold on tree-level compile-time problems, and that there is no need to
keep -O1 as a workaround for those, it's time to fix that regression.

Paolo

Re: Enable SSA at -O0

by Richard Guenther-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, 18 Jul 2008, Paolo Bonzini wrote:

>
> > > But since in 4.4 there is no more any reason to use -O1 when alias
> > > analysis
> > > blows up, why couldn't -O1 *be* -Og?
> >
> > Well, do you suggest to "drop" -O1 in favor of simply -O2 and
> > "re-introduce" it with -Og behavior?
>
> Yes, but just because...
>
> > I think this is more confusing
> > than adding -Og.  But we could certainly think of taking out some
> > optimizations from -O1 that hurt debugging, but I wouldn't go as far
> > as I suggested for -Og.
>
> ... I agree with the AdaCore guys' opinion was that in 3.x -O1 was -Og in
> practice, and it was then broken in 4.x.  So the non-debuggability of -O1 in
> 4.x is in some sense a regression.  If we can now declare to have a hold on
> tree-level compile-time problems, and that there is no need to keep -O1 as a
> workaround for those, it's time to fix that regression.

Ok.  Let's see what others think.  The most obvious candidates for
disabling at -O1 because of debugging are SRA and IVOPTs.  Inlining
is something else we can throttle down to -Os measures.

I doubt the result will be much better for debugging, but for the
remaining stuff we could at least try to fix some obvious oversights
in the passes.

Richard.

Re: Enable SSA at -O0

by Jan Hubicka-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,
to bring little more confussion into -O1 wrt -Og discussion, I should
note that the variable tracking seems to do (relatively) resonable job
here.

I had to enforce coalescing to handle somewhat extreme testcases such
as:

t()
{
  int i=0;
  /*breakpoint here*/
  for (i=0; ....
}

where the original copy of i initialized to 0 didn't get properly
tracked and the early breakpoint got wrong value.  This is something
that I would expect to break at O0 anyway, since the initial assignment
is dead.

So in this particular case I would rather look into ways improving
register tracking rather than disabling splitting of variables.
We've heard a lot about this on GCC summit so things might improve soon
;)

Honza

Re: Enable SSA at -O0

by Diego Novillo-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 7/17/08 7:52 AM, Jan Hubicka wrote:

> * cgraph.c (cgraph_add_new_function): Do early local passes.
> * tree-nrv.c (gate_pass_return_slot): New gate.
> (pass_nrv): Add the gate.
> * tree-ssa-coalese.c (hash_ssa_name_by_var, eq_ssa_name_by_var): New
> functions.
> (coalesce_ssa_name): Coalesce SSA names.
> * tree-ssa-live.c (remove_unused_locals): Be more conservative when
> not optimizing so unused user vars remains visible.
> * common.opt (flag_tree_ter): Always enable by default.
> * tree-ssa-ter.c: Include flags.h
> (is_replaceable_p): Check that locations match; when aliasing is missing
> be conservative about loads.
> * tree-optimize.c (gate_init_datastructures): Remove.
> (pass_init_datastructures): New.
> * passes.c: Reorder passes so we always go into SSA.

This is OK with me.  If we have debugging information regressions, then
we can deal with those separately.  Building SSA at -O0 will simplify
very many things, so I think we should enable it before 4.4 is released.


Diego.


Re: Enable SSA at -O0