[PATCH, committed] Add ability to set the optimization options and on ix86 target options on a function specific basis

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

[PATCH, committed] Add ability to set the optimization options and on ix86 target options on a function specific basis

by Michael Meissner-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I committed the following patch to revision 138075 on the mainline to add
function specific optimization/target option support.  I did not move the
optimization and target options data structures from being trees containing the
structure to the structure itself that Steven Bosscher suggested, because it
got complicated with regard to garbage collection and hashing.  I might look at
it later, but I wanted to get the main patches checked in.

[gcc]
2008-07-23  Michael Meissner  <gnu@...>
            Karthik Kumar  <karthikkumar@...>

        * attribs.c (file scope): Include c-common.h.
        (decl_attributes): Add support for #pragma GCC optimize and
        #pragma GCC option.

        * targhooks.c (default_can_inline_p): New function that is the
        default for the TARGET_CAN_INLINE_P target hook.

        * targhooks.h (default_can_inline_p): Add declaration.

        * tree.c (cl_optimization_node): New static tree for building
        OPTIMIZATION_NODE tree.
        (cl_target_option_node): New static tree for building
        TARGET_OPTION_NODE tree.
        (cl_option_hash_table): New hash table for hashing
        OPTIMIZATION_NODE and TARGET_OPTION_NODE trees.
        (cl_option_hash_hash): New function to provide the hash value for
        OPTIMIZATION_NODE and TARGET_OPTION_NODE trees.
        (cl_option_hash_eq): New function to provide an equality test for
        OPTIMIZATION_NODE and TARGET_OPTION_NODE trees.
        (tree_code_size): Add support for OPTIMIZATION_NODE and
        TARGET_OPTION_NODE trees.
        (tree_code_structure): Add support for OPTIMIZATION_NODE and
        TARGET_OPTION_NODE trees.
        (build_optimization_node): Build a tree that has all of the
        current optimization options.
        (build_target_option_node): Build a tree that has the target
        options that might be changed on a per function basis.

        * tree.h (file scope): Include options.h.
        (DECL_FUNCTION_SPECIFIC_TARGET): New accessor macro.
        (DECL_FUNCTION_SPECIFIC_OPTIMIZATION): Ditto.
        (TREE_OPTIMIZATION): Ditto.
        (TREE_TARGET_SPECIFIC): Ditto.
        (struct tree_function_decl): Add fields for remembering the
        current optimization options and target specific options.
        (struct tree_optimization_option): New tree variant that remembers
        the optimization options.
        (struct tree_target_option): New tree variant that remembers the
        target specific flags that might change for compiling a particular
        function.
        (union tree_node): Include tree_optimization_option and
        tree_target_option fields.
        (enum tree_index): Add TI_OPTIMIZATION_DEFAULT,
        TI_OPTIMIZATION_CURRENT, TI_OPTIMIZATION_COLD,
        TI_OPTIMIZATION_HOT, TI_TARGET_OPTION_DEFAULT,
        TI_TARGET_OPTION_CURRENT, TI_CURRENT_OPTION_PRAGMA,
        TI_CURRENT_OPTIMIZE_PRAGMA entries for saving function specific
        optimization and target options.
        (optimization_default_node): New macro to refer to global_trees
        field.
        (optimization_current_node): Ditto.
        (optimization_cold_node): Ditto.
        (optimization_hot_node): Ditto.
        (target_option_default_node): Ditto.
        (target_option_current_node): Ditto.
        (current_option_pragma): Ditto.
        (current_optimize_pragma): Ditto.

        * target.h (struct gcc_target): Add valid_option_attribute_p,
        target_option_save, target_option_restore, target_option_print,
        target_option_pragma_parse, and can_inline_p hooks.

        * toplev.h (parse_optimize_options): Add declaration.
        (fast_math_flags_struct_set_p): Ditto.

        * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): New function to
        adjust the current __OPTIMIZE__, etc. macros when #pragma GCC
        optimize is used.

        * ipa-inline.c (cgraph_decide_inlining_of_small_function): Call
        tree_can_inline_p hook to see if one function can inline another.
        (cgraph_decide_inlining): Ditto.
        (cgraph_decide_inlining_incrementally): Ditto.

        * opts.c (decode_options): Add support for running multiple times
        to allow functions with different target or optimization options
        than was specified on the command line.
        (fast_math_flags_struct_set_p): New function that is similar to
        fast_math_flags_set_p, except it uses the values in the
        cl_optimization structure instead of global variables.

        * optc-gen.awk: Add support for TargetSave to allow a back end to
        declare new fields that need to be saved when using function
        specific options.  Include flags.h and target.h in the options.c
        source.  Add support for Save to indicate which options can be set
        for individual functions.  Generate cl_optimize_save,
        cl_optimize_restore, cl_optimize_print, cl_target_option_save,
        cl_target_option_restore, cl_target_option_print functions to
        allow functions to use different optimization or target options.

        * opt-functions.awk (var_type_struct): Return the type used for
        storing the field in a structure.

        * opth-gen.awk: Add support for TargetSave to allow a back end to
        declare new fields that need to be saved when using function
        specific options.  Add support for Save to indicate which options
        can be set for individual functions.  Only generate one extern for
        Mask fields.  Generate cl_optimization and cl_target_option
        structures to remember optimization and target options.

        * treestruct.def (TS_OPTIMIZATION): Add support for garbage
        collecting new tree nodes.
        (TS_TARGET_OPTION): Ditto.

        * c-decl.c (merge_decls): Merge function specific target and
        optimization options.

        * function.c (invoke_set_current_function_hook): If the function
        uses different optimization options, change the global variables
        to reflect this.

        * coretypes.h (struct cl_optimization): Add forward reference.
        (struct cl_target_option): Ditto.

        * c-pragma.c (option_stack): New static vector to remember the
        current #pragma GCC option stack.
        (handle_pragma_option): New function to support #pragma GCC option
        to change target options.
        (optimize_stack): New static vector to remember the current
        #pragma GCC optimize stack.
        (handle_pragma_optimize): New function to support #pragma GCC
        optimize to change optimization options.
        (init_pragma): Add support for #pragma GCC optimize and #pragma
        GCC option.

        * tree.def (OPTIMIZATION_NODE): New tree code for remembering
        optimization options.
        (TARGET_OPTION_NODE): New tree code for remembering certain target
        options.

        * print-tree.c (print_node): Add support for OPTIMIZATION_NODE and
        TARGET_OPTION_NODE trees.

        * common.opt (-O): Add Optimization flag.
        (-Os): Ditto.
        (-fmath-errno): Ditto.
        (-falign-functions): Add UInteger flag to make sure flag gets full
        int in cl_optimization structure.
        (-falign-jumps): Ditto.
        (-falign-labels): Ditto.
        (-falign-loops): Ditto.
        (-fsched-stalled-insns): Ditto.
        (-fsched-stalled-insns-dep): Ditto.

        * target-def.h (TARGET_VALID_OPTION_ATTRIBUTE_P): Add default
        definition.
        (TARGET_OPTION_SAVE): Ditto.
        (TARGET_OPTION_RESTORE): Ditto.
        (TARGET_OPTION_PRINT): Ditto.
        (TARGET_OPTION_PRAGMA_PARSE): Ditto.
        (TARGET_CAN_INLINE_P): Ditto.
        (TARGET_INITIALIZER): Add new hooks.

        * tree-inline.c (tree_can_inline_p): New function to determine
        whether one function can inline another.  Check if the functions
        use compatible optimization options, and also call the backend
        can_inline_p hook.

        * tree-inline.h (tree_can_inline_p): Add declaration.

        * c-common.c (c_common_attribute): Add support for option and
        optimize attributes.
        (handle_option_attribute): Add support for the option attribute to
        allow the user to specify different target options for compiling a
        specific function.
        (handle_optimize_attribute): Add support for the optimize
        attribute to allow the user to specify different optimization
        options for compiling a specific function.
        (handle_hot_attribute): Turn on -O3 optimization for this one
        function if it isn't the default optimization level.
        (handle_cold_attribute): Turn on -Os optimization for this one
        function if it insn't the default optimization.
        (const_char_p): New const char * typedef.
        (optimize_args): New static vector to remember the optimization
        arguments.
        (parse_optimize_options): New function to set up the optimization
        arguments from either the optimize attribute or #pragma GCC
        optimize.

        * c-common.h (c_cpp_builtins_optimize_pragma): Add declaration.
        (builtin_define_std): Ditto.

        * config.gcc (i[3467]86-*-*): Add i386-c.o to C/C++ languages.
        Add t-i386 Makefile fragment to add i386-c.o and i386.o
        dependencies.
        (x86_64-*-*): Ditto.

        * Makefile.in (TREE_H): Add options.h.
        (options.o): Add $(TARGET_H) $(FLAGS_H) dependencies.

        * doc/extend.texi (option attribute): Document new attribute.
        (optimize attribute): Ditto.
        (hot attribute): Document hot attribute sets -O3.
        (cold attribute): Document cold attribute sets -Os.
        (#pragma GCC option): Document new pragma.
        (#pragma GCC optimize): Ditto.

        * doc/options.texi (TargetSave): Document TargetSave syntax.
        (UInteger): Document UInteger must be used for certain flags.
        (Save): Document Save option to create target specific options
        that can be saved/restored on a function specific context.

        * doc/c-tree.texi (DECL_FUNCTION_SPECIFIC_TARGET): Document new
        macro.
        (DECL_FUNCTION_SPECIFIC_OPTIMIZATION): Ditto.

        * doc/tm.texi (TARGET_VALID_OPTION_ATTRIBUTE_P): Document new
        hook.
        (TARGET_OPTION_SAVE): Ditto.
        (TARGET_OPTION_RESTORE): Ditto.
        (TARGET_OPTION_PRINT): Ditto.
        (TARGET_OPTION_PRAGMA_PARSE): Ditto.
        (TARGET_CAN_INLINE_P): Ditto.

        * doc/invoke.texi (-mfpmath=sse+387): Document as an alias for
        -mfpmath=sse,387.
        (-mfpmath=both): Ditto.

2008-07-23  Michael Meissner  <gnu@...>
            Karthik Kumar  <karthikkumar@...>

        * config/i386/i386.h (TARGET_ABM): Move switch into
        ix86_isa_flags.
        (TARGET_POPCNT): Ditto.
        (TARGET_SAHF): Ditto.
        (TARGET_AES): Ditto.
        (TARGET_PCLMUL): Ditto.
        (TARGET_CMPXCHG16B): Ditto.
        (TARGET_RECIP): Move switch into target_flags.
        (TARGET_FUSED_MADD): Ditto.
        (ix86_arch_features): Make an unsigned char type.
        (ix86_tune_features): Ditto.
        (OVERRIDE_OPTIONS): Add bool argument to override_options call.
        (TARGET_CPU_CPP_BUILTINS): Move into ix86_target_macros.
        (REGISTER_TARGET_PRAGMAS): Define, call ix86_register_pragmas.

        * config/i386/i386.opt (arch): New TargetSave field to define
        fields that need to be saved for function specific option
        support.
        (tune): Ditto.
        (fpmath): Ditto.
        (branch_cost): Ditto.
        (ix86_isa_flags_explicit): Ditto.
        (tune_defaulted): Ditto.
        (arch_specified): Ditto.
        (-m128-long-double): Add Save flag to save option for target
        specific option support.
        (-m80387): Ditto.
        (-maccumulate-outgoing-args): Ditto.
        (-malign-double): Ditto.
        (-malign-stringops): Ditto.
        (-mfancy-math-387): Ditto.
        (-mhard-float): Ditto.
        (-mieee-fp): Ditto.
        (-minline-all-stringops): Ditto.
        (-minline-stringops-dynamically): Ditto.
        (-mms-bitfields): Ditto.
        (-mno-align-stringops): Ditto.
        (-mno-fancy-math-387): Ditto.
        (-mno-push-args): Ditto.
        (-mno-red-zone): Ditto.
        (-mpush-args): Ditto.
        (-mred-zone): Ditto.
        (-mrtd): Ditto.
        (-msseregparm): Ditto.
        (-mstack-arg-probe): Ditto.
        (-m32): Ditto.
        (-m64): Ditto.
        (-mmmx): Ditto.
        (-m3dnow): Ditto.
        (-m3dnowa): Ditto.
        (-msse): Ditto.
        (-msse2): Ditto.
        (-msse3): Ditto.
        (-msse4.1): Ditto.
        (-msse4.2): Ditto.
        (-msse4): Ditto.
        (-mno-sse4): Ditto.
        (-msse4a): Ditto.
        (-msse5): Ditto.
        (-mrecip): Move flag into target_flags.
        (-mcld): Ditto.
        (-mno-fused-madd): Ditto.
        (-mfused-madd): Ditto.
        (-mabm): Move flag into ix86_isa_flags.
        (-mcx16): Ditto.
        (-mpopcnt): Ditto.
        (-msahf): Ditto.
        (-maes): Ditto.
        (-mpclmul): Ditto.

        * config/i386/i386-c.c: New file for #pragma support.
        (ix86_target_macros_internal): New function to #define or #undef
        target macros based when the user uses the #pragma GCC option to
        change target options.
        (ix86_pragma_option_parse): New function to add #pragma GCC option
        support.
        (ix86_target_macros): Move defining the target macros here from
        TARGET_CPU_CPP_BUILTINS in i386.h.
        (ix86_register_pragmas): Register the #pragma GCC option hook.  If
        defined, initialize any subtarget #pragmas.

        * config/i386/darwin.h (REGISTER_SUBTARGET_PRAGMAS): Rename from
        REGISTER_TARGET_PRAGMAS.

        * config/i386/t-i386: New file for x86 dependencies.
        (i386.o): Make dependencies mirror the include files used.
        (i386-c.o): New file, add dependencies.

        * config/i386/i386-protos.h (override_options): Add bool
        argument.
        (ix86_valid_option_attribute_tree): Add declaration.
        (ix86_target_macros): Ditto.
        (ix86_register_macros): Ditto.

        * config/i386/i386.c (ix86_tune_features): Move initialization of
        the target masks to initial_ix86_tune_features to allow functions
        to have different target options.  Make type unsigned char,
        instead of unsigned int.
        (initial_ix86_tune_features): New static vector to hold processor
        masks for the tune variables.
        (ix86_arch_features): Move initialization of the target masks to
        initial_ix86_arch_features to allow functions to have different
        target options.  Make type unsigned char, instead of unsigned
        int.
        (initial_ix86_arch_features): New static vector to hold processor
        masks for the arch variables.
        (enum ix86_function_specific_strings): New enum to describe the
        string options used for attribute((option(...))).
        (ix86_target_string): New function to return a string that
        describes the target options.
        (ix86_debug_options): New function to print the current options in
        the debugger.
        (ix86_function_specific_save): New function hook to save the
        function specific global variables in the cl_target_option
        structure.
        (ix86_function_specific_restore): New function hook to restore the
        function specific variables from the cl_target_option structure to
        the global variables.
        (ix86_function_specific_print): New function hook to print the
        target specific options in the cl_target_option structure.
        (ix86_valid_option_attribute_p): New function hook to validate
        attribute((option(...))) arguments.
        (ix86_valid_option_attribute_tree): New function that is common
        code between attribute((option(...))) and #pragma GCC option
        support that parses the options and returns a tree holding the
        options.
        (ix86_valid_option_attribute_inner_p): New helper function for
        ix86_valid_option_attribute_tree.
        (ix86_can_inline_p): New function hook to decide if one function
        can inline another on a target specific basis.
        (ix86_set_current_function); New function hook to switch target
        options if the user used attribute((option(...))) or #pragma GCC
        option.
        (ix86_tune_defaulted): Move to static file scope from
        override_options.
        (ix86_arch_specified): Ditto.
        (OPTION_MASK_ISA_AES_SET): New macro for moving switches into
        ix86_isa_flags.
        (OPTION_MASK_ISA_PCLMUL_SET): Ditto.
        (OPTION_MASK_ISA_ABM_SET): Ditto.
        (OPTION_MASK_ISA_POPCNT_SET): Ditto.
        (OPTION_MASK_ISA_CX16_SET): Ditto.
        (OPTION_MASK_ISA_SAHF_SET): Ditto.
        (OPTION_MASK_ISA_AES_UNSET): Ditto.
        (OPTION_MASK_ISA_PCLMUL_UNSET): Ditto.
        (OPTION_MASK_ISA_ABM_UNSET): Ditto.
        (OPTION_MASK_ISA_POPCNT_UNSET): Ditto.
        (OPTION_MASK_ISA_CX16_UNSET): Ditto.
        (OPTION_MASK_ISA_SAHF_UNSET): Ditto.
        (struct ptt): Move to static file scope from override_options.
        (processor_target_table): Ditto.
        (cpu_names): Ditto.
        (ix86_handle_option): Add support for options that are now isa
        options.
        (override_options): Add support for declaring functions that
        support different target options than were specified on the
        command line.  Move struct ptt, processor_target_table, cpu_names,
        ix86_tune_defaulted, ix86_arch_specified to static file scope.
        Add bool argument.  Fix up error messages so the appropriate error
        is given for either command line or attribute.
        (ix86_previous_fndecl): New static to remember previous function
        declaration to see if we need to change target options.
        (ix86_builtins_isa): New array to record the ISA of each builtin
        function.
        (def_builtin): Always create the builtin function, even if the
        current ISA doesn't support it.
        (ix86_init_mmx_sse_builtins): Remove TARGET_AES and TARGET_PCLMUL
        tests for those builtins.
        (ix86_init_builtins): Remove TARGET_MMX test for calling
        ix86_init_mmx_sse_builtins.
        (ix86_expand_builtin): If the current ISA doesn't support a given
        builtin, signal an error.
        (TARGET_VALID_OPTION_ATTRIBUTE_P): Set target hook.
        (TARGET_SET_CURRENT_FUNCTION): Ditto.
        (TARGET_OPTION_SAVE): Ditto.
        (TARGET_OPTION_RESTORE): Ditto.
        (TARGET_OPTION_PRINT): Ditto.
        (TARGET_CAN_INLINE_P): Ditto.

[gcc/testsuite]
2008-07-23  Michael Meissner  <gnu@...>
            Karthik Kumar  <karthikkumar@...>

        * gcc.target/i386/sse-22.c: New test for function specific option
        support.
        * gcc.target/i386/sse-23.c: Ditto.
        * gcc.target/i386/opt-1.c: Ditto.
        * gcc.target/i386/opt-2.c: Ditto.
        * gcc.target/i386/cold-1.c: Ditto.
        * gcc.target/i386/hot-1.c: Ditto.
        * gcc.target/i386/funcspec-1.c: Ditto.
        * gcc.target/i386/funcspec-2.c: Ditto.
        * gcc.target/i386/funcspec-3.c: Ditto.
        * gcc.target/i386/funcspec-4.c: Ditto.
        * gcc.target/i386/funcspec-5.c: Ditto.
        * gcc.target/i386/funcspec-6.c: Ditto.
        * gcc.target/i386/funcspec-7.c: Ditto.
        * gcc.target/i386/funcspec-8.c: Ditto.
        * gcc.target/i386/funcspec-9.c: Ditto.

Index: gcc/attribs.c
===================================================================
--- gcc/attribs.c (revision 138074)
+++ gcc/attribs.c (working copy)
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  
 #include "target.h"
 #include "langhooks.h"
 #include "hashtab.h"
+#include "c-common.h"
 
 static void init_attributes (void);
 
@@ -232,6 +233,41 @@ decl_attributes (tree *node, tree attrib
   if (!attributes_initialized)
     init_attributes ();
 
+  /* If this is a function and the user used #pragma GCC optimize, add the
+     options to the attribute((optimize(...))) list.  */
+  if (TREE_CODE (*node) == FUNCTION_DECL && current_optimize_pragma)
+    {
+      tree cur_attr = lookup_attribute ("optimize", attributes);
+      tree opts = copy_list (current_optimize_pragma);
+
+      if (! cur_attr)
+ attributes
+  = tree_cons (get_identifier ("optimize"), opts, attributes);
+      else
+ TREE_VALUE (cur_attr) = chainon (opts, TREE_VALUE (cur_attr));
+    }
+
+  if (TREE_CODE (*node) == FUNCTION_DECL
+      && optimization_current_node != optimization_default_node
+      && !DECL_FUNCTION_SPECIFIC_OPTIMIZATION (*node))
+    DECL_FUNCTION_SPECIFIC_OPTIMIZATION (*node) = optimization_current_node;
+
+  /* If this is a function and the user used #pragma GCC option, add the
+     options to the attribute((option(...))) list.  */
+  if (TREE_CODE (*node) == FUNCTION_DECL
+      && current_option_pragma
+      && targetm.target_option.valid_attribute_p (*node, NULL_TREE,
+  current_option_pragma, 0))
+    {
+      tree cur_attr = lookup_attribute ("option", attributes);
+      tree opts = copy_list (current_option_pragma);
+
+      if (! cur_attr)
+ attributes = tree_cons (get_identifier ("option"), opts, attributes);
+      else
+ TREE_VALUE (cur_attr) = chainon (opts, TREE_VALUE (cur_attr));
+    }
+
   targetm.insert_attributes (*node, &attributes);
 
   for (a = attributes; a; a = TREE_CHAIN (a))
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi (revision 138074)
+++ gcc/doc/extend.texi (working copy)
@@ -1792,6 +1792,8 @@ the enclosing block.
 @cindex functions that are passed arguments in registers on the 386
 @cindex functions that pop the argument stack on the 386
 @cindex functions that do not pop the argument stack on the 386
+@cindex functions that have different compilation options on the 386
+@cindex functions that have different optimization options
 
 In GNU C, you declare certain things about functions called in your program
 which help the compiler optimize function calls and check your code more
@@ -2662,6 +2664,207 @@ with the notable exceptions of @code{qso
 take function pointer arguments.  The @code{nothrow} attribute is not
 implemented in GCC versions earlier than 3.3.
 
+@item option
+@cindex @code{option} function attribute
+The @code{option} attribute is used to specify that a function is to
+be compiled with different target options than specified on the
+command line.  This can be used for instance to have functions
+compiled with a different ISA (instruction set architecture) than the
+default.  You can also use the @samp{#pragma GCC option} pragma to set
+more than one function to be compiled with specific target options.
+@xref{Function Specific Option Pragmas}, for details about the
+@samp{#pragma GCC option} pragma.
+
+For instance on a 386, you could compile one function with
+@code{option("sse4.1,arch=core2")} and another with
+@code{option("sse4a,arch=amdfam10")} that would be equivalent to
+compiling the first function with @option{-msse4.1} and
+@option{-march=core2} options, and the second function with
+@option{-msse4a} and @option{-march=amdfam10} options.  It is up to the
+user to make sure that a function is only invoked on a machine that
+supports the particular ISA it was compiled for (for example by using
+@code{cpuid} on 386 to determine what feature bits and architecture
+family are used).
+
+@smallexample
+int core2_func (void) __attribute__ ((__option__ ("arch=core2")));
+int sse3_func (void) __attribute__ ((__option__ ("sse3")));
+@end smallexample
+
+On the 386, the following options are allowed:
+
+@table @samp
+@item abm
+@itemx no-abm
+@cindex option("abm")
+Enable/disable the generation of the advanced bit instructions.
+
+@item aes
+@itemx no-aes
+@cindex @code{option("aes")} attribute
+Enable/disable the generation of the AES instructions.
+
+@item mmx
+@itemx no-mmx
+@cindex @code{option("mmx")} attribute
+Enable/disable the generation of the MMX instructions.
+
+@item pclmul
+@itemx no-pclmul
+@cindex @code{option("pclmul")} attribute
+Enable/disable the generation of the PCLMUL instructions.
+
+@item popcnt
+@itemx no-popcnt
+@cindex @code{option("popcnt")} attribute
+Enable/disable the generation of the POPCNT instruction.
+
+@item sse
+@itemx no-sse
+@cindex @code{option("sse")} attribute
+Enable/disable the generation of the SSE instructions.
+
+@item sse2
+@itemx no-sse2
+@cindex @code{option("sse2")} attribute
+Enable/disable the generation of the SSE2 instructions.
+
+@item sse3
+@itemx no-sse3
+@cindex @code{option("sse3")} attribute
+Enable/disable the generation of the SSE3 instructions.
+
+@item sse4
+@itemx no-sse4
+@cindex @code{option("sse4")} attribute
+Enable/disable the generation of the SSE4 instructions (both SSE4.1
+and SSE4.2).
+
+@item sse4.1
+@itemx no-sse4.1
+@cindex @code{option("sse4.1")} attribute
+Enable/disable the generation of the sse4.1 instructions.
+
+@item sse4.2
+@itemx no-sse4.2
+@cindex @code{option("sse4.2")} attribute
+Enable/disable the generation of the sse4.2 instructions.
+
+@item sse4a
+@itemx no-sse4a
+@cindex @code{option("sse4a")} attribute
+Enable/disable the generation of the SSE4A instructions.
+
+@item sse5
+@itemx no-sse5
+@cindex @code{option("sse5")} attribute
+Enable/disable the generation of the SSE5 instructions.
+
+@item ssse3
+@itemx no-ssse3
+@cindex @code{option("ssse3")} attribute
+Enable/disable the generation of the SSSE3 instructions.
+
+@item cld
+@itemx no-cld
+@cindex @code{option("cld")} attribute
+Enable/disable the generation of the CLD before string moves.
+
+@item fancy-math-387
+@itemx no-fancy-math-387
+@cindex @code{option("fancy-math-387")} attribute
+Enable/disable the generation of the @code{sin}, @code{cos}, and
+@code{sqrt} instructions on the 387 floating point unit.
+
+@item fused-madd
+@itemx no-fused-madd
+@cindex @code{option("fused-madd")} attribute
+Enable/disable the generation of the fused multiply/add instructions.
+
+@item ieee-fp
+@itemx no-ieee-fp
+@cindex @code{option("ieee-fp")} attribute
+Enable/disable the generation of floating point that depends on IEEE arithmetic.
+
+@item inline-all-stringops
+@itemx no-inline-all-stringops
+@cindex @code{option("inline-all-stringops")} attribute
+Enable/disable inlining of string operations.
+
+@item inline-stringops-dynamically
+@itemx no-inline-stringops-dynamically
+@cindex @code{option("inline-stringops-dynamically")} attribute
+Enable/disable the generation of the inline code to do small string
+operations and calling the library routines for large operations.
+
+@item align-stringops
+@itemx no-align-stringops
+@cindex @code{option("align-stringops")} attribute
+Do/do not align destination of inlined string operations.
+
+@item recip
+@itemx no-recip
+@cindex @code{option("recip")} attribute
+Enable/disable the generation of RCPSS, RCPPS, RSQRTSS and RSQRTPS
+instructions followed an additional Newton-Rhapson step instead of
+doing a floating point division.
+
+@item arch=@var{ARCH}
+@cindex @code{option("arch=@var{ARCH}")} attribute
+Specify the architecture to generate code for in compiling the function.
+
+@item tune=@var{TUNE}
+@cindex @code{option("tune=@var{TUNE}")} attribute
+Specify the architecture to tune for in compiling the function.
+
+@item fpmath=@var{FPMATH}
+@cindex @code{option("fpmath=@var{FPMATH}")} attribute
+Specify which floating point unit to use.  The
+@code{option("fpmath=sse,387")} option must be specified as
+@code{option("fpmath=sse+387")} because the comma would separate
+different options.
+@end table
+
+On the 386, you can use either multiple strings to specify multiple
+options, or you can separate the option with a comma (@code{,}).
+
+On the 386, the inliner will not inline a function that has different
+target options than the caller, unless the callee has a subset of the
+target options of the caller.  For example a function declared with
+@code{option("sse5")} can inline a function with
+@code{option("sse2")}, since @code{-msse5} implies @code{-msse2}.
+
+The @code{option} attribute is not implemented in GCC versions earlier
+than 4.4, and at present only the 386 uses it.
+
+@item optimize
+@cindex @code{optimize} function attribute
+The @code{optimize} attribute is used to specify that a function is to
+be compiled with different optimization options than specified on the
+command line.  Arguments can either be numbers or strings.  Numbers
+are assumed to be an optimization level.  Strings that begin with
+@code{O} are assumed to be an optimization option, while other options
+are assumed to be used with a @code{-f} prefix.  You can also use the
+@samp{#pragma GCC optimize} pragma to set the optimization options
+that affect more than one function.
+@xref{Function Specific Option Pragmas}, for details about the
+@samp{#pragma GCC option} pragma.
+
+This can be used for instance to have frequently executed functions
+compiled with more aggressive optimization options that produce faster
+and larger code, while other functions can be called with less
+aggressive options.  The @code{hot} attribute implies
+@code{optimize("O3")}, and @code{cold} attribute implies
+@code{optimize("Os")}.
+
+@smallexample
+int fast_func (void) __attribute__ ((__optimize__ ("O3,unroll-loops")));
+int slow_func (void) __attribute__ ((__optimize__ ("Os")));
+@end smallexample
+
+The inliner will not inline functions with a higher optimization level
+than the caller or different space/time trade offs.
+
 @item pure
 @cindex @code{pure} function attribute
 Many functions have no effects except the return value and their
@@ -2697,7 +2900,11 @@ all hot functions appears close together
 When profile feedback is available, via @option{-fprofile-use}, hot functions
 are automatically detected and this attribute is ignored.
 
-The @code{hot} attribute is not implemented in GCC versions earlier than 4.3.
+The @code{hot} attribute is not implemented in GCC versions earlier
+than 4.3.
+
+Starting with GCC 4.4, the @code{hot} attribute sets
+@code{optimize("O3")} to turn on more aggressive optimization.
 
 @item cold
 @cindex @code{cold} function attribute
@@ -2714,7 +2921,10 @@ occasions.
 When profile feedback is available, via @option{-fprofile-use}, hot functions
 are automatically detected and this attribute is ignored.
 
-The @code{hot} attribute is not implemented in GCC versions earlier than 4.3.
+The @code{cold} attribute is not implemented in GCC versions earlier than 4.3.
+
+Starting with GCC 4.4, the @code{cold} attribute sets
+@code{optimize("Os")} to save space.
 
 @item regparm (@var{number})
 @cindex @code{regparm} attribute
@@ -11108,6 +11318,7 @@ for further explanation.
 * Diagnostic Pragmas::
 * Visibility Pragmas::
 * Push/Pop Macro Pragmas::
+* Function Specific Option Pragmas::
 @end menu
 
 @node ARM Pragmas
@@ -11458,6 +11669,80 @@ int x [X];
 In this example, the definition of X as 1 is saved by @code{#pragma
 push_macro} and restored by @code{#pragma pop_macro}.
 
+@node Function Specific Option Pragmas
+@subsection Function Specific Option Pragmas
+
+@table @code
+@item #pragma GCC option (@var{"string"}...)
+@cindex pragma GCC option
+
+This pragma allows you to set target specific options for functions
+defined later in the source file.  One or more strings can be
+specified.  Each function that is defined after this point will be as
+if @code{attribute((option("STRING")))} was specified for that
+function.  The parenthesis around the options is optional.
+@xref{Function Attributes}, for more information about the
+@code{option} attribute and the attribute syntax.
+
+The @samp{#pragma GCC option} pragma is not implemented in GCC
+versions earlier than 4.4, and is currently only implemented for the
+386 and x86_64 backend.
+@end table
+
+@table @code
+@item #pragma GCC option (push)
+@itemx #pragma GCC option (pop)
+@cindex pragma GCC option
+
+These pragmas maintain a stack of the current options.  It is
+intended for include files where you temporarily want to switch to
+using a different @samp{#pragma GCC option} and then to pop back to
+the previous options.
+@end table
+
+@table @code
+@item #pragma GCC option (reset)
+@cindex pragma, target option
+@cindex pragma GCC option
+
+This pragma clears the current @code{#pragma GCC options} to use the
+default switches as specified on the command line.
+@end table
+@table @code
+@item #pragma GCC optimize (@var{"string"}...)
+@cindex pragma GCC optimize
+
+This pragma allows you to set global optimization options for functions
+defined later in the source file.  One or more strings can be
+specified.  Each function that is defined after this point will be as
+if @code{attribute((optimize("STRING")))} was specified for that
+function.  The parenthesis around the options is optional.
+@xref{Function Attributes}, for more information about the
+@code{optimize} attribute and the attribute syntax.
+
+The @samp{#pragma GCC optimize} pragma is not implemented in GCC
+versions earlier than 4.4.
+@end table
+
+@table @code
+@item #pragma GCC optimize (push)
+@itemx #pragma GCC optimize (pop)
+@cindex pragma GCC optimize
+
+These pragmas maintain a stack of the current optimization options.
+It is intended for include files where you temporarily want to switch
+to using a different @code{#pragma GCC optimize} and then to pop back
+to the previous optimizations.
+@end table
+
+@table @code
+@item #pragma GCC optimize reset
+@cindex pragma GCC optimize
+
+This pragma clears the current @code{#pragma GCC optimize} to use the
+default switches as specified on the command line.
+@end table
+
 @node Unnamed Fields
 @section Unnamed struct/union fields within structs/unions
 @cindex struct
Index: gcc/doc/options.texi
===================================================================
--- gcc/doc/options.texi (revision 138074)
+++ gcc/doc/options.texi (working copy)
@@ -35,8 +35,11 @@ has been declared in this way, it can be
 @xref{Option properties}.
 
 @item
-An option definition record.  These records have the following fields:
+A target specific save record to save additional information. These
+records have two fields: the string @samp{TargetSave}, and a
+declaration type to go in the @code{cl_target_option} structure.
 
+@item
 @enumerate
 @item
 the name of the option, with the leading ``-'' removed
@@ -124,7 +127,10 @@ This property cannot be used alongside @
 @item UInteger
 The option's argument is a non-negative integer.  The option parser
 will check and convert the argument before passing it to the relevant
-option handler.
+option handler.  @code{UInteger} should also be used on options like
+@code{-falign-loops} where both @code{-falign-loops} and
+@code{-falign-loops}=@var{n} are supported to make sure the saved
+options are given a full integer.
 
 @item Var(@var{var})
 The state of this option should be stored in variable @var{var}.
@@ -221,4 +227,9 @@ The option should only be accepted if pr
 option will be present even if @var{cond} is false; @var{cond} simply
 controls whether the option is accepted and whether it is printed in
 the @option{--help} output.
+
+@item Save
+Build the @code{cl_target_option} structure to hold a copy of the
+option, add the functions @code{cl_target_option_save} and
+@code{cl_target_option_restore} to save and restore the options.
 @end table
Index: gcc/doc/c-tree.texi
===================================================================
--- gcc/doc/c-tree.texi (revision 138074)
+++ gcc/doc/c-tree.texi (working copy)
@@ -1330,6 +1330,8 @@ a containing function, and the back end
 @findex DECL_GLOBAL_CTOR_P
 @findex DECL_GLOBAL_DTOR_P
 @findex GLOBAL_INIT_PRIORITY
+@findex DECL_FUNCTION_SPECIFIC_TARGET
+@findex DECL_FUNCTION_SPECIFIC_OPTIMIZATION
 
 The following macros and functions can be used on a @code{FUNCTION_DECL}:
 @ftable @code
@@ -1514,6 +1516,17 @@ is of the form `@code{()}'.
 This predicate holds if the function an overloaded
 @code{operator delete[]}.
 
+@item DECL_FUNCTION_SPECIFIC_TARGET
+This macro returns a tree node that holds the target options that are
+to be used to compile this particular function or @code{NULL_TREE} if
+the function is to be compiled with the target options specified on
+the command line.
+
+@item DECL_FUNCTION_SPECIFIC_OPTIMIZATION
+This macro returns a tree node that holds the optimization options
+that are to be used to compile this particular function or
+@code{NULL_TREE} if the function is to be compiled with the
+optimization options specified on the command line.
 @end ftable
 
 @c ---------------------------------------------------------------------
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi (revision 138074)
+++ gcc/doc/tm.texi (working copy)
@@ -9271,6 +9271,51 @@ attributes, @code{false} otherwise.  By
 target specific attribute attached to it, it will not be inlined.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_VALID_OPTION_ATTRIBUTE_P (tree @var{fndecl}, tree @var{name}, tree @var{args}, int @var{flags})
+This hook is called to parse the @code{attribute(option("..."))}, and
+it allows the function to set different target machine compile time
+options for the current function that might be different than the
+options specified on the command line.  The hook should return
+@code{true} if the options are valid.
+
+The hook should set the @var{DECL_FUNCTION_SPECIFIC_TARGET} field in
+the function declaration to hold a pointer to a target specific
+@var{struct cl_target_option} structure.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_OPTION_SAVE (struct cl_target_option *@var{ptr})
+This hook is called to save any additional target specific information
+in the @var{struct cl_target_option} structure for function specific
+options.
+@xref{Option file format}.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_OPTION_RESTORE (struct cl_target_option *@var{ptr})
+This hook is called to restore any additional target specific
+information in the @var{struct cl_target_option} structure for
+function specific options.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_OPTION_PRINT (struct cl_target_option *@var{ptr})
+This hook is called to print any additional target specific
+information in the @var{struct cl_target_option} structure for
+function specific options.
+@end deftypefn
+
+@deftypefn {Target Hook} bool TARGET_OPTION_PRAGMA_PARSE (target @var{args})
+This target hook parses the options for @code{#pragma GCC option} to
+set the machine specific options for functions that occur later in the
+input stream.  The options should be the same as handled by the
+@code{TARGET_VALID_OPTION_ATTRIBUTE_P} hook.
+@end deftypefn
+
+@deftypefn {Target Hook} bool TARGET_CAN_INLINE_P (tree @var{caller}, tree @var{callee})
+This target hook returns @code{false} if the @var{caller} function
+cannot inline @var{callee}, based on target specific information.  By
+default, inlining is not allowed if the callee function has function
+specific target options and the caller does not use the same options.
+@end deftypefn
+
 @node Emulated TLS
 @section Emulating TLS
 @cindex Emulated TLS
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi (revision 138074)
+++ gcc/doc/invoke.texi (working copy)
@@ -10543,6 +10543,8 @@ code that expects temporaries to be 80bi
 This is the default choice for the x86-64 compiler.
 
 @item sse,387
+@itemx sse+387
+@itemx both
 Attempt to utilize both instruction sets at once.  This effectively double the
 amount of available registers and on chips with separate execution units for
 387 and SSE the execution resources too.  Use this option with care, as it is
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c (revision 138074)
+++ gcc/targhooks.c (working copy)
@@ -709,4 +709,38 @@ default_hard_regno_scratch_ok (unsigned
   return true;
 }
 
+bool
+default_target_option_valid_attribute_p (tree ARG_UNUSED (fndecl),
+ tree ARG_UNUSED (name),
+ tree ARG_UNUSED (args),
+ int ARG_UNUSED (flags))
+{
+  return false;
+}
+
+bool
+default_target_option_can_inline_p (tree caller, tree callee)
+{
+  bool ret = false;
+  tree callee_opts = DECL_FUNCTION_SPECIFIC_TARGET (callee);
+  tree caller_opts = DECL_FUNCTION_SPECIFIC_TARGET (caller);
+
+  /* If callee has no option attributes, then it is ok to inline */
+  if (!callee_opts)
+    ret = true;
+
+  /* If caller has no option attributes, but callee does then it is not ok to
+     inline */
+  else if (!caller_opts)
+    ret = false;
+
+  /* If both caller and callee have attributes, assume that if the pointer is
+     different, the the two functions have different target options since
+     build_target_option_node uses a hash table for the options.  */
+  else
+    ret = (callee_opts == caller_opts);
+
+  return ret;
+}
+
 #include "gt-targhooks.h"
Index: gcc/targhooks.h
===================================================================
--- gcc/targhooks.h (revision 138074)
+++ gcc/targhooks.h (working copy)
@@ -97,5 +97,6 @@ extern int default_reloc_rw_mask (void);
 extern tree default_mangle_decl_assembler_name (tree, tree);
 extern tree default_emutls_var_fields (tree, tree *);
 extern tree default_emutls_var_init (tree, tree, tree);
-
 extern bool default_hard_regno_scratch_ok (unsigned int);
+extern bool default_target_option_valid_attribute_p (tree, tree, tree, int);
+extern bool default_target_option_can_inline_p (tree, tree);
Index: gcc/tree.c
===================================================================
--- gcc/tree.c (revision 138074)
+++ gcc/tree.c (working copy)
@@ -175,6 +175,16 @@ static GTY (()) tree int_cst_node;
 static GTY ((if_marked ("ggc_marked_p"), param_is (union tree_node)))
      htab_t int_cst_hash_table;
 
+/* Hash table for optimization flags and target option flags.  Use the same
+   hash table for both sets of options.  Nodes for building the current
+   optimization and target option nodes.  The assumption is most of the time
+   the options created will already be in the hash table, so we avoid
+   allocating and freeing up a node repeatably.  */
+static GTY (()) tree cl_optimization_node;
+static GTY (()) tree cl_target_option_node;
+static GTY ((if_marked ("ggc_marked_p"), param_is (union tree_node)))
+     htab_t cl_option_hash_table;
+
 /* General tree->tree mapping  structure for use in hash tables.  */
 
 
@@ -196,6 +206,8 @@ static int type_hash_eq (const void *, c
 static hashval_t type_hash_hash (const void *);
 static hashval_t int_cst_hash_hash (const void *);
 static int int_cst_hash_eq (const void *, const void *);
+static hashval_t cl_option_hash_hash (const void *);
+static int cl_option_hash_eq (const void *, const void *);
 static void print_type_hash_statistics (void);
 static void print_debug_expr_statistics (void);
 static void print_value_expr_statistics (void);
@@ -273,6 +285,12 @@ init_ttree (void)
   
   int_cst_node = make_node (INTEGER_CST);
 
+  cl_option_hash_table = htab_create_ggc (64, cl_option_hash_hash,
+  cl_option_hash_eq, NULL);
+
+  cl_optimization_node = make_node (OPTIMIZATION_NODE);
+  cl_target_option_node = make_node (TARGET_OPTION_NODE);
+
   tree_contains_struct[FUNCTION_DECL][TS_DECL_NON_COMMON] = 1;
   tree_contains_struct[TRANSLATION_UNIT_DECL][TS_DECL_NON_COMMON] = 1;
   tree_contains_struct[TYPE_DECL][TS_DECL_NON_COMMON] = 1;
@@ -505,6 +523,8 @@ tree_code_size (enum tree_code code)
  case STATEMENT_LIST: return sizeof (struct tree_statement_list);
  case BLOCK: return sizeof (struct tree_block);
  case CONSTRUCTOR: return sizeof (struct tree_constructor);
+ case OPTIMIZATION_NODE: return sizeof (struct tree_optimization_option);
+ case TARGET_OPTION_NODE: return sizeof (struct tree_target_option);
 
  default:
   return lang_hooks.tree_size (code);
@@ -2427,6 +2447,8 @@ tree_node_structure (const_tree t)
     case CONSTRUCTOR: return TS_CONSTRUCTOR;
     case TREE_BINFO: return TS_BINFO;
     case OMP_CLAUSE: return TS_OMP_CLAUSE;
+    case OPTIMIZATION_NODE: return TS_OPTIMIZATION;
+    case TARGET_OPTION_NODE: return TS_TARGET_OPTION;
 
     default:
       gcc_unreachable ();
@@ -8942,4 +8964,132 @@ block_nonartificial_location (tree block
   return ret;
 }
 
+/* These are the hash table functions for the hash table of OPTIMIZATION_NODEq
+   nodes.  */
+
+/* Return the hash code code X, an OPTIMIZATION_NODE or TARGET_OPTION code.  */
+
+static hashval_t
+cl_option_hash_hash (const void *x)
+{
+  const_tree const t = (const_tree) x;
+  const char *p;
+  size_t i;
+  size_t len = 0;
+  hashval_t hash = 0;
+
+  if (TREE_CODE (t) == OPTIMIZATION_NODE)
+    {
+      p = (const char *)TREE_OPTIMIZATION (t);
+      len = sizeof (struct cl_optimization);
+    }
+
+  else if (TREE_CODE (t) == TARGET_OPTION_NODE)
+    {
+      p = (const char *)TREE_TARGET_OPTION (t);
+      len = sizeof (struct cl_target_option);
+    }
+
+  else
+    gcc_unreachable ();
+
+  /* assume most opt flags are just 0/1, some are 2-3, and a few might be
+     something else.  */
+  for (i = 0; i < len; i++)
+    if (p[i])
+      hash = (hash << 4) ^ ((i << 2) | p[i]);
+
+  return hash;
+}
+
+/* Return nonzero if the value represented by *X (an OPTIMIZATION or
+   TARGET_OPTION tree node) is the same as that given by *Y, which is the
+   same.  */
+
+static int
+cl_option_hash_eq (const void *x, const void *y)
+{
+  const_tree const xt = (const_tree) x;
+  const_tree const yt = (const_tree) y;
+  const char *xp;
+  const char *yp;
+  size_t len;
+
+  if (TREE_CODE (xt) != TREE_CODE (yt))
+    return 0;
+
+  if (TREE_CODE (xt) == OPTIMIZATION_NODE)
+    {
+      xp = (const char *)TREE_OPTIMIZATION (xt);
+      yp = (const char *)TREE_OPTIMIZATION (yt);
+      len = sizeof (struct cl_optimization);
+    }
+
+  else if (TREE_CODE (xt) == TARGET_OPTION_NODE)
+    {
+      xp = (const char *)TREE_TARGET_OPTION (xt);
+      yp = (const char *)TREE_TARGET_OPTION (yt);
+      len = sizeof (struct cl_target_option);
+    }
+
+  else
+    gcc_unreachable ();
+
+  return (memcmp (xp, yp, len) == 0);
+}
+
+/* Build an OPTIMIZATION_NODE based on the current options.  */
+
+tree
+build_optimization_node (void)
+{
+  tree t;
+  void **slot;
+
+  /* Use the cache of optimization nodes.  */
+
+  cl_optimization_save (TREE_OPTIMIZATION (cl_optimization_node));
+
+  slot = htab_find_slot (cl_option_hash_table, cl_optimization_node, INSERT);
+  t = (tree) *slot;
+  if (!t)
+    {
+      /* Insert this one into the hash table.  */
+      t = cl_optimization_node;
+      *slot = t;
+
+      /* Make a new node for next time round.  */
+      cl_optimization_node = make_node (OPTIMIZATION_NODE);
+    }
+
+  return t;
+}
+
+/* Build a TARGET_OPTION_NODE based on the current options.  */
+
+tree
+build_target_option_node (void)
+{
+  tree t;
+  void **slot;
+
+  /* Use the cache of optimization nodes.  */
+
+  cl_target_option_save (TREE_TARGET_OPTION (cl_target_option_node));
+
+  slot = htab_find_slot (cl_option_hash_table, cl_target_option_node, INSERT);
+  t = (tree) *slot;
+  if (!t)
+    {
+      /* Insert this one into the hash table.  */
+      t = cl_target_option_node;
+      *slot = t;
+
+      /* Make a new node for next time round.  */
+      cl_target_option_node = make_node (TARGET_OPTION_NODE);
+    }
+
+  return t;
+}
+
 #include "gt-tree.h"
Index: gcc/tree.h
===================================================================
--- gcc/tree.h (revision 138074)
+++ gcc/tree.h (working copy)
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  
 #include "vec.h"
 #include "double-int.h"
 #include "alias.h"
+#include "options.h"
 
 /* Codes of tree nodes */
 
@@ -3408,6 +3409,16 @@ struct tree_decl_non_common GTY(())
 #define DECL_ARGUMENTS(NODE) (FUNCTION_DECL_CHECK (NODE)->decl_non_common.arguments)
 #define DECL_ARGUMENT_FLD(NODE) (DECL_NON_COMMON_CHECK (NODE)->decl_non_common.arguments)
 
+/* In FUNCTION_DECL, the function specific target options to use when compiling
+   this function.  */
+#define DECL_FUNCTION_SPECIFIC_TARGET(NODE) \
+   (FUNCTION_DECL_CHECK (NODE)->function_decl.function_specific_target)
+
+/* In FUNCTION_DECL, the function specific optimization options to use when
+   compiling this function.  */
+#define DECL_FUNCTION_SPECIFIC_OPTIMIZATION(NODE) \
+   (FUNCTION_DECL_CHECK (NODE)->function_decl.function_specific_optimization)
+
 /* FUNCTION_DECL inherits from DECL_NON_COMMON because of the use of the
    arguments/result/saved_tree fields by front ends.   It was either inherit
    FUNCTION_DECL from non_common, or inherit non_common from FUNCTION_DECL,
@@ -3419,6 +3430,10 @@ struct tree_function_decl GTY(())
 
   struct function *f;
 
+  /* Function specific options that are used by this function.  */
+  tree function_specific_target; /* target options */
+  tree function_specific_optimization; /* optimization options */
+
   /* In a FUNCTION_DECL for which DECL_BUILT_IN holds, this is
      DECL_FUNCTION_CODE.  Otherwise unused.
      ???  The bitfield needs to be able to hold all target function
@@ -3491,6 +3506,39 @@ struct tree_statement_list
   struct tree_statement_list_node *tail;
 };
 
+
+/* Optimization options used by a function.  */
+
+struct tree_optimization_option GTY(())
+{
+  struct tree_common common;
+
+  /* The optimization options used by the user.  */
+  struct cl_optimization opts;
+};
+
+#define TREE_OPTIMIZATION(NODE) \
+  (&OPTIMIZATION_NODE_CHECK (NODE)->optimization.opts)
+
+/* Return a tree node that encapsulates the current optimization options.  */
+extern tree build_optimization_node (void);
+
+/* Target options used by a function.  */
+
+struct tree_target_option GTY(())
+{
+  struct tree_common common;
+
+  /* The optimization options used by the user.  */
+  struct cl_target_option opts;
+};
+
+#define TREE_TARGET_OPTION(NODE) \
+  (&TARGET_OPTION_NODE_CHECK (NODE)->target_option.opts)
+
+/* Return a tree node that encapsulates the current target options.  */
+extern tree build_target_option_node (void);
+
 
 /* Define the overall contents of a tree node.
    It may be any of the structures declared above
@@ -3535,6 +3583,8 @@ union tree_node GTY ((ptr_alias (union l
   struct tree_memory_tag GTY ((tag ("TS_MEMORY_TAG"))) mtag;
   struct tree_omp_clause GTY ((tag ("TS_OMP_CLAUSE"))) omp_clause;
   struct tree_memory_partition_tag GTY ((tag ("TS_MEMORY_PARTITION_TAG"))) mpt;
+  struct tree_optimization_option GTY ((tag ("TS_OPTIMIZATION"))) optimization;
+  struct tree_target_option GTY ((tag ("TS_TARGET_OPTION"))) target_option;
 };
 
 /* Standard named or nameless data types of the C compiler.  */
@@ -3682,6 +3732,15 @@ enum tree_index
   TI_SAT_UDA_TYPE,
   TI_SAT_UTA_TYPE,
 
+  TI_OPTIMIZATION_DEFAULT,
+  TI_OPTIMIZATION_CURRENT,
+  TI_OPTIMIZATION_COLD,
+  TI_OPTIMIZATION_HOT,
+  TI_TARGET_OPTION_DEFAULT,
+  TI_TARGET_OPTION_CURRENT,
+  TI_CURRENT_OPTION_PRAGMA,
+  TI_CURRENT_OPTIMIZE_PRAGMA,
+
   TI_MAX
 };
 
@@ -3849,6 +3908,22 @@ extern GTY(()) tree global_trees[TI_MAX]
 #define main_identifier_node global_trees[TI_MAIN_IDENTIFIER]
 #define MAIN_NAME_P(NODE) (IDENTIFIER_NODE_CHECK (NODE) == main_identifier_node)
 
+/* Optimization options (OPTIMIZATION_NODE) to use for default, current, cold,
+   and hot functions.  */
+#define optimization_default_node global_trees[TI_OPTIMIZATION_DEFAULT]
+#define optimization_current_node global_trees[TI_OPTIMIZATION_CURRENT]
+#define optimization_cold_node global_trees[TI_OPTIMIZATION_COLD]
+#define optimization_hot_node global_trees[TI_OPTIMIZATION_HOT]
+
+/* Default/current target options (TARGET_OPTION_NODE).  */
+#define target_option_default_node global_trees[TI_TARGET_OPTION_DEFAULT]
+#define target_option_current_node global_trees[TI_TARGET_OPTION_CURRENT]
+
+/* Default tree list option(), optimize() pragmas to be linked into the
+   attribute list.  */
+#define current_option_pragma global_trees[TI_CURRENT_OPTION_PRAGMA]
+#define current_optimize_pragma global_trees[TI_CURRENT_OPTIMIZE_PRAGMA]
+
 /* An enumeration of the standard C integer types.  These must be
    ordered so that shorter types appear before longer ones, and so
    that signed types appear before unsigned ones, for the correct
Index: gcc/target.h
===================================================================
--- gcc/target.h (revision 138074)
+++ gcc/target.h (working copy)
@@ -963,6 +963,34 @@ struct gcc_target
     bool debug_form_tls_address;
   } emutls;  
 
+  struct target_option_hooks {
+    /* Function to validate the attribute((option(...))) strings or NULL.  If
+       the option is validated, it is assumed that DECL_FUNCTION_SPECIFIC will
+       be filled in in the function decl node.  */
+    bool (*valid_attribute_p) (tree, tree, tree, int);
+
+    /* Function to save any extra target state in the target options
+       structure.  */
+    void (*save) (struct cl_target_option *);
+
+    /* Function to restore any extra target state from the target options
+       structure.  */
+    void (*restore) (struct cl_target_option *);
+
+    /* Function to print any extra target state from the target options
+       structure.  */
+    void (*print) (FILE *, int, struct cl_target_option *);
+
+    /* Function to parse arguments to be validated for #pragma option, and to
+       change the state if the options are valid.  If the arguments are NULL,
+       use the default target options.  Return true if the options are valid,
+       and set the current state.  */
+    bool (*pragma_parse) (tree);
+
+    /* Function to determine if one function can inline another function.  */
+    bool (*can_inline_p) (tree, tree);
+  } target_option;
+
   /* For targets that need to mark extra registers as live on entry to
      the function, they should define this target hook and set their
      bits in the bitmap passed in. */  
Index: gcc/toplev.h
===================================================================
--- gcc/toplev.h (revision 138074)
+++ gcc/toplev.h (working copy)
@@ -82,6 +82,7 @@ extern void announce_function (tree);
 extern void error_for_asm (const_rtx, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void warning_for_asm (const_rtx, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void warn_deprecated_use (tree);
+extern bool parse_optimize_options (tree, bool);
 
 #ifdef BUFSIZ
 extern void output_quoted_string (FILE *, const char *);
@@ -158,6 +159,7 @@ extern void decode_d_option (const char
 
 /* Return true iff flags are set as if -ffast-math.  */
 extern bool fast_math_flags_set_p (void);
+extern bool fast_math_flags_struct_set_p (struct cl_optimization *);
 
 /* Return log2, or -1 if not exact.  */
 extern int exact_log2                  (unsigned HOST_WIDE_INT);
Index: gcc/c-cppbuiltin.c
===================================================================
--- gcc/c-cppbuiltin.c (revision 138074)
+++ gcc/c-cppbuiltin.c (working copy)
@@ -405,6 +405,58 @@ builtin_define_stdint_macros (void)
   builtin_define_type_max ("__INTMAX_MAX__", intmax_type_node, intmax_long);
 }
 
+/* Adjust the optimization macros when a #pragma GCC optimization is done to
+   reflect the current level.  */
+void
+c_cpp_builtins_optimize_pragma (cpp_reader *pfile, tree prev_tree,
+ tree cur_tree)
+{
+  struct cl_optimization *prev = TREE_OPTIMIZATION (prev_tree);
+  struct cl_optimization *cur  = TREE_OPTIMIZATION (cur_tree);
+  bool prev_fast_math;
+  bool cur_fast_math;
+
+  /* -undef turns off target-specific built-ins.  */
+  if (flag_undef)
+    return;
+
+  /* Other target-independent built-ins determined by command-line
+     options.  */
+  if (!prev->optimize_size && cur->optimize_size)
+    cpp_define (pfile, "__OPTIMIZE_SIZE__");
+  else if (prev->optimize_size && !cur->optimize_size)
+    cpp_undef (pfile, "__OPTIMIZE_SIZE__");
+
+  if (!prev->optimize && cur->optimize)
+    cpp_define (pfile, "__OPTIMIZE__");
+  else if (prev->optimize && !cur->optimize)
+    cpp_undef (pfile, "__OPTIMIZE__");
+
+  prev_fast_math = fast_math_flags_struct_set_p (prev);
+  cur_fast_math  = fast_math_flags_struct_set_p (cur);
+  if (!prev_fast_math && cur_fast_math)
+    cpp_define (pfile, "__FAST_MATH__");
+  else if (prev_fast_math && !cur_fast_math)
+    cpp_undef (pfile, "__FAST_MATH__");
+
+  if (!prev->flag_signaling_nans && cur->flag_signaling_nans)
+    cpp_define (pfile, "__SUPPORT_SNAN__");
+  else if (prev->flag_signaling_nans && !cur->flag_signaling_nans)
+    cpp_undef (pfile, "__SUPPORT_SNAN__");
+
+  if (!prev->flag_finite_math_only && cur->flag_finite_math_only)
+    {
+      cpp_undef (pfile, "__FINITE_MATH_ONLY__");
+      cpp_define (pfile, "__FINITE_MATH_ONLY__=1");
+    }
+  else if (!prev->flag_finite_math_only && cur->flag_finite_math_only)
+    {
+      cpp_undef (pfile, "__FINITE_MATH_ONLY__");
+      cpp_define (pfile, "__FINITE_MATH_ONLY__=0");
+    }
+}
+
+
 /* Hook that registers front end and target-specific built-ins.  */
 void
 c_cpp_builtins (cpp_reader *pfile)
Index: gcc/testsuite/gcc.target/i386/sse-23.c
===================================================================
--- gcc/testsuite/gcc.target/i386/sse-23.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/sse-23.c (revision 0)
@@ -0,0 +1,108 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8" } */
+
+#include <mm_malloc.h>
+
+/* Test that the intrinsics compile with optimization.  All of them are
+   defined as inline functions in {,x,e,p,t,s,w,a,b}mmintrin.h and mm3dnow.h
+   that reference the proper builtin functions.  Defining away "extern" and
+   "__inline" results in all of them being compiled as proper functions.  */
+
+#define extern
+#define __inline
+
+/* Following intrinsics require immediate arguments. */
+
+/* ammintrin.h */
+#define __builtin_ia32_extrqi(X, I, L)  __builtin_ia32_extrqi(X, 1, 1)
+#define __builtin_ia32_insertqi(X, Y, I, L) __builtin_ia32_insertqi(X, Y, 1, 1)
+
+/* wmmintrin.h */
+#define __builtin_ia32_aeskeygenassist128(X, C) __builtin_ia32_aeskeygenassist128(X, 1)
+#define __builtin_ia32_pclmulqdq128(X, Y, I) __builtin_ia32_pclmulqdq128(X, Y, 1)
+
+/* mmintrin-common.h */
+#define __builtin_ia32_roundpd(V, M) __builtin_ia32_roundpd(V, 1)
+#define __builtin_ia32_roundsd(D, V, M) __builtin_ia32_roundsd(D, V, 1)
+#define __builtin_ia32_roundps(V, M) __builtin_ia32_roundps(V, 1)
+#define __builtin_ia32_roundss(D, V, M) __builtin_ia32_roundss(D, V, 1)
+
+/* smmintrin.h */
+#define __builtin_ia32_pblendw128(X, Y, M) __builtin_ia32_pblendw128 (X, Y, 1)
+#define __builtin_ia32_blendps(X, Y, M) __builtin_ia32_blendps(X, Y, 1)
+#define __builtin_ia32_blendpd(X, Y, M) __builtin_ia32_blendpd(X, Y, 1)
+#define __builtin_ia32_dpps(X, Y, M) __builtin_ia32_dpps(X, Y, 1)
+#define __builtin_ia32_dppd(X, Y, M) __builtin_ia32_dppd(X, Y, 1)
+#define __builtin_ia32_insertps128(D, S, N) __builtin_ia32_insertps128(D, S, 1)
+#define __builtin_ia32_vec_ext_v4sf(X, N) __builtin_ia32_vec_ext_v4sf(X, 1)
+#define __builtin_ia32_vec_set_v16qi(D, S, N) __builtin_ia32_vec_set_v16qi(D, S, 1)
+#define __builtin_ia32_vec_set_v4si(D, S, N) __builtin_ia32_vec_set_v4si(D, S, 1)
+#define __builtin_ia32_vec_set_v2di(D, S, N) __builtin_ia32_vec_set_v2di(D, S, 1)
+#define __builtin_ia32_vec_ext_v16qi(X, N) __builtin_ia32_vec_ext_v16qi(X, 1)
+#define __builtin_ia32_vec_ext_v4si(X, N) __builtin_ia32_vec_ext_v4si(X, 1)
+#define __builtin_ia32_vec_ext_v2di(X, N) __builtin_ia32_vec_ext_v2di(X, 1)
+#define __builtin_ia32_mpsadbw128(X, Y, M) __builtin_ia32_mpsadbw128(X, Y, 1)
+#define __builtin_ia32_pcmpistrm128(X, Y, M) \
+  __builtin_ia32_pcmpistrm128(X, Y, 1)
+#define __builtin_ia32_pcmpistri128(X, Y, M) \
+  __builtin_ia32_pcmpistri128(X, Y, 1)
+#define __builtin_ia32_pcmpestrm128(X, LX, Y, LY, M) \
+  __builtin_ia32_pcmpestrm128(X, LX, Y, LY, 1)
+#define __builtin_ia32_pcmpestri128(X, LX, Y, LY, M) \
+  __builtin_ia32_pcmpestri128(X, LX, Y, LY, 1)
+#define __builtin_ia32_pcmpistria128(X, Y, M) \
+  __builtin_ia32_pcmpistria128(X, Y, 1)
+#define __builtin_ia32_pcmpistric128(X, Y, M) \
+  __builtin_ia32_pcmpistric128(X, Y, 1)
+#define __builtin_ia32_pcmpistrio128(X, Y, M) \
+  __builtin_ia32_pcmpistrio128(X, Y, 1)
+#define __builtin_ia32_pcmpistris128(X, Y, M) \
+  __builtin_ia32_pcmpistris128(X, Y, 1)
+#define __builtin_ia32_pcmpistriz128(X, Y, M) \
+  __builtin_ia32_pcmpistriz128(X, Y, 1)
+#define __builtin_ia32_pcmpestria128(X, LX, Y, LY, M) \
+  __builtin_ia32_pcmpestria128(X, LX, Y, LY, 1)
+#define __builtin_ia32_pcmpestric128(X, LX, Y, LY, M) \
+  __builtin_ia32_pcmpestric128(X, LX, Y, LY, 1)
+#define __builtin_ia32_pcmpestrio128(X, LX, Y, LY, M) \
+  __builtin_ia32_pcmpestrio128(X, LX, Y, LY, 1)
+#define __builtin_ia32_pcmpestris128(X, LX, Y, LY, M) \
+  __builtin_ia32_pcmpestris128(X, LX, Y, LY, 1)
+#define __builtin_ia32_pcmpestriz128(X, LX, Y, LY, M) \
+  __builtin_ia32_pcmpestriz128(X, LX, Y, LY, 1)
+
+/* tmmintrin.h */
+#define __builtin_ia32_palignr128(X, Y, N) __builtin_ia32_palignr128(X, Y, 8)
+#define __builtin_ia32_palignr(X, Y, N) __builtin_ia32_palignr(X, Y, 8)
+
+/* emmintrin.h */
+#define __builtin_ia32_psrldqi128(A, B) __builtin_ia32_psrldqi128(A, 8)
+#define __builtin_ia32_pslldqi128(A, B) __builtin_ia32_pslldqi128(A, 8)
+#define __builtin_ia32_pshufhw(A, N) __builtin_ia32_pshufhw(A, 0)
+#define __builtin_ia32_pshuflw(A, N) __builtin_ia32_pshuflw(A, 0)
+#define __builtin_ia32_pshufd(A, N) __builtin_ia32_pshufd(A, 0)
+#define __builtin_ia32_vec_set_v8hi(A, D, N) \
+  __builtin_ia32_vec_set_v8hi(A, D, 0)
+#define __builtin_ia32_vec_ext_v8hi(A, N) __builtin_ia32_vec_ext_v8hi(A, 0)
+#define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
+
+/* xmmintrin.h */
+#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, A, _MM_HINT_NTA)
+#define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
+#define __builtin_ia32_vec_set_v4hi(A, D, N) \
+  __builtin_ia32_vec_set_v4hi(A, D, 0)
+#define __builtin_ia32_vec_ext_v4hi(A, N) __builtin_ia32_vec_ext_v4hi(A, 0)
+#define __builtin_ia32_shufps(A, B, N) __builtin_ia32_shufps(A, B, 0)
+
+/* bmmintrin.h */
+#define __builtin_ia32_protbi(A, B) __builtin_ia32_protbi(A,1)
+#define __builtin_ia32_protwi(A, B) __builtin_ia32_protwi(A,1)
+#define __builtin_ia32_protdi(A, B) __builtin_ia32_protdi(A,1)
+#define __builtin_ia32_protqi(A, B) __builtin_ia32_protqi(A,1)
+
+
+#pragma GCC option ("3dnow,sse4,sse5,aes,pclmul")
+#include <wmmintrin.h>
+#include <bmmintrin.h>
+#include <smmintrin.h>
+#include <mm3dnow.h>
Index: gcc/testsuite/gcc.target/i386/opt-1.c
===================================================================
--- gcc/testsuite/gcc.target/i386/opt-1.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/opt-1.c (revision 0)
@@ -0,0 +1,35 @@
+/* Test the attribute((optimize)) really works.  Do this test by checking
+   whether we vectorize a simple loop.  */
+/* { dg-do compile } */
+/* { dg-options "-O1 -msse2 -mfpmath=sse -march=k8" } */
+/* { dg-final { scan-assembler "prefetcht0" } } */
+/* { dg-final { scan-assembler "addps" } } */
+/* { dg-final { scan-assembler "subss" } } */
+
+#define SIZE 10240
+float a[SIZE] __attribute__((__aligned__(32)));
+float b[SIZE] __attribute__((__aligned__(32)));
+float c[SIZE] __attribute__((__aligned__(32)));
+
+/* This should vectorize.  */
+void opt3 (void) __attribute__((__optimize__(3,"unroll-all-loops,-fprefetch-loop-arrays")));
+
+void
+opt3 (void)
+{
+  int i;
+
+  for (i = 0; i < SIZE; i++)
+    a[i] = b[i] + c[i];
+}
+
+/* This should not vectorize.  */
+void
+not_opt3 (void)
+{
+  int i;
+
+  for (i = 0; i < SIZE; i++)
+    a[i] = b[i] - c[i];
+}
+
Index: gcc/testsuite/gcc.target/i386/opt-2.c
===================================================================
--- gcc/testsuite/gcc.target/i386/opt-2.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/opt-2.c (revision 0)
@@ -0,0 +1,38 @@
+/* Test the attribute((optimize)) really works.  Do this test by checking
+   whether we vectorize a simple loop.  */
+/* { dg-do compile } */
+/* { dg-options "-O1 -msse2 -mfpmath=sse -march=k8" } */
+/* { dg-final { scan-assembler "prefetcht0" } } */
+/* { dg-final { scan-assembler "addps" } } */
+/* { dg-final { scan-assembler "subss" } } */
+
+#define SIZE 10240
+float a[SIZE] __attribute__((__aligned__(32)));
+float b[SIZE] __attribute__((__aligned__(32)));
+float c[SIZE] __attribute__((__aligned__(32)));
+
+/* This should vectorize.  */
+#pragma GCC optimize push
+#pragma GCC optimize (3, "unroll-all-loops", "-fprefetch-loop-arrays")
+
+void
+opt3 (void)
+{
+  int i;
+
+  for (i = 0; i < SIZE; i++)
+    a[i] = b[i] + c[i];
+}
+
+#pragma GCC optimize pop
+
+/* This should not vectorize.  */
+void
+not_opt3 (void)
+{
+  int i;
+
+  for (i = 0; i < SIZE; i++)
+    a[i] = b[i] - c[i];
+}
+
Index: gcc/testsuite/gcc.target/i386/cold-1.c
===================================================================
--- gcc/testsuite/gcc.target/i386/cold-1.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/cold-1.c (revision 0)
@@ -0,0 +1,13 @@
+/* Test whether using attribute((cold)) really turns on -Os.  Do this test
+   by checking whether strcpy calls the library function rather than doing
+   the move inline.  */
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=k8" } */
+/* { dg-final { scan-assembler "(jmp|call)\t(.*)strcpy" } } */
+
+void cold (char *) __attribute__((__cold__));
+
+void cold (char *a)
+{
+  __builtin_strcpy (a, "testing 1.2.3 testing 1.2.3");
+}
Index: gcc/testsuite/gcc.target/i386/funcspec-1.c
===================================================================
--- gcc/testsuite/gcc.target/i386/funcspec-1.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/funcspec-1.c (revision 0)
@@ -0,0 +1,34 @@
+/* Test whether using target specific options, we can generate SSE2 code on
+   32-bit, which does not generate SSE2 by default, but still generate 387 code
+   for a function that doesn't use attribute((option)).  */
+/* { dg-do compile } */
+/* { dg-require-effective-target ilp32 } */
+/* { dg-options "-O3 -ftree-vectorize -march=i386" } */
+/* { dg-final { scan-assembler "addps\[ \t\]" } } */
+/* { dg-final { scan-assembler "fsubs\[ \t\]" } } */
+
+#ifndef SIZE
+#define SIZE 1024
+#endif
+
+static float a[SIZE] __attribute__((__aligned__(16)));
+static float b[SIZE] __attribute__((__aligned__(16)));
+static float c[SIZE] __attribute__((__aligned__(16)));
+
+void sse_addnums (void) __attribute__ ((__option__ ("sse2")));
+
+void
+sse_addnums (void)
+{
+  int i = 0;
+  for (; i < SIZE; ++i)
+    a[i] = b[i] + c[i];
+}
+
+void
+i387_subnums (void)
+{
+  int i = 0;
+  for (; i < SIZE; ++i)
+    a[i] = b[i] - c[i];
+}
Index: gcc/testsuite/gcc.target/i386/funcspec-2.c
===================================================================
--- gcc/testsuite/gcc.target/i386/funcspec-2.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/funcspec-2.c (revision 0)
@@ -0,0 +1,99 @@
+/* Test whether using target specific options, we can generate SSE5 code.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O2 -march=k8" } */
+
+extern void exit (int);
+
+#define SSE5_ATTR __attribute__((__option__("sse5,fused-madd")))
+extern float  flt_mul_add     (float a, float b, float c) SSE5_ATTR;
+extern float  flt_mul_sub     (float a, float b, float c) SSE5_ATTR;
+extern float  flt_neg_mul_add (float a, float b, float c) SSE5_ATTR;
+extern float  flt_neg_mul_sub (float a, float b, float c) SSE5_ATTR;
+
+extern double dbl_mul_add     (double a, double b, double c) SSE5_ATTR;
+extern double dbl_mul_sub     (double a, double b, double c) SSE5_ATTR;
+extern double dbl_neg_mul_add (double a, double b, double c) SSE5_ATTR;
+extern double dbl_neg_mul_sub (double a, double b, double c) SSE5_ATTR;
+
+float
+flt_mul_add (float a, float b, float c)
+{
+  return (a * b) + c;
+}
+
+double
+dbl_mul_add (double a, double b, double c)
+{
+  return (a * b) + c;
+}
+
+float
+flt_mul_sub (float a, float b, float c)
+{
+  return (a * b) - c;
+}
+
+double
+dbl_mul_sub (double a, double b, double c)
+{
+  return (a * b) - c;
+}
+
+float
+flt_neg_mul_add (float a, float b, float c)
+{
+  return (-(a * b)) + c;
+}
+
+double
+dbl_neg_mul_add (double a, double b, double c)
+{
+  return (-(a * b)) + c;
+}
+
+float
+flt_neg_mul_sub (float a, float b, float c)
+{
+  return (-(a * b)) - c;
+}
+
+double
+dbl_neg_mul_sub (double a, double b, double c)
+{
+  return (-(a * b)) - c;
+}
+
+float  f[10] = { 2, 3, 4 };
+double d[10] = { 2, 3, 4 };
+
+int main ()
+{
+  f[3] = flt_mul_add (f[0], f[1], f[2]);
+  f[4] = flt_mul_sub (f[0], f[1], f[2]);
+  f[5] = flt_neg_mul_add (f[0], f[1], f[2]);
+  f[6] = flt_neg_mul_sub (f[0], f[1], f[2]);
+
+  d[3] = dbl_mul_add (d[0], d[1], d[2]);
+  d[4] = dbl_mul_sub (d[0], d[1], d[2]);
+  d[5] = dbl_neg_mul_add (d[0], d[1], d[2]);
+  d[6] = dbl_neg_mul_sub (d[0], d[1], d[2]);
+  exit (0);
+}
+
+/* { dg-final { scan-assembler "fmaddss" } } */
+/* { dg-final { scan-assembler "fmaddsd" } } */
+/* { dg-final { scan-assembler "fmsubss" } } */
+/* { dg-final { scan-assembler "fmsubsd" } } */
+/* { dg-final { scan-assembler "fnmaddss" } } */
+/* { dg-final { scan-assembler "fnmaddsd" } } */
+/* { dg-final { scan-assembler "fnmsubss" } } */
+/* { dg-final { scan-assembler "fnmsubsd" } } */
+/* { dg-final { scan-assembler "call\t(.*)flt_mul_add" } } */
+/* { dg-final { scan-assembler "call\t(.*)flt_mul_sub" } } */
+/* { dg-final { scan-assembler "call\t(.*)flt_neg_mul_add" } } */
+/* { dg-final { scan-assembler "call\t(.*)flt_neg_mul_sub" } } */
+/* { dg-final { scan-assembler "call\t(.*)dbl_mul_add" } } */
+/* { dg-final { scan-assembler "call\t(.*)dbl_mul_sub" } } */
+/* { dg-final { scan-assembler "call\t(.*)dbl_neg_mul_add" } } */
+/* { dg-final { scan-assembler "call\t(.*)dbl_neg_mul_sub" } } */
Index: gcc/testsuite/gcc.target/i386/funcspec-3.c
===================================================================
--- gcc/testsuite/gcc.target/i386/funcspec-3.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/funcspec-3.c (revision 0)
@@ -0,0 +1,66 @@
+/* Test whether using target specific options, we can generate popcnt by
+   setting the architecture.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O2 -march=k8" } */
+
+extern void exit (int);
+extern void abort (void);
+
+#define SSE4A_ATTR __attribute__((__option__("arch=amdfam10")))
+#define SSE42_ATTR __attribute__((__option__("sse4.2")))
+
+static int sse4a_pop_i (int a) SSE4A_ATTR;
+static long sse42_pop_l (long a) SSE42_ATTR;
+static int generic_pop_i (int a);
+static long generic_pop_l (long a);
+
+static
+int sse4a_pop_i (int a)
+{
+  return __builtin_popcount (a);
+}
+
+static
+long sse42_pop_l (long a)
+{
+  return __builtin_popcountl (a);
+}
+
+static
+int generic_pop_i (int a)
+{
+  return __builtin_popcount (a);
+}
+
+static
+long generic_pop_l (long a)
+{
+  return __builtin_popcountl (a);
+}
+
+int five = 5;
+long seven = 7;
+
+int main ()
+{
+  if (sse4a_pop_i (five) != 2)
+    abort ();
+
+  if (sse42_pop_l (seven) != 3L)
+    abort ();
+
+  if (generic_pop_i (five) != 2)
+    abort ();
+
+  if (generic_pop_l (seven) != 3L)
+    abort ();
+
+  exit (0);
+}
+
+/* { dg-final { scan-assembler "popcntl" } } */
+/* { dg-final { scan-assembler "popcntq" } } */
+/* { dg-final { scan-assembler "call\t(.*)sse4a_pop_i" } } */
+/* { dg-final { scan-assembler "call\t(.*)sse42_pop_l" } } */
+/* { dg-final { scan-assembler "call\t(.*)popcountdi2" } } */
Index: gcc/testsuite/gcc.target/i386/funcspec-4.c
===================================================================
--- gcc/testsuite/gcc.target/i386/funcspec-4.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/funcspec-4.c (revision 0)
@@ -0,0 +1,14 @@
+/* Test some error conditions with function specific options.  */
+/* { dg-do compile } */
+
+/* no sse500 switch */
+extern void error1 (void) __attribute__((__option__("sse500"))); /* { dg-error "unknown" } */
+
+/* Multiple arch switches */
+extern void error2 (void) __attribute__((__option__("arch=core2,arch=k8"))); /* { dg-error "already specified" } */
+
+/* Unknown tune target */
+extern void error3 (void) __attribute__((__option__("tune=foobar"))); /* { dg-error "bad value" } */
+
+/* option on a variable */
+extern int error4 __attribute__((__option__("sse2"))); /* { dg-warning "ignored" } */
Index: gcc/testsuite/gcc.target/i386/hot-1.c
===================================================================
--- gcc/testsuite/gcc.target/i386/hot-1.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/hot-1.c (revision 0)
@@ -0,0 +1,33 @@
+/* Test whether using attribute((hot)) really turns on -O3.  Do this test
+   by checking whether we vectorize a simple loop.  */
+/* { dg-do compile } */
+/* { dg-options "-O1 -msse2 -mfpmath=sse -march=k8" } */
+/* { dg-final { scan-assembler "addps" } } */
+/* { dg-final { scan-assembler "subss" } } */
+
+#define SIZE 1024
+float a[SIZE] __attribute__((__aligned__(32)));
+float b[SIZE] __attribute__((__aligned__(32)));
+float c[SIZE] __attribute__((__aligned__(32)));
+
+/* This should vectorize.  */
+void hot (void) __attribute__((__hot__));
+
+void
+hot (void)
+{
+  int i;
+
+  for (i = 0; i < SIZE; i++)
+    a[i] = b[i] + c[i];
+}
+
+/* This should not vectorize.  */
+void
+not_hot (void)
+{
+  int i;
+
+  for (i = 0; i < SIZE; i++)
+    a[i] = b[i] - c[i];
+}
Index: gcc/testsuite/gcc.target/i386/funcspec-5.c
===================================================================
--- gcc/testsuite/gcc.target/i386/funcspec-5.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/funcspec-5.c (revision 0)
@@ -0,0 +1,125 @@
+/* Test whether all of the 32-bit function specific options are accepted
+   without error.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target ilp32 } */
+
+extern void test_abm (void) __attribute__((__option__("abm")));
+extern void test_aes (void) __attribute__((__option__("aes")));
+extern void test_fused_madd (void) __attribute__((__option__("fused-madd")));
+extern void test_mmx (void) __attribute__((__option__("mmx")));
+extern void test_pclmul (void) __attribute__((__option__("pclmul")));
+extern void test_popcnt (void) __attribute__((__option__("popcnt")));
+extern void test_recip (void) __attribute__((__option__("recip")));
+extern void test_sse (void) __attribute__((__option__("sse")));
+extern void test_sse2 (void) __attribute__((__option__("sse2")));
+extern void test_sse3 (void) __attribute__((__option__("sse3")));
+extern void test_sse4 (void) __attribute__((__option__("sse4")));
+extern void test_sse4_1 (void) __attribute__((__option__("sse4.1")));
+extern void test_sse4_2 (void) __attribute__((__option__("sse4.2")));
+extern void test_sse4a (void) __attribute__((__option__("sse4a")));
+extern void test_sse5 (void) __attribute__((__option__("sse5")));
+extern void test_ssse3 (void) __attribute__((__option__("ssse3")));
+
+extern void test_no_abm (void) __attribute__((__option__("no-abm")));
+extern void test_no_aes (void) __attribute__((__option__("no-aes")));
+extern void test_no_fused_madd (void) __attribute__((__option__("no-fused-madd")));
+extern void test_no_mmx (void) __attribute__((__option__("no-mmx")));
+extern void test_no_pclmul (void) __attribute__((__option__("no-pclmul")));
+extern void test_no_popcnt (void) __attribute__((__option__("no-popcnt")));
+extern void test_no_recip (void) __attribute__((__option__("no-recip")));
+extern void test_no_sse (void) __attribute__((__option__("no-sse")));
+extern void test_no_sse2 (void) __attribute__((__option__("no-sse2")));
+extern void test_no_sse3 (void) __attribute__((__option__("no-sse3")));
+extern void test_no_sse4 (void) __attribute__((__option__("no-sse4")));
+extern void test_no_sse4_1 (void) __attribute__((__option__("no-sse4.1")));
+extern void test_no_sse4_2 (void) __attribute__((__option__("no-sse4.2")));
+extern void test_no_sse4a (void) __attribute__((__option__("no-sse4a")));
+extern void test_no_sse5 (void) __attribute__((__option__("no-sse5")));
+extern void test_no_ssse3 (void) __attribute__((__option__("no-ssse3")));
+
+extern void test_arch_i386 (void) __attribute__((__option__("arch=i386")));
+extern void test_arch_i486 (void) __attribute__((__option__("arch=i486")));
+extern void test_arch_i586 (void) __attribute__((__option__("arch=i586")));
+extern void test_arch_pentium (void) __attribute__((__option__("arch=pentium")));
+extern void test_arch_pentium_mmx (void) __attribute__((__option__("arch=pentium-mmx")));
+extern void test_arch_winchip_c6 (void) __attribute__((__option__("arch=winchip-c6")));
+extern void test_arch_winchip2 (void) __attribute__((__option__("arch=winchip2")));
+extern void test_arch_c3 (void) __attribute__((__option__("arch=c3")));
+extern void test_arch_c3_2 (void) __attribute__((__option__("arch=c3-2")));
+extern void test_arch_i686 (void) __attribute__((__option__("arch=i686")));
+extern void test_arch_pentiumpro (void) __attribute__((__option__("arch=pentiumpro")));
+extern void test_arch_pentium2 (void) __attribute__((__option__("arch=pentium2")));
+extern void test_arch_pentium3 (void) __attribute__((__option__("arch=pentium3")));
+extern void test_arch_pentium3m (void) __attribute__((__option__("arch=pentium3m")));
+extern void test_arch_pentium_m (void) __attribute__((__option__("arch=pentium-m")));
+extern void test_arch_pentium4 (void) __attribute__((__option__("arch=pentium4")));
+extern void test_arch_pentium4m (void) __attribute__((__option__("arch=pentium4m")));
+extern void test_arch_prescott (void) __attribute__((__option__("arch=prescott")));
+extern void test_arch_nocona (void) __attribute__((__option__("arch=nocona")));
+extern void test_arch_core2 (void) __attribute__((__option__("arch=core2")));
+extern void test_arch_geode (void) __attribute__((__option__("arch=geode")));
+extern void test_arch_k6 (void) __attribute__((__option__("arch=k6")));
+extern void test_arch_k6_2 (void) __attribute__((__option__("arch=k6-2")));
+extern void test_arch_k6_3 (void) __attribute__((__option__("arch=k6-3")));
+extern void test_arch_athlon (void) __attribute__((__option__("arch=athlon")));
+extern void test_arch_athlon_tbird (void) __attribute__((__option__("arch=athlon-tbird")));
+extern void test_arch_athlon_4 (void) __attribute__((__option__("arch=athlon-4")));
+extern void test_arch_athlon_xp (void) __attribute__((__option__("arch=athlon-xp")));
+extern void test_arch_athlon_mp (void) __attribute__((__option__("arch=athlon-mp")));
+extern void test_arch_k8 (void) __attribute__((__option__("arch=k8")));
+extern void test_arch_k8_sse3 (void) __attribute__((__option__("arch=k8-sse3")));
+extern void test_arch_opteron (void) __attribute__((__option__("arch=opteron")));
+extern void test_arch_opteron_sse3 (void) __attribute__((__option__("arch=opteron-sse3")));
+extern void test_arch_athlon64 (void) __attribute__((__option__("arch=athlon64")));
+extern void test_arch_athlon64_sse3 (void) __attribute__((__option__("arch=athlon64-sse3")));
+extern void test_arch_athlon_fx (void) __attribute__((__option__("arch=athlon-fx")));
+extern void test_arch_amdfam10 (void) __attribute__((__option__("arch=amdfam10")));
+extern void test_arch_barcelona (void) __attribute__((__option__("arch=barcelona")));
+extern void test_arch_foo (void) __attribute__((__option__("arch=foo"))); /* { dg-error "bad value" } */
+
+extern void test_tune_i386 (void) __attribute__((__option__("tune=i386")));
+extern void test_tune_i486 (void) __attribute__((__option__("tune=i486")));
+extern void test_tune_i586 (void) __attribute__((__option__("tune=i586")));
+extern void test_tune_pentium (void) __attribute__((__option__("tune=pentium")));
+extern void test_tune_pentium_mmx (void) __attribute__((__option__("tune=pentium-mmx")));
+extern void test_tune_winchip_c6 (void) __attribute__((__option__("tune=winchip-c6")));
+extern void test_tune_winchip2 (void) __attribute__((__option__("tune=winchip2")));
+extern void test_tune_c3 (void) __attribute__((__option__("tune=c3")));
+extern void test_tune_c3_2 (void) __attribute__((__option__("tune=c3-2")));
+extern void test_tune_i686 (void) __attribute__((__option__("tune=i686")));
+extern void test_tune_pentiumpro (void) __attribute__((__option__("tune=pentiumpro")));
+extern void test_tune_pentium2 (void) __attribute__((__option__("tune=pentium2")));
+extern void test_tune_pentium3 (void) __attribute__((__option__("tune=pentium3")));
+extern void test_tune_pentium3m (void) __attribute__((__option__("tune=pentium3m")));
+extern void test_tune_pentium_m (void) __attribute__((__option__("tune=pentium-m")));
+extern void test_tune_pentium4 (void) __attribute__((__option__("tune=pentium4")));
+extern void test_tune_pentium4m (void) __attribute__((__option__("tune=pentium4m")));
+extern void test_tune_prescott (void) __attribute__((__option__("tune=prescott")));
+extern void test_tune_nocona (void) __attribute__((__option__("tune=nocona")));
+extern void test_tune_core2 (void) __attribute__((__option__("tune=core2")));
+extern void test_tune_geode (void) __attribute__((__option__("tune=geode")));
+extern void test_tune_k6 (void) __attribute__((__option__("tune=k6")));
+extern void test_tune_k6_2 (void) __attribute__((__option__("tune=k6-2")));
+extern void test_tune_k6_3 (void) __attribute__((__option__("tune=k6-3")));
+extern void test_tune_athlon (void) __attribute__((__option__("tune=athlon")));
+extern void test_tune_athlon_tbird (void) __attribute__((__option__("tune=athlon-tbird")));
+extern void test_tune_athlon_4 (void) __attribute__((__option__("tune=athlon-4")));
+extern void test_tune_athlon_xp (void) __attribute__((__option__("tune=athlon-xp")));
+extern void test_tune_athlon_mp (void) __attribute__((__option__("tune=athlon-mp")));
+extern void test_tune_k8 (void) __attribute__((__option__("tune=k8")));
+extern void test_tune_k8_sse3 (void) __attribute__((__option__("tune=k8-sse3")));
+extern void test_tune_opteron (void) __attribute__((__option__("tune=opteron")));
+extern void test_tune_opteron_sse3 (void) __attribute__((__option__("tune=opteron-sse3")));
+extern void test_tune_athlon64 (void) __attribute__((__option__("tune=athlon64")));
+extern void test_tune_athlon64_sse3 (void) __attribute__((__option__("tune=athlon64-sse3")));
+extern void test_tune_athlon_fx (void) __attribute__((__option__("tune=athlon-fx")));
+extern void test_tune_amdfam10 (void) __attribute__((__option__("tune=amdfam10")));
+extern void test_tune_barcelona (void) __attribute__((__option__("tune=barcelona")));
+extern void test_tune_generic (void) __attribute__((__option__("tune=generic")));
+extern void test_tune_foo (void) __attribute__((__option__("tune=foo"))); /* { dg-error "bad value" } */
+
+extern void test_fpmath_sse (void) __attribute__((__option__("sse2,fpmath=sse")));
+extern void test_fpmath_387 (void) __attribute__((__option__("sse2,fpmath=387")));
+extern void test_fpmath_sse_387 (void) __attribute__((__option__("sse2,fpmath=sse+387")));
+extern void test_fpmath_387_sse (void) __attribute__((__option__("sse2,fpmath=387+sse")));
+extern void test_fpmath_both (void) __attribute__((__option__("sse2,fpmath=both")));
Index: gcc/testsuite/gcc.target/i386/funcspec-6.c
===================================================================
--- gcc/testsuite/gcc.target/i386/funcspec-6.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/funcspec-6.c (revision 0)
@@ -0,0 +1,71 @@
+/* Test whether all of the 64-bit function specific options are accepted
+   without error.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
+
+extern void test_abm (void) __attribute__((__option__("abm")));
+extern void test_aes (void) __attribute__((__option__("aes")));
+extern void test_fused_madd (void) __attribute__((__option__("fused-madd")));
+extern void test_mmx (void) __attribute__((__option__("mmx")));
+extern void test_pclmul (void) __attribute__((__option__("pclmul")));
+extern void test_popcnt (void) __attribute__((__option__("popcnt")));
+extern void test_recip (void) __attribute__((__option__("recip")));
+extern void test_sse (void) __attribute__((__option__("sse")));
+extern void test_sse2 (void) __attribute__((__option__("sse2")));
+extern void test_sse3 (void) __attribute__((__option__("sse3")));
+extern void test_sse4 (void) __attribute__((__option__("sse4")));
+extern void test_sse4_1 (void) __attribute__((__option__("sse4.1")));
+extern void test_sse4_2 (void) __attribute__((__option__("sse4.2")));
+extern void test_sse4a (void) __attribute__((__option__("sse4a")));
+extern void test_sse5 (void) __attribute__((__option__("sse5")));
+extern void test_ssse3 (void) __attribute__((__option__("ssse3")));
+
+extern void test_no_abm (void) __attribute__((__option__("no-abm")));
+extern void test_no_aes (void) __attribute__((__option__("no-aes")));
+extern void test_no_fused_madd (void) __attribute__((__option__("no-fused-madd")));
+extern void test_no_mmx (void) __attribute__((__option__("no-mmx")));
+extern void test_no_pclmul (void) __attribute__((__option__("no-pclmul")));
+extern void test_no_popcnt (void) __attribute__((__option__("no-popcnt")));
+extern void test_no_recip (void) __attribute__((__option__("no-recip")));
+extern void test_no_sse (void) __attribute__((__option__("no-sse")));
+extern void test_no_sse2 (void) __attribute__((__option__("no-sse2")));
+extern void test_no_sse3 (void) __attribute__((__option__("no-sse3")));
+extern void test_no_sse4 (void) __attribute__((__option__("no-sse4")));
+extern void test_no_sse4_1 (void) __attribute__((__option__("no-sse4.1")));
+extern void test_no_sse4_2 (void) __attribute__((__option__("no-sse4.2")));
+extern void test_no_sse4a (void) __attribute__((__option__("no-sse4a")));
+extern void test_no_sse5 (void) __attribute__((__option__("no-sse5")));
+extern void test_no_ssse3 (void) __attribute__((__option__("no-ssse3")));
+
+extern void test_arch_nocona (void) __attribute__((__option__("arch=nocona")));
+extern void test_arch_core2 (void) __attribute__((__option__("arch=core2")));
+extern void test_arch_k8 (void) __attribute__((__option__("arch=k8")));
+extern void test_arch_k8_sse3 (void) __attribute__((__option__("arch=k8-sse3")));
+extern void test_arch_opteron (void) __attribute__((__option__("arch=opteron")));
+extern void test_arch_opteron_sse3 (void) __attribute__((__option__("arch=opteron-sse3")));
+extern void test_arch_athlon64 (void) __attribute__((__option__("arch=athlon64")));
+extern void test_arch_athlon64_sse3 (void) __attribute__((__option__("arch=athlon64-sse3")));
+extern void test_arch_athlon_fx (void) __attribute__((__option__("arch=athlon-fx")));
+extern void test_arch_amdfam10 (void) __attribute__((__option__("arch=amdfam10")));
+extern void test_arch_barcelona (void) __attribute__((__option__("arch=barcelona")));
+extern void test_arch_foo (void) __a