Multicollinearity

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

Multicollinearity

by jimjohn :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

many of my independent variables correlate highly with each other. since i am building a multiple linear regression model with many of these variables and I am only analyzing the grouped effect all the variables have on the dependent variable (instead of any individual effect one independent variable may have), then I don't need to worry about multicollinearity. Can someone plz confirm this? From what I recall, multicollinearity is only an issue when you are trying to analyze each independent variable's individual effect on the dependent variable. thx.

Re: Multicollinearity

by Ornelas, Fermin-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I suggest you do a search on the issue.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of jimjohn
Sent: Monday, June 30, 2008 8:30 AM
To: SPSSX-L@...
Subject: Multicollinearity

many of my independent variables correlate highly with each other. since i am
building a multiple linear regression model with many of these variables and
I am only analyzing the grouped effect all the variables have on the
dependent variable (instead of any individual effect one independent
variable may have), then I don't need to worry about multicollinearity. Can
someone plz confirm this? From what I recall, multicollinearity is only an
issue when you are trying to analyze each independent variable's individual
effect on the dependent variable. thx.
--
View this message in context: http://www.nabble.com/Multicollinearity-tp18197967p18197967.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR CONFIDENTIAL information and is intended only for the use of the specific individual(s) to whom it is addressed.  It may contain information that is privileged and confidential under state and federal law.  This information may be used or disclosed only in accordance with law, and you may be subject to penalties under law for improper use or further disclosure of the information in this e-mail and its attachments. If you have received this e-mail in error, please immediately notify the person named above by reply e-mail, and then delete the original e-mail.  Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Re: Multicollinearity

by jimjohn :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanks Fermin, any specific links you recommend that might give me
this answer? it didnt come up in any of my searches yet.





Quoting "Ornelas, Fermin" <FerminOrnelas@...>:

> I suggest you do a search on the issue.
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On
> Behalf Of jimjohn
> Sent: Monday, June 30, 2008 8:30 AM
> To: SPSSX-L@...
> Subject: Multicollinearity
>
> many of my independent variables correlate highly with each other. since i am
> building a multiple linear regression model with many of these variables and
> I am only analyzing the grouped effect all the variables have on the
> dependent variable (instead of any individual effect one independent
> variable may have), then I don't need to worry about multicollinearity. Can
> someone plz confirm this? From what I recall, multicollinearity is only an
> issue when you are trying to analyze each independent variable's individual
> effect on the dependent variable. thx.
> --
> View this message in context:
> http://www.nabble.com/Multicollinearity-tp18197967p18197967.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@... (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
> CONFIDENTIAL information and is intended only for the use of the
> specific individual(s) to whom it is addressed.  It may contain
> information that is privileged and confidential under state and
> federal law.  This information may be used or disclosed only in
> accordance with law, and you may be subject to penalties under law
> for improper use or further disclosure of the information in this
> e-mail and its attachments. If you have received this e-mail in
> error, please immediately notify the person named above by reply
> e-mail, and then delete the original e-mail.  Thank you.
>

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Re: Multicollinearity

by SR Millis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Two excellent books on the topic:

Regression Diagnostics by John Fox

Regression Diagnostics by D Belsley, E Kuh, & R Welsch


Scott R Millis, PhD, MEd, ABPP (CN,CL,RP), CStat
Professor & Director of Research
Dept of Physical Medicine & Rehabilitation
Wayne State University School of Medicine
261 Mack Blvd
Detroit, MI 48201
Email:  smillis@...
Tel: 313-993-8085
Fax: 313-966-7682


--- On Mon, 6/30/08, azam.khan@... <azam.khan@...> wrote:

> From: azam.khan@... <azam.khan@...>
> Subject: Re: Multicollinearity
> To: SPSSX-L@...
> Date: Monday, June 30, 2008, 12:30 PM
> Thanks Fermin, any specific links you recommend that might
> give me
> this answer? it didnt come up in any of my searches yet.
>
>
>
>
>
> Quoting "Ornelas, Fermin"
> <FerminOrnelas@...>:
>
> > I suggest you do a search on the issue.
> >
> > -----Original Message-----
> > From: SPSSX(r) Discussion
> [mailto:SPSSX-L@...] On
> > Behalf Of jimjohn
> > Sent: Monday, June 30, 2008 8:30 AM
> > To: SPSSX-L@...
> > Subject: Multicollinearity
> >
> > many of my independent variables correlate highly with
> each other. since i am
> > building a multiple linear regression model with many
> of these variables and
> > I am only analyzing the grouped effect all the
> variables have on the
> > dependent variable (instead of any individual effect
> one independent
> > variable may have), then I don't need to worry
> about multicollinearity. Can
> > someone plz confirm this? From what I recall,
> multicollinearity is only an
> > issue when you are trying to analyze each independent
> variable's individual
> > effect on the dependent variable. thx.
> > --
> > View this message in context:
> >
> http://www.nabble.com/Multicollinearity-tp18197967p18197967.html
> > Sent from the SPSSX Discussion mailing list archive at
> Nabble.com.
> >
> > =====================
> > To manage your subscription to SPSSX-L, send a message
> to
> > LISTSERV@... (not to SPSSX-L), with no
> body text except the
> > command. To leave the list, send the command
> > SIGNOFF SPSSX-L
> > For a list of commands to manage subscriptions, send
> the command
> > INFO REFCARD
> >
> > NOTICE: This e-mail (and any attachments) may contain
> PRIVILEGED OR
> > CONFIDENTIAL information and is intended only for the
> use of the
> > specific individual(s) to whom it is addressed.  It
> may contain
> > information that is privileged and confidential under
> state and
> > federal law.  This information may be used or
> disclosed only in
> > accordance with law, and you may be subject to
> penalties under law
> > for improper use or further disclosure of the
> information in this
> > e-mail and its attachments. If you have received this
> e-mail in
> > error, please immediately notify the person named
> above by reply
> > e-mail, and then delete the original e-mail.  Thank
> you.
> >
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@... (not to SPSSX-L), with no body
> text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the
> command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Re: Multicollinearity

by Ornelas, Fermin-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I myself have written more than once on this issue. You should have been able to find it in the list, having said that anyone that is building a regression model is going to face this issue. The empirical characteristics of this problem generally are: low t-statistics for the parameter estimates, incorrect signs, instability of the parameter values and their sign direction as variables are removed from the models. These conditions will render hypotheses testing questionable. The empirical question becomes what would be a reasonable degree of collinearity in the model? To me, if my variance proportions from the diagnostics are less than .5 for no more than 3 variables say for a model of 10 variables and the condition index is less than 30, then the model passes this test. Also the VIF < 10 is a reasonable measure. One also has to be concerned with the purpose of the model, if prediction is the primary purpose of the model then having collinear variable is not likely to hinder!
  the model's prediction capability, but to make inferences that is a different story.

Extreme cases of collinearity will produce a warning error in most packages. SAS will tell you that parameters estimates could not be provided for the variables having linear dependence, a.k.a. being collinear.

Hope this short explanation helps. But most texts have a special section for this problem and suggest some solutions to it (collect more data, center the data, use ridge regression, etc).

-----Original Message-----
From: azam.khan@... [mailto:azam.khan@...]
Sent: Monday, June 30, 2008 9:30 AM
To: Ornelas, Fermin
Cc: SPSSX-L@...
Subject: RE: Multicollinearity

Thanks Fermin, any specific links you recommend that might give me
this answer? it didnt come up in any of my searches yet.





Quoting "Ornelas, Fermin" <FerminOrnelas@...>:

> I suggest you do a search on the issue.
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On
> Behalf Of jimjohn
> Sent: Monday, June 30, 2008 8:30 AM
> To: SPSSX-L@...
> Subject: Multicollinearity
>
> many of my independent variables correlate highly with each other. since i am
> building a multiple linear regression model with many of these variables and
> I am only analyzing the grouped effect all the variables have on the
> dependent variable (instead of any individual effect one independent
> variable may have), then I don't need to worry about multicollinearity. Can
> someone plz confirm this? From what I recall, multicollinearity is only an
> issue when you are trying to analyze each independent variable's individual
> effect on the dependent variable. thx.
> --
> View this message in context:
> http://www.nabble.com/Multicollinearity-tp18197967p18197967.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@... (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
> CONFIDENTIAL information and is intended only for the use of the
> specific individual(s) to whom it is addressed.  It may contain
> information that is privileged and confidential under state and
> federal law.  This information may be used or disclosed only in
> accordance with law, and you may be subject to penalties under law
> for improper use or further disclosure of the information in this
> e-mail and its attachments. If you have received this e-mail in
> error, please immediately notify the person named above by reply
> e-mail, and then delete the original e-mail.  Thank you.
>




NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR CONFIDENTIAL information and is intended only for the use of the specific individual(s) to whom it is addressed.  It may contain information that is privileged and confidential under state and federal law.  This information may be used or disclosed only in accordance with law, and you may be subject to penalties under law for improper use or further disclosure of the information in this e-mail and its attachments. If you have received this e-mail in error, please immediately notify the person named above by reply e-mail, and then delete the original e-mail.  Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Parent Message unknown Re: Multicollinearity

by satyanarayana are :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

when linear relationships among independant variables are not existing the
regression coefficientsis explained as measuring the change in dependant
variable by one unit increase in independant variables and all other kept
constant.When there is presence of linear relationships among independant
variables that is a problem of multicollinarity.when that is present the
regression results are ambiguous.A way to analyse multicollinearity reqires
principal components  of independant variables are used in the model.when
all the principal components are used in the model the regression estimates
and SE's are identical with those of least square s technique. so the
reduction in multicollinearity is obtained by using less set of principal
componenets and this is done by dropping sequentially the components which
are having near zero variances. Though there is bias due dropping of
pricipal components the estimates tend to more and more precise and stable .
so please try principal component analysis



----- Original Message ----
From: jimjohn <azam.khan@...>
To: SPSSX-L@...
Sent: Monday, June 30, 2008 9:00:27 PM
Subject: Multicollinearity

many of my independent variables correlate highly with each other. since i am
building a multiple linear regression model with many of these variables and
I am only analyzing the grouped effect all the variables have on the
dependent variable (instead of any individual effect one independent
variable may have), then I don't need to worry about multicollinearity. Can
someone plz confirm this? From what I recall, multicollinearity is only an
issue when you are trying to analyze each independent variable's individual
effect on the dependent variable. thx.
--
View this message in context: http://www.nabble.com/Multicollinearity-tp18197967p18197967.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Re: Multicollinearity

by Matthew Reeder :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Right on. This is a very common topic (and not remotely specific to SPSS). At the very least, there should be quite a good deal of information available freely on the Internet. There are also countless articles and books on the topic, so no lack of information there. Best of luck.

--- On Mon, 6/30/08, Ornelas, Fermin <FerminOrnelas@...> wrote:

From: Ornelas, Fermin <FerminOrnelas@...>
Subject: Re: Multicollinearity
To: SPSSX-L@...
Date: Monday, June 30, 2008, 11:59 AM

I suggest you do a search on the issue.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of
jimjohn
Sent: Monday, June 30, 2008 8:30 AM
To: SPSSX-L@...
Subject: Multicollinearity

many of my independent variables correlate highly with each other. since i am
building a multiple linear regression model with many of these variables and
I am only analyzing the grouped effect all the variables have on the
dependent variable (instead of any individual effect one independent
variable may have), then I don't need to worry about multicollinearity. Can
someone plz confirm this? From what I recall, multicollinearity is only an
issue when you are trying to analyze each independent variable's individual
effect on the dependent variable. thx.
--
View this message in context:
http://www.nabble.com/Multicollinearity-tp18197967p18197967.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
CONFIDENTIAL information and is intended only for the use of the specific
individual(s) to whom it is addressed.  It may contain information that is
privileged and confidential under state and federal law.  This information may
be used or disclosed only in accordance with law, and you may be subject to
penalties under law for improper use or further disclosure of the information
in this e-mail and its attachments. If you have received this e-mail in error,
please immediately notify the person named above by reply e-mail, and then
delete the original e-mail.  Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD




====================To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Re: Multicollinearity

by jimjohn :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanks so much guys! Will definitely be checkiing out those books. Just one follow up, so you say (and I did read this too) that if my only goal is prediction, multicollinearity is not likely to cause problems. but when i add a variable that is highly correlated with one or two other variables, my Adjusted R^2 increases but at the same time, i notice big changes in my coefficients (even though the coefficient of my new added variable is only 0.003). wouldn't it be risky to make predictions using an equation who's coefficients change in such a fashion? thanks.


Ornelas, Fermin-2 wrote:
I myself have written more than once on this issue. You should have been able to find it in the list, having said that anyone that is building a regression model is going to face this issue. The empirical characteristics of this problem generally are: low t-statistics for the parameter estimates, incorrect signs, instability of the parameter values and their sign direction as variables are removed from the models. These conditions will render hypotheses testing questionable. The empirical question becomes what would be a reasonable degree of collinearity in the model? To me, if my variance proportions from the diagnostics are less than .5 for no more than 3 variables say for a model of 10 variables and the condition index is less than 30, then the model passes this test. Also the VIF < 10 is a reasonable measure. One also has to be concerned with the purpose of the model, if prediction is the primary purpose of the model then having collinear variable is not likely to hinder!
  the model's prediction capability, but to make inferences that is a different story.

Extreme cases of collinearity will produce a warning error in most packages. SAS will tell you that parameters estimates could not be provided for the variables having linear dependence, a.k.a. being collinear.

Hope this short explanation helps. But most texts have a special section for this problem and suggest some solutions to it (collect more data, center the data, use ridge regression, etc).

-----Original Message-----
From: azam.khan@utoronto.ca [mailto:azam.khan@utoronto.ca]
Sent: Monday, June 30, 2008 9:30 AM
To: Ornelas, Fermin
Cc: SPSSX-L@LISTSERV.UGA.EDU
Subject: RE: Multicollinearity

Thanks Fermin, any specific links you recommend that might give me
this answer? it didnt come up in any of my searches yet.





Quoting "Ornelas, Fermin" <FerminOrnelas@azdes.gov>:

> I suggest you do a search on the issue.
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On
> Behalf Of jimjohn
> Sent: Monday, June 30, 2008 8:30 AM
> To: SPSSX-L@LISTSERV.UGA.EDU
> Subject: Multicollinearity
>
> many of my independent variables correlate highly with each other. since i am
> building a multiple linear regression model with many of these variables and
> I am only analyzing the grouped effect all the variables have on the
> dependent variable (instead of any individual effect one independent
> variable may have), then I don't need to worry about multicollinearity. Can
> someone plz confirm this? From what I recall, multicollinearity is only an
> issue when you are trying to analyze each independent variable's individual
> effect on the dependent variable. thx.
> --
> View this message in context:
> http://www.nabble.com/Multicollinearity-tp18197967p18197967.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
> CONFIDENTIAL information and is intended only for the use of the
> specific individual(s) to whom it is addressed.  It may contain
> information that is privileged and confidential under state and
> federal law.  This information may be used or disclosed only in
> accordance with law, and you may be subject to penalties under law
> for improper use or further disclosure of the information in this
> e-mail and its attachments. If you have received this e-mail in
> error, please immediately notify the person named above by reply
> e-mail, and then delete the original e-mail.  Thank you.
>




NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR CONFIDENTIAL information and is intended only for the use of the specific individual(s) to whom it is addressed.  It may contain information that is privileged and confidential under state and federal law.  This information may be used or disclosed only in accordance with law, and you may be subject to penalties under law for improper use or further disclosure of the information in this e-mail and its attachments. If you have received this e-mail in error, please immediately notify the person named above by reply e-mail, and then delete the original e-mail.  Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Re: Multicollinearity

by Ornelas, Fermin-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Even when the purpose of the model is prediction one still needs to be concerned with this problem and should try to minimize it if possible. In my own experience building predictive models for the credit card industry I found out that as long as the collinear relationship between the variables did not change over time the model was able to predict reasonable well. If you have variables that keep changing signs and whose coefficients are not significant at all you may want to drop the variable that is causing the most problem. There is an aside issue to consider, that is in social research often one must have keep a certain variable in order to satisfy a project objective and this is also a judgment call to consider when deciding to keep a variable knowing that it will present problems. These are some of the typical caveats of model building and getting familiar with your data and the research issue at hand will help you in tackling these modeling problems.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of jimjohn
Sent: Monday, June 30, 2008 12:52 PM
To: SPSSX-L@...
Subject: Re: Multicollinearity

Thanks so much guys! Will definitely be checkiing out those books. Just one
follow up, so you say (and I did read this too) that if my only goal is
prediction, multicollinearity is not likely to cause problems. but when i
add a variable that is highly correlated with one or two other variables, my
Adjusted R^2 increases but at the same time, i notice big changes in my
coefficients (even though the coefficient of my new added variable is only
0.003). wouldn't it be risky to make predictions using an equation who's
coefficients change in such a fashion? thanks.



Ornelas, Fermin-2 wrote:

>
> I myself have written more than once on this issue. You should have been
> able to find it in the list, having said that anyone that is building a
> regression model is going to face this issue. The empirical
> characteristics of this problem generally are: low t-statistics for the
> parameter estimates, incorrect signs, instability of the parameter values
> and their sign direction as variables are removed from the models. These
> conditions will render hypotheses testing questionable. The empirical
> question becomes what would be a reasonable degree of collinearity in the
> model? To me, if my variance proportions from the diagnostics are less
> than .5 for no more than 3 variables say for a model of 10 variables and
> the condition index is less than 30, then the model passes this test. Also
> the VIF < 10 is a reasonable measure. One also has to be concerned with
> the purpose of the model, if prediction is the primary purpose of the
> model then having collinear variable is not likely to hinder!
>   the model's prediction capability, but to make inferences that is a
> different story.
>
> Extreme cases of collinearity will produce a warning error in most
> packages. SAS will tell you that parameters estimates could not be
> provided for the variables having linear dependence, a.k.a. being
> collinear.
>
> Hope this short explanation helps. But most texts have a special section
> for this problem and suggest some solutions to it (collect more data,
> center the data, use ridge regression, etc).
>
> -----Original Message-----
> From: azam.khan@... [mailto:azam.khan@...]
> Sent: Monday, June 30, 2008 9:30 AM
> To: Ornelas, Fermin
> Cc: SPSSX-L@...
> Subject: RE: Multicollinearity
>
> Thanks Fermin, any specific links you recommend that might give me
> this answer? it didnt come up in any of my searches yet.
>
>
>
>
>
> Quoting "Ornelas, Fermin" <FerminOrnelas@...>:
>
>> I suggest you do a search on the issue.
>>
>> -----Original Message-----
>> From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On
>> Behalf Of jimjohn
>> Sent: Monday, June 30, 2008 8:30 AM
>> To: SPSSX-L@...
>> Subject: Multicollinearity
>>
>> many of my independent variables correlate highly with each other. since
>> i am
>> building a multiple linear regression model with many of these variables
>> and
>> I am only analyzing the grouped effect all the variables have on the
>> dependent variable (instead of any individual effect one independent
>> variable may have), then I don't need to worry about multicollinearity.
>> Can
>> someone plz confirm this? From what I recall, multicollinearity is only
>> an
>> issue when you are trying to analyze each independent variable's
>> individual
>> effect on the dependent variable. thx.
>> --
>> View this message in context:
>> http://www.nabble.com/Multicollinearity-tp18197967p18197967.html
>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> LISTSERV@... (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>> NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
>> CONFIDENTIAL information and is intended only for the use of the
>> specific individual(s) to whom it is addressed.  It may contain
>> information that is privileged and confidential under state and
>> federal law.  This information may be used or disclosed only in
>> accordance with law, and you may be subject to penalties under law
>> for improper use or further disclosure of the information in this
>> e-mail and its attachments. If you have received this e-mail in
>> error, please immediately notify the person named above by reply
>> e-mail, and then delete the original e-mail.  Thank you.
>>
>
>
>
>
> NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
> CONFIDENTIAL information and is intended only for the use of the specific
> individual(s) to whom it is addressed.  It may contain information that is
> privileged and confidential under state and federal law.  This information
> may be used or disclosed only in accordance with law, and you may be
> subject to penalties under law for improper use or further disclosure of
> the information in this e-mail and its attachments. If you have received
> this e-mail in error, please immediately notify the person named above by
> reply e-mail, and then delete the original e-mail.  Thank you.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@... (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

--
View this message in context: http://www.nabble.com/Multicollinearity-tp18197967p18203033.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR CONFIDENTIAL information and is intended only for the use of the specific individual(s) to whom it is addressed.  It may contain information that is privileged and confidential under state and federal law.  This information may be used or disclosed only in accordance with law, and you may be subject to penalties under law for improper use or further disclosure of the information in this e-mail and its attachments. If you have received this e-mail in error, please immediately notify the person named above by reply e-mail, and then delete the original e-mail.  Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Re: Multicollinearity

by Juanito Talili :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Can we not do first exploratory factor analysis in the p independent variables (X1, X2, ...,Xp), then use the factor solutions as predictors in the regression model?  That is,  suppose F1, F2, ...Fk are the factor solutions of the p independent variables (where k<p), then the F1, F2,...Fk would be the independent variables in predicting the dependt variable Y. Is this statistically correct?
 
Juanito 
 

--- On Mon, 6/30/08, Ornelas, Fermin <FerminOrnelas@...> wrote:

From: Ornelas, Fermin <FerminOrnelas@...>
Subject: Re: Multicollinearity
To: SPSSX-L@...
Date: Monday, June 30, 2008, 8:06 PM

Even when the purpose of the model is prediction one still needs to be concerned
with this problem and should try to minimize it if possible. In my own
experience building predictive models for the credit card industry I found out
that as long as the collinear relationship between the variables did not change
over time the model was able to predict reasonable well. If you have variables
that keep changing signs and whose coefficients are not significant at all you
may want to drop the variable that is causing the most problem. There is an
aside issue to consider, that is in social research often one must have keep a
certain variable in order to satisfy a project objective and this is also a
judgment call to consider when deciding to keep a variable knowing that it will
present problems. These are some of the typical caveats of model building and
getting familiar with your data and the research issue at hand will help you in
tackling these modeling problems.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of
jimjohn
Sent: Monday, June 30, 2008 12:52 PM
To: SPSSX-L@...
Subject: Re: Multicollinearity

Thanks so much guys! Will definitely be checkiing out those books. Just one
follow up, so you say (and I did read this too) that if my only goal is
prediction, multicollinearity is not likely to cause problems. but when i
add a variable that is highly correlated with one or two other variables, my
Adjusted R^2 increases but at the same time, i notice big changes in my
coefficients (even though the coefficient of my new added variable is only
0.003). wouldn't it be risky to make predictions using an equation
who's
coefficients change in such a fashion? thanks.



Ornelas, Fermin-2 wrote:

>
> I myself have written more than once on this issue. You should have been
> able to find it in the list, having said that anyone that is building a
> regression model is going to face this issue. The empirical
> characteristics of this problem generally are: low t-statistics for the
> parameter estimates, incorrect signs, instability of the parameter values
> and their sign direction as variables are removed from the models. These
> conditions will render hypotheses testing questionable. The empirical
> question becomes what would be a reasonable degree of collinearity in the
> model? To me, if my variance proportions from the diagnostics are less
> than .5 for no more than 3 variables say for a model of 10 variables and
> the condition index is less than 30, then the model passes this test. Also
> the VIF < 10 is a reasonable measure. One also has to be concerned with
> the purpose of the model, if prediction is the primary purpose of the
> model then having collinear variable is not likely to hinder!
>   the model's prediction capability, but to make inferences that is a
> different story.
>
> Extreme cases of collinearity will produce a warning error in most
> packages. SAS will tell you that parameters estimates could not be
> provided for the variables having linear dependence, a.k.a. being
> collinear.
>
> Hope this short explanation helps. But most texts have a special section
> for this problem and suggest some solutions to it (collect more data,
> center the data, use ridge regression, etc).
>
> -----Original Message-----
> From: azam.khan@... [mailto:azam.khan@...]
> Sent: Monday, June 30, 2008 9:30 AM
> To: Ornelas, Fermin
> Cc: SPSSX-L@...
> Subject: RE: Multicollinearity
>
> Thanks Fermin, any specific links you recommend that might give me
> this answer? it didnt come up in any of my searches yet.
>
>
>
>
>
> Quoting "Ornelas, Fermin" <FerminOrnelas@...>:
>
>> I suggest you do a search on the issue.
>>
>> -----Original Message-----
>> From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On
>> Behalf Of jimjohn
>> Sent: Monday, June 30, 2008 8:30 AM
>> To: SPSSX-L@...
>> Subject: Multicollinearity
>>
>> many of my independent variables correlate highly with each other.
since
>> i am
>> building a multiple linear regression model with many of these
variables
>> and
>> I am only analyzing the grouped effect all the variables have on the
>> dependent variable (instead of any individual effect one independent
>> variable may have), then I don't need to worry about
multicollinearity.
>> Can
>> someone plz confirm this? From what I recall, multicollinearity is
only

>> an
>> issue when you are trying to analyze each independent variable's
>> individual
>> effect on the dependent variable. thx.
>> --
>> View this message in context:
>> http://www.nabble.com/Multicollinearity-tp18197967p18197967.html
>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> LISTSERV@... (not to SPSSX-L), with no body text except
the

>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>> NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
>> CONFIDENTIAL information and is intended only for the use of the
>> specific individual(s) to whom it is addressed.  It may contain
>> information that is privileged and confidential under state and
>> federal law.  This information may be used or disclosed only in
>> accordance with law, and you may be subject to penalties under law
>> for improper use or further disclosure of the information in this
>> e-mail and its attachments. If you have received this e-mail in
>> error, please immediately notify the person named above by reply
>> e-mail, and then delete the original e-mail.  Thank you.
>>
>
>
>
>
> NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
> CONFIDENTIAL information and is intended only for the use of the specific
> individual(s) to whom it is addressed.  It may contain information that is
> privileged and confidential under state and federal law.  This information
> may be used or disclosed only in accordance with law, and you may be
> subject to penalties under law for improper use or further disclosure of
> the information in this e-mail and its attachments. If you have received
> this e-mail in error, please immediately notify the person named above by
> reply e-mail, and then delete the original e-mail.  Thank you.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@... (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

--
View this message in context:
http://www.nabble.com/Multicollinearity-tp18197967p18203033.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
CONFIDENTIAL information and is intended only for the use of the specific
individual(s) to whom it is addressed.  It may contain information that is
privileged and confidential under state and federal law.  This information may
be used or disclosed only in accordance with law, and you may be subject to
penalties under law for improper use or further disclosure of the information
in this e-mail and its attachments. If you have received this e-mail in error,
please immediately notify the person named above by reply e-mail, and then
delete the original e-mail.  Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD




====================To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Re: Multicollinearity

by ViAnn Beadle :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

How one can use this technique to predict an individual observation. The
regression equation is very straightforward, what is required to do it using
the results of intermediate principal components.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of
Juanito Talili
Sent: Monday, June 30, 2008 6:45 PM
To: SPSSX-L@...
Subject: Re: Multicollinearity

Can we not do first exploratory factor analysis in the p independent
variables (X1, X2, ...,Xp), then use the factor solutions as predictors in
the regression model?  That is,  suppose F1, F2, ...Fk are the factor
solutions of the p independent variables (where k<p), then the F1, F2,...Fk
would be the independent variables in predicting the dependt variable Y. Is
this statistically correct?

Juanito


--- On Mon, 6/30/08, Ornelas, Fermin <FerminOrnelas@...> wrote:

From: Ornelas, Fermin <FerminOrnelas@...>
Subject: Re: Multicollinearity
To: SPSSX-L@...
Date: Monday, June 30, 2008, 8:06 PM

Even when the purpose of the model is prediction one still needs to be
concerned
with this problem and should try to minimize it if possible. In my own
experience building predictive models for the credit card industry I found
out
that as long as the collinear relationship between the variables did not
change
over time the model was able to predict reasonable well. If you have
variables
that keep changing signs and whose coefficients are not significant at all
you
may want to drop the variable that is causing the most problem. There is an
aside issue to consider, that is in social research often one must have keep
a
certain variable in order to satisfy a project objective and this is also a
judgment call to consider when deciding to keep a variable knowing that it
will
present problems. These are some of the typical caveats of model building
and
getting familiar with your data and the research issue at hand will help you
in
tackling these modeling problems.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of
jimjohn
Sent: Monday, June 30, 2008 12:52 PM
To: SPSSX-L@...
Subject: Re: Multicollinearity

Thanks so much guys! Will definitely be checkiing out those books. Just one
follow up, so you say (and I did read this too) that if my only goal is
prediction, multicollinearity is not likely to cause problems. but when i
add a variable that is highly correlated with one or two other variables, my
Adjusted R^2 increases but at the same time, i notice big changes in my
coefficients (even though the coefficient of my new added variable is only
0.003). wouldn't it be risky to make predictions using an equation
who's
coefficients change in such a fashion? thanks.



Ornelas, Fermin-2 wrote:

>
> I myself have written more than once on this issue. You should have been
> able to find it in the list, having said that anyone that is building a
> regression model is going to face this issue. The empirical
> characteristics of this problem generally are: low t-statistics for the
> parameter estimates, incorrect signs, instability of the parameter values
> and their sign direction as variables are removed from the models. These
> conditions will render hypotheses testing questionable. The empirical
> question becomes what would be a reasonable degree of collinearity in the
> model? To me, if my variance proportions from the diagnostics are less
> than .5 for no more than 3 variables say for a model of 10 variables and
> the condition index is less than 30, then the model passes this test. Also
> the VIF < 10 is a reasonable measure. One also has to be concerned with
> the purpose of the model, if prediction is the primary purpose of the
> model then having collinear variable is not likely to hinder!
>   the model's prediction capability, but to make inferences that is a
> different story.
>
> Extreme cases of collinearity will produce a warning error in most
> packages. SAS will tell you that parameters estimates could not be
> provided for the variables having linear dependence, a.k.a. being
> collinear.
>
> Hope this short explanation helps. But most texts have a special section
> for this problem and suggest some solutions to it (collect more data,
> center the data, use ridge regression, etc).
>
> -----Original Message-----
> From: azam.khan@... [mailto:azam.khan@...]
> Sent: Monday, June 30, 2008 9:30 AM
> To: Ornelas, Fermin
> Cc: SPSSX-L@...
> Subject: RE: Multicollinearity
>
> Thanks Fermin, any specific links you recommend that might give me
> this answer? it didnt come up in any of my searches yet.
>
>
>
>
>
> Quoting "Ornelas, Fermin" <FerminOrnelas@...>:
>
>> I suggest you do a search on the issue.
>>
>> -----Original Message-----
>> From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On
>> Behalf Of jimjohn
>> Sent: Monday, June 30, 2008 8:30 AM
>> To: SPSSX-L@...
>> Subject: Multicollinearity
>>
>> many of my independent variables correlate highly with each other.
since
>> i am
>> building a multiple linear regression model with many of these
variables
>> and
>> I am only analyzing the grouped effect all the variables have on the
>> dependent variable (instead of any individual effect one independent
>> variable may have), then I don't need to worry about
multicollinearity.
>> Can
>> someone plz confirm this? From what I recall, multicollinearity is
only

>> an
>> issue when you are trying to analyze each independent variable's
>> individual
>> effect on the dependent variable. thx.
>> --
>> View this message in context:
>> http://www.nabble.com/Multicollinearity-tp18197967p18197967.html
>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> LISTSERV@... (not to SPSSX-L), with no body text except
the

>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>> NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
>> CONFIDENTIAL information and is intended only for the use of the
>> specific individual(s) to whom it is addressed.  It may contain
>> information that is privileged and confidential under state and
>> federal law.  This information may be used or disclosed only in
>> accordance with law, and you may be subject to penalties under law
>> for improper use or further disclosure of the information in this
>> e-mail and its attachments. If you have received this e-mail in
>> error, please immediately notify the person named above by reply
>> e-mail, and then delete the original e-mail.  Thank you.
>>
>
>
>
>
> NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR
> CONFIDENTIAL information and is intended only for the use of the specific
> individual(s) to whom it is addressed.  It may contain information that is
> privileged and confidential under state and federal law.  This information
> may be used or disclosed only in accordance with law, and you may be
> subject to penalties under law for improper use or further disclosure of
> the information in this e-mail and its attachments. If you have received
> this e-mail in error, please immediately notify the person named above by
> reply e-mail, and then delete the original e-mail.  Thank you.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@... (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

--
View this message in context:
http://www.nabble.com/Multicollinearity-tp18197967p18203033.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no b