|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
|
|
|
Re: poisson regression with robust error variance ('eyestudyOn Thu, May 8, 2008 at 8:38 AM, Ted Harding
<Ted.Harding@...> wrote: > The below is an old thread: > > On 02-Jun-04 10:52:29, Lutz Ph. Breitling wrote: > > Dear all, > > > > i am trying to redo the 'eyestudy' analysis presented on the site > > http://www.ats.ucla.edu/stat/stata/faq/relative_risk.htm > > with R (1.9.0), with special interest in the section on "relative > > risk estimation by poisson regression with robust error variance". > > > > so i guess rlm is the function to use. but what is its equivalent > > to the glm's argument "family" to indicate 'poisson'? or am i > > somehow totally wrong and this is not applicable here? in glm lately. I went and read that UCLA website on the RR eye study and the Zou article that uses a glm with robust standard errors. I don't think "rlm" is the right way to go because that gives different parameter estimates. You need to estimate with glm and then get standard errors that are adjusted for heteroskedasticity. Well, you may wish to use rlm for other reasons, but to replicate that eyestudy project, you need to take the ordinary usual estimates of the b's and adjust the standard errors. The Zou article does not give the mathematical formulae used to estimate those robust errors, it rather gives a code snippit on using SAS proc GENMOD to get those estimates. Presumably, if we had access to the SAS formula, we could easily get the calculations we need with R. It is a little irksome to me that people think saying "use SAS proc GENMOD" is informative. Rather, we really need to know which TYPE of robust standard error is being considered, since there are about 10 competing formulations. I started checking various R packages for calculating sandwich variance estimates. There is a package "sandwich" for that purpose, and in the "car" package there is a function hccm that can do so. The hccm in car refuses to take glm objects, but the function vcovHC in "sandwich" will do so. The discussion for sandwich's vcovHC function refers to linear models, and lm objects are used in the examples, but the vignette "sandwich" distributed with the package states "The HAC estimators are already available for generalized linear models (fitted by glm) and robust regression (fitted by rlm in package MASS)." . As a result, one can get a var/covar matrix from the routines in the sandwich package, but I'm not entirely sure they are match the ones SAS gives. The vcovHC help page has a nice explanation of the differences across sandwich estimators. It says they are all variants on (X'X)^{-1} X' Omega X (X'X)^{-1} With different stipulations about Omega. If we knew the stipulation about OMEGA used in the SAS routine, we could try it. I tried to get the eyestudy data, but the link on the UCLA website is no longer valid. So I generated some phony 0-1 data: y <- as.numeric(rnorm(1000) > 0) x <- rnorm(1000) > glm1 <- glm(y~x, family=binomial) > hccm(glm1) Error in hccm.lm(glm1) : requires unweighted lm > vcovH vcovHAC vcovHC > glm1 <- glm(y~x, family=poisson(link=log)) > vcovHC(glm1) (Intercept) x (Intercept) 1.017588e-03 -3.722456e-05 x -3.722456e-05 9.492517e-04 The default type of estimation the method called HC3, which is recommended by authors of some Monte Carlo research studies. vcovHC calculates White's basic correction if you run: > vcovHC(glm1, type="HC0") (Intercept) x (Intercept) 1.013508e-03 -3.691381e-05 x -3.691381e-05 9.417839e-04 Compare against the non-robust glm var/covar matrix. > vcov(glm1) (Intercept) x (Intercept) 0.0020152998 -0.0000778422 x -0.0000778422 0.0018721903 In conclusion, use glm followed by vcovHC and I believe you will find estimates like the ones provided by SAS or Stata. But, without access to their data, I can't be entirely sure. -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
|
|
|
Re: poisson regression with robust error variance ('eyestudyOnce again, Paul, many thanks for your thorough examination
of this question! And for spelling out your approach!!! It certainly looks as though you're very close to target (or even spot-on). I've only one comment -- see at end. On 08-May-08 20:35:38, Paul Johnson wrote: > Ted Harding said: >> I can get the estimated RRs from > >> RRs <- exp(summary(GLM)$coef[,1]) > >> but do not see how to implement confidence intervals based >> on "robust error variances" using the output in GLM. > > Thanks for the link to the data. Here's my best guess. If you > use the following approach, with the HC0 type of robust standard > errors in the "sandwich" package (thanks to Achim Zeileis), > you get "almost" the same numbers as that Stata output gives. > The estimated b's from the glm match exactly, but the robust > standard errors are a bit off. > >### Paul Johnson 2008-05-08 >### sandwichGLM.R > system("wget http://www.ats.ucla.edu/stat/stata/faq/eyestudy.dta") > library(foreign) > > dat <- read.dta("eyestudy.dta") >### Ach, stata codes factor contrasts backwards > > dat$carrot0 <- ifelse(dat$carrot==0,1,0) > dat$gender1 <- ifelse(dat$gender==1,1,0) > > glm1 <- glm(lenses~carrot0, data=dat, family=poisson(link=log)) > summary(glm1) > > library(sandwich) > vcovHC(glm1) > sqrt(diag(vcovHC(glm1))) > sqrt(diag(vcovHC(glm1, type="HC0"))) > >### Result: ># > summary(glm1) ># Call: ># glm(formula = lenses ~ carrot0, family = poisson(link = log), ># data = dat) > ># Deviance Residuals: ># Min 1Q Median 3Q Max ># -1.1429 -0.9075 0.3979 0.3979 0.7734 > ># Coefficients: ># Estimate Std. Error z value Pr(>|z|) ># (Intercept) -0.8873 0.2182 -4.066 4.78e-05 *** ># carrot0 0.4612 0.2808 1.642 0.101 ># --- ># Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > ># (Dispersion parameter for poisson family taken to be 1) > ># Null deviance: 67.297 on 99 degrees of freedom ># Residual deviance: 64.536 on 98 degrees of freedom ># AIC: 174.54 > ># Number of Fisher Scoring iterations: 5 > ># > sqrt(diag(vcovHC(glm1, type="HC0"))) ># (Intercept) carrot0 ># 0.1673655 0.1971117 > >### Compare against >## http://www.ats.ucla.edu/stat/stata/faq/relative_risk.htm >## robust standard errors are: >## .1682086 .1981048 > > glm2 <- glm(lenses~carrot0 +gender1 +latitude, data=dat, > family=poisson(link=log)) > > sqrt(diag(vcovHC(glm2, type="HC0"))) >### Result: ># > summary(glm2) > >#Call: ># glm(formula = lenses ~ carrot0 + gender1 + latitude, family = > poisson(link = log), ># data = dat) > ># Deviance Residuals: ># Min 1Q Median 3Q Max ># -1.2332 -0.9316 0.2437 0.5470 0.9466 > ># Coefficients: ># Estimate Std. Error z value Pr(>|z|) ># (Intercept) -0.65212 0.69820 -0.934 0.3503 ># carrot0 0.48322 0.28310 1.707 0.0878 . ># gender1 0.20520 0.27807 0.738 0.4605 ># latitude -0.01001 0.01898 -0.527 0.5980 ># --- ># Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > ># (Dispersion parameter for poisson family taken to be 1) > ># Null deviance: 67.297 on 99 degrees of freedom ># Residual deviance: 63.762 on 96 degrees of freedom ># AIC: 177.76 > ># Number of Fisher Scoring iterations: 5 > ># > sqrt(diag(vcovHC(glm2, type="HC0"))) ># (Intercept) carrot0 gender1 latitude ># 0.49044963 0.19537743 0.18481067 0.01275001 > >### UCLA site has: > >## .4929193 .1963616 .1857414 .0128142 > > > So, either there is some "rounding error" or Stata is not using > exactly the same algorithm as vcovHC in sandwich. The above differences look somewhat systematic (though very small). The percentage differences (vcovHC relative to STATA) for the two cases you analyse above are vcovHC "HC0": 0.1673655 0.1971117 STATA : 0.1682086 0.1981048 ------------------------------------- % Difference: -0.5012229 -0.5013003 vcovHC "HC0": 0.49044963 0.19537743 0.18481067 0.01275001 STATA: 0.4929193 0.1963616 0.1857414 .0128142 ------------------------------------------------------------- % Difference: -0.5010293 -0.5012029 -0.5010891 -0.5009287 so, in all cases, very close to -0.501%, despite varying absolute values. This suggests a very subtle difference in algorithm, rather than rounding error. Though there is conceivably a convergence issue in the background. Does ANYONE here know exactly how STATA does it? Best wishes, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding@...> Fax-to-email: +44 (0)870 094 0861 Date: 08-May-08 Time: 22:28:36 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: poisson regression with robust error variance ('eyestudyPaul & Ted:
> > I can get the estimated RRs from > > > RRs <- exp(summary(GLM)$coef[,1]) > > > but do not see how to implement confidence intervals based > > on "robust error variances" using the output in GLM. > > > Thanks for the link to the data. Here's my best guess. If you use > the following approach, with the HC0 type of robust standard errors in > the "sandwich" package (thanks to Achim Zeileis), you get "almost" the > same numbers as that Stata output gives. The estimated b's from the > glm match exactly, but the robust standard errors are a bit off. Thanks for posting the code reproducing this example! > ### Paul Johnson 2008-05-08 > ### sandwichGLM.R > system("wget http://www.ats.ucla.edu/stat/stata/faq/eyestudy.dta") > > library(foreign) > > dat <- read.dta("eyestudy.dta") > > ### Ach, stata codes factor contrasts backwards > dat$carrot0 <- ifelse(dat$carrot==0,1,0) > dat$gender1 <- ifelse(dat$gender==1,1,0) > > glm1 <- glm(lenses~carrot0, data=dat, family=poisson(link=log)) > > library(sandwich) > sqrt(diag(vcovHC(glm1, type="HC0"))) > # (Intercept) carrot0 > # 0.1673655 0.1971117 > > ### Compare against > ## http://www.ats.ucla.edu/stat/stata/faq/relative_risk.htm > ## robust standard errors are: > ## .1682086 .1981048 I have the solution. Note that the ratio of both standard errors to those from sandwich is almost constant which suggests a scaling difference. In "sandwich" I have implemented two scaling strategies: divide by "n" (number of observations) or by "n-k" (residual degrees of freedom). This leads to R> sqrt(diag(sandwich(glm1))) (Intercept) carrot0 0.1673655 0.1971117 R> sqrt(diag(sandwich(glm1, adjust = TRUE))) (Intercept) carrot0 0.1690647 0.1991129 (Equivalently, you could youse vcovHC() with types "HC0" and "HC1", respectively.) Their standard errors are between those, so I thought that they used a different scaling. Here, n = 100 and n-k = 98 which only left R> sqrt(diag(sandwich(glm1) * 100/99)) (Intercept) carrot0 0.1682086 0.1981048 Bingo! So they scaled with n-1 rather than n or n-k. This is, of course, also consistent (if the other estimators are), but I wouldn't know of a good theoretical underpinning for it. > glm2 <- glm(lenses~carrot0 +gender1 +latitude, data=dat, > family=poisson(link=log)) > > ### UCLA site has: > ## .4929193 .1963616 .1857414 .0128142 R> round(sqrt(diag(sandwich(glm2) * 100/99)), digits = 7) (Intercept) carrot0 gender1 latitude 0.4929204 0.1963617 0.1857417 0.0128142 Only the constant is a bit off, but that is not unusual in replication exercises. Some further advertising: If you want to get the full table of coefficients, you can use coeftest() from "lmtest" R> library("lmtest") R> coeftest(glm2, sandwich(glm2) * 100/99) z test of coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.652122 0.492920 -1.3230 0.18584 carrot0 0.483220 0.196362 2.4609 0.01386 * gender1 0.205201 0.185742 1.1048 0.26926 latitude -0.010009 0.012814 -0.7811 0.43474 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Best, Z ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: poisson regression with robust error variance ('eyestudyI'd like to thank Paul Johnson and Achim Zeileis heartily
for their thorough and accurate responses to my query. I think that the details og how to use the procedure, and of its variants, which they have sent to the list should be definitive -- and very helpfully usable -- for folks like myself who may in future grope in the archives concerning this question. And, just to confirm, it all worked perfectly for me in the end. Best wishes, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding@...> Fax-to-email: +44 (0)870 094 0861 Date: 09-May-08 Time: 00:56:14 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: poisson regression with robust error variance ('eyestudyAt 00:56 09/05/2008, Ted Harding wrote:
>I'd like to thank Paul Johnson and Achim Zeileis heartily >for their thorough and accurate responses to my query. > >I think that the details og how to use the procedure, and >of its variants, which they have sent to the list should >be definitive -- and very helpfully usable -- for folks >like myself who may in future grope in the archives >concerning this question. > >And, just to confirm, it all worked perfectly for me >in the end. There is an article available online (by a frequent contributor to this list) which addresses the topic of estimating relative risk in multivariable models. I found it very helpful. http://www.bepress.com/uwbiostat/paper293/ >Best wishes, >Ted. > >-------------------------------------------------------------------- >E-Mail: (Ted Harding) <Ted.Harding@...> >Fax-to-email: +44 (0)870 094 0861 >Date: 09-May-08 Time: 00:56:14 >------------------------------ XFMail ------------------------------ Michael Dewey http://www.aghmed.fsnet.co.uk ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: poisson regression with robust error variance ('eyestudOn 13-May-08 14:25:37, Michael Dewey wrote:
> At 00:56 09/05/2008, Ted Harding wrote: >>I'd like to thank Paul Johnson and Achim Zeileis heartily >>for their thorough and accurate responses to my query. >> >>I think that the details og how to use the procedure, and >>of its variants, which they have sent to the list should >>be definitive -- and very helpfully usable -- for folks >>like myself who may in future grope in the archives >>concerning this question. >> >>And, just to confirm, it all worked perfectly for me >>in the end. > > There is an article available online (by a frequent contributor to > this list) which addresses the topic of estimating relative risk in > multivariable models. I found it very helpful. > > http://www.bepress.com/uwbiostat/paper293/ Thanks, Michael. That is indeed an excellent survey and reference! Ted. >>Best wishes, >>Ted. >> >>-------------------------------------------------------------------- >>E-Mail: (Ted Harding) <Ted.Harding@...> >>Fax-to-email: +44 (0)870 094 0861 >>Date: 09-May-08 Time: 00:56:14 >>------------------------------ XFMail ------------------------------ > > Michael Dewey > http://www.aghmed.fsnet.co.uk > > ______________________________________________ > R-help@... mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding@...> Fax-to-email: +44 (0)870 094 0861 Date: 13-May-08 Time: 17:43:10 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
| Free Forum Powered by Nabble | Forum Help |