Re: Approximate solution to linear regression
- From: Bruce Weaver <bweaver@xxxxxxxxxxxx>
- Date: Wed, 20 Jun 2007 07:34:06 -0400
vincent64@xxxxxxxxx wrote:
On Jun 19, 12:44 pm, Paige Miller <paige.mil...@xxxxxxxxx> wrote:On Jun 17, 3:46 pm, "vincen...@xxxxxxxxx" <datashap...@xxxxxxxxx>
wrote:
Problem can have 40,000 variables, most of them highly correlated.I haven't tried to go through your solution in any detail.
More variables than observations in some cases. I came up with an
approach, and my question is
(1) is this an original approach?
(2) more importantly, does it always provide a fairly accurate
solution?
The problem and solution are described athttp://datashaping.com/contest14004.shtml
. The newsgroup can not render the mathematical formatting.
In similar situations, I use Partial Least Squares (PLS) Regression,
which is also an "approximate" method (actually, its a biased
regression) that doesn't care if you have highly correlated X
variables and many more Xs than observations. If you use the maximum
possible number of dimensions in PLS, you will get an OLS solution
without having to invert a matrix.
So, with that in mind, it seems to me your approximate solution is
trying to fit into a niche where there already is a solution, and the
PLS solution has proven useful in zillions of published articles. So
unless you can show that your approximate solution has better
properties than PLS, I don't see much of a need for it.
--
Paige Miller
paige\dot\miller \at\ kodak\dot\com
Thanks for your reply. I've heard that Lasso regression does similar
things too. Anyway, being efficient is much more important than being
original in this context: I'm not trying to publish an article, this
is not academic research. If I need to spend $150K to get PLS
regression software (SAS Enterprise Miner) and spend many hours
getting it to work, I'm MUCH better off re-inventing the wheel. So I
could rephrase my question as follows: am I re-inventing the wheel
quite well, meaning my approach is not significantly inferior to PLS
regression?
You must be joking about the $150K. Doesn't R do PLS regression? And it looks like Stata has a routine to run the SAS implementation.
http://ideas.repec.org/c/boc/bocode/s456810.html
A single-user corporate version of Stata (intercooled, v10) is $1150.00, according to their website.
--
Bruce Weaver
bweaver@xxxxxxxxxxxx
www.angelfire.com/wv/bwhomedir
.
- Follow-Ups:
- Re: Approximate solution to linear regression
- From: Gaj Vidmar
- Re: Approximate solution to linear regression
- References:
- Approximate solution to linear regression
- From: vincent64@xxxxxxxxx
- Re: Approximate solution to linear regression
- From: Paige Miller
- Re: Approximate solution to linear regression
- From: vincent64@xxxxxxxxx
- Approximate solution to linear regression
- Prev by Date: Re: Approximate solution to linear regression
- Next by Date: Re: Approximate solution to linear regression
- Previous by thread: Re: Approximate solution to linear regression
- Next by thread: Re: Approximate solution to linear regression
- Index(es):
Relevant Pages
|