Re: Approximate solution to linear regression
- From: "S.W.Christensen" <swc@xxxxxxxxxxxxxxxx>
- Date: Tue, 26 Jun 2007 23:31:31 -0700
On 26 Jun., 14:30, Paige Miller <paige.mil...@xxxxxxxxx> wrote:
On Jun 22, 3:16 am, "S.W.Christensen" <s...@xxxxxxxxxxxxxxxx> wrote:
On 17 Jun., 21:46, "vincen...@xxxxxxxxx" <datashap...@xxxxxxxxx>
wrote:
Problem can have 40,000 variables, most of them highly correlated.
More variables than observations in some cases.
I haven't looked over your solution in great detail, but I would
suggest that:
1) Group your variables into clusters, based on their correlations
2) Construct an ensemble of regression models, each based on just one
exemplar from each cluster
3) Weight each model conservatively (because you have so many
variables); e.g. equal weighting.
Why create an ad hoc procedure, where the properties are not known,
and where you would have to defend the validity of the procedure, and
where you have to write the code yourself?
If you only choose to create procedures that are not ad hoc, then you
will never create a procedure. Talk about halting civilisation in its
tracks...
By the way: 1) the various parts of the method are thoroughly tested,
2) a method that consistently works has proven itself beyond the need
for defending, 3) writing code yourself is the best guarantee you can
ever have of its correctness (do you have faith in the correctness of
code written by software companies?)
Best regards,
Stefan W. Christensen
.
- Follow-Ups:
- Re: Approximate solution to linear regression
- From: vincent64@xxxxxxxxx
- Re: Approximate solution to linear regression
- References:
- Re: Approximate solution to linear regression
- From: Paige Miller
- Re: Approximate solution to linear regression
- Prev by Date: Re: Time series
- Next by Date: Re: Chi-Square OK for this Contingency Table?
- Previous by thread: Re: Approximate solution to linear regression
- Next by thread: Re: Approximate solution to linear regression
- Index(es):
Relevant Pages
|