Re: small data set
- From: Greg Heath <heath@xxxxxxxxxxxxxxxx>
- Date: Tue, 6 May 2008 00:43:15 -0700 (PDT)
Corrected for the heinous sin of top-posting.
On May 5, 5:59 pm, "giannis " <fanzi...@xxxxxxxxxxx> wrote:
Greg Heath <he...@xxxxxxxxxxxxxxxx> wrote in message
<9b4c2a53-7f64-42a4-a546-5a8e0f9e2...@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>...
On May 1, 7:22=A0am, Greg Heath <he...@xxxxxxxxxxxxxxxx>wrote:
wrote:On May 1, 6:30=A0am, "giannis " <fanzi...@xxxxxxxxxxx>
Hello.
speciments).I am doing a statistical research using KNN,neuralnets and
SVM.. The problem is the very small data set (25
I am using cross validation to resample the data but I am
not sure if my results can be accurate with such a small
data set.
possiblecan you please suggest any method to use as best as
=A0such a small data set?
thank you in advance =A0
Bootstrapping
Search the mathworks website.
If you have prior information on the form of the probability
distribution function, you can use the 25 observations to
estimate the parameters and then generate more "data".
The danger is that, even in one dimension, 25 observations
will not give you precise parameter estimates.
If you don't have such prior information you can test
hypotheses as to which distribution the data might be
from. However, with only 25 observations the testing will
be far from definitive. You may test several distributions,
find that you can reject all except one. However, that does
not guarantee that it will be the correct distribution.
=2E..suddenly I have the feeling that the data is not
1-dimensional!
What are the dimensions of your input and output?
Exactly what type of problem do you have and what
exactly do you want the neural net to do?
Hello Greg,
thank you for all your help.
I have data from 25 people. 20 of them have lung cancer and
5 don't. I have 6 different characteristic for each person.
(so the array is 25X6)
the tasks are:to produce two classifiers
1st: to classify between a constant value - 2 outputs)
2nd: to classify the stage of cancer 0,1,2,3 or 4 so - 5
outputs)
I tried to use SVM, Linear regresion, Backpropagation and
RBF Neural Nets and KNN.
I tried to reshuffle my data using Leave One Out Cross
Validation (LOOCV) so keeping each time one for testing and
24 for training.
hope I gave you the picture..?
What kind of error rates are you getting for each method?
What are the largest error rates that you would accept?
When you plot the desired {0,1} classification vs each
of the inputs does there appear to be predictive capability?
What are the corresponding correlation coefficients?
Hope this helps.
Greg
.
- Follow-Ups:
- Re: small data set
- From: giannis
- Re: small data set
- References:
- small data set
- From: giannis
- Re: small data set
- From: Greg Heath
- Re: small data set
- From: Greg Heath
- Re: small data set
- From: giannis
- small data set
- Prev by Date: reset the contents of an edit box
- Next by Date: Error using ==> subsindex
- Previous by thread: Re: small data set
- Next by thread: Re: small data set
- Index(es):
Relevant Pages
|