Re: Issues regarding testing of a classifier
- From: Milind Joshi <milind.a.joshi@xxxxxxxxx>
- Date: Tue, 05 Feb 2008 11:33:08 GMT
On Jan 28, 9:35 pm, talse...@xxxxxxxxx wrote:
Hi all,
I have a general question, I hope you guys could help me.
Suppose I have a classifier A that discriminates between two classes:
class W and B (White balls and Black balls, respectively).
Suppose I have to run the classifier on a vast set of balls (:= P), in
which the distribution of White and Black balls is unknown (Which
means I don't know the a-priori probability of getting a white or a
black ball to examine).
Now I would like to test the classifier. I choose a subset of P (:=N)
that consists of N balls and run the experiment to get the ROC curve
of the classifier.
My question is: What is the best way to set the distribution of White
and Black balls in N if the distribution of P is unknown? 0.5*N Black
balls and 0.5*N White balls sounds right, but is it really right?! And
how would the answer change if P can be determined?
Hi,
I would say the use of the word "testing" for your scenario is
probably not the right one... testing is against a known outcome or
known probability.
What you are trying to do is estimate the probability that your
classifier accurately reflects the real life distribution without
knowing the real life distribution.
In essence, you are trying to guess. Could work, if you know the
domain well.
The techniques like sampling a portion of your domain to see what the
probability is, is a good idea.
In this case, what could help (if known) is:
1. The probability that your large sample is distributed as you
imagine it to be
2. The probability that your classifier works well with the class of
problem you described
3. The probability that you don't place a very high confidence on the
accuracy of your guess.
Best Regards,
Milind
[ comp.ai is moderated ... your article may take a while to appear. ]
.
- Prev by Date: Re: Issues regarding testing of a classifier
- Next by Date: Kalman Filter Paper Review Needed
- Previous by thread: Re: Issues regarding testing of a classifier
- Next by thread: Re: UMBC agents mailing list 2.0
- Index(es):
Relevant Pages
|