Re: Issues regarding testing of a classifier



On Jan 28, 9:35 pm, talse...@xxxxxxxxx wrote:
Hi all,

I have a general question, I hope you guys could help me.

Suppose I have a classifier A that discriminates between two classes:
class W and B (White balls and Black balls, respectively).

Suppose I have to run the classifier on a vast set of balls (:= P), in
which the distribution of White and Black balls is unknown (Which
means I don't know the a-priori probability of getting a white or a
black ball to examine).

Now I would like to test the classifier. I choose a subset of P (:=N)
that consists of N balls and run the experiment to get the ROC curve
of the classifier.

My question is: What is the best way to set the distribution of White
and Black balls in N if the distribution of P is unknown? 0.5*N Black
balls and 0.5*N White balls sounds right, but is it really right?! And
how would the answer change if P can be determined?


Hi,

I would say the use of the word "testing" for your scenario is
probably not the right one... testing is against a known outcome or
known probability.
What you are trying to do is estimate the probability that your
classifier accurately reflects the real life distribution without
knowing the real life distribution.

In essence, you are trying to guess. Could work, if you know the
domain well.

The techniques like sampling a portion of your domain to see what the
probability is, is a good idea.

In this case, what could help (if known) is:

1. The probability that your large sample is distributed as you
imagine it to be

2. The probability that your classifier works well with the class of
problem you described

3. The probability that you don't place a very high confidence on the
accuracy of your guess.

Best Regards,
Milind

[ comp.ai is moderated ... your article may take a while to appear. ]
.



Relevant Pages

  • Re: Issues regarding testing of a classifier
    ... Suppose I have a classifier A that discriminates between two classes: ... class W and B (White balls and Black balls, ... What is the best way to set the distribution of White ... balls and 0.5*N White balls sounds right, ...
    (comp.ai)
  • Re: Issues regarding testing of a classifier
    ... Suppose I have a classifier A that discriminates between two classes: ... class W and B (White balls and Black balls, ... What is the best way to set the distribution of White ...
    (comp.ai)
  • Re: Issues regarding testing of a classifier
    ... Suppose I have a classifier A that discriminates between two classes: ... class W and B (White balls and Black balls, ... What is the best way to set the distribution of White ...
    (comp.ai)
  • Re: Question for the statisticians
    ... sampling distribution of means is a normal distribution if the ... zillion colored balls, zome red and some green, one red ball for each ... of greens. ... what's the probability a random sample will show 37% ...
    (rec.games.bridge)
  • Re: Application of Birthday Paradox
    ... but i can have distribution such that certain bins can have ... That's why i need some other ball distribution method. ... finding balls in the first few consecutive bins. ... If you choose any k bins, the probability that no balls end up in any ...
    (sci.math)