Re: Combining Probabilities?



On May 1, 10:32 am, moogie <budgetan...@xxxxxxxxxxxxxx> wrote:
I am wondering the best ( correct? ) way of combining probabilities to
predict the next symbol in a sequence.

I have two different predictors which generate a probability between
0..1 for each symbol in a set to be the next symbol in the sequence.

One predictor has a limited set of symbols it will generate
probabilities for while the other predictor generates probability for
all known symbols. Thus for some symbols there will only be
probabilities generated by only one predictor.

This leads me to my potentially erroneous combination algorithm:

Currently I am using a weighted sum of these two probabilities to
generate a combined probability. This seems to work but is this
correct?

Should i use the mean of the probabilites?

Thanks

Nick

There is no best way. One technique is to average: p = w1*p1 + w2*p2,
then adjust the weights in the direction that favors the more accurate
model. You can also use sets of weights selected by some small
context.

In paq I predict one bit at a time. I experimented with lots of
methods. In paq1 I counted 0 and 1 bits in each context, then did a
weighted sum of the 0 and 1 counts. I tuned the weights by hand. I
found that for an order n context, a weight of n^2 worked pretty
well. For this method, you can't allow the 0 and 1 counts to both get
large. After a 1 bit, you discard some 0 counts and vice versa.

Starting in paq4 I used a method of adjusting the weights to favor the
models that made the most accurate prediction. I use different sets
of weights in different contexts.

Starting in paq7 each model outputs a prediction instead of 0,1
counts. I convert each probability into the logistic domain, log(p)/
log(1-p), then combine by weighted averaging, and convert back to the
linear domain, p = 1/(1+exp(-x)). After each prediction the weights
are adjusted to favor the best predictors.

This is for combining hundreds of models. If you only have 2, you can
use a 2-D lookup table with interpolation, then adjust the entry after
the prediction in the direction of the actual outcome.

See also http://en.wikipedia.org/wiki/PAQ

-- Matt Mahoney

.



Relevant Pages

  • Re: Creationism in Science Fiction
    ... probability computations have nothing whatsoever to do with the actual ... are not a valid predictor of future events. ... I presume the "they" of whom you speak refers to casino management. ...
    (rec.arts.sf.written)
  • Re: is the design correct ?
    ... > have an effect in the apparition of disease. ... but since the replicated cases all have the same predictor ... You then shift the cutoff probability ... threshold so that it is no longer 0.5 but rather a value that picks out the ...
    (sci.stat.edu)
  • Re: modify Probit coefficients to match desired number of yes predictions
    ... I have a probit model with a 0/1 predictor y and let's say 10 ... probability of case 1 to 60%, ... implies that tau must be -.2533. ...
    (sci.stat.math)
  • Re: combining probabilities from different models
    ... Most of my forays into compression have been based on using one ... particular model to generate a probability for a given symbol. ... monitor is saying. ... predictor for the kind of static data the program has. ...
    (comp.compression)
  • Re: BAYESIAN (weighted) PROBABILITY ??
    ... > where Pis the prior probability. ... liklihood ratio actually already contains the weights. ... that GOD=G_uv is CORRECT, given the scientific evidence". ...
    (sci.stat.math)