Re: Hidden Markov Models
- From: "Ted Dunning" <ted.dunning@xxxxxxxxx>
- Date: Thu, 27 Oct 2005 01:15:37 GMT
albinali wrote:
> For a particular application on a desktop (e.g. an FTP client), if we
> monitor the sequence of mouse events (particularly mouse clicks and
> their coordinates), can we determine if the application will attempt to
> transfer data on a wireless card.
I would think that you could do with simpler mechanisms than HMM's,
especially after you apply the input processing that will be necessary
to make the HMM's work right.
In general, it is best to state your problem in somewhat more basic
terms. In my view, these basic terms have to do with the goal, the
inputs available and how you will evaluate your system. These terms
should have almost nothing to do with the implementation of your
solution although your thoughts about implementation might help you go
back and negotiate a restatement of the problem.
Taking the liberty of restating your problem for you, I think you are
saying:
---------------------------------
Goal:
To recognize when network activity is about to happen.
Inputs:
The system will get low-level input events such as come from the mouse
without any knowledge of the window or application structure.
Output:
The output of the system should be a probability of network activity
within the next 5 seconds.
Evaluation:
The output probability p will be sampled once per second and a score of
n log p + (1-n) log (1-p) will be accumulated where n is 1 if network
activity occurs in the succeeding 5 seconds and 0 otherwise.
------------------------
This specification immediately raises the question of what other inputs
might be available. For instance, are keyboard events available? What
about a list of processes? What about an indication of which process
is receiving the events?
If only mouse events are available, then the obvious constraint that
the detector should not depend on where the application initiating the
network activity is on the screen, nor on the screen dimensions nor
should the absolute time matter. This suggests that delta's between
successive mouse events be used instead of the original events.
Likewise, a lapse of activity may be important so the time since the
previous event should probably be capped at a maximum value or events
should only be retained for a maximum time window.
Note that your inputs are numerical values, not symbols. If you really
want to use something like an HMM, you need to have symbolic inputs.
An easy way to translate from numerical values to symbols is to cluster
your inputs. Three dimensional K-means (dx, dy, dt) should work pretty
well. You will need to adjust the number of clusters to match your
training data. The best way to do this is to use held-out data to
evaluate your performance and retrain with different numbers of
clusters. Note that most clustering algorithms are likely to group all
events that occur after a long delay together as a single symbol. This
is a nice property since it encodes a DELAY symbols.
Once you get this far, it is fair to ask why you want to use an HMM at
all. Why not just have a list of strings that precede activity with
associated probabilities of activity? In particular, any HMM that you
might use would need as many internal states as there are intermediate
common prefix strings for valid inputs. As such, you really might just
be better off with a lookup table.
Please let me know if this helps.
[ comp.ai is moderated. To submit, just post and be patient, or if ]
[ that fails mail your article to <comp-ai@xxxxxxxxxxxxxxxxxx>, and ]
[ ask your news administrator to fix the problems with your system. ]
.
- References:
- Hidden Markov Models
- From: albinali
- Hidden Markov Models
- Prev by Date: Re: A* or D* planner
- Next by Date: Re: Hidden Markov Models
- Previous by thread: Re: Hidden Markov Models
- Next by thread: Re: Hidden Markov Models
- Index(es):
Relevant Pages
|