- #1
malina
- 3
- 0
Hi All,
I need to train an HMM using data with sequences of variable length (5 - 500 symbols per input sequence).
From what I've seen thus far, all (or most) trainings are perfirmed on data-sets of a fixed size, although there is no explicit demand for this in the HMM structure.
So, first of all - what am I missing and is it indeed not advised to train HMM with variable-length data? Does this violate the stochastic assumptons of the EM/Viterbi algorithms?
Next, for the model that I receive, I have "good" performance for "short" sequences, but as the sequence gets longer, the perfromance decreases (and sometimes increases back). I can relate this to two possible causes:
1) Longer sequences have dynamics uncaptured by the HMM since they are not the majority of the training set hence the "random" prediction behavior
2) HMM gets stuck on short-length model (which is another way to rephrase (1), but not exactly).
Can someone please advise on the matter?
Thanks!
I need to train an HMM using data with sequences of variable length (5 - 500 symbols per input sequence).
From what I've seen thus far, all (or most) trainings are perfirmed on data-sets of a fixed size, although there is no explicit demand for this in the HMM structure.
So, first of all - what am I missing and is it indeed not advised to train HMM with variable-length data? Does this violate the stochastic assumptons of the EM/Viterbi algorithms?
Next, for the model that I receive, I have "good" performance for "short" sequences, but as the sequence gets longer, the perfromance decreases (and sometimes increases back). I can relate this to two possible causes:
1) Longer sequences have dynamics uncaptured by the HMM since they are not the majority of the training set hence the "random" prediction behavior
2) HMM gets stuck on short-length model (which is another way to rephrase (1), but not exactly).
Can someone please advise on the matter?
Thanks!