Linear regression model of short k-word: a similarity distance suitable for biological sequences with various lengths
2013
Yang, Xiwu | Wang, Tianming
Originating from sequences' length difference, both k-word based methods and graphical representation approaches have uncovered biological information in their distinct ways. However, it is less likely that the mechanisms of information storage vary with sequences' length. A similarity distance suitable for sequences with various lengths will be much near to the mechanisms of information storage. In this paper, new sub-sequences of k-word were extracted from biological sequences under a one-to-one mapping. The new sub-sequences were evaluated by a linear regression model. Moreover, a new distance was defined on the invariants from the linear regression model. With comparison to other alignment-free distances, the results of four experiments demonstrated that our similarity distance was more efficient.
Показать больше [+] Меньше [-]Ключевые слова АГРОВОК
Библиографическая информация
Эту запись предоставил National Agricultural Library