The non-coding elements of a genome, with many of them considered as junk earlier, have now started gaining long due respectability, with microRNAs as the best current example. MicroRNAs bind preferentially to the 3′ untranslated regions (UTRs) of the target genes and negatively regulate their expression most of the time. Several microRNA:target prediction softwares have been developed based upon various assumptions and the majority of them consider the free energy of binding of a target to its microRNA and seed conservation. However, the average concordance between the predictions made by these softwares is limited and compounded by a large number of false-positive results. In this study, we describe a methodology developed by us to refine microRNA:target prediction by target prediction softwares through observations made from a comprehensive study. We incorporated the information obtained from dinucleotide content variation patterns recorded for flanking regions around the target sites using support vector machines (SVMs) trained over two different major sources of experimental data, besides other sources. We assessed the performance of our methodology with rigorous tests over four different dataset models and also compared it with a recently published refinement tool, MirTif. Our methodology attained a higher average accuracy of 0.88, average sensitivity and specificity of 0.81 and 0.94, respectively, and areas under the curves (AUCs) for all the four models scored above 0.9, suggesting better performance by our methodology and a possible role of flanking regions in microRNA targeting control. We used our methodology over genes of three different pathways – toll-like receptor (TLR), apoptosis and insulin – to finally predict the most probable targets. We also investigated their possible regulatory associations, and identified a hsa-miR-23a regulatory module.
Volume 45, 2020
Continuous Article Publishing mode
Click here for Editorial Note on CAP Mode