NEETA NAIN
Articles written in Sadhana
Volume 45 All articles Published: January 2020 Article ID 0020 Original Article (Computer Sciences)
CALAM: model-based compilation and linguistic statistical analysis of Urdu corpus
In this paper, we introduce an efficient framework for the compilation of an Urdu corpus along with ground truth and transcription in Unicode format. A novel scheme of the annotation based on four-level XML has been incorporated for the corpus CALAM. In addition to compilation and bench marking test, the frameworkgenerates the word frequency distribution according to category sapient useful for linguistic evaluation. This paper presents the statistical analysis with corpus data based on transcript text and frequency of occurrences. The observation of statistical analysis is conducted using vital statistics like rank of words, the frequency of words, ligatures length (number of ligatures with combination of two to seven characters), entropy and perplexity of the corpus. Besides rudimental statistics coverage, some additional statistical features are also evaluated like Zipf’s linguistic rule and measurement of dispersion in corpus information. The experimental results obtained from statistical observation are presented for asserting viability and usability of the corpus data as a standard platformfor linguistic research on the Urdu language.
Volume 47 All articles Published: 10 August 2022 Article ID 0158
GSO-CRS: grid search optimization for collaborative recommendation system
Many online platforms have adopted a recommender system (RS) to suggest an actual product to the active users according to their preferences. The RS that provides accurate information on users’ past preferences is known as collaborative filtering (CF). One of the most common CF methods is matrix factorization (MF). It is important to note that the MF technique contains several tuned parameters, leading to an expensive and complex black-box optimization problem. An objective function quantifies the quality of a prediction by mapping any possible configuration of hyper-parameters to a numerical score. In this article, we show how a gird search optimization (GSO) can efficiently obtain the optimal value of hyper-parameters an MF and improve the prediction of the collaborative recommender system (CRS). Specifically, we designed a 4 X 4 grid search space, obtained the optimal set of hyper-parameters, and then evaluated the model using these hyperparameters. Furthermore, we evaluated the model using two benchmark datasets and compared it with the stateof-the-art model. We found that the proposed model significantly improves the prediction accuracy, precision@k, and NDCG@k over the state-of-art-the models and handles the sparsity problem of CF.
Volume 48, 2023
All articles
Continuous Article Publishing mode
Click here for Editorial Note on CAP Mode
© 2022-2023 Indian Academy of Sciences, Bengaluru.