Molecular descriptors are numerical values that summarize information about a molecule’s structure, topology, geometry, or physicochemical properties in a form suitable for machine learning or statistical modeling. ARKA (Arithmetic Residuals in K-Groups Analysis) descriptors differ from traditional descriptors by encoding atomic-level information through recursive autoregression techniques, which aim to capture subtle structural patterns and improve predictive accuracy. They are designed to be both interpretable and well-suited to modeling nonlinear relationships in QSAR studies.
Comparisons
While QSAR is essentially a similarity-based approach, the occurrence of activity/property cliffs may greatly reduce the predictive accuracy of the developed models.[1] The novel Arithmetic Residuals in K-groups Analysis (ARKA) approach is a supervised dimensionality reduction technique developed by the DTC Laboratory, Jadavpur University that can easily identify activity cliffs in a data set.[2] Activity cliffs are similar in their structures but differ considerably in their activity. The basic idea of the ARKA descriptors is to group the conventional QSAR descriptors based on a predefined criterion and then assign weightage to each descriptor in each group. ARKA descriptors have also been used to develop classification-based[3] and regression-based[4] QSAR models with acceptable quality statistics.
A tutorial presentation on the ARKA descriptors is available. Recently a multi-class ARKA framework has been proposed for improved q-RASAR model generation.[21]
^Kar, Supratik; Gallagher, Andrea (2024). "Comparative QSAR and q-RASAR Modeling for Aquatic Toxicity of Organic Chemicals to Three Trout Species: O. Clarkii, S. Namaycush, and S. Fontinalis". Journal of Hazardous Materials. 480. Bibcode:2024JHzM..48036060K. doi:10.1016/j.jhazmat.2024.136060. PMID39393319.
^Rahimi-Soujeh, Zaniar; Safaie, Naser; Moradi, Sajad; Abbod, Mohsen; Sharifi, Rouhalah; Mojerlou, Shideh; Mokhtassi-Bidgoli, Ali (2024). "New binary mixtures of fungicides against Macrophomina phaseolina: machine learning-driven QSAR, read-across prediction, and molecular dynamics simulation". Chemosphere. 366. Bibcode:2024Chmsp.36643533R. doi:10.1016/j.chemosphere.2024.143533.
^Abdellatif, Hayet; Laidi, Maamar; Si-Moussa, Cherif; Amrane, Abdeltif; Euldji, Imane; Benmouloud, Widad (2024). "Contributions to the development of prediction models for the toxicity of ionic liquids". Structural Chemistry. 36 (3): 865–886. doi:10.1007/s11224-024-02411-4.
^Sun, Ting; Wei, Chongzhi; Liu, Yang; Ren, Yueying (2024). "Explainable machine learning models for predicting the acute toxicity of pesticides to sheepshead minnow (Cyprinodon variegatus)". Science of the Total Environment. 957. Bibcode:2024ScTEn.95777399S. doi:10.1016/j.scitotenv.2024.177399. PMID39521088.
^Banjare, Purusottam; Murmu, Anjali; Matore, Balaji Wamanrao; Singh, Jagadish; Papa, Ester; Roy, Partha Pratim (2024). "Unveiling the interspecies correlation and sensitivity factor analysis of rat and mouse acute oral toxicity of antimicrobial agents: first QSTR and QTTR Modeling report". Toxicology Research. 13 (6): tfae191. doi:10.1093/toxres/tfae191. PMC 11569388. PMID39559274.