Article information
2024 , Volume 29, ą 1, p.45-58
Berikov V.B., Kutnenko O.A., Pestunov I.A.
Weakly supervised group classification
Weakly supervised learning implies possible uncertainty or fuzziness of the labelling. Current study addresses this problem using the formulation of group binary classification. It is assumed that each sample object may include a set of sub-objects belonging to one of two classes. Objects are described by a set of features; the predicted feature determines the degree to which an object belongs to the “positive” class. It is required to construct a decision function from the training sample to predict the target feature for new objects. The proposed method is based on the selection of informative feature space and filtering the training sample. Both the selection of informative features and the removal of noise observations are carried out on the basis of analysis of the local environment of objects. The degree of similarity between the object and the class is determined by the 𝑘 nearest neighbours of the object, taking into account their degree of belonging to the target class. For an experimental study of the developed method, the real problem of analyzing tomography images of the brain to predict the degree of damage to its areas in stroke is solved. The results are compared with a number of known methods. A method for constructing a decision function for predicting the degree of belonging of an object to the target class has been developed. The results of an experimental study and comparison with a number of well-known machine learning algorithms (random forest, support vector machine, 𝑘NN) confirmed the efficiency of the method for solving the problem of predicting the degree of damage to brain areas in stroke patients. Unlike other similar algorithms, the proposed method allows establishing a set of the most informative features in order to improve the interpretability of the solution and reduce the effect of overfitting.
[link to elibrary.ru]
Keywords: weakly supervised learning, group classification, informative features, filtering of sample objects, computed tomography
doi: 10.25743/ICT.2024.29.1.005
Author(s): Berikov Vladimir Borisovich Dr. , Associate Professor Position: General Scientist Office: Sobolev Institute of mathematics Siberian Branch of Russian Academy of Science Address: 630090, Russia, Novosibirsk, 4, Acad. Koptyug Avenue
Phone Office: (383) 3333291 E-mail: berikov@math.nsc.ru SPIN-code: 8108-2591Kutnenko Olga Andreevna PhD. , Associate Professor Position: Senior Research Scientist Office: Sobolev Institute of Mathematics Siberian Branch Russian Academy of Sciences Address: 630090, Russia, Novosibirsk, 4, Acad. Koptyug Avenue
E-mail: olga@math.nsc.ru SPIN-code: 7600-1424Pestunov Igor Alekseevich PhD. , Associate Professor Position: Leading research officer Office: Federal Research Center for Information and Computational Technologies Address: 630090, Russia, Novosibirsk, Ac. Lavrentiev ave., 6
Phone Office: (383) 334-91-55 E-mail: pestunov@ict.nsc.ru SPIN-code: 9159-3765 References: 1. Van Engelen J.E., Hoos H.H. A survey on semi-supervised learning. Machine Learning. 2020; 109(2):373–440.
2. Cohn D.A., Ghahramani Z., Jordan M.I. Active learning with statistical models. Journal of Artificial Intelligence Research. 1996; (4):129–145.
3. Zhou Z.-H. A brief introduction to weakly supervised learning. National Science Review. 2018; 5(1):44–53.
4. Muhlenbach F., Lallich ., Zighed D. Identifying and handling mislabelled instances. Journal Intelligent Information Systems. 2004; (22):89–109.
5. Borisova I.A., Kutnenko O.A. The problem of correction diagnostic errors in the target attribute with the function of rival similarity. Mathematical Biology and Bioinformatics. 2018; 13(1):38–49. (In Russ.)
6. Raykar V.C., Yu S., Zhao L.H., Florin C., Bogoni L., Moy L. Learning from crowds. Journal of Machine Learning Research. 2010; (11):1297–1322.
7. Zhou Z.-H. Ensemble methods: foundations and algorithms. Boca Raton: CRC Press; 2012: 218.
8. Gao W., Wang L., Li Y.F., Zhou Z.-H. Risk minimization in the presence of label noise. Proceedings of the 30th AAAI Conference on Artificial Intelligence. Phoenix, AZ; 2016: 30(1).
9. Huang K., Shi Y., Zhao F., Zhang Z., Tu S. Multiple instance deep learning for weakly supervised visual object tracking. Signal Processing: Image Communication. 2020; (84):115807.
10. Gao W., Zhang T., Yang B.-B., Zhou Z.-H. On the noise estimation statistics. Artificial Intelligence. 2021; (293):103451.
11. Berikov V., Litvinenko A., Pestunov I., Sinyavskiy Yu. On a weakly supervised classification problem. Lecture Notes in Computer Science. Springer; 2022; (13217):315–329. DOI:10.1007/978-3-031-16500-9_26.
12. Foulds J., Frank E. A review of multi-instance learning assumptions. The Knowledge Engineering Review. 2010; 25(1):1–25.
13. Dietterich T.G., Lathrop R.H., Lozano-P`erez T. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence. 1997; 89(1–2):31–71.
14. Abusev R.A. On group choice procedures for problems of classification and reliability in the case of lognormal variance. Journal Mathematical Sciences. 2013; 189(6):911–918. DOI:10.1007/s10958-013-1231-y.
15. Petrovsky A.B. Methods for the group classification of multi-attribute objects (part 1). Scientific and Technical Information Processing. 2010; 37(5):346–356.
16. Xiao Y., Zijian Y., Bo L. A similarity-based two-view multiple instance learning method for classification. Knowledge-based Systems. 2020; (201):105661.
17. Arkad’ev A.G., Braverman E.M. Obuchenie mashiny raspoznavaniyu obrazov [Machine learning for pattern recognition]. Moscow: Nauka; 1964: 112. (In Russ.)
18. Zagoruiko N.G. Prikladnye metody analiza dannykh i znaniy [Applied methods of data and knowledge analysis]. Novosibirsk: Izdatel’stvo Instituta Matematiki SO RAN; 1999: 270. (In Russ.)
19. Li Y., Li T., Liu H. Recent advances in feature selection and its applications. Knowledge and Information Systems. 2017; (53):551–577. DOI:10.1007/s10115-017-1059-8.
20. Zagoruiko N.G., Kutnenko O.A. Recognition methods based on the AdDel algorithm. Siberian Journal of Industrial Mathematics. 2004; 7(1(17)):39–47. (In Russ.)
21. Barabash Yu.L., Varskiy B.V., Zinoviev V.T. Avtomaticheskoe raspoznavanie obrazov [Automatic pattern recognition]. Kiev: Izdatel’stvo KVAIU; 1963: 168. (In Russ.)
22. Merill T., Green O.M. On the effectiveness of receptors in recognition systems. IEEE Transactions on Information Theory. 1963; (IT–9):11–17.
23. Zagoruiko N.G. Kognitivnyy analiz dannykh [Cognitive data analysis]. Novosibirsk: Akademicheskoe Izdatel’stvo GEO; 2013: 186. (In Russ.)
24. Kalmutskiy K., Tulupov A., Berikov V. Recognition of tomographic images in the diagnosis of stroke. Del Bimbo A. et al. (Eds.) Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science. Springer, Cham; 2021; (12665). DOI:10.1007/978-3-030-68821-9_16 Bibliography link: Berikov V.B., Kutnenko O.A., Pestunov I.A. Weakly supervised group classification // Computational technologies. 2024. V. 29. ą 1. P. 45-58
|