br Contents lists available at ScienceDirect br Expert Systems With
Contents lists available at ScienceDirect
Expert Systems With Applications
journal homepage: www.elsevier.com/locate/eswa
Classification of colorectal cancer based on the association of multidimensional and multiresolution features
Matheus Gonçalves Ribeiro a, Leandro Alves Neves a,∗, Marcelo Zanchetta do Nascimento b, Guilherme Freire Roberto b, Alessandro Santana Martins c, Thaína Aparecida Azevedo Tosta d
a Department of Computer Science and Statistics (DCCE), São Paulo State University (UNESP), Rua Cristóvão Colombo, 2265, São José do Rio Preto, São Paulo 15054-000, Brazil
b Faculty of Computation (FACOM) - Federal University of Uberlândia (UFU), Avenida João Neves de Ávila 2121, Bl.B, Uberlândia, Minas Gerais 38400-902, Brazil
c Federal Institute of Triangulo Mineiro (IFTM), Rua Belarmino Vilela Junqueira S/N, Ituiutaba, Minas Gerais 38305-200, Brazil
d Center of Mathematics, Computing and Cognition, Federal University of ABC (UFABC), Avenida dos Estados, 5001, Santo André, São Paulo 09210-580, Brazil
Article history:
Keywords:
Colorectal cancer
Feature associations
Multiresolution features
Fractal techniques
Curvelet transforms
Haralick descriptors
Colorectal cancer is one of the most common types of cancer according to worldwide incidences statistics. The correct diagnosis of this lesion leads to the indication of the most adequate treatments for cancer-affected patients. The diagnosis is made through the visual analysis of tissue samples by pathologists. However, this analysis is susceptible to intra- and inter-pathologists variability in addition to being a complex and time-consuming task. To deal with these challenges, image processing methods are devel-oped for application on histological images obtained through the digitization of the tissue samples. To do so, feature extraction and classification techniques are investigated to aid pathologists and make it possible a faster and more objective diagnosis definition. Therefore, in this work, we propose a method that associates multidimensional fractal techniques, curvelet transforms and Haralick descriptors for the study and pattern recognition of colorectal cancer, which not yet explored in the Literature. The proposed method considered a feature selection approach and different classification techniques for evaluating as-sociations, such as decision tree, random forest, support vector machine, naive Bayes, k∗ , and a polyno-mial method. This strategy allowed for more precise interpretations regarding the best associations for the LY 379268 of groups concerning histological images of colorectal cancer. The proposal was tested on colorectal images from two distinct datasets commonly investigated in the Literature. The best result was reached with features based mainly on lacunarity and percolation obtained from curvelet sub-images, using a polynomial classifier. The tests were evaluated by applying the 10-fold cross-validation method and the result was 0.994 of AUC, which is a relevant contribution to the Literature of pattern recogni-tion of colorectal cancer. The obtained performance with a detailed analysis involving different types of features and classifiers are important contributions for pathologists, specialists interested in the study of this cancer and histological image processing researchers, which aim to develop the clinically applicable computational techniques.
1. Introduction
Colorectal cancer is a malignant tumour that develops on the internal wall of the large intestine (colon) or rectum (Alteri, Kramer, & Simpson, 2014). In 2012, the international agency for research on cancer (IARC) presented a study in which colorec-
∗ Corresponding author.
E-mail addresses: goncalves.mgr@sjrp.unesp.br (M.G. Ribeiro), leandro@ibilce.unesp.br (L.A. Neves).
The diagnosis for colorectal cancer can be made through sig-moidoscopy or by colonoscopy with confirmation by tissue biopsy. The results are tissue samples stained with hematoxylin and eosin (H&E) that are visually analysed by pathologists. This task is com-plex and demands time on the part of the specialist. In order to minimise these problems, computational methods were proposed to support pathologists in the pattern classification and recognition tasks of colorectal tissues stained with H&E (Jørgensen et al., 2017; Kalkan, Nap, Duin, & Loog, 2012; Masood & Rajpoot, 2009; Naiyar, Asim, & Shahid, 2015; Rathore, Iftikhar, Hussain, & Jalil, 2013).
After the digitization of tissue samples, these images can be analysed by applying computational systems. These systems enable the development and investigation of new image processing tech-niques. Moreover, these systems can support pathologists in their diagnosis and prognosis definitions which, consequently, lead to the most adequate treatments for the patients. The steps of fea-ture extraction and classification compose a relevant part of these computational methods. By them, it is possible to explore the in-trinsic image information defined by the computational methods and relate them to the investigated type of cancer. Besides, these steps also contribute to a faster and more accurate diagnosis.