The occurrence of diabetes is mounting at a rapid rate worldwide, and if diabetic retinopathy (DR) is not adequately diagnosed and treated, it will significantly contribute to several primary causes of vision loss [1]. In the world, around 220 crorepeople were projected to be visually impaired. Of them, approximately 100 crore cases may be treated or prevented, as documented in the World Report on Vision 2019 by the WHO [2]. DR is among the inescapable sources of sightlessness, especially among people aged between 25 and 74 years. Patients with diabetes can reduce their risk of blindness through proper regulation of blood sugar, blood pressure, and cholesterol, combined with frequent eye examinations [3, 4]. Early revealing and screening for DR play an essential role in examining ocular issues before they affect vision, recognizing timely associations to check or pause vision loss [5]. If signs of DR or maculopathy are categorized, retinal screening directs both the choice of treatment and the monitoring of disease progression [6].
Identified methodologies for DR screening engage well-timed and comprehensive eye checks, for instance, dilated ophthalmoscopy or the estimation of high-quality fundus images, precisely in patients without prior DR or other ocular treatments [7]. Fundus imaging has transformed more recognized images due to its flexibility to establish high-quality digital images with red-free or color visualization and enriched image-processing proficiencies [8, 9, 10]. The reports of DR are often more entirely implied in fundus images than in regular clinical assessments. Technological enhancements—encompassing automated image analysis, large-scale data handling, and mobile applications—have provided advanced DR pathogenesis investigation and screening accuracy [11, 12, 13].
The development of DR occurs in well-defined stages: mild DR, categorized by microaneurysms [14]; moderate DR, related to vessel extension and vision blurring; severe DR, involving proliferation of abnormal blood vessels and blocked vasculature; and proliferative DR, the final stage, marked by retinal detachment and potential total vision loss [15, 16]. As DR primarily exhibits without obvious symptoms, regular retinal assessments for diabetic patients are required. Automated DR detection by means of color fundus images has been presented as a cost-effective solution for population-based screening, focusing on the limitations of manual analysis, which is time-consuming and reliant on ophthalmologists’ ability [17, 18, 19]. Computer-aided diagnostic (CAD) systems offer rapid and accurate mass screening, leveraging both classical image-processing and advanced deep learning methods [20].
However, conventional quality points in fundus images, such as sharpness variations, noise, uneven illumination, and poor contrast, pose significant challenges for accurate anomaly detection [21]. Improper lighting conditions can develop in overly dark or bright images, reducing the profile of pathological features. Therefore, image enhancement is a crucial pre-processing step in retinal analysis [22]. Existing enhancement methods are generally divided into three sections: transformation-based approach, filter-based approach, and histogram-based approach [23]. Among these, contrast limited adaptive histogram equalization technique has shown superior effectiveness [24, 25, 26]. Improvement can be shown by (1) changing the color images to black-and-white images and increasing the level of black-and-white images, (2) breaking the color channels among red, green, and blue(e.g., RGB) and enhancing particular channels selected for, individually, before merging, or (3) applying enhancement directly in color spaces [27, 28].
In the last few years, studies have mainly focused on developing the green channel or the luminosity (L) channel in lab-space as they may control prominent vessel information. For example, A.W. Setiawan et al. in his work applied contrast limited adaptive histogram equalization (CLAHE) method to the green channel in [25], whereas Alwazzan et al. used Wiener filtering using CLAHE method on the green channel before adding those with channels red and blue [29]. Jin et al. transformed RGB images into Lab color space and enhanced the normalized L, a*, and b* components individually [27]. Although the L channel has been widely used, the a* and b* channels— representing chromaticity—remain underexplored despite their potential to highlight different retinal anomalies, such as drusen or cotton wool spots. The color information of a retinal image adapts with the type of disease and image quality; focusing exclusively on the green channel may result in loss of critical information.
This research reports these encounters by mentioning a dual framework: (i) a Lab color-space-based enhancement method that services blue channel variance to determine color dominance (CD) and select appropriate Lab channels (a* or b*) for improvement, thereby refining anomaly visibility across multiple retinal diseases; and (ii) Interactive Dual Wasserstein Remora Adversarial Generative Network (IDWRAGN) planned for automated DR classification. The enhancement step progresses the visibility of lesions and vessels, while the IDWRAGN model leverages dual generators and Wasserstein distance to address mode collapse, overfitting, and class imbalance issues commonly observed in traditional GANs. Furthermore, the adaptive remora optimization algorithm (ROA) with a learning factor accelerates convergence and improves classification performance.
The leading impacts of this research work are discussed as follows:
Novel Enhancement Strategy: A Lab color-space-based enhancement technique that adaptively selects color channels based on blue channel variance to improve anomaly visibility in multi-disease retinal images.
Robust Classification Model: Introduction of IDWRAGN for precise DR grading (moderate, severe, no-DR, mild, and proliferative), overcoming limitations of conventional GANs.
Optimized Learning: Integration of adaptive ROA to enhance exploration capabilities and speed up convergence.
Comprehensive Evaluation: Validation using multiple performance parameters (accuracy, precision, sensitivity, F1-score, and specificity) along with statistical analysis to validate effectiveness.
Numerous examinations have been reported in the literature, focusing on DR recognition and organization. Some of the most significant and current works are summarized below.
Mondal et al. [31] suggested an automatic ensemble deep learning framework for DR identification and categorization. Their method mixes two deep learning architectures—an enhanced DenseNet101 and ResNeXt—for effective detection. Pre-processing implies applying CLAHE to enhance the contrast of an image. Since non-proliferative DR images were restricted in numbers, a GAN-based augmentation of data was applied to increase the training set.
Using OCT images and developing a three-stage approximation, DR classification was done by Elgafi et al. in 2022 [32]. Initially, segmentation of OCT images (input) was done for retinal layers. Later, 3D structural feature metrics and reflectivity were derived from every layer. Finally, he typed the OCT images by using back propagation neural networks.
In the year 2022, a multi-phase ensemble CNN approach for precise DR detection and grading using fundus images was experimented with by Deepa et al. [33]. Initially, images were parted into multiple regions and managed using InceptionV3 and Exception models to obtain critical pieces from shallow-dense CNN layers. These features were then fused, and an ANN classifier was employed on the combined probability vectors. In the ultimate phase, multiple CNN outputs were ensembled for the classification decision. This multi-stage deep learning method is substantially superior in DR grading accuracy.
An efficient DR detection framework incorporating fuzzy logic with digital image processing was invented in the year 2022 by Bhimavarapu and Battineni [34]. Particle swarm optimization (PSO) was employed for fundus image segmentation, specifically for micro aneurysm detection. PSO clustering utilized membership functions to group high-similarity data efficiently. Analysis of comparative study of the advantages and disadvantages of the approaches is being discussed in Table 1.
An analysis of the comparative study of the advantages and disadvantages of the approaches
| Inventors | Year of invention | Process | Advantages | Disadvantages |
|---|---|---|---|---|
| Mondal et al. [31] | 2023 | Hybrid DenseNet101–ResNeXt | Enhanced accuracy across DR classes | Does not classify all DR subtypes |
| Elgafi et al. [32] | 2022 | OCT-based DR detection | High accuracy using LOSO cross-validation | Limited dataset (188 images) |
| Deepa et al. [33] | 2022 | MPDCNN | Enhanced DR grading accuracy | Less effective for DR detection at very high-risk |
| Bhimavarapu & Battineni [34] | 2022 | PBPSO | Fast implementation time | Segmentation of the optic disc was not done by this method |
| AbdelMaksoud et al. [35] | 2022 | DenseNet–EyeNet hybrid | Accurate DR vs. normal classification | Not applicable to OCTA images |
| Kalaiselvi R & Vinayaki VD [38] | 2022 | R-Convolutional Network attached with Window Grouping Attention | Image segmentation becomes better | Accuracy is 95.4% |
| Gayathri et al. [42] | 2021 | M. CNN network with classifier J48 | Time value for complexity will be less in time | Unable to detect other retinal diseases |
| Erciyas & Barışçı [43] | 2021 | Faster R-CNN | Strong detection performance | Relied solely on accuracy as metric |
| Jia H et al. [40] | 2021 | MGS ROA DBN | Steady, along with dependable retina image classification | Shorter recognition rate (93.18%) |
| Bodapati et al. [45] | 2020 | Blended features–DNN | Quick convergence, lowered overfitting | Misclassified proliferative DR as moderate |
DR, diabetic retinopathy; DBN, deep belief network; ROA, Remora optimization algorithm.
Further contributions have refined these methods. AbdelMaksoud et al. [35] introduced a hybrid model combining DenseNet and EyeNet through transfer learning, enhancing EyeNet with dense blocks and hyperparameter tuning. This E-DenseNet model effectively distinguished healthy and DR-affected fundus images across multiple grades.
A new DR detection model connected with Remora Optimization of Multi-Threshold-based technique for vessel segmentation for feature parameter extraction, along with classification using Region-based CNN network with Geese Algorithm was suggested in 2022 by Muduli D, Dash R, Majhi B et al. [38]. This approach segregated DR into different stages effectively.
Gayathri et al. [42] suggested a mixture technique where both deep learning and machine learning procedures are used for automated DR severity grading. A Multipath CNN was used for extracting both global and local features, while a J48 (C4.5) classifier achieved severity categorization, representing high performance.
Jia et al. [40] proposed a comprehensive pipeline involving CLAHE-based pre-processing, optic disc removal using watershed methods, vessel segmentation with gray-level thresholding, and abnormality detection via top-hat transformation. The optimization technique was modified into a gear- and steering-based rider optimization approach (MGS-ROA), which was employed for feature selection and network weight optimization, while feature extraction and classification were performed using a Deep Belief Network (DBN).
Image enhancement plays a key role in the amendment of CAD systems, as retinal images are highly sensitive to superior degradation. Over the years, many enhancement procedures have been recommended to expand the view of retinal features and objects in fundus images. Gupta et al. [46] presented a method using a gamma correction adaptive method on the Lab-color-space luminosity channel, where weights are derived from the collective histogram of pixels taken as input. The contrast was more governed using a histogram of quantile-based, creating better values of Peak SNR (PSNR) with SSIM for quantile 3, and better values for PSNR with SSIM for quantile 5 on the images taken from the MESSIDOR dataset.
The values are
PSNR-27.7 and SSIM-0.66—for quantile 3
PSNR-28.4 and SSIM-0.69—for quantile 5
Jawad et al. [47] experimented with a stabilized Lab-space luminosity channel operated CLAHE, once segmenting the retinal region. The expanded L channel was linked after rescaling with the a* and b* channels, accomplishing a PSNR of 24.42, an entropy of 5.63, and a contrast index (local) of 0.57. Zhou et al. [27] proposed a method recognized on contrast enhancement and luminosity, applying gamma correction on the luminance gain matrix after altering RGB to HSV, pursued by conversion on the way to Lab-space and CLAHE application on the L channel. Testing on 961 low-quality images from a 4,000-image private dataset improved the common image quality estimation, ranging from 0.0404 to 0.4565.
Qureshi et al. [49] came up an idea of converting RGB images toward the CIECAM02 space and transmuted the nimbleness component to grayscale, applying non-linear contrast improvement on the texture features, resulting in an average entropy of 4.60, PSNR of 23.78, and contrast-to-noise ratio of 8.78 on MESSIDOR and DRIVE datasets. Dissopa et al. [50] cultivated contrast, disturbing with CLAHE in Lab space, trailed by the method histogram stretching and rescaling to finalize brightness as per Hubbard’s range, and estimated the results using Lightness Order Error, Global Contrast Factor, and Quaternion Structural Similarity.
Overall reasonable analysis of the reviewed image development methods was clearly shown by Table 1. The analysis explains that the recent research operates development methods basically on the Luminosity channels. According to us, the current idea/plan/work has not measured computing information of color for image enrichment performance. During this technique, we calculated the span of color channel information by variance determination, and the collected image improvement performances were found based on color information.
Zhou et al. [27] focused on the MESSIDOR dataset, where they enhanced retinal images using Gamma correction applied to the gain matrix of HSV luminosity, paired with CLAHE processing on the Lab color-space’s L-channel. Their evaluation relied on a quality assessment index spanning 0–1.
Gupta et al. [46], also using MESSIDOR, introduced a Gamma adaptive correction framework on the gain luminosity matrix. This was combined with quantile-driven histogram equalization in Lab space, specifically at quantile = 3, achieving SSIM = 0.66 and PSNR = 27.67.
Jawad et al. [47] applied their technique to the DRIVE dataset, employing CLAHE on the Lab L-channel, resulting in a value of PSNR of 24.42, and that of Entropy of 5.64, and a contrast index (local) of 0.57.
Singh et al. [48] showed experiments across DRIVE, CHASE, and STARE, where they projected a radiance-indicator-based histogram equalization method. The performance was explored by the values of PSNR, measures of Euclidean, and inspection of the quality of vision.
Qureshi. et al. [49] explored both MESSIDOR and DRIVE datasets, applying contrast amplification (non-linear) on the J_component in color appearance model (CIECAM-02). The study reviewed intensity consistency, PSNR, contrast-to-noise ratio and Entropy.
Dissopa. et al. [50], working with DIARET-DB0 and STARE, achieved local contrast improvement followed by standardization using Hubbard-brightness scaling. Their assessment included Quaternion SSIM, lightness-order error, and Global contrast factor.
Kumar. et al. [51], using STARE, structured brightness in the HSV value channel, and then operated a weighted-average histogram equalization scheme. Metrics, such as CEIQ, VSI, NIQE, MEME, and EBCM, were used for quality appraisal.
Wang et al. [52] assessed images from DIARET-DB0 and DIARET-DB1 by decomposing fundus images into base, detail, and noise layers. Enhancement approaches differed across layers, directed by a visual adaptation model, and their results were registered in terms of local contrast index and Entropy.
Xiong et al. [53], using DIARET-DB0, DIARET-DB1, and a private dataset, presented illumination correction along with transmission-map-based foreground and background separation, followed by selective enhancement of foreground pixels. Their primary metric was the local contrast index.
Primary recognition of DR is paramount while checking sight loss; however, precise diagnosis remains a challenge due to the need for expert interpretation of fundus images. Millions of patients could benefit from automated or simplified diagnostic systems; however, existing methods present several restrictions. Many current DR detection methods struggle with effective classification of DR stages, and are often trained on small datasets (e.g., 188 cases), and lack capabilities for high-risk DR identification or optic disc segmentation. Furthermore, lower detection rates and misclassification of severe or proliferative DR as moderate highlight the need for more accurate and reliable identification methods.
A key factor affecting diagnostic accuracy is the poor quality of retinal fundus images, which are often degraded by non-uniform illumination, low contrast, and noise. These image quality disputes ambiguous retinal anomalies and obscure automated evaluation. Image enhancement during pre-processing is, therefore, a necessity to improve anomaly visibility. Enhancement algorithms typically depend on listed styles like histogram-based, transformation-based, or filter-based graces, with CLAHE tactic being mainly effective in the visibility of retinal structures and instructive local disparity [47, 48, and 51].
Together, these encounters inspire the advancement of an integrated framework that associates robust retinal image enhancement with an advanced automated DR classification model to accomplish higher accuracy, consistent grading, and improved handling of poor-quality images.
This workflow (Figure 1) implies a comprehensive retinal image improvement framework structured to update the visual quality, contrast, and feature visibility of retinal images for research or analytical aims. The process commences with a color retinal image, usually captured using a fundus camera. Before any progress, the images go through cropping to strip off redundant background or borders, confirming that only the region including the retina is managed. This step is essential for targeting computational sources in the significant area of the image. Note for COMP: In Figure 1, colour (UK spelling) should be changed to color (US spelling).

Flowchart diagram of the suggested CDBROA method. CDBROA, color dominance and boosted Remora optimization algorithm with deep adversarial approach; CLAHE, contrast-limited adaptive histogram equalization; CWF, color Wiener filtering.
The proposal section is divided into two basic processes after cropping of images—RGB channel separation and color space conversion. In the RGB color model, each image is made up of three different colors as red, green, and blue channels. Among these, the blue channel streams notable textural and contrast information, but it is often the noisiest due to lower reflectivity in retinal images. Therefore, the variance (a measure of intensity distribution) of the blue channel is judged to establish the degree of enhancement required. Thereafter, the image is converted from RGB to the Lab color space, where “L” represents lightness, and “a” and “b” characterize the color-opponent dimensions. The Lab color interpretation is perceptually uniform, meaning changes in this space correspond more densely to human visual perception, making it standard for supervised contrast enhancement.
The blue channel variance signifies as a guiding parameter for the subsequent stage. Based on its value, CLAHE method s exploited selectively to any of the “a” and “b” channels in Lab color space. CLAHE, an adaptive contrast enrichment method which divides the image into small zones (tiles) and expands each separately, thus developing local contrast while preventing over-amplification of noise. Utilizing CLAHE on the Lab color components creates fine management over color enhancement without modifying brightness or introducing artifacts.
After the CLAHE operation, the modified channels are joined back into a single image, rebuilding the full color composition. The merged image is then subjected to color Wiener filtering (CWF)—a filtering approach aimed to shrink noise while maintaining edges and fine structural details. CWF is especially efficient in retinal imaging, where it upgrades and retains delicate vascular structures. After filtering, the image is transformed to grayscale, abridging further assessment while retaining key structural features.
The next phase engages Adaptive Fuzzy Tsallis Entropy Clustering, an advanced segmentation procedure that links fuzzy logic with Tsallis entropy theory. Traditional entropy-based segmentation techniques often assume uniform data distributions, but Tsallis entropy, being non-extensive, fits better to complex, non-linear image distributions like those found in retinal data. The fuzzy clustering component permits each pixel to belong to various clusters with varying degrees of membership, allowing smoother segmentation boundaries and improving preservation of retinal groups like optic discs, lesions, and blood vessels.
Eventually, the workflow merges the ROA, a metaheuristic algorithm (bio-inspired), extended on the symbiotic behavior of Remora fish that manage themselves to larger marine hosts. In these instances, ROA is operated to increase the enhancement parameters, such as clustering thresholds and contrast levels. Justifying that the resulting image realizes maximum clarity of vision, balanced brightness, and ideal contrast. By regulating these parameters adaptively, ROA assists in decreasing manual trial-and- error adjustments, leading to a consistently enhanced output.
The standard outcome of the practice is a refurbished retinal image that demonstrates improved contrast, reduced noise, and clearer visibility of diagnostic features. This advancement makes successive medical image analysis—such as blood vessel extraction, optic disc detection, or disease classification—more precise and dependable. Broadly, the workflow put together color-space transformations, adaptive enrichment, entropy-based clustering, and bio-inspired optimization into a unified, intelligent pipeline for retinal image improvement.
The images taken from the dataset (RFMiD) have the variations in resolutions. Image trimming is done in every retinal image by using the retinal region of a square mask. The trimmed image of retina in three color space (RGB) is allotted in different unique channels as red, green, and blue to find out the blue channel variance as depicted by the following equation:
Here,
individual pixels are denoted by X (i, j),
Mean is denoted by µ
the number of rows is denoted by m
the number of columns is dented by n
To select the data variability spread, the variance measure is used. As per the variance of every channel as expressed, the CD in fundus images has variance with the blue channel. From the table, it is for sure that blue variance for red dominant images is less compared to non-red dominant images, which partake a bluer value of variance as per the basic theory of Lab color and human vision space. Subsequently, Lab color space is formulated on the visual perception of humans. Lab color space is used here for the said purpose. The a*channel in the Lab space explains the connection between red and green pixel values in the image, while the b*channel explains the yellow-blue pixel values in the Lab space. So, the image in RGB color space is shifted to Lab space using the transformations as per the Eqs
2–5. RGB to Lab color space change is connected to changes to intermediate components X, Y, Z of CIE-XYZ color space, and Xn, Yn, and Zn are values of tristimulus of the reference white-point of CIE XYZ.
In this process, whatever the outcomes are evaluated by applying algorithm 1 on the given fundus images is applied to the next phase.
After several experiments on investigating the conneion between blue variance and image color dominance, a limit of θ value of 1,500 is set. For the red dominant image, that is, with blue variance,
Here,
σ2 ≤ θ: CLAHE is operated for non-red dominant images.
σ2 > θ: CLAHE is applied for Lab space.
The advanced Lab space channel is joined with two unaffected channels of Lab space and finally converted to RGB. The change from Lab to RGB color space indicates a transition to color space CIEXYZ as depicted in the following equations:
Where,
Where the color space of RGB components are R, G, and B; color space CIE-XYZ components are X, Y, and Z, and CIE-XYZ tristimulus standards of the white-point (reference) are represented by Xn, Yn, and Zn.
After application of the first algorithm on input fundus images, the output is given to the immediate next phase: Stage 2.
| Stage 1 | Algorithm 1 | Process |
|---|---|---|
| Input: Retinal Image (cropped) (ImgRGB) | ||
| Output: Lab-enhanced fundus image (Img(RGB)') | ||
| 1: Split ImgRGB in RGB color-space into red, green, and blue channels | ||
| 2: redChannel = ImgRGB[:,:,0] | ||
| 3: blueChannel = ImgRGB[:,:,1] | ||
| 4: greenChannel= ImgRGB[:,:,2] | ||
| 5: Calculate the variance σ2 of the blue channel (equation 1). 6: Convert ImgRGB to ImgLab-color-space (equations (2)–(5)). 7: Split ImgLab in Lab-color-space into individual channels | ||
| 8: LChannel = ImgLab[:,:,0] | ||
| 9: aChannel = ImgLab[:,:,1] | ||
| 10: bChannel= ImgLab[:,:,2] | ||
| 11: Variance threshold θ = 1500 of the blue channel, | ||
| 12: if σ2 ≤ θ then | ||
| 13: a’ = Apply CLAHE on ‘aChannel’ obtained from Step 9 | ||
| 14: Calculate the size of rows and columns of L-Channel as (r, c) | ||
| 15: Generate a 3D array ImgLab ’ with zeros and for given channel values | ||
| 16: ImgLa’b[:,:,0] = LChannel | ||
| 17: ImgLa’b[:,:,1] = a’ | ||
| 18: ImgLa’b[:,:,2] = bChannel | ||
| 19: else | ||
| 20: b’= Apply CLAHE on ‘aChannel’ obtained from Step 10 | ||
| 21: Calculate the rows and columns size of L-Channel = (r, c) | ||
| 22: Generate a 3D array ImgLab ’ with zeros and given channel values | ||
| 23: ImgLab’ [:,:,0] = LChannel | ||
| 24: ImgLab ’ [:,:,1] = aChannel | ||
| 25: ImgLab’ [:,:,2] = b’ | ||
| 26: end if | ||
| 27: Img(RGB)’ = Convert Img(La’b or ImgLab’ to RGB color-space (equations (6)–(9)) | ||
In the phase of pre-processing with the steps of denoising to develop and purify the data, filtering is very much crucial and essential. So, all the images in the dataset are collected and applied to CWF to increase the quality of images by eliminating noise from them. CWF had the advantage in pre-processing for the recognition of DR. Like all other methods, this also denoises the images and enhances the contrast value without any compromise with the other related information. It could reduce noise without any loss of image quality, which makes it a perfect method for detecting experienced retinopathy features. Along with it, CWF is flexible and adds variations in the quality of retinal images; consequently, it becomes a more reliable and perfect diagnostic approach. So, according to this process, CWF diminishes detrimental isolated edges or little compression types while creating important structures. Along with this, it is also suitable for diminishing the mean square error value in raw and better images.
The Eq. (10) explains developed noise and color space displaying in the original color space of RGB:
The viewed image
Commonly, 𝒞CV and 𝒞OO are the unknown image and the observed image and they are determined by Eq. (16) as:
When input image and the noise mismatches are determined by Eq. (17)
Lastly, calculation of filtering of Wiener
Using Eq. (18), the images’ noise can be eradicated by the process CWF. After noise cancellation, images can be converted into black-and-white images to reduce the time of processing. Grayscale conversion appears to play a basic role in identifying subtle color transforms in the retina. By explaining colors into various shades of gray, it demonstrates a procedure to signify and highlight these delicate color nuances. This conversion modernizes the image, permitting a more intensive and accurate assessment of minute variations in color in the retina. In circumstances such as DR detection, where even slight color transforms may carry investigative significance, grayscale conversion enriches the capability to detect and examine these refined variations.
A grayscale image incorporates only shades of gray, observing it from color images by occupying less information per pixel. In a grayscale image, each pixel is allotted a single intensity value. Unlike distinct color images that involve three intensity values (red, green, and blue), grayscale images have identical intensity through all color components in the RGB space. Grayscale images are frequently appropriate for many tasks, reducing the need for more complex and computationally demanding color images. These grayscale images are then applied in the segmentation process for clustering of AFTE.
Segmentation of images divides a retinal image into regions that share similar metric parameter characteristics. As per Zhou et al. [27], the segmentation using the AFTE-based approach improves the extraction of blood vessels and the optic disc, which is crucial for detecting DR. AFTE places greater emphasis on enhancing the quality of image distinctions and addressing the variability in the widths of vessels commonly found in fundus images. Its adaptability to fuzzy parameters and responsiveness to image-specific features enable accurate and robust segmentation. Compared to several existing methods, Zhou et al. [27] AFTE is a smaller amount sensitive to image and noise artifacts, resulting in far more dependable outputs.
This technique partitions the digital image into multiple pixel-based regions and is extensively used in applications like image compression and object recognition. In this work, AFTE clustering is integrated with an Improved Search Cuckoo (ISC) optimization method for image segmentation based on pre-processing. The ISC algorithm is used to determine the optimal FTE threshold level for effective segmentation.
An addition of global entropy is called the Tsallis entropy method. The adaptive FTE is exploited in image segmentation threshold collection. Assume that the m × n-dimensional grayscale image 𝐈, and that is formulated as 𝐈(B, A) ∈ κ; 𝐈(B, A) ∈ κ, ∀(B, A) ∈ ρ. Here, κ = (from 0 to P − 1), B = (from 0 to Q − 1) and A = (from 0 to R − 1). The image’s (grayscale) highest strength is explained as
Where Dark area Tsallis entropy fuzzy =
Bright area Tsallis entropy fuzzy =
By Eq. (23) we can define the complete of Tsallis entropy fuzzy,
By Eq. (24) we can express objective function,
Here, 𝒫n denotes the frequency of a gray level, and the system’s dependency on a specific value is represented as α. The parameters ρ and 𝓆 (threshold values) are determined to divide the image into three distinct regions. In the FTE procedure, the ISC algorithm is employed to obtain the optimal cutoff value. Under segmented time constraint conditions, the search approach cuckoo proves highly effective, as it also enhances the overall convergence rate. The main point of this algorithm—founded on a global search mechanism—is to get the highest optimum segmentation outcome.
The ISC algorithm handles according to these core directives:
A cuckoo has some favorite choices of a nest and dwells the eggs in all conditions.
The production is next produced on the selection of the highest quality eggs and proper nests.
𝒫a ∈ [0, 1] is the probability that a host bird will detect the eggs placed by the cuckoo.
Eq. (25) clarifying the novel innovation by the Levy flight:
Here, λ and β represent the governing variables. Eq. (27) describes the non-linear variance relationship associated with Lévy flights. The iterative process concludes once the ideal solution is attained. The use of the ISC-based AFTE method enhances entropy performance, and the thresholds derived from the optimal entropy significantly outperform those obtained through other techniques. After segmentation, the extracted different regions are passed to the fast discrete curvelet transform via wrapping (FDCT-WRP) method for additional feature extraction procedures.
The FDCT-WRP method [28] is applied to extract features and parameters, such as circularity of the perimeter, lesion count, area, aspect ratio, and the lengths of the major and minor axes from segmented images. According to its ability to provide multiresolution analysis and time vs frequency localization, the wavelet transform is generally used. However, unlike ridgelet and curvelet transforms, conventional wavelet techniques are limited in capturing two-dimensional singular structures such as lines and curves.
The second-generation (2 G) curvelet transform is created in two stages, using the wrapping method with the unequally spaced fast Fourier transform (USFFT). After comparison to the first-generation (1 G) curvelets, these methods are faster, less redundant, and more efficient. The FDCT-WRP approach is very simple, computationally quicker, and easier to implement for its integration with USFFT. Owing to these pros, the wrapping-based strategy is applied to construct the FDCT, referred to as FDCT-WRP, which is applied as a feature extraction method. The steps involved in the FDCT-WRP process are outlined below:
Build the constants (𝕍[𝕔1, 𝕕1]) from all images of the two-dimensional fast Fourier transform (FFT).
At the frequency domain
for every angle and scale, construct the discrete localizing window and estimate the product\left( {\tilde {\cal x}_{{\cal{p}},{\cal{q}}} \left[ {{\mathbb{c}}_1 ,{\mathbb{d}}_1 } \right]} \right) .\left( {{\mathbb{V}}\left[ {\mathbb{c}_{1} ,{\mathbb d}_{1} } \right]} \right)\left( {\tilde {\cal x} _{{{{\cal p},{\cal q}}}} \left[ {{\mathbb{c}}_1 ,{\mathbb{d}}_1 } \right]} \right) Cover the multiplication concluded the origin
and it’s a form of reindexing of data given by the datasets.\left( {{{{\mathbb{\tilde V}}}_{{\cal{p}},{\cal{q}}}}\left[ {{\mathbb{c}_1},{\mathbb{d}_1}} \right]} \right) Application of the inverse two-dimensional FFT applied on
to get the constants{{\mathbb{\tilde V}}_{{\rm{p,q}}}} of discrete curvelet transform.{\mathbb{CU}}{^{{\mathbb{\hat D}}}}\left( {\cal{p},{\cal{q}},\cal{r}} \right)
To calculate/get the feature vector parameter, the constants for FDCT’s are assigned, using a wrapping method called FDCT WRP for each orientation and scale, such as 𝓅, 𝓆. The following equation is applied to evaluate the number of dimensions in the images whose size are 𝓂c * 𝓂𝓇
The suggested CD and BROA with deep adversarial approach (CDBROA) model was published on three widely available images of retinal datasets—DRIVE, STARE, and CHASE_DB1—which are widely applied benchmarks for vessel segmentation and image enhancement tasks. All experiments were implemented in Python using TensorFlow and executed on a workstation with NVIDIA RTX 4090 GPU, 24GB VRAM, and Intel i9 CPU.
To ensure fairness, all images were resized to 512 × 512 times 512 × 512 pixels and normalized prior to processing. The CDBROA model was compared against several algorithms for the state of up-to-date segmentation, which includes U Net, Attention U Net, Swin-Transformer U Net (Swin-UNet), Dense U-Net, and GAN-Vessel-Net.
The proposed CD-based enhancement improved illumination balance and vessel visibility before segmentation. In the following table, the image quality enhancement comparison shows improvements in PSNR, SSIM, and Entropy values using the CD method across datasets.
For the estimation of the generalization capability and healthiness of the suggested CDBROA framework, cross-dataset validation was conducted using multiple retinal fundus image datasets acquired under varying imaging conditions. The model was trained on the primary Retinal Funduscopic Multi-Disease Image Dataset and subsequently tested on independent external datasets without retraining or fine-tuning. This strategy ensures that the learned feature representations and classification performance are not biased toward dataset-specific characteristics, such as illumination, resolution, camera type, or patient demographics.
Performance metrics parameters were processed for each cross-dataset estimation. The results demonstrate consistent performance across datasets, indicating strong generalization ability and resilience to domain shifts. The effectiveness of the proposed CD–based enhancement, adaptive fuzzy Tsallis entropy segmentation, and IDWRAGN-based classification is thereby validated under real-world variability.
Overall, cross-dataset validation confirms that the CDBROA framework is robust, scalable, and suitable for deployment in diverse clinical environments for automated DR screening.
The retinal fundus datasets used in this study were partitioned following a standardized and reproducible protocol for fair evaluation and to check for any type of data leakage. Each dataset was segregated into training, validation, and testing subsets at the image level, ensuring that images from the same subject did not appear across multiple subsets.
For the primary retinal fundus image dataset for training, 74% of the images were used, 13% of the images were used for validation, and another 13% of the images were used for testing independently. The dataset was employed for hyperparameter tuning and immediate finishing, whereas the test set was stored completely for definitive performance estimation.
The dataset comprises several DR severity classes. To mitigate class imbalance and ensure equitable learning, stratified sampling was applied during dataset splitting so that each subset preserved the original class distribution. Additionally, class-balanced loss weighting was incorporated during model training to further address minority class bias.
For cross-dataset validation, the CDBROA model was trained exclusively on the primary dataset and assessed on external datasets with differing class distributions, image resolutions, and acquisition settings. No retraining or fine-tuning was performed on the external datasets. This protocol rigorously assesses the simplification ability of the proposed framework under real-world domain shifts.
The image quality metrics are shown in concise form for the values PSNR, SNIM, Entropy, and UIQI by using different datasets (DRIVE, STARE, and CHASE_DB1). Image quality enhancement metrics for the proposed algorithm CDBROA are explained in Figure 2.

Image quality enhancement metrics (for CDBROA). CDBROA, color dominance and boosted Remora optimization algorithm with deep adversarial approach; PSNR, peak SNR.
Contrasted to baseline CLAHE and Retinex-based techniques, CD-enhancement accomplished an average advancement of 4.6% in SSIM and 2.8% in PSNR, managing to superior vessel-edge continuity and noise inhibition.
Performance was assessed using typical segmentation parameter metrics, such as Accuracy, Sensitivity, Specificity, and F1 score, using different types of method. This was explained in Figure 3 which reveals the comparative AUC scores across distinct segmentation simulations showing the superior performance of CDBROA. In the following table, a comparison is shown between images from different datasets of the metric parameters.

Comparison of AOC across segmentation model. CDBROA, color dominance and boosted Remora optimization algorithm with deep adversarial approach; Swin-UNet, swin-transformer U Net.
The Boosted Remora Optimization Algorithm (BROA) competently tuned the network parameters of the adversarial segmentation exhibit, steering to enhanced precision and vessel continuity related to organized deep and hybrid techniques.
Visual comparison (Figures 2–4) establishes that CDBROA keeps fine vascular structures, including thin capillaries and bifurcation points, thereby controlling background noise and over-segmentation.

Ablation study results. BROA, boosted Remora optimization algorithm; CD, color dominance; CDBROA, color dominance and boosted Remora optimization algorithm with a deep adversarial approach.
Developed images present reliable vessel contrast and consistent brightness, which assist in accurate discrimination between arteries and veins.
The following table and Figure 4 show the ablation study and the performance contributions of CD, BROA, and Adversarial learning modules. To assess the contribution of each module:
The inclusion of CD pre-processing enhanced vessel visibility, while the Boosted Remora Optimization Algorithm (BROA) refined learning dynamics, and the Adversarial component improved edge continuity and generalization.
The recommended enhancement technique is executed, trained, and validated on the training and validation set of the RFMiD dataset using a pre-trained VGG16 model and assessed on the test set to detect the presence or deficiency of retinal abnormalities. The model performance is evaluated by calculating accuracy and F1 score. Accuracy is calculated as the ratio between the total number of correct predictions and the total number of forecasts. The F1 score is defined as the harmonic mean of precision and recall. Visual image analysis of the result is carried out in RGB color space and as well as in grayscale. Figure 5 shows the comparison of the original input, stage 1 output, and stage 2 output in color space, along with an evaluation between the grayscale of the input image vs. the output of the recommended technique.

(A) Input image in RGB color space, (B) Output of stage 1, (C) Output of stage 2 (D) Input image in grayscale, (E) Final output in grayscale.
Confusion matrix illustrating in Figure 6 as well as in the following table also, shows the classification balance for vessel and non-vessel pixels using CDBROA on DRIVE dataset.

The confusion matrix. FN, false negative; FP, false positive; TN, true negative; TP, true positive.
These results show a higher level of accuracy of around 97.0% and a Dice coefficient of 0.962, confirming robustness across diverse retinal illumination and vessel-width variations.
Using a paired t-test, the performance difference between CDBROA and Swin-UNet was very important (p < 0.01), demonstrating the superiority of the suggested model.
Despite the addition of optimization and adversarial stages, CDBROA required 12% fewer epochs to converge than standard GAN-VesselNet and achieved a 24% faster inference time per image, owing to adaptive feature attention and Remora-based hyperparameter tuning.
- ✓
CDBROA substantially enhanced image quality and vessel clarity.
- ✓
Accomplished the highest segmentation accuracy (0.970) and AUC (0.990) between all compared models.
- ✓
Determined powerful generalization among multiple datasets.
- ✓
Diminished computational overhead through enduring structural fidelity.
The setup consequences depict that the proposed CDBROA method efficiently enhances retinal image quality and recognizes enhanced segmentation accuracy related to standard deep learning and hybrid optimization models. The integrated use of CD pre-processing, BROA, and Deep Adversarial learning carries synergistically to this development.
As per the discussion, Figure 5. compares samples of the input image and the superior image in color scale and grayscale. The features are well-enhanced and conspicuous in the enhanced image that matches the original image. From the closer sectional view shown in Figure 7, it is evident that abnormalities present in the path of retinal vessels, such as red hemorrhages, are developed well. The path of blood vessels serves as vital evidence to recognize diseases like retinal pigment epithelium. It is intriguing to enhance both retinal vessels and the optic cup in a fundus image since either of them gets destroyed while enhancing the other. The advantage of the intended method is that it is efficient in developing the vessel as well as the optic cup for an image of different resolutions.

(A) Input image, (B) A section of the input image, (C) Image enhanced using the proposed technique, (D) A section of enhanced image.
The CD module plays an important task in normalizing illumination and enhancing contrast amid retinal images captured under non-uniform lighting environment. Conventional enhancement techniques, such as CLAHE or Retinex, often define excessive brightness or local noise amplification. In comparison, CD-based pre-processing compares chromatic components adaptively, producing images with more evident vessel edges and decreased background interference. This establishes that thin capillaries and low-contrast vessels are better maintained, directly altering the segmentation network’s feature-extraction capability.
The addition of BROA launches a biologically informed metaheuristic optimization procedure that fine-tunes deep network parameters and hyperparameters willingly. Unlike static optimizers such as Adam or RMSProp, BROA adjusts learning trajectories based on convergence trends and fitness evaluation. This process prevents premature convergence and increases model generalization. The outcomes show a consistent performance gain of 2%–3% in F1-score and AUC over non-optimized models. Moreover, BROA quickens convergence, dipping training epochs while preserving high accuracy, supporting its computational efficiency for medical image analysis missions.
The deep adversarial approach further routes vessel segmentation by evolving a discriminator network that guides the generator in producing anatomically plausible vessel maps. This adversarial feedback lessens over-smoothing, and develops the continuity of vessel structures, especially in areas where vessels exhibit variable thickness or branching. The model learns to preserve fine vascular networks and conceal false positives, which is outward in higher Dice coefficients and enriched ROC performance across datasets.
Compared to existing architectures, such as U-Net, Attention U-Net, and Swin-UNet, CDBROA exhibits a clear benefit in both quantitative and qualitative evaluations. While transformer-based and attention-based models enhance context awareness, they remain sensitive to uneven illumination and weak contrast regions. The proposed sequence of CD enhancement and adversarial learning effectively overcomes these limitations. Furthermore, in comparison with GANVesselNet, CDBROA maintains superior structural consistency, fewer segmentation artifacts, and higher matching between sensitivity and specificity. The results suggest that the hybrid integration of improvement, optimization, and adversarial segmentation leads to a more resilient and data-efficient model.
Cross-dataset validation on DRIVE, CHASE_DB1, and STARE justifies the robustness of CDBROA under adapting motions, vessel thickness, and imaging artifacts. The minimal show drop concerning datasets indicates that the framework generalizes well across diverse imaging environments. Additionally, the CD pre-processing balances for domain-specific color variations, enhancing the adaptability of the segmentation network to unseen datasets.
Although it has high accuracy, the suggested approach even deals with some constraints. This CDBROA method needs multiple stages, resulting in slightly higher pre-processing time than simpler CNN-based methods. The adversarial training demands sensible balancing between the generator and discriminator loss functions to prevent mode collapse. The method’s performance may strain marginally in icons with terrible pathological artifacts (e.g., large hemorrhages or exudates) where vessel continuity is upset. This method also significantly increases computational load because of its high computational complexity. Due to the training instability of adversarial networks, it requires careful tuning of large, diverse datasets to ensure stable convergence. This may require dataset-specific calibration, reducing universality. For the algorithm’s limited explainability of optimization and deep models, this can hinder clinical trust and regulatory approval. These limitations indicate potential tracks for ultimate optimization and dataset-specific fine-tuning.
The CDBROA structure confirms a promising foundation for automated retinal assessment systems. Its competence to preserve vessel continuity and contrast can significantly strengthen the recognition of ophthalmic and systemic diseases like hypertension, DR, and occlusions of the retinal vein. The incorporation of biologically inspired optimization with deep adversarial training contributes a novel paradigm appropriate outside retinal imaging, containing other biomedical segmentation jobs, such as angiography, coronary imaging, and neurovascular mapping. This algorithm makes the approach highly suitable for mass DR screening, especially in resource-constrained clinical environments. Clinical decision support enhancement enables ophthalmologists to make more informed and reproducible decisions.
While future work may focus on lightweight CDBROA alternatives for real-time exploitation on transportable fundus cameras, the incorporation of multimodal data (e.g., OCT and fluorescein angiography) and plausible AI modules may focus on developing medical interpretability.