Have a personal or library account? Click to login
Fusion of Day Light and Infrared Images: A Systematic Review of the State of the Art in EO/IR Gimbal Systems Cover

Fusion of Day Light and Infrared Images: A Systematic Review of the State of the Art in EO/IR Gimbal Systems

Open Access
|Dec 2025

Figures & Tables

Fig. 1.

PRISMA flow diagram of this systematic review
PRISMA flow diagram of this systematic review

Fig. 2.

Image fusion framework a) pixel level b) feature level c) decision level
Image fusion framework a) pixel level b) feature level c) decision level

Fig. 3.

Image fusion techniques
Image fusion techniques

Fig. 4.

Generic structure of fusion rules
Generic structure of fusion rules

Advantages and Disadvantages of Frequency Domain Methods

Fusion MethodAdvantagesDisadvantages
Morphological pyramid [34] Laplacian/Gaussian pyramid [34,35] Gradient pyramid [36] Low-pass pyramid ratio [37] Filter subtract decimate [36]Provide better image qualityThe fused image is affected by the number of breakdown levels. Also, there is no direction information, so detailed image information in different directions cannot be extracted.
Discrete cosine transform (DCT) [38]The images are decomposed into a series of cosine waveforms representing different spatial frequency components. This compact representation makes DCT suitable for real-time applicationsThe fused image is blurred, and blocking artifacts are generated.
Discrete wavelet technique with Haar fusion [39]Spectral distortions are decreased, and a fused image with better SNR is produced.The spatial resolution of the fused image is lower. The anisotropy of the source image is not represented.
Kekre’s wavelet transform fusion [40,41]Irrespective of the size of the images, the fused image is more informativeComputation complexity is high
Kekre’s hybrid wavelet-based transform fusion [42,43]Fused image results are better with more temporal and frequency features with multiresolution properties.If the images are an integer power of two, this approach cannot be used
Stationary wavelet transform (SWT) [44-46]At decomposition level 2, better results are obtainedHigh computational time
Curvelet Transform [47]Best suits for edge representationHigh computational time

Performance Evaluation Metrics

S. No.CategoryMetricDesired value for good performanceRemarks
1Information theoryCross entropy (CE)LowEvaluates the similarity of information shared between the EO/IR image and the fused image
Entropy (EN)HighMeasures the average amount of information or detail contained in the fused image
Mutual information (MI)HighQuantifies the degree of statistical dependence between the source and fused images.
Peak signal-to-noise ratio (PSNR)HighFused image distortion
2Structural similarityUniversal image quality index, SSIM (Structural Similarity Index Metric)HighImage loss (correlation loss, luminance loss) and distortion (contrast distortion)
Root Mean Squared Error (RMSE)LowCalculate the variation in the source image and the fused image
3Image featureAverage gradient (AG)HighInsights into image clarity and fusion texture characteristics
Edge intensity (EI)HighQuantifies image edge intensity
Standard deviation (SD)HighProvide details on the factors linked with image quality -distribution of information and contrast
Spatial frequency (SF)HighInformation on the overall activity and clarity of the image
Gradient-based fusion performance, QAB/FHighAssesses the degree to which the gradient or edge details from the source images are preserved in the fused image

Quantitative results of various methods [59] – [62]

Data setAlgorithmPSNRSSIMENMIAGSDSFRunning time (s)
Multimodal imageDense Fuse60.270.726.8413.674.24--9.85
CNN62.210.697.3114.675.76--33.25
ResNet64.230.736.7313.463.64--4.53
Convolution Sparse representation-0.8646.221.90-21.46--
Anisotropic diffusion-0.946.181.94-20.58--
Fourth-order partial differential equation-0.866.251.73-21.33--
Total variation and augmented Lagrangian-0.916.211.92-21.08--
Bayes Fusion-0.946.432.45-26.28--
Deep convolutional sparse coding--6.912.504.2246.97--
DeepFuse--6.862.303.6032.25--
Saliency Detection--6.671.723.9828.04--
FusianGAN--6.582.342.4229.04--
DLF--6.382.152.7222.94--
Fast and efficient zero learning--6.632.232.5528.09--
Discrete Wavelet Transform (DWT)--6.44-3.09-8.160.76
Non-subsampled contourlet transform (NSCT)--7.17-5.02-12.782.03
Multi-Focus image fusion (MFCNN)--6.61-3.61-9.550.38
CNN integration (ECNN)--7.10-5.48-18.340.34
Unsupervised depth model for image fusion (SESF)--7.31-7.26-24.910.31
IY-Net--6.81-4.84-12.530.16

Benchmarking datasets

S.No.Database NameYearWeb Address
1.TNO2014https://figshare.com/articles/dataset/TNO_Image_Fusion_Dataset/1008029
2.KAIST2015https://soonminhwang.github.io/rgbt-ped-detection/
3.VIFB2020https://github.com/xingchenzhang/VIFB
4.LLVIP2021https://bupt-ai-cz.github.io/LLVIP/

Advantages and disadvantages of Spatial Domain Methods

Fusion MethodAdvantagesDisadvantages
Averaging - Image fusion by pixel averaging [22,23]This is a basic method to identify and put into practice if the images are from the same sensor with lot of contrast and brightness. It involves a low computational costThe fused image quality is reduced. The output images are hazy and so not suitable for real-time applications. Also, edges and image information are lost
Minimum pixel value [22]The fused image is good if the inputs have dark shadesFused images are characterised by low contrast and blurred
Simple block replacement [24]Incredibly easy to understand and applyThe fused image has a random variation of brightness and color information. Fine detail of the image is less
Maximum pixel value [22,23]The low pixel values are rejected, and the highest pixel value is used to create the fused method. So this method is susceptible to artifacts and distortionThe contrast of the fused image is decreased
Max-min [24]Easy to implement, and the computational time is lessThe efficiency of fusion is reduced, and the output image has rough edges due to blocking artifacts and isolated spots
Weighted averaging [25]This method is easy to apply and robust. It is more suitable for multifocus imagesThe signal-to-noise ratio is enhanced in the fused image
PCA [26,27]This approach gives excellent spatial quality and robustFused images show chromatic aberration and spectral degradation
IHS [23]The colour, resolution and features are improved in the output image. The processing time is quick with high sharpeningOnly three multispectral bands are analysed. So chromatic aberration occurs in the fused image
Brovey [24]Extremely easy and fast processing methodRGB pictures are generated with high contrast, which causes color distortion
Guided filtering [28]This method is suitable for real-time applications and provides better performance in image smoothingThe method does not apply to sparse input data. Some edges may have halos. Also, there will be a mismatch in the color and depth details between the input and fused image

Advantages and Disadvantages of Deep Learning Methods

Fusion MethodAdvantagesDisadvantages
Convolution Neural Network [51-53]Features are extracted and learnt from the training data without human assistanceComputational speed is low
CSR [54]This method is less sensitive to misregistrationEnormous training data required
SAE [55]Limited data required for supervised learningThe model training speed depends on the processor
DOI: https://doi.org/10.2478/ama-2025-0086 | Journal eISSN: 2300-5319 | Journal ISSN: 1898-4088
Language: English
Page range: 768 - 778
Submitted on: Mar 13, 2025
|
Accepted on: Nov 9, 2025
|
Published on: Dec 31, 2025
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2025 Kamlesh VERMA, Deepak YADAV, Anitha Kumari SIVATHANU, R Senthilnathan, G Murali, R Ranjith Pillai, Rajalakshmi TS, Vignesh SM, G Madhumitha, Nandhini MURUGAN, published by Bialystok University of Technology
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.