In an article published in the journal Nature, researchers introduced a dual information modulation network (DIMN) aimed at accurate underwater image restoration (UIR). By addressing the inconsistency in attenuation across color channels and spatial regions, DIMN utilized a multi-information enhancement module (MIEM) incorporating spatial-aware attention blocks (SAAB) and multi-scale structural transformer blocks (MSTB). Experimental results demonstrated DIMN's superiority over existing UIR methods in correcting color deviations, recovering details, and enhancing image sharpness and contrast.
Background
The ocean, characterized by its vast resources and diverse applications, presents challenges for underwater vision due to light interference, resulting in degraded image quality. UIR aims to mitigate these challenges, encompassing super-resolution (SR) and enhancement techniques. Previous UIR research predominantly relies on convolutional neural networks (CNNs) and generative adversarial networks (GANs). However, these methods often overlook larger spatial contexts and struggle with establishing long-range dependencies, limiting their effectiveness in restoring underwater images.
To address these limitations, the researchers proposed a DIMN for UIR. DIMN integrated a MIEM featuring SAAB and MSTB. SAAB focused on spatial relationships to correct color deviations and preserve details, while MSTB enhanced image sharpness by exploring multi-scale structures. Experimental evaluations demonstrated DIMN's superior performance compared to existing methods, offering enhanced accuracy in UIR. By addressing spatial-context exploration and long-range dependency establishment challenges, DIMN represented a significant advancement in UIR technology, facilitating improved visualization and analysis of underwater environments.
Method
The proposed DIMN for UIR comprised three stages to effectively restore degraded underwater images. In the initial stage, convolutional operations extracted coarse-grained features from the input image. The second stage employed an MIEM to extrapolate features from coarse to fine-grained space, utilizing SAAB and MSTB. SAAB focused on spatial perception to improve attention on spatial regions, while MSTB explored multi-scale structural attention to enhance image sharpness.
In the third stage, for SR tasks, fine-grained features are upsampled to the desired high-resolution (HR) size, while in enhancement tasks, a convolutional operation generates the final enhanced image. MIEM, comprising SAAB and MSTB, modulated spatial and global information, addressing inconsistencies in color channels' attenuation. SAAB extracted spatial perception information through convolutional layers, reshaping, and computing spatial affinity matrices among all positions. Attention coefficients were calculated based on the input features, local original information, and global information.
Finally, these coefficients were obtained by merging spatial-aware features with the original features. MSTB utilized multi-scale feature extraction and multi-scale structure attention mechanisms to capture semantic cues for image restoration. Various dilation convolutions were utilized to capture multi-scale features, followed by asymmetric convolutions for extracting horizontal and vertical structural information. The transformer mechanism was then applied to these processed features to capture profound semantic cues. The output of MSTB underwent multi-layer perceptron and layer normalization operations to obtain the final enhanced image. Both SAAB and MSTB were essential components of MIEM, ensuring the effectiveness of the DIMN in addressing the challenges of UIR tasks.
Experiments
The experiments conducted by the researchers aimed to evaluate the performance of the proposed DIMN for UIR tasks, including SR reconstruction and enhancement. Various publicly available UIR datasets were utilized for training and testing, such as USR-248, UFO-120, EUVP, and UIEB. Evaluation metrics included both full-reference and reference-free image quality assessment indexes, covering aspects like mean-squared error (MSE), peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), underwater image quality measure (UIQM), natural image quality evaluator (NIQE), and more.
Ablation experiments were performed to evaluate the influence of various components within the proposed DIMN. Results showed that the inclusion of SAAB and MSTB significantly improved restoration performance. Particularly, MSTB played a crucial role in capturing multi-scale structural features, while SAAB enhanced spatial perception. Additionally, the number of MIEMs was optimized to balance network complexity and restoration performance. Comparisons with state-of-the-art methods on different datasets demonstrated the superiority of DIMN in terms of various metrics.
In underwater SR tasks, DIMN consistently outperformed existing methods like SR-CNN, SR-GAN, and SR deep residual network-based model (SR-DRM) in terms of PSNR and SSIM. Similarly, for underwater enhancement tasks, DIMN exhibited better performance compared to methods like deep simultaneous enhancement and super-resolution (SESR) and Funie-GAN in terms of PSNR and UIQM. Visual comparisons further supported these findings, showing that DIMN produced clearer, more detailed, and visually pleasing results with reduced color casts and improved contrast. In essence, the experimental findings affirm the efficacy and resilience of the DIMN in addressing UIR challenges, underscoring its viability for practical use in enhancing underwater image quality.
Conclusion
In conclusion, the proposed DIMN significantly improved UIR by integrating SAAB and MSTB. Experimental outcomes showcased DIMN's supremacy over existing methods by effectively rectifying color deviations, restoring intricate details, and augmenting image sharpness and contrast. By effectively addressing attenuation inconsistencies, DIMN represented a significant advancement in UIR technology, promising improved visualization and analysis of underwater environments.
Journal reference:
- Wang, L., Li, X., Li, K., Mu, Y., Zhang, M., & Yue, Z. (2024). Underwater image restoration based on dual information modulation network. Scientific Reports, 14(1), 5416. https://doi.org/10.1038/s41598-024-55990-x, https://www.nature.com/articles/s41598-024-55990-x