In an article recently submitted to the arxiv* server, researchers introduced an active strategy combining image watermarking with latent diffusion models (LDM) to address the ethical concerns of generative image modeling. Their method embedded an invisible watermark in all generated images for future detection and identification.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
They fine-tuned the latent decoder of the image generator, conditioning it on a binary signature. A pre-trained watermark extractor retrieved the hidden signature from any generated image, and a statistical test determined its origin. The researchers evaluated the invisibility and robustness of the watermarks across various tasks, demonstrating that the stable signature method remained effective even after image modifications.
Background
Past work in generative modeling and natural language processing has made significant strides in creating and manipulating photo-realistic images, exemplified by tools like discrete autoencoder language learner – E (DALL·E 2) and stable diffusion. These advancements have led to the development of numerous image editing tools, such as ControlNet and instruct-pix2pix, becoming popular among artists, designers, and the general public. Despite the benefits of creative artificial intelligence (AI) applications, these developments also raise concerns about the authenticity and integrity of photo-realistic images.
Embedding Signature Efficiently
Stable Signature modifies the generative network to embed a specific signature in generated images using a fixed watermark extractor. The training process involves two phases: creating the watermark extractor network (W) and then fine-tuning the LDM decoder (D) to ensure all generated images contain a designated signature via W.
HiDDeN, a classical deep watermarking method, is utilized to pre-train the watermark extractor (W). This method jointly optimizes the watermark encoder (WE) and extractor (W) to embed k-bit messages into images, which are robust to various transformations applied during training. Post-training, WE is discarded, and only W is retained. The WE produce a watermarked image by adding a residual image to the original, which is then transformed and processed to extract a message. The loss is calculated using Binary Cross Entropy (BCE), with the network architectures kept simple to facilitate LDM fine-tuning. A PCA whitening transformation is also applied to remove bias and decorrelate the output bits of W.
In the LDM, diffusion occurs in the latent space of an auto-encoder. The fine-tuning process modifies the decoder (D) to produce images containing a given message (m), which W can extract without altering the diffusion process. A fixed signature (m) is embedded in the images by fine-tuning D into Dm, using the LDM encoder to create a latent vector (z) and the decoder to reconstruct an image. The message loss is the binary cross entropy (BCE) between the extracted message and the original, while the perceptual loss controls image distortion. The fine-tuning process is rapid, taking less than a minute on a single GPU, and involves optimizing the weights of Dm over 100 iterations to minimize the combined message and perceptual loss.
Stable Signature: Effective Watermarking
The experimental results section thoroughly evaluates the efficacy of a stable signature for embedding watermarks in generated images across various generative tasks and scenarios. Firstly, the method's versatility is highlighted as it seamlessly integrates with the LDM decoder, making it applicable to diverse tasks such as text-to-image generation, image editing, super-resolution, and inpainting. Evaluation metrics like peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and Frechet inception distance (FID) are employed to assess image quality and distortion caused by the watermarking process.
Secondly, the qualitative impact of a stable signature on image generation is demonstrated through examples that show minimal perceptual differences between watermarked and original images. Despite modest PSNR values and the absence of constraints specific to human visual systems, the method effectively embeds watermarks in textured areas, preserving uniform backgrounds. This nuanced approach ensures that the embedded signatures remain inconspicuous to human observers while achieving robust watermarking.
Lastly, the robustness of the watermarking method is evaluated across various image transformations and tasks. Results indicate high bit accuracy in watermark extraction, demonstrating resilience to cropping, brightness adjustments, and joint photographic experts group (JPEG) compression transformations. Furthermore, comparisons with post-hoc watermarking methods highlight stable signature efficiency and security in embedding watermarks during the image generation process rather than afterward, ensuring minimal impact on image quality.
These findings underscore the stable signature's effectiveness in embedding robust watermarks into generated images while maintaining high-quality visual outputs across diverse generative tasks and challenging scenarios.
Conclusion
To sum up, by fine-tuning the LDM decoder, imperceptible watermarks were seamlessly embedded into all generated images while preserving the integrity of the diffusion process. This approach was compatible with various LDM-based generative models and enabled effective detection and user identification from generated images with high accuracy. It highlighted the advantages of watermarking over passive detection methods in deploying image generative models publicly. The code for reproducibility was available at github.com/facebookresearch/stable signature, and the method's environmental impact was relatively low, estimated at approximately 10 tons of CO2eq emissions for the experiments conducted.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.