Published on November 6, 2025 11:44 AM GMT
epistemic status: my thoughts, backed by some arguments
With the advent of deep fakes, it has become very hard to know which image / sound / video is authentic and which is generated by an AI. In this context, people have proposed using software to detect generated content, usually aided by some type of watermarking. I don’t think this type of solution would work.
Watermarking AI-generated content
One idea is to add a watermark to all content produced by a generative model. The exact technique would depend on the type of media - e.g. image, sound, text.
We could discuss various techniques with their advantages and shortcomings, but I think this is beside the point. The fact is that this is an adversaria…
Published on November 6, 2025 11:44 AM GMT
epistemic status: my thoughts, backed by some arguments
With the advent of deep fakes, it has become very hard to know which image / sound / video is authentic and which is generated by an AI. In this context, people have proposed using software to detect generated content, usually aided by some type of watermarking. I don’t think this type of solution would work.
Watermarking AI-generated content
One idea is to add a watermark to all content produced by a generative model. The exact technique would depend on the type of media - e.g. image, sound, text.
We could discuss various techniques with their advantages and shortcomings, but I think this is beside the point. The fact is that this is an adversarial setting - one side is trying to design reliable, robust watermarks and the other side is trying to find ways to break them. Relying on watermarks could start a watermarking arms race. There are strong incentives for creating fakes so hoping that those efforts would fail seems like wishful thinking.
Then there is the issue of non-complying actors. One company could still decide not to put watermarks or release the weights of its model. This is next to impossible to prevent on a worldwide scale. Whoever wants to create fakes can simply use any generative model which doesn’t add watermarks.
I don’t think watermarking AI-generated content is a reasonable strategy.
Mandatory digital watermark system for all digital cameras
Another idea is to make digital cameras add a watermark (or a digital signature) to pictures and videos. Maybe digital microphones can even do something similar for sound, although this would likely significantly increase the price of the cheapest ones. We should not that this technique cannot be applied for text.
I see several objections to this proposal:
- How do we decide who has the authority to make digital cameras? There needs to be control to make sure watermarked content really comes from a digital camera. This could lead to an oligopoly where only some companies have the authority to make digital cameras and thus decide what is real. We could try to make this watermarking system partly decentralized, kind of like HTTPs certificate authorities. The problem is that HTTPs is not as secure as we would like to think. It would be even worse for watermarks, because some actors (e.g. states) have stronger incentives to create fakes and there seem to be fewer ways to detect those (e.g. there is no underlying network where one can track suspicious packages).
- What software goes on a digital camera? Cameras already do a ton of software processing before saving the content to a file so somebody must decide and control what software can be put there. The watermark would be useless if the camera software could be used to watermark an image where an object was digitally removed from a scene, for example.
- It’s not clear how technically feasible this watermarking could be. We would like watermarks that persist after “legitimate” edits (crop, brightness change, format change, etc.) but break otherwise.
- Such a watermarking system is likely to end up like the Clipper Chip or the Content Scramble System - somebody would find a security hole rendering it useless or counterproductive.
- I print my deep fake image using a high-quality printer and then use my high-quality camera with a special lens to take a picture of it. Now the deep fake is watermarked. So, can one reliably distinguish a picture of something from a picture of a picture?
- It is dubious whether using cameras adding watermarks would see wide adoption.
- What do we do with the analog devices?
Is there a solution?
I think we may need to accept that indistinguishable fakes are part of what’s technologically possible now.
In such case, the best we could do is track the origin of content and then each person could decide which origins to trust. I am thinking of some decentralized append-only system where people can publish digital signatures of content they have generated.
If you trust your journalist friend Ron Burgundy, you could verify the digital signature of the photo in his news article against his public key. You could also assign some level of trust to the people Ron trusts. This creates a distributed network of trust.
With the right software, I can imagine this whole process being automated: I click on an article from a site and one of my browser plugins shows the content is 65% trustworthy (according to my trusted list). When I publish something, a hash of it signed with my private key is automatically appended to a distributed repository of signatures. Anybody can choose run nodes of the repository software in a way similar to how people can run blockchain or tor nodes. Platforms with user-generated content could choose to only allow signed content and the signer could potentially be held responsible. It’s not a perfect idea, but it’s the best I have been able to come up with.
I have seen attempts at something similar, but usually controlled by some company and requiring paid subscription (both of which defeat the whole purpose for wide adoption).
Discuss