Ghost Vectors: Soft-Deleted Embeddings Remain Reconstructible in HNSW Vector Databases (opens in new tab)
Retrieval-augmented generation (RAG) allows large language models to access external and private corpora for factual, domain-specific responses. Modern RAG pipelines use hierarchical navigable small world (HNSW) vector databases for efficient similarity search. When a user requests data deletion, the systems typically only mark the record as deleted, leaving the embedding on disk physically unchanged. This soft-delete operation raises compliance...
Read the original article