Polynomial Trajectory Compression for Protein Language Model Embeddings (opens in new tab)

Protein language models (PLMs) generate rich, layer-wise embeddings that capture diverse biological information but are expensive in terms of storage and computation at scale. In this work, we propose a compact surrogate representation for PLM embeddings across transformer layers using low-dimensional PCA projections and cubic polynomial trajectories. This approach enables efficient storage and on-demand reconstruction of these protein-level embeddings at any layer without rerunning the PLM. ...

Read the original article