TreeGazer: Prospecting Protein Sequence-Function Landscapes via Phylogenetic Structure (opens in new tab)
Building diverse and informative protein sequence datasets is critical for understanding how function varies across sequence space. Because only a small fraction of sequences in a dataset can typically be experimentally characterised, strategies for selecting what sequences to characterise should maximise the information gained from each experiment. Here, we present TreeGazer, a phylogeny-informed framework that combines Bayesian optimisation with the topology of a tree to guide sequence sele...
Read the original article