Preview
Open Original
Learn the hidden signal inside teacher probabilities (“dark knowledge”) and use cross-entropy + KL to transfer it to a smaller model.
Learn the hidden signal inside teacher probabilities (“dark knowledge”) and use cross-entropy + KL to transfer it to a smaller model.