I was fine-tuning a language model on Arabic. The loss was perfect. It spoke Chinese. (opens in new tab)
Repo: github.com/AmmarHassona/trainsafe I was working on fine-tuning an open-source small language model (SLM) on Arabic using DPO. I had the data, the pipeline, and everything set up for training. I was fairly confident that this training run would improve the model and align it further to what I wanted. I started the training and let it run until it finished. When I came back to test the checkpoint, it was speaking Chinese. Loss only tells you the model is learning something — not what it's...
Read the original article