Finding Optimal Tokenizers (opens in new tab) 📦Logistics and scheduling algorithms Content type: Blog
In this post, I will present an algorithm that was able to compute an optimal tokenizer in some settings. This result is cool because optimal tokenization is theoretically intractable, but seems to be solvable in practice. My finding is very similar to various results on the Traveling Salesman Problem (TSP), where even difficult instances can be solved optimally using cutting-plane techniques.
Read the original article