Optimizing memory usage of Scikit-Learn models using succinct tries (opens in new tab)

We use the scikit-learn library for various machine-learning tasks at Zyte. For example, for text classification we'd typically build a statistical model using sklearn's Pipeline, FeatureUnion, some classifier (e.g. LinearSVC) + feature extraction and preprocessing classes. The model is usually trained on a developers machine, then serialized (using pickle/joblib) and uploaded to a server where […]

Read the original article