Chapter 2: Classification
dev.to·5d·
Discuss: DEV
🗂️Vector Databases
Preview
Report Post

MNIST

MNIST is a dataset which is a set of 70,000 small images of digits handwritten by high school students and employees of the US Census Bureau.

In this chapter, we will train a machine learning model that can classify if a given image is of a 5 or not-5.

from sklearn.datasets import fetch_openml

mnist = fetch_openml('mnist_784', as_frame=False)

Let’s create two variables: X and y for features and labels respectively:

X, y = mnist.data, mnist.target

There are 70,000 images in total, and each image has 784 features / pixels. This means that each image has a resolution of 28x28.

Let’s split this into training set and test set:

X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]

Training a Binary Classifier

Let’s …

Similar Posts

Loading similar posts...