Neural networks have revolutionized the field of machine learning, powering advancements in areas like image recognition, natural language processing, and predictive modeling. At their core, neural networks are built by designing an architecture that can learn from data and refined through training to achieve accurate predictions.
In this tutorial, you’ll learn how to build and train a neural network in Python using TensorFlow, Keras, and Scikit-Learn. We’ll walk you through every step, from data preprocessing and model construction to training, evaluation, and visualization of results. By the end of this guide, you’ll have the skills to create your own neural networks and apply them to a wide range of machine learning tasks.
The tutorial includes the following steps:
#…
Neural networks have revolutionized the field of machine learning, powering advancements in areas like image recognition, natural language processing, and predictive modeling. At their core, neural networks are built by designing an architecture that can learn from data and refined through training to achieve accurate predictions.
In this tutorial, you’ll learn how to build and train a neural network in Python using TensorFlow, Keras, and Scikit-Learn. We’ll walk you through every step, from data preprocessing and model construction to training, evaluation, and visualization of results. By the end of this guide, you’ll have the skills to create your own neural networks and apply them to a wide range of machine learning tasks.
The tutorial includes the following steps:
Step 1: Import Required Libraries
Before we begin, let’s import the libraries that we’ll use throughout this tutorial:
# Import libraries
import matplotlib.pyplot as plt
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Input
Here’s what each library is used for in this tutorial:
- Matplotlib: For visualizing the training process and model performance.
- NumPy: For numerical operations and data manipulation.
- Scikit-Learn: For splitting data into training and testing sets, scaling data, and evaluating the model.
- TensorFlow/Keras: For building, training, and evaluating the model.
These libraries will help us preprocess the data, define the neural network, train it, and visualize the results.
Now that we have our tools ready, let’s move on to preparing the data in the next step.
Step 2: Load and Prepare the Data
We will use the Pima Indians Diabetes Database for this tutorial [Data Attribution]. This data set, originally from the National Institute of Diabetes and Digestive and Kidney Diseases, aims to predict the likelihood of diabetes in female patients based on diagnostic measurements. The target variable indicates whether a patient has diabetes (1) or not (0).
The predictor variables include medical details such as the number of pregnancies, Body Mass Index (BMI), insulin level, age, and more.
You can download the data set by clicking here. Save the file as pima-indians-diabetes.csv in the same directory as your Python script.
Next, we will load the data set using the genfromtxt() function from NumPy. This function reads the CSV file and converts it into a NumPy array for further processing:
# Load the data
my_data = np.genfromtxt('pima-indians-diabetes.csv',
delimiter=',',
skip_header=1)
The image below shows the first six rows of our data set:

Once loaded, we split the data into two parts: the input features (X_raw) and the target variable (y). The input features consist of the first 8 columns, while the target variable is located in the 9th column (index 8):
# Split into input features and target variable
X_raw = my_data[:, 0:8]
y = my_data[:, 8]
Now that we have loaded and split the data set into input features and the target variable, the next step is to preprocess the data to ensure it is ready for training the neural network.
Step 3: Split the Data into Training and Testing Sets
To evaluate the performance of our neural network, we need to split the data into two parts:
- Training set: Used by the model to learn patterns from the data.
- Testing set: Used to evaluate the model’s ability to generalize to unseen data.
We’ll use Scikit-Learn’s train_test_split() function for this task. This function randomly splits the data into training and testing sets based on a specified ratio. In this tutorial, we’ll reserve 20% of the data for testing and use the remaining 80% for training:
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_raw, y, test_size=0.2)
The X_train and y_train variables contain the training data for the input features and target variable, respectively. Similarly, X_test and y_test hold the testing data.
Splitting the data ensures that we can assess the model’s accuracy on data it hasn’t seen during training, which is crucial for validating its real-world performance.
Now that the data is split, the next step is to normalize the input features to ensure consistent scaling, which will help the neural network learn effectively.
Step 4: Normalize the Input Data
Neural networks perform better when the input features are on a similar scale. To achieve this, we normalize the input data using Scikit-Learn’s StandardScaler(). This scaler standardizes the features by subtracting the mean and dividing by the standard deviation, resulting in a distribution with a mean of 0 and a standard deviation of 1.
# Normalize input data
my_scaler = StandardScaler()
X_train = my_scaler.fit_transform(X_train)
X_test = my_scaler.transform(X_test)
The fit_transform() method computes the mean and standard deviation from the training data and applies the transformation. The transform() method is then used to apply the same scaling to the testing data without re-computing the mean and standard deviation.
Standardizing the data ensures that all input features are treated equally by the neural network. This not only helps the network converge faster during training but also improves overall performance by avoiding issues caused by varying feature scales.
Now that the input features have been normalized, we are ready to define the architecture of our neural network. This step involves specifying the layers, activation functions, and overall structure of the model.
Step 5: Define the Neural Network Model
We will now use Keras to define a simple feedforward neural network, also known as a fully connected neural network. This architecture processes input features sequentially, passing them through a series of layers to generate predictions:
# Define the model
my_model = Sequential([Input(shape=(8,)),
Dense(16, activation='relu'),
Dense(1, activation='sigmoid')])
Here’s what each component of the model does:
- Input Layer: Accepts the 8 input features from the data set. The
Input(shape=(8,))specifies the size of the input vector. - Hidden Layer: A dense layer with 16 neurons, each using the ReLU (Rectified Linear Unit) activation function. ReLU introduces non-linearity, enabling the model to learn complex patterns in the data.
- Output Layer: A dense layer with 1 neuron and the sigmoid activation function. This layer outputs a value between 0 and 1, making it suitable for binary classification tasks like predicting the presence or absence of diabetes.
This neural network is relatively simple but powerful enough to learn patterns in the data. It serves as a strong starting point for training a model that can predict the likelihood of diabetes based on diagnostic measurements.
To enhance the network’s complexity and enable it to learn deeper and more intricate patterns, you could add additional hidden layers or increase the number of neurons in each layer. However, to keep things straightforward and focused in this tutorial, we will use this simpler model, which is sufficient for solving the current problem.
With the model architecture defined, the next step is to compile the network by specifying the loss function, optimizer, and evaluation metrics, preparing it for training.
Step 6: Compile the Model
Before training, the model needs to be compiled, which involves specifying the loss function, optimizer, and evaluation metrics:
# Compile the model
my_model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
Here’s a breakdown of the key components:
- Loss Function: The loss function measures the difference between the model’s predictions and the actual values. The
binary_crossentropyloss function is used because this is a binary classification problem, where the target variable has only two possible values (diabetes: 1, no diabetes: 0). - Optimizer: The optimizer controls how the model’s weights are updated during training. The Adam optimizer combines the benefits of RMSprop and momentum, providing efficient and adaptive gradient updates during training.
- Metrics: Metrics are used to evaluate the model’s performance during training and testing. Accuracy is specified as the evaluation metric to monitor how well the model performs in predicting the correct class during training and testing.
This configuration ensures that the model is optimized for binary classification and is ready for the training process. The choice of the loss function, optimizer, and metrics is critical, as they directly impact how the model learns and evaluates its performance.
While this tutorial uses a commonly accepted setup, it’s worth noting that there are many other ways to specify these parameters. For example, different loss functions or optimizers may be better suited for specific tasks or data distributions. Selecting the optimal configuration is an active area of research in machine learning and often involves experimentation and fine-tuning to achieve the best results.
With the model configured, the next step is to train it using the prepared data and monitor its performance during the training process.
Step 7: Train the Model
Training the model involves feeding it the training data and allowing it to learn patterns by updating its weights over multiple iterations. This process is called fitting the model:
# Fit the model
my_history = my_model.fit(X_train,
y_train,
epochs=100,
batch_size=30,
validation_split=0.1)
Here’s what each parameter in the fit() function specifies:
- Epochs: The model is trained for 100 epochs, where one epoch is a single pass through the entire training data. Increasing or decreasing the number of epochs can affect how well the model learns patterns.
- Batch Size: The training data is divided into smaller batches of 30 samples each. The model updates its weights after processing each batch, which helps balance memory usage and learning efficiency.
- Validation Split: 10% of the training data is set aside as validation data. This allows the model to evaluate its performance on unseen data during training, helping to detect overfitting or underfitting.
During training, the model adjusts its weights to minimize the loss function, guided by the optimizer. The training process generates metrics such as accuracy and loss, which can be monitored using the returned my_history object. This object stores the training history, including metrics for both training and validation data.
By the end of this step, the model will have learned patterns from the training data and will be ready for evaluation on the testing data.
Step 8: Evaluate the Model
After training the model, we evaluate its performance on the testing set to measure how well it generalizes to unseen data. This involves generating predictions and analyzing evaluation metrics.
First, we use the predict() function to generate probabilities for each sample in the testing set. Then, we apply a threshold of 0.5 to classify the predictions as 1 (diabetes) or 0 (no diabetes), and the astype(int) method converts these predictions into integers for comparison with the actual class labels:
# Make class predictions with the model
my_predictions = (my_model.predict(X_test) > 0.5).astype(int)
Next, we use Scikit-Learn’s classification_report() to evaluate the predictions:
# Evaluate the model
print(classification_report(y_test,
my_predictions))

The classification report provides detailed performance metrics for each class (0.0 and 1.0), as well as overall metrics for the model:
-
Precision: Measures how many of the predicted positives are actually correct.
-
Class 0.0 (No Diabetes): Precision is 0.79, meaning 79% of the samples predicted as Class 0.0 (no diabetes) are correct.
-
Class 1.0 (Diabetes): Precision is 0.56, meaning 56% of the samples predicted as Class 1.0 (diabetes) are correct.
-
Recall: Measures how many actual positives are correctly identified by the model.
-
Class 0.0 (No Diabetes): Recall is 0.81, meaning the model correctly identifies 81% of the actual Class 0.0 (no diabetes) samples.
-
Class 1.0 (Diabetes): Recall is 0.52, meaning the model correctly identifies 52% of the actual Class 1.0 (diabetes) samples.
-
F1-Score: The harmonic mean of precision and recall, providing a single measure of model performance for each class.
-
Class 0.0 (No Diabetes): F1-score is 0.80, indicating good performance on this class.
-
Class 1.0 (Diabetes): F1-score is 0.54, indicating the model struggles more with this class.
-
Accuracy: The overall accuracy of the model is 0.72, meaning it correctly classifies 72% of all samples.
-
Macro Avg: The unweighted average of precision, recall, and F1-score across both classes, with a value of 0.67.
-
Weighted Avg: The weighted average of precision, recall, and F1-score, taking into account the number of samples in each class, with a value of 0.72.
-
Support: Represents the number of actual samples in each class. Support helps contextualize the metrics, as it shows how balanced the testing data is across the two classes.
-
Class 0.0 (No Diabetes): Support is 106, meaning there are 106 actual samples in the testing set for this class.
-
Class 1.0 (Diabetes): Support is 48, meaning there are 48 actual samples in the testing set for this class.
These metrics reveal that the model performs better at predicting class 0 (no diabetes) compared to class 1 (diabetes). Addressing class imbalance or refining feature selection could help improve performance on class 1.
While the evaluation highlights areas where the model performs well and where it struggles, the next step involves visualizing the training history. This will allow us to better understand how the model’s accuracy and loss evolved over the course of training.
Step 9: Visualize Training Performance
Visualizing the model’s training history helps us understand how its performance evolved during training and whether it generalized well to the validation data. In this step, we plot the training and validation accuracy over epochs to evaluate the learning process:
# Visualize training history
plt.plot(my_history.history['accuracy'], label='Training Accuracy')
plt.plot(my_history.history['val_accuracy'], label='Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True)
plt.show()

The training accuracy (blue line) increases rapidly during the initial epochs and stabilizes around 80%. The validation accuracy (orange line) closely follows the training accuracy and stabilizes around 75%, indicating that the model generalizes reasonably well to the validation data during training. The small gap between training and validation accuracy suggests that overfitting is not a concern, as the model performs similarly on both the training and validation sets. However, the true extent of overfitting can only be confirmed by evaluating performance on an unseen testing set.
It’s important to note that, due to the small data set size and the randomness involved in splitting the data and initializing the model weights, these results may vary slightly if the code is run multiple times. Techniques like cross-validation, stratified splits, increasing the data set size, or using multiple random seeds might be employed to mitigate this problem and improve the reliability of evaluations.
The model achieves steady performance after approximately 20 epochs, showing that increasing the number of epochs further does not significantly improve accuracy. While the small gap between training and validation accuracy demonstrates good generalization, further improvements, such as hyperparameter tuning or regularization, could enhance performance on the validation data.
This visualization is a valuable tool for diagnosing and addressing potential issues in the training process, such as the need for more epochs, better regularization, or refined model architecture.
In conclusion, this basic neural network application shows good generalization, but there are many ways to improve its performance, such as hyperparameter tuning, regularization, or exploring different architectures. These improvements can help refine the model for more complex tasks and better results.
To explore more advanced techniques and gain a deeper understanding of neural networks, let’s look at some further resources that can help you enhance your knowledge and skills in this area.
Further Resources
To deepen your understanding and explore more on neural networks and machine learning, check out the following resources:
- Neural Network by Wikipedia
- Build a Neural Network & Make Predictions by Real Python
- Manually Optimize Neural Network Models by Machine Learning Mastery
- TensorFlow Website
- scikit-learn Website
- Python Tutorials by Statistics Globe
Data Attribution: This tutorial utilizes data obtained from kaggle.com. We acknowledge the contributors as the primary source of this data set, which significantly enhances the educational value of our tutorial. For more detailed information, please visit the corresponding pages of the data set on the kaggle website.
This tutorial has walked you through building and training a neural network in Python, but there’s still so much more to explore! If you have any questions or feedback, feel free to leave a comment below!