Extracting features from images using a pre-trained model is a common technique in transfer learning using python. It involves using a neural network model that has been trained on a large dataset to capture relevant patterns and features from images. Instead of training a model from scratch, you leverage the knowledge and learned features present in the pre-trained model. Pre-trained models are trained on large datasets for tasks like image classification, and their learned features can be used as a foundation for other tasks, even if your specific task is different.
Extracting features from an image using a pre-trained model includes loading the model, preprocessing the image, passing it through the model, extracting features from the model's output, and then preparing these features for use in downstream tasks.
You can watch the video-based tutorial with step by step explanation down below.
Import Modules
import os
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input
from tensorflow.keras.preprocessing.image import load_img, img_to_array
from tensorflow.keras.models import Model
os - provides a way to interact with the operating system's functionalities. It offers a variety of methods to work with files, directories, and other operating system-related tasks.
tensorflow.keras.applications.vgg16 - is a module in the TensorFlow library that provides the VGG16 model architecture along with pre-trained weights. VGG16 is a convolutional neural network architecture that was developed by the Visual Geometry Group (VGG) at the University of Oxford.
tensorflow.keras.preprocessing.image - provides tools for working with image data. It offers various functions to load, preprocess, and augment images, making it easier to prepare image datasets for training deep learning models.
tensorflow.keras.models - provides tools for creating and working with neural network models using the Keras API. This module includes classes and functions that allow you to define, compile, train, and evaluate deep learning models.
Load the Model
You have to load the chosen pre-trained model. Here you will be using VGG16 model.
# load the model
model = VGG16()
# restructure the model
model = Model(inputs=model.inputs, outputs=model.layers[-2].output)
# summary
print(model.summary())
The VGG16 model is loaded using its default configuration, which includes the fully connected layers at the top for classification.
Then the code uses the Keras functional API to restructure the model. It creates a new model where the input is the same as the original model's input, and the output is the output of the second-to-last layer (penultimate layer) in the original model. By doing this, you remove the final classification layer of VGG16 and retain the output just before the classification.
Finally, print a summary of the restructured model, showing the layers and their shapes.
After restructuring the model, the output of this modified VGG16 model will be a tensor representing the learned features from the second-to-last layer. This tensor can then be used for various tasks, such as transfer learning, feature extraction, or fine-tuning, without needing to train the entire VGG16 architecture from scratch.
Extract Features
Next you have to preprocess the image and extract the features from images.
features = {}
directory = 'image data/'
for image in os.listdir(directory):
image_path = directory+image
# load the image
img = load_img(image_path, target_size=(224, 224))
# convert pixel to numpy array
img = img_to_array(img)
# reshape the image for the model
img = img.reshape((1, img.shape[0], img.shape[1], img.shape[2]))
# preprocess the image
img = preprocess_input(img)
# extract features
feature = model.predict(img, verbose=0)
# store feature
features[image_path] = feature
First initialize an empty dictionary called features to store the extracted features, and directory is set to the path where your images are located.
Then the loop iterates through the files in the specified directory. For each image file, it constructs the full path by concatenating the directory with the image filename.
Inside the loop, the code uses the load_img() function to load the image from the specified image_path. The target_size argument resizes the image to a size of (224, 224) pixels.
Next converts the image to a NumPy array. This step is necessary for further processing and feeding it into the model.
Next reshapes the NumPy array representing the image into the format expected by the model. The shape (1, img.shape[0], img.shape[1], img.shape[2]) creates a batch of one image.
Applies preprocessing steps to the image, which usually includes normalizing the pixel values based on the model's requirements. The preprocess_input() function is specific to the pre-trained model you're using.
The code then uses the pre-trained model (here, VGG16) to predict features from the preprocessed image. The verbose=0 argument suppresses the progress output.
Finally, the extracted features are stored in the features dictionary, where the key is the image's full path, and the value is the feature representation extracted from that image.
Display the Features
Let us see the extracted feature representation for that specific image.
features['image data/1.jpg']
array([[0. , 0. , 0.85335493, ..., 0. , 0. , 0. ]], dtype=float32)
The feature representation stored in the features dictionary for the image at the path 'image data/1.jpg'. This will return the extracted feature representation for that specific image.
We can store the dictionary using pickle. Using this we can reload the dictionary from the disk and use it for other projects.
Final Thoughts
Pre-trained models have learned rich hierarchical features from large datasets. By using these features, you can benefit from the knowledge captured by the model on tasks it was originally trained for.
Extracting features from images is much faster and requires fewer computational resources compared to training a full model from scratch. This can be crucial, especially when working with limited resources.
The features extracted from a pre-trained model can be adapted for various tasks, such as classification, object detection, clustering, and more. You can build custom models on top of these features, often requiring less training data.
Pre-trained models are trained on diverse datasets, resulting in features that generalize well across different domains. This can be especially useful when your own dataset is small or specific.
The choice of pre-trained model depends on your specific task and dataset characteristics. Some models might be more suitable for certain types of images, such as fine-grained classification or object detection.
It's often a good practice to normalize the extracted features before using them as inputs for downstream models. Normalization can enhance convergence and stability.
Depending on your use case, you might choose to remove the top classification layers from the pre-trained model to focus on feature extraction.
Properly preprocess your input images to match the requirements of the pre-trained model. This includes resizing, normalization, and other required transformations.
If you're using the features for a classification task, you'll likely need to train a new classifier on top of the extracted features. Be prepared to handle differences in class distribution, data quality, and any other specific requirements.
As with any machine learning technique, experimentation is key. Try different pre-trained models, feature extraction methods, and downstream tasks to find the best approach for your specific problem.
In summary, extracting features from images using pre-trained models is a versatile and efficient strategy that can save time, computational resources, and potentially lead to improved model performance. It's a valuable technique to have in your toolbox when working on various image-related tasks in the field of machine learning and computer vision.
Get the project notebook from here
Thanks for reading the article!!!
Check out more project videos from the YouTube channel Hackers Realm
Comments