Saving a Machine Learning Model:

When in machine learning we work with scikit learn library we have to save the trained model to use it in future. It helps in comparing it with other models or to test this model on new data. The saving of the model is known as Serialization. Restoring the data is known as Deserialization in python for data science.

Another thing is that we work with data of different sizes. The data which are smaller are easily train and takes less time but in case of large data (ex: 1GB), it is time-consuming on a local machine even with a GPU. So, for avoiding wastage of time, we store the train model to use it anytime in the future.

We can save a model in scikit learn in two ways:
  1. Pickle string: This module will implement a fundamental but powerful algorithm for serializing and de-serializing a Python object structure.

Saving Machine Learning Model in Python for Data Science - PST Analytic

Example: We will apply K nearest neighbor on the iris dataset, and then we will save the model.

Saving Machine Learning Model in Python for Data Science - PST Analytic

Saving the model to string by using pickle:

Saving Machine Learning Model in Python for Data Science - PST Analytic

OUTPUT:

Saving Machine Learning Model in Python for Data Science - PST Analytic

  1. Pickle model as a file by using joblib: Pickle is replace with joblib when we are dealing with objects carrying large numpy arrays as in such cases it is more efficient. These functions are also capable of accepting file like objects inplace of filenames.

Saving Machine Learning Model in Python for Data Science - PST Analytic

Saving to the pickle file using joblib:

OUTPUT:

So, to learn more about machine learning models and how to deploy it in python for data science, you can check this and this as well.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.