Detecting Cassava Leaf Disease, Part 3

Review:

As introduced in the first blog post in this series, Cassava is an extremely important crop in Africa as it represents the second-largest provider of carbohydrates. As this crop is so vital, not only to the people who consume its carbohydrates but to the farmers on whose livelihood it depends, our goal of detecting and classifying Cassava Leaf Disease through machine learning is an important task. Part 1 of this series covered loading data from the Kaggle API, Exploratory Data Analysis (EDA), and a simple majority-classifier baseline model with an accuracy of 61.5%. In Part 2, of this series, we preprocessed our image data to be fit for more complex models, and built three different CNNs — a Keras Sequential Model, a GoogleNet model, and a ResNet50 model. All three of these models narrowly improved our validation accuracy scores. We left off Part 2 discussing the changes we planned to make to improve our validation accuracy. As this is the third and final installment of this series, we will discuss the changes we made both to our preprocessing and our models in order to achieve stronger results both in our validation accuracy scores and Kaggle submission score.

Adding a VGG model:

All of our previous models were performing adequately, but none of them really stood out. With that, we thought we would try our hand at a new model, VGG. As we discussed in our previous post, transfer learning is an excellent and applicable technique, because it loads a pretrained model with weights and biases to be built upon and applied to our specific problem. VGG16 was a very successful model on the ImageNet dataset, achieving approximately 97.5% accuracy after weeks of training.

Fine-Tuning our other models (again, with little success):

Similarly to our previous discussion with VGG16, we tried fine tuning our other models to see if any adjustments in hyperparameters or model architecture would squeeze out additional validation accuracy points. Unfortunately, we were looking for quite a big jump in validation accuracy; as a refresher, our current validation accuracy scores were sitting around the 65% to 70% marks depending on the respective model. Small changes in our hyperparameters, in some cases, led to an additional percentage point at most but it wasn’t the large jump we were optimistically hoping for.

The break-through!

Previously, we had settled on resizing our images to 100 x 100 pixels. This was because we had used a for loop to read each individual image, resize it, and then append it to a new list. This process was hugely expensive from a RAM perspective. While it was clear that resizing our images to such a small size of 100 x 100 would lose us a lot of valuable information, we simply did not have large enough RAM to increase our image sizes — we attempted resizing our images to 224 x 224, and each time our notebook would crash and suggest upgrading to Colab Pro. We caved! Oops! Capitalism! Unfortunately, Colab Pro still did not offer enough RAM for our inefficient method of preprocessing our images.

Our final model / submission:

We achieved the best model accuracy with our VGG model with transfer learning. Initially, we froze the base layers of the model and trained it with a batch size of 32, a learning rate of 0.001. After training for 10 epochs, we unfroze the base layers of the VGG model. We also added a learning rate schedule to ensure the learning rate would decrease as we fine tuned and incorporated an early stopping measure to avoid overfitting. It trained for another 20 epochs and was able to achieve a validation accuracy of 83%.

Lessons learned and takeaways:

For us, the main takeaways really lied in the importance of preprocessing and doing so in a way that our work environment had enough memory for us to run deep models. As we conducted exploratory data analysis (EDA) and proper preprocessing, we really learned that hyperparameter tuning alone will not produce the best model. It is critical that as data scientists, we look back to our preprocessing steps to ensure that the data is organized in a way that will provide the best results later on in the pipeline.

Potential Improvements:

With additional time and memory resources, we would hope to train our models for more epochs (between 50 and 100). We were also notified of Google AI Not, which would allow us to use higher computing power to run these models. With access to this, we could create deeper models and perform additional hyperparameter tuning to strengthen our predictive model.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store