Rubbish Classifier - Computer Vision Classification
Project screenshots
Feel free to check them out.
Fetching project data . . .
PyTorch
Python
Jupyter
It is known that when it comes to rubbish separation, we only do it properly sometimes. It could be for lack of information about the correct way of doing it or for the laziness of needing someone or something to tell us where to throw it correctly. That's the reason for this application to come to exist.
As a disclaimer, the technology used in this project is at the beginner level, where the idea was to experiment and try out a miniature version of a CNN, to, later on, improve it by applying other technologies such as using a state-of-the-art model for image classification, transfer learning and fine-tune techniques.
Therefore, the idea was to implement a model able to perform a classification through images of nine classes of rubbish:
But, as specified in the disclaimer, it is a beginner-level implementation to experiment with a computer vision classification. As a result, only the first four classes were used.
The number of layers, parameters and dimensions were extracted from CNN Explainer, which provides a graphic view of the network with all its information.
To develop the application, a CNN was trained on self-collected data and developed using PyTorch. The network architecture was based on the Tiny VGG used in the CNN Explainer.
The hyperparameters chosen were the following:
Conv2d()
ReLu()
Conv2d()
ReLu()
MaxPool2s()
Flatten()
Linear()
CrossEntropyLoss()
Adam Optimizer
The model layer and shapes were the following:
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
TinyVGG [32, 4] --
├─Sequential: 1-1 [32, 10, 126, 126] --
│ └─Conv2d: 2-1 [32, 10, 254, 254] 280
│ └─ReLU: 2-2 [32, 10, 254, 254] --
│ └─Conv2d: 2-3 [32, 10, 252, 252] 910
│ └─ReLU: 2-4 [32, 10, 252, 252] --
│ └─MaxPool2d: 2-5 [32, 10, 126, 126] --
├─Sequential: 1-2 [32, 10, 61, 61] --
│ └─Conv2d: 2-6 [32, 10, 124, 124] 910
│ └─ReLU: 2-7 [32, 10, 124, 124] --
│ └─Conv2d: 2-8 [32, 10, 122, 122] 910
│ └─ReLU: 2-9 [32, 10, 122, 122] --
│ └─MaxPool2d: 2-10 [32, 10, 61, 61] --
├─Sequential: 1-3 [32, 4] --
│ └─Flatten: 2-11 [32, 37210] --
│ └─Linear: 2-12 [32, 4] 148,844
==========================================================================================
Total params: 151,854
Trainable params: 151,854
Non-trainable params: 0
Total mult-adds (G): 3.31
==========================================================================================
Input size (MB): 25.17
Forward/backward pass size (MB): 405.20
Params size (MB): 0.61
Estimated Total Size (MB): 430.97
==========================================================================================
And after all the development, the training and test results were the following:
0%| | 0/20 [00:00<?, ?it/s]
Epoch: 0 | train_loss: 1.3578 | train_acc: 0.3380 | test_loss: 1.4371 | test_acc: 0.2344
Epoch: 1 | train_loss: 1.3505 | train_acc: 0.3745 | test_loss: 1.4166 | test_acc: 0.2344
Epoch: 2 | train_loss: 1.3462 | train_acc: 0.3745 | test_loss: 1.4127 | test_acc: 0.2344
Epoch: 3 | train_loss: 1.3365 | train_acc: 0.3823 | test_loss: 1.4370 | test_acc: 0.2344
Epoch: 4 | train_loss: 1.3153 | train_acc: 0.4101 | test_loss: 1.4134 | test_acc: 0.3203
Epoch: 5 | train_loss: 1.2436 | train_acc: 0.4605 | test_loss: 1.3471 | test_acc: 0.4089
Epoch: 6 | train_loss: 1.1433 | train_acc: 0.5326 | test_loss: 1.2943 | test_acc: 0.4583
Epoch: 7 | train_loss: 1.0717 | train_acc: 0.5595 | test_loss: 1.3672 | test_acc: 0.4167
Epoch: 8 | train_loss: 0.9518 | train_acc: 0.6047 | test_loss: 1.2811 | test_acc: 0.4792
Epoch: 9 | train_loss: 0.8666 | train_acc: 0.6456 | test_loss: 1.2530 | test_acc: 0.4375
Epoch: 10 | train_loss: 0.7554 | train_acc: 0.6968 | test_loss: 1.4066 | test_acc: 0.4922
Epoch: 11 | train_loss: 0.6378 | train_acc: 0.7655 | test_loss: 1.4421 | test_acc: 0.4818
Epoch: 12 | train_loss: 0.4516 | train_acc: 0.8732 | test_loss: 1.2420 | test_acc: 0.5260
Epoch: 13 | train_loss: 0.3104 | train_acc: 0.9010 | test_loss: 1.3614 | test_acc: 0.5729
Epoch: 14 | train_loss: 0.1869 | train_acc: 0.9375 | test_loss: 1.8353 | test_acc: 0.5365
Epoch: 15 | train_loss: 0.1195 | train_acc: 0.9618 | test_loss: 1.7163 | test_acc: 0.5104
Epoch: 16 | train_loss: 0.0646 | train_acc: 0.9792 | test_loss: 2.0551 | test_acc: 0.5365
Epoch: 17 | train_loss: 0.0387 | train_acc: 0.9948 | test_loss: 2.0905 | test_acc: 0.5859
Epoch: 18 | train_loss: 0.0206 | train_acc: 0.9974 | test_loss: 2.0580 | test_acc: 0.5703
Epoch: 19 | train_loss: 0.0107 | train_acc: 1.0000 | test_loss: 2.2457 | test_acc: 0.5573
Total training time: 942.753 seconds.
As we can see, the test results could be better, the cause is the low complexity of the model and the small amount of data used. As said before, the idea was to part from here and implement a better model with better technologies.
If you feel like going deeper into the project and getting detailed information about its implementation and results, check this notebook.
Feel free to check them out.