10
These results show Adam and its variant (Adamax) provide the best results. Adam and
its variants were observed to converge faster. On the other hand, it was observed that
models trained using SGD were learning very slowly and saturated much earlier espe-
cially when dealing with age.
5.2
Transfer Learning
Table 8 compares the performance based on the different extracted features, on which
our models were trained.
Table 8. Comparison based on feature extractors
Feature Extractor
Age Estimation
(MAE)
Gender Classification
(Accuracy)
VGG_f
4.86
93.42
ResNet50_f
4.65
94.64