Ignore some weak “Folds” for Cross-validation.

Calculate and compare accuracy, before using each “Fold”.

Cross-validation (Image by Author)

Ignore some weak Folds

Ignore some weak Folds (Image by Author)
Overfitting (Sofa Designer: Fabio Novembre)

Another evaluation for selecting weak Folds

So increasing the number of folds can always reduce the risk of “Overfitting” and only in this case, it is possible to ignore a few weak folds. Of course, if we plan this from the beginning, we can achieve another goal, which is sometimes very important. This means that we can use a new type of evaluation to select weak folds. This new evaluation can be different from the challenge (or project) evaluation and somehow complement it.

Once again, training with all the dataset

In the above notebook, we did the training with all the datasets once again and ensembled the result with the previous results, and finally, the final score was a little better. Of course, the score may not always improve.

Can changing the value of “random_state” improve our score?

Yes, there is a possibility that the score will improve. You can even go one step further. This means that you can use a fixed model but change the value of “random_state” several times and finally ensemble the results of these calculations together. This simple task may improve your score a little. We have seen this trick by clever participants many times in various Kaggle challenges. If you want more information, you can look at the public notebooks below, for example.

--

--

Senior Civil Structural Engineer, Kaggle Master, Researcher, Developer. https://www.soliset.com, https://www.farsi.media, https://www.newchains.info

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store