A 2D design space with two parameters were created using different sampling methods: grid, Latin hypercube sampling (LHS), random, and antithetic version of the last two. The number of sample points to cover the study space are: 100, 225, 625, 1225, and 2500.
The lower values for both parameters equal to 0.2 and upper values equal to 1. The design space is based on the geometry characterised by non-linear equations, and non-convexity. The synthetic tabular datasets contain two parameters and consider a binary classification problem, where points are “Good” denoted with “1” if they are in the interior of the design space and “Bad” denoted with “0” if they are not.
The datasets were used to extract two extra datasets to train, evaluate, and compare classification models coupled with active learning strategies. The two extra datasets extracted from the datasets containing the values of parameters and the target associated are: (i) the indexes of the initial labelled samples and (ii) the indexes of the initial training samples.