FAST is a software created to simplify feature selection in the field of QSAR modelling, where, given the huge amount of molecular descriptors, this step plays a crucial role. Given the input dataset of descriptors and target variables to predict, FAST can select the variables subset with the best predictive performances, returning a reduced dataset and the list of selected variables.
FAST, via a full-search approach, proposes a set of solutions. For each of them, it calculates an effectiveness score, required to a posteriori determine the best one. In particular, FAST splits the features selection problem into three sequential steps, characterized by increasing computational cost and precision:
- FILTERING: removal of those features that are redundant or that exhibit characteristics not compatible to the modeling.
- PRUNING: removal of all the features with low or insufficient importance within the models.
- SELECTION: cross-validated selection of features that ensures the optimal bias-variance trade-off.
The progressive reduction of the original dataset in smaller subsets allows to get closer to the best solution by focusing the computational efforts on sets of reduced features.
FAST is the first tool of a complete set of solutions that Kode Chemoinformatics is developing to support computational chemists, labs and organizations in the chemistry, pharma, food and biotechnology fields. The client is thus led through the whole process of creating, managing, and deploying QSAR models, from features selection to prediction.
FAST, as a product belonging to a suite capable of handling the entire QSAR modeling process, can operate in continuity with the other softwares of the suite ensuring a fluent stream of data with the following steps.