Mokken scale analysis is a popular method to evaluate the psychometric quality of clinical and personality questionnaires and their individual items. Although many empirical papers report on the extent to which sets of items form Mokken scales, there is less attention for the effect of violations of commonly used rules of thumb. In this study we investigated the practical consequences of retaining or removing items with psychometric properties that do not comply with these rules-of-thumb. Using simulated data, we concluded that items with low scalability had some influence on the reliability of test scores, person ordering and selection, and criterion-related validity estimates. Removing the misfitting items from the scale had, in general, a small effect on the outcomes. Although important outcome variables were fairly robust against scale violations in some conditions, we conclude that researchers should not rely exclusively on algorithms allowing automatic selection of items. In particular, content validity must be taken into account in order to build sensible psychometric instruments.