Remove Class Imbalance
Without adding Bias
Class imbalance refers to a situation in which the classes in a dataset are not represented equally. This can create challenges for machine learning models as they tend to be more biased towards the majority class, leading to poor performance and inaccurate predictions for the minority class.
There are several ways to combat class imbalance
Resampling techniques
These involve either oversampling the minority class or undersampling the majority class. Oversampling techniques create synthetic samples of the minority class to balance the dataset, while undersampling techniques reduce the population of the majority class. Examples include random oversampling, SMOTE (Synthetic Minority Over-sampling Technique), and Tomek links.
Generate new features
If the existing features do not provide enough discrimination between classes, creating new features can help improve the model's performance. Feature engineering techniques, such as transforming and combining existing features, can be used to create new features that better represent the minority class.