The L2-regularized hinge loss kernel SVM is one of the most important and most studied machine learning algorithms. Unfortunately, its computational training time complexity is generally unsuitable for big data. Empirical runtimes can however often be reduced using shrinking heuristics on the training sample set, which exploit the fact that non-support vectors do not affect the decision boundary. These shrinking heuristics are neither well understood nor especially reliable. We present the first safe removal bound for data points which does not rely on spectral properties of the kernel matrix. From there a relaxation provides us with a shrinking heuristic that is more reliable and performs favorably compared to a state-of-the-art shrinking heuristic suggested by Joachims, opening up an opportunity to improve the state of the art.