Ordinal or interval scala for labels

I have a question concerning a BERT classification task with three (or more) labels, which are ordinal, or interval scaled. E.g., I have three categories of text. Category 1 (label 0) is text of “high quality”, category 2 (label 1) is text of “mean quality” and category 3 (label 2) is of “low quality”.
Now I train a classification task with BERT. My problem is that e.g., a text with “high quality” (label 0) is misclassified during training, there is no difference that it is misclassified to “mean quality” (label 1) or “low quality” (label 2), although a misclassification in “mean quality” is of lower or smaller error than a misclassification in “low quality” (label 2). Is there any possibility, to get BERT trained by respecting the labels being ordinal or even interval scaled? I think that would give better classification results. Thank you for your feedback.

Hi Michael,

If I’m reading your question right it sounds like BERT with a regression head is what you’re after? This way, your data would be scored not as a classification task but as a “score” of text quality so long as your numerical labels are in order of either ascending or descending quality.

I haven’t done this in a while but believe that in huggingface setting “config.num_labels == 1” switches the model to a regression head with MSE loss out of the box.

I believe the STS-B dataset benchmark (part of GLUE) scores similarity between two pieces of text as a regression task between 0.0-5.0. It’s a common benchmark so there should be good examples in huggingface or elsewhere.


Hi Nick

Many thanks for your reply. I think, your answer will help me a lot. I will test it.

Best regards