Number Token Loss¶

A regression-like loss that improves numerical reasoning in language models.
Originally presented in “Regress, Don’t Guess” (ICML 2025).

Getting Started¶

Install from PyPI:

uv add ntloss
pip install ntloss # if you are oldschool

Use like this:

from ntloss import NTLoss
ntl_fn = NTLoss(tokenizer=tokenizer)
ntl = ntl_fn(logits, labels)

# We recommend
loss = cross_entropy(logits, labels) + 0.3 * ntl

ntloss is currently in alpha phase and pre-release. Feedback & PRs are very welcome.

📝 Citation¶

If you use ntloss, please cite our paper:

@inproceedings{zausinger2025regress,
  title   = {Regress, Don't Guess – A Regression-like Loss on Number Tokens for Language Models},
  author  = {Jonas Zausinger and Lars Pennig and Anamarija Kozina and Sean Sdahl
             and Julian Sikora and Adrian Dendorfer and Timofey Kuznetsov
             and Mohamad Hagog and Nina Wiedemann and Kacper Chlodny
             and Vincent Limbach and Anna Ketteler and Thorben Prein
             and Vishwa Mohan Singh and Michael Danziger and Jannis Born},
  booktitle = {Proc. of the 42nd International Conference on Machine Learning (ICML)},
  year    = {2025},
  url     = {https://tum-ai.github.io/number-token-loss/}
}