Simon van Dyk commited on
Commit
7403338
·
1 Parent(s): f98bde0

Add: cost calc explanation

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -56,6 +56,10 @@ Its lightweight architecture enables fast, large-scale extraction of forecasts,
56
  <img src="https://huggingface.co/NOSIBLE/prediction-v1.1-base/resolve/main/plots/results.png"/>
57
  <p>
58
 
 
 
 
 
59
  ## Class token mapping.
60
 
61
  Because this is a classification model built off the [**Qwen3-0.6B**](https://huggingface.co/Qwen/Qwen3-0.6B), we mapped the `prediction` and `not-prediction` classes onto tokens. This is the mapping we chose.
 
56
  <img src="https://huggingface.co/NOSIBLE/prediction-v1.1-base/resolve/main/plots/results.png"/>
57
  <p>
58
 
59
+ Cost per 1M tokens for the LLMs was calculated as a weighted average of input and output token costs using a 10:1 ratio (10× input cost + 1× output cost, divided by 11), based on pricing from OpenRouter. This reflects the ratio between our prompt used to label our dataset.
60
+
61
+ For the NOSIBLE model, we conservatively used the cost of Qwen-8B on OpenRouter with a 100:1 ratio since the model produces a single output token when used as described in this guide. Despite this, our model is still the cheapest option.
62
+
63
  ## Class token mapping.
64
 
65
  Because this is a classification model built off the [**Qwen3-0.6B**](https://huggingface.co/Qwen/Qwen3-0.6B), we mapped the `prediction` and `not-prediction` classes onto tokens. This is the mapping we chose.