feat: implement /v1/messages/count_tokens endpoint

Add Anthropic-compatible token counting endpoint using hybrid approach: - Local estimation with gpt-tokenizer for text content (~95% accuracy) - API-based counting for complex content (images, documents) - Automatic fallback to local estimation on API errors This resolves warnings in LiteLLM and other clients that rely on pre-request token counting.
2026-01-14 15:20:32 +07:00
parent cc64b93f32
commit acc228b920
4 changed files with 311 additions and 17 deletions
--- a/package.json
+++ b/package.json
@@ -60,7 +60,8 @@
    "async-mutex": "^0.5.0",
    "better-sqlite3": "^12.5.0",
    "cors": "^2.8.5",
-    "express": "^4.18.2"
+    "express": "^4.18.2",
+    "gpt-tokenizer": "^2.5.0"
  },
  "devDependencies": {
    "@tailwindcss/forms": "^0.5.7",