dots.ocr-1.5 vs GLM-OCR vs PaddleOCR-VL-1.5 Which Model Fits Your OCR Stack
Download printable cheat-sheet (CC-BY 4.0)16 Feb 2026, 00:00 Z
dots.ocr-1.5 landed on February 16, 2026 and immediately entered OCR conversations because it is not framed as only a document OCR model. It is positioned as a broader visual-language parser that also covers web pages, scene text, and SVG-oriented tasks.
That makes comparison harder and more useful at the same time. If your goal is pure document OCR, GLM-OCR and PaddleOCR-VL-1.5 still set a strong baseline. If your goal is one model across OCR and non-document parsing tasks, dots.ocr-1.5 may be worth a serious challenger slot.
1 Evidence tier first: what is strongest today?
Before comparing scores, separate evidence quality:
| Model | Evidence maturity (Feb 16, 2026) | Practical implication |
| PaddleOCR-VL-1.5 | Strongest: paper + model card + official Paddle docs | Best option when you want the most complete public technical write-up |
| GLM-OCR | Strong: official repo + model card; technical report not yet published | Good production candidate, but some claims still rely on repo/model-card reporting |
| dots.ocr-1.5 | Early: official repo + model card release notes for 1.5; no dedicated 1.5 paper yet | High-upside challenger, but treat current results as early-cycle evidence |
2 What changed in dots.ocr-1.5
Based on the official release notes/model card, dots.ocr-1.5 adds:
- A 3B model profile (
1.2Bvisual encoder +1.7Blanguage model). - Expanded task framing beyond document OCR: scene text OCR, object grounding, UI/web parsing, and image-to-SVG style parsing.
- Reported gains over prior dots.ocr on both classic OCR and broader parsing benchmarks.
- Longer output behavior emphasis (release notes mention max output length up to 16k tokens).