Releases: VikParuchuri/surya
Releases · VikParuchuri/surya
Minor bugfixes
- Fix rotation and copy bugs
Fix image bugs
- Fix bugs with RGBA images
- Fix assert bug
- Add back in thumbnail method for resizing
- Slightly optimize segformer code
Change image resize
- Image resize from cv2 to PIL - cv2 caused benchmark regressions
OCR speedups
- Speed up base OCR model ~15-20%, and reduce memory usage by ~25% (can do higher batch sizes)
- Add static cache for compilation - torch.compile will result in another 15% speedup
- Other optimizations, like faster image resizing
- Bugfixes, like enabling different length language inputs for OCR (batching different docs with different languages together)
Processor improvements
- Remove unneeded format conversions
- Fix bug in OCR, where only one color channel was used for OCR - results should be better now
- Speed up layout/text detection a bit
OCR speedup
Cut OCR time in half. Combined with the previous release, OCR should now take about 40% as much time as it did before.
Significant speedup for layout, line detection
- Improve CPU postprocessing for line detection and layout - cut postprocessing time to 1/3 of original
- Unpin transformers version after investigating model performance
This should result in an ~2x speedup for layout and text detection. The effect will be most noticeable on GPU. I haven't fully benchmarked, though.
Bug fixes
- Fix memory leak with layout and text detection models and large batch sizes
- Improve ordering model generation slightly
Save memory when pruning MoE
- Prune MoE experts before loading model
- Unpin torch version from 2.2.2
Fix issue with torch and dependencies
Merge pull request #96 from VikParuchuri/dev Fix publishing issue