Skip to content

Compress images in docx/pptx/xlsx files using ImageOptim. (MacOS)

License

Notifications You must be signed in to change notification settings

cometeme/compress-office

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Compress Office

English | 中文

A simple tool that use ImageOptim to compress images in docx/pptx/xlsx documents, and can reduce file size.

screenshot

Installing

Clone this repo first:

git clone https://github.com/cometeme/compress-office.git

Then we need:

  1. ImageOptim: https://github.com/ImageOptim/ImageOptim
  2. Python 3.8 (and you need to install rich package via pip)
  3. trash: https://github.com/ali-rantakari/trash
  4. fd (Optional, can optimize searching speed): https://github.com/sharkdp/fd

If you have Homebrew,run these commands to install dependencies (or install manually):

brew install imageoptim python@3.8 trash
pip3.8 install rich

Then start ImageOptim, open "Preferences" menu and modify the settings. You can choose to perform lossless or lossy compression on the pictures in the document. The former can maintain the image quality, while the latter can better reduce size. You can also choose whether to remove EXIF information. EXIF contains informations about the time, gps and camera settings. Removes EXIF in the picture can better protect your privacy.

Usage

Run compress-office.py with files or folders you want to compress. You can pass in multiple paths. If you pass in a directory, the program will traverse in the directory and find files that can be compressed.

python3.8 compress-office.py [path1] [path2] ...

For example:

python3.8 compress-office.py ~/Documents ./test.docx

When program run for the first time, it will create a file called process_history.csv, which records the path of the compressed file and its modify time. When running again, if program finds that the file has not changed (file path is in the history and its modify time is same as recorded), then the program will skip the file without re-compressing, because doing so is not meaningful. If you really need to recompress a file, just delete its record from csv file.

After the compression is complete, the original document will be moved to recycle bin. Please check your documents before empty the recycle bin.

Use fd to speed up searching (optional)

The program uses python's built-in glob for file traversal by default, but it is very slow when there are many files, so the program supports the use of fd to speed up the search.

If you have HomeBrew, enter brew install fd in the console to install fd.

Other questions

Q1: ImageOptim can only run on MacOS, can I use this tool on other platforms?

A1: Yes, but you need to find a substitute for ImageOptim first, such as Trimage. Then modify the function.py in this project, and change the cache directory cache_folder and the compression command compress_command to make this program work.

Q2: How does this tool compress the pictures in the document? Will it corrupt my documents?

A2: The docx/pptx/xlsx document is essentially a zip compressed package, in which the resources are packaged together. This program decompresses the documents passed in by user into a cache directory one by one, use ImageOptim to compress all the pictures in the cache directory, recompresses them, and puts new file back.

Therefore, in theory, using this program will not corrupt files when compressing, but in order to prevent unpredictable bugs, it is recommended to backup files before compress them. At the same time, this program will move the original document to the recycle bin after compression, and if you found a problem, you can restore the original file from the recycle bin.

Q3: Why there are many pictures in the recycle bin after compression?

This is because ImageOptim puts the original pictures in the recycle bin when compressing pictures, and there is no setting to cancel this, so this problem cannot be solved temporarily.

About

Compress images in docx/pptx/xlsx files using ImageOptim. (MacOS)

Topics

Resources

License

Stars

Watchers

Forks

Languages