Microsoft Causes Inflation

This simple trick will cut your Word document down to size.

The Problem

Word LogoWord documents can suffer from bloat. An author tried to submit a 32-page Word file to us. The formatting was straightforward, but the file was a whopping 2.4 MB. Our on-line systems rejected the file (because why would we want a file that big?) The author saved it as a Rich Text Format (RTF) file of 600 kB and uploaded it. We imported it into Word and saved it as a .docx and low and behold, it was back to nearly 3 MB. Over 2 MB bigger without the addition of so much as a single comma. We reviewed the file and added some simple mark-up, and it blew up to well over 3MB.
After that, I applied this trick to the script and got it down to less than 90 kB – two orders of magnitude smaller. (And with five minutes work, it went to 80 kB.) So what’s going on?

The Cause

The frank answer is that I’m not sure. I know some of the causes, but Word is a complex tool, so attributing anything to a single cause is dubious, and I’m trying to approach this as a user, not as a product tester. Broadly, there are two issues. Firstly, Word tracks changes. Even if you declare an edition to be final, and stop tracking changes, Word seems not to discard the change data. It’s still hanging around somewhere, even though it’s not used. Secondly, if your document is edited by multiple users (or one user on several computers), it picks up template information from each instance without discarding the previous information, so it keeps adding unused data to the document. (There may also be an issue with using different versions of Word to edit one document, and certainly further issues with editing documents in a mix of Word and other word-processors.)

The Solution

The solution is to leave all the rubbish behind:-
Create a new blank document (preferably using a clean template that includes just the Styles you need). Now open your bloated document. Select all the contents (either by mouse or use a shortcut; Ctrl-A on a Windows computer). Click copy (Ctrl-C). Switch to your new blank document and Paste (Ctrl-V). Then Save. The new version will have left most of the dross behind and kept your text, your formatting and not a lot else.

(There is a minor additional tweak: that process will copy over all the Styles from the source document, including ones that are not actually in use. You can reduce the file size a little more by deleting unused Styles.)

Postscript – What If That Wasn’t My Problem

The other major cause of Word Bloat is embedded images. If you need pictures, you need them, but consider cropping and shrinking to a size appropriate for your purpose before you embed images in your document.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.