Tag Archives: Microsoft

Taking a Butcher’s at Fonts

Fonts are fun.  When we (Lazy Bee Scripts) publish a stage play or pantomime, we pick a title font that says something about the content.  Usually it’s a feeling conveyed by the typeface (from chilling to frivolous); occasionally it’s suggested by the font name – so I have used a font named Gaslight for plays set in the late Victorian era.

So far, so good.  However, we distribute a lot of scripts as Word files and that brings some additional problems.  Obviously we don’t expect our customers to have the same fonts installed on their computers as we do, but Word provides a means of embedding TrueType fonts into the Word file, so that they are available to readers of that file who don’t have the font installed.  There are two options for doing this: embed the whole font file or only characters used in the file (which Word recommends as “best for reducing file size”).  We take the latter option.

When the customer opens the file, if they don’t have the font installed, they may see a message about restricted fonts.  Sometimes we use fonts that (for copyright reasons) are not freely distributable, however, if embedded, they will show up on-screen and may be printed.  What the customer can’t do is to edit a document containing a restricted font.  That’s fine.  Most of our customers do not need to edit our scripts.  (Not least because of the general point that you should not change a copyrighted work without the permission of the copyright holder.)  However, there are occasionally good reasons for editing (embedding lighting cues for a specific production, for example).  In that case, if the customer saves an editable version, they will lose the embedded font and Word will substitute its default system font (which probably won’t look anything like our chosen font).

The next problem was pointed out to us by author Tim Cole when we sent him a copy of his script Butchers.  When he took a butcher’s *, he pointed out that there was a spurious square at the end of the title.

Usually, spurious squares are an indication that the creator of a font has not implemented some characters (usually punctuation marks).  In this case, the square appeared at the end of the line, in the position of the paragraph return.  Even more bizarrely, making the paragraph marks visible displayed the pilcrow (the printers’ end of paragraph mark, ¶ ) in the chosen font.  Somehow, when the pilcrow was supposed to be invisible, Word was trying to display a character that wasn’t in the embedded character set.

If you paid close attention to my second paragraph, you may have identified the obvious way around the problem: instead of embedding just the used characters, why not embed the whole font?  You’re right.  I tried that, and indeed the invisible character that isn’t embedded in the “used characters” is part of the whole font set and so if the whole font set is embedded, the problem disappears.  Unfortunately, there’s a cost to doing that.  Remember that recommendation from Microsoft that embedding just the used characters is “best for reducing file size”.  Embedding a whole font in a test document grew it from 98 kB to over 1.7 MB – so one invisible character cost me 1.6 MB of storage space.  The problem is prevalent in all non-installed fonts and, since we don’t know what our customers have installed on their computers, applying the fix to all scripts would cost us 10 GB of on-line storage.  (You can argue that this is not very much by modern standards; however Lazy Bee Scripts is a small publisher with small storage requirements compared to, say, YouTube.)

I found a different solution, which was to replace every return character in a script title with a return in the document’s default font.  Quite tricky to automate (and, because of thousands of scripts, it needed to be automated), but it removes the spurious square at no cost to the file size.
All this may well be a feature of the latest version of Word.  I had not come across it before, but then I don’t keep archived copies of different versions of Word just to test for Microsoft’s problems.

 

 

*     “Butcher’s hook”, rhyming slang for look.  Not many people see butcher’s hooks any more, but they were very useful to my grandfather.

Microsoft Causes Inflation

This simple trick will cut your Word document down to size.

The Problem

Word LogoWord documents can suffer from bloat. An author tried to submit a 32-page Word file to us. The formatting was straightforward, but the file was a whopping 2.4 MB. Our on-line systems rejected the file (because why would we want a file that big?) The author saved it as a Rich Text Format (RTF) file of 600 kB and uploaded it. We imported it into Word and saved it as a .docx and low and behold, it was back to nearly 3 MB. Over 2 MB bigger without the addition of so much as a single comma. We reviewed the file and added some simple mark-up, and it blew up to well over 3MB.
After that, I applied this trick to the script and got it down to less than 90 kB – two orders of magnitude smaller. (And with five minutes work, it went to 80 kB.) So what’s going on?

The Cause

The frank answer is that I’m not sure. I know some of the causes, but Word is a complex tool, so attributing anything to a single cause is dubious, and I’m trying to approach this as a user, not as a product tester. Broadly, there are two issues. Firstly, Word tracks changes. Even if you declare an edition to be final, and stop tracking changes, Word seems not to discard the change data. It’s still hanging around somewhere, even though it’s not used. Secondly, if your document is edited by multiple users (or one user on several computers), it picks up template information from each instance without discarding the previous information, so it keeps adding unused data to the document. (There may also be an issue with using different versions of Word to edit one document, and certainly further issues with editing documents in a mix of Word and other word-processors.)

The Solution

The solution is to leave all the rubbish behind:-
Create a new blank document (preferably using a clean template that includes just the Styles you need). Now open your bloated document. Select all the contents (either by mouse or use a shortcut; Ctrl-A on a Windows computer). Click copy (Ctrl-C). Switch to your new blank document and Paste (Ctrl-V). Then Save. The new version will have left most of the dross behind and kept your text, your formatting and not a lot else.

(There is a minor additional tweak: that process will copy over all the Styles from the source document, including ones that are not actually in use. You can reduce the file size a little more by deleting unused Styles.)

Postscript – What If That Wasn’t My Problem

The other major cause of Word Bloat is embedded images. If you need pictures, you need them, but consider cropping and shrinking to a size appropriate for your purpose before you embed images in your document.

Why does Microsoft make things up?

An exercise for word-processing obsessives

Word LogoThis is a feature of Word 2007 and Word 2010, but not (pre-ribbon) Word 2002.
Try the following steps.

  • Start a new document in Word 2007 or Word 2010.
  • Write a short sentence or headline.
  • Select your text, then change the font to your favourite fancy font, increase the font size and make it italic.
  • Select the text, then click on the expander in the bottom right hand corner of the Styles box on the home page of the ribbon.  (That launches the pop-up Styles panel.)
  • At the bottom of the Styles panel, click on click on the New Style icon.  This should create a new style from your fancy text, and prompt you to give it a name.  Let’s call this style “Wanted”.  Click OK to create it.
  • The name of your new style should now appear in the Styles panel.
  • From the “Options…” link at the bottom of the Styles panel, under the “Select Formatting to Show As Styles” heading, select “paragraph level formatting”.  (That determines what shows-up in your Styles panel.)
  • Now go back to the short sentence that you’ve created in your “Wanted” style.  Put the cursor somewhere in the middle of that sentence and press Ctrl-Return.  That inserts a page break.

Did you spot what that last operation did?  In addition to the page break, it added something to your Styles panel.

What it added depends on which version of Word you’re using (and possibly the phase of the moon).  In Word 2010 it usually adds a new style called “After: <something descriptive of paragraph formatting>”.  In Word 2007 it adds a new style that describes details of the “Wanted” style.

Is this necessary?
No.
To prove that, use the Styles panel to select all instances of the new (“Unwanted”) style and then apply the “Wanted” style to them.  Aside from the demise of the Unwanted style, nothing else happens in the document.  The Unwanted style was unnecessary.

Why does this matter?
Well, the point of Styles is to keep control of your document – to ensure that everything that should have the same format does have the same format.  To ensure that if you want to change the way particular parts of the document look, you can change the style – one style, one change – and the change will be applied consistently throughout the document.  By spewing out unnecessary styles, Microsoft makes it harder to format documents consistently.