I’m returning to the subject of character names in plays. What? Me? Obsessive? Oh, all right then… My particular obsession is that names tie a person to an era, so naturally I am going to start with a few words about where that observation does not apply.
There is, of course, a melodramatic tradition of allegorical names, where the characters are labelled by their virtue or vice. Think back to the 17th Century and The Pilgrim’s Progress, with characters like Faithful and Piety. (Mind you, such names leaked into reality at the time – think of the Barebone’s Parliament, named after one of its members, the short version of whose name was Praise-God Barebone.) There is also a habit in some farces of giving characters ridiculous names (diving into the first one on my list, The Affairs at Meddler’s Top by Richard Coleman, we find Trellis Trelawney and Bouffant Eclair). In farce, it’s a matter of the writer having fun with the names (and possibly also avoiding any possibility of libelling real people).
No, my real concern here is for fictional names that are supposed to sound real. I gave an outline of the issue in an earlier post. Since then I’ve been casting around for some statistical backing for my intuition, and eventually found it via the Office for National Statistics (bless their cotton socks). From here on, all the statistics come from England and Wales using the first given name only.
The children born in any particular year are given a huge range of names. Take 1996: there were 296 000 girls who each had one of 4957 names. The spread of boys names is smaller – 319 000 boys with 3713 names.
What you see from the graph is that there’s an enormous tail – lots of names with very few people attached to them. That tail includes 851 names (from Albion to Ziggy) each given to only three people that year. The tail also covers a lot of ethnic minority names (from a lot of ethnic minorities) and the names bestowed by the sort of eccentric parents who believe that their child’s life will be so much better if they spell Alec as Alick.
Looking at the other end of the graph, The most common name (Jack) accounted for over 10 000 souls – that’s 3.4% of all boys named in 1996. The top 30 names accounted for 50% of all boys. Extending that to 100 names, you cover 76% of all boys from that year.
From the point of view of naming characters in a play or a novel, if you pick a name from the top 100, it rightly feels at home in the year. Pick one from outside the top 100 and, whilst it is entirely possible (you have the remaining 3613 to choose from), any individual name is very unlikely. For example, it would be possible to name a boy Tarquin in 1996 – indeed three families did – but the chances of finding one in the general population of that year would be less than 0.001%.
Whilst the pattern of the long tail remains fairly constant from year to year, the names in the top 100 can shift dramatically. However, girls’ names are much more changeable than boys’ in this respect. The following graphs show position in the top 100 (with number 1 at the top). Take a look at John.
John spent four decades at the very top of the popularity charts before a slow decline. (When I was a student, in a population of 48 men sharing a hall of residence, five were called John. They were commonly known in the Welsh manner by a secondary characteristic: John the Miner, John the Post Graduate, John the Milkman, John the Engine Driver and Little John.)
As I said, it is entirely possible to find outliers. Uncommon names are still valid names. However, it becomes less likely to find a set of uncommon names together. Imagine that you are writing a story about five friends. If they were born in 2004, the most popular names were Jack, Joshua, Thomas, James and Daniel. Together those names account for 15% of the 2004 cohort. For simplicity, assume it’s 3% each.
The probability of finding any one of those five in a group of five is 3% times the number of tries – so 3% * 5 = 15%.
If we’ve found one of them, what’s the probability of finding another?
Well, we have four goes, so 3% * 4 = 12%
And so on through 9% for the third name, 6% for the fourth and 3% for the fifth.
The chances of finding them all together in a group of only five people is the product of the probabilities:-
15% * 12% * 9% * 6% * 3% = 0.0003%
That doesn’t sound very likely (and it’s not, in the sense that there are 3708 other names that could be in the same group) however, that is more likely than any other group of five names for that year!
Take for example the top five from 1964: David, Paul, Andrew, Mark and John. They are all still in the top 100 for 2004, but their combined share of the name market in 2004 is down from 15% to 2.8% – an average of less than 0.6% each.
Applying the same logic as above, the probability of finding them together as an exclusive group is only 0.00000007%
Or, to put it another way, the probability of finding Jack and his group together amongst the 2004 cohort is more than 4000 times more likely than finding David and his group.
The more likely that the names belong together, the more credible your story.
Spreadsheets of the Top 100 Girls’ and Boys’ names (1904 to 2004) are available on the Lazy Bee Scripts publishing pages.