The lady wife, for whom English is a difficult second language, was asking for help in distinguishing when to use articles. Her language doesn’t have them. Indeed, her language doesn’t have an awful lot of those little words we use in English to provide fine distinction of meaning. I explore that here. For some of the Mandarin issues, see the earlier Looking Both Ways. Wow; 2010.
In general, we distinguish between a table and the table because with the second one we’re referring to that indicated one with the metaphorical pointer. There are many tables but I’m referring to that one over there. Here’s an example referring to drivers of vehicles:
Please check that drivers are aware of the speed limit (drivers in general, so plural)
Please check that the driver knows what the speed limit is (the driver of the car you are in)
How does a driver know what the limit is? (any single driver may not know)
Does this road have an indicated speed limit? (or is it implied by street lights to be 30mph)
Definite = a definite, particular thing (*that* driver over there, who you are metaphorically pointing at); test this by inserting the word 'given' ('a given driver')
Indefinite = an indeterminate, possibly abstract thing. test here is to insert the word 'some’ to think of 'a' as equivalent to 'some' (as in 'some driver').
The indefinite article ‘a’ becomes ‘an’ when followed by a vowel, we are told, especially if your accent does not pronounce the initial letter such as ‘an hotel’, an historic hotel, an horrific hotel. [“an ‘orrific ‘istoric ‘otel”, aurally]. It is a feature of older British education that some of us learned to use an while still pronouncing the leading h. So whether you use haitch, or aitch might affect your choice. An hour but a hippo. I noted an early headmaster of mine pronounced ‘human’ as ‘yuman’ and remembering wondering what he was going to do with this other so-called half-vowel. An ‘ead, but a Head, though in most live cases the Head or even The Head.
There are exceptions for words beginning with the letter U, depending on the way you pronounce it. A uniform (You-knee-form) but an umbrella; an ugly duck but a uterus; and a unicorn or unit but an urn and an undone knot. I suppose that means we have a eucalypt not an eucalyptus. If it sounds as yoo then use a not an.
Therefore we seem to have a rule, that:
We use ‘an’ whenever the subsequent word is pronounced as if with a vowel. 4
From habit I have always said an hotel, presumably having been so taught. I now think this incorrect, but it may the the terribly English and almost mandatory exception, on the premise that for writing I should be suppressing the leading letter and mentally saying ‘otel.
Historically an meant one and this may serves as an alternative test, though I think less good. English doesn’t have plurals for articles, though I have seen pleas for a plural for the.
For those that like or need grammar, here’s what wikipedia has to say: Traditionally in English, an article is usually considered to be a type of adjective. In some languages, articles are a special part of speech, which cannot easily be combined with other parts of speech. It is also possible for articles to be part of another part of speech category such as a determiner, an English part of speech category that combines articles and demonstratives (such as 'this' and 'that').
In languages that employ articles, every common noun, with some exceptions, is expressed with a certain definiteness (e.g., definite or indefinite), just as many languages express every noun with a certain grammatical number (e.g., singular or plural). Every noun must be accompanied by the article, if any, corresponding to its definiteness, and the lack of an article (considered a zero article) itself specifies a certain definiteness. This is in contrast to other adjectives and determiners, which are typically optional. This obligatory nature of articles makes them among the most common words in many languages—in English, for example, the most frequent word is the.[1]
Articles are usually characterised as either definite or indefinite.[2] A few languages with well-developed systems of articles may distinguish additional subtypes. Within each type, languages may have various forms of each article, according to grammatical attributes such as gender, number, or case, or according to adjacent sounds.
Conversation with the daughter (that child of mine that is female, that specific daughter)
>Example: Wallsend
Let's go to Wallsend (the place where we live)
Let's go to the Wall's End (either to the pub across the road or to Segedunum itself) - but why isn't it the Segedunum?
> Because there isn't another Segedunum to distinguish it from. 'The' would imply that you were picking that Segedunum out from other possibilities, when in fact you are not. The Wall's End is a given place in space. Segedunum is considered unique.
Problems with mandarin (like I’m an expert, ha!)
> I assume Chinese doesn't have any prepositions?
> Not that I can tell. It is possible to distinguish place, on the table, under the table, above the table, beside the table.
It is very much harder. possibly governed by local habit, to indicate fine meaning of anything, but especially location in time (which English appears to be very good at, if confusing). Chinese is very ambiguous with what we might call compound verbs or phrasal verbs such as take plus ancillary - take to, take from, take with, take by, take off, take in, take after, take away, take out, take up, take up with, take on, take up on, take out on, take over, take to…. oh dear. And what I hear [from second language speakers] confuses take with bring;
take means to hold and move, (mostly from here to there, because the verb is to take <something> to <a location> where bring means much the same but from there to here. We can take from, but we can't leave out the 'from' itself. ‘Take this the fridge’ doesn't work. Nor does ‘Bring this the fridge’; in both cases we need a to or from.
I don't think the def/indef is so much a problem as the occasions where the article is or is not used. This would be the zero article or the null article. To me this is an essential of countability, but whether you prefer 0 or ϕ (the symbols for zero and null1). There are, to my surprise, both—distinct—classifications of the zero article2, ϕ1, ϕ2 and maybe more.
Class ϕ1, the zero article, occurring in front of the italicised noun:
Mass vs count the guys drove cars; cats like fish
Abstract vs concrete he went by car; cars make you faceless; they talked by phone
Intentional vagueness Walking is slow
‘Adjective‘ vs noun It was car enough for the job; he was man enough to accept this.
Class ϕ2, the null article, occurring in front of the italicised noun:
Bounded proper Greece is bankrupt; Scotland has voted; McDonald’s was fun
Rank, Post I was chairman; she was made headmistress; Clement was Pope
Familiar Time I went running after eating; it often rains on Friday; we will talk next week.
Familiar Place I left it at home (but not in the car);
Familiar Noun Phrase He and she were together; Romeo and Juliet were in love.
These can be explained as mostly being bounded singular proper nouns and some special singular count nouns 2 (reference to footnote because I’m quoting without real understanding. Yet, I hope).
Can we distinguish between these two? I’m finding that difficult.
Sticking with Peter Master for now, ”Did you go without lunch?” Was that a lunch or some lunch that you went without? It is indefinite in both cases, so this is ϕ1, zero article, zero lunch. Zero lunch happened.
From Grammar.www.facebookcomabout.com:
In general, the zero article is used with proper nouns, mass nouns where the reference is indefinite, and plural count nouns where the reference is indefinite. Also, the zero article is generally used with means of transport ("by plane") and common expressions of time and place ("at midnight," "in jail").
“Lunch was late:” The lunch was late, wasn’t it? Definite, so ϕ2, null lunch
“Thanks for lunch”: That lunch we just ate, the lunch I’m thanking you for. ϕ2, null lunch. it was an expression of thanks for a particular, therefore definite, meal.
Ø2 indicates a specific point in time or the start of a season while the indicates a period of time. Frazier and Llosa, 2009 (see URL list, benjamins.com), studying the use of the null versus the definite article for just the seasons. This fits the descriptions of familiar time and place.
Peter Master investigated failure to do this correctly in English and notes that even for those whose first language does the same as English (Spanish, for example), articles were lost. I found a suggestion that we should mark nouns known through prior mention or shared knowledge with the. I may refer to the election (Britain, 2015), the referendum (Eire, LGBT law, 2015) or the Scottish referendum (meaning 2014 independence) or the last match (Newcastle escaping from relegation from the English Premier League, or, with two friends at the running club who support Middlesborough, that would refer to the match against Norwich that was lost, meaning Middlesbrough don’t go ‘up’)3
How do we parse ‘The whale is a mammal’? Why do we use the definite article? Is it to emphasise that, say, a shark is a fish, so the whale is definite to distinguish from an assumption of fishiness? That seems mildly insulting, then, assumptive of ignorance or error.
It is far worse for Chinese speakers, where there is no article. Indeed, when I googled for a list, what I found was a reverse answer - Ask the Wiki gods: 'Among the world's most widely spoken languages, articles are found almost exclusively in Indo-European and Semitic languages.' You might note it is a marker of portraying a foreign accent, such as Russian. I confirm a lack of articles in Chinese, Japanese, Vietnamese, Russian, all the Slavic languages but for Bulgarian and Macedonian, but see the WALS for detail if you want it. Using that source indirectly, 198 languages have no definite or indefinite article, and 45 have no definite article but have indefinite articles. These number excludes languages that have affixes or clitics to mark definiteness, and languages which use demonstrative words as definite articles.
So how do people get around this general problem in other languages?
There has to be a way in each language to mark the status of information, however convoluted: you do something to indicate which X you have in mind, whether I’m thinking of the same X as you, whether you think I’m familiar with the X you refer to, whether this X is the same X we’ve just been discussing and so on. Proximity by association is often assumed and causes problems - I’m thinking of my Asian friends referring to ‘This one’ with no pointing at all, where they clearly think the association is strong and I think they’ve switched topic.
Indeed, it is by the use of demonstratives (this and that) that many languages get around this general problem, though some use different word forms (so ‘a big dog’ and ‘the big dog’ use different words for the size adjective). Some use word order to sort this out: In some languages, such as Russian, the word order is in part determined by definiteness: e.g. Boy kissed girl, with the nominative marking on boy and accusative marking on girl means "The boy kissed a girl", whereas the same words with the same case markers but in the reverse order — girl kissed boy — would mean "A boy kissed the girl".
There is a nice exposition on this by Dan Velleman at the linguistics stack exchange. Referring to languages in general he says we mostly use definiteness-marking (by which I think he means demonstratives in English, this & that), but there are alternative ways. All languages use focus, labelling new information - “... and then a guy asked her out” introduces a new person into a story. All languages do something to mark the status of information, as I covered earlier. Some languages effectively rank the subjects mentioned (obviation), some mark completeness of the event (telicity). Some, specificity (English is poor at this)6. Some have switch-reference systems7. Some change the verb not the noun phrase, having a special verb form when the object is definite (versus indefinite).
There is a good deal of overlap between these descriptions - they are not mutually exclusive - and the redundancy is useful. Indeed, I’ve often thought, since leaving Britain in 2007, that redundancy is useful for confirmation. “Let’s meet on Weds 27th” confirms I really mean that particular Wednesday, without extending the message to a long-form date. Not providing redundancy leaves (much) room for error especially across language and cultural barriers. Appointments benefit particularly8. We have a load of athletics meetings this summer on weekends but not always the same day, i.e. some are Saturday and some are Sunday; it is therefore well worth ensuring we include the day with the date, as emphasis.
The definite article has issues when associated with proper nouns.
The general rule associates either article with a proper noun unless the noun contains a prepositional phrase, meaning here something acting as an adjective.
There is a lot of geographical complication here, and I start with discussing country names.
Okay, so why the US? The <insert, like British> Isles, the UK? The Gambia? What did these places do to earn (or fail to lose) the definite article, the The? Worse, if it is part of the title, then why don't we include it more often? That is , why can't i look up these exceptions by searching a country database for 'THE_*' ?9
The answer seems to be that where the name has a descriptor—the Windward islands, the Gold Coast, even the Nether(-)lands— then the definite article is appropriate. Those descriptive elements merely identifying a sub-division do not take the definite article; British Honduras, Northern Ireland, South Korea. That’s not the same as the south of Vietnam or Korea, which is a specific reference, hence the definite article is appropriate.
This generally descriptive definition explains the addition of the definite article but there are some in transition as the descriptive element is steadily lost:
the Argentine, the land of silver, becomes Argentina,
the Ukraine is not recognised as ‘the borderlands‘ — and the Ukranians don’t like this any more,
the Punjab, the land of five waters (Punj-ab) may be steadily losing its article.
the Gambia, that narrow strip along its river enclosed on three sides by Senegal, may be keeping the descriptive from its longer title, the Gambian colony, so moves the Gambia into the group where formality and habit preserves the article.
Some will keep their definite article because the descriptor remains valid AND the habit is well-established:
the British Isles, not any other isles; ϕ2 Britain, ϕ2 GB, but the UK
the Republic of China, not any non-republic, not an empire so the PRC and ϕ2 China.
the United States, not the dis-united States, and hence the US, the States.
the Dominion of Canada, ϕ2 Canada
There are two riders to this rule:
- • use the definite article with proper nouns referring to geographical features such as rivers, oceans, bridges, regions and buildings.
Generally labels for water do take the, but not lakes.
• Bays need the article except when bay comes last, the Bay of Fundy in the Gulf of Maine, but Lundy Bay (north Cornwall)
• Rivers take the definite article: the Thames, the Rhine, the Amur. Oceans always do.
• Bridges are defining which bridge; the Forth Bridge, the Golden Gate (bridge). Note that in London this is not true; Hammersmith Bridge, London Bridge, Waterloo Bridge while in Newcastle (upon Tyne) with an even higher density of bridges it is true: the swing bridge, the high-level bridge, the Redheugh Bridge, the Eye. This is a British inconsistency reflecting a degree of fame as in Sydney Harbour bridge (no the). All dams take the article (I found no exceptions).
- •Regions, such as the north-east (US, Northeast) with or without capitals, generally take the article, but there are historical exceptions such as the Bronx. In cases such as The Hague the article is kept due to translation from the native name. Similarly we have the Yukon, the Sudan and the Crimea. Thus deserts collect an article; the Sahara, the Gobi, the Kalahari.
- •Buildings often take an article, especially when somehow famous, but there are confusions. The Golden Lion is correct even if the article is not in the title, the Eiffel Tower is correct because the French includes the article in the title and this is common across Europe. The Hilton is a hotel but you go to McDonald’s; you might find a drink drink at the Golden Lion, or the Sage or at Costa or Starbucks. Generally hotels, pubs and museums collect an article while other eating houses and shops do not, which makes one wonder what the separating criterion is.
- •Ships take an article when abbreviated, as in the Ark Royal but HMS Ark Royal. Convention drops the article with the title and one would not say “We went to view the SS Great Britain” unless you were thinking of it as a museum rather than a ship.
Station names tend to copy a local feature but then lose the article and other punctuation, but that is because they are defining a (new) name, the label of the station such as Elephant and Castle, Bank, St James (Newcastle), St James’ Park (Exeter), The Hawthorns, Smithy Bridge, St Johns, St Michaels. Mostly that is an issue with titles, so you can have issues with apparently missing apostrophes which are only correct because the possessive label was changed into a title. St Thomas’ Hospital, Guy’s Hospital, St James Theatre, St James’ Park (Newcastle), St James park (Exeter)and St James’s Park (London) are each correct. Whatever it says on the title plaque and the letter heading is the choice of the business. Harrods, McDonalds, Sainsbury’s; their choice, their name.
I wonder that so many exceptions are located in London and suggest this is a feature of familiarity or even fame, that is, the list of exceptions to using the article is affected by fame, location and local common usage. Hence you might visit Buckingham Place but the White House, Horse Guards Parade but The Mall, Times Square but the Strand or the Serpentine10.
- • use the definite article with lakes, mountains and islands only when plural.
- •Bear Lake but the Great Lakes. The Lake District in NW England is correct because lake describes which district, as in the central business district, the golden mile. The Serpentine in London is a recreational lake but it was previously a piece of river, so history stands.
- •Mountain ranges take the definite article (the Rockies (Rocky Mountains), the Alps, the Caucasus, the Atlas (mountains), the Pyrénées) but mountains themselves do not (Everest, Kilimanjaro) with exceptions such as the Eiger, the Matterhorn simply because the European mountains take the definite article in their home language. Indeed this generosity of bowing to the home language habit extends to other (European) references, hence the Mona Lisa, the Venus de Milo, the Parthenon. This part of the language is a mess.
- •Some groups of islands no longer (or not always) attach ‘islands’ to their titles, hence the Comoros, the Maldives or Falklands, the Bahamas, the Philippines. Note these all look plural even when the long version is looks singular, the Maldive islands, the Falkland islands. There are apparent exceptions, such as the Isle of Man (or Wight or Skye), where Isle comes first and thus acts as an adjectival phrase, while Christmas Island and bear island remain solidly singular and without an article. You’d visit Skye or the Isle of Skye.
I came across a suggestion that these odd conventions may be based on psychological issues to do with perception of the world, but made no progress with that idea. Some of these oddities are historic, predating the onset of rules. British English embraces exception by common usage and continues to do so, while other versions of English seek greater consistency. That doesn’t help the second-language user one whit and this remains a difficult topic. Indeed, exploring this over the last two days, I realise the very large list of sub-categories that defy easy explanation where I apply learned habit not learned rules. Thus I’m left expressing sympathy for the English language learner. This is not at all simple.
DJS 20150601
I continued looking for better explanations. Here is one from the University of Adelaide, based on Peter Master again, Master, P 1986, Science, medicine and technology: English grammar and technical writing, Prentice-Hall, New Jersey. This could improve
Is the noun singular and countable?
Yes
Is it definite?
Yes the possible replacement of the by my, each, both
No a / an possible replacement of the by my, each, both
Is the noun plural or uncountable?
Yes
Is it definite?
Yes the
No no article
Significant help from Jessie at The Filthy Comma
http://en.wikipedia.org/wiki/Article_(grammar)
http://en.wikipedia.org/wiki/Zero-marking_in_English#Zero_article
https://escholarship.org/uc/item/2kb4p9r0
http://wals.info/chapter/37 if you’re interested where there are no articles in the language
http://linguistics.stackexchange.com/questions/739/how-is-definiteness-expressed-in-languages-with-no-definite-article-clitic-or-a helpful discourse on this very topic. The contributors write clearly.
http://www.davidappleyard.com/english/articles.htm
https://benjamins.com/#catalog/journals/etc.2.1.01fra/details I can’t access this. Maybe you can.
https://www.adelaide.edu.au/english-for-uni/articles/
http://www.birmingham.ac.uk/documents/college-artslaw/corpus/conference-archives/2011/paper-92.pdf
1 For those who missed set theory at school, there is a distinction between a set with content, {the, a, an} and a set containing nothing and no thing, {}, or ϕ or sometimes {ϕ}. This is the empty set and it is (defined, effectively) as being a subset of all sets. Real-life example: your watch may have four modes, Chinese four tones and my first car four gears, but the watch had ‘off/broken’, Chinese adds a neutral tone and the car had a neutral gear position, each equivalent to the 0-state or the null state. You might say the car had four gears but the gear stick had five positions; it is a case where the language needs to be used carefully to be correct. this is helpful pedantry.
favourite maths class was the empty set; I could tick off loads of stuff as ‘taught but not attended’ (and therefore not my fault, yippee!). It would be different if the kids indicated they would not be present, if they asked ‘permission’ to be absent; then the onus on (of?) delivery of content remains with me.
2 https://escholarship.org/uc/item/2kb4p9r0 Peter Master, Acquisition of the Zero and Null Articles in English
Master, P. (1997). The English article system: Acquisition, function, and pedagogy. System, 25, 215-232. doi: 10.1016/S0346-251X(97)00010-9
Yoo, I. W. (2009). The English definite article: What ESL/EFL grammars say and what corpus findings show. Journal of English for Academic Purposes, 8, 267-278. doi: 10.1016/j.jeap.2009.07.004
Master, P. (1990). Teaching the English articles as a binary system. TESOL Quarterly, 24, 461-478. doi: 10.2307/3587230
Master, P. (1990). Teaching the English articles as a binary system. TESOL Quarterly, 24, 461-478. doi: 10.2307/3587230
3 Note Middlesbrough not MiddlesbOrough, so the contraction would be M’bro not M’boro. Norwich won and go up to the Premier League, Sep 2015. Newcastle United won their last game, having lost the previous ten, while Sunderland played similarly badly - and they both survive. So the moan from M’bro fans is that their team played well all season while the other two North-Eastern teams played badly, yet the two poorer performers stay in the (very) rich league. The estimated change in funding is £130 million, a large number compared to turnover. You might research this some more.
4 Though written it is a rule that depends upon pronunciation. Isn’t that odd? Words starting eu- are pronounced yoo so take ‘a’ not ‘an’. That applies the rule sensibly.
I wondered about capitalised letter-words, which I think follow the same rule: an FBI agent, a UN embargo, a NATO operation, an NUS representative. This is about initialism over acronym: NASA, NATO, FIFA, WASP, radar, DeFRA, EFTA, ETA (Basque liberation front), OPEC are acronyms - we turned the letters into a word. But the FBI, the NSA, the UN, EU, FA, USB, BBC, ESP, ELF (radio for subs), ETA (estimated time of arrival), are initialisms - we pronounce each letter. Where needing to distinguish these I have used periods. So ETA is a bit special: I can give you an E.T.A. but I am not an ETA activist. I may be a fan of the E.U. but I’ve not watched a EUFA match. Note that in Mandarin FIFA is an initialism, F.I.F.A., not an acronym and therefore we should not assume that all acronyms cross language barriers. EFTA, the European Free Trade Association might be distinguished from the E.F.T.A., the area in which it operates. Similarly with H (haitch and aitch); you may say an HTML lesson or a HTML (“Haitch tee em ell”). I hear E.F.L. but also TEFL and ToEFL. For that matter, in academia there is a PhD (“Phud”) and a P.h.D. My father was a Mad Phil and he (just) remains so.
Rob Compton responded on Facebook thus:
On the hotel issue: Most modern styleguides look at whether the H is aspirated (a hotel, a hypotenuse, a hero, a handbag) or not (an heir, an hour). Sounds simple, but the argument is then of course whether Hs are ever actually aspirated in modern speech: "I'm staying at an 'otel"; "I just bought an 'andbag". However, for the rule in writing we think about a more standard pronunciation. Here one would indeed pronounce the H: "I'm staying at a hotel"; "I just bought a handbag". Only RP forces the an: "You'll be accommodated in an 'otel, ma'am", but then the H gets dropped. Also, think about consonants (eg. in abbreviations) that start with a vowel sound (an LAPD officer). In conclusion, the rule in speech is pretty clear: if you start with a vowel and not the H you say an, otherwise a. In writing, most modern styleguides now call for an "a" in front of aspirated h+vowel/y combinations (as above in "standard pronunciation"). But an hotel is not incorrect if you imagine you are writing in RP - it's just a bit old hat. Or as "a hat" is referred to in Plymouth, "an 'aaa".
Which still leaves the forensic conclusion resulting from identifying the educational background of one who says “an Hotel”.
5 Some distinctions that are important for conceptualisation/communication are simply not marked in some languages. Your question seems to be: if a language provides no resources for definite marker, how does a speaker communicate that a given description is intended as definite as opposed to indefinite? Answer: discourse context steps in to help. For example, in English, the discourse function of an indefinite article is typically to signal that a new referent is being introduced. So, if I've already been talking about John, it would be extremely weird to refer back to him by saying Then, a guy called Mary to invite her to the dance, as it would seem now I am definitely not talking about John, rather talking about some new male individual. However, if John is salient in the discourse, it is perfectly natural to refer back to him using the definite, Then, the guy called Mary to invite her to the dance. In a language with no definite article/clitic/affix/word order difference, but yet with an indefinite article for example, at a minimum discourse demands would lead such speakers to only use the indefinite when introducing new discourse referents, and avoiding using it to refer to established referents. In the absence, too, of the indefinite article, languages could resort to word order differences (e.g., apply focus and topicalisation rules), or additional words like same, different, or use demonstratives and pronouns to indicate (non-)contrast of referents. from remark No7.
6 Velleman gives an interesting example, “I’m looking for a friend”. Read it again and ask yourself what is meant; “I’m looking for a friend”.
Received one way, some sad bugger is feeling lonely and wants a friend, any friend. Read another way, there’s a specific friend in mind, perhaps someone who owes money. Spanish provides the distinction, English doesn’t - we’d need to qualify whether ‘a friend’ is general or specific.
7 A switch-reference marks (in some way) a clause with a different subject or the same subject to the previous clause (sentence, even). We do that in English orally by changing tone but not in written work. Mostly in English we use this vagueness to convey humour.
8 Some of my readers dislike my frequent use of dash. Many more dislike my frequent use of colon and semicolon. This is my style, and I use these markers to indicate lengths of pause. I recognise the ordered set { , ; : } as increasing in length while not switching subject. I use the dash for periphrasis —insertion or addition of a phrase within a spoken breath, as if you read the way we speak, so that you can ‘hear’ conversation with you the reader, though the speed will be greatly different. Or it is for me - he says, demonstrating the same idea with a longer pause indicating a speech stoppage. I do not (I try not to do this), though I’ve seen it done, adopt that awful habit one catches on the tv couch of breathing in the middle of a sentence so as to have air to continue at length and prevent the other parties to the (non) conversation from participating. I use brackets in a similar way to dashes, generally reserving the brackets for less relevant asides covering possible mis-understanding (which could be bad writing) and deliberate duplicity (that which passes for humour - or doesn’t). Thus demonstrating a load of style rules for my writing. Not yours, mine.
For those who dislike the dash, you might note with a descending spirit that there are several different dash sizes available and I found some style sheets explaining where (or when) to change the length. Under punctuation symbols I have , –, —; what we may call the ell, en and em dashes, where the shortest one is used as, and called, a hyphen. See here and here. The em-dash — this one — is for a longer pause than the en-dash – this one –. There is difference between ‘authorities’ on whether dashes should be separated by spaces. My habit is that the non-space variety is what a hyphen is (oh, look, I did one). Perhaps I should be using the em dash—in preference to the space-limited hyphen - the immediate problem is that the em dash is not an ordinary key on my keyboard–nor is the shorter en dash, used for shorter pauses as demonstrated. In practice I use the space limited dash - like that - to represent the em dash. The dash is visually helpful in permitting variety for a surfeit of commas—where many would say one should then use a shorter sentence, i.e. start again. Style sheet explanations indicate that the en dash is (should be) for connecting things, the letters a–e, the numbers 117–123, the books Genesis–Numbers.
Maybe your keyboard fixes these for you-by correcting this typing, changing the (non-space bracketed) hyphen to an em dash. Mine doesn’t do that. No, I’m wrong: the em dash is available—shift-opt-hyphen. And I now realise that option-hyphen gives the en dash. My fault for not exploring the combination of modifier keys. These may or may not display on your browser, which is why I change rare symbols to the ASCII code when I find an error. Here they are:
Ÿ⁄™‹›fifl‡°·‚— Shift-Option `¡€#¢∞§¶•ªº–≠ Option only
Œ„‰ÂÊÁËÈØ∏”’» œ∑´®†¥^ø^π“‘«
ÅÍÎÏÌÓÔÒÚÆ åß∂ƒ©˙∆˚¬…æ
ÛÙÇ◊ıˆ˜¯˘¿ Ω≈ç√µ∫≤≥÷
Back when I was using an Apple Macintosh Plus, the cuddly little thing with a 9” screen, I was able to change the font characters at the pixel level quite easily, so I had what to me were useless symbols replaced with things I use often but others don’t, such as alpha, gamma and theta, α, 𝛄 and 𝜽. perhaps I should research this more?
9 You can look these exceptions up, and I did. See http://www.engvid.com/english-resource/the-with-country-names-lakes-rivers/
10 English is riddled with exceptions and this topic is, of course, no exception itself. We refer to the Strand (Times, this Saturday), because the Strand was once the beach on the Thames riverside, yet we use just Piccadilly, not the Piccadilly. But you'd go to the circus even if meaning Piccadilly but missing the adjective (“turn at the circus, Piccadilly Circus”) because there are many circuses in London, basically intersections. Worse, properties in the Strand are Strand properties, indefinite. We say the High Street but not the Station Road (until you mean perhaps the road which includes the station, such as in Bristol, where the station access street is no longer a through route). The cornmarket (say a covered market selling corn, the given one I'm referring to) and yet Cornmarket in Oxford (which somehow there is also sometimes the Cornmarket but I think that is probably local error). So we sometimes insert a definite article with names, depending upon the things local history.
The Serpentine is a recreational lake in London but was previously the Serpentine river, so history—London’s history indeed—prevents change. The street in London is ‘The Mall’, but the High in Oxford is I think labelled High Street.
Here’s a first attempt at improving the decision tree. Please criticise so it improves !
Here’s an attempt at showing what happens with proper nouns.
Charts written in Excel. Set the text as large as possible within the print area of one nominal page, to the limit of the typing, checking page limits. Save as PDF, choose selection or Print(ed) Area. Insert the pdf. Still too much white space.