251 - unicode fun


While hunting for a quincunx [ⵘ] to use in explaining the use of fractions in ancient Rome, I found there is a very large number of symbols I might be expected to know about, but don’t.  So I thought I’d share some. The result might belong more in the Maths than the Writing, but I’ll accept opinions on that after publication.

0. Four ancient elements U1F7nn series: 🜁🜂🜃🜄 is followed by a further alchemical symbols, so I challenge you to supply the chemical formulae for these (and that’s as much help as you get): 

1.   🜔🜛🜺🝁🜚   and ‘harder’ :  🜹  🜦  🜿  

I couldn’t choose: urine  🝕, a mixture of lots of things including urea;    iron ore 🜜, a mixture, often of oxides and salts of iron;  Aqua regia, 🜆, a 1:3 mixture of nitric and hydrochloric acids. I didn’t know these words well enough to use them: regulus (a partially purified form, perhaps up to 1% of impurity) as in regulus of antimony,   🜰 and 🜱; marcasite, a semiprecious stone with iron pyrites (the disulphide).

2.  Here is a different set of fundamental elements to identify, the “Five Phases”: 

木 火 土 ⾦ ⽔, but you might prefer 土火 风 ⽔

There are some lovely little pictures, which may or may not reproduce on your screen: 1F9nn - 🦆🦕🦐🥑🦇

1F6nn  🚴🚼🛒🛠🚂, 1F3nnn🏃🐄🍳🏐🐖🌏.

3. You might guess these:       الرياح النار الأرض والماء            What is the name of the language?

  What is the name of the language here?        زمین آگ ہوا اور پانی

4. Here is a different set of four things in different character sets. I suggest you decipher one and then conclude the remainder. Identify the language or character set.

 उत्तर  दक्षिण  पूर्व/पूरब पश्चिम             উত্তর  দক্ষিম t  পূর্ব্ব  পশ্চিম         الشمال والجنوب والشرق والغرب

北,南,东,西,        ทิศเหนือ ตอนใต้  ทิศตะวันออก  ทิศตะวันตก

There is a private use area in unicode, in the digit zones F0nnn and 10nnnn (‘planes’ 15 and 16); read about thisThe same document explains that there are exactly 66 non-characters, which seems an oxymoron. You might wonder about those, and sentinels; the link explains enough to set you going.

Punctuation marks

¡ !   ⸘ ‽ “ ” ‘  ‛ ‟  .  ‚ „ ‘     ˝ ^ ° ¸ ˛ ¨ ` ˙ ˚ ª º … : ; & _ ¯ – —  # ⁊ ¶  ‡ @ % ‰ ‱ ¦ | /  \ˉ ˆ ˘  - ‒ ~ * ‼︎ ⁇  ⁉︎ ❛ ❜ ❝ ❞ ❢ ❣    ❡ ⸎ ⸐ ⸑ ⸓ ⸔ ⸕ ⸖ ⸗ ⸘⸚ ⸛ ⸜ ⸝ ⸞ ⸟ ⸠ ⸡ ⸢ ⸣ ⸤ ⸥ ⸦ ⸧ ⸨ ⸩ ⸪ ⸫ ⸬ ⸭ ⸮ ⸰ ⸋ ⸊ ⸉ ⸈ ⸇ ⸆ ⸅ ⸄
 ※*⁕⁑ ⁂ ⁁  ‵‶‷‴⁗𒑲𒑳

different and not spaced   ⁏;︔⁏   

similarly   〝〞〟‶⸗״᱿

2018 checking out punctuation, discovered the interrobang, ‽ which replaces ⁈ and or ⁉︎. I can see how to use that; what ab out the the dagger and double dagger? Oh, they’re for footnotes, where instead of ¹ ² ³ one would use * † ‡ (asterisk, obelus, diesis). A triple dagger exists at U+2E4B, but I can’t persuade that to print. see wikipedia on this. I discovered here a further weird collection of punctuation marks. Where weird means not previously seen and unknown usage or intention; do read it as an insight to medieval punctuation - I wondered if it might be useful for playwrights, since it offers multiple version of a spoken pause (well, non-spoken, but you know what I mean…). Some of these might actually be useful: for example, maybe we could adopt one of the elevated commas to indicate a plural, microscopically different from the apostrophe?  Would you prefer CDs or CD⸃s or CDⸯ (different from CD’s). I can find reference to the unicode 2E30 to 2E80, but I cannot see it. what I get is a load of ⸱⸲⸳⸴⹹ .

I am bothered that, having discovered the way to do index without having to find superscript , the tiny digits are not at the same elevation:  ⁹⁸⁷⁶⁵⁴³²¹⁰⁻ⁿ. The subscript equivalent works,  ₀₁₂₃₄₅₆₇₈₉    ⁵⁴³²  hence difficulties with properly representing  10⁻¹⁰, for example. I keep all of these in character favourites so reducing the work involved in input, particularly following some larger scale change, which tends to undo the sub or super script effort.  I discover this is a feature failure of the Arial font I prefer, repeated in Monaco, Helvetica,  Lucida, Times (both). Verdana does this instead ⁹⁸⁷⁶⁵⁴³²¹⁰⁻ⁿ.   One solution is to change to Arial Unicode  ⁹⁸⁷⁶⁵⁴³²¹⁰⁻ⁿ.  Hence 10⁻¹⁰. Yippee‼︎ Thats doesn’t cure x⁻¹/³  but perhaps I can use the right raised omission bracket U+2E0D  as in x⁻¹⸍³  

Here’s a character I’d like to use, the Tyronean et, ⁊, which would be more correct to use than the ampersand, &, which properly means per se, by itself. I read the history of both characters and see that we use & because we have always used it more; for example, it used to come at the end of every alphabet as a 27th character. Thus we now use & to pair items without using the and as separator . yet in its original use I’d write that last sentence as:  use ⁊ to pair items without using the and&. 

Similarly, the reversed question mark ⸮ is suggested to indicate irony, replacing the combination (!) and (?). Properly called a percontation point or the rhetorical question mark proposed around Caxton’s time as to be used for a question that does not require an answer. Which surely, would occur in a lot of written prose⸮  U+2E2E if you can’t see it. Questions that require an answer use the regualr ? symbol. In the same way the ¡ symbol was suggested to indicate ironic statements. More recently Hervé Bazin, in his 1966 essay Plumons l'Oiseau (“Let's pluck the bird”), came up with a longer list, shown here.

Quiz: name these characters        * ⁊ ‽ ^ | … † & 

Identify each of these by writing the symbol:  caret, pilcrow, silcrow, guillemet, obelus, solidus

1.   🜔 Is NaCl, common salt; 🜛is Ag, silver (I thought ammonia, NH₃, more likely); 🜺 is As, arsenic, where I thought it ought to be CO₂ ; 🝁 is CaO, quicklime ;     🜚 is Au, gold.  

Harder:  🜹sal ammoniac Nh4Cl;  🜦 Cu(SbO3)2  copper antimoniate, copper antimony oxide; 🜿   tartar, KC4H5O6  potassium bitartrate

I couldn’t choose: urine  🝕, a mixture of lots of things including urea;    iron ore 🜜, a mixture, often of oxides and salts of iron;  Aqua regia, 🜆, a 1:3 mixture of nitric and hydrochloric acids. I didn’t know these words well enough to use them: regulus (a partially purified form, perhaps up to 1% of impurity) as in regulus of antimony,   🜰 and 🜱; marcasite, a semiprecious stone with iron pyrites (the disulphide)

2.  The “Five Phases” are Wood ( ), Fire ( huǒ), Earth ( ), Metal ( jīn), and Water ( shuǐ).   Also, Jupiter-木, Saturn-土, Mercury-水, Venus-金, Mars-火   see here.   The Japanese is the same.       Wind , feng, 

To find the characters, what I found worked was to put up the unicode set on my screen and search for the character I knew, such as shui: searching for shui produces a list of characters all pronounced ’shui’, ⽔谁睡税瞓说說誰稅 …(24 of them). This is a similar process to that used to send a mandarin text message.

 Hunting through the unicode characters to find one I recognise is useful only in showing the basis for how characters are assembled and ordered — itself useful. Couple with a frequency table (to know which characters are used the most, say the top 200) so that one worked with subsets to learn — this I deem useful. I cannot persuade my other half to see that this is desirable and she says, perhaps correctly, that I should do that myself. I continue to say that there is a need to teach these characters to westerners in a western style, not a far eastern one.

3الرياح النار الأرض والماء Is alriyah alnaar al’ard walma’ earth fire wind & water again. I cannot even identify for sure where the word breaks are. I typed this into my search engine and the sounds produced do not fit well with the roman character form. This is Arabic, then Urdu.

4    Hindi  North, उत्तर South, दक्षिण East, पूर्व/पूरब West. पश्चिम  Uttar, Dakshin, poorva, paschim

North n – উত্তর (uttōr) South n – দক্ষিম (dōkkhim) East n – পূর্ব্ব (purbbō) West n – পশ্চিম (pōscim)  Bengali, so not surprisingly similar.

الشمال Al shamal, الجنوب, al ganoob, الشرق al sharq, الغرب Al gharb              Arabic    

வடக்கு தெற்கு, கிழக்கு மேற்கு    [Their order would be ENSW] NSEW is vadaku therku kizaku mearku. This is Tamil.

北,南,东,西  Bei nan dong, xi. Mandarin. Their order would be ESWN and for compass point the EW comes before the NS. Korean and Japanese are the same.

ทิศเหนือ ตอนใต้  ทิศตะวันออก  ทิศตะวันตก   Thai. Not what I expected and I didn’t discover pronunciation.

5 * asterisk, ⁊ tyronian, ‽ interrobang, ^ caret, | virgule, … ellipsis,  † obelus, & ampersand

,,,  ,,

caret,     ¶ pilcrow,   § silcrow,    «» guillemet † or ÷ obelus, / or  ⁄  solidus

you may argue that solidus and virgule are synonyms, that a caret is a circumflex and so on. I agree and sympathise.

Email: David@Scoins.net      © David Scoins 2018