Hi! I've been on the Inkling thread working with
@Ikaheishi to try to create a standardized encoding of the fan language for fontmakers to work from so all Splatoon fonts are mutually compatible. We're mostly done with Inkling, so it's about time to get started working on encoding Octoling. To start on the encoding, however, I need to ask a few clarifying questions:
1. Is the canonical order for the consonants
b č d ð f g h j k l m n p q r s š t þ c(ts) v w x y z ž as in the pronunciation section, or
k g p b f v t d č j þ ð s z š ž m n l q r c h x y w -, as in the orthography section? I personally prefer the second since it seems to fit more naturally with the language itself, but if the first is how an Octoling would learn them in school, that should be the order in which they are encoded. The order for regular syllable codas (including standalone vowels) and irregular dissyllables (and the sections' relative positions) will be as in the orthography section of the document because it is a very natural scheme from a computer processing perspective, but alphabet don't have the same coding tricks to take advantage of as a syllabary, so any order would work.
2. Unicode standards require that the names of glyphs only be composed of uppercase letters of the English alphabet, so an alternative transcription, using polygraphs instead of non-English characters will have to be used for glyph names. Some letters have an obvious English-only equivalent (for example {þ} = <TH> and {ð} = <DH>}, but some have multiple, so I'd like your opinion on them
{č} {j} {š} {ž} can either be written <CH> <J> <SH> <ZH> or the more descriptive <SOFT CH> <SOFT J> <SOFT SH> <SOFT ZH> to recognize the fact that they are not quite how an English speaker would realize ch, j, sh, and zh. I bring it up because in Eastern European languages that distinguish between /ʃ/ and /ɕ/ (usually with {š} for /ʃ/ and an s with a grave accent for /ɕ/), the /ɕ/ is normally described as 'soft' and the other hard, but since Octoling doesn't have both it's not strictly necessary to specify. The fact that {č} does appear in the dissyllable glyph {iču}, which would be written <ICHU> either way, does lean me away from specifying <SOFT>, but <SOFT> is a useful descriptor, so I'd leave it up to you.
{l} could simply be transcribed <L>, but since this transcription already allows digraphs, unlike the standard one, and /ɬ/ is a VERY different sound from the English l, it might be better to use another alternative. Welsh uses {ll} to represent /ɬ/, and of languages with the /ɬ/ sound has probably had the greatest influence on English because of its geographic proximity, making <LL> a good candidate, but {sl} is used in several other European languages for the same sound and may be more representative to the reader of the actual sound it makes, so <SL> may be a good option.
Likewise, {x}, {q}, and {c} would work transcribed as <X>, <Q>, and <C>, and <X> even matches it's ipa pronunciation /x/, but since we're already using polygraphs in this transcription, <KH>, <TL> or <TLL>, and <TS>, respectively, would more intuitively describe their sounds for someone familiar with these kinds of transcriptions, where readers might otherwise assume as <X> = /ks/ or /x/, <Q> = /k/ /kw/ /ɣ/ or one of many other exotic sounds, and <C> = /k/ /s/ /ts/ or /tʃ/ depending on what previous transcriptions they've been exposed to.
{a}, {e}, {i}, {o}, {u} should probably just remain <A>, <E>, <I>, <O>, <U>, but there is a school of thought that would transcribe them <AW>, <EH>, <EE>, <OH>, <OO> to clarify since English has an absurd number of vowels represented by the same 5 letters and after the great vowel shift they don't align with the rest of Europe's pronunciation of {a}, {e}, {i}, {o} and {u} anyway.
If this were a pure alphabet, I'd transcribe {å} as <AA>, {ë} as <SCHWA> and {ü} as either <Y> <ROUNDED I> or <U WITH UMLAUT>, but only <AA> works when you have syllables involving them. The neutral schwa sound is often written uh in English, so <UH> would be an option, but if you want to preserve the e, <EH> is occasionally used for the schwa as well (though if you used <EH> for {e}, this obviously isn't an option). The IPA /y/ would normally be used for {ü} (as the ancient greek realization of upsilon was pronounced the same as a German u-with-umlaut), but since that is a consonant in the language already we'll have to get a bit creative: <IW> <IY> <UU> <UY> <UW> are all equally unsatisfactiory options, but {iw} and {iy} both appear in dissyllable characters and {üu} would be transcribed hideously as <UUU> if <UU> was picked, so I'd say <UY> or <UW> would be the best options...unless you transcribed {u} as <OO>, in which case {ü} could just be <U>.
Also, on a non-encoding note, there are Western (Indo-European) languages that use identifying markers instead of word order to identify roles in a sentence...just most of them are dead. The most notable is Latin, which, like Octoling, tends to put the verb at the end of the sentence in prose (though since the identifiers make it clear, it doesn't *have* to be) and native speakers would have a
tendency towards a certain word order for certain sentence constructions as well, but the Romans were just fine letting the words fall where they may. The main difference being that Latin uses morphology (changing the ending of the word) instead of particles to identify words' roles, making Latin highly synthetic language where Octoling is a mostly analytic language (except in the verb department). And, of course that Latin has gender and number.