Marjou (2021)

De Arbres
  • Marjou, Xavier. 2021. 'OTEANN: Estimating the Transparency of Orthographies with an Artificial Neural Network', ACL Anthology, texte.


calcul de transparence phonétique des orthographes à partir de données du wikeriadur/wictionnaire ("ent" et "eno" sont des contrôles).


tableau de résultats

Orthography Write Read
ent 99.6 ± 0.3 99.8 ± 0.1
eno 0.0 ± 0.0 0.0 ± 0.0
Arabic 84.3 ± 0.8 99.4 ± 0.3
Breton 80.6 ± 0.6 77.2 ± 1.6
German 69.1 ± 1.0 78.0 ± 1.5
English 36.1 ± 1.5 31.1 ± 1.3
Esperanto 99.3 ± 0.2 99.7 ± 0.1
Spanish 66.9 ± 2.0 85.3 ± 1.3
Finnish 97.7 ± 0.3 92.3 ± 0.8
French 28.0 ± 1.4 79.6 ± 1.7
"French ortofacil" 99.0 ± 0.3 89.7 ± 1.1
Italian 94.5 ± 0.8 71.6 ± 0.9
Korean 81.9 ± 1.0 97.5 ± 0.5
Dutch 72.9 ± 1.7 55.7 ± 2.2
Portuguese 75.8 ± 1.0 82.4 ± 0.9
Russian 41.3 ± 1.6 97.2 ± 0.5
Serbo-Croatian 99.2 ± 0.3 99.3 ± 0.3
Turkish 95.4 ± 0.7 95.9 ± 0.6
Chinese 19.9 ± 1.4 78.7 ± 0.9


 "p.6.
 Breton, German, Italian, Portuguese and Spanish: 
 With all their scores above 65% their orthography was also measured as fairly transparent. 
 [...]
 French: With a low writing score (28%), the results showed that the chances of correctly writing a French word on the sole basis of its pronunciation were rare, as anticipated given the high number of phoneme-to-grapheme possibilities. Without being able to access a broader context than the word itself, the ANN was not able to reliably predict how to write a French word. With a much higher reading score (80%), the ANN obtained good reading results. As a comparison, for the same language, the alternative ’fro’ orthography obtained excellent writing score (99%) and reading score (90%). Recall that the difference between its two scores is due to the fact that the ’fro’ orthography is not bijective. For instance, in the reading direction, the <o> letter can be translated into /o/ or /ɔ/)."