More: show
Language (abbreviation) no. of occurrences/total no. of words = per cent
Akan (AK) 5/368 = 1,4%
Ancient Greek (AGR) 4/368 = 1,1%
Bambara (BM) 10/368 = 2,7%
Bengali (BN) 6/368 = 1,6%
Burmese (MY) 12/368 = 3,7%
English (EN) 65/368 = 17,7%
Filipino (FIL) 6/368 = 1,6%
French (FR) 16/368 = 4,3%
Fula (FF) 23/368 = 6,3%
German (DE) 14/368 = 3,8%
Hausa (HA) 7/368 = 1,9%
Hindustani (HI) 21/368 = 5,7%
Hungarian (HU) 29/368 = 7,9%
Italian (IT) 55/368 = 14,9%
Japanese (JA) 39/368 = 10,6%
Javanese (JV) 3/368 = 0,8%
Kazakh (KK) 8/368 = 2,2%
Korean (KO) 8/368 = 2,2%
Latin (LA) 24/368 = 6,5%
Malay (ML) 10/368 = 2,7%
Mandarin (MN) 12/368 = 3,2%
Marathi (MR) 35/368 = 9,5%
Modern Standard Arabic (AR) 32/368 = 8,7%
Oromo (OM) 4/368 = 1,1%
Persian (FA) 49/368 = 13,3%
Portuguese (PT) 42/368 = 11,4%
Punjabi (PA) 14/368 = 3,8%
Russian (RU) 6/368 = 1,6%
Sanskrit (SA) 5/368 = 1,4%
Spanish (ES) 34/368 = 9,2%
Swahili (SW) 4/368 = 1,1%
Tamil (TA) 25/368 = 6,8%
Telugu (TE) 0/368 = 0,0%
Thai (TH) 1/368 = 0,2%
Turkish (TR) 33/368 = 9,0%
Uzbek (UZ) 27/368 = 7,3%
Vietnamese (VI) 18/368 = 4,9%
Yoruba (YO) 42/368 = 11,4%
Based on this you don't have much help knowing any language, except maybe like English or Italian. But note that text 1 has relatively few unique words, with some words being repeated many times. So some language happen to get overrepresented and some underrepresented in this text. I'll do the numbers for the other two texts too, and see what result I get...
Akan (AK) 5/368 = 1,4%
Ancient Greek (AGR) 4/368 = 1,1%
Bambara (BM) 10/368 = 2,7%
Bengali (BN) 6/368 = 1,6%
Burmese (MY) 12/368 = 3,7%
English (EN) 65/368 = 17,7%
Filipino (FIL) 6/368 = 1,6%
French (FR) 16/368 = 4,3%
Fula (FF) 23/368 = 6,3%
German (DE) 14/368 = 3,8%
Hausa (HA) 7/368 = 1,9%
Hindustani (HI) 21/368 = 5,7%
Hungarian (HU) 29/368 = 7,9%
Italian (IT) 55/368 = 14,9%
Japanese (JA) 39/368 = 10,6%
Javanese (JV) 3/368 = 0,8%
Kazakh (KK) 8/368 = 2,2%
Korean (KO) 8/368 = 2,2%
Latin (LA) 24/368 = 6,5%
Malay (ML) 10/368 = 2,7%
Mandarin (MN) 12/368 = 3,2%
Marathi (MR) 35/368 = 9,5%
Modern Standard Arabic (AR) 32/368 = 8,7%
Oromo (OM) 4/368 = 1,1%
Persian (FA) 49/368 = 13,3%
Portuguese (PT) 42/368 = 11,4%
Punjabi (PA) 14/368 = 3,8%
Russian (RU) 6/368 = 1,6%
Sanskrit (SA) 5/368 = 1,4%
Spanish (ES) 34/368 = 9,2%
Swahili (SW) 4/368 = 1,1%
Tamil (TA) 25/368 = 6,8%
Telugu (TE) 0/368 = 0,0%
Thai (TH) 1/368 = 0,2%
Turkish (TR) 33/368 = 9,0%
Uzbek (UZ) 27/368 = 7,3%
Vietnamese (VI) 18/368 = 4,9%
Yoruba (YO) 42/368 = 11,4%
Based on this you don't have much help knowing any language, except maybe like English or Italian. But note that text 1 has relatively few unique words, with some words being repeated many times. So some language happen to get overrepresented and some underrepresented in this text. I'll do the numbers for the other two texts too, and see what result I get...