Tags: Python
I ran a series of natural text analyses on anonymized essays of non native speakers of English. The code is published on GitHub, and I'm not going to post it here, but I thought some of the results were interesting.
Since the population of Arabic native speakers was so high at the time, I decided to compare Arabic native speakers to non-Arabic native speakers on their use of verbs, auxiliaries and modals per sentence. This was a combined project for my LING 720 and CSCE 500 classes in graduate school.
Long story short, I analyzed 83 arabic essays and 74 non-arabic essays. Here are some data:
Arabic words per essay: 3254.144578313253
Non-Arabic words per essay: 3703.689189189189
Arabic Sentences: 2043
Non-Arabic Sentences: 2062
Arabic verbs per question: 2.2771084337349397
Non-Arabic verbs per question: 1.6216216216216217
Arabic Aux per Question: 1.0769230769230769
Non-Arabic Aux per Question: 0.825
Arabic Modals per Question: 0.3076923076923077
Non-Arabic Modals per Question: 0.25
©2023, kirillsimin.com