Christian Khairallah (Cayralat)
I'm currently a full-time research assistant at the Computational Approaches to Modeling Language (CAMeL) Lab at New York University Abu Dhabi, under the supervision of Prof. Nizar Habash.
I graduated with distinction in Electrical and Computer Engineering from the American University of Beirut (AUB) with a minor in English Language, and with a double Master of Science in Computational Linguistics from Charles University in Prague and Saarland University in Germany, where I attended both as part of the Language and Communication Technologies (LCT) Erasmus Mundus Master's program.
On the personal level, I am a big language enthusiast, and I am currently especially interested in the history of Semitic languages, and more specifically in the Arabic branch and how current variants came to co-exist with Classical Arabic in a diglossic relationship. I am natively fluent in Arabic, French, and English, beginner-level in German, and I am currently learning Italian and Syriac. In my spare time, I am an avid mélomane and I am interested in the history of music in general, I mostly go hiking, cycle, swim, play ping pong, practice yoga, cook, and father a crew of insufferable indoor plants.
In my work, I focus on computational approaches aiming to reconcile the disparity between Modern Standard Arabic and Dialectal Arabic, both in terms of resource creation and processing tools. In addition to leveraging the latest computational methods to solve current NLP problems, I have taken a keen interest in dialectal and standard Arabic morphology and syntax over the past year.
During my Master's, I focused on processing spontaneous orthography in Dialectal Arabic, which is the result of it lacking any standard orthography, working on tasks such as morphological analysis and segmentation, character-level neural machine translation, spelling correction, and taxonomy and dataset creation.
Working on maintaining the Maknuune Palestinian Arabic Lexicon and on integrating it in Wiktionary. Check out the PDF Book version that I created for it!
Assisting on a grammatical error correction project for Modern Standard Arabic.
Camel Morph Project
Working with final-year capstone students on extracting a Gulf Arabic nominals lexicon in a semi-supervised way.
Working on morphological analyzers/generators for Modern Standard Arabic, Egyptian, Gulf, and Levantine (and potentially Maghrebi) Arabic.
Working on updating the Conventional Orthography for Dialectal Arabic (CODA*) guidelines [paper]
Working on a taxonomy of spontaneous orthography spelling inconsistencies for Dialectal Arabic.
Maknuune: A Large Open Palestinian Arabic Lexicon
Shahd Dibas, Christian Khairallah, Nizar Habash, Omar Fayez Sadi, Tariq Sairafy, Karmel Sarabta, Abrar Ardah
Proceedings of the Sixth Arabic Natural Language Processing Workshop co-located with EMNLP (Abu Dhabi, 2022)
[paper] [website] [pdf book]
Morphotactic Modeling in an Open-source Multi-dialectal Arabic Morphological Analyzer and Generator
Nizar Habash, Reham Marzouk, Christian Khairallah, Salam Khalifa
Proceedings of the Nineteenth SIGMORPHON Workshop co-located with NAACL (Seattle, 2022)
Orthography Standardization in Arabic Dialects
[report] [code] [data]