News

Asset Publisher

Back KSGAAL introduces Falak: A Novel Platform for Arabic Language Corpora

KSGAAL introduces Falak: A Novel Platform for Arabic Language Corpora

King Salman Global Academy for the Arabic Language (KSGAAL) announced today, Wednesday, the launch of  Falak platform for  Arabic language corpora. 
Crafted for researchers and enthusiasts in linguistic, social, and related fields, this platform streamlines the exploration, monitoring, study, and analysis of linguistic, social, and humanitarian phenomena, mainly focusing on their connection to language.
Dr. Abdullah Al-Washmi, KSGAAL Secretary-General, said:  “Led by His Highness Prince Badr bin Abdullah bin Farhan Al-Saud, Chairman of the Board of Trustees, KSGAAL is actively engaged in various initiatives to advance the Arabic language and integrate it with computing, technology, and Artificial Intelligence (AI). Recognizing the pressing need for high-quality references for researchers in Arabic language fields, KSGAAL’s Falak platform will serve researchers in exploring linguistic data and creating smart tools, leveraging Natural Language Processing and AI technologies.”
 Dr. Al-Washmi added, “Falak platform is the culmination of over 18 months of dedicated work by a team of experts and specialists whose tireless efforts ensured the platform's alignment with KSGAAL’s vision and aspirations to enhance the Arabic language's position in the AI era."
The launch of Falak platform aims to empower AI experts to effectively serve the Arabic language, providing top-tier linguistic data, streamlining data annotation processes, and fostering collaboration among language enthusiasts to advance computational linguistics. It also seeks to bolster the scientific credibility of trusted Arabic linguistic data.
As part of KSGAAL’s Computational Linguistics track, Falak platform is a pivotal initiative to strengthen Arabic language knowledge and data in the digital age. It utilizes cutting-edge methods and approaches to constructing digital corpora and Natural Language Processing (NLP). The platform also offers a comprehensive suite of digital corpora tailored to various linguistic levels and purposes of the Arabic language, spanning from spoken to written language and covering general, specialized, official, and everyday language. Additionally, the platform provides a robust suite of advanced data analysis tools for analyzing digital corpora texts.
Falak platform boasts an extensive collection of over 1.5 billion words, comprising three language corpora: KSGAAL Corpus, the Arabic Books Corpus, and the Language Corpus for Arabic Learners. By the end of the first quarter of 2024, the platform aims to expand to include ten corpora. Additionally, the platform offers eight computational tools for data analysis and research. Through a diverse temporal coverage, the corpora facilitate the examination of linguistic phenomena over time, while the platform interface allows for easy comparison between different linguistic corpora.
The platform offers contemporary search and query tools alongside user-friendly and responsive intelligent analytical display methods. These include features such as example search, contextual indexing, frequency distribution, lexical collocation, frequency lists, and “N-GRAM" word sequences. These features utilize cutting-edge AI and NLP technologies and applications to enhance research and scholarly outputs. Additionally, KSGAAL organizes regular online training sessions to introduce the platform and equip specialists with the skills to maximize its benefits.
KSGAAL aims to establish Falak platform as the go-to destination for Arabic corpora, delving into the linguistic phenomena. Drawing from a vast data repository spanning formal to colloquial Arabic, outsourced from various linguistic contexts and outlets like social media, news websites, and official reports, Falak enables exploring diverse linguistic contexts. With ongoing improvements fueled by user feedback, the platform will adapt to meet the evolving needs of its users.