中文 | English

中文 | English

Chinese-Arabic Language Resources Dataset
May 30, 2019

Project manager: Yang Erhong, Executive Deputy Director of ACLR, Professor of BLCU.

Introduction: This dataset consists of Chinese-Arabic bilingual resources from news websites, dictionaries, literary works, etc., covering the fields of economy and trade, science and technology, religion, literature, politics, etc.

Data size: Chinese-Arabic bilingual parallel sentences corpus consists of approximately 14,442,034 sentence pairs; Chinese-Arabic bilingual dictionary contains about 338,289 pairs of words, more than 90,000 terms, among which, there are 4000 idioms, 16,000 computer terms, more than 50,000 science and technology terms, more than 20,000 Proper nouns; Arabic-Chinese literary parallel corpus includes 23 literary works and Chinese-Arabic literary parallel corpus contains 15 literary works.

File format: TXT

Samples: Please sign up and visit http://202.112.195.40:8080/index.xhtml