Publications

  1. Yupu Liang, Yaping Zhang, Cong Ma, Zhiyang Zhang, Yang Zhao, Lu Xiang, Chengqing Zong, Yu Zhou. Document Image Machine Translation with Dynamic Multi-pre-trained Models Assembling. In The 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024). Mexico City, Mexico. June 16-21, 2024.

  2. Cong Ma, Yaping Zhang, Zhiyang Zhang, Yupu Liang, Yang Zhao, Yu Zhou, Chengqing Zong. Born a BabNet with Hierarchical Parental Supervision for End-to-End Text Image Machine Translation. In The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Torino, Italia. May 20-25, 2024.

  3. Cong Ma, Yaping Zhang, Yang Zhao, Yu Zhou, Chengqing Zong. Vector Quantization Knowledge Transfer for End-to-End Text Image Machine Translation. In The 49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP 2024). COEX, Seoul, Korea. April 14-19, 2024. IEEE Xplore.

  4. Cong Ma, Xu Han, Linghui Wu, Yaping Zhang, Yang Zhao, Yu Zhou, and Chengqing Zong. Modal Contrastive Learning based End-to-End Text Image Machine Translation. In IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE/ACM TASLP), vol. 32, pp. 2153-2165, 2024, doi: 10.1109/TASLP.2023.3324540. IEEE Xplore.

  5. Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, Chengqing Zong. CCIM: Cross-Modal Cross-Lingual Interactive Image Translation. In Findings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), Singapore. December 6-10, 2023. pp. 4959–4965. ACL_Anthology

  6. Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, and Chengqing Zong. E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation. In The 17th Document Analysis and Recognition (ICDAR 2023), San José, California, USA. August 21-26, 2023. pp 70–88. Cham. Springer Nature Switzerland. arXiv, Springer_Link

  7. Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, and Chengqing Zong. Multi-Teacher Knowledge Distillation for End-to-End Text Image Machine Translation. In The 17th Document Analysis and Recognition (ICDAR 2023), San José, California, USA. August 21-26, 2023. pp. 484–501, Cham. Springer Nature Switzerland. (Oral Paper) arXiv, Springer_Link

  8. Cong Ma, Yaping Zhang, Mei Tu, Xu Han, Linghui Wu, Yang Zhao, Yu Zhou. Improving End-to-End Text Image Translation From the Auxiliary Text Translation Task. In Proceedings of the 26th International Conference on Pattern Recognition (ICPR 2022), Virtually, Montréal Québec, Canada. August 21-25, 2022. pp.1664-1670. arXiv, IEEE Xplore, GitHub.

  9. Qian Wang, Yuchen Liu, Cong Ma, Yu Lu, Yining Wang, Long Zhou, Yang Zhao, Jiajun Zhang, Chengqing Zong. CASIA’s System for IWSLT 2020 Open Domain Translation. In Proceedings of the 17th International Conference on Spoken Language Translation(IWSLT), pages 130-139. July 9-10, 2020. ACL Anthology.

  10. Yang Zhao, Long Zhou, Qian Wang, Cong Ma, Yuchen Liu, Yining Wang, Lu Xiang, Jiajun Zhang, Yu Zhou, Chengqing Zong. Research on Low-Resource Ethnic-to-Chinese Neural Machine Translation. In The 15th China Conference on Machine Translation, CCMT 2019.

  11. ZHAO Yang, ZHOU Long, WANG Qian, MA Cong, LIU Yuchen, WANG Yining, XIANG Lu, ZHANG Jiajun, ZHOU Yu, ZONG Chengqing. The Study on Ethnic-to-Chinese Scare-Resource Neural Machine Translation. In the Journal of Jiangxi Normal University (Natural Science), 2019, vol. 43, no. 6, pp. 630-637.
  12. H. Li, J. Zhu, C. Ma, J. Zhang and C. Zong, “Read, Watch, Listen, and Summarize: Multi-Modal Summarization for Asynchronous Text, Image, Audio and Video” in IEEE Transactions on Knowledge and Data Engineering, vol. 31, no. 5, pp. 996-1009, 1 May 2019. IEEE_Xplore.

  13. Haoran Li, Junnan Zhu, Cong Ma, Jiajun Zhang and Chengqing Zong. Multi-modal Summarization for Asynchronous Collection of Text, Image, Audio and Video. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP-17), Copenhagen, Denmark. September 9-11, 2017, pp. 1103–1113. ACL Anthology.

Patents

  1. 发明专利:CN114626392B, 端到端文本图像翻译模型训练方法, 授权时间:2023.02.21
  2. 发明专利:CN113011202B, 基于多任务训练的端到端图像文本翻译方法、系统、装置, 授权时间:2023.07.25