An Overview of Statistical Machine Translation Tools

Mir Aadil, M. Asger


The process Machine translation is a combination of many complex sub-processes and the quality of results of each sub-process executed in a well defined sequence determine the overall accuracy of the translation. Statistical Machine Translation approach considers each sentence in target language as a possible translation of any source language sentence. The possibility is calculated by probability and as obvious, sentence with highest probability is treated as the best translation. SMT is the most favoured approach not only because of its good results for corpus rich language pairs, but also for the tools that  SMT approach has been enhanced  with in past two and half decades. The paper gives a brief introduction to SMT:  its steps and different tools available for each step.

Full Text:



P.F. Brown, S. Della Pietra, “The Mathematics of Statistical Machine Translation: Parameter Estimation,” Association of Computational Linguistics, vol.19, p. 263–311, 1993.

A. Kumar, and V. Goyal, “Comparative Analysis of Tools Available for Developing Statistical Approach Based Machine Translation System,” in Information Systems for Indian Languages. Communications in Computer and Information Science, Springer, vol 139, 2011, pp. 254-260.

A. Stolcke (2002). “Srilm —An Extensible Language Modeling Toolkit,” Speech Technology and Research Laboratory SRI International, Menlo Park, CA, U.S.A. [Online]. Available:

F. Josef. (2001) Readme-File Of Yasmet 1.0 Yet Another Small MaxEnt Toolkit: YASMET. [Online]. Available:

K.A. McCallum. (2002) MALLET: A Machine Learning for Language Toolkit. [Online]. Available:

P.R. Clarkson, R. Rosenfeld, “Statistical Language Modeling Using the CMU-Cambridge Toolkit,” in Proceedings ESCA Eurospeech, 1997, p. 2707–2710

J.M. Crego, J. B. Mariño, A. de Gispert, “An Ngram-based Statistical Machine Translation Decoder,” in 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, 2005, p. 3185–3188

K. Papineni, S. Roukos, T. Ward, W.J. Zhu, “BLEU: a method for automatic evaluation of machine translation,” in 40th Annual meeting of the Association for Computational Linguistics, 2002, p. 311–318

C. Dyer, A. Lopez, J. Ganitkevitch, J. Weese, F. Ture, P. Blunsom, H. Setiawan, V. Eidelman, and P. Resnik. “cdec: A Decoder, Alignment, and Learning Framework for Finite-State and Context-Free Translation Models,” In Proceedings of ACL, July, 2010.

Callison-Burch, C. Osborne, P. Koehn, “Re-evaluating the Role of BLEU in Machine Translation Research,” in 11th Conference of the European Chapter of the Association for Computational Linguistics, 2006, p. 249–256

A. Patry, F. Gotti, and P. Langlais, “Mood at work: Ramses versus Pharaoh,” in Proceedings of the Workshop on Statistical Machine Translation, New York City, Association for Computational Linguistics, 2006, p. 126–129.

G. Doddington, “Automatic evaluation of machine translation quality using n-gram co-occurrence statistics,” in Proceedings of the Human Language Technology Conference (HLT), San Diego, CA, 2002, p. 128–132

K. Knight. (1990) A Statistical MT Tutorial Workbook, JHU summer workshop [Online]. Available:

A. Lopez. (2008) Statistical machine translation. ACM Computer Survey (2008),



  • There are currently no refbacks.

© International Journals of Advanced Research in Computer Science and Software Engineering (IJARCSSE)| All Rights Reserved | Powered by Advance Academic Publisher.