# Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation

@inproceedings{Cho2014LearningPR, title={Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation}, author={Kyunghyun Cho and Bart van Merrienboer and Çaglar G{\"u}lçehre and Dzmitry Bahdanau and Fethi Bougares and Holger Schwenk and Yoshua Bengio}, booktitle={EMNLP}, year={2014} }

#### Figures, Tables, and Topics from this paper

#### 13,151 Citations

On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

- Computer Science, Mathematics
- SSST@EMNLP
- 2014

It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Expand

Unsupervised Learning of Sentence Representations using Convolutional Neural Networks

- Computer Science
- ArXiv
- 2016

Experimental results on several benchmark datasets, and across a broad range of applications, demonstrate the superiority of the proposed model over competing methods. Expand

Convolutional Encoders for Neural Machine Translation

- Computer Science
- 2015

We propose a general Convolutional Neural Network (CNN) encoder model for machine translation that fits within in the framework of Encoder-Decoder models proposed by Cho, et. al. [1]. A CNN takes as… Expand

Feedforward sequential memory networks based encoder-decoder model for machine translation

- Computer Science
- 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
- 2017

A sequence to sequence model is presented by replacing the recurrent neural networks with feedforward sequential memory networks in both encoder and decoder, which enables the new architecture to encode the entire source sentence simultaneously. Expand

Learning Generic Sentence Representations Using Convolutional Neural Networks

- Computer Science
- EMNLP
- 2017

Experimental results on several benchmark datasets, and across a broad range of applications, demonstrate the superiority of the proposed model over competing methods. Expand

Neural Machine Translation with Word Predictions

- Computer Science
- EMNLP
- 2017

This paper proposes to use word predictions as a mechanism for direct supervision to be able to predict the vocabulary in target sentence and ensures better representations in the encoder and decoder without using any extra data or annotation. Expand

Neural Machine Transliteration: Preliminary Results

- Computer Science
- ArXiv
- 2016

A character-based encoder-decoder model has been proposed that consists of two Recurrent Neural Networks that are jointly trained to maximize the conditional probability of a target sequence given a source sequence. Expand

Learning bilingual phrase representations with recurrent neural networks

- Computer Science
- MTSUMMIT
- 2015

A novel method is introduced which transforms a sequence of word feature vectors into a fixed-length phrase vector across two languages, and can be used to predict the semantic equivalence of source and target word sequences in the phrasal translation units used in phrase-based statistical machine translation. Expand

Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder

- Computer Science
- ACL
- 2017

This paper proposes a bidirectional tree encoder which learns both sequential and tree structured representations; a tree-coverage model that lets the attention depend on the source-side syntax and experiments demonstrate that the proposed models outperform the sequential attentional model. Expand

Universal Vector Neural Machine Translation With Effective Attention

- Computer Science
- ArXiv
- 2020

A neutral/universal model representation is introduced that can be used to predict more than one language depending on the source and a provided target and the number of models needed for multiple language translation applications are reduced. Expand

#### References

SHOWING 1-10 OF 39 REFERENCES

Joint Language and Translation Modeling with Recurrent Neural Networks

- Computer Science
- EMNLP
- 2013

This work presents a joint language and translation model based on a recurrent neural network which predicts target words based on an unbounded history of both source and target words which shows competitive accuracy compared to the traditional channel model features. Expand

Learning Semantic Representations for the Phrase Translation Model

- Computer Science
- ArXiv
- 2013

The results show that the new semantic-based phrase translation model significantly improves the performance of a state-of-the-art phrase-based statistical machine translation sys-tem, leading to a gain of 0.7-1.0 BLEU points. Expand

Continuous Space Translation Models for Phrase-Based Statistical Machine Translation

- Computer Science
- COLING
- 2012

Experimental evidence is provided that the approach seems to be able to infer meaningful translation probabilities for phrase pairs not seen in the training data, or even predict a list of the most likely translations given a source phrase. Expand

Continuous Space Translation Models with Neural Networks

- Computer Science
- NAACL
- 2012

Several continuous space translation models are explored, where translation probabilities are estimated using a continuous representation of translation units in lieu of standard discrete representations, jointly computed using a multi-layer neural network with a SOUL architecture. Expand

Decoding with Large-Scale Neural Language Models Improves Translation

- Computer Science
- EMNLP
- 2013

This work develops a new model that combines the neural probabilistic language model of Bengio et al., rectified linear units, and noise-contrastive estimation, and incorporates it into a machine translation system both by reranking k-best lists and by direct integration into the decoder. Expand

Recurrent Continuous Translation Models

- Computer Science
- EMNLP
- 2013

We introduce a class of probabilistic continuous translation models called Recurrent Continuous Translation Models that are purely based on continuous representations for words, phrases and sentences… Expand

A Neural Probabilistic Language Model

- Computer Science
- J. Mach. Learn. Res.
- 2000

This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences. Expand

Statistical Phrase-Based Translation

- Computer Science
- NAACL
- 2003

The empirical results suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translation. Expand

Continuous space language models

- Computer Science
- Comput. Speech Lang.
- 2007

Highly efficient learning algorithms are described that enable the use of training corpora of several hundred million words and it is shown that this approach can be incorporated into a large vocabulary continuous speech recognizer using a lattice rescoring framework at a very low additional processing time. Expand

How to Construct Deep Recurrent Neural Networks

- Computer Science, Mathematics
- ICLR
- 2014

Two novel architectures of a deep RNN are proposed which are orthogonal to an earlier attempt of stacking multiple recurrent layers to build aDeep RNN, and an alternative interpretation is provided using a novel framework based on neural operators. Expand