Added a Conv1D layer at the end of both decoder and next_sequence_predictor

2 jobs for master in 21 minutes and 55 seconds (queued for 1 second)