Mixed Multi-Head Self-Attention for Neural MT Tag