Learning to reinforcement learn