This is the training data for my blog post Is the Reversal Curse Real?
https://andrewmayne.com/2023/11/14/is-the-reversal-curse-real/
This is a response to "The Reversal Curse: LLMs trained on A=B fail to learn B=A":
https://arxiv.org/pdf/2309.12288.pdf
Their paper's GitHub:
https://github.com/lukasberglund/reversal_curse
This was how the researcher trained the model – with the text split between "prompt" and "completion":
This is how I would have trained the model to begin with and what resulted in better results:
This is the same as the empty prompt method, but with the data in the message thread format used by ChatGPT-style models:
This is the fake data I used to get the model to refer to a fictious book Tom Cruise never wrote: