Part 6/12:
Input Sequence: A sequence of baskets over recent weeks (e.g., last 8 weeks), each basket comprising product IDs.
Target Sequence: The next basket of products the customer will buy, focusing on new items (the "novel" recommendations).
This analogy allows the use of sequence-to-sequence modeling, where the model learns to generate the next set of products, just like translating a sentence from one language to another.
Data Transformation & Tokenization
Key to this approach is converting product IDs into a sequence of tokens, similar to words in NLP. The team designed special tokens:
Padding
Start and end of sequence indicators
Separators between baskets
Tokens for empty baskets