id author title date pages extension mime words sentences flesch summary cache txt cord-020912-tbq7okmj Batra, Vishwash Variational Recurrent Sequence-to-Sequence Retrieval for Stepwise Illustration 2020-03-17 .txt text/plain 4506 247 50 We evaluate the model for the application of stepwise illustration of recipes, where a sequence of relevant images are retrieved to best match the steps described in the text. More concretely, we incorporate the global context information encoded in the entire text sequence (through the attention mechanism) into a variational autoencoder (VAE) at each time step, which converts the input text into an image representation in the image embedding space. To capture the semantics of the images retrieved so far (in a story/recipe), we assume the prior of the distribution of the topic given the text input follows the distribution conditional on the latent topic from the previous time step. -We propose a new variational recurrent seq2seq (VRSS) retrieval model for seq2seq retrieval, which employs temporally-dependent latent variables to capture the sequential semantic structure of text-image sequences. Our work is related to: cross-modal retrieval, story picturing, variational recurrent neural networks, and cooking recipe datasets. ./cache/cord-020912-tbq7okmj.txt ./txt/cord-020912-tbq7okmj.txt