Reference frames are ubiquitous in spatial cognition, and they have been essential in the visual attention literature. Although most studies involving reference frames have been typically designed to examine how a spatial location can be defined in different ways depending on the reference point that is used, they often come to the unsupported conclusion that these constructs have a specific compositional format, the Cartesian coordinate system. The present study designed a modified version of the spatial cueing paradigm to examine the extent to which attention can be guided by encoding locations within compositional (coordinate) spatial representations. Experiment 1 used 75%-valid, compositional word cues that conveyed separate information about the likely direction and distance of the target. The main results showed that these cues elicited a compositional gradient arising from the combined activation of their separate spatial dimensions and were consistent with the use of a Cartesian coordinate reference system. Experiment 2 used non-compositional number cues to rule out an alternative account. Experiment 3 examined the dynamic nature of these compositional gradients over time. And Experiment 4 examined a boundary condition that could potentially limit the emergence of these compositional gradients. These findings were interpreted within a theory of conceptual control that distinguish sitional and can be used to guide attention from one object to another. But conceptual representations depend on non-compositional, perceptual representations to bind the activations arising from their separate spatial dimensions, much like non-spatial feature dimensions do.