Visual search is facilitated when observers search through repeated displays. This effect, termed contextual cueing (CC), reflects the exceptional ability of our cognitive system to utilize regularities embedded in the environment. Recent studies that tested visual search with real-world objects found that CC takes place even in heterogeneous search displays, but only when the identities (“what”) and locations (“where”) of the objects are both repeated. The purpose of the current study was to test whether the repetition of both “what” and “where” is not only necessary but also sufficient for CC. Consistent with previous results, Experiment 1 found robust CC when both the “what” and “where” information were repeated, and further revealed that the effect was not modulated by the number of search items. In contrast, Experiment 2 showed that the repetition of both objects’ identities and locations did not benefit the search when the two were not bound together. CC was also absent in Experiment 3, where the objects’ identities and locations were repeated together, however, target locations varied randomly. Together these results suggest that CC with real-world objects is robust, but critically depends on “what” and “where” binding as well as context-target associations.