Article 1 Why are LVLMs bad at picking up on hints? : Probing the Grounding Gap in Vision-Language Models