New research reveals why even state-of-the-art large language models stumble on seemingly easy tasks—and what it takes to fix ...