201 Comments

User's avatar
Carlana Johnson's avatar

ChatGPT can pass the canonical Winograd schema because it has heard the answer before. If you do a novel one, it fails. Someone posted a new one on Mastodon "The ball broke the table because it was made of steel/Styrofoam." In my test just now, it chooses ball both times.

Expand full comment
Leif Kent's avatar

You can reliably make chat gpt fail the winograd test. The trick us to split up the clauses into different sentences or paragraphes. Eg:

Person 1 has attributes A, B, and C. (More info on person 1).

Person 2 has attributes D, E, F. (More info on person 2).

Person 1 wouldn’t (transitive verb) person 2 because they were (synonym for attribute associated with person 1/2).

Chat GPT doesn’t understand, so it uses statistical regularities to disambiguate. It over indexes on person 1, because that’s the more common construction. Sometimes it can pattern match on synonyms, because language models do have a concept of synonymy. But you can definitely fool it in ways you couldn’t fool a person.

Expand full comment
199 more comments...

No posts