It's just a multiple choice test with question prompts. This is the exact sort of thing an LLM should be very good at. This isn't chat gpt trying to do the job of an actual doctor, it would be quite abysmal at that. And even this multiple choice test had to be stacked in favor of chat gpt.
Because GPT models cannot interpret images, questions including imaging analysis, such as those related to ultrasound, electrocardiography, x-ray, magnetic resonance, computed tomography, and positron emission tomography/computed tomography imaging, were excluded.
Don't get me wrong though, I think there's some interesting ways AI can provide some useful assistive tools in medicine, especially tasks involving integrating large amounts of data. I think the authors use some misleading language though, saying things like AI "are performing at the standard we require from physicians," which would only be true if the job of a physician was filling out multiple choice tests.