1. 구글의 Med-PALM 2
USMLE에서 정답률 85% (기사에 따르면 대략적으로 60%가 합격선)
https://www.medpagetoday.com/special-reports/exclusives/103522

Latest version of medically tuned AI model achieved 85% accuracy, beating previous record
www.medpagetoday.com
인간과 답변 퀄리티 비교
The generation capabilities of large language models also enable them to produce long-form answers to consumer medical questions. However, ensuring model responses are accurate, safe, and helpful has been a crucial research challenge, especially in this safety-critical domain.
In a pairwise study, Med-PaLM 2 answers were preferred to physician answers across eight of nine axes considered.
2. 마이크로소프트 GPT-4의 성능:
https://arxiv.org/pdf/2303.13375.pdf
Prompt crafting (GPT-4 가 잘 이해하게 전처리하는 작업) 없이도 USMLE 합격점 20점 이상
Our results show that GPT-4, without any specialized prompt crafting,
exceeds the passing score on USMLE by over 20 points and outperforms earlier general-purpose
models (GPT-3.5) as well as models specifically fine-tuned on medical knowledge (Med-PaLM, a
prompt-tuned version of Flan-PaLM 540B).
댓글 0