Created
February 21, 2024 14:05
-
-
Save monk1337/0109d36c43cdde7ed2c687e06e7177a4 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"Dataset": [ | |
"multimedqa", | |
"medmcqa", | |
"medqa_4options", | |
"mmlu_anatomy", | |
"mmlu_clinical_knowledge", | |
"mmlu_college_biology", | |
"mmlu_college_medicine", | |
"mmlu_medical_genetics", | |
"mmlu_professional_medicine", | |
"pubmedqa" | |
], | |
"Mistral-7B-v0.1": [ | |
0.533002, | |
0.481951, | |
0.508248, | |
0.555556, | |
0.686792, | |
0.680556, | |
0.595376, | |
0.71, | |
0.683824, | |
0.754 | |
], | |
"Gemma": [ | |
0.254791, | |
0.217547, | |
0.254517, | |
0.244444, | |
0.264151, | |
0.263889, | |
0.346821, | |
0.24, | |
0.220588, | |
0.552 | |
] | |
} |
@monk1337 May I know what is the prompt format for the PubMedQA benchmark
evaluation? Thank you very much in advance!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thank you very much for your response, @monk1337! For the
google/gemma-2b-it
model, May I know the prompt format for the PubMedQA benchmark?Can we use the following prompt format?
"""
"<start_of_turn>user\nWrite down the best option after \n\nThe correct answer is . \n\nQuestion: Dyschesia can be provoked by inappropriate defecation movements. The aim of this prospective study was to demonstrate dysfunction of the anal sphincter and/or the musculus (m.) puborectalis in patients with dyschesia using anorectal endosonography.\n Twenty consecutive patients with a medical history of dyschesia and a control group of 20 healthy subjects underwent linear anorectal endosonography (Toshiba models IUV 5060 and PVL-625 RT). In both groups, the dimensions of the anal sphincter and the m. puborectalis were measured at rest, and during voluntary squeezing and straining. Statistical analysis was performed within and between the two groups.\n The anal sphincter became paradoxically shorter and/or thicker during straining (versus the resting state) in 85% of patients but in only 35% of control subjects. Changes in sphincter length were statistically significantly different (p<0.01, chi(2) test) in patients compared with control subjects. The m. puborectalis became paradoxically shorter and/or thicker during straining in 80% of patients but in only 30% of controls. Both the changes in length and thickness of the m. puborectalis were significantly different (p<0.01, chi(2) test) in patients versus control subjects.\n Is anorectal endosonography valuable in dyschesia?\n (A) yes\n (B) no\n (C) maybe<end_of_turn>\n<start_of_turn>model\n"
"""
Please note that
Write down the best option after \n\nThe correct answer is .
is the instruction.Thank you very much again!