Clémentine commited on
Commit
5737cc1
·
1 Parent(s): 96bfd95
app/src/content/chapters/automated-benchmarks/designing-your-automatic-evaluation.mdx CHANGED
@@ -322,18 +322,18 @@ Once you've selected your model, you need to define what is the best possible pr
322
 
323
  <Note title="Prompt design guidelines" emoji="📝" variant="info">
324
  Provide a clear description of the task at hand:
325
- - `Your task is to do X`.
326
- - `You will be provided with Y`.
327
 
328
  Provide clear instructions on the evaluation criteria, including a detailed scoring system if needed:
329
- - `You should evaluate property Z on a scale of 1 - 5, where 1 means ...`
330
- - `You should evaluate if property Z is present in the sample Y. Property Z is present if ...`
331
 
332
  Provide some additional "reasoning" evaluation steps:
333
- - `To judge this task, you must first make sure to read sample Y carefully to identify ..., then ...`
334
 
335
  Specify the desired output format (adding fields will help consistency)
336
- - `Your answer should be provided in JSON, with the following format {"Score": Your score, "Reasoning": The reasoning which led you to this score}`
337
  </Note>
338
 
339
  You can and should take inspiration from [MixEval](https://github.com/huggingface/lighteval/blob/main/src/lighteval/tasks/extended/mix_eval/judge_prompts.pyy) or [MTBench](https://github.com/huggingface/lighteval/blob/main/src/lighteval/tasks/extended/mt_bench/judge_prompt_templates.py) prompt templates.
 
322
 
323
  <Note title="Prompt design guidelines" emoji="📝" variant="info">
324
  Provide a clear description of the task at hand:
325
+ - *Your task is to do X*.
326
+ - *You will be provided with Y*.
327
 
328
  Provide clear instructions on the evaluation criteria, including a detailed scoring system if needed:
329
+ - *You should evaluate property Z on a scale of 1 - 5, where 1 means ...*
330
+ - *You should evaluate if property Z is present in the sample Y. Property Z is present if ...*
331
 
332
  Provide some additional "reasoning" evaluation steps:
333
+ - *To judge this task, you must first make sure to read sample Y carefully to identify ..., then ...*
334
 
335
  Specify the desired output format (adding fields will help consistency)
336
+ - *Your answer should be provided in JSON, with the following format {"Score": Your score, "Reasoning": The reasoning which led you to this score}*
337
  </Note>
338
 
339
  You can and should take inspiration from [MixEval](https://github.com/huggingface/lighteval/blob/main/src/lighteval/tasks/extended/mix_eval/judge_prompts.pyy) or [MTBench](https://github.com/huggingface/lighteval/blob/main/src/lighteval/tasks/extended/mt_bench/judge_prompt_templates.py) prompt templates.