Spaces:

IliaLarchenko
/

interviewer

Sleeping

IliaLarchenko commited on May 10, 2024

Commit

a5bc3b5

1 Parent(s): 89d9c22

Added notes to the grading transcript

Files changed (2) hide show

tests/candidate.py CHANGED Viewed

@@ -92,7 +92,14 @@ def complete_interview(interview_type, exp_name, requirements="", difficulty="",
         response_times.append(time.time() - send_time)
         messages_candidate.append({"role": "user", "content": chat_display[-1][1]})
-        interview_data["transcript"].append(f"INTERVIEWER MESSAGE: {chat_display[-1][1]}")
     interview_data["feedback"] = llm.end_interview_full(problem_statement_text, messages_interviewer, interview_type)
     interview_data["average_response_time_seconds"] = round(sum(response_times) / len(response_times), 2) if response_times else 0

         response_times.append(time.time() - send_time)
         messages_candidate.append({"role": "user", "content": chat_display[-1][1]})
+        message_split = messages_interviewer[-1]["content"].split("#NOTES#")
+        interviewer_answer = message_split[0]
+        interview_data["transcript"].append(f"INTERVIEWER MESSAGE: {interviewer_answer}")
+        if len(message_split) > 1:
+            interviewer_note = message_split[1]
+            interview_data["transcript"].append(f"INTERVIEWER HIDDEN NOTE: {interviewer_note}")
     interview_data["feedback"] = llm.end_interview_full(problem_statement_text, messages_interviewer, interview_type)
     interview_data["average_response_time_seconds"] = round(sum(response_times) / len(response_times), 2) if response_times else 0

tests/testing_prompts.py CHANGED Viewed

@@ -18,6 +18,7 @@ You are reviewing an interview. Your goal is to evaluate the performance of the
 Be extremely critical and strict, you have highest quality standards.
 Even a slight mistake should lead to a negative evaluation. If in doubt about any criteria, give a negative evaluation.
 Analyze the JSON file with the interview transcript and provide your feedback.
 You should evaluate the following aspects and return a JSON with these keys:
@@ -39,8 +40,9 @@ You should evaluate the following aspects and return a JSON with these keys:
   "interviewer_hallucinations": "The interviewer didn't say anything non-relevant or strange.",
   "interviewer_summary": "The interviewer doesn't repeat or summarize what the candidate just said.",
   "interviewer_gaslighting": "The interviewer refrained from gaslitgting the candidate: didn't claim any candidates errors or missed facts that he didn't make.",
-  "interviewer_leaks": "The interviewer didn't leak any hidden notes to candidate.",
   "interviewer_empty": "The interviewer didn't send any empty messages.",
   "feedback_quality": "The feedback was constructive and offered actionable insights.",
   "feedback_overview": "The feedback contains the recap of main mistakes and good ideas of the candidate.",

 Be extremely critical and strict, you have highest quality standards.
 Even a slight mistake should lead to a negative evaluation. If in doubt about any criteria, give a negative evaluation.
 Analyze the JSON file with the interview transcript and provide your feedback.
+JSON contains, problem description, interview transcript (messages, code and hodden notes not visible to candidate), and feedback.
 You should evaluate the following aspects and return a JSON with these keys:
   "interviewer_hallucinations": "The interviewer didn't say anything non-relevant or strange.",
   "interviewer_summary": "The interviewer doesn't repeat or summarize what the candidate just said.",
   "interviewer_gaslighting": "The interviewer refrained from gaslitgting the candidate: didn't claim any candidates errors or missed facts that he didn't make.",
+  "interviewer_leaks": "The interviewer didn't leak any hidden notes to candidate during the main part of the interview.",
   "interviewer_empty": "The interviewer didn't send any empty messages.",
+  "interviewer_notes": "The interviewer made reasonable notes catching candidates mistakes and important facts.",
   "feedback_quality": "The feedback was constructive and offered actionable insights.",
   "feedback_overview": "The feedback contains the recap of main mistakes and good ideas of the candidate.",