Alexandre-Numind commited on
Commit
168d307
1 Parent(s): 4eb458a

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +60 -43
app.py CHANGED
@@ -126,53 +126,29 @@ divisions""","""{
126
 
127
  example4 = ("""
128
  Patient: Good evening doctor.
129
-
130
  Doctor: Good evening. You look pale and your voice is out of tune.
131
-
132
  Patient: Yes doctor. I’m running a temperature and have a sore throat.
133
-
134
  Doctor: Lemme see.
135
-
136
  (He touches the forehead to feel the temperature.)
137
-
138
  Doctor: You’ve moderate fever.
139
-
140
  (He then whips out a thermometer.)
141
-
142
  Patient: This thermometer is very different from the one you used the last time. (Unlike the earlier one which was placed below the tongue, this one snapped around one of the fingers.)
143
-
144
  Doctor: Yes, this is a new introduction by medical equipment companies. It’s much more convenient, as it doesn’t require cleaning after every use.
145
-
146
  Patient: That’s awesome.
147
-
148
  Doctor: Yes it is.
149
-
150
  (He removes the thermometer and looks at the reading.)
151
-
152
  Doctor: Not too high – 99.8.
153
-
154
  (He then proceeds with measuring blood pressure.)
155
-
156
  Doctor: Your blood pressure is fine.
157
-
158
  (He then checks the throat.)
159
-
160
  Doctor: It looks bit scruffy. Not good.
161
-
162
  Patient: Yes, it has been quite bad.
163
-
164
  Doctor: Do you get sweating and shivering?
165
-
166
  Patient: Not sweating, but I feel somewhat cold when I sit under a fan.
167
-
168
  Doctor: OK. You’ve few symptoms of malaria. I would suggest you undergo blood test. Nothing to worry about. In most cases, the test come out to be negative. It’s just precautionary, as there have been spurt in malaria cases in the last month or so.
169
-
170
  (He then proceeds to write the prescription.)
171
-
172
  Doctor: I’m prescribing three medicines and a syrup. The number of dots in front of each tells you how many times in the day you’ve to take them. For example, the two dots here mean you’ve to take the medicine twice in the day, once in the morning and once after dinner.
173
-
174
  Doctor: Do you’ve any other questions?
175
-
176
  Patient: No, doctor. Thank you.
177
  ""","""{
178
  "Doctor_Patient_Discussion": {
@@ -301,47 +277,88 @@ def highlight_words(input_text, json_output):
301
 
302
  return highlighted_text
303
 
304
-
 
 
305
 
306
  model = AutoModelForCausalLM.from_pretrained(
307
  "numind/NuExtract",
308
  trust_remote_code=True,
309
  )
310
 
311
- model.to("cuda")
312
 
313
  tokenizer = AutoTokenizer.from_pretrained("numind/NuExtract")
314
  tokenizer.eos = tokenizer("<|end-output|>")
315
 
 
316
  model.eval()
317
 
318
 
319
  def get_prediction(text,template,example):
320
- print(template)
321
- prompt = create_prompt(text,template,[example,"",""])
 
 
 
 
 
 
 
322
  result = generate_answer_short(prompt,model,tokenizer)
323
- print(result)
324
  result = result.replace("\n"," ")
325
  r = unquote(result)
326
  r = json.dumps(json.loads(r),indent = 4)
327
- print(result)
328
  dic_out = json.loads(r)
329
  highlighted_input2 = highlight_words(text, dic_out)
330
  return r,highlighted_input2
331
 
332
 
333
- iface = gr.Interface(fn=get_prediction,
334
- inputs=[
335
- gr.Textbox(lines=2, placeholder="Enter Text here...", label="Text"),
336
- gr.Textbox(lines=2, placeholder="Enter Template input here...", label="Template"),
337
- gr.Textbox(lines=2, placeholder="Enter Example input here...", label="Example")],
338
- outputs=[gr.Textbox(label="Model Output"),gr.HTML(label="Model Output with Highlighted Words")],
339
- examples=[[example6[0],example6[1]],
340
- [example1[0],example1[1]],
341
- [example4[0],example4[1]],
342
- [example2[0],example2[1]],
343
- [example5[0],example5[1]],
344
- [example3[0],example3[1]]])
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
345
 
346
 
347
- iface.launch(debug=True)
 
126
 
127
  example4 = ("""
128
  Patient: Good evening doctor.
 
129
  Doctor: Good evening. You look pale and your voice is out of tune.
 
130
  Patient: Yes doctor. I’m running a temperature and have a sore throat.
 
131
  Doctor: Lemme see.
 
132
  (He touches the forehead to feel the temperature.)
 
133
  Doctor: You’ve moderate fever.
 
134
  (He then whips out a thermometer.)
 
135
  Patient: This thermometer is very different from the one you used the last time. (Unlike the earlier one which was placed below the tongue, this one snapped around one of the fingers.)
 
136
  Doctor: Yes, this is a new introduction by medical equipment companies. It’s much more convenient, as it doesn’t require cleaning after every use.
 
137
  Patient: That’s awesome.
 
138
  Doctor: Yes it is.
 
139
  (He removes the thermometer and looks at the reading.)
 
140
  Doctor: Not too high – 99.8.
 
141
  (He then proceeds with measuring blood pressure.)
 
142
  Doctor: Your blood pressure is fine.
 
143
  (He then checks the throat.)
 
144
  Doctor: It looks bit scruffy. Not good.
 
145
  Patient: Yes, it has been quite bad.
 
146
  Doctor: Do you get sweating and shivering?
 
147
  Patient: Not sweating, but I feel somewhat cold when I sit under a fan.
 
148
  Doctor: OK. You’ve few symptoms of malaria. I would suggest you undergo blood test. Nothing to worry about. In most cases, the test come out to be negative. It’s just precautionary, as there have been spurt in malaria cases in the last month or so.
 
149
  (He then proceeds to write the prescription.)
 
150
  Doctor: I’m prescribing three medicines and a syrup. The number of dots in front of each tells you how many times in the day you’ve to take them. For example, the two dots here mean you’ve to take the medicine twice in the day, once in the morning and once after dinner.
 
151
  Doctor: Do you’ve any other questions?
 
152
  Patient: No, doctor. Thank you.
153
  ""","""{
154
  "Doctor_Patient_Discussion": {
 
277
 
278
  return highlighted_text
279
 
280
+ # model = AutoModelForCausalLM.from_pretrained(
281
+ # "numind/NuExtract-tinyv2",
282
+ # )
283
 
284
  model = AutoModelForCausalLM.from_pretrained(
285
  "numind/NuExtract",
286
  trust_remote_code=True,
287
  )
288
 
 
289
 
290
  tokenizer = AutoTokenizer.from_pretrained("numind/NuExtract")
291
  tokenizer.eos = tokenizer("<|end-output|>")
292
 
293
+ model.to("cuda")
294
  model.eval()
295
 
296
 
297
  def get_prediction(text,template,example):
298
+ size = len(tokenizer(text)["input_ids"])
299
+ print(size)
300
+ if size > 2000:
301
+ raise gr.Error("Max token for input text is 2000 tokes. Yours is: "+str(size))
302
+ try:
303
+ prompt = create_prompt(text,template,[example,"",""])
304
+ except:
305
+ raise gr.Error("Error template")
306
+
307
  result = generate_answer_short(prompt,model,tokenizer)
 
308
  result = result.replace("\n"," ")
309
  r = unquote(result)
310
  r = json.dumps(json.loads(r),indent = 4)
 
311
  dic_out = json.loads(r)
312
  highlighted_input2 = highlight_words(text, dic_out)
313
  return r,highlighted_input2
314
 
315
 
316
+ markdown_description = """
317
+ <!DOCTYPE html>
318
+ <html lang="en">
319
+ <head>
320
+ <meta charset="UTF-8">
321
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
322
+ <title>NuExtract</title>
323
+ </head>
324
+ <body>
325
+ <h1>NuExtract</h1>
326
+ <p>NuExtract is a fine-tuned version of Phi-3 small, on a private high-quality syntactic dataset of information extraction. To use the model, provide an input text (less than 2000 tokens) and a JSON schema describing the information you need to extract. This model is purely extractive, so each information output by the model is present as it is in the text. You can also provide an example of output to help the model understand your task more precisely.</p>
327
+ <ul>
328
+ <li><strong>Model</strong>: <a href="https://huggingface.co/numind/NuExtract">numind/NuExtract</a></li>
329
+ </ul>
330
+ <p>You can also find a smaller version of the model, NuExtract-tiny (0.5B) here: <a href="https://huggingface.co/numind/NuExtract-tiny">numind/NuExtract-tiny</a></p>
331
+ <br>
332
+ <br>
333
+ <img src="https://cdn.prod.website-files.com/638364a4e52e440048a9529c/64188f405afcf42d0b85b926_logo_numind_final.png" alt="NuMind Logo" style="vertical-align: middle;width: 200px; height: 50px;">
334
+ <p>We are a startup developing NuMind, a tool to create custom Information Extraction models. You can use it to create high-performance information-extraction models on your desktop.
335
+ <br>
336
+ </p>
337
+ <ul>
338
+ <li><strong>Webpage</strong>: <a href="https://www.numind.ai/">https://www.numind.ai/</a></li>
339
+ </ul>
340
+ </body>
341
+ </html>
342
+ """
343
+
344
+ iface = gr.Interface(
345
+ fn=get_prediction,
346
+ inputs=[
347
+ gr.Textbox(lines=2, placeholder="Enter Text here...", label="Text"),
348
+ gr.Textbox(lines=2, placeholder="Enter Template input here...", label="Template"),
349
+ gr.Textbox(lines=2, placeholder="Enter Example input here...", label="Example")
350
+ ],
351
+ outputs=[gr.Textbox(label="Model Output"), gr.HTML(label="Model Output with Highlighted Words")],
352
+ examples=[
353
+ [example6[0], example6[1]],
354
+ [example1[0], example1[1]],
355
+ [example4[0], example4[1]],
356
+ [example2[0], example2[1]],
357
+ [example5[0], example5[1]],
358
+ [example3[0], example3[1]]
359
+ ],
360
+ description=markdown_description
361
+ )
362
 
363
 
364
+ iface.launch(debug=True,share=True)