Spaces:
Sleeping
Sleeping
Update app.py
Browse files
app.py
CHANGED
@@ -132,39 +132,34 @@ def pdf_to_images(pdf_path, dpi=300, output_format='JPEG'):
|
|
132 |
# -- START -- set up run variables
|
133 |
|
134 |
system_msg = """
|
135 |
-
You are
|
136 |
-
|
137 |
-
|
138 |
-
|
139 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
140 |
|
141 |
### Validation Rule:
|
142 |
-
|
|
|
|
|
|
|
143 |
|
144 |
-
###
|
145 |
-
|
146 |
-
- A JSON with an error message if the validation fails.
|
147 |
""".strip()
|
148 |
|
149 |
|
150 |
-
"""
|
151 |
-
Sachant que Total à payer doit etre egal à Fond travaux alur + Part charges prévisionnelles+ Part autres travaux - le solde précédent"""
|
152 |
-
# The user message
|
153 |
-
user_msg = """
|
154 |
-
fournit les informations suivante sous format json uniquement:
|
155 |
-
-Total à payer
|
156 |
-
-Fond travaux loi alur et non pas le fond de participation
|
157 |
-
-Total Part charges prévisionnelles
|
158 |
-
-Part autres travaux
|
159 |
-
-le solde précédent
|
160 |
-
-identifier le propriétaire
|
161 |
-
- l’adresse du propriétaire ou le numéro du lot du propriétaire si l'adresse n'est pas trouvé
|
162 |
-
- nom du locataire
|
163 |
-
-l'adresse de copropriété : et non pas l'alresse de l'agence immobiere
|
164 |
-
-la reference:
|
165 |
-
- date du document
|
166 |
-
- date limit du payement
|
167 |
-
""".strip()
|
168 |
|
169 |
|
170 |
|
@@ -178,7 +173,7 @@ def process(pdf):
|
|
178 |
image_paths = pdf_to_images(pdf)
|
179 |
system = set_system_message(system_msg)
|
180 |
chat_hist = [] # list of more user/assistant items
|
181 |
-
user = set_user_message(
|
182 |
|
183 |
params = { # dictionary format for ** unpacking
|
184 |
"model": "gpt-4o",
|
|
|
132 |
# -- START -- set up run variables
|
133 |
|
134 |
system_msg = """
|
135 |
+
You are an intelligent assistant tasked with extracting and validating information from French real estate syndic documents (*appel de fonds*). These documents contain financial details, property information, and owner details. Your job is to extract and ensure the correctness of the following information:
|
136 |
+
|
137 |
+
### Task Overview:
|
138 |
+
You need to extract and validate the following fields:
|
139 |
+
1. **Total à payer**: The total amount the owner must pay for the period.
|
140 |
+
2. **Fond travaux alur**: The amount allocated to the ALUR works fund.
|
141 |
+
3. **Total Part charges prévisionnelles**: The forecasted portion of charges the owner must pay for general building maintenance, collective services, etc.
|
142 |
+
4. **Part autres travaux**: Any additional expenses related to specific works or repairs.
|
143 |
+
5. **le solde précédent**: The previous balance from past transactions (can be positive or negative).
|
144 |
+
6. **Propriétaire**: The name of the property owner.
|
145 |
+
7. **Adresse du propriétaire**: The postal address of the owner.
|
146 |
+
8. **Adresse du bien**: The location of the property (address of the unit or building).
|
147 |
+
9. **Référence**: The reference number of the document or account related to the property.
|
148 |
+
10. **Date du document**: The date when the document was issued.
|
149 |
+
11. **Date limite du paiement**: The deadline by which the payment must be made.
|
150 |
+
12. **Montant total solde en notre faveur**: The total balance in favor of the syndic (if applicable).
|
151 |
|
152 |
### Validation Rule:
|
153 |
+
The following validation rules must be respected:
|
154 |
+
- **Total à payer** = **Fond travaux alur** + **Total Part charges prévisionnelles** + **Part autres travaux**.
|
155 |
+
- The amounts should be taken from the "débit" column, not the "crédit" column, to ensure accuracy. Verify that the **Total à payer** is from the correct column (débit).
|
156 |
+
- Additionally, both **Total à payer** and **Montant total solde en notre faveur** should be extracted for a cross-check to ensure that the final amounts are accurate and reflect the correct financial state.
|
157 |
|
158 |
+
### Format for Output:
|
159 |
+
Return the extracted information in JSON format. If there is a discrepancy (such as a mismatch between amounts or amounts found in the wrong column), return an error message in JSON format explaining the issue.
|
|
|
160 |
""".strip()
|
161 |
|
162 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
163 |
|
164 |
|
165 |
|
|
|
173 |
image_paths = pdf_to_images(pdf)
|
174 |
system = set_system_message(system_msg)
|
175 |
chat_hist = [] # list of more user/assistant items
|
176 |
+
user = set_user_message(image_paths, max_size)
|
177 |
|
178 |
params = { # dictionary format for ** unpacking
|
179 |
"model": "gpt-4o",
|