We’re looking for a GPT expert to build an AI tool that executes the below functions:
1. Extracting: unstructured data from PDFs and images (some docs/parts can be handwritten)
2. Translating: into English if it is in another language
3. Mapping: the correct data to the appropriate JSON param (should be intelligent to handle data with or without labels)
4. Saving: the data using API (to be created by you)
5. Showing: an error (dialog box) if a similar document already exists and the option to keep the original file or replace this one.
6. Stamping: Attributes like uploaded by, date/time etc.
Types of documents:
A. Statement of Facts (SOF)
B. Invoices
C. Statement of Accounts
D. Tariffs (Rate charts)
UI Workflow:
1. User uploads file
2. Document type is identified (whether it is Invoice, SOF etc.)
3. JSON is extracted and shown on UI
4. User edit the values on the page, where required, and hits "save"
5. If no conflicting file is found, then data is saved in the database (endpoint will be provided by us)
6. Parallely, the file is saved on the server with stamp (watermark and/or metadata in PDF)
Important:
1. Please analyze the attached sample SOF files and the JSON format, applications that contain outputs of the sample files will be given preference
2. If you plan to use the online version of ChatGPT 4/3.5/Turbo etc., in your proposal, you must provide an accurate estimate of the average tokens that will be consumed per document (input + output + vision if applicable)
3. The accuracy required is at least 90% across 10 sample documents, so you will be required to fine-tune the model.
Milestones:
1. Extracting, Translating, Saving functions for "Statement of Facts"
2. Extracting, Translating, Saving functions for "Invoices"
3. Extracting, Translating, Saving functions for "Statement of Accounts"
4. Extracting, Translating, Saving functions for "Tariffs"
5. Saving, Showing, Stamping for all the above file types
6. UI development
7. Deployment on Server
8. Handover of Source Code
Budget: $1,000
Posted On: March 09, 2024 08:04 UTC
Category: AI Integration
Skills:ChatGPT API, GPT API, OCR Algorithm, Machine Learning, Microsoft Excel, Tesseract, Artificial Intelligence, OCR Software, Tesseract OCR, Web Development, JSON, PDF Conversion, Data Extraction
Country: India
click to apply
Powered by WPeMatico
