Choosing the Right Parser Type
Parsio currently supports 4 parser types:
Choosing the appropriate parser for parsing your documents is crucial.
The 3 tables below show the main differences, advantages, and limitations of each, making it easier for you to choose the right parser for your use case.
Parser | Use cases | Parser description | Cost |
Template-based |
Parsing transactional machine-generated emails with a fixed layout. Parsing structured file formats: XML, CSV, Excel, JSON. |
Create a parsing template by highlighting the data to extract. Structured file formats are processed automatically. |
1 credit / document |
AI PDF parser | Parsing PDF files: invoices, receipts, bank statements, business cards, ID documents, PDF tables, tax forms and general documents. | Pre-trained AI models automatically extract all available data. | 5 credits / page |
GPT-powered parser | Parsing complex emails, PDFs and documents where other parsers failed. | Write a text prompt as if you were talking to a person. | 2 credits / page |
OCR converter |
Convert PDFs and images to text formats; convert tables to Excel, CSV, and more. |
Converts scanned documents (PDFs, images) to editable formats: Excel, CSV, JSON, HTML, Markdown, TXT | 1 credit / page |
Parser | Supported formats |
Template-based | Emails, PDF, HTML, Excel, CSV, XML, DOCX, JSON |
AI PDF parser | PDF, JPG, PNG, BMP, TIFF |
GPT-powered parser |
Emails, PDF, HTML, TXT, DOCX, XML, MD, JSON
|
OCR converter |
PDF, JPG, PNG, TIFF
|
Parser | Advantages | Limitations |
Template-based |
Easy to use. Perfect for transactional emails and simple text PDFs. Built-in templates: Haro, Airbnb, Etsy, LinkedIn and more. |
No OCR for PDFs. Only some simple text PDFs. Cannot extract tables from PDF files. |
AI PDF parser |
OCR: parse text and scanned PDFs. Extract tables from PDFs. Parse handwritten text. Uses pre-trained AI models: no need to create parsing templates or parsing rules. |
Limited to available pre-trained models: invoices, receipts, bank statements, business cards, ID documents, PDF tables, tax forms, general documents, etc. |
GPT-powered parser |
Very powerful for parsing emails, PDFs and documents with complex layout where creating of a parsing template is impossible. Easy to use: write a text prompt as if you were talking to a person. |
Sometimes it cannot extract large chunks of data. In such cases, it's better to split your document into multiple pieces. |
OCR converter |
Ideal for converting PDFs and images to editable text. Accurately parses tables from scanned and image documents. Conserves the original document layout. |
Converts documents to editable text formats but, unlike other parser types, does not extract structured data. |