1. Overview
  2. Data Extraction
  3. Choosing the Right Parser Type

Choosing the Right Parser Type

Parsio currently supports 4 parser types:

Choosing the appropriate parser for parsing your documents is crucial.

The 3 tables below show the main differences, advantages, and limitations of each, making it easier for you to choose the right parser for your use case.

Parser Use cases Parser description Cost
Template-based

Parsing transactional machine-generated emails with a fixed layout. Parsing structured file formats: XML, CSV, Excel, JSON.

Create a parsing template by highlighting the data to extract. Structured file formats are processed automatically.

1 credit / document
AI PDF parser Parsing PDF files: invoices, receipts, bank statements, business cards, ID documents, PDF tables, tax forms and general documents. Pre-trained AI models automatically extract all available data. 5 credits / page
GPT-powered parser Parsing complex emails, PDFs and documents where other parsers failed. Write a text prompt as if you were talking to a person. 2 credits / page
OCR converter

Convert PDFs and images to text formats; convert tables to Excel, CSV, and more.

Converts scanned documents (PDFs, images) to editable formats: Excel, CSV, JSON, HTML, Markdown, TXT 1 credit / page
Parser Supported formats
Template-based Emails, PDF, HTML, Excel, CSV, XML, DOCX, JSON
AI PDF parser PDF, JPG, PNG, BMP, TIFF
GPT-powered parser
Emails, PDF, HTML, TXT, DOCX, XML, MD, JSON
OCR converter
PDF, JPG, PNG, TIFF
Parser Advantages Limitations
Template-based

Easy to use. Perfect for transactional emails and simple text PDFs. Built-in templates: Haro, Airbnb, Etsy, LinkedIn and more.

No OCR for PDFs. Only some simple text PDFs. Cannot extract tables from PDF files.

AI PDF parser

OCR: parse text and scanned PDFs.

Extract tables from PDFs.

Parse handwritten text.

Uses pre-trained AI models: no need to create parsing templates or parsing rules.

Limited to available pre-trained models: invoices, receipts, bank statements, business cards, ID documents, PDF tables, tax forms, general documents, etc.

GPT-powered parser

Very powerful for parsing emails, PDFs and documents with complex layout where creating of a parsing template is impossible.

Easy to use: write a text prompt as if you were talking to a person.

Sometimes it cannot extract large chunks of data. In such cases, it's better to split your document into multiple pieces.
OCR converter

Ideal for converting PDFs and images to editable text.

Accurately parses tables from scanned and image documents.

Conserves the original document layout.

Converts documents to editable text formats but, unlike other parser types, does not extract structured data.

 


Was this article helpful?
© 2024 Parsio Knowledge Base