How to Fix Unparsed (Failed to Process) Documents
Before going further, it's important to understand some basics of how Parsio works.
When you create a template and select fields to extract, Parsio will "remember" the context, e.g. the data (either plain text or HTML tags) right before and after each field. This context is used to extract data from future incoming documents: Parsio will find the same pattern in a new document and extract the data between the two parts.
It should be clear now that the data right before and after your fields must remain constant in all your emails or documents.
With this in mind, let's see some tips and tricks that help to resolve the most common parsing problems.
Configure Email Auto Forwarding
Some email clients (including Gmail) tend to change the email's body when you manually forward it elsewhere. Even if the email looks unchanged, the HTML code may be edited by the email client.
That's why we recommend to configure auto email forwarding.
What does it mean for you? Most of our customers manually forward their first email to Parsio, create a template that successfully parses that email. Then, after configuring email auto forwarding, it's often turns out that all the newly received emails are failed to be processed. To resolve this problem, simply create another template based on the auto-forwarded email.
How to Process Emails With a Slightly Different Layout
Some services may send a few types of emails for the same event or update their email layout over time.
With Parsio, you can create multiple templates in each inbox. Parsio automatically chooses the best template to apply. First, it tries to use the one which has more fields than the others. If it fails, Parsio will pick the next one etc.
To resolve this problem, most of the time you simply need to create an additional template.
How to Handle Emails With Missing Fields
Sometimes you want to mark fields as "optional" or "empty" because they don't exist in some of your emails. By design, Parsio must find all the fields in your document to consider it as successfully parsed.
Most of the time you can create one or a few additional templates to cover all the possible cases.
However, there is an advanced trick that might be helpful.
What If You Have Dynamic Data Around a Field
Imagine you receive an email with the client's address: Zip & State: 77701 Texas
and decide to extract the zip code only. The data right before the zip code most likely will never change (constant data). However, the state may change almost in every new email (dynamic data): Zip & State: 08066 New Jersey
.
Parsio will be unable to find the pattern in the incoming emails (because of that dynamic data after the field) and most of them will be marked as "Failed".
To fix the problem, you can either create a new field to extract the state name OR extract zip+state as a single value.
This often happens with the timestamp date: August 3rd, 2022 03:20pm
. You can't simply extract the date August 3rd, 2022
, you will need to extract either the full DateTime value or create a second "Time" field.
Use the Template Debugger
Parsio comes with a built-in template debugger that is helpful to understand where the parsing fails.
The template debugger shows side-by side the template document (left) and the parsing document (right).
The red color indicates where the parsing fails. You can compare it with the corresponding part in the template to understand what went wrong.
Use the Text Mode
Incoming emails can be parsed as HTML or plain text documents. If your emails have many layout variations (e.g. the HTML code is always changing), you can try converting them to text format, which may help reduce the number of parsing templates.
You can switch between HTML and Text modes on the Settings page: