Common Examples
Prevent a document from being exported
If you don't want to export some of the parsed documents (typically, based on a certain condition), simply return None. This will mark the document as "Skipped." Skipped documents aren't exported to Webhooks, Zapier, and other integrations.
return None
Here's an example of how to avoid exporting emails to your webhooks and integrations if you only need to parse email attachments:
if "content_type" in data and data["content_type"] == "message/rfc822":
return None
Don't forget to enable the 'Content type' meta field to make it available during the post-processing step.
Create a new field
# Create field
data['new_field'] = 'Field value'
# Create field based on another field. If 'email' doesn't exist, use the default value
data['customer_email'] = data.get("email", 'default@example.com')
Delete a field
del data['email']
Merge multiple fields into one
# Using string concatenation (+ operator)
data['fullname'] = data['first_name'] + " " + data['last_name']
# Using formatted string literals (f-strings)
data["fullname"] = f"{data['first_name']} {data['last_name']}"
# Using format() function
data['fullname'] = '{} {}'.format(data['first_name'], data['last_name'])
Rename a field
This code will remove the original email
field and insert a new field customer_email
with the same value:
data['customer_email'] = data.pop('email')
Check if field exists in parsed data
# If 'name' exists, change its value
if "name" in data:
data['name'] = 'Peter'
# If 'name' doesn't exists, create field
if "name" not in data:
data['name'] = 'Peter'
# ...or in one line
data['name'] = data.get('name', 'Peter')
Store parsed data in another variable
You can also create a new dictionary which will replace the original data
dictionary.
parsed_data = {
**data, # copy all the fields from the 'data' dictinary
'name': 'Peter',
'age': 27
}
return parsed_data # do not forget to 'return' the dictionary
Round a floating point number
pi = 3.141592653589793
data['pi'] = round(pi, 2) # 3.14
Formatting dates
Parsio provides 2 Python modules to work with dates: datetime and dateparser.
date = data['date'] # "Mon, Mar 28, 2022"
format_out = "%Y-%m-%d" # YYYY-MM-DD
# Using the 'datetime' module
format_in = "%a, %b %d, %Y"
date_out_1 = datetime.strptime(date, format_in).strftime(format_out)
data['date_formatted_1'] = date_out_1
# Using the 'dateparser' module
date_parsed = dateparser.parse(date)
date_out_2 = date_parsed.strftime(format_out)
data['date_formatted_2'] = date_out_2