Common Examples

Prevent a document from being exported

If you don't want to export some of the parsed documents (typically, based on a certain condition), simply return None. This will mark the document as "Skipped." Skipped documents aren't exported to Webhooks, Zapier, and other integrations.

return None

Here's an example of how to avoid exporting emails to your webhooks and integrations if you only need to parse email attachments:

if "content_type" in data and data["content_type"] == "message/rfc822":
	return None

Don't forget to enable the 'Content type' meta field to make it available during the post-processing step.

Create a new field

# Create field
data['new_field'] = 'Field value'

# Create field based on another field. If 'email' doesn't exist, use the default value
data['customer_email'] = data.get("email", 'default@example.com')

Delete a field

del data['email']

Merge multiple fields into one

# Using string concatenation (+ operator)
data['fullname'] = data['first_name'] + " " + data['last_name']

# Using formatted string literals (f-strings)
data["fullname"] = f"{data['first_name']} {data['last_name']}"

# Using format() function
data['fullname'] = '{} {}'.format(data['first_name'], data['last_name'])

Rename a field

This code will remove the original email field and insert a new field customer_email with the same value:

data['customer_email'] = data.pop('email')

Check if field exists in parsed data

# If 'name' exists, change its value
if "name" in data:
	data['name'] = 'Peter'

# If 'name' doesn't exists, create field
if "name" not in data:
	data['name'] = 'Peter'
# ...or in one line
data['name'] = data.get('name', 'Peter')

Store parsed data in another variable

You can also create a new dictionary which will replace the original data dictionary.

parsed_data = {
	**data,		# copy all the fields from the 'data' dictinary
	'name': 'Peter',
	'age': 27
}
return parsed_data    # do not forget to 'return' the dictionary

Round a floating point number

pi = 3.141592653589793
data['pi'] = round(pi, 2) # 3.14

Formatting dates

Parsio provides 2 Python modules to work with dates: datetime and dateparser.

date = data['date'] 	# "Mon, Mar 28, 2022"
format_out = "%Y-%m-%d" # YYYY-MM-DD

# Using the 'datetime' module
format_in = "%a, %b %d, %Y"
date_out_1 = datetime.strptime(date, format_in).strftime(format_out)
data['date_formatted_1'] = date_out_1

# Using the 'dateparser' module
date_parsed = dateparser.parse(date)
date_out_2 = date_parsed.strftime(format_out)
data['date_formatted_2'] = date_out_2


Was this article helpful?
© 2024 Parsio Knowledge Base