👨‍💻 Parse HTML and Text Documents Using API

In this article, we’ll see how to parse HTML and TXT documents using the Parsio API.

Use these endpoints when you already have the document content as plain text or HTML. Instead of uploading a file, you send the document content directly in the API request body. If your source document is a PDF, scanned PDF, image, DOCX, or another file, use the file upload API instead.

Which endpoint should I use?

For most API integrations, we recommend /doc-sync because it returns parsed data directly in the same HTTP response whenever parsing finishes within the timeout.

Use /doc if you prefer webhooks, do not want the request to wait, or simply want to create the document now and fetch the result later.

  • Recommended: POST https://api.parsio.io/mailboxes/<mailbox_id>/doc-sync

  • Asynchronous alternative: POST https://api.parsio.io/mailboxes/<mailbox_id>/doc

Important: text and HTML import is supported only for template-based and GPT-powered parsers. It is not supported for the AI-powered parser or OCR converter.

Authentication

To access the API, you will need the API key that you will find in your account:

This API key should be included in the X-API-Key HTTP header.

Unauthenticated responses return HTTP 401 Unauthorized.

Example request using cURL:

curl -X GET https://api.parsio.io/mailboxes/ -H "X-API-Key: <YOUR_API_KEY>"

JSON Parameters

  • name (string, optional): Document name or email subject

  • html (string, optional): HTML content

  • text (string, optional): Plain-text content

  • from (string, optional): Sender email address

  • to (string, optional): Recipient email address

  • meta (object, optional): Custom payload included in the parsed JSON as __meta__

At least one of html or text must be provided.

If both are provided, Parsio uses html.

Your Mailbox ID can be found in the browser location bar:

How Responses Work

Synchronous endpoint: /doc-sync

If parsing finishes within the timeout, Parsio returns a JSON object like this:

{
"doc_id": "DOC_ID",
"parsing_in_progress": false,
"status": "parsed",
"name": "New order",
"content_type": "text/html",
"created_at": "2026-04-02T12:00:00.000Z",
"processed_at": "2026-04-02T12:00:04.000Z",
"json": {
"field1": "value1"
}
}

If parsing does not finish within the timeout, Parsio returns:

{
"doc_id": "DOC_ID",
"parsing_in_progress": true,
"status": "parsing",
"name": null,
"content_type": null,
"created_at": null,
"processed_at": null,
"json": null
}

In that case, parsing continues in the background. You can then retrieve the parsed result later using GET /docs/<doc_id> or receive it via webhook.

Asynchronous endpoint: /doc

This endpoint returns the created document ID and starts parsing in the background immediately.

Use it when:

  • You already use webhooks

  • You do not want the request to wait for parsing

  • You want the fastest API response

Code Samples

cURL — recommended synchronous request:

curl -X POST https://api.parsio.io/mailboxes/<YOUR_MAILBOX_ID>/doc-sync -d '{
"name": "New order",
"from": "admin@example.com",
"to": "me@example.com",
"html": "<p>HTML content goes here</p>",
"meta": {
"external_id": "12345"
}
}' -H "Content-Type: application/json" -H "X-API-Key: <YOUR_API_KEY>"

cURL — asynchronous request:

curl -X POST https://api.parsio.io/mailboxes/<YOUR_MAILBOX_ID>/doc -d '{
"name": "New order",
"from": "admin@example.com",
"to": "me@example.com",
"html": "<p>HTML content goes here</p>",
"meta": {
"external_id": "12345"
}
}' -H "Content-Type: application/json" -H "X-API-Key: <YOUR_API_KEY>"

If you need to upload PDFs, images, scanned documents, or other files instead of text or HTML, use Parse PDF and files using API.


Was this article helpful?
© 2026 Parsio Knowledge Base