1. Overview
  2. Public API
  3. Parse PDF and Files Using API

Parse PDF and Files Using API

In this article we'll see how to upload PDF and other files using Parsio API.

Related articles:

Authentication

To access the API, you will need the API key that you will find in your account:

This API key should be included in the X-API-Key HTTP header.

Unauthenticated responses will return in a HTTP 401 Unauthorized code.

Here's a request example using cURL:

curl -X GET https://api.parsio.io/mailboxes/ -H "X-API-Key: <YOUR_API_KEY>"

How to parse PDFs and files using Parsio API

API Endpont: POST https://api.parsio.io/mailboxes/<mailbox_id>/upload

Parameters:

  • file: Binary file object
  • meta (object): Payload document data (optional)

If needed, you can provide some payload data in the meta field, which will be included in the parsed JSON as the __meta__ field. This can be helpful for linking the document with your external database, for example.

Supported formats: PDF, HTML, CSV, TXT, DOCX, RTF or XML

Max. file size: 20MB

Your Mailbox ID can be found in the browser location bar:

 

Code samples

CURL:

curl \
-X POST \
https://api.parsio.io/mailboxes/<YOUR_MAILBOX_ID>/upload \
-F 'file=@./receipt.pdf' \
-H "X-API-Key: <YOUR_API_KEY>"

PHP:

<?php

$apikey = '<API_KEY>';
$url = 'https://api.parsio.io/mailboxes/<MAILBOX_ID>/upload';
$filepath = './invoice.pdf';

$curl = curl_init();

curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HTTPHEADER, array(
    'X-API-Key: ' . $apikey
));
curl_setopt($curl, CURLOPT_POST, true);

$meta = array(
  'foo' => 'bar',
  'my_id' => 42,
);
$metaJson = json_encode($meta);

curl_setopt($curl, CURLOPT_POSTFIELDS, array(
    'file' => curl_file_create($filepath, 'application/pdf', 'invoice.pdf'),
    'meta' => $metaJson
));

$response = curl_exec($curl);
curl_close($curl);

echo $response;

Python:

import requests

header = {"X-API-Key": "<API_KEY>"}
url = "https://api.parsio.io/mailboxes/<MAILBOX_ID>/upload"

with open('invoice.pdf', 'rb') as f:
    files = {'file': f}
    with requests.request("POST", url, files=files, headers=header) as response:
        print('response: ', response)

Node.js:

const fetch = require("node-fetch");
const fs = require("fs");
const FormData = require("form-data");

const APIKEY = "<YOUR_API_KEY>";
const mailboxId = "<MAILBOX_ID>";
const filePath = "/path/to/your/file.pdf";
// Optional: Pass custom payload which will be avalable in the parsed data.
const metadata = { foo: "bar" };

async function importFile(mailboxId, filePath, metadata) {
  const url = `https://api.parsio.io/mailboxes/${mailboxId}/upload`;

  const fileStream = fs.createReadStream(filePath);

  const form = new FormData();
  form.append("file", fileStream);
  form.append("meta", JSON.stringify(metadata));

  try {
    const response = await fetch(url, {
      method: "POST",
      body: form,
      headers: {
        "X-API-Key": APIKEY,
      },
    });

    const docId = await response.json();
    console.log(response.status);
    console.log("Document id:", docId);
  } catch (e) {
    console.error("Error:", e.message);
  }
}

importFile(mailboxId, filePath, metadata);

Was this article helpful?
© 2024 Parsio Knowledge Base