PDF to Format

PDF Parsing and Formatted Output
Convert an uploaded PDF into a structured downstream format such as Markdown or text, based on the selected output type.

Method: POST
Path: /v1/imagerecognition/pdf2format
Demo: https://api.gugudata.io/v1/imagerecognition/pdf2format/demo
OpenAPI: https://gugudata.io/assets/openapi/gugudata.openapi.3.1.json

Request Parameters:

appkey (string, required): Application key used for request authentication. Supply the value as a query parameter, form field, or multipart field according to the request content type.

type (string, required): Endpoint-specific type selector. Refer to the endpoint description for supported values.

pdffile (file, required): PDF file uploaded as multipart form data.

Response Fields:

DataStatus.statusCode (integer, required): Application-level status code returned by the current v1 contract.

DataStatus.statusDescription (string, required): Application-level status message returned by the current v1 contract.

DataStatus.responseDateTime (string, required): Response timestamp returned by the current service contract.

DataStatus.dataTotalCount (integer, required): Total number of records that match the request.

Data.result (string, required): Parsed PDF data returned by the API, the format is determined by the type parameter

HTTP Status Codes:

200: Request processed successfully. Some endpoints expose a separate application-level status field in the response body, such as dataStatus.statusCode.

400: Invalid request parameters or request format. Check required fields, data types, and request body format.

401: Missing or unknown application key. Provide a valid appkey with the request.

403: The application key is recognized but access is not allowed. The key may be expired, inactive, or not permitted for the requested API.

429: Request rate or trial usage limit exceeded. Reduce concurrency or retry after the limit window resets.

500: Internal service error. Retry later or contact support if the error persists.

503: Upstream service unavailable. Retry later; the requested upstream dependency is temporarily unavailable.

Business Status Codes:

100 Normal response: No additional remark.

101 Parameter error: No additional remark.

102 Request rate limited: Requests cannot exceed 100 per second

103 Account overdue: No additional remark.

104 Invalid APPKEY: Please check if the passed APPKEY is the one obtained from the developer center

110 API response error: No additional remark.

Key Features:

General recognition API, supports standard PDF file parsing.

Multiple format output, supports TEXT, HTML, XML, TAG.

HTML includes perfect formatting.

Recognition accuracy improves continuously with machine learning.

1M file millisecond-level recognition performance.

Fully supports HTTPS (TLS v1.0 / v1.1 / v1.2 / v1.3).

Fully compatible with Apple ATS.

Nationwide multi-node CDN deployment.

Fast API response, load balancing across multiple servers.

Details:
https://gugudata.io/details/pdf2format

Responses

🟢200成功

application/json

Bodyapplication/json

Example

{
  "DataStatus": {
    "statusCode": 0,
    "statusDescription": "string",
    "responseDateTime": "string",
    "dataTotalCount": 0,
    "StatusCode": 100,
    "StatusDescription": "OK",
    "ResponseDateTime": "2026-01-01 00:00:00",
    "DataTotalCount": 1
  },
  "Data": {
    "result": "string"
  }
}

Request

Request Code Samples

Responses