1. DEMO
API Documents - GuGuData | Production-Ready APIs Built for Developers
  • GuGuData API documents
  • Metadata
    • DEMO
      • [DEMO] Global QS World University Rankings
      • [DEMO] Global University
      • [DEMO] Chinese Poem
      • [DEMO] Global QS World University Rankings
      • [DEMO] Stock US Symbols
      • [DEMO] Stock HK Symbols
    • Global QS World University Rankings
    • Global University
    • Chinese Poem
    • Global QS World University Rankings
    • Stock US Symbols
    • Stock HK Symbols
  • Website Tools
    • DEMO
      • [DEMO] Webpage Readable Content Extraction
        GET
      • [DEMO] Domain SSL Certificate Information Parsing
        GET
      • [DEMO] Domain DNS Information Query
        GET
      • [DEMO] Query Website Favicon and Title
        GET
      • [DEMO] Format International Phone
        GET
      • [DEMO] URL to static HTML
        GET
      • [DEMO] URL to Image
        GET
      • [DEMO] URL to Markdown
        GET
      • [DEMO] Get URL links
        GET
      • [DEMO] Website Snapshot
        GET
      • [DEMO] Domain Whois
        GET
      • [DEMO] IP Address
        GET
      • [DEMO] Article Extract
        GET
      • [DEMO] Geographic Coordinate System Converter
        GET
      • [DEMO] Extract Structured JSON from Webpage
        GET
      • [DEMO] Short Link API
        GET
    • Webpage Readable Content Extraction
      POST
    • Domain SSL Certificate Information Parsing
      GET
    • Domain DNS Information Query
      GET
    • Query Website Favicon and Title
      GET
    • Format International Phone
      GET
    • URL to static HTML
      POST
    • URL to Image
      GET
    • URL to Markdown
      POST
    • Get URL links
      GET
    • Website Snapshot
      POST
    • Domain Whois
      GET
    • IP Address
      GET
    • Article Extract
      POST
    • Geographic Coordinate System Converter
      GET
    • Extract Structured JSON from Webpage
      POST
    • Short Link API
      POST
  • Image Recognition
    • DEMO
      • [DEMO] HTML to PDF
      • [DEMO] Image OCR
      • [DEMO] PDF to Format
      • [DEMO] PDF Splitting
      • [DEMO] Markdown to PDF
      • [DEMO] PPT to Images
      • [DEMO] PDF to HTML
      • [DEMO] PDF Summary
      • [DEMO] Image Compress
      • [DEMO] Word to HTML
      • [DEMO] Convert HTML to Word
      • [DEMO] Convert PPT to PDF
    • HTML to PDF
    • Image OCR
    • PDF to Format
    • PDF Splitting
    • Markdown to PDF
    • PPT to Images
    • PDF to HTML
    • PDF Summary
    • Image Compress
    • Word to HTML
    • Convert HTML to Word
    • Convert PPT to PDF
  • QRcode and Barcode
    • DEMO
      • [DEMO] QR Code
      • [DEMO] Wifi QR Code
      • [DEMO] Bar Code
      • [DEMO] Decode QR Code from Image
    • QR Code
    • Wifi QR Code
    • Bar Code
    • Decode QR Code from Image
  • Text Tools
    • DEMO
      • [DEMO] Simplified and Traditional Chinese Converter
      • [DEMO] Text Similarity Calculator
      • [DEMO] Detect Text Language
      • [DEMO] ISBN Book Metadata Lookup
    • Simplified and Traditional Chinese Converter
    • Text Similarity Calculator
    • Detect Text Language
    • ISBN Book Metadata Lookup
  • healthcheck
    GET
  1. DEMO

[DEMO] Article Extract

GET
https://api.gugudata.io/v1/article/extract/demo
Last modified:2026-05-14 10:39:32

Article Extractor
Extract the primary article content, title, byline, publication date, and clean body text from a target webpage or raw HTML input.

Method: POST
Path: /v1/article/extract
Demo: https://api.gugudata.io/v1/article/extract/demo
OpenAPI: https://gugudata.io/assets/openapi/gugudata.openapi.3.1.json

Request Parameters:

  • appkey (string, required): Application key used for request authentication. Supply the value as a query parameter, form field, or multipart field according to the request content type.
  • url (string, required): Target webpage URL.

Response Fields:

  • DataStatus.StatusCode (integer, required): Application-level status code returned by the current v1 contract.
  • DataStatus.StatusDescription (string, required): Application-level status message returned by the current v1 contract.
  • DataStatus.ResponseDateTime (string, required): Response timestamp returned by the current service contract.
  • DataStatus.DataTotalCount (integer, required): Total number of records that match the request.
  • Data.url (string, required): Source URL of the article
  • Data.title (string, required): Extracted article title
  • Data.description (string, optional): Article description/summary
  • Data.links (array, optional): Array of links contained in the article
  • Data.image (string, optional): Main article image URL
  • Data.content (string, required): Extracted article content (HTML format, with ads and navigation removed)
  • Data.author (string, optional): Article author (if available, may be empty string)
  • Data.favicon (string, optional): Website favicon URL
  • Data.source (string, optional): Source website domain (e.g., sohu.com)
  • Data.published (string, optional): Article publication date/time (format: YYYY-MM-DD HH:MM)
  • Data.ttr (integer, optional): Estimated reading time (Time to Read, in minutes)
  • Data.type (string, optional): Article type (e.g., news, article, etc.)

HTTP Status Codes:

  • 200: Request processed successfully. Some endpoints expose a separate application-level status field in the response body, such as dataStatus.statusCode.
  • 400: Invalid request parameters or request format. Check required fields, data types, and request body format.
  • 401: Missing or unknown application key. Provide a valid appkey with the request.
  • 403: The application key is recognized but access is not allowed. The key may be expired, inactive, or not permitted for the requested API.
  • 429: Request rate or trial usage limit exceeded. Reduce concurrency or retry after the limit window resets.
  • 500: Internal service error. Retry later or contact support if the error persists.
  • 503: Upstream service unavailable. Retry later; the requested upstream dependency is temporarily unavailable.

Business Status Codes:

  • 200 Normal return: Article successfully extracted
  • 400 Parameter error: Invalid or missing required parameters (url is required)
  • 429 Request frequency limited: Cannot exceed 100 requests per second
  • 403 Account in arrears: Payment required to continue using the service
  • 402 APPKEY error: Please check whether the APPKEY passed is obtained from the developer center
  • 500 API response error: Internal server error during article extraction. URL may be inaccessible or content format may be unsupported
  • 503 Service unavailable: External service temporarily unavailable

Key Features:

  • Extract clean article content from any webpage URL.
  • Automatic removal of ads, navigation, and non-content elements.
  • Extract article title, content, author, publication date, and metadata.
  • Separate endpoint available for HTML string extraction (/v1/article/extractFromHtml).
  • High-quality content extraction with intelligent parsing.
  • Full API support for HTTPS (TLS v1.0 / v1.1 / v1.2 / v1.3).
  • Fully compatible with Apple ATS.
  • Nationwide multi-node CDN deployment.
  • Ultra-fast response, API interface load balancing built with multiple servers.

Details:
https://gugudata.io/details/article-extract

Request

None

Request Code Samples

Shell
JavaScript
Java
Swift
Go
PHP
Python
HTTP
C
C#
Objective-C
Ruby
OCaml
Dart
R
Request Request Example
Shell
JavaScript
Java
Swift
curl --location 'https://api.gugudata.io/v1/article/extract/demo'

Responses

🟢200成功
application/json
Bodyapplication/json

Example
{
  "DataStatus": {
    "StatusCode": 0,
    "StatusDescription": "string",
    "ResponseDateTime": "string",
    "DataTotalCount": 0
  },
  "Data": {
    "url": "string",
    "title": "string",
    "description": "string",
    "links": [
      {}
    ],
    "image": "string",
    "content": "string",
    "author": "string",
    "favicon": "string",
    "source": "string",
    "published": "string",
    "ttr": 0,
    "type": "string"
  }
}
Modified at 2026-05-14 10:39:32
Previous
[DEMO] IP Address
Next
[DEMO] Geographic Coordinate System Converter
Built with