pdf_to_pptx / PPT_DATA_SCHEMA.md
HyunsangJoo
Refactor PPTX generation logic to separate module
eea5466

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

PPT Conversion Data Schema

This document outlines the data schema required by the build_pptx_from_results function in app.py to generate a PowerPoint presentation.

Overview

The input data (parse_results) is a list of dictionaries, where each dictionary represents a single page.

Data Structure

[
  {
    "layout_result": [
      {
        "bbox": [109, 235, 450, 280],
        "category": "Title",
        "text": "Document Title"
      },
      {
        "bbox": [109, 300, 800, 500],
        "category": "Text",
        "text": "Content goes here..."
      }
    ]
  },
  {
    "layout_result": [ ... next page data ... ]
  }
]

Field Definitions

Each item in the layout_result list must be a dictionary with the following fields:

Field Type Required Description
bbox List[int] Yes [x1, y1, x2, y2] coordinates (Top-Left, Bottom-Right). Absolute pixel coordinates relative to the image size.
category str Yes The layout element category (see Allowed Categories below).
text str Conditional The text content to display. Required for text-based categories.

Allowed Categories

The category field determines styling (font size, bolding) and whether the element is included in the PPT.

Text Elements (Rendered)

These categories are rendered as text boxes in the PowerPoint slide.

  • Title: Rendered in Bold. Minimum font size 24pt.
  • Section-header: Rendered in Bold. Minimum font size 18pt.
  • Caption: Maximum font size 12pt.
  • Footnote: Maximum font size 12pt.
  • Text: Standard text body.
  • List-item: Standard text body.
  • Page-header
  • Page-footer
  • Formula

Non-Text Elements (Skipped)

These categories are excluded from text box generation in the current implementation (they are assumed to be part of the background image or handled separately).

  • Picture
  • Table