--- title: HTML to PDF Converter emoji: 📄 colorFrom: purple colorTo: blue sdk: docker app_port: 7860 health_check: path: /health --- # HTML to PDF Converter API A powerful API to convert HTML documents to PDF with support for page breaks, image embedding, and multiple aspect ratios. ## Features - Convert HTML files to PDF - Support for multiple aspect ratios (16:9, 1:1, 9:16, auto) - Automatic aspect ratio detection from content - Image embedding support - Page break support - Base64 PDF output option # HTML to PDF Conversion - Complete Implementation Guide ## 🎯 Overview This guide covers the **optimized HTML to PDF conversion system** based on 2024-2025 Puppeteer best practices, implementing proper multi-page support, intelligent mode detection, and professional PDF generation. --- ## 🚀 Usage Examples ### Example 1: Auto-Detection (Recommended) ```bash curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ -F "html_file=@document.html" \ -F "aspect_ratio=auto" \ -F "mode=single" \ -o output.pdf ``` The system will: - Analyze HTML structure - Detect page breaks and content height - Choose optimal aspect ratio - Select best generation mode ### Example 2: Force Multi-Page Document ```bash curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ -F "html_file=@report.html" \ -F "aspect_ratio=9:16" \ -F "mode=multi" \ -o report.pdf ``` ### Example 3: Single-Page Infographic ```bash curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ -F "html_file=@infographic.html" \ -F "aspect_ratio=1:1" \ -F "mode=single" \ -o infographic.pdf ``` ### Example 4: Presentation Slides ```bash curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ -F "html_file=@presentation.html" \ -F "aspect_ratio=16:9" \ -F "mode=multi" \ -o slides.pdf ``` ### Example 5: With Images ```bash curl -X POST https://abdallalswaiti-htmlpdfs.hf.space/convert \ -F "html_file=@document.html" \ -F "images=@logo.png" \ -F "images=@chart.jpg" \ -o document.pdf ``` --- ## 🎨 CSS Best Practices ### Essential CSS for Multi-Page PDFs ```css /* 1. Configure page size */ @page { size: A4 portrait; margin: 20mm; } /* 2. Prevent unwanted breaks */ .keep-together { page-break-inside: avoid; break-inside: avoid; } /* 3. Force page breaks */ .new-page { page-break-before: always; break-before: page; } .page, .slide { page-break-after: always; break-after: page; } /* 4. Avoid breaking after headings */ h1, h2, h3, h4, h5, h6 { page-break-after: avoid; break-after: avoid; } /* 5. Table header repetition */ thead { display: table-header-group; break-inside: avoid; } /* 6. Prevent table row breaks */ tr { page-break-inside: avoid; } /* 7. Keep images intact */ img, figure { page-break-inside: avoid; break-inside: avoid; max-width: 100%; height: auto; } /* 8. Ensure backgrounds print */ * { -webkit-print-color-adjust: exact !important; print-color-adjust: exact !important; } ``` ### Print-Specific Styles ```css @media print { /* Hide elements that shouldn't print */ .no-print { display: none; } /* Optimize for printing */ body { font-size: 12pt; line-height: 1.4; } /* Avoid color backgrounds for text readability */ .text-content { background: white; color: black; } } ``` --- ## 📐 Aspect Ratios Guide ### **9:16 - Portrait (Default)** - **Best for**: Reports, documents, invoices, contracts - **Format**: A4 portrait (210mm × 297mm) - **Use case**: Traditional business documents ```html
... ``` ### **16:9 - Landscape** - **Best for**: Presentations, slides, dashboards - **Format**: A4 landscape (297mm × 210mm) - **Use case**: Wide-format content ```html ``` ### **1:1 - Square** - **Best for**: Social media, Instagram posts, certificates - **Format**: 210mm × 210mm - **Use case**: Square format content ```html ... ``` ### **Auto - Dynamic** - **Best for**: Infographics, posters, or any content where the exact dimensions should be preserved. - **Format**: The PDF will have the same dimensions as the HTML content. - **Use case**: When you want the PDF to be a perfect snapshot of the HTML content. --- ## 🎯 Mode Selection Guide ### **Auto Mode (Recommended)** ```javascript mode: "auto" ``` - System analyzes content - Detects page breaks in CSS - Measures content height - Chooses optimal mode automatically **Auto-detection logic:** - Finds `.page` or `.slide` classes → Multi-page - Content height > 2× viewport → Multi-page - Keywords like "infographic", "poster" → Single-page - Otherwise → Multi-page (safer default) ### **Multi-Page Mode** ```javascript mode: "multi" ``` - Natural page breaks respected - Table headers repeat - Content flows across pages - Best for documents, reports, articles ### **Single-Page Mode** ```javascript mode: "single" ``` - Everything on one page - Page sized to content - Perfect for infographics - No page breaks applied --- ## 🔧 Advanced Configuration ### Custom Page Sizes ```css /* Custom dimensions */ @page { size: 8.5in 11in; /* Letter size */ } @page { size: 420mm 594mm; /* A2 size */ } @page { size: landscape; /* Generic landscape */ } ``` ### Headers and Footers ```css @page { @top-center { content: "Document Title"; } @bottom-right { content: counter(page) " of " counter(pages); } } ``` ### Margins ```css @page { margin: 25mm 20mm; /* top/bottom left/right */ } @page { margin-top: 30mm; margin-bottom: 25mm; margin-left: 20mm; margin-right: 20mm; } ``` --- ## 🐛 Troubleshooting ### Issue: Page breaks in wrong places **Solution**: Add `page-break-inside: avoid` to container elements ```css .section { page-break-inside: avoid; } ``` ### Issue: Table headers not repeating **Solution**: Ensure proper CSS on `` ```css thead { display: table-header-group; break-inside: avoid; } ``` ### Issue: Backgrounds not printing **Solution**: Add color-adjust property ```css * { -webkit-print-color-adjust: exact; print-color-adjust: exact; } ``` ### Issue: Content cut off at page edges **Solution**: Add appropriate margins ```css @page { margin: 20mm; } body { padding: 0; /* Remove body padding when using @page margins */ } ``` ### Issue: Fonts not loading **Solution**: Ensure fonts are embedded or use web-safe fonts ```css @font-face { font-family: 'CustomFont'; src: url(data:font/woff2;base64,...); } ``` --- ## 📊 Performance Optimization ### 1. Reuse Browser Instance (for high-volume) Not implemented in current single-request design, but for high-volume: ```javascript // Keep one browser instance const browser = await puppeteer.launch(); // Reuse for multiple PDFs for (let html of htmlList) { const page = await browser.newPage(); await page.setContent(html); await page.pdf({...}); await page.close(); } await browser.close(); ``` ### 2. Optimize Images - Use compressed formats (WebP, optimized JPEG) - Resize to display dimensions - Embed as data URLs for self-contained PDFs ### 3. Minimize CSS - Remove unused styles - Combine selectors - Use shorthand properties ### 4. Reduce Wait Times ```javascript // Only wait for what's needed waitUntil: 'networkidle2' // Instead of networkidle0 ``` --- ## 🎓 Real-World Examples ### Invoice Template ```htmlDate: 2024-10-17
| Item | Qty | Price |
|---|
Content...
Content...