Help CenterSAI Formulas

Using PDF Input in SAI Operations with SheetXAI

Overview

SheetXAI now supports PDF input directly in SAI operations, allowing you to process and analyze PDF documents at scale. This powerful feature is perfect for handling large volumes of PDFs like invoices, medical forms, legal documents, or any structured PDF content.

Key Information

Purpose

The !PDF prefix enables you to send PDF documents to AI models for analysis, data extraction, summarization, and more. Instead of processing PDFs one by one, you can handle hundreds or thousands of PDFs simultaneously using SAI formulas.

Provider Requirements

For LTD Users:

  • The !PDF prefix works with OpenRouter, OpenAI, and Straico when configured as your content model
  • You must have one of these providers set up in your Content tab settings

For Credit Users:

  • SheetXAI automatically selects the perfect provider for PDF processing
  • No manual configuration needed - the system handles provider selection automatically

Note: The #SAI format automatically transforms to formulas every 5 seconds when the chat panel is open. You can also use formulas directly (=SAI("!PDF analyze this document", A2) in Google Sheets or =SHEETXAI.SAI(...) in Excel) if you prefer. To configure your content models, see Content Tab Settings.

How to Use PDF Input in SAI Operations

Step 1: Prepare Your PDF URLs

You have two options for getting PDF URLs:

If you already have direct PDF links (URLs that point directly to the PDF file), you can use them immediately.

Examples of Direct PDF URLs (Will Work):

  • https://example.com/invoice.pdf
  • https://storage.googleapis.com/my-bucket/document.pdf
  • https://cdn.website.com/files/report.pdf

Examples of Non-Direct URLs (Won't Work):

  • https://drive.google.com/file/d/1234567890/view (sharing link, not direct file)
  • https://example.com/document (webpage containing PDF, not the PDF itself)
  • https://docs.google.com/document/d/1234567890/edit (Google Docs link, not PDF)

Tip: To get a direct PDF URL, right-click a PDF online and select "Copy link address" or "Copy link" (the exact wording depends on your browser). Make sure the URL ends with .pdf or points directly to the PDF file.

Option B: Upload PDFs Using Bulk Uploader

If you have PDF files on your computer, use SheetXAI's bulk upload feature:

  1. Go to Extensions → SheetXAI → Bulk Image and PDF Uploader
  2. Upload your PDF files (you can upload multiple at once)
  3. SheetXAI will return direct PDF links for each uploaded file
  4. Copy and paste these links into Column A of your spreadsheet

Learn more: See our Image & PDF Upload Feature guide for detailed instructions.

Step 2: Use the !PDF Prefix in SAI Commands

Once you have PDF URLs in your spreadsheet (e.g., Column A), use the !PDF prefix in your SAI commands:

Basic Format:

#sai !PDF [your instruction] #A

Examples:

Data Extraction:

#sai !PDF extract all invoice line items from this document #A

Summarization:

#sai !PDF summarize the key points from this medical form #A

Analysis:

#sai !PDF analyze this invoice and identify the total amount, due date, and vendor #A

Learn more: See our guide on using SAI formulas for more examples and best practices.

Multi-Column Workflow:

  • Column A: PDF URLs
  • Column B: #sai !PDF extract invoice number and total amount #A
  • Column C: #sai !PDF identify the vendor name and contact information #A
  • Column D: #sai !PDF summarize payment terms and due date #A

Tip: Learn about using column references and dragging formulas down for efficient bulk processing.

Step 3: Process Multiple PDFs

The real power of this feature shines when processing many PDFs:

  1. Place all PDF URLs in Column A (one per row)
  2. Drag your SAI formula down to process all rows
  3. Each PDF is processed independently
  4. Results appear in the corresponding cells

Real-World Use Cases

Invoice Processing

Scenario: You have 500 invoices in PDF format that need data extraction.

Workflow:

  1. Upload all invoices using the Bulk Image and PDF Uploader
  2. Paste the returned links in Column A
  3. Column B: #sai !PDF extract invoice number #A
  4. Column C: #sai !PDF extract total amount #A
  5. Column D: #sai !PDF extract vendor name #A
  6. Column E: #sai !PDF extract due date #A

Result: All 500 invoices processed automatically in minutes instead of hours. Learn more about generating personalized content at scale.

Medical Forms Processing

Scenario: Processing hundreds of patient intake forms or medical records.

Workflow:

  1. Upload medical forms using the Bulk Image and PDF Uploader
  2. Paste links in Column A
  3. Column B: #sai !PDF extract patient name and date of birth #A
  4. Column C: #sai !PDF extract primary diagnosis #A
  5. Column D: #sai !PDF extract prescribed medications #A
  6. Column E: #sai !PDF summarize treatment plan #A

Scenario: Reviewing contracts or legal documents for specific information.

Workflow:

  1. Place PDF URLs in Column A
  2. Column B: #sai !PDF identify all parties involved in this contract #A (uses SAI formulas)
  3. Column C: #sai !PDF extract key dates and deadlines #A
  4. Column D: #sai !PDF summarize payment terms and obligations #A
  5. Column E: #sai !PDF identify any termination clauses #A

Form Data Extraction

Scenario: Extracting data from hundreds of application forms, surveys, or questionnaires.

Workflow:

  1. Upload forms using the Bulk Image and PDF Uploader
  2. Paste links in Column A
  3. Column B: #sai !PDF extract applicant name and contact information #A
  4. Column C: #sai !PDF extract all form responses as a structured list #A
  5. Column D: #sai !PDF identify any missing required fields #A

Technical Notes

File Requirements

  • File Size: PDFs should be less than 20MB for optimal performance
  • File Format: Standard PDF files (not password-protected or encrypted)
  • Processing Time: PDF processing may take 5-10 seconds longer than text-only operations due to document parsing

Best Practices

  • Be Specific: Clearly state what data you want extracted from each PDF
  • Use Column References: Use #A instead of #A1 when dragging formulas down (learn more about SAI formulas and cell references)
  • Batch Processing: Process similar PDFs together for consistent results
  • Quality Matters: Higher quality, well-structured PDFs yield better extraction results

Limitations

  • PDFs must be accessible via direct URL (not behind authentication)
  • Very large PDFs (over 20MB) may not process
  • Complex multi-page PDFs may require specific page references in your prompt
  • Handwritten or scanned PDFs with poor OCR quality may produce less accurate results

Troubleshooting

Problem: PDF Not Processing

Solutions:

  • Verify the PDF URL is direct (ends with .pdf or points directly to the file)
  • Check that the PDF is accessible (not behind a login or firewall)
  • Ensure the PDF is under 20MB
  • Try using the Bulk Image and PDF Uploader to get a proper direct link

Problem: Incomplete Data Extraction

Solutions:

  • Be more specific in your instruction about what data to extract
  • For multi-page PDFs, specify which page contains the data
  • Break complex extractions into multiple columns (one data point per column)
  • Use clearer, more structured PDFs when possible

Problem: Provider Not Working

Solutions:

  • For LTD users: Ensure you have OpenRouter, OpenAI, or Straico configured in your Content tab settings
  • Check that your API keys are valid and have sufficient credits
  • For credit users: The system automatically selects the best provider - no action needed

This feature transforms PDF processing from a manual, time-consuming task into an automated, scalable operation. Process hundreds of PDFs in the time it would take to manually handle just a few!

Last updated on 2025-12-21