Skip to main content

đź“‹ E-invoicing is becoming mandatory across EU countries. Download UBL Buddy and be prepared.

Type something to search...

Convert a PDF Invoice to UBL — Can You Really? What works and what doesn't when converting PDF invoices to XML

  • E invoicing
  • 30 Apr, 2026
  • 7 min read

TL;DR: Converting a PDF invoice into a real UBL/Peppol XML is not a simple file conversion — it’s OCR plus data extraction, and the result is only as accurate as the OCR engine. For incoming invoices, you usually shouldn’t be converting at all: ask your supplier for a real UBL file. For your own outgoing invoices, send UBL straight from your accounting software instead of going via PDF.

You have a stack of PDF invoices you want to get into your system, or a supplier asks for UBL but you only have a PDF. The obvious question: can you just convert a PDF into UBL?

The short answer is yes, but not the way you’d expect. Here’s what actually happens — and when you should skip it.

Why PDF to UBL isn’t a real “conversion”

Converting Word to PDF keeps the content the same and only swaps the format. PDF to UBL is fundamentally different.

A PDF invoice is a visual document: pixels and text laid out for humans. A UBL invoice is a structured data file: every field (invoice number, supplier, VAT ID, line items, IBAN, totals) is labelled in XML so software can parse it automatically.

To go from PDF to UBL, software has to:

  1. Extract the text from the PDF (text extraction or OCR for scanned PDFs)
  2. Figure out which word is the supplier, which number is the VAT rate, which is an IBAN, and so on
  3. Place those fragments into the correct UBL fields

Step 2 is the hard part. Every supplier designs their invoice differently. “Total” might be bottom-right, top, or middle. VAT can be per-line or aggregated. Field order varies. So this isn’t conversion — it’s data extraction with pattern recognition, i.e. OCR with invoice intelligence.

What actually works: invoice OCR

Several categories of tools can read PDFs and return structured data.

Accounting software with scan-and-recognise

Packages like Xero, QuickBooks, Moneybird, Yuki, Exact Online, Sage and Twinfield offer scan-and-recognise for incoming invoices. You drop in a PDF, their OCR identifies the fields and proposes a booking. Most don’t explicitly export to UBL — but the data ends up structured inside your accounting system.

For: Businesses already using such a package.

Limitation: The UBL/XML usually doesn’t leave the system as a file; it stays internal.

Dedicated invoice-OCR services

Services like Klippa, Rossum, Hypatos, Veryfi and similar APIs specialise in invoice OCR and return structured JSON or XML. With a thin layer of mapping their output can be turned into UBL.

For: Businesses processing high volumes of PDFs that need automation.

Limitation: Costs per document (typically €0.05–€0.30), and accuracy depends on the OCR engine.

E-invoicing platforms

Some Peppol access points and e-invoicing platforms include built-in OCR for suppliers who can’t yet produce UBL. PDF goes in, UBL comes out, and it’s sent over Peppol.

For: Businesses going fully electronic that haven’t got their suppliers there yet.

Limitation: Requires a platform subscription and setup.

When you shouldn’t bother

Two scenarios where PDF→UBL conversion is the wrong tool.

Scenario 1: You received a PDF and want to book it

Your supplier should be sending you a real UBL/Peppol invoice — not a PDF you have to convert. In countries where e-invoicing is becoming mandatory (see our country-by-country overview) this is a legal requirement.

Better: Ask your supplier for a UBL file or a Peppol address. Most accounting tools can produce both with one click.

Or: Import the PDF directly into your accounting software using scan-and-recognise. You don’t need to convert it to UBL first — the data lands in your purchase ledger straight away.

Scenario 2: You want to invoice your own customers in UBL

You’re trying to deliver your own outgoing invoices in UBL by first making a PDF in Word/Pages and then converting it. That’s backwards.

Better: Use accounting software that produces UBL/Peppol natively. Most modern accounting tools do — fill in the invoice once and the system generates both a PDF (for human reading) and a UBL (for automated processing).

When conversion is genuinely useful

A few legitimate use cases:

  • Historical archives — thousands of old PDFs you want indexed and searchable. OCR + extraction to structured data is useful, even if the output isn’t strictly UBL.
  • Suppliers that truly can’t do UBL — if you need Peppol compliance for B2G and a supplier won’t cooperate, an access point with OCR can bridge the gap.
  • Migrating off a legacy system — your old system only exported PDFs and you’re now switching to e-invoicing.

In all three: OCR output is statistical, not exact. Expect to manually correct 5–15% of fields, especially on complex or multilingual invoices.

The other direction: UBL/XML to PDF

Sometimes you want the reverse: turning a UBL invoice you received into a PDF for printing, archiving or sending to someone without a viewer.

That direction is a real conversion — all the data is there, you just need a template laid over the XML. Tools like UBL Buddy display a UBL invoice with proper formatting on Mac, iPhone or iPad. From there you can print to PDF using the standard macOS print dialog (File → Print → Save as PDF).

Frequently asked questions

Is there a free PDF to UBL converter?

Not a reliable one. Free online converters exist but OCR quality is usually poor, and you’re uploading sensitive invoice data to unknown servers. Fine for one-off curiosity; not for production use.

Can ChatGPT or another AI tool convert a PDF to UBL?

An AI can read text out of a PDF and place it into a UBL template. In practice this works reasonably for simple invoices, but:

  • Error rates are unpredictable — you only know if fields are correct after manual review
  • VAT codes and currency codes must follow UBL exactly (e.g. S for standard rate, ISO currency codes)
  • For compliance (Peppol validation, VIES checks) the XML is strict
  • Sending sensitive data to an external AI is a GDPR concern

For one-off experimentation: maybe. For production: no.

What’s the difference between UBL and XML?

XML is the generic format (a way to store structured data). UBL (Universal Business Language) is a specific standard within XML for business documents like invoices. So a UBL file is always XML, but not all XML is UBL. Peppol uses UBL as its on-the-wire format.

Can UBL Buddy convert my PDFs to UBL?

No. UBL Buddy is a viewer for UBL/Peppol invoices you’ve received — not an OCR tool. For PDF extraction, use your accounting software or a dedicated invoice-OCR service.

What if my supplier insists on PDF?

From 2026 onwards, B2B e-invoicing becomes mandatory in a growing list of European countries, including Belgium (Jan 1, 2026), Germany (sending from 2027) and France (September 2026). PDF-only invoices will become legally invalid for B2B in those markets. You can already point suppliers at the deadline and ask for Peppol enrolment — most accounting tools support this with one click.

Further reading

Tags:
  • Pdf
  • Ubl
  • Xml
  • Conversion
  • Peppol
Share:

Related Posts

E-Invoicing in Andorra (2026)

  • E invoicing
  • 01 May, 2026

TL;DR: Andorra is not an EU member and operates its own indirect tax (IGI — Impost General Indirecte) at 4.5%. There is no Andorran e-invoicing mandate. However, Andorran businesses trading with EU pa...

Read More

E-Invoicing in Malta (2026)

  • E invoicing
  • 01 May, 2026

TL;DR: Malta has implemented B2G e-invoicing through Peppol in line with EU Directive 2014/55/EU, but there is currently no B2B mandate. Maltese businesses operating across the EU should still expect ...

Read More

E-Invoicing in Norway (2026)

  • E invoicing
  • 01 May, 2026

TL;DR: Norway has had e-invoicing mandatory for B2G (business-to-government) since 2012 and uses EHF (Elektronisk Handelsformat) — a national profile of UBL — as its standard, exchanged over the Peppo...

Read More