Some UTF-8 subjects will break PDF attachments

I really thought I’m going crazy with this one, but from what I can tell, with some UTF-8 subjects, attachments (I’ve tested with PDF) will not be sent correctly - the attachment on the email will be a blank PDF (with the same number of pages as the original PDF, but all pages are blank). I’ve tried this with multiple PDF files - one page, multiple pages, with a short text, long text, tables, etc and got the same results.

For example, if the subject is test the PDF is sent correctly.
If the subject is تست the subject is again sent correctly.
But if the subject is بامداد سیر هرمزان R3TYRO the PDF attachment will be blank. (on Gmail at least)

Tested on kumod 2025.01.29-833f82a8

sample payloads are here: KumoMTA issues with UTF8 and attachments · GitHub

Not sure if github handles the base64 lines correctly, please let me know if it doesn’t and I can send you the json files directly (couldn’t attach them here, seems like attachments are disabled).

Sample of the attachment working with a UTF-8 subject

Working with non-UTF-8 subject as well

Then with some other UTF-8 subject, not showing any content in the PDF this time.

All three emails were injected using the Kumo HTTP API, everything other than the subject is exactly the same in the payloads.

Have you tried this with SMTP as well? Just curious if it is isolated to HTTP injection.

I haven’t but I will do that today and get back to you

With SMTP the issue does not happen, it’s only with the HTTP API.

Great - thank you.

The json doesn’t appear to be valid:

$ jq . < ~/Downloads/payload-bad.json
jq: parse error: Invalid string: control characters from U+0000 through U+001F must be escaped at line 20, column 9188

This one works (tested with the jq command you sent)
payload-bad.json (19.4 KB)

payload-ok1.json (19.3 KB)

payload-ok2.json (19.3 KB)

I don’t see what could possibly cause this; if you diff the generated payloads, there’s no changes near the attachments.
Feels like it has to be a gmail bug somehow.

--- /tmp/bad.txt        2025-03-12 13:59:29.193945555 -0700
+++ /tmp/ok1.txt        2025-03-12 13:59:19.758936112 -0700
@@ -1,28 +1,27 @@
 Content-Type: multipart/mixed;
-       boundary="lbw7hu8JSoK/jgGGPT8qRA"
+       boundary="UfriakAJQeyg9OY5oRS2Vg"
 To: "hf.farhad@gmail.com" <hf.farhad@gmail.com>
 From: Ahasend <no-reply@email.ahasend.com>
-Subject: =?UTF-8?q?=D8=A8=D8=A7=D9=85=D8=AF=D8=A7=D8=AF_=D8=B3=DB=8C=D8=B1_=D9=87?=
-       =?UTF-8?q?=D8=B1=D9=85=D8=B2=D8=A7=D9=86_R3TYRO?=
+Subject: sample email with attachment
 Mime-Version: 1.0
-Date: Wed, 12 Mar 2025 20:59:29 +0000
+Date: Wed, 12 Mar 2025 20:59:19 +0000

---lbw7hu8JSoK/jgGGPT8qRA
+--UfriakAJQeyg9OY5oRS2Vg
 Content-Type: multipart/alternative;
-       boundary="nAZn1tVBSgmnKmJf5j6R0g"
+       boundary="JakDcFqeSRm3i/4KZTKucA"

---nAZn1tVBSgmnKmJf5j6R0g
+--JakDcFqeSRm3i/4KZTKucA
 Content-Type: text/plain;
        charset="us-ascii"

 test
---nAZn1tVBSgmnKmJf5j6R0g
+--JakDcFqeSRm3i/4KZTKucA
 Content-Type: text/html;
        charset="us-ascii"

 test
---nAZn1tVBSgmnKmJf5j6R0g--
---lbw7hu8JSoK/jgGGPT8qRA
+--JakDcFqeSRm3i/4KZTKucA--
+--UfriakAJQeyg9OY5oRS2Vg
 Content-Type: application/pdf
 Content-Transfer-Encoding: base64
 Content-Disposition: attachment;
@@ -277,5 +276,5 @@
 MDAwMDEyNjM2IDAwMDAwIG4gCjAwMDAwMTI4NzIgMDAwMDAgbiAKMDAwMDAxMzI0MyAwMDAwMCBu
 IAp0cmFpbGVyCjw8L1NpemUgMTcKL1Jvb3QgMTIgMCBSCi9JbmZvIDEgMCBSPj4Kc3RhcnR4cmVm
 CjEzNzMxCiUlRU9GCg==
---lbw7hu8JSoK/jgGGPT8qRA--
+--UfriakAJQeyg9OY5oRS2Vg--

FWIW, I added a little lua function to help test this sort of thing. inject api: add a lua `build_v1` function · KumoCorp/kumomta@1083539 · GitHub

Put this in test.lua:

local kumo = require 'kumo'

-- Put your payload in a file here:
local request = kumo.serde.json_load '/home/wez/Downloads/payload-bad.json'

for _, msg in ipairs(kumo.api.inject.build_v1(request)) do
  print(msg:get_data())
  -- and optionally print the From header, and so on.
  -- print(msg:get_first_named_header_value('from'))
end

And then run it with: kumod --script --policy test.lua

this will generate message(s) and print them out to stdout so that you don’t have to literally generate and send the message to see what is going on.

that build_v1 lua function is finalized or officially supported/stable at this time, so there are no official docs for it yet

This issue is resolved by the fixes in Discord