HTTP Injection strips white spaces from the subject when it's UTF-8

It seems like the HTTP injection endpoint removes whitespaces from the subject if the subject string is a unicode string. For example:

curl http://localhost:8000/api/inject/v1 \
    -X POST \
    -H 'Content-Type: application/json' \
    -d '{
        "envelope_sender": "info@example.com",
        "recipients": [
            {
                "email": "me@gmail.com"
            }
        ],
        "content": {
            "from": {
                "name": "Sender",
                "email": "info@example.com"
            },
            "subject": "تست یک دو سه",
            "html_body": "  تست یک دو سه"
        }
    }'

I’ve added logs to http_message_generated event handler:

kumo.on('http_message_generated', function(msg)
  local original_sender_domain = msg:sender().domain
  local tenant = aha.cached_tenant_id(original_sender_domain)
  if tenant == '4cdd7bdd-294e-4762-892f-83d40abf5a87' then
    print("subject:", msg:get_first_named_header_value('subject'))
  end

  -- the rest of the handler...
end

The output in journal is:

Oct 14 04:53:22 send kumod[1239348]: subject:        تستیکدوسه

while I expect it to be the same string as provided in the JSON payload, e.g.:

تست یک دو سه

This seems to happen only for subject, the same string in content.html_body and content.from.name works without any issues (the spaces there don’t get removed).

Same thing happens if I inject using a string content:

curl http://localhost:8000/api/inject/v1     -X POST     -H 'Content-Type: application/json'     -d '{"envelope_sender":"no-reply@example.com","content":"Mime-Version: 1.0\r\nDate: Mon, 14 Oct 2024 10:33:24 +0200\r\nSubject: =?UTF-8?q?=D8=AA=D8=B3=D8=AA_=DB=8C=DA=A9_=D8=AF=D9=88_=D8=B3=D9=87?=\r\nFrom: \"Example Sender\" \u003cno-reply@example.com\u003e\r\nTo: me@gmail.com\r\nContent-Type: text/plain; charset=UTF-8\r\nContent-Transfer-Encoding: quoted-printable\r\n\r\n=D8=AA=D8=B3=D8=AA =DB=8C=DA=A9 =D8=AF=D9=88 =D8=B3=D9=87","recipients":[{"email":"me@gmail.com","name":""}]}'

Same message sent using SMTP has no issues:

$ echo "Mime-Version: 1.0\r\nDate: Mon, 14 Oct 2024 10:33:24 +0200\r\nSubject: =?UTF-8?q?=D8=AA=D8=B3=D8=AA_=DB=8C=DA=A9_=D8=AF=D9=88_=D8=B3=D9=87?=\r\nFrom: \"Example Sender\" \u003cno-reply@example.com\u003e\r\nTo: me@gmail.com\r\nContent-Type: text/plain; charset=UTF-8\r\nContent-Transfer-Encoding: quoted-printable\r\n\r\n=D8=AA=D8=B3=D8=AA =DB=8C=DA=A9 =D8=AF=D9=88 =D8=B3=D9=87" > msg.eml
$ python send-eml.py no-reply@example.com me@gmail.com ./msg.eml

Interesting, will investigate.

Confirmed, and should be resolved by fix a mime rfc2047 qp encoding issue for unstructured fields · KumoCorp/kumomta@9cae055 · GitHub