fhf
March 8, 2025, 11:23am
1
It seems like some UTF-8 strings are not handled correctly in from.name when injected using the HTTP API. For example with this payload:
{
"envelope_sender": "no-reply@email.ahasend.com",
"content": {
"text_body": "test",
"html_body": "test",
"subject": "sample subject",
"from": {
"email": "noreply@email.ahasend.com",
"name": "وجیهه تست برای نام"
}
},
"recipients": [
{
"email": "hf.farhad@gmail.com",
"name": "hf.farhad@gmail.com"
}
]
}
The subject from name is shown as
وجیهه تست بر =?UTF-8?q?=D8=A7=DB=8C =D9=86=D8=A7=D9=85?=
in Gmail.
fhf
March 8, 2025, 11:23am
3
It seems like attachments are turned off here? wanted to attach a screenshot.
fhf
March 8, 2025, 11:42am
4
Sorry, by subject I meant from.name
fhf
March 8, 2025, 11:43am
5
Issue with UTF-8 in from.name in HTTP API injection
Mike
(Mike Hillyer)
March 8, 2025, 3:51pm
6
I fixed permissions, try screenshot again?
tom
(Tom Mairs)
March 11, 2025, 7:33pm
8
I guess “don’t do that” is not an option?
What version are you running? Some UTF handling changes were made recently.
fhf
March 12, 2025, 9:43am
9
That’s what I’ve asked them to do for now, just wanted to let you guys know about the issue. It’s on kumod 2025.01.29-833f82a8
tom
(Tom Mairs)
March 12, 2025, 4:56pm
11
We are looking into it, but do not have any kind of ETA. In the mean time, please use SMTP injection.
wez
(Wez Furlong)
March 12, 2025, 8:11pm
12
We encode the From header like this for that payload:
From: =?UTF-8?q?=D9=88=D8=AC=DB=8C=D9=87=D9=87_=D8=AA=D8=B3=D8=AA_=D8=A8=D8=B1?=
=?UTF-8?q?=D8=A7=DB=8C_=D9=86=D8=A7=D9=85?= <noreply@email.ahasend.com>
and that is conforming to the specs.
In addition, I tried composing in gmail to a crazy version of my own address, and the To header that it produces has the same kind of structure wrt. word wrapping:
To: =?UTF-8?Q?a_very_long_long_long_long_long_long_long_long_long_long_?=
=?UTF-8?Q?a_very_long_long_long_long_long_long_long_long_long_long_a_v?=
=?UTF-8?Q?ery_long_long_long_long_long_long_long_long_long_long_a_very?=
=?UTF-8?Q?_long_long_long_long_long_long_long_long_long_long_a_very_lo?=
=?UTF-8?Q?ng_long_long_long_long_long_long_long_long_long_a_very_long_?=
=?UTF-8?Q?ng_=DB=8C=D9=87=D9=87?= <wez@wezfurlong.org>
I don’t want to say that gmail has a bug here, just that, if we have a bug in our encoding of that field, it’s not clear what it is.
I will note that when I composed a mail from gmail using exactly your input, gmail chose to base64 it:
To: =?UTF-8?B?2YjYrNuM2YfZhyDYqtiz2Kog2KjYsdin24wg2YbYp9mF?= <wez@wezfurlong.org>
You can pre-rfc2047-header-encode the from.name if you want, and we’ll pass that through.
I’m hesitant to want to try to change anything here right now because it’s really not clear what the bug is.
fhf
March 15, 2025, 12:46pm
13
I just had some time to do some more testing, and found another problematic subject انتشارات جمال.
Also, this issue is not limited to gmail, it’s happening on outlook as well.
Sending the same message through the API results in this from.name value (taken from test-SMTP.eml attached belowo):
From: =?UTF-8?q?=D8=A7=D9=86=D8=AA=D8=B4=D8=A7=D8=B1=D8=A7=D8=AA_=D8=AC=D9=85?=
=?UTF-8?q?=D8=A7=D9=84?= <no-reply@email.ahasend.com>
with the API, the generated From header is (taken from test-API.eml):
From: =?UTF-8?q?=D8=A7=D9=86=D8=AA=D8=B4=D8=A7=D8=B1=D8=A7=D8=AA_=D8=AC=D9=85_?=
=?UTF-8?q?=3D=3FUTF-8=3Fq=3F=3DD8=3DA7=3DD9=3D84=3F=3D?= <noreply@email.ahasend.com>
The API version has a small difference at the end of the first line (85?= in SMTP vs 85_?= in API)
The SMTP message was constructed and sent using PHPMailer.
test-API.eml (5.4 KB)
test-SMTP.eml (5.85 KB)
wez
(Wez Furlong)
March 15, 2025, 2:29pm
15
=?UTF-8?q?=3D=3FUTF-8=3Fq=3F=3DD this part from test-API looks like it is embedding some qp encoded value of its own in there; looks like something has been double encoded. What was the input payload there? Is there some additional logic in your policy that might be modifying the content after it has been generated?
fhf
March 15, 2025, 3:39pm
16
Just double checked, other than adding / removing some headers and meta using set_meta, append_header, and remove_all_named_headers we’re making no changes to the message in http_message_generated .
This is the payload:
{
"envelope_sender": "no-reply@email.ahasend.com",
"content": {
"text_body": "test",
"html_body": "test",
"subject": "test with API",
"from": {
"email": "noreply@email.ahasend.com",
"name": "انتشارات جمال"
}
},
"recipients": [
{
"email": "hf.farhad@gmail.com",
"name": "hf.farhad@gmail.com"
}
]
}
fhf
March 15, 2025, 3:48pm
17
wez
(Wez Furlong)
March 15, 2025, 4:23pm
18
the check_fix_conformance call is what is changing that header value
wez
(Wez Furlong)
March 15, 2025, 4:23pm
19
local kumo = require 'kumo'
local request = kumo.serde.json_parse [[
{
"envelope_sender": "no-reply@email.ahasend.com",
"content": {
"text_body": "test",
"html_body": "test",
"subject": "test with API",
"from": {
"email": "noreply@email.ahasend.com",
"name": "انتشارات جمال"
}
},
"recipients": [
{
"email": "hf.farhad@gmail.com",
"name": "hf.farhad@gmail.com"
}
]
}
]]
for _, msg in ipairs(kumo.api.inject.build_v1(request)) do
print(msg:get_data())
msg:check_fix_conformance(
-- check for and reject messages with these issues:
'NON_CANONICAL_LINE_ENDINGS',
-- fix messages with these issues:
'NEEDS_TRANSFER_ENCODING|MISSING_DATE_HEADER|MISSING_MESSAGE_ID_HEADER|LINE_TOO_LONG'
)
print('after fix')
print(msg:get_data())
end
wez
(Wez Furlong)
March 15, 2025, 4:23pm
20
$ kumod --script --policy ./builder.lua
Content-Type: multipart/alternative;
boundary="0opD4Ub9RUih8hYuaRHXTQ"
To: "hf.farhad@gmail.com" <hf.farhad@gmail.com>
From: =?UTF-8?q?=D8=A7=D9=86=D8=AA=D8=B4=D8=A7=D8=B1=D8=A7=D8=AA_=D8=AC=D9=85?=
=?UTF-8?q?=D8=A7=D9=84?= <noreply@email.ahasend.com>
Subject: test with API
Mime-Version: 1.0
Date: Sat, 15 Mar 2025 16:22:53 +0000
--0opD4Ub9RUih8hYuaRHXTQ
Content-Type: text/plain;
charset="us-ascii"
test
--0opD4Ub9RUih8hYuaRHXTQ
Content-Type: text/html;
charset="us-ascii"
test
--0opD4Ub9RUih8hYuaRHXTQ--
after fix
Content-Type: multipart/alternative;
boundary="0opD4Ub9RUih8hYuaRHXTQ"
To: "hf.farhad@gmail.com" <hf.farhad@gmail.com>
From: =?UTF-8?q?=D8=A7=D9=86=D8=AA=D8=B4=D8=A7=D8=B1=D8=A7=D8=AA_=D8=AC=D9=85_?=
=?UTF-8?q?=3D=3FUTF-8=3Fq=3F=3DD8=3DA7=3DD9=3D84=3F=3D?= <noreply@email.ahasend.com>
Subject: test with API
Mime-Version: 1.0
Date: Sat, 15 Mar 2025 16:22:53 +0000
Message-ID: <bf2ab3ed01b911f09266cc28aa0a5c5a@email.ahasend.com>
--0opD4Ub9RUih8hYuaRHXTQ
Content-Type: text/plain;
charset="us-ascii"
test
--0opD4Ub9RUih8hYuaRHXTQ
Content-Type: text/html;
charset="us-ascii"
test
--0opD4Ub9RUih8hYuaRHXTQ--
wez
(Wez Furlong)
March 15, 2025, 4:24pm
21
So that gives me something to dig into!