Error with unbound dns resolver

Hi,
I tried to integrate Unbound as a DNS resolver (configure_unbound_resolver - KumoMTA Docs), but I get the following error on my KumoMTA instance with high traffic.

Okt 21 08:04:41 mta01 kumod[2136529]: [1761026681] libunbound[2136529:0] error: event_add failed. in cpsl.
Okt 21 08:04:41 mta01 kumod[2136529]: [1761026681] libunbound[2136529:0] error: event_add failed. in cpsl.
Okt 21 08:04:41 mta01 kumod[2136529]: [1761026681] libunbound[2136529:0] error: event_add failed. in cpsl.
Okt 21 08:04:41 mta01 kumod[2136529]: [1761026681] libunbound[2136529:0] error: event_add failed. in cpsl.
Okt 21 08:04:41 mta01 kumod[2136529]: [1761026681] libunbound[2136529:0] error: event_add failed. in cpsl.
Okt 21 08:04:41 mta01 kumod[2136529]: [1761026681] libunbound[2136529:0] error: event_add failed. in cpsl.
Okt 21 08:04:41 mta01 kumod[2136529]: [1761026681] libunbound[2136529:0] error: event_add failed. in cpsl.
Okt 21 08:04:41 mta01 kumod[2136529]: [1761026681] libunbound[2136529:0] error: event_add failed. in cpsl.

Hope to help in some way, seen this in the past, but on a machine level installation. I don’t know what kind of Unbound implementation Kumo uses, but whenever I’ve seen these kind of errors it’s always been due to file descriptior limits/exhaustion. Try check your ulimit/LimitNOFILE&Co. and how many opens files you have on the process via lsof

Some check like

# soft hard limits for the kumod process
cat /proc/$(pidof kumod)/limits | grep "open files"

# check eventually your LimitNOFILE on the systemd

# check also limits via
ulimit -Hn
cat /proc/sys/fs/nr_open

# check opened files by kumo
lsof -p $(pidof kumod) | wc -l

You probably hadn’t hit any limits before, but by introducing this new flow inside the Kumo process ( more thread, socket and so on ), you might now be hitting those limits (?)

I don’t think it’s a limit issue, because I’m facing this error randomly. There is no causality with opened files. I got this error every time Unbound is used by KumoMTA. But I also don’t understand, when Unbound is used by KumoMTA if dns and unbound resolver is configured.

Logically, I’m writing without knowing anything about your specific implementation, just sharing an outside point of view based on my past experience on the Unbound filed. It really depends on the load/concurrency, the number of open files can fluctuate a lot, so that *could * explain why you’re seeing it randomly.
Also, IMO there’s quite a difference between not thinking it’s a limit issue and actually verifying it :wink:

Here are my limits for following my thoughts:

cat /proc/$(pidof kumod)/limits | grep "open files"
Max open files 524288 files 

ulimit -Hn
999999

cat /proc/sys/fs/nr_open
1048576

lsof -p $(pidof kumod) | wc -l
# mostly betweeen 5000-16000

Of course it is possible that the error occurs because of limitations, but I have no idea how to check that. But if the limitations are the reason for this error, how can I fix it? Is there any matching config for KumoMTA?
Thanks for your help!

In my experience I’ve seen similar(same) errors with Unbound for the reasons I mentioned above (limits), but in this case it could also be some internal limit or resource “leak” within Kumo that isn’t visible just by looking at the open files

It might be worth waiting for input from the Kumo team, and perhaps consider collecting a stack trace while the issue is happening, as described here: Troubleshooting KumoMTA - KumoMTA Docs

BUT now I’ll leave it to those who know the internals better than me :slightly_smiling_face:

I’d recommend sticking with hickory in pretty much all cases. The only reason to consider using the embedded unbound resolver is if you absolutely require working DANE. If that is true for you, you should consider becoming a sponsor in order to run down this sort of integration issue.