version :2025.05.06-b29689af(I haven’t had time to upgrade to the latest version yet.)
Recently, I noticed that the value of open files keeps increasing, so I’m wondering if some connections in the SOCKS service aren’t being closed in time.
I don’t know your specific workload, kumo conf, distribution of contacted domains, shaping, etc, so I’m sharing the note below in general terms on topic, are notes I meant to post in another thread some time ago but never did… taking the opportunity now I hope it’s useful to others as well.
These are practical tips from years (and years) working with **Momentum ** and recent load testing on Kumo. Treat them as information and apply only if you understand the trade offs and after proper validation.. Some of this appears in System Preparation docs, but IMO it’s easy to miss.
Systemd limits
For kumo / kumo-proxy (depending on your cluster), raise, as already said in some threads
LimitNOFILE
LimitNPROC
TCP /proc/sys/net/ipv4 tuning for outbound (heavy) workloads
It might take a long time, or it may have no effect on the existing FIN-WAIT-2 connections—the number hasn’t changed (in fact, it’s even increased slightly).
I’m not planning to restart the proxy server yet, in order to troubleshoot the issue.
Just for info, have you tried checking which peers are involved in the FIN-WAIT-2 state? Just to see if the issue can be isolated to a specific host or domain?
Of course, we’ve analyzed that. Since our Kumomta deployment is in China, it’s well known that emails sent from China are often treated differently, resulting in a large number of link issues.
ss -ant state fin-wait-2
0 0 myip:36737 203.138.180.112:25
xxxx
many
many
many
I also looked it up on DeepWiki, and they suspect that the proxy server may have connection leaks in certain situations. That’s why I reported it here for review.
Connections in the FIN-WAIT-2 state are controlled by the Linux kernel’s tcp_fin_timeout parameter, which is typically set to 60 seconds by default. This means that even if the application layer doesn’t properly close the connection, the operating system will automatically clean it up after the timeout.
So, I’m not quite sure where the problem actually lies.