Network Engineer here! I can shed some light on the issue.
VRC uses UDP port 3290 for voice traffic. By definition, UDP traffic is stateless - it does not rely on the 2 parties agreeing on how/when to send it. When a pilot transmits on frequency, the voice server relays that to you as a UDP stream. The problem occurs when your router receives that traffic and does not know where to send it (your computer vs your phone vs your smart TV). The reason why keying up every so often alleviates the problem is because your router will track the UDP stream for some period of time (usually 30 seconds to a minute). When you key up, your router keeps track of that connection and while your router is tracking the connection, if it receives the UDP stream from the voice server, it is smart enough to see the two are related and sends it to your computer where VRC gets it. Port forwarding also alleviates the problem because now your router has a permanent rule telling it that ANY UDP data it receives on port 3290, send it to your computer (whether from the voice server or not).
This boils down to the fact that NAT sucks. It is a hack that was implemented to stretch the lifetime of the IPv4 address space. Most of the time, it works great, but things like this show its flaws.
Hope this helps someone understand