In #5061 it was suggested that the primary cause of empty blocks was due to ASICs taking some time to switch to new work, but this is not a possible reason for empty blocks - pools can mine empty or full blocks no matter how long ASICs do (or do not) take to switch to new work.
At some time T a new block is found, at some time T+1 a pool hears about this new block and desires to switch their miners to the new block. If the pool does a full block validation of the new block prior to sending new work (and updates their mempool), they can switch miners to the new work with transactions immediately. If, then, at time T+2 the ASIC finds a block based on the prior work it doesn't matter whether they were given transactions or not - the block is based on the previous tip and is just a fork.
Instead, the relevant question for empty blocks is whether the pool delays until they can fully validate the block and update their mempool (or in fact even *have* the block), selecting a first block template which contains transactions or not. #5061 tried to explain this away with an incredibly weak reference to bandwidth, but the cost of a merkle branch for a block with 4k transactions is only ~800 bytes of JSON'd hex, which shouldn't even push you into a second IP packet, let alone make for substantial bandwidth.
Its worth noting that it can take many seconds for block to make it from one pool to others, and can further take a second or two to validate a block even on fast hardware (in cases where Bitcoin Core decides to flush to disk due to cache fill), so it makes sense that you'll see empty blocks for some seconds after the previous block. @mononaut posted a chart convincingly demonstrating this - for the first few seconds after a previous block ~all blocks are empty blocks (strong evidence for spy mining or otherwise sending new work prior to updating the local mempool) and we see some dropoff of empty blocks therafter on a relatively expected distribution.
There are likely some empty blocks which are mined further past the previous block due to ASICs holding onto previous work, however the only reason those ASICs *have* previous work that is an empty block is because of spy mining or otherwise not tuning Bitcoin Core on the side of the pool to ensure they can update their mempool fast enough and get blocks fast enough to not need to send empty work at all.