For bare metal hypervisors, raw performance should be slightly below native (<10% maybe?). There is also some configuration that needs to be done on the hypervisor end to enable guest VM's to obtain full access to the hardware. So if your use-case was running a Windows guest for gaming and a linux guest to act as a game server, this would be possible on the same set of hardware.
As others here have pointed out, the catch is in the usage of a thin-client. The encoding and network transmission adds latency while often lowering quality. There are cards like
AMD's FirePro R5000 that have hardware encoders and dedicated networking interfaces for this task that lower latencies than software based solutions. At work I've been looking at a PCoIP solution for a presentation system. It does uncompressed 1280 x 720 with a reduced color space to YUC 4:2:2 (PC displays are typically YUC 4:4:4). The results aren't bad for what you get but I'm likely going to pass on it as I'm for 1920 x 1080 support. Units I've looked at all have some form of compression involved to reduce the bandwidth requirements (and to be fair, I haven't yet looked at if/what compression the R5000 does).
Another alternative I've come across is HDBaseT. While it uses standard RJ-45 and Cat6 cabling, it is not compatible with Ethernet (so don't plug it into your switch). Basically it is a cost effective means of running tens of meters of wiring in a more cost effective manner. If you need a display far from where your system is at (but at the same general location) this would be an alternative.