TraceMonkey's big trick is trace trees. When you do a loop, you get back to the top through jumps. Do this enough times and that jump is considered hot and of some use. When that decision is made, a trace is made from the top of the loop to the jump. That entire trace is then inlined.
Python could potentially do this if we ever figure out a good way to do inlining (Python's rich calling semantics make me wonder how easy it would be to do inlining, but I still think it is doable). I think the first step in trying to pull something like this off would be to get simple inlining working for functions and methods and build on that.
V8 does something called hidden classes. Basically, when a class is created a hidden version is made that makes attribute access be a memory offset. As more attributes are added, more hidden classes are created so each new hidden class allows for one more attribute to be accessed. The potential problem for doing this for Python is that namespace lookups are structured around using dicts. Being able to overload __dict__ is part of the design of the language. So if anything was done like this then a special namespace dict would be needed that could translate dict lookups to the proper memory offset lookup.
In the first announcement for SquirrelFish, the speedup came from moving off of AST execution and over to a bytecode interpreter. About the biggest difference between SquirrelFish and Python is the use of registers instead of a stack. I have always been curious as to whether using a register VM might be of some benefit to Python. SquirrelFish Extreme basically adds what V8 does with hidden classes.
There are two things that seem to be common amongst the VMs. One is using research done for the Self programming language. While the language itself is prototype-based, some of the work could still be applied to Python. And second, all of the interpreters use JIT compilation to native code when possible.