This paper presents an improved hardware acceleration scheme for Java method calls in the REALJava coprocessor. The strategy is implemented in an FPGA prototype and allows for measuring real performance increases. It validates the coprocessor concept for accelerating Java bytecode execution in embedded systems with limited CPU performance and memory availability. The coprocessor architecture is highly modular, separating communication from the execution core to improve reusability and allow for system scalability.