Performance and efficiency is a major goal of REPY, which may be surprising considering Python's reputation as a 'slow' language, and furthermore the official interpreter lags behind many third-party implementations. However, it is more than possible to use REPY without performance issues, even for code that needs to run every frame.
Three factors to consider:
That being said, a poorly designed mod can definitely result in serious lag. This page goes over a couple of common pitfalls that can seriously hurt your mod's performance.
The current implementation of REPY's Interpreter Stack (the mechanism that manages thread and interpreter tracking) only changes the PyThreadState whenever the stack is empty, or when the value at the top of the stack changes. There is a potential performance issue with this design, however: Accidentally creating a situation where the current PyThreadState is released too often can result in considerable slowdown. In fact, this can often result in PyThreadState management actually taking up more CPU time than the actual Pythonic operations that the PyThreadState is meant to encapsulate.
This can often happen when encapsulating simple Python operations in short C functions. Consider the following example:
Remember that REPY_FN_SETUP and REPY_FN_CLEANUP make calls to REPY_PushInterpreter and REPY_PopInterpreter, respectively. If we assume that the Interpreter Stack is empty when main_func is called, then REPY will activate acquire a PyThreadState during pyfunc, release it, and then immediately acquire it again during pyfunc2. This quite inefficient. While one uncessessary release won't make a noticable difference, lots of reaquirings can add up, particularly if main_func is called many times per frame.
The saving grace is that fix here is rather simple: Simply encapsulate your calls to pyfunc1 and pyfunc2 between their own REPY_PushInterpreter and REPY_PopInterpreter calls, like so:
This ensure that REPY will hold onto the PyThreadState for the calls to both pyfunc1 and pyfunc2, only releasing at the call to REPY_PopInterpreter in main_func. This way, the PyThreadState is only ever released once, and the resulting code is MUCH more efficient.
Two additional notes about this concept:
There are plans to optimize REPY's management of PyThreadState in a future update.
Currently, there are some performance issues with how subinterpreters are handled, which significantly exaggerates the thread state release issues from the above section, and similar issues can occur when REPY_PopInterpreter merely results in switching interpreters. Until these issues can be fixed (in a future update), subinterpreters should be used sparingly.
It goes without saying that making the Python interpreter parse the same string of code every time that string needs executing is highly inefficient. In fact, in many cases, the act of parsing the code string can consume more CPU time than the actual execution.
Fortunately, the maxim that "In Python, everything is an object" applies to the parsed executable code as well. By leveraging Python's built-in compile method, we hold a obtain a Python code object that can be run using exec and eval. This means that a Python code string can be parsed once and then executed as many times as necessary.
REPY exposes this capability in the C API in many ways, but by far the easiest to use are REPY_FN_EXEC_CACHE and the different variants of REPY_FN_EVAL_CACHE. These macros parse the Python code string on first execution and store the code object as a static variable. That code object can then be used to perform all subsequent executions, which dramatically improves performance in most cases.
In summary, this code block:
will be much more performant than this one:
For further information on preparsing Python code, consider looking at the following functions and macros: