There is only one thing at a time an instance is doing, however that one thing may be interrupted and resumed any time. This one thing is always one of these:
BEGIN, END, main and awk functions are the four entry points of executing the script. Normally BEGIN is run right after setting up the script, then main is run on all input and END is run when the script exits, right before uninitialization of the script instance. This is a 1:1 copy of the standard way awk works. The fourth, calling awk functions directly from the application is an extra entry point.
The script is not doing anything unless the application commands it to. Some of the simplified API does this automatically, but the raw API (staged init/uninit) always lets the app decide when to start running the script. This document calls an execution transaction when the application calls the API to start running a script.
Any execution related call is non-blocking, thus it will return after a reasonable time spent running the script and will never stuck running an infinite loop. When such an API call returns, the return value is a mawk_exec_result_t that indicates the reason of the return:
Execution transaction are collected on the evaluation stack. If the application requests an execution and the API call returns before finishing, the transaction is still active. The application is free to initiate a new execution transaction, without first finishing the previous one. However, the VM will always resume and progress running the most recent execution transaction. This means execution transactions are sort of nested. When the top, most recent execution transaction finishes (return 3), the next resume request will go on with the previous transaction.
Note, however, that the script has global states. The most obvious state is the exit state: if the script runs exit(), it will discard all open transactions. For example consider a script that is running a main part processing the input. When the application is in this phase, the topmost transaction is always a "running main" transaction that returned previously because there was no more input to be processed. If the application calls an awk function that decides to do an exit(), that will affect not only discard the function transaction but the pending "running main" transaction as well. Whenever the application requests a resume on the code, that will start running the END section.
{ print "prefix:", $0 }The application fills the FIFO with some data that may contain one or more full records, potentially ending with a partial (unterminated) record. If the application resumes the script, it will try to read all full records and process them. It will interrupt execution and return MAWK_EXER_INT_READ the first time a full record can't be read. This always happens "before the {}".
A slightly more complicated script prefixes odd and even lines differently:
{ print "odd:", $0 getline print "even:", $0 }This script may return with MAWK_EXER_INT_READ either before {} or in the getline instruction. This means the application should not assume that when main returns it was not in the middle of such a block. (In the actual VM main starts with an implicit getline so there's no difference between the two cases).
A similar situation is when an awk function is executing getline on a FIFO: the application that calls the function shall not expect that the function finishes and produces its return value in the initial execution request. Instead the request will create a new execution transaction and multiple resume calls may be needed until the function actually returns.
Obviously the application shall fill the FIFO while executing resumes: if there is no new input and the script is waiting for new input, the resume call will return immediately.
This feature is useful when the application is implemented as a single threaded async loop: running a blocking script would block the entire loop.
The application shall never expect the initial call that
created the new execution transaction will end in
MAWK_EXER_DONE or MAWK_EXER_FUNCRET; when it does not,
a subsequent resume call eventually will.
return path 4.: MAWK_EXER_EXIT
Similar to MAWK_EXER_DONE, but means the script called exit.
This is legal from even an awk function call, in which case the
function will never have a return value (as the code can not be resumed
any more). Normal awk rules apply: calling exit() from BEGIN or main
(or subsequent functions, called by the script or the application) puts
the script in exit mode and next resume will run END. Calling exit from
END will exit immediately leaving the script in non-runnable state.
conclusion: script execution
It is safe to assume calling any script execution will return with
a conclusion if, and only if:
Since these are not guaranteed in most common use cases, the code should prepare to:
Thus following c-pseudo-code should be used:
TODO