Layered I/O has been the holy grail of Apache module writers for years. With Apache 2.0, module writers can finally take advantage of layered I/O in their modules.
In all previous versions of Apache, only one handler was allowed to modify the data stream that was sent to the client. With Apache 2.0, one module can modify the data and then specify that other modules can modify the data if they would like.
In order to make a module use layered I/O, there are some modifications needed. A new return value has been added for modules, RERUN_HANDLERS. When a handler returns this value, the core searches through the list of handlers looking for another module that wants to try the request.
When a module returns RERUN_HANDLERS, it must modify two fields of the request_rec, the handler and content_type fields. Most modules will set the handler field to NULL, and allow the core to choose the which module gets run next. If these two fields are not modified, then the server will loop forever calling the same module's handler.
Most modules should not write out to the network if they want to take advantage of layered I/O. Two BUFF structures have been added to the request_rec, one for input and one for output. The module should read and write to these BUFFs. The module will also have to setup the input field for the next module in the list. A new function has been added, ap_setup_input, which all modules should call before they do any reading to get data to modify. This function checks to determine if the previous module set the input field, if so, that input is used, if not the file is opened and that data source is used. The output field is used basically the same way. The module must set this field before they call ap_r* in order to take advantage of layered I/O. If this field is not set, ap_r* will write directly to the client. Usually at the end of a handler, the input (for the next module) will be the read side of a pipe, and the output will be the write side of the same pipe.
This example is the most basic layered I/O example possible. It is basically CGIs generated by mod_cgi and sent to the network via http_core.
mod_cgi executes the cgi script, and then sets request_rec->input to the output pipe of the CGI. It then NULLs out request_rec->handler, and sets request_rec->content_type to whatever the CGI writes out (in this case, text/html). Finally, mod_cgi returns RERUN_HANDLERS.
ap_invoke_handlers() then loops back to the top of the handler list and searches for a handler that can deal with this content_type. In this case the correct module is the default_handler from http_core.
When default handler starts, it calls ap_setup_input, which has found a valid request_rec->input, so that is used for all inputs. The output field in the request_rec is NULL, so when default_handler calls an output primitive it gets sent out over the network.
Ryan Bloom, 25th March 2000