CiviCRM Replay-on-Write: About¶
This is a small utility which allows CiviCRM to work with an opportunistic combination of a read-write master database (RWDB) and read-only slave databases (RODB). The general idea is to connect to RODB optimistically (expecting a typical read-only use-case) -- and then switch to RWDB if there is an actual write.
Opportunistically switching to RWDB is not quite as simple as it sounds because the original read-only session may have some significant local state (such as session-variables or temporary-tables) that feed into the write operations. To mitigate this issue, we use a buffer to track statements which affect session-state. Such statements are replayed on RWDB.
This is designed for use-cases in which:
- Most users/page-views can be served entirely by RODB.
- Some users/page-views need to work with RWDB.
- There is no simpler or more correct way to predict which users/page-views will need read-write operations. (Or: you need a fallback in case the predictions are imperfect.)
Every SQL statement is (potentially) classified into one of three buckets:
SELECT * FROM foo): The SQL statement has no side-effects; it simply reads data.
SET @user_id = 123): The SQL statement has no long-term, persistent side-effects; it can, however, have temporary side-effects during the present MySQL session.
TRUNCATE foo): The SQL statement has long-term, persistent side-effects and must be executed on RWDB. (Generally, if we can't demonstrate that something is
BUFFER, then we assume it is
For more detailed examples of each category, browse tests/examples.
Each CiviCRM page-view uses a MySQL connection. Over the course of the page-view, the connection may progress through as many as three stages:
- (Read-only) In the first stage, we connect to RODB. We stay connected
as long as the SQL queries are read-oriented (
TYPE_READ). Statements with localized side-effects (
TYPE_BUFFER) are copied to the buffer.
- (Replay/Transition) In the second stage, we encounter the first write statement
TYPE_WRITE). We switch to RWDB, where we replay the buffer along with the write statement.
- (Read-write) In the third/final stage, all statements of any type (
TYPE_WRITE) are executed on RWDB.
civirpow provides some consistency, but it also has a limitation.
Within a given MySQL session, you can mix various read+write operations -- for example, insert a record, then read the record, then update it, and then read it again. Once you start writing, all requests are handled by RWDB -- which should provide a fair degree of consistency.
However, there is one notable source of inconsistency: at the beginning of the connection (before the first write), you'll read data from RODB (instead of RWDB) -- so it may start with a dirty-read. A few mitigating considerations:
If your environment regularly has a perceptible propagation delay between the RWDB+RODB (e.g. 30sec), then users may be sensitive to dirty-reads within the propagation period (e.g. 30sec). Use an HTTP cookie or HTTP session-variable to force them onto RWDB (i.e.
forceWriteMode()) for the subsequent 30-60 sec. (TODO: Example code)
Hopefully, some dirty-reads are acceptable (or acceptably infrequent). If dirty-reads are a total show-stopper, then replay-on-write may be the wrong approach for your use-case.
If you know that a use-case will be writing and must have fresh reads, you can give a hint to force it into write mode; either
- (If you don't have access to
$stateMachine) Issue a request for the dummy query