stay 3.7.0 in the future ,WAL(Write-Ahead Log) Mode can be used , Another way to achieve transaction atomicity .
- WAL The advantages of
- In general, it is necessary to VFS Support shared memory mode .(shared-memory primitives)
- The process of operating the database file must be on the same host , It can't be used in a network operating system .
- A database connection that holds multiple database files is atomic for a single database , It's not atomic for all databases .
- Get into WAL The mode cannot be changed later page Of size.
- Can't open read-only WAL database (Read-Only Databases), The process has to have "-shm" Write permission of file .
- For read only operations , A database with few write operations , It's so slow 1 To 2 percentage .
- There will be extra "-wal" and "-shm" file
- Developers need to pay attention to checkpointing
- Faster in most cases
- More parallelism . Because read and write operations can be done in parallel .
- file IO More orderly , Serialization （more sequential）
- Use fsync() Fewer times , stay fsync() It's more uncertain when it's called .
The way to roll back the log is to write the contents of the changed database file to the log , Then write the changed content directly to the database file . In system crash Or power down , The contents of the log are rewritten to the database file . The log file was deleted , sign commit Once again commit The end of .
WAL The pattern is the opposite . The original database content is changed in the database file , Modifications to the database file are appended to a separate WAL In file . When a record is appended to WAL After the document , It marks the first time commit The end of . So once commit There is no need to operate on the database file , When a write operation is in progress , Can read at the same time . The contents of multiple transactions can be appended to one transaction WAL End of file .
Last WAL The contents of the file must be updated to the database file . hold WAL The process of updating the contents of a file to a database file is called once checkpoint.
There are two ways to roll back logs ： Read and write .WAL There are three operations , read 、 Write and checkpoint.
default ,SQL Will be in WAL The document reaches 1000page Once in a while checkpoint. Conduct WAL The timing of the application can also be determined by the application itself .
When a read operation occurs in WAL Schema , You'll find it first WAL Last submission in document , be called "end mark". Each transaction can have its own way "end point", But for a given transaction ,end mark Is constant .
When reading the page when ,SQLite We'll start with WAL Find out if there is a corresponding page, Find out the distance from end mark The latest record ; If you can't find it , Then look for the correct data from the database file page. To avoid scanning every transaction WAL file ,SQLite One is maintained in shared memory "wal-index" Data structure of , Help locate quickly page.
Writing a database is just adding new content to it WAL End of file , It has nothing to do with read operations . Since there is only one WAL file , Therefore, there can only be one write operation at the same time .
checkpoint Operations can be performed in parallel with read operations . But if checkpoint Put one page Write to database file , And this page Exceeded the limit of the current read operation end mark when ,checkpoint Must stop . Otherwise, the current reading will be covered . The next time checkpoint when , From this page Start copying data to the database .
When writing operations , Will check the WAL The progress of the file being copied to the database . If it has been completely copied to the database file , Already synchronized , And there are no read operations in use WAL file , So the WAL File empty , Add data from the beginning . Guarantee WAL No unlimited file growth .
Write operations are fast , Because only one write operation is required , And it's sequential （ It's not random , Write to the end every time ）. and , It is not necessary to brush data to disk .（ If PRAGMA synchronous yes FULL, Every time commit I have to brush it once , Otherwise do not brush .）
The performance of read operations is degraded , Because of the need from WAL Find content in file , Time and effort WAL File size .wal-index It can shorten the time , But it can not be completely avoided . So we need to guarantee WAL The size of the file won't be too large .
In order to protect the database from being damaged , You need to put WAL Before writing the file to the database WAL File is swiped to disk ; On reset WAL File before the database content to brush into the database file . Besides checkpoint Need to find operation . These factors make checkpoint Slower than writing .
The default policy is that many threads can grow WAL file . hold WAL File size becomes smaller than 1000page The big thread is responsible for the process checkpoint. Most of the read and write operations are very fast , A random write operation is very slow . You can also disable auto checkpoint The strategy of , Periodically in a thread or process checkpoint operation .
Efficient write operation WAL The larger the file, the better ; Efficient read operation WAL The smaller the file, the better . There is a difference between the two tradeoff.
Activation and configuration WAL Pattern
PRAGMA journal_mode=WAL;, If it works , Returns the "wal".
Can manually checkpoint
sqlite3_wal_checkpoint(sqlite3 *db, const char *zDb)
To configure checkpoint
sqlite3_wal_autocheckpoint(sqlite3 *db, int N);
It can be invoked in any database connection that can write operations. sqlite3_wal_checkpoint_v2() or sqlite3_wal_checkpoint().
WAL Persistence of patterns
When a process is set WAL Pattern , Shut down the process , Reopen the database , Still WAL Pattern .
If the WAL Pattern , Then all connections to this database will be set to WAL Pattern .
Read only database
If the database needs to be recovered , And you only have read access , No write permission , Then you can't read the database , Because the first step to read is to recover the database .
Allied , because WAL When the database is read in mode , Operation similar to database recovery is required , So if you only have read access , You can't open the database on the server .
WAL The implementation of needs to have a WAL The hash table of the file is in shared memory . stay Unix and Windows Of VFS In the implementation , Is based on MMap Of . Mapping shared memory to "-shm" In file . So even if it's right WAL Read the database file under the mode , Write permission is also required .
To convert a database file to a read-only file , You need to change the log mode of this database to "delete".
Avoid too much WAL file
WAL-index Implementation of shared memory based on FPGA
stay WAL Before release , I've tried to wal-index Map to temporary directory , Such as /dev/shm or /tmp. But different users see different directories , So the road is blocked .
Later, I tried to wal-index Mapped to anonymous virtual memory blocks , But it can't be used when you don't use it Unix Consistent in the version .
The final decision to adopt will be wal-index Map to the same directory . This will result in unnecessary disks IO. But it's not a big problem , Because wal-index Rarely more than 32k, And it's never called sync operation . Besides , After the last database connection is closed , This file will be deleted .
If this database is used by only one process , Then you can use heap memory Instead of sharing memory .
Implementation without shared memory WAL
stay 3.7.4 After the version , as long as SQLite Of lock mode Is set to EXCLUSIVE, So even if shared memory does not support , You can also use WAL Pattern .
let me put it another way , If only one process uses SQLite, Then it can be used without shared memory WAL.
here , take lock mode Change it to normal It's invalid , Cancellation required WAL Pattern .