crash course into database architecture
slides: wcbtn.j7y.co/
get me all the orders for user 4
get me all subscriptions where the next payment date is between X and Y
get me all orders where the total is $19.92
get me the sum of all order totals where the order date is in May and order by order total. or date
basically doing anything on the admin area
All of those are going to be
...really...
...slow.
What are they?
Why are they important?
How do they work? 🌈🌈
By author | By book | By author-book |
---|---|---|
George Orwell - pos 832:021 | George Orwell: Animal Farm - pos 832:021 | Animal Farm - pos 832:021 |
Isaac Asimov - pos 432:333 | Isaac Asimov: Foundation - pos 531:214 | Foundation - pos 432:333 |
Isaac Asimov - pos 531:214 | Isaac Asimov: The Naked Sun - pos 432:333 | The Handmaid's Tale - pos 111:209 |
Margaret Atwood - pos 111:209 | Margaret Atwood: The Handmaid's Tale - pos 111:209 | The Naked Sun - pos 531:214 |
👌 designed indexes ➡
fast, efficient data access ➡
💯 site ➡
😍 customers ➡
💰💵
UNIQUE: one value can only appear once in a column. Super fast
Example: post IDs
MULTI: same value can repeat. Still fast, though additional filtering required.
Example: postmeta_key
tldr; if data in column shouldn't ever be NULL, declare it so. Data lookup will be faster.
CREATE TABLE `wp_posts` (
`ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
...
)
protip: primary keys will always be NOT NULL!
protip2: unique keys permit multiple NULL values!
-- given these columns
-- ID, post_name, post_date
CREATE INDEX (`ID`, `post_name`, `post_date`)
-- will result in a compound index across ID, post_name, post_date, in that order
“If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to look up rows.”
SELECT * FROM table WHERE `ID` = x, `post_name` = y -- uses index (1,2)
SELECT * FROM table WHERE `ID` = x -- uses index (1)
SELECT * FROM table WHERE `ID` = x, `post_date` = z -- NOT using index (1,3)
SELECT * FROM table WHERE `post_name` = y, `post_date` = z -- NOT using index (2,3)
SELECT * FROM table WHERE `post_name` = y -- NOT using index (2)
Examples of signed / unsigned. "Can it have a negative sign?". Highest binary place is the sign
// signed -127 – +127
011111111 = +127
111111111 = -128
// unsigned 0 – 255
011111111 = 127
111111111 = 255
MariaDB [dev.dev]> show columns from wp_posts;
+-----------------------+---------------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------------+---------------------+------+-----+---------------------+----------------+
| ID | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| post_author | bigint(20) unsigned | NO | MUL | 0 | |
| post_date | datetime | NO | | 0000-00-00 00:00:00 | |
| post_date_gmt | datetime | NO | | 0000-00-00 00:00:00 | |
| post_content | longtext | NO | | NULL | |
| post_title | text | NO | | NULL | |
| post_excerpt | text | NO | | NULL | |
| post_status | varchar(20) | NO | | publish | |
| comment_status | varchar(20) | NO | | open | |
| ping_status | varchar(20) | NO | | open | |
| post_password | varchar(255) | NO | | | |
| post_name | varchar(200) | NO | MUL | | |
| to_ping | text | NO | | NULL | |
| pinged | text | NO | | NULL | |
| post_modified | datetime | NO | | 0000-00-00 00:00:00 | |
| post_modified_gmt | datetime | NO | | 0000-00-00 00:00:00 | |
| post_content_filtered | longtext | NO | | NULL | |
| post_parent | bigint(20) unsigned | NO | MUL | 0 | |
| guid | varchar(255) | NO | | | |
| menu_order | int(11) | NO | | 0 | |
| post_type | varchar(20) | NO | MUL | post | |
| post_mime_type | varchar(100) | NO | | | |
| comment_count | bigint(20) | NO | | 0 | |
+-----------------------+---------------------+------+-----+---------------------+----------------+
23 rows in set (0.00 sec)
191? - Limit is 767 bytes. utf8mb4 charset => 767 / 4 = 191.75 => 191. Only 191 characters fit into the key prefix.
(post_type, post_status, post_date, ID)
(post_type, post_status, post_date, ID)
(post_type, post_status, post_date, ID)
(post_type, post_status, post_date, ID)
(post_type, post_status, post_date, ID)
(post_type, post_status, post_date, ID)
ID (PRIMARY)
(post_type, post_status, post_date, ID)
ID (PRIMARY)
By the way the previous is why WP Query forces you to declare a post_type, otherwise defaults to post.
// in class-wp-query.php line 2281 in version 4.8.1
// method get_posts()
} else {
$where .= " AND {$wpdb->posts}.post_type = 'post'";
$post_type_object = get_post_type_object ( 'post' );
}
If there was no post_type declaration, lots of queries would skip using the index.
Think carefully what to use for order by. The full list is on MySQL docs on ORDER BY optimization.
These will use index:
-- same compound key, consecutive, no WHERE clause
SELECT * FROM t1
ORDER BY key_part1, key_part2;
-- key_part1 in WHERE with constant (not column), key_part2 in order by
SELECT * FROM t1
WHERE key_part1 = constant
ORDER BY key_part2;
-- key_part1 is compared to constant, same key used in order by
SELECT * FROM t1
WHERE key_part1 > constant
ORDER BY key_part1 ASC;
-- key parts compared to constants, order by uses same key part
SELECT * FROM t1
WHERE key_part1 = constant1 AND key_part2 > constant2
ORDER BY key_part2;
These will not use index though:
-- Query uses two different indexes
SELECT * FROM t1 ORDER BY key1, key2;
-- non-consecutive compound index parts in order by
SELECT * FROM t1 WHERE key2=constant ORDER BY key_part1, key_part3;
-- mixes ASC and DESC
SELECT * FROM t1 ORDER BY key_part1 DESC, key_part2 ASC;
-- key in WHERE and key in ORDER BY are different
SELECT * FROM t1 WHERE key2=constant ORDER BY key1;
-- order by is an expression other than the index (needs calculation)
SELECT * FROM t1 ORDER BY ABS(key);
SELECT * FROM t1 ORDER BY -key;
Temporary table / uses indexes in VERY SPECIFIC cases. Douglas Adam's words apply to Group By too:
The Encyclopedia Galactica, in its chapter on Love states that it is far too complicated to define. The Hitchhiker's Guide to the Galaxy has this to say on the subject of love: Avoid, if at all possible. Unfortunately, Arthur Dent has never read the Hitchhiker's Guide to the Galaxy.
Data that requires multiple tables to fetch. Eg:
Hypothetical WooCommerce Orders new table structure
delete things from wp_posts
then need to clean up in wp_postmeta
... and terms
... and comments
... and taxonomy relationships...
The solution is (would be)
Foreign keys!!
constraints on the database to keep data consistent
CREATE TABLE parent (
id INT NOT NULL,
PRIMARY KEY (id)
) ENGINE=INNODB;
CREATE TABLE child (
id INT,
parent_id INT,
INDEX par_ind (parent_id),
FOREIGN KEY (parent_id)
REFERENCES parent(id)
ON DELETE CASCADE
) ENGINE=INNODB;
You can't insert data that doesn't belong to something that already exists in the main table.
Can't drop table that has other tables depending on it.
ON DELETE CASCADE
if I delete the main record in parent, everything else that had its foreign key constrained to THAT id will automatically be deleted with it. Automatic cleanup! No more hunting for orphan data!
Of course it's not implemented in WordPress, mostly due to backwards compat reasons. (Trac 19207)
btw slides on wcbtn.j7y.co