CREATE TABLE
Synopsis
Use the CREATE TABLE
statement to create a table in a database. It defines the table name, column names and types, primary key, and table properties.
Syntax
create_table ::= CREATE [ TEMPORARY | TEMP ] TABLE [ IF NOT EXISTS ]
table_name ( [ table_elem [ , ... ] ] )
[ WITH ( { COLOCATION = { 'true' | 'false' }
| storage_parameters } )
| WITHOUT OIDS ] [ TABLESPACE tablespace_name ]
[ SPLIT { INTO positive_int_literal TABLETS
| AT VALUES ( split_row [ , ... ] ) } ]
table_elem ::= column_name data_type [ column_constraint [ ... ] ]
| table_constraint
column_constraint ::= [ CONSTRAINT constraint_name ]
{ NOT NULL
| NULL
| CHECK ( expression )
| DEFAULT expression
| GENERATED { ALWAYS | BY DEFAULT } AS
IDENTITY [ sequence_options ]
| UNIQUE index_parameters
| PRIMARY KEY
| references_clause }
[ DEFERRABLE | NOT DEFERRABLE ]
[ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]
table_constraint ::= [ CONSTRAINT constraint_name ]
{ CHECK ( expression )
| UNIQUE ( column_names ) index_parameters
| PRIMARY KEY ( key_columns )
| FOREIGN KEY ( column_names )
references_clause }
[ DEFERRABLE | NOT DEFERRABLE ]
[ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]
key_columns ::= hash_columns [ , range_columns ] | range_columns
hash_columns ::= column_name [ HASH ] | ( column_name [ , ... ] ) HASH
range_columns ::= { column_name { ASC | DESC } } [ , ... ]
storage_parameters ::= storage_parameter [ , ... ]
storage_parameter ::= param_name [ = param_value ]
index_parameters ::= [ INCLUDE ( column_names ) ]
[ WITH ( storage_parameters ) ]
[ USING INDEX TABLESPACE tablespace_name ]
references_clause ::= REFERENCES table_name [ column_name [ , ... ] ]
[ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ]
[ ON DELETE key_action ]
[ ON UPDATE key_action ]
split_row ::= ( column_value [ , ... ] )
sequence_options ::= [ INCREMENT [ BY ] int_literal ]
[ MINVALUE int_literal | NO MINVALUE ]
[ MAXVALUE int_literal | NO MAXVALUE ]
[ START [ WITH ] int_literal ]
[ CACHE positive_int_literal ] [ [ NO ] CYCLE ]
create_table
table_elem
column_constraint
table_constraint
key_columns
hash_columns
range_columns
storage_parameters
storage_parameter
index_parameters
references_clause
split_row
sequence_options
Semantics
Create a table with table_name. If qualified_name
already exists in the specified database, an error will be raised unless the IF NOT EXISTS
clause is used.
Primary key
Primary key can be defined in either column_constraint
or table_constraint
, but not in both.
There are two types of primary key columns:
-
Hash primary key columns
: The primary key may have zero or more leading hash-partitioned columns. By default, only the first column is treated as the hash-partition column. But this behavior can be modified by explicit use of the HASH annotation. -
Range primary key columns
: A table can have zero or more range primary key columns and it controls the top-level ordering of rows in a table (if there are no hash partition columns) or the ordering of rows among rows that share a common set of hash partitioned column values. By default, the range primary key columns are stored in ascending order. But this behavior can be controlled by explicit use ofASC
orDESC
.
For example, if the primary key specification is PRIMARY KEY ((a, b) HASH, c DESC)
, then columns a
& b
are used together to hash partition the table, and rows that share the same values for a
and b
are stored in descending order of their value for c
.
If the primary key specification is PRIMARY KEY(a, b)
, then column a
is used to hash partition the table, and rows that share the same value for a
are stored in ascending order of their value for b
.
Tables always have a primary key
PostgreSQL's table storage is heap-oriented—so a table with no primary key is viable. However YugabyteDB's table storage is index-oriented (see DocDB Persistence), so a table isn't viable without a primary key.
Therefore, if you don't specify a primary key at table-creation time, YugabyteDB will use the internal ybrowid
column as PRIMARY KEY
and the table will be sharded on ybrowid HASH
.
Foreign key
FOREIGN KEY
and REFERENCES
specifies that the set of columns can only contain values that are present in the referenced column(s) of the referenced table. It is used to enforce referential integrity of data.
Unique
This enforces that the set of columns specified in the UNIQUE
constraint are unique in the table, that is, no two rows can have the same values for the set of columns specified in the UNIQUE
constraint.
Check
This is used to enforce that data in the specified table meets the requirements specified in the CHECK
clause.
Default
This clause is used to specify a default value for the column. If an INSERT
statement does not specify a value for the column, then the default value is used. If no default is specified for a column, then the default is NULL.
An identity column will automatically receive a new value produced by its linked sequence.
Deferrable constraints
Constraints can be deferred using the DEFERRABLE
clause. Currently, only foreign key constraints
can be deferred in YugabyteDB. A constraint that is not deferrable will be checked after every row
in a statement. In the case of deferrable constraints, the checking of the constraint can be postponed
until the end of the transaction.
Constraints marked as INITIALLY IMMEDIATE
will be checked after every row in a statement.
Constraints marked as INITIALLY DEFERRED
will be checked at the end of the transaction.
IDENTITY columns
Create the column as an identity column.
An implicit sequence will be created, attached to it, and new rows will automatically have values assigned from the sequence. IDENTITY columns are implicitly NOT NULL
.
ALWAYS
and BY DEFAULT
will determine how user-provided values are handled in INSERT
and UPDATE
statements.
On an INSERT
statement:
- when
ALWAYS
is used, a user-provided value is only accepted if theINSERT
statement usesOVERRIDING SYSTEM VALUE
. - when
BY DEFAULT
is used, then the user-provided value takes precedence. See INSERT statement for reference. (In theCOPY
statement, user-supplied values are always used regardless of this setting.)
On an UPDATE
statement:
- when
ALWAYS
is used, a column update to a value other thanDEFAULT
will be rejected. - when
BY DEFAULT
is used, the column can be updated normally. (OVERRIDING
clause cannot be used for the UPDATE statement)
The sequence_options
optional clause can be used to override the options of the generated sequence.
See CREATE SEQUENCE for reference.
Multiple Identity Columns
PostgreSQL and YugabyteDB allow a table to have more than one identity column. The SQL standard specifies that a table can have at most one identity column.
This relaxation primarily aims to provide increased flexibility for carrying out schema modifications or migrations.
Note that the INSERT command can only accommodate one override clause for an entire statement. As a result, having several identity columns, each exhibiting distinct behaviours, is not effectively supported.
TEMPORARY or TEMP
Using this qualifier will create a temporary table. Temporary tables are visible only in the current client session or transaction in which they are created and are automatically dropped at the end of the session or transaction. Any indexes created on temporary tables are temporary as well. See the section Creating and using temporary schema-objects.
TABLESPACE
Specify the name of the tablespace that describes the placement configuration for this table. By default, tables are placed in the pg_default
tablespace, which spreads the tablets of the table evenly across the cluster.
SPLIT INTO
For hash-sharded tables, you can use the SPLIT INTO
clause to specify the number of tablets to be created for the table. The hash range is then evenly split across those tablets.
Presplitting tablets, using SPLIT INTO
, distributes write and read workloads on a production cluster. For example, if you have 3 servers, splitting the table into 30 tablets can provide write throughput on the table. For an example, see Create a table specifying the number of tablets.
Note
By default, YugabyteDB presplits a table inysql_num_shards_per_tserver * num_of_tserver
shards. The SPLIT INTO
clause can be used to override that setting on a per-table basis.
SPLIT AT VALUES
For range-sharded tables, you can use the SPLIT AT VALUES
clause to set split points to presplit range-sharded tables.
Example
CREATE TABLE tbl(
a int,
b int,
primary key(a asc, b desc)
) SPLIT AT VALUES((100), (200), (200, 5));
In the example above, there are three split points and so four tablets will be created:
- tablet 1:
a=<lowest>, b=<lowest>
toa=100, b=<lowest>
- tablet 2:
a=100, b=<lowest>
toa=200, b=<lowest>
- tablet 3:
a=200, b=<lowest>
toa=200, b=5
- tablet 4:
a=200, b=5
toa=<highest>, b=<highest>
COLOCATION
To create a colocated table, use the following command:
CREATE TABLE <name> (columns) WITH (COLOCATION = true);
In a colocated database, all tables are colocated by default. To opt a specific table out of colocation, use the following command:
CREATE TABLE <name> (columns) WITH (COLOCATION = false);
This ensures that the table is not stored on the same tablet as the rest of the tables for this database, but instead has its own set of tablets. Use this option for large tables that need to be scaled out.
COLOCATION = true
has no effect if the database that the table is part of is not colocated, as currently colocation is supported only at the database level. See Colocated tables for more details.
Storage parameters
Storage parameters, as defined by PostgreSQL, are ignored and only present for compatibility with PostgreSQL.
Examples
Table with primary key
yugabyte=# CREATE TABLE sample(k1 int,
k2 int,
v1 int,
v2 text,
PRIMARY KEY (k1, k2));
In this example, the first column k1
will be HASH
, while second column k2
will be ASC
.
yugabyte=# \d sample
Table "public.sample"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
k1 | integer | | not null |
k2 | integer | | not null |
v1 | integer | | |
v2 | text | | |
Indexes:
"sample_pkey" PRIMARY KEY, lsm (k1 HASH, k2)
Table with range primary key
yugabyte=# CREATE TABLE range(k1 int,
k2 int,
v1 int,
v2 text,
PRIMARY KEY (k1 ASC, k2 DESC));
Table with check constraint
yugabyte=# CREATE TABLE student_grade(student_id int,
class_id int,
term_id int,
grade int CHECK (grade >= 0 AND grade <= 10),
PRIMARY KEY (student_id, class_id, term_id));
Table with default value
yugabyte=# CREATE TABLE cars(id int PRIMARY KEY,
brand text CHECK (brand in ('X', 'Y', 'Z')),
model text NOT NULL,
color text NOT NULL DEFAULT 'WHITE' CHECK (color in ('RED', 'WHITE', 'BLUE')));
Table with foreign key constraint
Define two tables with a foreign keys constraint.
yugabyte=# CREATE TABLE products(id int PRIMARY KEY,
descr text);
yugabyte=# CREATE TABLE orders(id int PRIMARY KEY,
pid int REFERENCES products(id) ON DELETE CASCADE,
amount int);
Insert some rows.
yugabyte=# SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL SERIALIZABLE;
yugabyte=# INSERT INTO products VALUES (1, 'Phone X'), (2, 'Tablet Z');
yugabyte=# INSERT INTO orders VALUES (1, 1, 3), (2, 1, 3), (3, 2, 2);
yugabyte=# SELECT o.id AS order_id, p.id as product_id, p.descr, o.amount FROM products p, orders o WHERE o.pid = p.id;
order_id | product_id | descr | amount
----------+------------+----------+--------
1 | 1 | Phone X | 3
2 | 1 | Phone X | 3
3 | 2 | Tablet Z | 2
(3 rows)
Inserting a row referencing a non-existent product is not allowed.
yugabyte=# INSERT INTO orders VALUES (1, 3, 3);
ERROR: insert or update on table "orders" violates foreign key constraint "orders_pid_fkey"
DETAIL: Key (pid)=(3) is not present in table "products".
Deleting a product will cascade to all orders (as defined in the CREATE TABLE
statement above).
yugabyte=# DELETE from products where id = 1;
yugabyte=# SELECT o.id AS order_id, p.id as product_id, p.descr, o.amount FROM products p, orders o WHERE o.pid = p.id;
order_id | product_id | descr | amount
----------+------------+----------+--------
3 | 2 | Tablet Z | 2
(1 row)
Table with unique constraint
yugabyte=# CREATE TABLE translations(message_id int UNIQUE,
message_txt text);
Create a table specifying the number of tablets
To specify the number of tablets for a table, you can use the CREATE TABLE
statement with the SPLIT INTO
clause.
yugabyte=# CREATE TABLE tracking (id int PRIMARY KEY) SPLIT INTO 10 TABLETS;
Opt a table out of colocation
yugabyte=# CREATE DATABASE company WITH COLOCATION = true;
yugabyte=# CREATE TABLE employee(id INT PRIMARY KEY, name TEXT) WITH (COLOCATION = false);
In this example, database company
is colocated and all tables other than the employee
table are stored on a single tablet.