Pypika

Pypika is a Python API for building SQL queries. The motivation behind PyPika is to provide a simple interface for building SQL queries without limiting the flexibility of handwritten SQL.

PyPika is a fast, expressive and flexible way to replace handwritten SQL. Validation of SQL correctness is not an explicit goal of the project. Instead you are encouraged to check inputs you provide to PyPika or appropriately handle errors raised from your SQL database.

After the queries have been built you need to interact with the database with other libraries.

Installation ⚑

pip install pypika

Usage⚑

The main classes in pypika are pypika.Query, pypika.Table, and pypika.Field.

from pypika import Query, Table, Field

Creating Tables ⚑

The entry point for creating tables is pypika.Query.create_table, which is used with the class pypika.Column. As with selecting data, first the table should be specified. This can be either a string or a pypika.Table. Then the columns, and constraints..

stmt = Query \
    .create_table("person") \
    .columns(
        Column("id", "INT", nullable=False),
        Column("first_name", "VARCHAR(100)", nullable=False),
        Column("last_name", "VARCHAR(100)", nullable=False),
        Column("phone_number", "VARCHAR(20)", nullable=True),
        Column("status", "VARCHAR(20)", nullable=False, default=ValueWrapper("NEW")),
        Column("date_of_birth", "DATETIME")) \
    .unique("last_name", "first_name") \
    .primary_key("id")

This produces:

CREATE TABLE "person" (
    "id" INT NOT NULL,
    "first_name" VARCHAR(100) NOT NULL,
    "last_name" VARCHAR(100) NOT NULL,
    "phone_number" VARCHAR(20) NULL,
    "status" VARCHAR(20) NOT NULL DEFAULT 'NEW',
    "date_of_birth" DATETIME,
    UNIQUE ("last_name","first_name"),
    PRIMARY KEY ("id")
)

It seems that they don't yet support the definition of FOREIGN KEYS when creating a new table.

Inserting data ⚑

Data can be inserted into tables either by providing the values in the query or by selecting them through another query.

By default, data can be inserted by providing values for all columns in the order that they are defined in the table.

Insert with values⚑

customers = Table('customers')

q = Query.into(customers).insert(1, 'Jane', 'Doe', 'jane@example.com')

INSERT INTO customers VALUES (1,'Jane','Doe','jane@example.com')

customers =  Table('customers')

q = customers.insert(1, 'Jane', 'Doe', 'jane@example.com')

INSERT INTO customers VALUES (1,'Jane','Doe','jane@example.com')

Multiple rows of data can be inserted either by chaining the insert function or passing multiple tuples as args.

customers = Table('customers')

q = (
    Query.into(customers)
    .insert(1, "Jane", "Doe", "jane@example.com")
    .insert(2, "John", "Doe", "john@example.com")
)

customers = Table('customers')

q = Query.into(customers).insert(
    (1, "Jane", "Doe", "jane@example.com"), (2, "John", "Doe", "john@example.com")
)

INSERT INTO "customers" VALUES (1,'Jane','Doe','jane@example.com'),(2,'John','Doe','john@example.com')

Insert with on Duplicate Key Update⚑

customers = Table('customers')

q = Query.into(customers)\
    .insert(1, 'Jane', 'Doe', 'jane@example.com')\
    .on_duplicate_key_update(customers.email, Values(customers.email))

INSERT INTO customers VALUES (1,'Jane','Doe','jane@example.com') ON DUPLICATE KEY UPDATE `email`=VALUES(`email`)

.on_duplicate_key_update works similar to .set for updating rows, additionally it provides the Values wrapper to update to the value specified in the INSERT clause.

Insert from a SELECT Sub-query⚑

INSERT INTO customers VALUES (1,'Jane','Doe','jane@example.com'),(2,'John','Doe','john@example.com')

To specify the columns and the order, use the columns function.

customers = Table('customers')

q = Query.into(customers).columns('id', 'fname', 'lname').insert(1, 'Jane', 'Doe')

INSERT INTO customers (id,fname,lname) VALUES (1,'Jane','Doe','jane@example.com')

Inserting data with a query works the same as querying data with the additional call to the into method in the builder chain.

customers, customers_backup = Tables('customers', 'customers_backup')

q = Query.into(customers_backup).from_(customers).select('*')

INSERT INTO customers_backup SELECT * FROM customers

customers, customers_backup = Tables('customers', 'customers_backup')

q = Query.into(customers_backup).columns('id', 'fname', 'lname')
    .from_(customers).select(customers.id, customers.fname, customers.lname)

INSERT INTO customers_backup SELECT "id", "fname", "lname" FROM customers

The syntax for joining tables is the same as when selecting data

customers, orders, orders_backup = Tables('customers', 'orders', 'orders_backup')

q = Query.into(orders_backup).columns('id', 'address', 'customer_fname', 'customer_lname')
    .from_(customers)
    .join(orders).on(orders.customer_id == customers.id)
    .select(orders.id, customers.fname, customers.lname)

INSERT INTO "orders_backup" ("id","address","customer_fname","customer_lname")
SELECT "orders"."id","customers"."fname","customers"."lname" FROM "customers"
JOIN "orders" ON "orders"."customer_id"="customers"."id"

[Updating⚑

data](https://pypika.readthedocs.io/en/latest/2_tutorial.html#updating-data)

PyPika allows update queries to be constructed with or without where clauses.

customers = Table('customers')

Query.update(customers).set(customers.last_login, '2017-01-01 10:00:00')

Query.update(customers).set(customers.lname, 'smith').where(customers.id == 10)

UPDATE "customers" SET "last_login"='2017-01-01 10:00:00'

UPDATE "customers" SET "lname"='smith' WHERE "id"=10

The syntax for joining tables is the same as when selecting data

customers, profiles = Tables('customers', 'profiles')

Query.update(customers)
     .join(profiles).on(profiles.customer_id == customers.id)
     .set(customers.lname, profiles.lname)

UPDATE "customers"
JOIN "profiles" ON "profiles"."customer_id"="customers"."id"
SET "customers"."lname"="profiles"."lname"

Using pypika.Table alias to perform the update

customers = Table('customers')

customers.update()
        .set(customers.lname, 'smith')
        .where(customers.id == 10)

UPDATE "customers" SET "lname"='smith' WHERE "id"=10

Using limit for performing update

customers = Table('customers')

customers.update()
        .set(customers.lname, 'smith')
        .limit(2)

UPDATE "customers" SET "lname"='smith' LIMIT 2

Selecting Data ⚑

The entry point for building queries is pypika.Query. In order to select columns from a table, the table must first be added to the query. For simple queries with only one table, tables and columns can be references using strings. For more sophisticated queries a pypika.Table must be used.

q = Query.from_('customers').select('id', 'fname', 'lname', 'phone')

To convert the query into raw SQL, it can be cast to a string.

str(q)

Alternatively, you can use the Query.get_sql() function:

q.get_sql()

The .select statement doesn't need to be after the .from_ statement. This is useful when composing a query in multiple steps, where you can do the .join before the .select.

In simple queries like the above example, columns in the “from” table can be referenced by passing string names into the select query builder function. In more complex examples, the pypika.Table class should be used. Columns can be referenced as attributes on instances of pypika.Table.

from pypika import Table, Query

customers = Table('customers')
q = Query.from_(customers).select(customers.id, customers.fname, customers.lname, customers.phone)

Both of the above examples result in the following SQL:

SELECT id,fname,lname,phone FROM customers

An alias for the table can be given using the .as_ function on pypika.Table.

Table('x_view_customers').as_('customers')
q = Query.from_(customers).select(customers.id, customers.phone)

SELECT id,phone FROM x_view_customers customers

An alias for the columns can also be given using the .as_ function on the columns.

customers = Table('customers')
q = Query.from_(customers).select(
    customers.id.as_("customer.id"), customers.fname.as_("customer.name")
)

SELECT "id" "customer.id","fname" "customer.name" FROM customers

A schema can also be specified. Tables can be referenced as attributes on the schema.

from pypika import Table, Query, Schema

views = Schema('views')
q = Query.from_(views.customers).select(customers.id, customers.phone)

SELECT id,phone FROM views.customers

Also references to databases can be used. Schemas can be referenced as attributes on the database.

from pypika import Table, Query, Database

my_db = Database('my_db')
q = Query.from_(my_db.analytics.customers).select(customers.id, customers.phone)

SELECT id,phone FROM my_db.analytics.customers

Results can be ordered by using the following syntax:

from pypika import Order
Query.from_('customers').select('id', 'fname', 'lname', 'phone').orderby('id', order=Order.desc)

This results in the following SQL:

SELECT "id","fname","lname","phone" FROM "customers" ORDER BY "id" DESC

Filtering ⚑

Queries can be filtered with pypika.Criterion by using equality or inequality operators.

customers = Table('customers')
q = Query.from_(customers).select(
    customers.id, customers.fname, customers.lname, customers.phone
).where(
    customers.lname == 'Mustermann'
)

SELECT id,fname,lname,phone FROM customers WHERE lname='Mustermann'

Query methods such as select, where, groupby, and orderby can be called multiple times. Multiple calls to the where method will add additional conditions as:

customers = Table('customers')
q = Query.from_(customers).select(
    customers.id, customers.fname, customers.lname, customers.phone
).where(
    customers.fname == 'Max'
).where(
    customers.lname == 'Mustermann'
)

SELECT id,fname,lname,phone FROM customers WHERE fname='Max' AND lname='Mustermann'

Filters such as IN and BETWEEN are also supported.

customers = Table('customers')
q = Query.from_(customers).select(
    customers.id,customers.fname
).where(
    customers.age[18:65] & customers.status.isin(['new', 'active'])
)

SELECT id,fname FROM customers WHERE age BETWEEN 18 AND 65 AND status IN ('new','active')

Filtering with complex criteria can be created using boolean symbols &, |, and ^.

AND

customers = Table('customers')
q = Query.from_(customers).select(
    customers.id, customers.fname, customers.lname, customers.phone
).where(
    (customers.age >= 18) & (customers.lname == 'Mustermann')
)

SELECT id,fname,lname,phone FROM customers WHERE age>=18 AND lname='Mustermann'

OR

customers = Table('customers')
q = Query.from_(customers).select(
    customers.id, customers.fname, customers.lname, customers.phone
).where(
    (customers.age >= 18) | (customers.lname == 'Mustermann')
)

SELECT id,fname,lname,phone FROM customers WHERE age>=18 OR lname='Mustermann'

XOR

customers = Table('customers')
q = Query.from_(customers).select(
    customers.id, customers.fname, customers.lname, customers.phone
).where(
    (customers.age >= 18) ^ customers.is_registered
)

SELECT id,fname,lname,phone FROM customers WHERE age>=18 XOR is_registered

Using the REGEXP filter

Pypika supports regex, but if you're using sqlite3 you need to configure the connection to the database.

Joining tables and subqueries ⚑

Tables and subqueries can be joined to any query using the Query.join() method. Joins can be performed with either a USING or ON clauses. The USING clause can be used when both tables/subqueries contain the same field and the ON clause can be used with a criterion. To perform a join, ...join() can be chained but then must be followed immediately by ...on(<criterion>) or ...using(*field).

Join Types⚑

All join types are supported by PyPika.

Query \
    .from_(base_table)
    ...
    .join(join_table, JoinType.left)
    ...

Query \
    .from_(base_table)
    ...
    .left_join(join_table) \
    .right_join(join_table) \
    .inner_join(join_table) \
    .outer_join(join_table) \
    .cross_join(join_table) \
    ...

Example of a join using ON⚑

history, customers = Tables('history', 'customers')
q = Query \
    .from_(history) \
    .join(customers) \
    .on(history.customer_id == customers.id) \
    .select(history.star) \
    .where(customers.id == 5)

SELECT "history".* FROM "history" JOIN "customers" ON "history"."customer_id"="customers"."id" WHERE "customers"."id"=5

Example of a join using ON_FIELD⚑

As a shortcut, the Query.join().on_field() function is provided for joining the (first) table in the FROM clause with the joined table when the field name(s) are the same in both tables.

history, customers = Tables('history', 'customers')
q = Query \
    .from_(history) \
    .join(customers) \
    .on_field('customer_id', 'group') \
    .select(history.star) \
    .where(customers.group == 'A')

SELECT "history".* FROM "history" JOIN "customers" ON "history"."customer_id"="customers"."customer_id" AND "history"."group"="customers"."group" WHERE "customers"."group"='A'

Example of a join using USING⚑

history, customers = Tables('history', 'customers')
q = Query \
    .from_(history) \
    .join(customers) \
    .using('customer_id') \
    .select(history.star) \
    .where(customers.id == 5)

SELECT "history".* FROM "history" JOIN "customers" USING "customer_id" WHERE "customers"."id"=5

Example of a correlated subquery in the SELECT⚑

history, customers = Tables('history', 'customers')

last_purchase_at = Query.from_(history).select(
    history.purchase_at
).where(history.customer_id==customers.customer_id).orderby(
    history.purchase_at, order=Order.desc
).limit(1)

q = Query.from_(customers).select(
    customers.id, last_purchase_at._as('last_purchase_at')
)

SELECT
  "id",
  (SELECT "history"."purchase_at"
   FROM "history"
   WHERE "history"."customer_id" = "customers"."customer_id"
   ORDER BY "history"."purchase_at" DESC
   LIMIT 1) "last_purchase_at"
FROM "customers"

Deleting data ⚑

Query.from_(table).delete().where(table.id == id)

References⚑

Docs
Source

Pypika

Installation⚑

Usage⚑

Creating Tables⚑

Inserting data⚑

Insert with values⚑

Insert with on Duplicate Key Update⚑

Insert from a SELECT Sub-query⚑

[Updating⚑

Selecting Data⚑

Filtering⚑

Joining tables and subqueries⚑

Join Types⚑

Example of a join using ON⚑

Example of a join using ON_FIELD⚑

Example of a join using USING⚑

Example of a correlated subquery in the SELECT⚑

Deleting data⚑

References⚑

Installation ⚑

Creating Tables ⚑

Inserting data ⚑

Selecting Data ⚑

Filtering ⚑

Joining tables and subqueries ⚑

Deleting data ⚑