destinations.sql_jobs
SqlBaseJob Objects
class SqlBaseJob(NewLoadJobImpl)
Sql base job for jobs that rely on the whole tablechain
from_table_chain
@classmethod
def from_table_chain(cls,
table_chain: Sequence[TTableSchema],
sql_client: SqlClientBase[Any],
params: Optional[SqlJobParams] = None) -> NewLoadJobImpl
Generates a list of sql statements, that will be executed by the sql client when the job is executed in the loader.
The table_chain
contains a list schemas of a tables with parent-child relationship, ordered by the ancestry (the root of the tree is first on the list).
SqlStagingCopyJob Objects
class SqlStagingCopyJob(SqlBaseJob)
Generates a list of sql statements that copy the data from staging dataset into destination dataset.
SqlMergeJob Objects
class SqlMergeJob(SqlBaseJob)
Generates a list of sql statements that merge the data from staging dataset into destination dataset.
generate_sql
@classmethod
def generate_sql(cls,
table_chain: Sequence[TTableSchema],
sql_client: SqlClientBase[Any],
params: Optional[SqlJobParams] = None) -> List[str]
Generates a list of sql statements that merge the data in staging dataset with the data in destination dataset.
The table_chain
contains a list schemas of a tables with parent-child relationship, ordered by the ancestry (the root of the tree is first on the list).
The root table is merged using primary_key and merge_key hints which can be compound and be both specified. In that case the OR clause is generated.
The child tables are merged based on propagated root_key
which is a type of foreign key but always leading to a root table.
First we store the root_keys of root table elements to be deleted in the temp table. Then we use the temp table to delete records from root and all child tables in the destination dataset. At the end we copy the data from the staging dataset into destination dataset.
gen_key_table_clauses
@classmethod
def gen_key_table_clauses(cls, root_table_name: str,
staging_root_table_name: str,
key_clauses: Sequence[str],
for_delete: bool) -> List[str]
Generate sql clauses that may be used to select or delete rows in root table of destination dataset
A list of clauses may be returned for engines that do not support OR in subqueries. Like BigQuery
gen_delete_temp_table_sql
@classmethod
def gen_delete_temp_table_sql(
cls, unique_column: str,
key_table_clauses: Sequence[str]) -> Tuple[List[str], str]
Generate sql that creates delete temp table and inserts unique_column
from root table for all records to delete. May return several statements.
Returns temp table name for cases where special names are required like SQLServer.
gen_delete_from_sql
@classmethod
def gen_delete_from_sql(cls, table_name: str, unique_column: str,
delete_temp_table_name: str,
temp_table_column: str) -> str
Generate DELETE FROM statement deleting the records found in the deletes temp table.