airflow.contrib.hooks.gcp_sql_hook

Module Contents

airflow.contrib.hooks.gcp_sql_hook.UNIX_PATH_MAX = 108[source]
airflow.contrib.hooks.gcp_sql_hook.NUM_RETRIES = 5[source]
airflow.contrib.hooks.gcp_sql_hook.TIME_TO_SLEEP_IN_SECONDS = 1[source]
class airflow.contrib.hooks.gcp_sql_hook.CloudSqlOperationStatus[source]
PENDING = PENDING[source]
RUNNING = RUNNING[source]
DONE = DONE[source]
UNKNOWN = UNKNOWN[source]
class airflow.contrib.hooks.gcp_sql_hook.CloudSqlHook(api_version, gcp_conn_id='google_cloud_default', delegate_to=None)[source]

Bases: airflow.contrib.hooks.gcp_api_base_hook.GoogleCloudBaseHook

Hook for Google Cloud SQL APIs.

All the methods in the hook where project_id is used must be called with keyword arguments rather than positional.

_conn[source]
get_conn(self)[source]

Retrieves connection to Cloud SQL.

Returns

Google Cloud SQL services object.

Return type

dict

get_instance(self, instance, project_id=None)[source]

Retrieves a resource containing information about a Cloud SQL instance.

Parameters
  • instance (str) – Database instance ID. This does not include the project ID.

  • project_id (str) – Project ID of the project that contains the instance. If set to None or missing, the default project_id from the GCP connection is used.

Returns

A Cloud SQL instance resource.

Return type

dict

create_instance(self, body, project_id=None)[source]

Creates a new Cloud SQL instance.

Parameters
Returns

None

patch_instance(self, body, instance, project_id=None)[source]

Updates settings of a Cloud SQL instance.

Caution: This is not a partial update, so you must include values for all the settings that you want to retain.

Parameters
Returns

None

delete_instance(self, instance, project_id=None)[source]

Deletes a Cloud SQL instance.

Parameters
  • project_id (str) – Project ID of the project that contains the instance. If set to None or missing, the default project_id from the GCP connection is used.

  • instance (str) – Cloud SQL instance ID. This does not include the project ID.

Returns

None

get_database(self, instance, database, project_id=None)[source]

Retrieves a database resource from a Cloud SQL instance.

Parameters
  • instance (str) – Database instance ID. This does not include the project ID.

  • database (str) – Name of the database in the instance.

  • project_id (str) – Project ID of the project that contains the instance. If set to None or missing, the default project_id from the GCP connection is used.

Returns

A Cloud SQL database resource, as described in https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/databases#resource.

Return type

dict

create_database(self, instance, body, project_id=None)[source]

Creates a new database inside a Cloud SQL instance.

Parameters
Returns

None

patch_database(self, instance, database, body, project_id=None)[source]

Updates a database resource inside a Cloud SQL instance.

This method supports patch semantics. See https://cloud.google.com/sql/docs/mysql/admin-api/how-tos/performance#patch.

Parameters
Returns

None

delete_database(self, instance, database, project_id=None)[source]

Deletes a database from a Cloud SQL instance.

Parameters
  • instance (str) – Database instance ID. This does not include the project ID.

  • database (str) – Name of the database to be deleted in the instance.

  • project_id (str) – Project ID of the project that contains the instance. If set to None or missing, the default project_id from the GCP connection is used.

Returns

None

export_instance(self, instance, body, project_id=None)[source]

Exports data from a Cloud SQL instance to a Cloud Storage bucket as a SQL dump or CSV file.

Parameters
Returns

None

import_instance(self, instance, body, project_id=None)[source]

Imports data into a Cloud SQL instance from a SQL dump or CSV file in Cloud Storage.

Parameters
Returns

None

_wait_for_operation_to_complete(self, project_id, operation_name)[source]

Waits for the named operation to complete - checks status of the asynchronous call.

Parameters
  • project_id (str) – Project ID of the project that contains the instance.

  • operation_name (str) – Name of the operation.

Returns

None

airflow.contrib.hooks.gcp_sql_hook.CLOUD_SQL_PROXY_DOWNLOAD_URL = https://dl.google.com/cloudsql/cloud_sql_proxy.{}.{}[source]
airflow.contrib.hooks.gcp_sql_hook.CLOUD_SQL_PROXY_VERSION_DOWNLOAD_URL = https://storage.googleapis.com/cloudsql-proxy/{}/cloud_sql_proxy.{}.{}[source]
airflow.contrib.hooks.gcp_sql_hook.GCP_CREDENTIALS_KEY_PATH = extra__google_cloud_platform__key_path[source]
airflow.contrib.hooks.gcp_sql_hook.GCP_CREDENTIALS_KEYFILE_DICT = extra__google_cloud_platform__keyfile_dict[source]
class airflow.contrib.hooks.gcp_sql_hook.CloudSqlProxyRunner(path_prefix, instance_specification, gcp_conn_id='google_cloud_default', project_id=None, sql_proxy_version=None, sql_proxy_binary_path=None)[source]

Bases: airflow.LoggingMixin

Downloads and runs cloud-sql-proxy as subprocess of the Python process.

The cloud-sql-proxy needs to be downloaded and started before we can connect to the Google Cloud SQL instance via database connection. It establishes secure tunnel connection to the database. It authorizes using the GCP credentials that are passed by the configuration.

More details about the proxy can be found here: https://cloud.google.com/sql/docs/mysql/sql-proxy

_build_command_line_parameters(self)[source]
static _is_os_64bit()[source]
_download_sql_proxy_if_needed(self)[source]
_get_credential_parameters(self, session)[source]
start_proxy(self)[source]

Starts Cloud SQL Proxy.

You have to remember to stop the proxy if you started it!

stop_proxy(self)[source]

Stops running proxy.

You should stop the proxy after you stop using it.

get_proxy_version(self)[source]

Returns version of the Cloud SQL Proxy.

get_socket_path(self)[source]

Retrieves UNIX socket path used by Cloud SQL Proxy.

Returns

The dynamically generated path for the socket created by the proxy.

Return type

str

airflow.contrib.hooks.gcp_sql_hook.CONNECTION_URIS[source]
airflow.contrib.hooks.gcp_sql_hook.CLOUD_SQL_VALID_DATABASE_TYPES = ['postgres', 'mysql'][source]
class airflow.contrib.hooks.gcp_sql_hook.CloudSqlDatabaseHook(gcp_cloudsql_conn_id='google_cloud_sql_default', default_gcp_project_id=None)[source]

Bases: airflow.hooks.base_hook.BaseHook

Serves DB connection configuration for Google Cloud SQL (Connections of gcpcloudsql:// type).

The hook is a “meta” one. It does not perform an actual connection. It is there to retrieve all the parameters configured in gcpcloudsql:// connection, start/stop Cloud SQL Proxy if needed, dynamically generate Postgres or MySQL connection in the database and return an actual Postgres or MySQL hook. The returned Postgres/MySQL hooks are using direct connection or Cloud SQL Proxy socket/TCP as configured.

Main parameters of the hook are retrieved from the standard URI components:

  • user - User name to authenticate to the database (from login of the URI).

  • password - Password to authenticate to the database (from password of the URI).

  • public_ip - IP to connect to for public connection (from host of the URI).

  • public_port - Port to connect to for public connection (from port of the URI).

  • database - Database to connect to (from schema of the URI).

Remaining parameters are retrieved from the extras (URI query parameters):

  • project_id - Optional, Google Cloud Platform project where the Cloud SQL

    instance exists. If missing, default project id passed is used.

  • instance - Name of the instance of the Cloud SQL database instance.

  • location - The location of the Cloud SQL instance (for example europe-west1).

  • database_type - The type of the database instance (MySQL or Postgres).

  • use_proxy - (default False) Whether SQL proxy should be used to connect to Cloud SQL DB.

  • use_ssl - (default False) Whether SSL should be used to connect to Cloud SQL DB. You cannot use proxy and SSL together.

  • sql_proxy_use_tcp - (default False) If set to true, TCP is used to connect via proxy, otherwise UNIX sockets are used.

  • sql_proxy_binary_path - Optional path to Cloud SQL Proxy binary. If the binary is not specified or the binary is not present, it is automatically downloaded.

  • sql_proxy_version - Specific version of the proxy to download (for example v1.13). If not specified, the latest version is downloaded.

  • sslcert - Path to client certificate to authenticate when SSL is used.

  • sslkey - Path to client private key to authenticate when SSL is used.

  • sslrootcert - Path to server’s certificate to authenticate when SSL is used.

Parameters
  • gcp_cloudsql_conn_id (str) – URL of the connection

  • default_gcp_project_id (str) – Default project id used if project_id not specified in the connection URL

_conn[source]
static _get_bool(val)[source]
static _check_ssl_file(file_to_check, name)[source]
_validate_inputs(self)[source]
validate_ssl_certs(self)[source]
validate_socket_path_length(self)[source]
static _generate_unique_path()[source]
static _quote(value)[source]
_generate_connection_uri(self)[source]
_get_instance_socket_name(self)[source]
_get_sqlproxy_instance_specification(self)[source]
create_connection(self, session=None)[source]

Create connection in the Connection table, according to whether it uses proxy, TCP, UNIX sockets, SSL. Connection ID will be randomly generated.

Parameters

session – Session of the SQL Alchemy ORM (automatically generated with decorator).

retrieve_connection(self, session=None)[source]

Retrieves the dynamically created connection from the Connection table.

Parameters

session – Session of the SQL Alchemy ORM (automatically generated with decorator).

delete_connection(self, session=None)[source]

Delete the dynamically created connection from the Connection table.

Parameters

session – Session of the SQL Alchemy ORM (automatically generated with decorator).

get_sqlproxy_runner(self)[source]

Retrieve Cloud SQL Proxy runner. It is used to manage the proxy lifecycle per task.

Returns

The Cloud SQL Proxy runner.

Return type

CloudSqlProxyRunner

get_database_hook(self)[source]

Retrieve database hook. This is the actual Postgres or MySQL database hook that uses proxy or connects directly to the Google Cloud SQL database.

cleanup_database_hook(self)[source]

Clean up database hook after it was used.

reserve_free_tcp_port(self)[source]

Reserve free TCP port to be used by Cloud SQL Proxy

free_reserved_port(self)[source]

Free TCP port. Makes it immediately ready to be used by Cloud SQL Proxy.