Interact with Google Cloud Storage. This hook uses the Google Cloud Platform connection.
Returns a Google Cloud Storage service object.
copy(self, source_bucket, source_object, destination_bucket=None, destination_object=None)¶
Copies an object from a bucket to another, with renaming if requested.
destination_bucket or destination_object can be omitted, in which case source bucket/object is used, but not both.
source_bucket (str) – The bucket of the object to copy from.
source_object (str) – The object to copy.
destination_bucket (str) – The destination of the object to copied to. Can be omitted; then the same bucket is used.
destination_object (str) – The (renamed) path of the object if given. Can be omitted; then the same name is used.
rewrite(self, source_bucket, source_object, destination_bucket, destination_object=None)¶
Has the same functionality as copy, except that will work on files over 5 TB, as well as when copying between locations and/or storage classes.
destination_object can be omitted, in which case source_object is used.
download(self, bucket, object, filename=None)¶
Downloads a file from Google Cloud Storage.
When no filename is supplied, the operator loads the file into memory and returns its content. When a filename is supplied, it writes the file to the specified location and returns the location. For file sizes that exceed the available memory it is recommended to write to a file.
upload(self, bucket, object, filename, mime_type='application/octet-stream', gzip=False, multipart=None, num_retries=None)¶
Uploads a local file to Google Cloud Storage.
exists(self, bucket, object)¶
Checks for the existence of a file in Google Cloud Storage.
is_updated_after(self, bucket, object, ts)¶
Checks if an object is updated in Google Cloud Storage.
delete(self, bucket, object, generation=None)¶
Deletes an object from the bucket.
list(self, bucket, versions=None, maxResults=None, prefix=None, delimiter=None)¶
List all objects from the bucket with the give string prefix in name
bucket (str) – bucket name
versions (bool) – if true, list all versions of the objects
maxResults (int) – max count of items to return in a single page of responses
prefix (str) – prefix string which filters objects whose name begin with this prefix
delimiter (str) – filters objects based on the delimiter (for e.g ‘.csv’)
a stream of object names matching the filtering criteria
get_size(self, bucket, object)¶
Gets the size of a file in Google Cloud Storage in bytes.
get_crc32c(self, bucket, object)¶
Gets the CRC32c checksum of an object in Google Cloud Storage.
get_md5hash(self, bucket, object)¶
Gets the MD5 hash of an object in Google Cloud Storage.
create_bucket(self, bucket_name, resource=None, storage_class='MULTI_REGIONAL', location='US', project_id=None, labels=None)¶
Creates a new bucket. Google Cloud Storage uses a flat namespace, so you can’t create a bucket with a name that is already in use.
For more information, see Bucket Naming Guidelines: https://cloud.google.com/storage/docs/bucketnaming.html#requirements
bucket_name (str) – The name of the bucket.
resource (dict) – An optional dict with parameters for creating the bucket. For information on available parameters, see Cloud Storage API doc: https://cloud.google.com/storage/docs/json_api/v1/buckets/insert
storage_class (str) –
This defines how objects in the bucket are stored and determines the SLA and the cost of storage. Values include
If this value is not specified when the bucket is created, it will default to STANDARD.
location (str) –
The location of the bucket. Object data for objects in the bucket resides in physical storage within this region. Defaults to US.
project_id (str) – The ID of the GCP Project.
labels (dict) – User-provided labels, in key/value pairs.
If successful, it returns the
idof the bucket.
insert_bucket_acl(self, bucket, entity, role, user_project=None)¶
Creates a new ACL entry on the specified bucket. See: https://cloud.google.com/storage/docs/json_api/v1/bucketAccessControls/insert
bucket (str) – Name of a bucket.
entity (str) – The entity holding the permission, in one of the following forms: user-userId, user-email, group-groupId, group-email, domain-domain, project-team-projectId, allUsers, allAuthenticatedUsers. See: https://cloud.google.com/storage/docs/access-control/lists#scopes
role (str) – The access permission for the entity. Acceptable values are: “OWNER”, “READER”, “WRITER”.
user_project (str) – (Optional) The project to be billed for this request. Required for Requester Pays buckets.
insert_object_acl(self, bucket, object_name, entity, role, generation=None, user_project=None)¶
Creates a new ACL entry on the specified object. See: https://cloud.google.com/storage/docs/json_api/v1/objectAccessControls/insert
bucket (str) – Name of a bucket.
object_name (str) – Name of the object. For information about how to URL encode object names to be path safe, see: https://cloud.google.com/storage/docs/json_api/#encoding
entity (str) – The entity holding the permission, in one of the following forms: user-userId, user-email, group-groupId, group-email, domain-domain, project-team-projectId, allUsers, allAuthenticatedUsers See: https://cloud.google.com/storage/docs/access-control/lists#scopes
role (str) – The access permission for the entity. Acceptable values are: “OWNER”, “READER”.
compose(self, bucket, source_objects, destination_object, num_retries=None)¶
Composes a list of existing object into a new object in the same storage bucket
Currently it only supports up to 32 objects that can be concatenated in a single operation
Given a Google Cloud Storage URL (gs://<bucket>/<blob>), returns a
tuple containing the corresponding bucket and blob.