Amazon Web Services is a platform for a vast portfolio of data-related services. Simple Storage Service (S3) is one of the most frequently used services of AWS that serves as an underlying storage area for many other services. These services include Athena, RDS, Lambda, etc.

With S3 buckets, users can ensure the security of the data by implementing encryption such as server-side encryption or KMS, can define the access permissions of other users to the data, enable version to maintain object history or object lock for retaining the data, etc. Aside from accessing and managing the data from the S3 console, users can also access the S3 service programmatically by  using the Boto3 SDK of Python. 

Learn more: Getting Started With Buckets: Overview of Amazon Simple Storage Service

What is the Client Interface in Boto3?

Boto3 provides two ways for accessing and managing the AWS services i.e., via Client or Resource. The client abstraction of the Boto3 SDK is a low-level interface. By using the Client interface, users can access the AWS resources via various built-in methods. The methods that are provided by the Client interface include some of the common operations such as creating the bucket, downloading files, deleting objects, or uploading files, etc. This tutorial presents a practical demonstration of the copy() method of the Boto3 SDK using the Client interface.

Prerequisite

Before getting started with the implementation of the copy() method, there are some prerequisites for this tutorial: 

What is the Copy() Function in S3 Using Boto3?

The Copy() method provided by the Client interface of the Boto3 SDK allows users to copy an object from one S3 bucket to another. The contents of the bucket can be either copied to entirely different buckets in different AWS regions or in the same region within the same bucket. However, the source and destination buckets must be within the same account as this function does not support cross-account access. 

Initial Configurations

Before getting started, we will install the Boto3 SDK first. For this purpose, create an empty folder and open this folder in the VS code:

Tap the “New file” option and provide the name of the file with the “.py” extension:

Open the terminal inside this folder by clicking on the “Terminal” option from the toolbar. Tap the “New terminal” option from the drop-down list:

Provide the following command to the terminal for installing the Boto3 SDK:

pip install boto3

The SDK has been successfully installed. Next, log into the AWS account by providing the following command to the terminal. Provide the Secret Access key, Access key, etc of the AWS account to the terminal:

aws configure

How to Use Copy() Function in S3 Using Boto3 Client?

The copy() function of Boto3 performs multipart copy in multiple threads when required. This section of the article provides various examples of the implementation of the copy() method in Boto3 Client:

Syntax

The syntax of the Copy function is given below:

S3.Client.copy(CopySource, Bucket, Key, ExtraArgs=None, Callback=None, SourceClient=None, Config=None)

Below is a brief description of the parameters specified in the syntax:

  • CopySource: It is of dictionary type and accepts the two arguments i.e., Bucket and Key. This field refers to the name of the source bucket and the name of the object that is to be copied.
  • Bucket: This field inputs the name of the destination bucket where the object is to be copied.
  • Key: This field refers to the name which is to be assigned to the copied object.
  • ExtraArgs: This field refers to the “Extra arguments” that are used for various purposes and can be specified by the user. The ExtraArgs are of dict type. 
  • Callback: It is of function type and is defined as the number of bytes that are transferred periodically while a file is being downloaded. 
  • Config: This field refers to the transfer configurations. The transfer configurations are used when performing the transfer. 

Example 1: Copying the Object Within the Same Bucket 

For this example, we have already created the S3 bucket with the name “uploadwithinthesamebucketdemo”. This bucket also contains a text file which is identified as the “copiedobject.txt”. The code given below demonstrates the implementation of copying an object programmatically using the Boto3 within the same bucket:

import boto3
s3=boto3.client('s3')

copy_source={
    'Bucket':'uploadwithinthesamebucketdemo',
    'Key': 'track.txt'
}

s3.copy(copy_source,'uploadwithinthesamebucketdemo','copiedobject.txt')
print("successfull")

Following is a brief description of the above-mentioned code:

  • Copy_source: This field is of dictionary data type that takes two arguments as input. This dictionary is then passed to the copy function.
  • Bucket: It represents the bucket from which the object is to be copied. Replace the  “newbucket191008” with the S3 bucket name configured in your AWS account.
  • Key: It refers to the name of the object that is to be copied in the destination bucket. For this demo, we have provided the name “track.txt” as the key. Replace this name with the file’s name upload within the S3 bucket.
  • s3.copy(): This function from the boto3 client will be used to implement the multipart copy in multiple threads. It takes three arguments as input.
  • Destination Bucket: As we are copying the object within the same bucket, therefore, we will specify the name of the source bucket as the destination bucket. The “uploadwithinthesamebucketdemo” refers to the destination bucket’s name. Replace this name with the name of your AWS bucket.
  • Key: The key in the copy() method refers to the name that will be assigned to the copied object in the destination bucket. At the moment, we have provided “copiedobject.txt” as the key name. User can replace this name according to preference. 

To verify if the object has been successfully copied within the same bucket, visit the S3 dashboard and click the name of the S3 bucket:

From the interface displayed, we can see that the object has been copied successfully. Furthermore, the object has been assigned the name “copiedobject.txt” as specified in the code:

Example 2: Copying the object Within the Same region between different Buckets 

For this demo, we have already created two buckets. The first bucket serves as the source bucket which contains the object that is to be copied. The second bucket is the destination bucket that will receive the copy of the object. Both of these buckets share the same AWS region. Below is the sample code for implementing this function programmatically:

import boto3
s3=boto3.client('s3')

copy_source={
    'Bucket':'newbucket191008',
    'Key': 'track.txt'
}

s3.copy(copy_source,'newbucket202319108','copiedobject.txt')
print("successful")

Below is a brief description of the code:

  • The dictionary copy_source and its key-value pairs i.e., Bucket and Key will be configured the same way as specified in Example 1 of this section.
  • s3.copy(): This function from the boto3 client will be used to implement the multipart copy in multiple threads. It takes three arguments as input.
  • Destination Bucket: The “newbucket202319108” name refers to the destination bucket. 
  • Key: The key in the copy() method refers to the name that will be assigned to the copied object in the destination bucket. At the moment, we have provided “copiedobject.txt” as the key name. User can replace this name according to preference. 

Output

The Python file will be executed by using the following code. The output window displayed the “successful” message as specified in the print statement upon the completion of the code:

python <FileName>

To verify if the object has been copied to the destination bucket, visit the S3 dashboard and click the name of the destination bucket. Tap the reload button if the object is not readily displayed and it will display the copied object with the name specified by the used in the code:

Example 3: Copy an Object Between S3 Buckets Within Different Regions 

To implement this functionality, we have already created two buckets in different AWS regions. For this purpose, the user can select a different region from the “Regions” field while creating the S3 bucket:

An object has been uploaded to the source bucket (copysoucebucket). This object will be then copied to the destination bucket (copytdestinationbucetdemo) using the Boto3 Client. This article provides a demonstration of uploading an object to the S3 bucket: “How to Upload Objects in Amazon Simple Storage Service”. For implementing this functionality, the code is given below:

import boto3
s3=boto3.client('s3')

copy_source={
    'Bucket':'copysourcebucket',
    'Key': 'track.txt'
}

s3.copy(copy_source,'coptdestinationbucketdemo','copiedindifferentregion.txt')
print("successfull")

Note: This functionality is only supported if both bucket lies in different regions within the same account. 

Below is a brief description of the code:

  • The “copy_source” is of dictionary data type and its key-value pairs i.e., Bucket and Key will be configured the same way as specified in the previous examples of this section.
  • s3.copy(): This function from the boto3 client will be used to implement the multipart copy in multiple threads. It takes three arguments as input.
  • Destination Bucket: The name “copydestinationbucketdemo” refers to the destination bucket. 
  • Key: The key in the copy() method refers to the name that will be assigned to the copied object in the destination bucket. At the moment, we have provided “copiedobject.txt” as the key name. User can replace this name according to preference. 

Output

To execute the code file, use the below-mentioned command: 

python <FileName>

The output window displays the “successfull” message as specified in the print statement upon the completion of the code:

To verify if the object has been copied to the destination bucket, visit the S3 dashboard and click the name of the destination bucket. Tap the reload button if the object is not readily displayed and it will display the copied object with the name specified in the code:

How to Use the “copy_obj()” Function in S3 Using Boto3 Client?

The copy_obj() function serves the same functionality with additional parameters. It offers distinct options for the users to define permissions, read-write access, encryption, replication, storage classes, etc for an object. 

Syntax

The syntax for this function is given below:

response = client.copy_object(
    ACL='private'|'public-read'|'public-read-write'|'authenticated-read'|'aws-exec-read'|'bucket-owner-read'|'bucket-owner-full-control',
    Bucket='string',
    CacheControl='string',
    ChecksumAlgorithm='CRC32'|'CRC32C'|'SHA1'|'SHA256',
    ContentDisposition='string',
    ContentEncoding='string',
    ContentLanguage='string',
    ContentType='string',
    CopySource='string' or {'Bucket': 'string', 'Key': 'string', 'VersionId': 'string'},
    CopySourceIfMatch='string',
    CopySourceIfModifiedSince=datetime(2015, 1, 1),
    CopySourceIfNoneMatch='string',
    CopySourceIfUnmodifiedSince=datetime(2015, 1, 1),
    Expires=datetime(2015, 1, 1),
    GrantFullControl='string',
    GrantRead='string',
    GrantReadACP='string',
    GrantWriteACP='string',
    Key='string',
    Metadata={
        'string': 'string'
    },
    MetadataDirective='COPY'|'REPLACE',
    TaggingDirective='COPY'|'REPLACE',
    ServerSideEncryption='AES256'|'aws:kms'|'aws:kms:dsse',
    StorageClass='STANDARD'|'REDUCED_REDUNDANCY'|'STANDARD_IA'|'ONEZONE_IA'|'INTELLIGENT_TIERING'|'GLACIER'|'DEEP_ARCHIVE'|'OUTPOSTS'|'GLACIER_IR'|'SNOW',
    WebsiteRedirectLocation='string',
    SSECustomerAlgorithm='string',
    SSECustomerKey='string',
    SSEKMSKeyId='string',
    SSEKMSEncryptionContext='string',
    BucketKeyEnabled=True|False,
    CopySourceSSECustomerAlgorithm='string',
    CopySourceSSECustomerKey='string',
    RequestPayer='requester',
    Tagging='string',
    ObjectLockMode='GOVERNANCE'|'COMPLIANCE',
    ObjectLockRetainUntilDate=datetime(2015, 1, 1),
    ObjectLockLegalHoldStatus='ON'|'OFF',
    ExpectedBucketOwner='string',
    ExpectedSourceBucketOwner='string'
)

Here is a brief description of the above-mentioned parameters:

  • ACL: The ACL( Access Control List) allows users to manage the access to S3 bucket and its content. By specifying this option in the create_bucket() function, the user can define a set of predefined permissions for the other users. Users can specify options such as private, public-read-write, authenticate-read, etc.
  • Bucket: This field is of string data type and is used to input the name of the destination bucket where the object will be copied. It must be unique globally and available across the AWS platform. It is a required parameter. 
  • CacheControl: This field is used to define the caching behavior of the request/response chain. 
  • ChecksumAlgorithm: This field specifies the name of the algorithm that will be used to create the checksum for the object. It is of string data type.
  • ContentEncoding: This field specifies the type of encoding that applies to the content of the bucket. Hence, it also determines the kind of decoding mechanism to apply in order to obtain the value of the content. 
  • ContentLanguage: It defines the language of the content. 
  • ContentType: This field refers to a standard MIME format for the object data. 
  • CopySource: It is a required parameter and inputs the name of the bucket from where the object is to be copied. 
  • CopySourceIfMatch: This field applies a certain condition to the copying of the object. It matches the entity tag of the object and then copies it.
  • CopySourceIfModifiedSince: It applies the copying condition on the time of the object. This field only copies the object if the entity tag is changed after the specified time. 
  • CopySourceIfNoneMatch: This field only copies the object if the entity tag is different from the specified tag. 
  • CopySourceIfUnmodifiedSince: This field applies the constraint on the time of the object. It only allows the user to copy the object if no modification has been made since the specified time. 
  • Expires: This defines the data and time after which the object cannot be cached. 
  • GrantFullControl: This field is used to assign certain permissions to other users of the bucket e.g.,  “READ”, “WRITE”, “READ_ACP”, “WRITE_ACP”, “FULL_CONTROL”. 
  • GrantRead: This allows other users to list the contents of the bucket. 
  • GrantReadACP: This enables the users to read the bucket ACL. 
  • GrantWrite: Allows the users to create new objects within the bucket. 
  • GrantWrtieACP: Allows users to write the ACL of the bucket. 
  • Key: This field refers to the name that is to be assigned to the copied object within the destination bucket. It is of string data type.
  • Metadata: This field refers to the metadata which is to be stored with every object. It is of dictionary type and contains information about the different aspects of the object. 
  • MetadataDirective: This field determines the copying of the metadata of the source bucket. It also refers to the metadata provided with the request. 
  • TaggingDirective: This field refers to the tag set of the object. When an object is copied, every content and field of it is also copied. Hence this field specifies if they are copied from the source object or are replaced by the tag-sets that were specified in the request. 
  • ServerSideEncryption: The user can specify and apply different kinds of encryption on the content of the S3 bucket such as AES256, KMS, etc.
  • StorageClass: Multiple options are available for defining the storage classes in the S3 bucket. Users can choose between this option and specify the name of the storage class in this field.
  • WebsiteRedirectLocation: If the content of the bucket is specified as a website then it will redirect the request coming from this object to another object residing in the same bucket or to an external  URL. Note that this value stored in the header is unique for every bucket and is not copied. 
  • SSECustomerAlgorithm: This field specifies the usage of encryption algorithms such as AES256 for the content of the bucket. 
  • SSECustomerKey: This field inputs the value for the encryption key which is provided by the customer. Note that the S3 bucket does not store the encryption key. However, the value provided must be in accordance with the specified algorithm for the encryption.
  • SSECustomerKeyMD5: This field is automatically populated if not specified by the user. It is not a required parameter and is used to specify the 128-bit digest of the encryption key in compliance with RFC1321. This field determines if the encryption key was copied without any error occurrence. 
  • SSEKMSKeyId: Aside from the server-side encryption, users can also implement encryption using Key Management Service. This field is used to specify the KMS ID for encryption. 
  • SSEKMSEncryptoonContext: This field specifies the KMS encryption context for object encryption.
  • BucketKeyEnabled: This field is of boolean data type. It is used to specify if S3 should use the S3 Bucket key for object encryption along with server-side encryption or KMS encryption. 
  • CopySourceSSECustomerAlgorithm: If the encoding scheme on the content is enabled then there must be a decoding mechanism to obtain the original value of the content. For this purpose, the user can specify a string in this field for the decryption of the object. 
  • CopySourceSSECustomerKey: This field specifies the customer-provided encryption key for the decryption of the source object. This key must be similar to the key that was specified when the source object was created.
  • CopySourceSSECustomerKeyMD5: This field is automatically populated if not specified by the user. It is not a required parameter and is used to specify the 128-bit MD5 digest of the encryption key in compliance with RFC1321. This field determines if the encryption key was copied without any error occurrence. 
  • RequestPayer: This field confirms that the requester agrees upon the deduction of charges for the request. If this field is enabled on the source or destination bucket, the requester will be charged.
  • Tagging: This field follows the encoding scheme of the URL Query parameters. It is used to specify the tag set for the copied object. 
  • ObjectLockMode: This field specifies the object lock which is to be enabled on the copied object. 
  • ObjectLockRetainUntilDate: This field defines the retention date for the object lock on the copied object. After the specified date, the object lock feature will be disabled. 
  • ObjectLockLegalHoldStatus: It is used to specify if the user wants to apply the legal hold to the copied object.
  • ExpectedBucketOwner: This field specifies the value of the expected bucket owner account ID. An exception will occur when the source bucket is present in a different AWS account. The request results in the failure with the 403 status code.
  • ExpectedSourceBucketOwner: This field inputs the value of the source bucket owner account ID. If the source bucket is located in a different AWS account, the request will generate an error with the Status code 403. 

Example: Copy an Object Between the S3 Bucket Within the Same Region

For this demo, we are running two S3 buckets. These two buckets will be used as source and destination buckets. Below is the sample code for implementing this function programmatically:

import boto3
s3=boto3.client('s3')

s3.copy_object(
    Bucket="newbucket202319108",
    CopySource="/newbucket191008/track.txt",
    Key="newcopiedobject.txt"

)

print("Successfully copied")

Following is a brief description of the code:

  • s3.copy_object(): This function accepts different parameters and is used for copying the object from one bucket to another. 
  • ACL: For this demo, this field is specified as “public-read” which allows the other users to read this object.
  • Bucket: This field specifies the destination bucket’s name. 
  • CopySource: In this field, a slash (/) symbol is used which is then followed by the name of the source bucket. Following the same format, we will then specify the name of the object which is to be copied.
  • Key: This field refers to the user-defined name for the copied object. 

Output

To execute the code, use the following command: 

python <FileName>

The output window displayed the “successful” message as specified in the print statement upon completion of the code:

To verify if the object has been copied to the destination bucket, visit the S3 dashboard and click the name of the destination bucket. Tap the reload button if the object is not readily displayed. After reloading the interface, it will display the copied object with the name specified by the user in the code:

Similarly, the copy_obj() also supports the copying of data within the same bucket or between the bucket that lies in a different region. The user has to provide the names of the source and destination buckets in the “Bucket” and “CopySource” fields.

Conclusion 

To use the Copy() method in Boto3, specify the parameters such as Bucket name, object to copy, key name, etc, and execute the code file via the mentioned commands. The copy() method supports the copying of the object within the same bucket, within the same region, and the buckets that are located in different regions. Before executing the code, it is important to log into the AWS account via the terminal. This article provides a practical implementation of copying the object using the copy() method of the Boto3 Client interface.