Amazon S3 Python delete

Simon Bulmer
2 min readOct 5, 2021

--

There are a handful of options available to AWS users if and when they want to delete objects in an S3 bucket. For the task I had, I took it as an opportunity to write something in Python using the boto3 libraries, and albeit very basic, it did the job.

There were literally thousands of objects to remove from a number of prefixes in the target bucket, and with versioning enabled I also had to make sure it deleted these as well.

import boto3
import sys
s3 = boto3.resource('s3')
bucket = s3.Bucket("simon-bucket")
prefix = str(sys.argv[1])
for object_summary in bucket.objects.filter(Prefix=prefix):
object_summary.delete()
print(object_summary.key)
for version_summary in bucket.object_versions.filter(Prefix=prefix):
version_summary.delete()
print(version_summary.key)

As stated in the opening block this was a very basic Python script with no error handling. After working through a number of the objects the script burst into flames (added for dramatic effects) and stopped working. A quick review later I found that the S3 bucket contained prefixes with special characters present and the script simply did not know how to handle it.

In other words, it is similar to executing a terminal command containing spaces and the terminal not knowing where the filename starts and ends. e.g aws s3 cp s3://simon-bucket/this is^a file”name.txt s3://simon-bucket1/this is^also a file”name.txt.

I know the above example is more of a copy statement, but it’s the only thing that popped into my head at the time of writing.

So, I fiddled around with it for a little while and kept getting the same error. Eventually, I tried adding a couple of if / elif statements containing the special character and it seemed to work. Is this the correct way of handling this? Possibly not! Did it work for me? Yes

import boto3
import sys
s3 = boto3.resource('s3')
bucket = s3.Bucket("simon-bucket")
prefix = str(sys.argv[1])
for object_summary in bucket.objects.filter(Prefix=prefix):
if 'is^' in object_summary.key:
object_summary.delete()
elif 'file"' in object_summary.key:
object_summary.delete()
else:
object_summary.delete()
print(object_summary.key)
for version_summary in bucket.object_versions.filter(Prefix=prefix):
if 'is^' in version_summary.key:
version_summary.delete()
elif 'file"' in version_summary.key:
version_summary.delete()
else:
version_summary.delete()
print(version_summary.key)

Not the best Python script in the world by far, but it did what I needed it to do. I execute my scripts mostly from a laptop and use aws-runas tagged to the beginning of the command to handle my AWS session. When executing this script I allowed a single argument to be passed controlling the prefix input, allowing me to add to a bash script, but that’s not to say the bucket variable cannot be changed also.

At the moment this is just sat on my laptop collecting dust and I haven’t really spent any more time on it. I definitely can improve it moving forward, I just need to find the time to do so and in the meantime, I am just happy to share it.

--

--

Simon Bulmer
Simon Bulmer

Written by Simon Bulmer

Senior Site Reliability Engineer, Cyclist and occasional runner

No responses yet