Objectives

Get some experience with using boto3 to work with EC2 instances and S3 buckets

In this lab we will:

  • Create/Stop/Terminate EC2 instances using boto3
  • List/modify EC2 instance attributes
  • Create/Delete an S3 bucket using boto3
  • Create/delete object in S3 bucket
  • List/modify S3 bucket attributes

Getting started with Boto3

Boto is a Python interface to Amazon Web Services. You can find tutorials and the full API reference at https://boto3.readthedocs.io/en/latest/. The following steps are intended to get you started with doing some basic EC2 activities in the Python interpreter. If you have not already done so you must install/configure AWS CLI and boto3 – refer to instructions in Part C here.

boto3 from Python3 interpreter

  1. Initially we will use the Python interpreter to give a little flavor of what can be done with Boto3. Then you can proceed to a lab exercise that walks through the basic creation, viewing and deletion of resources in the EC2 and S3 services.

  2. First, make sure that you can communicate with AWS from the Python3 interpreter

    $ python3
    >>> import boto3
    >>> ec2 = boto3.resource('ec2')
  3. This returns an EC2 service resource object, which will allow us to interact with the EC2 service. For example this object contains an instances collection manager which allows us to iterate across a collection of instance objects (in this case we are just printing the IDs of all instances):
    >>> for inst in ec2.instances.all():
    >>>     print (inst.id)
  4. Sometimes it might be convenient to manage the instances using a list. The following will copy the instance objects into a list:
    >>> instance_list = []
    >>> for inst in ec2.instances.all():
    >>>     instance_list.append(inst)
  5. Now we can access individual instances if we wish using instance_list[0], instance_list[1], etc:
    >>> print (instance_list[0].image_id)
    >>> instance_list[0].stop()
    Note that this last line stops a running instance.

Create an EC2 instance using boto3

  1. Now let’s create an instance:
    >>> new_instance = ec2.create_instances(
    ImageId='ami-0fad7378adf284ce0',
    MinCount=1,
    MaxCount=1,
    InstanceType='t2.micro')
    The image ami-0fad7378adf284ce0 is a public Amazon Linux 2 (64-bit x86) image in the EU West (Ireland) region. If you now go to the web console, you should be able to see this instance up and running. The above actually returns a list of instances, in this case with one entry, which we can access as follows:
    >>> print (new_instance[0].id)
    'i-5811234b 1bacfaa45'
    >>> print (new_instance[0].state['Name'])
    'pending'
    >>> new_instance[0].reload()
    >>> print (new_instance[0].state)
    'running'
    >>> new_instance[0].stop()
    >>> print (new_instance[0].state)
    'stopping'
    Note that the above is a simplified use of create_instances() with only the required fields ImageId MinCount and MaxCount included as well as the optional InstanceType. Note if you did not specify the t2.micro instance type it would create an m1.small instance which is more costly. All other instance creation parameters are set to their default values. You can see the full specification for create_instances() at http://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.ServiceResource.create_instances. For practical instance creation you would need to specify a security group for your instances. You might first need to create this security group with the create_security_group() method.

List instances

  1. All of the code for the remaining scripts can be downloaded here. For our first script, we list the instances we have running in EC2. We can get this information with just a few short lines of code. First, we’ll import the boto3 library. Using the library, we’ll create an EC2 resource. This is like a handle to the EC2 console that we can use in our script. Finally, we’ll use the EC2 resource to get all of the instances and then print their instance ID and state. Here’s what the script looks like:
    #!/usr/bin/env python3
    import boto3
    ec2 = boto3.resource('ec2')
    for instance in ec2.instances.all():
     print (instance.id, instance.state)
    Save (or download) this as list_instances.py and change the mode to executable. That will allow you to run the script directly from the command line – remember you use chmod +x on the remaining scripts to get them running. Of course you can edit these files if you want to try out different boto3 features.
    $ nano list_instances.py
    $ chmod +x list_instances.py
    $ ./list_instances.py
    If you haven’t created any instances, running this script won’t produce any output. So let’s fix that by moving on to the next step and creating some instances.

Create an instance

  1. One of the key pieces of information we need for scripting EC2 is an Amazon Machine Image (AMI) ID. This will let us tell our script what type of EC2 instance to create. While getting an AMI ID can be done programmatically, that's an advanced topic beyond the scope of this tutorial. For now, let’s go back to the AWS console and get an ID from there. In the AWS console, go the EC2 service and click the “Launch Instance” button. On the next screen, you’re presented with a list of AMIs you can use to create instances. Let’s focus on the Amazon Linux 2 AMI at the very top of the list (for the 64-bit x86 architecture). Make a note of the AMI ID to the right of the name. In this example, it is “ami-0fad7378adf284ce0" That’s the value we need for our script. Note that AMI IDs differ across regions and are updated often so the latest ID for the Amazon Linux AMI may be different for you.
  2. Now with the AMI ID, we can complete our script. Following the pattern from the previous script, we’ll import the boto3 library and use it to create an EC2 resource. Then we’ll call the create_instances() function, passing in the image ID, max and min counts, and the instance type. We can capture the output of the function call, which is an instance object. For reference, we can print the instance’s ID.
    #!/usr/bin/env python3
    import boto3
    ec2 = boto3.resource('ec2')
    instance = ec2.create_instances(
     ImageId= 'ami-0fad7378adf284ce0',
     MinCount=1,
     MaxCount=1,
     InstanceType='t2.micro')
    print (instance[0].id)
    While the command will finish quickly, it will take some time for the instance to be created. Run the list_instances.py script several times to see the state of the instance change from pending to running.  

Terminate an instance

  1. Now that we can programmatically create and list instances, we also need a method to terminate them. For this script, we’ll follow the same pattern as before with importing the boto3 library and creating an EC2 resource. But we’ll also take one parameter: the ID of the instance to be terminated. To keep things simple, we’ll consider any argument to the script to be an instance ID. We’ll use that ID to get a connection to the instance from the EC2 resource and then call the terminate() function on that instance. Finally, we print the response from the terminate function. Here’s what the script looks like:
    #!/usr/bin/env python3
    import sys
    import boto3
    ec2 = boto3.resource('ec2')
    for instance_id in sys.argv[1:]:
     instance = ec2.Instance(instance_id)
     response = instance.terminate()
     print (response)
    Run the list_instances.py script to see what instances are available. Note one of the instance IDs to use as input to the terminate_instances.py script. After running the terminate script, we can run the list instances script to confirm the selected instance was terminated. That process looks something like this:
    $ ./list_instances.py
    i-0c34e5ec790618146 {u'Code': 16, u'Name': 'running'}
    $ ./terminate_instances.py i-0c34e5ec790618146
    {u'TerminatingInstances': [{u'InstanceId': 'i-0c34e5ec790618146', u'CurrentState': {u'Code': 32, u'Name': 'shutting-down'}, u'PreviousState': {u'Code': 16, u'Name': 'running'}}], 'ResponseMetadata': {'RetryAttempts': 0, 'HTTPStatusCode': 200, 'RequestId': '55c3eb37-a8a7-4e83-945d-5c23358ac4e6', 'HTTPHeaders': {'transfer-encoding': 'chunked', 'vary': 'Accept-Encoding', 'server': 'AmazonEC2', 'content-type': 'text/xml;charset=UTF-8', 'date': 'Sun, 01 Jan 2017 00:07:20 GMT'}}}
    $ ./list_instances.py
    i--0c34e5ec790618146 {u'Code': 48, u'Name': 'terminated'}

Note that if you do not provide the instance id to the terminate_instances.py program then nothing will happen. You should modify this program to inform the user that an instance id value must be provided as a parameter when running this program.

S3 scripts - list buckets and their contents

  1. The AWS Simple Storage Service (S3) provides object storage similar to a file system. Folders are represented as buckets and the contents of the buckets are known as keys. Of course, all of these objects can be managed with Python and the boto3 library.
  2. Our first S3 script will let us see what buckets currently exist in our account and any keys inside those buckets. Of course, we’ll import the boto3 library. Then we can create an S3 resource. Remember, this gives us a handle to all of the functions provided by the S3 console. We can then use the resource to iterate over all buckets. For each bucket, we print the name of the bucket and then iterate over all the objects inside that bucket. For each object, we print the object’s key or essentially the object’s name. The code looks like this:
    #!/usr/bin/env python3
    import boto3
    s3 = boto3.resource('s3')
    for bucket in s3.buckets.all():
     print (bucket.name)
     print ("---")
     for item in bucket.objects.all():
         print ("\t%s" % item.key)
    If you don’t have any buckets when you run this script, you won’t see any output. However by default RosettaHub creates a number of administrative buckets which you do not have permission to access. This will cause the above code to fail when trying to list the objects within the bucket. You should modify the code above to handle this error i.e print the error and continue on.

Now let’s create a bucket or two and then upload some files into them.

Create a Bucket

In our bucket creation script, let's import the boto3 library (and the sys library too for command line arguments) and create an S3 resource. We’ll consider each command line argument as a bucket name and then, for each argument, create a bucket with that name. We can make our scripts a bit more robust by using Python’s try and except features. If we wrap our call to the create_bucket() function in a try: block, we can catch any errors that might occur. If our bucket creation goes well, we simply print the response. If an error is encountered, we can print the error message and exit gracefully. Here’s what that script looks like:

#!/usr/bin/env python3
import sys
import boto3
s3 = boto3.resource("s3")
for bucket_name in sys.argv[1:]:
    try:
          response = s3.create_bucket(Bucket=bucket_name, CreateBucketConfiguration={'LocationConstraint': 'eu-west-1'})
         print (response)
     except Exception as error:
         print (error)

Creating a bucket is easy but comes with some rules and restrictions. To get the complete run-down, read the Bucket Restrictions and Limitations section in the S3 documentation. The two rules that needs to be emphasized for this example are 1) bucket names must be globally unique and 2) bucket names must follow DNS naming conventions. Note that we needed to specify a LocationConstraint otherwise the S3 bucket could have been created in any Region. When choosing a bucket name, pick one that you are sure hasn’t been used before and only use lowercase letters, numbers, and hyphens. Because simple bucket names like “my_bucket” are usually not available, a good way to get a unique bucket name is to use a name, a number, and the date. For example:

$ ./create_bucket.py projectx-bucket1-$(date +%F-%s)
s3.Bucket(name='projectx-bucket1-2019-01-28-1548716885')

Now we can run the list_buckets.py script again to see the buckets we created.

$ ./list_buckets.py
projectx-bucket1-2019-01-28-1548716885
---

Our buckets are created but they’re empty. Let’s put some files into these buckets.

S3 scripts - Put a File into a Bucket

  1. Similar to our bucket creation script, we start the put script by importing the sys and boto3 libraries and then creating an S3 resource. Now we need to capture the name of the bucket we’re putting the file into and the name of the file as well. We’ll consider the first argument to be the bucket name and the second argument to be the file name. To keep with robust scripting, we’ll wrap the call to the put() function in a try; block and print the response if all goes well. If anything fails, we’ll print the error message. That script comes together like this:
    #!/usr/bin/env python3
    import sys
    import boto3
    s3 = boto3.resource("s3")
    bucket_name = sys.argv[1]
    object_name = sys.argv[2]
    try:
     response = s3.Object(bucket_name, object_name).put(Body=open(object_name, 'rb'))
     print (response)
    except Exception as error:
     print (error)
    For testing, we can create some empty files and then use the put_bucket.py script to upload each file into our target bucket.
$ touch file{1,2,3,4}.txt
$ ./put_bucket.py projectx-bucket1-2019-01-28-1548716885 file1.txt
{'ETag': '"d41d8cd98f00b204e9800998ecf8427e"', 'ResponseMetadata': {'RetryAttempts': 0, 'HostId': 'kNuvbQgpXXM6klZS8Dx3wId1OpGDokx3bzWP7+kwza8BlTCzV1b2FPVgb0sCwoz4WdDXDY9GrLc=', 'RequestId': '0488B4D01E30235A', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-length': '0', 'x-amz-request-id': '0488B4D01E30235A', 'x-amz-id-2': 'kNuvbQgpXXM6klZS8Dx3wId1OpGDokx3bzWP7+kwza8BlTCzV1b2FPVgb0sCwoz4WdDXDY9GrLc=', 'server': 'AmazonS3', 'date': 'Mon, 28 Jan 2019 23:14:35 GMT', 'etag': '"d41d8cd98f00b204e9800998ecf8427e"'}}}
$ ./put_bucket.py projectx-bucket1-2019-01-28-1548716885 file2.txt

$ ./put_bucket.py projectx-bucket1-2019-01-28-1548716885 file3.txt

$ ./put_bucket.py projectx-bucket1-2019-01-28-1548716885 file4.txt

$ ./list_buckets.py
projectx-bucket1-2019-01-28-1548716885
---
        file1.txt
        file2.txt
        file3.txt
        file4.txt

Success! We’ve created a bucket and uploaded some files into it. Now let’s go in the opposite direction, deleting objects and then finally, deleting the bucket.

S3 scripts - Delete Bucket Contents

  1. For our delete script, we’ll start the same as our create script: importing the needed libraries, creating an S3 resource, and taking bucket names as arguments. To keep things simple, we’ll delete all the objects in each bucket passed in as an argument. We’ll wrap the call to the delete() function in a try: block to make sure we catch any errors. Our script looks like this:
    #!/usr/bin/env python3
    import sys
    import boto3
    s3 = boto3.resource('s3')
    for bucket_name in sys.argv[1:]:
     bucket = s3.Bucket(bucket_name)
     for key in bucket.objects.all():
         try:
             response = key.delete()
             print (response)
         except Exception as error:
             print (error)

If we save this as ./delete_contents.py and run the script on our example bucket, output should look like this:

$ ./delete_contents.py projectx-bucket1-2019-01-28-1548716885
{'ResponseMetadata': {'HTTPStatusCode': 204, 'RetryAttempts': 0, 'HostId': 'T9pKsldJC7nseqjiipucF2ziBCXK+MAn1KkMKaGOQa+XEeKujDeGf/V7PLqlhgMRVC8Yfmop2p4=', 'RequestId': '8E5E41563544570F', 'HTTPHeaders': {'date': 'Mon, 28 Jan 2019 23:18:39 GMT', 'x-amz-id-2': 'T9pKsldJC7nseqjiipucF2ziBCXK+MAn1KkMKaGOQa+XEeKujDeGf/V7PLqlhgMRVC8Yfmop2p4=', 'server': 'AmazonS3', 'x-amz-request-id': '8E5E41563544570F'}}}
.......

Now if we run the list_buckets.py script again, we’ll see that our bucket is indeed empty.

$ ./list_buckets.py
projectx-bucket1-2019-01-28-1548716885
---

Delete a Bucket

Our delete bucket script looks a lot like our delete object script. The same libraries are imported and the arguments are taken to be bucket names. We use the S3 resource to attach to a bucket with the specific name and then in our try: block, we call the delete() function on that bucket, catching the response. If the delete worked, we print the response. If not, we print the error message. Here’s the script:

#!/usr/bin/env python3
import sys
import boto3
s3 = boto3.resource('s3')
for bucket_name in sys.argv[1:]:
    bucket = s3.Bucket(bucket_name)
try:
        response = bucket.delete()
print (response)
except Exception as error:
print (error)

One important thing to note when attempting to delete a bucket is that the bucket must be empty first. If there are still objects in a bucket when you try to delete it, an error will be reported and the bucket will not be deleted. Running our delete_buckets.py script on our target bucket produces the following output:

$ ./delete_buckets.py projectx-bucket1-2019-01-28-1548716885
{'ResponseMetadata': {'HostId': 'mMjzqdwBcj4aghfFwnuZGBoEBTjNJnUpvZlbPPnpQHp0OZSltQ97JfVWLmogEq3ceEbdQAjk9ms=', 'HTTPHeaders': {'date': 'Mon, 28 Jan 2019 23:22:13 GMT', 'server': 'AmazonS3', 'x-amz-id-2': 'mMjzqdwBcj4aghfFwnuZGBoEBTjNJnUpvZlbPPnpQHp0OZSltQ97JfVWLmogEq3ceEbdQAjk9ms=', 'x-amz-request-id': 'C9C29654E32C4748'}, 'RetryAttempts': 0, 'HTTPStatusCode': 204, 'RequestId': 'C9C29654E32C4748'}}

We can run list_buckets.py again to see that our bucket has indeed been deleted.

Exercises - upload to Moodle:

Please use Python/boto3 to perform the following tasks :

  1. To enable you to ssh onto an instance you have created you must also specify a key pair. What parameters must you specify to launch an instance into a specific security group with a specific key pair? [screenshot 1]
  2. How can you tag an instance when you are creating it? [screenshot 2]
  3. What if you did not tag the instance when creating it – how can you tag the instance later? [screenshot 3]
  4. Modify the put_bucket.py program to use subprocess.run() to launch the Firefox browser with URL of image loaded in the bucket. [screenshot 4]
  5. When you run the list_buckets.py program it will fail because of a permission error when it tries to list the objects in the RosettaHub administration buckets. You should modify this program to handle this error without crashing. [screenshot 5]

Submission to Moodle

  • Please upload your pdf file to the submission drop box in Moodle by midnight on Wednesday 6th Feb.