Cloud computing with Amazon Web Services, Part
      2: Amazon Simple Storage Service (S3)
      Reliable, flexible, and i...
developerWorks®                                                                        ibm.com/developerWorks




        ...
ibm.com/developerWorks                                                                 developerWorks®



      location c...
developerWorks®                                                                       ibm.com/developerWorks



     Unlik...
ibm.com/developerWorks                                                             developerWorks®




      Access loggin...
developerWorks®                                                                      ibm.com/developerWorks



     conver...
ibm.com/developerWorks                                                                developerWorks®



                 ...
developerWorks®                                                                           ibm.com/developerWorks




     ...
ibm.com/developerWorks                                                                developerWorks®




                ...
developerWorks®                                                                       ibm.com/developerWorks



     After...
ibm.com/developerWorks                                                                 developerWorks®



      Gatekeeper...
developerWorks®                                                                        ibm.com/developerWorks




       S...
ibm.com/developerWorks                                                                 developerWorks®



      class and ...
developerWorks®                                                                        ibm.com/developerWorks



     in S...
ibm.com/developerWorks                                                                   developerWorks®




       if (ob...
developerWorks®                                                                        ibm.com/developerWorks




       /...
ibm.com/developerWorks                                                              developerWorks®



      S3 Shell

   ...
developerWorks®                                                                      ibm.com/developerWorks



     specif...
ibm.com/developerWorks                                                                        developerWorks®




      Do...
developerWorks®                                                                     ibm.com/developerWorks




     Resour...
ibm.com/developerWorks                                                                developerWorks®



         • Browse...
Upcoming SlideShare
Loading in …5
×

Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

6,778 views

Published on

Published in: Business, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,778
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
149
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

  1. 1. Cloud computing with Amazon Web Services, Part 2: Amazon Simple Storage Service (S3) Reliable, flexible, and inexpensive storage and retrieval of your data Skill Level: Introductory Prabhakar Chaganti (prabhakar@ylastic.com) CTO Ylastic, LLC. 19 Aug 2008 In this series, learn about cloud computing using Amazon Web Services. Explore how the services provide a compelling alternative for architecting and building scalable, reliable applications. This article delves into the highly scalable and responsive services provided by Amazon Simple Storage Service (S3). Learn about tools for interacting with S3, and use code samples to experiment with a simple shell. Amazon Simple Storage Service Part 1 of this series introduced the building blocks of Amazon Web Services and explains how you can use this virtual infrastructure to build Web-scale systems. In this article, learn more about Amazon Simple Storage Service (S3). S3 is a highly scalable and fast Internet data-storage system that makes it simple to store and retrieve any amount of data, at any time, from anywhere in the world. You pay for the storage and bandwidth based on your actual usage of the service. There is no setup cost, minimum cost, or recurring overhead cost. Amazon provides the administration and maintenance of the storage infrastructure, leaving you free to focus on the core functions of your systems and applications. S3 is an industrial-strength platform that is readily available for your data storage needs. It's great for: Amazon Simple Storage Service (S3) © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 1 of 21
  2. 2. developerWorks® ibm.com/developerWorks • Storing the data for your applications. • Personal or enterprise backups. • Quickly and cheaply distributing media and other bandwidth-guzzling content to your customers. Valuable features of S3 include: Reliability It is designed to tolerate failures and repair the system very quickly with minimal or no downtime. Amazon provides a service level agreement (SLA) to maintain 99.99 percent availability. Simplicity S3 is built on simple concepts and provides great flexibility for developing your applications. You can build more complex storage schemes, if needed, by layering additional functions on top of S3 components. Scalability The design provides a high level of scalability and allows an easy ramp-up in service when a spike in demand hits your Web-scale applications. Inexpensive S3 rates are very competitive with other enterprise and personal data-storage solutions on the market. The three basic concepts underpinning the S3 framework are buckets, objects, and keys. Buckets Buckets are the fundamental building blocks. Each object that is stored in Amazon S3 is contained within a bucket. Think of a bucket as analogous to a folder, or a directory, on the file system. One of the key distinctions between a file folder and a bucket is that each bucket and its contents are addressable using a URL. For example, if you have a bucket named "prabhakar," then it can be addressed using the URL http://prabhakar.s3.amazonaws.com. Each S3 account can contain a maximum of 100 buckets. Buckets cannot be nested within each other, so you can't create a bucket within a bucket. You can affect the geographical location of your buckets by specifying a location constraint when you create them. This will automatically ensure that any objects that you store within that bucket will be stored in that geographical location. At this time, you can locate your buckets in either the United States or the European Union. If you do not specify a location when creating the bucket, the bucket and its contents will be stored in the Amazon Simple Storage Service (S3) Page 2 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.
  3. 3. ibm.com/developerWorks developerWorks® location closest to the billing address for your account. Bucket names need to conform to the following S3 requirements: • The name must start with a number or a letter. • The name must be between 3 and 255 characters. • A valid name can contain only lowercase letters, numbers, periods, underscores, and dashes. • Though names can have numbers and periods, they cannot be in the IP address format. You cannot name a bucket 192.168.1.254. • The bucket namespace is shared among all buckets from all of the accounts in S3. Your bucket name must be unique across the entire S3. Buckets that will contain objects to be served with addressable URLs must conform to the following additional S3 requirements: • The name of the bucket must not contain any underscores. • The name must be between 3 and 63 characters. • The name cannot end with a dash. For example, myfavorite-.bucket.com is invalid. • There cannot be dashes next to periods in the name. my-.bucket.com is invalid. You can use a domain naming convention for your buckets, such as media.yourdomain.com, and thus map your existing Web domains or subdomains to Amazon S3. The actual mapping will be done when you add DNS CNAME entries to point back to S3. The big advantage with this scheme is that you can use your own domain name in your URLs to download files. The CNAME mapping will be responsible for translating between the S3 address for your bucket. For example, http://media.yourdomain.com.s3.amazonaws.com becomes the more friendly URL http://media.yourdomain.com. Objects Objects contain the data that is stored within the buckets in S3. Think of an object as the file that you want to store. Each object that is stored is composed of two entities: data and metadata. The data is the actual thing that is being stored, such as a PDF file, Word document, a video file, and so on. The stored data also has associated metadata for describing the object. Some examples of metadata are the content type of the object being stored, the date the object was last modified, and any other metadata specific to you or your application. The metadata for an object is specified by the developer as key value pairs when the object is sent to S3 for storage. Amazon Simple Storage Service (S3) © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 3 of 21
  4. 4. developerWorks® ibm.com/developerWorks Unlike the limitation on the number of buckets, there are no restrictions on the number of objects. You can store an unlimited number of objects in your buckets, and each object can contain up to 5GB of data. The data in your publicly accessible S3 objects can be retrieved by HTTP, HTTPS, or BitTorrent. Distribution of large media files from your S3 account becomes very simple when using BitTorrent; Amazon will not only create the torrent for your object, it will also seed it! Keys Each object stored within an S3 bucket is identified using a unique key. This is similar in concept to the name of a file in a folder on your file system. The file name within a folder on your hard drive must be unique. Each object inside a bucket has exactly one key. The name of the bucket and the key are together used to provide the unique identification for each object that is stored in S3. Every object within S3 is addressable using a URL that combines the S3 service URL, bucket name, and unique key. If you store an object with the key my_favorite_video.mov inside the bucket named prabhakar, that object can be addressed using the URL http://prabhakar.s3.amazonaws.com/ my_favorite_video.mov. Though the concepts are simple, as shown in Figure 1, buckets, objects, and keys together provide a lot of flexibility for building your data storage solutions. You can leverage these building blocks to simply store data on S3, or use their flexibility to layer and build more complex storage and applications on top of S3 to provide additional functions. Figure 1. Conceptual view of S3 Amazon Simple Storage Service (S3) Page 4 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.
  5. 5. ibm.com/developerWorks developerWorks® Access logging Each S3 bucket can have access log records that contain details on each request for a contained object. The log records are turned off by default; you have to explicitly enable the logging for each Amazon S3 bucket that you want to track. An access log record contains a lot of detail about the request, including the request type, the resource requested, and the time and date that the request was processed. The logs are provided in the S3 server access log format but can be easily Amazon Simple Storage Service (S3) © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 5 of 21
  6. 6. developerWorks® ibm.com/developerWorks converted into Apache combined log format. They can then be easily parsed by any of the open source or commercial log analysis tools, such as Webalizer, to give you a human readable report and pretty graphs upon request. The reports can be very useful to gain insight into your customer base that's accessing the files. See Resources for tools you can use for easier visualization of the S3 log records. Security Each bucket and object created in S3 is private to the user account creating them. You have to explicitly grant permissions to other users and customers for them to be able to see the list of objects in your S3 buckets or to download the data contained within them. Amazon S3 provides the following security features to protect your buckets and the objects in them. Authentication Ensures that the request is being made by the user that owns the bucket or object. Each S3 request must include the Amazon Web Services access key that uniquely identifies the user. Authorization Ensures that the user trying to access the resource has the permissions or rights to the resource. Each S3 object has an access control list (ACL) associated with it that explicitly identifies the grants and permissions for that resource. You can grant access to all Amazon Web Services users or to a specific user identified by e-mail address, or you can grant anonymous access to any user. Integrity Each S3 request must be digitally signed by the requesting user with an Amazon Web Services secret key. On receipt of the request, S3 will check the signature to ensure that the request has not been tampered with in transit. Encryption You can access S3 through the HTTPS protocol to ensure that the data is transmitted through an encrypted connection. Nonrepudiation Each S3 request is time-stamped and serves as proof of the transaction. Each and every REST request made to S3 must go through the following standard steps that are essential to ensuring security: • The request and all needed parameters must be assembled into a string. • Your Amazon Web Services secret access key must be used to create a Amazon Simple Storage Service (S3) Page 6 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.
  7. 7. ibm.com/developerWorks developerWorks® keyed-HMAC (Hash Message Authentication Code) signature hash of the request string. • This calculated signature is itself added as a parameter on the request. • The request is then forwarded to Amazon S3. • Amazon S3 will check to see if the provided signature is a valid keyed-HMAC hash of the request. • If the signature is valid, then (and only then) Amazon S3 will process the request. Pricing The charges for S3 are calculated based on three criteria, which are different based on the geographic location of your buckets. • The total amount of storage space used, which includes the actual size of your data content and the associated metadata. The units used by S3 for determining the storage consumed are GB-Month. The number of bytes of storage used by your account is computed every hour, and at the end of the month it's converted into the storage used for the month. The table below shows pricing for storage. Location Cost United States $0.15 per GB-Month of storage used Europe $0.18 per GB-Month of storage used • The amount of data or bandwidth transferred to and from S3. This includes all data that is uploaded and downloaded from S3. There is no charge for data transferred between EC2 and S3 buckets that are located in the United States. Data transferred between EC2 and European S3 buckets is charged at the standard data transfer rate as shown below. Location Cost United States $0.100 per GB - all data transfer in $0.170 per GB - first 10TB / month data transfer out Amazon Simple Storage Service (S3) © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 7 of 21
  8. 8. developerWorks® ibm.com/developerWorks $0.130 per GB - next 40TB / month data transfer out $0.110 per GB - next 100TB / month data transfer out $0.100 per GB - data transfer out / month over 150TB Europe $0.100 per GB - all data transfer in $0.170 per GB - first 10TB / month data transfer out $0.130 per GB - next 40TB / month data transfer out $0.110 per GB - next 100TB / month data transfer out $0.100 per GB - data transfer out / month over 150TB • The number of application programming interface (API) requests performed. S3 charges fees per each request that is made using the interface—for creating objects, listing buckets, listing objects, and so on. There is no fee for deleting objects and buckets. The fees are once again slightly different based on the geographic location of the bucket. The following table shows pricing for API requests. Location Cost United States $0.01 per 1,00 0 PUT, POST, or LIST requests $0.01 per 10,000 GET and all other requests No charge for delete requests Europe $0.012 per Amazon Simple Storage Service (S3) Page 8 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.
  9. 9. ibm.com/developerWorks developerWorks® 1,000 PUT, POST, or LIST requests $0.012 per 10,000 GET and all other requests No charge for delete requests Check Amazon S3 for the latest price information. You can also use the AWS Simple Monthly Calculator for calculating your monthly usage costs for S3 and the other Amazon Web Services. Getting started with Amazon Web Services and S3 To start exploring S3, you will first need to sign up for an Amazon Web Services account. You will be assigned an Amazon Web Services account number and will get the security access keys along with the x.509 security certificate that will be required when you start using the various libraries and tools for communicating with S3. All communication with any of the Amazon Web Services is through either the SOAP interface or the query/REST interface. The request messages that are sent through either of these interfaces must be digitally signed by the sending user to ensure that the messages have not been tampered with in transit, and that they are really originating from the sending user. This is the most basic part of using the Amazon Web Services APIs. Each request must be digitally signed and the signature attached to the request. Each Amazon Web Services user account is associated with the following security credentials: • An access key ID that identifies you as the person making requests through the query/REST interface. • A secret access key that is used to calculate the digital signature when you make requests through the query interface. • Public and private x.509 certificates for signing requests and authentication when using the SOAP interface. You can manage your keys and certificate, regenerate them, view account activity and usage reports, and modify your profile information from Web Services Account information. Amazon Simple Storage Service (S3) © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 9 of 21
  10. 10. developerWorks® ibm.com/developerWorks After you successfully sign up for the Amazon Web Services account, you need to enable Amazon S3 service for your account using the following steps: 1. Log in to your Amazon Web Services account. 2. Navigate to the S3 home page. 3. Click on Sign Up For This Web Service on the right side of the page. 4. Provide the requested information and complete the sign-up process. Examples in this article use the query/REST interface to communicate with S3. You are going to need to obtain your access keys. You can access them from your Web Services Account information page by selecting View Access Key Identifiers. You are now set up to use Amazon Web Services, and have enabled S3 service for your account. Interacting with S3 To learn about interacting with S3, you can use existing libraries available from Amazon or from third parties and independent developers. This article does not delve into the details of communication with S3, such as how to sign requests, how to build up the XML documents used for encapsulating the data, or the parameters sent to and received from S3. We'll let the libraries handle all of that for us, and use the higher-level interface they provide. You can review the S3 developer guide for more details. You'll use an open-source Java™ library named JetS3t to explore S3, and learn about its API by viewing small snippets of code. By the end of the article you'll collect and organize these snippets into something useful: a simple and handy S3 shell that you can use at any time to experiment and interact with S3. JetS3t JetS3t is an open source Java toolkit for interacting with S3. It is more than just a library. The distribution includes several very useful S3 related tools that can be used by typical S3 users as well as service providers who build applications on top of S3. JetS3t includes: Cockpit A GUI for managing the contents of an Amazon S3 account. Synchronize A command-line application for synchronizing directories on your computer with an Amazon S3 account. Amazon Simple Storage Service (S3) Page 10 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.
  11. 11. ibm.com/developerWorks developerWorks® Gatekeeper A servlet that you can use to mediate access to Amazon S3 accounts. CockpitLite A lighter version of Cockpit that routes all its operations through a mediating gatekeeper service. Uploader A GUI that routes all its operations through a mediating gatekeeper service and can be used by service providers to provide access to their S3 accounts for customers. Download the latest release of JetS3t. You can, of course, use one of these GUI applications for interacting with S3, but that won't be very helpful if you need to develop applications to interface with S3. You can download the complete source code for this article as a zipped archive, including a ready-to-go Netbeans project that you can import into your workspace. Connecting to S3 JetS3t provides an abstract class named org.jets3t.service.S3Service that must be extended by classes that implement a specific interface, such as REST or SOAP. JetS3t provides two implementations you can use for connecting and interacting with S3: • org.jets3t.service.impl.rest.httpclient.RestS3Service communicates with S3 through the REST interface. • org.jets3t.service.impl.soap.axis.SoapS3Service communicates with S3 through the SOAP interface using Apache Axis 1.4. JetS3t uses a file named jets3t.properties to configure various parameters that are used while communicating with S3. The example in this article uses the default jets3t.properties that is shipped with the distribution. The JetS3t configuration guide has a detailed explanation of the parameters. In this article you'll use the RestS3Service to connect to S3. A new RestS3Service object can be created by providing your Amazon Web Services access keys in the form of an AWSCredentials object. Keep in mind that the code snippets in this article are for demonstrating the API. To run each snippet, you have to ensure that all the required class imports are present. Refer to the source in the download package for the right imports. Or, even simpler, you can import the provided Netbeans project into your workspace for easy access to all of the source code. Listing 1. Create a new RestS3Service Amazon Simple Storage Service (S3) © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 11 of 21
  12. 12. developerWorks® ibm.com/developerWorks String awsAccessKey = ”Your AWS access key”; String awsSecretKey = “Your AWS Secret key”; // use your AWS keys to create a credentials object AWSCredentials awsCredentials = new AWSCredentials(awsAccessKey, awsSecretKey); // create the service object with our AWS credentials S3Service s3Service = new RestS3Service(awsCredentials); Managing your buckets The concept of a bucket is encapsulated by the org.jets3t.service.model.S3Bucket, which extends the org.jets3t.service.model.BaseS3Object class. This class is the parent class for both buckets and objects in the JetS3t model. Each S3Bucket object provides a toString(), in addition to various accessor methods, that can be used to print the salient information for a bucket (name and geographical location of the bucket, date the bucket was created, owner’s name, and any metadata associated with the bucket). Listing 2. List buckets // list all buckets in the AWS account and print info for each bucket. S3Bucket[] buckets = s3Service.listAllBuckets(); for (S3Bucket b : buckets) { System.out.println(b); } You can create a new bucket by providing a unique name for it. The namespace for buckets is shared by all the user accounts, so sometimes finding a unique name can be challenging. You can also specify where you want the bucket and the objects that it will contain to be physically located. Listing 3. Create buckets // create a US bucket and print its info S3Bucket bucket = s3Service.createBucket(bucketName); System.out.println("Created bucket - " + bucketName + " - " + bucket); // create a EU bucket and print its info S3Bucket bucket = s3Service.createBucket(bucketName, S3Bucket.LOCATION_EUROPE); System.out.println("Created bucket - " + bucketName + " - " + bucket); You have to delete all the objects contained in the bucket prior to deleting the bucket or an exception will be raised. The RestS3Service class you have been using is fine for dealing with single objects. When you start dealing with multiple objects, it makes more sense to use a multithreaded approach to speed things up. JetS3t provides the org.jets3t.service.multithread.S3ServiceSimpleMulti class just for this purpose. You can wrap the existing s3Service object using this Amazon Simple Storage Service (S3) Page 12 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.
  13. 13. ibm.com/developerWorks developerWorks® class and take full advantage of those multiprocessors. It comes in handy when you need to clear a bucket by deleting all the objects it contains. Listing 4. Delete a bucket // get the bucket S3Bucket bucket = getBucketFromName(s3Service, “my bucket”); // delete a bucket – it must be empty first s3Service.deleteBucket(bucket); // create a multi threaded version of the RestService S3ServiceSimpleMulti s3ServiceMulti = new S3ServiceSimpleMulti(s3Service); // get all the objects from bucket S3Object[] objects = s3Service.listObjects(bucket); // clear the bucket by deleting all its objects s3ServiceMulti.deleteObjects(bucket, objects); Each bucket is associated with an ACL that determines the permissions or grants for the bucket and the level of access provided to other users. You can retrieve the ACL and print the grants that are provided by it. Listing 5. Retrieve ACL for bucket // get the bucket S3Bucket bucket = getBucketFromName(s3Service, “my bucket”); // get the ACL and print it AccessControlList acl = s3Service.getBucketAcl(bucket); System.out.println(acl); The default permissions on newly created buckets and objects make them private to the owner. You can modify this by changing the ACL for a bucket and granting a group of users permission to read, write, or have full control over the bucket. Listing 6. Make a bucket and its content public // get the bucket S3Bucket bucket = getBucketFromName(s3Service, “my bucket”); // get the ACL AccessControlList acl = s3Service.getBucketAcl(bucket); // give everyone read access acl.grantPermission(GroupGrantee.ALL_USERS, Permission.PERMISSION_READ); // save changes back to S3 bucket.setAcl(acl); s3Service.putBucketAcl(bucket); You can easily enable logging for a bucket and retrieve the current logging status. After logging is enabled, detailed access logs for each file in that bucket are stored Amazon Simple Storage Service (S3) © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 13 of 21
  14. 14. developerWorks® ibm.com/developerWorks in S3. Your S3 account will be charged for the storage space that is consumed by the logs. Listing 7. Logging for S3 buckets // get the bucket S3Bucket bucket = getBucketFromName(s3Service, “my bucket”); // is logging enabled? S3BucketLoggingStatus loggingStatus = s3Service.getBucketLoggingStatus(bucketName); System.out.println(loggingStatus); // enable logging S3BucketLoggingStatus newLoggingStatus = new S3BucketLoggingStatus(); // set a prefix for your log files newLoggingStatus.setLogfilePrefix(logFilePrefix); // set the target bucket name newLoggingStatus.setTargetBucketName(bucketName); // give the log_delivery group permissions to read and write from the bucket AccessControlList acl = s3Service.getBucketAcl(bucket); acl.grantPermission(GroupGrantee.LOG_DELIVERY, Permission.PERMISSION_WRITE); acl.grantPermission(GroupGrantee.LOG_DELIVERY, Permission.PERMISSION_READ_ACP); bucket.setAcl(acl); // save the changed ACL for the bucket to S3 s3Service.putBucketAcl(bucket); // save the changes to the bucket logging s3Service.setBucketLoggingStatus(bucketName, newLoggingStatus, true); System.out.println("The bucket logging status is now enabled."); Managing your objects Each object contained in a bucket is represented by the org.jets3t.service.model.S3Object. Each S3Bucket object provides a toString() that can be used to print the important details for an object: • Name of the key • Name of the containing bucket • Date the object was last modified • Any metadata associated with the object It also provides methods for accessing the various properties of an object along with its metadata. Listing 8. List objects // list objects in a bucket. S3Object[] objects = s3Service.listObjects(bucket); // print out the object details Amazon Simple Storage Service (S3) Page 14 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.
  15. 15. ibm.com/developerWorks developerWorks® if (objects.length == 0) { System.out.println("No objects found"); } else { for (S3Object o : objects) { System.out.println(o); } } You can filter the list of objects that are retrieved by providing a prefix to match. Listing 9. Filter the list of objects // list objects matching a prefix. S3Object[] filteredObjects = s3Service.listObjects(bucket, “myprefix”, null); // print out the object details if (filteredObjects.length == 0) { System.out.println("No objects found"); } else { for (S3Object o : filteredObjects) { System.out.println(o); } } Each object can have associated metadata, such as the content type, date modified, and so on. You can also associate your application-specific custom metadata with an object. Listing 10. Retrieve object metadata // get the bucket S3Bucket bucket = getBucketFromName(s3Service, bucketName); // getobjects matching a prefix S3Object[] filteredObjects = s3Service.listObjects(bucket, “myprefix”, null); if (filteredObjects.length == 0) { System.out.println("No matching objects found"); }else { // get the metadata for multiple objects. S3Object[] objectsWithHeadDetails = s3ServiceMulti.getObjectsHeads(bucket, filteredObjects); // print out the metadata for (S3Object o : objectsWithHeadDetails) { System.out.println(o); } } Each newly created object is private by default. You can use JetS3t to generate a signed URL that anyone can use for downloading the object data. This URL can be created to be valid only for a certain duration, at the end of which it automatically expires. The object is still private, but you can give the URL to anyone to let them download it for a brief time. Listing 11. Generate a signed URL for object downloads Amazon Simple Storage Service (S3) © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 15 of 21
  16. 16. developerWorks® ibm.com/developerWorks // get the bucket S3Bucket bucket = getBucketFromName(s3Service, bucketName); // how long should this URL be valid? int duration = Integer.parseInt(tokens.nextToken()); Calendar cal = Calendar.getInstance(); cal.add(Calendar.MINUTE, duration); Date expiryDate = cal.getTime(); // create the signed url String url = S3Service.createSignedGetUrl(bucketName, objectKey, awsCredentials, expiryDate); System.out.println("You can use this public URL to access this file for the next " + duration + " min - " + url); S3 allows a maximum of 5GB per object in a bucket. If you have objects that are larger than this, you'll need to split them up into multiple files, each 5GB in size, and then upload all of the parts to S3. Listing 12. Upload to S3 // get the bucket S3Bucket bucket = getBucketFromName(s3Service, bucketName); // create an object with the file data File fileData = new File(“/my_file_to_upload”); S3Object fileObject = new S3Object(bucket, fileData); // put the data on S3 s3Service.putObject(bucket, fileObject); System.out.println("Successfully uploaded object - " + fileObject); JetS3t provides a DownloadPackage class that makes it simple to associate the data from an S3 object to a local file and automatically save the data to it. You can use this feature to easily download objects from S3. Listing 13. Download from S3 // get the bucket S3Bucket bucket = getBucketFromName(s3Service, bucketName); // get the object S3Object fileObject = s3Service.getObject(bucket, fileName); // associate a file with the object data DownloadPackage[] downloadPackages = new DownloadPackage[1]; downloadPackages[0] = new DownloadPackage(fileObject, new File(fileObject.getKey())); // download objects to the associated files s3ServiceMulti.downloadObjects(bucket, downloadPackages); System.out.println("Successfully retrieved object to current directory"); This section covered some of the basic functions provided by the JetS3t toolkit, and how to use them to interact with S3. See Resources for more about S3 service and an in-depth discussion of the JetS3t toolkit. Amazon Simple Storage Service (S3) Page 16 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.
  17. 17. ibm.com/developerWorks developerWorks® S3 Shell The interaction thus far with S3, through small code snippets, can be put into a more useful and longer lasting form by creating a simple S3 Shell program that you can run from the command line. You'll create a simple Java program that accepts the Amazon Web Services access key and secret key as parameters and returns a console prompt. You can then type a letter or a few letters, such as b for listing buckets or om for listing objects that match a certain prefix. Use this program for experimentation. The shell program contains a main() that is filled out with an implementation using the snippets of code you're using in this article. In the interest of space, the code listing for S3 Shell is not included here. The complete S3 Shell source code, along with its dependencies, is in the download. You can run the shell by simply executing the devworks-s3.jar file. Listing 14. Running the S3 Shell java -jar devworks-s3.jar my_aws_access_key my_aws_secret_key You can type h at any time in the S3 Shell to get a list of supported commands. Figure 2. Help in the S3 Shell Some of the more useful methods have been added to the S3 Shell. You can extend it to add any other functions you want to make the shell even more useful to your Amazon Simple Storage Service (S3) © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 17 of 21
  18. 18. developerWorks® ibm.com/developerWorks specific case. Summary In this article you learned some of the basic concepts behind Amazon's S3 service. The JetS3t toolkit is an open source library you can use to interact with S3. You also learned how to create a simple S3 Shell using sample snippets of code, so you can continue to experiment easily and simply with S3 using the command line. Stay tuned for the next article in this series, which will explain how to use Amazon Elastic Compute Cloud (EC2) to run virtual servers in the cloud. Amazon Simple Storage Service (S3) Page 18 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.
  19. 19. ibm.com/developerWorks developerWorks® Downloads Description Name Size Download method Sample code for this article devworks-s3.zip 2.93MB HTTP Information about download methods Amazon Simple Storage Service (S3) © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 19 of 21
  20. 20. developerWorks® ibm.com/developerWorks Resources Learn • Learn about specific Amazon Web Services: • Amazon Simple Storage Service (S3) • Amazon Elastic Compute Cloud (EC2) • Amazon Simple Queue Service (SQS) • Amazon SimpleDB (SDB) • The Service Health Dashboard is updated by the Amazon team regarding any issues with the services. • Sign up for an Amazon Web Services account. • The Amazon Web Services Developer Connection is the gateway to all the developer resources. • Read the blog to find out the latest happenings in the world of Amazon Web Services. • From the Web Services Account information page you can manage your keys and certificate, regenerate them, view account activity and usage reports, and modify your profile information. • S3 Technical Resources has Amazon Web Services technical documentation, user guides, and other articles of interest. • Amazon S3 has the latest pricing information. Use the AWS Simple Monthly Calculator tool for calculating your monthly usage costs for S3 and the other Amazon Web Services. • Review the S3 Developer Guide for more details. • Amazon Service Level Agreement (SLA) for S3. • The S3stats resource page has several links on processing and viewing S3 log records. Logs are in the S3 Server Access Log Format, but can be easily converted into Apache Combined Log Format, then easily parsed by any of the open source or commercial log analysis tools such as Webalizer. • Learn about JetS3t, an open source Java toolkit for Amazon S3, developed by James Murty. See the toolkit documentation, and get detailed explanations of parameters in the configuration guide. • In the Architecture area on developerWorks, get the resources you need to advance your skills in the architecture arena. Amazon Simple Storage Service (S3) Page 20 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.
  21. 21. ibm.com/developerWorks developerWorks® • Browse the technology bookstore for books on these and other technical topics. Get products and technologies • Download JetS3t and other tools. • Download IBM product evaluation versions and get your hands on application development tools and middleware products from IBM® DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®. Discuss • Check out developerWorks blogs and get involved in the developerWorks community. About the author Prabhakar Chaganti Prabhakar Chaganti is the CTO of Ylastic, a start-up that is building a single unified interface to architect, manage, and monitor a user's entire AWS Cloud computing environment: EC2, S3, SQS and SimpleDB. He is the author of two recent books, Xen Virtualization and GWT Java AJAX Programming. He is also the winner of the community choice award for the most innovative virtual appliance in the VMware Global Virtual Appliance Challenge. Trademarks IBM, the IBM logo, ibm.com, DB2, developerWorks, Lotus, Rational, Tivoli, and WebSphere are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. These and other IBM trademarked terms are marked on their first occurrence in this information with the appropriate symbol (® or ™), indicating US registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml Java and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Amazon Simple Storage Service (S3) © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 21 of 21

×