Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AWS Java SDK @ scale


Published on

Our history of dealing with AWS Java SDK, mostly related to S3 client. Good practices, bugs, performance tuning.

Published in: Software
  • Be the first to comment

AWS Java SDK @ scale

  1. 1. A W S J AVA S D K @ S C A L E B A S E D M O S T LY O N E X P E R I E N C E S W I T H S 3 image source:
  2. 2. C R E D E N T I A L S O U R
  3. 3. E N D P O I N T S • REST API for everyone • Great documentation • documentation/
  4. 4. A W S J AVA S D K • One monolithic jar before 1.9.0 • Currently split into ~48 smaller modules dedicated to individual Amazon services • All depend on aws-java-sdk-core module • Other runtime dependencies: • commons-logging • apache http client (4.3.4) • joda time
  5. 5. C R E D E N T I A L S • Manually provide accessKey and secretKey (generated by IAM) • Manual key management • No automatic rotation • Leaked keys will loose you serious $$$ new AmazonS3Client(new BasicAWSCredentials(accessKey, secretKey));
  6. 6. C R E D E N T I A L S “I only had S3 keys on my GitHub and they where gone within 5 minutes! Turns out through the S3 API you can actually spin up EC2 instances, and my key had been spotted by a bot that continually searches GitHub for API keys. Amazon AWS customer support informed me this happens a lot recently, hackers have created an algorithm that searches GitHub 24 hours per day for API keys. Once it finds one it spins up max instances of EC2 servers to farm itself bitcoins. Boom! A $2375 bill in the morning.”
  7. 7. C R E D E N T I A L S • Use credentials provider • Default behaviour when zero argument constructor is invoked • EnvironmentVariableCredentialsProvider
 InstanceProfileCredentialsProvider • All but last one share security problems with manual access/ secret keys management new AmazonS3Client();
  8. 8. C R E D E N T I A L S • Use InstanceProfileCredentialsProvider • Needs IAM role of the server to be configured with permissions needed by the service using this provider. • Calls EC2 Instance Metadata Service to get current security credentials. • • Automatic management and rotation of keys. • Stored only in memory of calling process
  9. 9. C R E D E N T I A L S • Use InstanceProfileCredentialsProvider • Credentials are reloaded under lock which may cause latency spikes (every hour). • Instantiate with refreshCredentialsAsync == true • Problems when starting on developers machines • Use AdRoll’s hologram to create fake environment locally •
  10. 10. B U I LT I N M O N I T O R I N G amazonS3Client.addRequestHandler(new RequestHandler2() {
 public void beforeRequest(Request<?> request) {
 public void afterResponse(Request<?> request, Response<?> response) {
 public void afterError(Request<?> request, Response<?> response, Exception e) {
  11. 11. B U I LT I N M O N I T O R I N G AmazonS3Client amazonS3 = new AmazonS3Client( new StaticCredentialsProvider(credentials), new ClientConfiguration(), new RequestMetricCollector() {
 public void collectMetrics(Request<?> request, Response<?> response) { 
 } );
  12. 12. T E S T I N G W I T H S 3 • Use buckets located close to testing site • Use fake S3 process: • • • same thing but with few bug fixes • Not scalable enough • Write your own :( • Not that hard //lookout for issue 414 amazonS3.setEndpoint(“http://localhost...");
  13. 13. S C A RY S T U F F • #333 SDK can't list bucket nor delete S3 object with characters in range [0x00 - 0x1F] #333 • According to the S3 objects naming scheme, [0x00 - 0x1F] are valid characters for the S3 object. However, it's not possible to list bucket with such objects using the SDK (XML parser chokes on them) and also, they can't be deleted thru multi objects delete (also XML failure). What is interesting, download works just fine. • #797 S3 delete_objects silently fails with object names containing characters in the 0x00-0x1F range • Bulk delete over 1024 objects will fail with unrelated exception
  14. 14. “ A S Y N C H R O N O U S ” V E R S I O N S • There is no truly asynchronous mode in AWS SDK • Async versions of clients use synchronous blocking http calls but wrap them in a thread pool • S3 has TransferManager (we have no experience here)
  15. 15. B A S I C S 3 P E R F O R M A N C E T I P S • Pseudo random key prefix allows splitting files among S3 “partitions” evenly • Listing is usually the bottleneck. Cache list results. • Or write your own microservice to eliminate lists
  16. 16. S D K P E R F O R M A N C E • Creates tons of short lived objects • Many locks guarding internal state • Profiled with Java Mission Control (if it does not crash) • Or Yourkit • Then test on production data
  17. 17. public XmlResponsesSaxParser() throws AmazonClientException {
 // Ensure we can load the XML Reader.
 try {
 xr = XMLReaderFactory.createXMLReader();
 } catch (SAXException e) {
 throw new AmazonClientException("Couldn't initialize a SAX driver to create an XMLReader", e);

  18. 18. @Override
 protected final CloseableHttpResponse doExecute(final HttpHost target, final HttpRequest request,
 final HttpContext context)
 throws IOException, ClientProtocolException {
 Args.notNull(request, "HTTP request");
 // a null target may be acceptable, this depends on the route planner
 // a null context is acceptable, default context created below
 HttpContext execContext = null;
 RequestDirector director = null;
 HttpRoutePlanner routePlanner = null;
 ConnectionBackoffStrategy connectionBackoffStrategy = null;
 BackoffManager backoffManager = null;
 // Initialize the request execution context making copies of
 // all shared objects that are potentially threading unsafe.
 synchronized (this) {
  19. 19. public synchronized final ClientConnectionManager getConnectionManager() {
 if (connManager == null) {
 connManager = createClientConnectionManager();
 return connManager;
 public synchronized final HttpRequestExecutor getRequestExecutor() {
 if (requestExec == null) {
 requestExec = createRequestExecutor();
 return requestExec;
 public synchronized final AuthSchemeRegistry getAuthSchemes() {
 if (supportedAuthSchemes == null) {
 supportedAuthSchemes = createAuthSchemeRegistry();
 return supportedAuthSchemes;
 public synchronized void setAuthSchemes(final AuthSchemeRegistry registry) {
 supportedAuthSchemes = registry;
 public synchronized final ConnectionBackoffStrategy getConnectionBackoffStrategy() {
 return connectionBackoffStrategy;
  20. 20. O L D A PA C H E H T T P C L I E N T ( 4 . 3 . 4 ) • Riddled with locks • Reusing same client can save resources but at cost of performance • different code paths may not target same sites • open sockets are not that costly • better use many client instances (e.g. per-thread) • Make sure number of threads using one client instance it is less than maximum number of connections in its pool • severe contention on returning connections to pool • recent versions got better
  21. 21. B A S I C C O N F I G U R AT I O N <bean id=“...” class="" scope="prototype">
 <bean class="com.amazonaws.ClientConfiguration">
 <property name="maxConnections" value="#{T(Integer).parseInt('${storage.readingThreads}') * 2}”/> 
 <property name="protocol" value="HTTP"/>
  22. 22. C L I E N T P O O L <bean id="poolTargetSource" class="pl.codewise.voluum.util.AmazonS3ClientPool">
 <property name="targetBeanName" value="amazonS3Client"/>
 <property name="maxSize" value="10"/>
 <bean id="amazonS3Client" class="org.springframework.aop.framework.ProxyFactoryBean" primary="true">
 <property name="targetSource" ref="poolTargetSource"/>
 <property name="interfaces">
 </bean> int index = ThreadLocalRandom.current().nextInt(getMaxSize());
 return clients[index];
  23. 23. W H AT T O D O W I T H T H I S ? • Hardcore approach (classpath overrides of following classes) • Our own AbstractAWSSigner that uses third party, lock free HmacSHA1 signing algorithm • ResponseMetadataCache without locks (send metadata to /dev/null) • AmazonHttpClient to remove call to System.getProperty • DateUtils using joda time (now fixed in SDK itself)
  24. 24. D s t a t o u t p u t . U s e r m o d e c p u u s a g e m o s t l y re l a t e d t o d a t a p ro c e s s i n g . P E R F O R M A N C E A C H I E V E D CPU (user, system, idle) Network transfer (IN/OUT) IRQ/CNTX
  25. 25. O P T I M I S AT I O N S R E S U LT Please reduce your request rate. (Service: Amazon S3; Status Code: 503; Error Code: SlowDown)
  26. 26. – H E N RY P E T R O S K I "The most amazing achievement of the computer software industry is its continuing cancellation of the steady and staggering gains made by the computer hardware industry."