AWS featuring Mechanical Turk for Financial Services_2014


Published on

AWS featuring Mechanical Turk for Financial Services; case studies, overview of Mechanical Turk for financial services providers, overview of AWS

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • My name is Daniel Gray, Principal Sales with Amazon Mechanical Turk, based in Seattle, WA. During the next 30-45mins, I’d like to help illustrate how financial services Requestors are shifting their human-intensive use cases such as data collection, language/translation processing, unstructured data processing, and many other use cases to Mechanical Turk’s flexible Workforce.

    INNOVATE: Requesters with multiple use cases that can be broken down into repeatable tasks for the Workforce to perform; rather than taxing internal resources or even vendors with spiky workloads. Mechanical Turk is ideal as an on-demand utility for ‘human judgment.’

    Today’s session will cover 3 areas:
    1. The Customer: how traditional work models are unable to efficiently process the volume, velocity and variety of data needs that organizations have today.
    2. Mechanical Turk: The marketplace to access a flexible workforce of over 500K Workers across 190 countries. Requesters can access the Workforce direct or indirectly, via WebUI and/or API.
    3. AWS Overview: Why is AWS selected the Solution of Choice.
  • AWS is solving problems for big organizations across many verticals and geographies. We’re extremely proud of our customer list and happy to know that we’re providing good outcomes and better results for some of the best firms in the world

    The AWS Premier Consulting Partner designation highlights the top APN Consulting Partners globally that have distinguished themselves by investing significantly in their AWS practice, growing their AWS business, providing exceptional customer service and helping a large number of customers run their applications on AWS. We have announced 22 consulting partners as 2014 premier partners.
  • You might have questions about security in the cloud, but our biggest and most conservative customers have found that we’re able to meet their security requirements, and often we can provide a better security profile than what they can deliver internally. The AWS cloud infrastructure has been designed and managed in alignment with regulations, standards, and best-practices including HIPAA and ISO 27001.

    Recently we announced AWS CloudTrail, a service that records API calls made on your account and delivers log files to your Amazon S3 bucket. CloudTrail provides increased visibility into AWS user activity that occurs within an AWS account and allows you to track changes that were made to AWS resources. This allows enterprises to run comprehensive security analysis, but better manage their governance and compliance efforts.
  • You might have questions about security in the cloud, but our biggest and most conservative customers have found that we’re able to meet their security requirements, and often we can provide a better security profile than what they can deliver internally. The AWS cloud infrastructure has been designed and managed in alignment with regulations, standards, and best-practices including HIPAA and ISO 27001.

    Recently we announced AWS CloudTrail, a service that records API calls made on your account and delivers log files to your Amazon S3 bucket. CloudTrail provides increased visibility into AWS user activity that occurs within an AWS account and allows you to track changes that were made to AWS resources. This allows enterprises to run comprehensive security analysis, but better manage their governance and compliance efforts.
  • You might have questions about security in the cloud, but our biggest and most conservative customers have found that we’re able to meet their security requirements, and often we can provide a better security profile than what they can deliver internally. The AWS cloud infrastructure has been designed and managed in alignment with regulations, standards, and best-practices including HIPAA and ISO 27001.

    Recently we announced AWS CloudTrail, a service that records API calls made on your account and delivers log files to your Amazon S3 bucket. CloudTrail provides increased visibility into AWS user activity that occurs within an AWS account and allows you to track changes that were made to AWS resources. This allows enterprises to run comprehensive security analysis, but better manage their governance and compliance efforts.

    Amazon had millions of Web pages that described individual products, but it needed to weed out duplicate pages.
    Software could help, but algorithmically eliminating all the duplicates was impossible.
    Born was Mechanical Turk - a Web site where people would look at product pages and be paid a few cents for every duplicate page they correctly identified.
    Mr. Bezos figured that what had been useful to Amazon would be valuable to other businesses, too. In November 2005, Amazon made Mechanical Turk’s API public.

    Mechanical Turk Overview
    1. Crowdsourcing, and specifically Mechanical Turk, gives businesses access to on-demand, scalable resources to solve their business problems. Requestors are typically seeking to fix / accomplish / avoid the following business goals via Mechanical Turk:
    - Reduce cost, transform fixed costs to variable expense
    - Improve scalability or elasticity (i.e. bursting up & down) in line with workload type (i.e. on-demand)
    - Accelerate time-to-market
    - Improve quality / accuracy
    - Increase revenue
    2. Workforce: The ability to tap into a workforce of over 500,000 people around the world can enable you to move faster.
    3. Fixed costs: By shifting from a “fixed cost” model to an “on-demand” model, companies outsource work without making long term commitments, and they can iterate faster. Requestors are typically in one or both of the following situations:
    Seeking to supplement, or shift from, using FTE’s, staff augmentation, or traditional outsourcing (i.e. on-the-bench) to perform routine, human-intensive tasks.
    Seeking to train/test their algorithms by quickly developing large sets of data
    4. Iterate faster: Especially in an agile product development environment, if you can iterate faster/fail cheaply, = innovate faster, and focus on your core business vs. operational challenges like data cleansing.
    A. Empowering developers to build against your platform doesn’t just create value for partners
    B. it expands the ecosystem, increases retention, and drives up the value of the platform.
    C. Most importantly, end customers win…when all their products work seamlessly together.
  • Let’s synch on the definition of ‘crowdsourcing’ because it can actually mean different things. The Umbrella of crowdsourcing breaks it out into 4 distinct groups:

    Microtasking: Mechanical Turk is optimized for microtasking; here, you’re breaking down a larger project into atomic level tasks in order for an army of Workers to perform at scale. Instead of paying Workers by time, you’re shifting the paradigm to paying Workers by result.
    Contests: did a study in 2011 that broke down the use of ‘crowdsourcing’ by vertical; since then, we’ve only seen an increase in the quantity of companies across all verticals (some verticals more quickly than others) using Mechanical Turk for a variety of use cases
  • Enterprises use AWS for virtually running any workload. Some of the most prominent ones include Market Research, Data Cleansing, Document Processing. Other use cases include:
    1. data training (i.e. building up sample data to test algorithms)
    2. data collection (i.e. categorization, moderation, time stamping, tagging/annotation, authoring, translation, transcription, research/surveys, sentiment analysis, more).

    Let’s look into some of these use cases.

    AWS Mechanical Turk offers a reliable and secure crowd infrastructure platform that enables enterprises to quickly launch entire enterprise workloads into the crowd. One of the great examples of crowdsourcing at scale is LinkedIn, who uses Mechanical Turk for transcription, 100% supporting their CardMunch product.

  • Mechanical Turk: Story continues
    We already talked about how Amazon needed to solve the ‘data cleansing’ problem by finding and eliminating duplicates.
    Using pages for illustration purposes, let me share 2 more typical use case examples
    Think of Mechanical Turk use cases in 2 primary buckets:
    Data collection: In data collection, the crowd collects data for you, in the form of cleansing, aggregating, moderating, categorizing, transcribing, rating, authoring, surveys/research, attributing or tagging.
    Data training: In data training, the crowd helps you quickly develop large sets of training data to help train your algorithms
    4. [CLICK] This shows an example of attribution, where missing product data is added by Workers, improving searchability.
  • …and here’s another example use case of categorization where Worker’s identify relevant product results to improve search relevancy
  • Now I’ve shared some examples of data problems that Amazon needs to tackle at scale (deduplication, attribution, categorization). What are the Work Model options to do this?

    Insourcing: Is a work model that uses in-house staff to perform work. Fixed resources impact cost, speed and scalability, while ideally achieving the greatest savings in quality. [CLICK]
    Outsourcing: Is a work model that contracts a service provider to perform work, using workers ‘on-the-bench’. Savings are improved, but efficiencies are still not as optimized as possible. [CLICK]
    Crowdsourcing: Is a distributed work model that breaks work down to the most efficient task-level, and accesses Workers directly, on-demand. Accuracy or quality is the biggest misunderstanding about Crowdsourcing…

    Is NOT an Open Call: This means building and managing a qualified and screened community of individuals to complete the work, NOT launching work into the unknown.
    Is Secure and Reliable: because it’s working on a successful infrastructure which includes Worker quality, workflows, and best practices.
    Is Scalable: Whether you have workload types that fit the i) on/off, ii) fast growth, iii) variable peaks, or iv) predictable peaks, the crowd can burst up/scale down on-demand, and on-task.
    Is not the silver bullet…for all projects or tasks. Some projects are better kept in-house and/or outsourced. But more and more enterprises are shifting human-intensive, routine work to Mechanical Turk…and leveraging the Human API call.

    Cost of Ownership
    A Requester’s cost of ownership in the marketplace is comprised of i) Worker fees, and ii) Amazon fee. Amazon Mechanical Turk collects a 10% commission on top of the reward amount you set for Workers. For example, if a HIT reward is set to $0.20, Amazon Mechanical Turk collects $0.02 for each assignment. The minimum commission charged is $0.005 per assignment. When you grant a bonus, Amazon Mechanical Turk collects 10% of the bonus amount, or a minimum of $0.005 per bonus payment. If you choose to send HITs exclusively to Photo Moderation or Categorization Masters, an additional 20% fee applies.
  • Mechanical Turk is the marketplace that gives you PROGRAMMATIC access to a cost-effective, scalable, global workforce of over 500K Workers in 190 countries.
    Like AWS’ ‘cloud’ offerings, which provide access to scalable computing power, Mechanical Turk provides access to scalable human power / or human judgment. In other words, there are tasks humans can do, better, that computing technology cannot do alone, so think of Mechanical Turk as the ‘human API’ call.

    Mechanical Turk’s Partner network is comprised of Consulting Partners, and Technology Partners, intended to ease access and usage of the Mechanical Turk marketplace. As part of the Partner network, Ed from Top Image Systems will profile their technology, its value to you, and how it integrates Mechanical Turk.

    Let’s highlight 3 Partners, and their value-add proposition to the Requester.

    Let’s take a closer look at how Mechanical Turk works at a high-level, and then illustrate how some Requesters structure their Mechanical Turk workflows for success.
  • Begin with a project…and define the goals & key components of your project. For example, your goal might be to clean your business listing database so that you have accurate information for consumers. The sub-components of your project might be to categorize the businesses by listing type (i.e., restaurant or service) and verify that the related address and phone number are current.
    Break it into tasks and design your HIT…so many Workers can work in parallel and faster. For example, if you have 1,000 listings to verify, each listing could be an individual task. Next, design your Human Intelligence Tasks (HITs) by writing crisp and clear instructions, identifying the specific outputs and inputs desired and how much you will pay to have work completed. Calculating reward is a function of defining a competitive effective hourly rate, prorating based on task completion time, competitive marketplace rates, and throttling your cost/accuracy/productivity levers relative to your target performance metrics.
    Publish HITs to the marketplace…hundreds, thousands, even millions at a time. For example, each HIT can have multiple assignments so that different Workers can provide answers to the same set of questions and you can compare the results to form an agreed-upon answer.
    Workers accept assignments…for special skills, you can Qualify the Workforce. For example, if Workers need special skills, specific geography, or specific marketplace rating to complete your tasks, you can require that they pass a Qualification test before they are allowed to work on your HITs.
    Workers submit assignments for review. When a Worker completes your HIT, he or she submits an assignment for you to review.
    Approve or reject assignments…you pay only for approved work. When your work items have been completed, you can review the results and approve or reject them.
    Complete your project…Congratulations; your project has been completed and your Workers paid!
  • Going directly to the crowd….
    …requires companies to define their tasks more precisely….so that anyone who reads their instructions can successfully complete the task.
    There’s no one “right” way to structure your work. However, approaching Mechanical Turk is similar to my story – do you jump right into PPT and start creating slides…or do you storyboard first? Establishing the blueprints for your architecture, & workflows, and learning market dynamics & best practices increases the probability of achieving accuracy, cost-efficiency and productivity at scale.
    Because Mechanical Turk is on-demand…makes it easy to spin up an project, measure it, and optimize based on the results.
  • Here’s an example of accessing Mechanical Turk via a Partner. I want to highlight a couple things here:
    Multiple AWS technologies working together (S3 and Mechanical Turk)
    Best practices, including a defined adjudication strategy (qualification, quality control, plurality, known answers), market dynamics, HIT ergonomics
    Workflow, designed and tested prior to scaling work.
  • This is a simple view of the set of services that we offer. At the core is the compute, storage and data services that are the heart of our offering. We then surround these offerings with a range of supporting components like management tools, networking services and application services. All these capabilities are hosted within our global data center footprint that allows you to consume services without having to build out your own facilities or procure hardware equipment.

    This view shows the number of new services and features launched since our inception. In 2010, we launched 61 significant services and features, in 2012 it was 159, and this year alone, we have launched 245 services and features. The pace of innovation is accelerating at AWS.

    Our data center footprint is global, spanning 5 continents with highly redundant clusters of data centers in each region. Our footprint is expanding continuously as we increase capacity, redundancy and add locations to meet the needs of our customers around the world.

    AWS has been named a leader in the Gartner MQ for Cloud IaaS third year in a row. Not only that Gartner notes that AWS is the overwhelming market share leader, with more than five times the compute capacity in use than the aggregate total of the other fourteen providers

  • Cost is the conversation starter when it comes to cloud. There are many pieces to cost conversation when it comes to AWS and your own infrastructure. The first advantage you get in the cloud is that you don’t have to lay out capital expense for hardware and infrastructure before you know the demand. In essence you convert your capital expense into variable expense. And then that variable expense on AWS is lower than what most companies can do on their own because AWS runs at a massive scale and we pass that scale to our customers in the form of lower pricing. There are multiple pricing models in AWS, so you can optimize your spend depending on what your workloads requirements are. And the more you use AWS, the less your costs are. We have tiered pricing and for customers doing large data center migrations, we have negotiated custom pricing to make their transitions cost-effective.
  • Enterprises cannot afford to be slow, but if you can ask an enterprise leader as to how long does it take to get a server for running a workload, the typical time frame is 10 to 18 weeks. In the cloud you can spin thousands of servers in minutes and experiment quickly. If the experiment doesn’t work out, you can spin down those instances and stop paying for them.

    This is a big difference from the old world. In the cloud, you can instantly spin up and down clusters, Petabyte size data warehouses and new production or dev. Environments. Everything changes with this kind of agility.
  • We see our customers do amazing things when they reduce the cost of experimentation- it moves IT from being a roadblock, where each idea costs lots of money and takes lots of time, to being an enabler where you can launch a speculative project quickly and cheaply. It allows firms to take more chances on ideas, and gives them a shot at winning big, as opposed to being scared to even try.
  • Many enterprises understand the value proposition of cloud, but worry that using a cloud or on-premises infrastructure is a binary choice. It is not. We understand that enterprises have a number of on-premises data centers that they are not ready to retire yet; what they really want is the ability to use their on-premises data centers easily with AWS.
  • We have spent last couple of years making this integration simpler and easier and this is an area where we’ll be spending significant resources in the future.
  • We have launched several features to support this vision of integrating your on-premises infrastructure with AWS. For identity federation we have the ability to integrate with Active Directory and SAML. We have built a number of network capabilities, including Amazon Virtual Private Cloud that allows you to practically cordon off part of our network and deploy AWS resources into it. Many enterprises have deployed VPC as an extension of their existing data centers.

    We also have AWS Direct Connect, which allows private connections between your data center and AWS. We continue to encrypt all our persistent data. We also have Storage Gateway, a virtual appliance that allows you to store your your primary data in Amazon S3 and retain your frequently accessed data locally or store your primary data locally, and asynchronously back up point-in-time snapshots of this data to Amazon S3.
  • We have also worked with a number of third party providers to provide an easier view so that you can have a single pane of glass to manage your applications. This lets you view you deployments in on-premises and AWS environments in one view. We work with BMC and CA and others to make this easier for customers.
  • To summarize, AWS is a great fit for you if you’re building new applications, facing a technical refresh during the next year, or planning to add capacity for your growing workloads.
  • AWS featuring Mechanical Turk for Financial Services_2014

    1. 1. AWS for Financial Services featuring Mechanical Turk
    2. 2. It’s about the Customer AWS Customers & Use Cases Innovation Amazon Mechanical Turk Flexible Workforce AWS Overview Overview, Solution of Choice, Workloads Daniel Gray Amazon Mechanical Turk Principal, Business Development Seattle, WA
    3. 3. Trusted by Enterprises Worldwide Used by Government Agencies & Educational Institutions Worldwide 2014 Premier Tier Partners
    4. 4. Case Study #1: FINRA FINRA selected AWS because it offered the right services while fulfilling the company’s security requirements. By using dynamic clusters (Hadoop, Hive, and HBase), and services such as Amazon Elastic MapReduce (Amazon EMR) and Amazon Simple Storage Service (Amazon S3), FINRA was able to create a flexible platform that can adapt to changing market dynamics. By using the AWS Cloud, FINRA has been able to increase agility, speed and cost savings while allowing them to operate at scale. The company estimates it will save $10 to $20 million annually by using AWS.
    5. 5. Case Study #2: ME Bank The bank evaluated five cloud providers before deciding to migrate its development and testing environments to the Amazon Web Services (AWS) Cloud. “AWS combined full self-service capabilities with cost-effectiveness,” says Fanning. “We were also impressed by the rapid delivery of new product and service releases.” ME Bank uses Amazon Virtual Private Cloud (Amazon VPC) to provision an isolated, virtual network in the AWS Asia-Pacific (Sydney) Region. ME Bank’s developers on several Transformation teams build new products and services and complete unit testing before delivering code built in the AWS Cloud to a centralized Environment Services team. Environment Services then deploys the code to environments running on AWS for system integration testing, performance testing, and user acceptance testing. Environment Services provisions development and testing environments on behalf of the Transformation teams. Depending on the workload, the team spins up Amazon Elastic Compute Cloud (Amazon EC2) instances ranging from medium to extra-large. AWS CloudFormation is used to replicate instance templates, providing the agility needed to keep pace with the bank’s change program. Environment Services uses AWS Identity and Access Management (IAM) to structure and monitor different levels of security and access for projects. Eventually, Environment Services plans to provide access to AWS so development, test and support teams can provision their own test and development environments. The flexibility and scalability of AWS has enabled ME Bank to ramp up development and testing work as the technology transformation program evolves. “We started the program with only three developers,” say Fanning. “Because of the ease with which we can provision new environments, we now have 150 developers to develop and test new banking applications and services.”
    6. 6. Case Study #3: Bankinter Bankinter uses the flexibility and power of Amazon Elastic Compute Cloud (Amazon EC2) to perform these simulations, subdividing processes through a grid of Amazon EC2 instances and implementing simulations in parallel on several Amazon EC2 instances to obtain the result in a very effective time period. Bankinter used Java to develop their application and the Amazon Software Development Kit (SDK) to automate the provisioning process of AWS elements. Through the use of AWS, Bankinter decreased the average time-to-solution from 23 hours to 20 minutes and dramatically reduced processing, with the ability to reduce even further when required. Amazon EC2 also allowed Bankinter to adapt from a big batch process to a parallel paradigm, which was not previously possible. Costs were also dramatically reduced with this cloud-based approach.
    7. 7. Mechanical Turk is a marketplace to access a workforce of over 500K Workers in 190 countries. 02Flexible Workforce
    8. 8. Crowdsourcing is this decade’s cloud computing. Wired Magazine 2011 study of crowdsourcing by vertical
    9. 9. Financial Services use cases on AWS Mechanical Turk Market Research • Data collection • Survey • Web research • Sentiment analysis • Data training & labeling • User studies • List assembly Data Cleansing • Verification • Deduping • Categorization • Merging • Moderation • Normalization Document Processing • Structured, semi, unstructured • Transcription & extraction • Validation/ verification • Data entry • OCR text correction • Small business financing • Coupling script technology + Mechanical Turk to verify/dedupe business listings. • Over 7months, published ~500K assignments to the marketplace produced by a Worker pool of >300 Workers. • Satisfied w/cost of ownership, productivity is amazing, and accuracy needs to be better measured by Requester. • Leading global financial services firm • Conducts research of various types to help guide and evaluate UI designs for customers and prospects. • Used Mechanical Turk to recruit medium- to-large sample sizes for simple, short tests like card sorts and tree tests. • Adding Mechanical Turk as a standard tool in their research toolkit. • Real estate asset valuation & collateral risk assessment • Unstructured real estate records containing multiple data elements not matching internal template Example applications
    10. 10. Crowdsourcing Quality Cost Speed Scalability Service Provider Outsourcing Quality Cost Speed Scalability Insourcing Quality Cost Speed Scalability Greatest Savings You You You API
    11. 11. Marketplace Technology Partners Consulting Partners
    12. 12. Amazon S3 Amazon Mechanical Turk Requester client server (1 Application Master, 14 Application instances) Mobile Client Mobile Client MySQL DB Instance MySQL DB Instance Amazon Mechanical Turk server (1 Application Master, 8 Application Instances) RabbitMQ WorkersAssignments Human Intelligence Tasks (HIT) Requester   
    13. 13. SharepointCCSMTurkWorkersS3 Initial Qualification Crowdsourcing Quality Control Workflow Quality Control Process
    14. 14. AWS…and why it’s the Solution of Choice. 03AWS Overview
    15. 15. 2. Pace of Innovation New Service Announcements & Updates 20122011201020092008 2013 24 48 61 82 159 280 3. Global Infrastructure 10 Regions 25 Availability Zones 51 Edge Locations “AWS is the overwhelming market share leader, with more than five times the compute capacity in use than the aggregate total of the other fourteen providers.” 4. Capacity Increased agility has become the #1 reason businesses choose the AWS cloud… 1. Workload Support AWS Global Infrastructure Application Services Networking Deployment & Administration DatabaseStorageCompute
    16. 16. Lower Costs with AWS Up-Front and Increase Savings as Your Usage Grows Source: IDC Whitepaper, sponsored by Amazon, “The Business Value of Amazon Web Services Accelerates Over Time.” July 2012 1 “Average of 400 servers replaced per customer” Replace up-front capital expense with low variable cost 2 38 Price Reductions Economies of scale allow us to continually lower costs 3 Pricing model choice to support variable & stable workloads 4 Save more money as you grow bigger On-demand Reserved Spot Tiered Pricing Volume Discounts Custom Pricing
    17. 17. Enterprises Can’t Afford to be Slow Add New Dev Environment Add New Prod Environment Add New Environment in Japan Add 1,000 Servers Remove 1,000 Servers Deploy 1 PB Data Warehouse Shut down 1 PB Data Warehouse AWS: Infrastructure in Minutes Old World: Infrastructure in Weeks Everything changes with this kind of agility
    18. 18. A culture of Innovation: Experiment Often & Fail without Risk On-Premises Experiment Infrequently Failure is expensive Less Innovation Experiment Often Fail quickly at a low cost More Innovation $ Millions Nearly $0
    19. 19. Many Enterprises Worry That These are the Only Two Choices Build a “Private” Cloud Rip everything out and move to AWS #1 #2
    20. 20. The Good News is that Cloud isn’t an ‘All or Nothing’ Choice Corporate Data Centers On-Premises Resources Cloud Resources Integration
    21. 21. Active Directory Network Configuration Encryption Backup Appliances Your On-Premises Apps Corporate Data Centers Users & Access Rules (IAM) Your Private Network (VPC) Encryption (S3, RDS, HSM) Backups (Storage Gateway) Your Cloud Apps AWS Direct Connect Integrating AWS with your existing On-Premises Infrastructure
    22. 22. Tools to help customers manage resources across environments Single Pane of Glass Management Tool Partners
    23. 23. Engage with us if… 1. You’re processing human-intensive workloads that can be micro-tasked… 2. You’re depending on internal FTE’s and/or outsource/offshore vendors… 3. Your workloads are on/off, fast growth, predictable/variable peaks, backlog… 4. You’re considering outsourcing part or all of your workloads requiring human touch (i.e. judgment)…