Your SlideShare is downloading. ×
0
Safe harbor statement under the Private Securities Litigation Reform Act of 1995:
This presentation may contain forward-lo...
-
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
OK Fast Faster
Records/Hour
Serial
Parallel
20M records
5M records
5M records
5M records
5M records
Time
Serial
Parallel
Time
5M records
5M records
5M records
5M records
20M records
Serial
Parallel
Time
5M records
5M records
5M records
5M records
20M records
Throughput
inhibitors










Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Time
• One...
Concurrency Mode Serial
Records Loaded 1 million
Records Failed 0
Run Time 77 minutes
Work Completed 75 minutes
Throughput...
0
50000
100000
150000
200000
250000
300000
350000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Serial
Serial Run
• L...
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Thread
Time
• One...


Concurrency Mode Parallel
Records Loaded 396,600
Records Failed 603,400
Run Time 17 minutes
Work Completed 3 hours 15 minu...
0
50000
100000
150000
200000
250000
300000
350000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Serial
Parallel Run 1...




Concurrency Mode Parallel
Records Loaded 1 million
Records Failed 0
Run Time 3 minutes and 30 seconds
Work Completed 1 hou...
0
50000
100000
150000
200000
250000
300000
350000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Serial
Parallel Run 2...








Concurrency Mode Parallel
Records Loaded 1 million
Records Failed 0
Run Time 4 minutes
Work Completed 1 hour
Throughput 25...
0
50000
100000
150000
200000
250000
300000
350000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Serial
Parallel Run 3...


0
50000
100000
150000
200000
250000
300000
350000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Serial
Controlled Fee...




Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1
Upcoming SlideShare
Loading in...5
×

Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1

888

Published on

Can you load 20 million records into Salesforce in under an hour? If not, this webinar is for you.

You want to load tons of data into Salesforce. No problem, right? Just use the Bulk API and turn on parallel loading. Think again. Unless you carefully plan the big data loads that you want to break up into parallel operations to achieve maximum throughput, those loads can turn out more like slow, serial loads.

0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
888
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
39
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Transcript of "Fast Parallel Data Loading with the Bulk API #Forcewebinar UK Salesforce1"

  1. 1. Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services. The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of intellectual property and other litigation, risks associated with possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non- salesforce.com products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-Q for the most recent fiscal quarter . This documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of our Web site. Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements.
  2. 2. - 5,000,000 10,000,000 15,000,000 20,000,000 25,000,000 OK Fast Faster Records/Hour
  3. 3. Serial Parallel 20M records 5M records 5M records 5M records 5M records Time
  4. 4. Serial Parallel Time 5M records 5M records 5M records 5M records 20M records
  5. 5. Serial Parallel Time 5M records 5M records 5M records 5M records 20M records Throughput inhibitors
  6. 6.     
  7. 7.     
  8. 8. Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Time • One job • 100 batches • 10,000 records/batch • 1M total records
  9. 9. Concurrency Mode Serial Records Loaded 1 million Records Failed 0 Run Time 77 minutes Work Completed 75 minutes Throughput 13,000 records per minute Degree of Parallelism 0.97 Key Problem Degree of parallelism explicitly limited to ~1. Solution Explore parallel load for increased throughput.
  10. 10. 0 50000 100000 150000 200000 250000 300000 350000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Serial Serial Run • Low degree of parallelism Degree of Parallelism ThroughputRecords/Min
  11. 11. Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Time • One job • 100 batches • 10,000 records/batch • 1M total records
  12. 12.  
  13. 13. Concurrency Mode Parallel Records Loaded 396,600 Records Failed 603,400 Run Time 17 minutes Work Completed 3 hours 15 minutes Throughput 22,000 records per minute Degree of Parallelism 11.5 Key Problem Lock Exceptions. Server worked significantly harder but no increase in throughput. Solution Run the load in serial mode or manage locks.
  14. 14. 0 50000 100000 150000 200000 250000 300000 350000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Serial Parallel Run 1 • High degree of parallelism • Low throughput due to locks Degree of Parallelism ThroughputRecords/Min Parallel 1
  15. 15.    
  16. 16. Concurrency Mode Parallel Records Loaded 1 million Records Failed 0 Run Time 3 minutes and 30 seconds Work Completed 1 hour Throughput 320,000 records per minute Degree of Parallelism 19 Key Problem None Solution n/a
  17. 17. 0 50000 100000 150000 200000 250000 300000 350000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Serial Parallel Run 2 • High degree of parallelism • High throughput Degree of Parallelism ThroughputRecords/Min Parallel 2 Parallel 1
  18. 18.  
  19. 19.      
  20. 20. Concurrency Mode Parallel Records Loaded 1 million Records Failed 0 Run Time 4 minutes Work Completed 1 hour Throughput 250,000 records per minute Degree of Parallelism 16.5 Key Problem Minimal overhead due to locks Solution Remove all unnecessary locks
  21. 21. 0 50000 100000 150000 200000 250000 300000 350000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Serial Parallel Run 3 • High degree of parallelism • High throughput Degree of Parallelism ThroughputRecords/Min Parallel 2 Parallel 3 Parallel 1
  22. 22.  
  23. 23. 0 50000 100000 150000 200000 250000 300000 350000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Serial Controlled Feed Run • Reduced parallelism • Expected throughput Degree of Parallelism ThroughputRecords/Min Parallel 2 Parallel 3 Controlled Feed Parallel 1
  24. 24.    
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×