AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
Processing large volume of data with MuleSoft and salesforce.pptx
1. Processing large volume of data with
MuleSoft and salesforce
Faridabad MuleSoft Meetup Group
July 15, 2023
2. Information Classification:
General
Safe Harbour Statement
● Both the speaker and the host are organizing this meet-up in individual capacity only. We are
not representing our companies here.
● This presentation is strictly for learning purposes only. Organizer/Presenter do not hold any
responsibility that same solution will work for your business requirements.
● This presentation is not meant for any promotional activities.
2
3. Information Classification:
General
A recording of this meetup will be uploaded to events page within 24 hours.
Questions can be submitted/asked at any time in the Chat/Questions & AnswersTab.
Make it more Interactive!!!
Give us feedback! Rate this meetup session by filling feedback form at the end of theday.
We Love Feedbacks!!! Its Bread & Butter for Meetup.
Housekeeping
3
5. Information Classification:
General
Senior Solution Architect and APIN Competency Head
EPAM Systems
Jitendra Pravin Bafna
Overall 15+ years of experience in the IT industry.
APIN Competency Head at EPAM Systems
Holding various Credentials from MuleSoft
• MuleSoft Ambassador from 2021
• MuleSoft Speaker (Speak in 70+ Meetups and other event).
• MuleSoft Meetup Leader (Mumbai, Surat, Nashik and Engineering Student – Organized
more than 70 MuleSoft Meetups).
• MuleSoft Delivery Champion and Go To Market Champion.
• All MuleSoft Certified (Platform, Integration Architect and Developer Certification).
• Received more than 10 MuleSoft Community Awards and 2 times All-Stars award.
Expertise in setting up MuleSoft Platform (CloudHub, RTF), Customer Hosted Mule), MuleSoft
Security, API Strategy and Governance. Executed more than 20 projects related to the
MuleSoft that includes Catalyst Launch of Platform, API Strategy and Governance, API
Enablement, API Implementation etc.
Published more than 300+ videos related to MuleSoft on YouTube.
Published more than 200 Blogs/Articles on MuleSoft Blogs, Medium and Dzone.
Expertise in the integration between MuleSoft and other systems like Salesforce, NetSuite,
SAP, Databases.
7. Information Classification:
General
8
Salesforce Connector
Anypoint Connector for Salesforce (Salesforce Connector) enables you
to accelerate your Salesforce integrations across Sales Cloud, Service
Cloud, Salesforce Platform, and Force.com. The connector gives you
access to all Salesforce entities to enable automation of your business
processes to help maximize your investments in services and solutions
like enabling your sales teams, increasing revenue, and serving your
customers better. This connector:
• Provides a powerful solution to implement top Salesforce
integration patterns.
• Supports all create, read, upsert, and delete operations across
Salesforce objects. Additionally, it supports Salesforce's bulk
operations that are optimized for working with large sets of data.
• Enables you to easily leverage custom fields and custom entities to
solve integration needs to your custom Salesforce instance.
8. Information Classification:
General
Salesforce Bulk API v2
Bulk API v2.0 is available in API version 41.0 and later. Compared to Bulk API v1.0, Salesforce has
dramatically simplified the pre-processing step of Bulk API. In this blog, we will walk you through how to
bulk upload your contact records in CSV format to Salesforce through Bulk API v2.0.
• You can submit up to 15,000 batches per rolling 24-hour
period.
• Maximum number of records uploaded per 24-hour
rolling period - 150,000,000 (15,000 batches x 10,000
records per batch maximum)
• Maximum time that a job can remain open – 24 hours.
• Maximum file size – 150 MB Per Job.
• Maximum number of fields in a record – 5000
• Maximum number of characters in a record - 400,000
• Timeout for retrieving query results – 20 Minutes
More details can be found here https://developer.salesforce.com/docs/atlas.en-
us.244.0.salesforce_app_limits_cheatsheet.meta/salesforce_app_limits_cheatsheet/salesforce_app_limits_platform_bulkapi.
htm
9. Information Classification:
General
Total Record – 100
Total Number of Record in each block = Total Record/Total Block Size = 100/100 = 1 Record
Block Size - Batch Processing
A lower batch block size will perform more I/O. A higher batch block size will reduce the I/O, but eventually you
will be processing more sequential records whenever you retrieve X amount of records from the queue, and X <
block size.
• If you have millions of records with payload size in KBs, you can process blocks of a high number of records
with no memory issues. Here, setting a larger block size improves the batch job time.
• If you need to process heavy payloads like files of larger size in MBs, then you can consider keeping the block
size smaller to distribute the load and avoid memory issues.
"Max concurrency" configuration of the batch job ( " 2 * CPU cores quantity " by default unless you configure it
on the Batch Job scope)
In this scenario, Each block will have one record and it is not good practice to process one record in one block
and it will end up in queuing the blocks and execution time will be high.
Demonstration 1 – Adding record into
Salesforce one by one with Block Size of
100
10. Information Classification:
General
Total Record – 100
Total Number of Record in each block = Total Record/Total Block Size = 100/10 = 10 Record
In this case, each block will have 10 records and each block will be processed sequentially. Record within each
block will be executed parallelly.
This execution will be faster, and it will take lesser time as compared to demonstration 1.
Demonstration 2 – Adding record into
Salesforce one by one with Block Size of 10
11. Information Classification:
General
A Batch Aggregator adds records in a collection until the Nth record equal to the size attribute of the Batch
Aggregator is added. Then the collection contain all records is the payload of the Mule Message that get
processed by the processors inside the Batch Aggregator.
For example, if the size of the Batch Aggregator is set to 10, then the payload will be a collection with 10
records. Every 10 records the processors inside the Batch Aggregator will execute.
Salesforce allows to send 200 records in one batch and not more than otherwise it will throw error.
Total Number of Record in each block = Total Record/Total Block Size = 300/10 = 30
Batch Aggregation Size for 300 records = 2 (One batch of 200 record and other batch of 100 record)
Demonstration 3 – Adding record into
Salesforce with Block Size of 10 and
aggregating the 200 record in one batch
12. Information Classification:
General
In this scenario, for 300 record it will create one batch in Salesforce, and it will process the records
asynchronously.
For 11000 records, Salesforce will create 2 batches one of 10000 records and other with 1000 records in the
Salesforce Job and it will be automatically taken care by Salesforce.
Finally, we can retrieve the state of Job by passing JobId and there are 4 state maintain for each Job.
• UploadComplete
• JobComplete
• Failed
• InProgress
In this scenario, we don’t have to worry about concurrency, Batch or Block Size.
Demonstration 4 – Processing 300 and
11000 record with Salesforce Bulk API v2
13. Information Classification:
General
In this scenario, we can submit query to Salesforce using Bulk API V2 and retrieve the results asynchronously.
Whenever you want to retrieve the record that contain thousand to million records, Bulk API V2 is best option to
be used.
Demonstration 5 – Query With Salesforce
Bulk API v2
16. Information Classification:
General
20
● Share:
○ Tweet using the hashtag #MuleSoftMeetups
○ Invite your network to join: https://meetups.mulesoft.com/faridabad
● Feedback:
○ Fill out the survey feedback and suggest topics for upcoming events
○ Contact MuleSoft at meetups@mulesoft.com for ways to improve the program
What’s next?