In Salesforce, a record ID uniquely identifies the records and associates a record with the other. Salesforce uses the record ID to reconcile the parent-child relationships.
When you run a masking task, the task reconciles relationships through an external ID, a custom field, or a unique field, and writes the data to the target. Salesforce recommends to create and use external IDs instead of custom field lookup to insert or upsert the target data. If Salesforce cannot create an external ID, the masking task creates a custom field in Salesforce to perform lookup on the target. If Salesforce cannot create an external ID or a custom field, the masking task creates a unique field to perform lookup on the source and the target. The external ID field or the custom field lookup uniquely identifies the parent-child relationship in a record when you insert or upsert data in a masking task.
External ID field
A Salesforce external ID field contains the External ID attribute with unique record identifiers from a system outside Salesforce.
The Data Masking task uses external IDs to identify the parent-child relationships objects in the target database. Salesforce recommends to create and use external IDs instead of custom field lookup or unique field lookup to insert or upsert the target data. When you create an external ID from the Target tab, the Data Masking task appends DMASK_ to the name of the external ID. The Target tab shows the external IDs that you create in the Data Masking task.
The Data Masking task creates an additional field for the external ID in the target at the design time. If an external ID exists for the object, the task uses the same external ID. You can either retain or delete the external IDs after you run the task. You can retain the external ID fields if you want to perform another upsert operation.
If you cannot create an external ID field for an object, you can select a unique field or an idlookup field for the object from the Target tab when you perform an upsert operation. For example, you can select a unique field or an idlookup field for the RecordType object.
Custom field lookup
The masking task performs a lookup-based reconciliation on the parent target object to get a parent record ID.
When the external ID is not present, the masking task creates a custom field to perform a lookup in the Salesforce target with the same name. A custom field lookup requires one lookup operation on the target.
The masking task performs a custom field lookup in the following situations:
•The external ID limit exceeds in the target object.
•The number of unique fields in the target exceeds the limit.
Unique field lookup
The masking task uses a unique field for the objects on which you cannot create an external ID or a custom field. This reconciliation strategy is for standard objects that contain unique fields.
If there are unique fields present in the object, the task performs the lookup operation based on the unique field. A unique field lookup requires one lookup operation on the source and the target.
If you cannot create an external ID field for an object, you can select a unique field or an idlookup field for the object from the Target tab when you perform an upsert operation. For example, you can select a unique field or an idlookup field for the RecordType object.
If you cannot create an external ID field on an object and if there is no unique field present for the object, you cannot perform an insert operation. For example, you cannot perform an insert operation on the OpportunityContactRole object.
The unique field lookup strategy is applicable to the following standard objects:
•AdditionalNumber
•Announcement
•ApexClass
•ApexComponent
•ApexPage
•ApexTrigger
•Attachment
•AuthProvider
•BrandTemplate
•BusinessHours
•BusinessProcess
•CallCenter
•CollaborationGroup
•ContentDistribution
•CorsWhitelistEntry
•Document
•EmailServicesAddress
•EmailServicesFunction
•EmailTemplate
•EntitlementContact
•EntitlementTemplate
•Folder
•Group
•Holiday
•LiveChatTranscriptEvent
•LiveChatTranscriptSkill
•LiveChatUserConfigProfile
•LiveChatUserConfigUser
•LiveChatVisitor
•MailmergeTemplate
•MilestoneType
•NetworkActivityAudit
•Note
•PresenceUserConfigProfile
•PresenceUserConfigUser
•QuestionReportAbuse
•QuestionSubscription
•QuoteDocument
•RecordType
•ReplyReportAbuse
•SelfServiceUser
•RecordType
•StreamingChannel
•Topic
•TopicAssignment
•User
•UserProvAccount
•UserProvAccountStaging
•UserProvMockTarget
•UserProvisioningLog
•UserRole
•WebLink
Junction objects
A junction object is a Salesforce object that contains many-to-many relationships between two related objects.
The relationship details stored within a junction object form a junction relationship. In a many-to-many relationship, each record in an object links to multiple records in another object. A junction object stores all the relationships between the two objects. For example, CaseSolution is a junction object that stores many-to-many relations between the Case object and the Solution object. The relationship between the Case object and the Solution object is the junction relationship.
You can create a data subset from a junction object. You can insert data into a junction object. You cannot upsert data into the junction object.
Target owner name
In Salesforce, the masking task can add the source owner name of the objects in the target instead of the target connection user name. The target must contain a user with the same alias.
When you select multiple sources, the User object and the other related objects are added to the list of sources. The User object reconciles the source owner name based on the Alias field in Salesforce. If the target contains the same owner name as the source, the masking task adds the source owner name in the target. If the target does not contain the same owner name as the source, the masking task populates the default target connection name.
If multiple users have the same alias name, the masking task choose a random user to reconcile relationship with the other object.
Salesforce bulk API limits
Salesforce limits the amount of data that you can read or write through the Salesforce Bulk API.
To improve the performance and reduce the number of API requests for large data sets, you use Salesforce Bulk API.
Batch Limit
You can submit up to 5,000 batches in every 24 hours. You can perform query, write, and delete operations.
Bulk API Writer
Default size of the Bulk API Writer is 10,000 rows for a batch. The task can process up to 5000*10000 which is 50 million records a day. Edit the Secure Agent properties to configure the batch size of the Bulk API Writer.
Bulk API Reader
With Salesforce Bulk API Reader, you can retrieve up to 15 GB data from a query. If the query exceeds 15 GB, the masking task fails. You can manually calculate whether the amount of data that you want to process exceeds the Salesforce limitations.
To calculate the amount of data you can query through the Salesforce Bulk API Reader, you can use the following formula:
Sum (the number of bytes in the fields that you want to mask) * (the number of rows in the query results) < 15 GB
In the Salesforce application, view the size of the fields of the objects and add the number of bytes of the fields that you want to mask. Find the total number of rows in the query results for the Salesforce object and apply the formula. If the result is within the 15 GB limit, the task succeeds. If the result is greater than 15 GB, the task fails.
Salesforce Bulk API Limits Example
The Account object contains 100 fields and you want to mask 10 fields of sensitive data. The sum of the size of all the 10 fields that you want to mask is 100 bytes. The total number of rows for the Account object is 500,000.
Use the following formula:
Total query size = 100 * 500,000 = 0.05 GB
The result 0.05 GB is within the 15 GB data limit and the application can process the task successfully.
The following table contains some more examples of total query size calculations for the Salesforce objects:
Objects
Number of Rows in Query Results
Sum of the Size of the Fields to be Masked
Total Query Size
Account
500,000
1,000
0.5 GB
Contact
6,000,000
2,000
12 GB
Lead
8,500,000
2,000
15.8 GB
The total query size for the Account and Contact objects are within the 15 GB data limit and the application processes the task successfully. The total query size for the Lead object is greater than the 15 GB data limit and the application cannot process the data.To reduce the query for the Lead object, you can use horizontal or vertical partitioning and create multiple masking tasks.
In horizontal partitioning, you can split the number of rows in a Salesforce object. Specify a condition in the data filter criteria to reduce the total number of rows in the query results. You can run two masking tasks, one with closed Leads and the other with open Leads.
In vertical partitioning, you can split the number of fields in a Salesforce object. For the Lead object, you can create 2 masking tasks, 1 task to mask 10 fields with 1300 bytes of data, and other task to mask 3 fields with 700 bytes of data.