Data Architect Study Cards

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

With SFtoSF - shared records can have what statuses?

A shared record can have one of six statuses, namely, Active (sent), Active (received), Pending (sent), Inactive, Inactive (deleted), and Inactive (converted)

How is a single view of the customer created?

A single view of customer is created by consolidating data from multiple sources in a central hub.

What are the features of Field History Tracking?

Although Field History Tracking can be used to track and store field values in Salesforce, it only allows accessing the field history data for up to 18 months via the user interface and 24 months via the API.

What is the difference between archive and backup?

An archive includes historical, rarely-used Salesforce data located out of production. Upon archiving, the information moves to for future use. Archiving is about selecting subsets of data from production environments to move to external, long-term storage. You can use your data backup as a copy of your entire Salesforce production/sandbox environment to quickly restore in the event of accidental data loss or corruption caused by human error, bad code, or malicious intent.

Records for big objects can be viewed via the standard UI.

False. Records are populated via CSV, API, and Apex, and displayed via a Lightning Component or Visualforce. You cannot create a standard application tab to expose big object with its records.

In Classic, you can merge contacts with different primary accounts.

False. Salesforce Classic only allows merging duplicate contacts that have the same primary account.

You can use SOQL in Salesforce to make update

False. You can only use it to retrieve data.

How do skinny tables get triggered?

Force.com platform determines at query runtime when it would make sense to use skinny tables, so you don't have to modify your reports or develop any Apex code or API calls

When using Salesforce Connect, what's a big consideration?

OData callout limits per day. An external object references data that's stored outside Salesforce. Access to an external object involves a callout to its associated external system. Salesforce enforces rate limits for these callouts.

Data Harmonization

Data harmonization refers to all efforts to combine data from different sources and provide users with a comparable view.

Ensuring Consistently High Data Quality - 3: Data Integration

Data in multiple systems can be integrated to form a single source of truth (SSOT). Business processes can be evaluated to determine the system of record (SOT) for data synchronization.

Data Survivorship - Decay

Decay refers to the rate at which the trust score may decrease over time.

Query Plan Tool - sObject Type

The name of the queried.

Experience Cloud - Member Based

This is best when users will log in frequently to the Experience Cloud site. A license is purchased for each user accessing the site.

PK Chunking allows splitting a Bulk API query to avoid a full table scan, slow performance or failure of a data extraction or deletion job.

True

Record locking errors can be avoided by partitioning the source data into multiple batches. The records in each batch can be ordered by the parent ID. Can also set the bulk API to series mode, but parallel mode is preferred for large data volumes.

True

SOQL queries must be selective, particularly for queries inside triggers for the best performance.

True

Salesforce automatically creates a standard price book entry when a new product record is created, which can be marked as active or inactive.

True

Skinny tables and skinny indexes can also contain encrypted data.

True

Skinny tables are copied to your Full sandbox orgs only.

True

The Einstein Data Detect managed package is available with the purchase of the Salesforce Shield add-on.

True

The Lightning Platform query optimizer determines filter condition selectivity for a SOQL query, report, or list view.

True

Visitors accessing pages will do not need a license.

True

You cannot use standard reports & dashboards with big objects.

True

custom picklist field can have up to 1000 values

True

Salesforce SSO - Salesforce can be set up as the ___ provider. ___ can be set up for all the other systems that need to be accessed by the users using the same login.

single sign-on identity, Connected apps

Salesforce Big Object

- Clients and external systems use a standard set of APIs to access big object data - Confirm how many records can be stored; documentation mentions 1 billion - They can be created under Set Up > Big Objects or via the Metadata API

When using an upsert operations, which of the following will could cause an error? A. Failure due to inaccurate External ID value B. Failure due to duplicate External ID values

B. The DUPLICATE_EXTERNAL_ID error means that one or more records in the target have the same value for the external ID field. Salesforce doesn't know which record to upsert, so throws an error. How to fix: make sure the custom field in Salesforce is marked as External ID and Unique.

Difference between Bulk API and Batch Apex

Batch Apex [1] is used to do asynchronous jobs on large amounts of data in a more or less scheduled way. Batch Apex can be used in conjunction with Scheduled Apex Bulk API is used for doing data manipulation of large sets of data, typically from outside of Salesforce. You can use BULK API from an external platform say using Java or .NET; whereas batch APEX as the name suggests you can only use APEX programming language.

Data Governance & Stewardship: 3 - Implement Your Governance Plan

1. Make date entry and management easy 2. Make data lean & effective with regular housekeeping 3. a) standardize data b) eliminate duplicates c) compare your data against a referential data source d) create rules and schedule automatic cleanings e) track report & learn

When connecting to Salesforce orgs, you can use Salesforce Connect and Salesforce to Salesforce, but what should you keep in mind?

1. Salesforce Connect shows virtualized data (doesn't physically store the data in Salesforce 2. Salesforce to Salesforce shares data, which is duplicated in the receiving org

Query Plan Tool - Additional Info

1. The Query Plan Tool will show a list of available plans that our Query Optimizer can use for the query provided and will be arranged by cost ascending 2. Each Plan will contain information on Cardinality, Operation Type, Cost, sObject Type, and more 3. Each plan has its own Cost value 4. The cost value is derived from the latest gathered database (DB) statistics on the table and values. The plan with the lowest cost will be the plan used. If the Cost is above 1, it means that the query won't be selective.

Channel Account Licenses

1. The Channel Account license is available for Experience Cloud sites and has the same permission and feature access as the Partner Community license. 2. Unlike login or member-based licenses, Partners then open up access to users.

Guidelines for remembering query thresholds - standard index

1st Threshold: 30% of 1 - 1 Million Records 2nd Threshold: 15% of 1M+ Records Final Threshold: Sum of 1s & 2nd with a 1 M ceiling Standard index: 30/15 1 Million

The criteria used for the data survivorship rules can be based on the following - Completeness

A record that has more field values that are populated may be considered more reliable than incomplete records.

Cross-Origin Resource Sharing (CORS)

CORS allows code running in a web browser to communicate with Salesforce from a specific origin. Let's add the URL patterns for Postman.

Event Monitoring

Event Monitoring can be used to access, download and visualize metadata information such as user logins, logouts, API calls, report exports, etc.

What can you export with Event Monitoring?

Event Monitoring in Setup allows exporting data such as file downloads, logins, and logouts.

How else are external objects similar to custom objects?

External objects are similar to custom objects and can also be used in Apex, SOSL, SOQL queries, Salesforce APIs, and deployment via the Metadata API, change sets, and packages.

What features are not available to a platform license?

No access to standard tabs and objects such as forecasts, leads, campaigns, and opportunities

In Salesforce, you can increase query time out values

False. 1. Queries that take longer than two minutes to process will be timed out. 2. For timed out queries, the API returns an API fault element ofInvalidQueryLocatorFault 3. If a timeout occurs, refactor your query to return or scan a smaller amount of data.

A big object can be connected to external objects.

False. A big object can be connected to internal Salesforce objects via lookup relationships.

Creating public groups and queues slows down sharing performance significantly.

False. Because public groups and queues are not part of the role hierarchy or the territory hierarchy, creating them does not slow sharing performance significantly, and there isn't a strong reason for creating them at a specific time.

Bulkified Apex improves query performance.

False. Bulkified Apex ensures optimal execution time and governor limit usage; however, it does not improve query performance.

Salesforce cannot be used as an MDM tool.

False. Customer 360 Data Manager can be used as a Master Data Management tool to provide the most up to date and accurate customer data across instances

All customers can use flows with threat events.

False. Customers who have purchased the Salesforce Shield or Event Monitoring addon can use flows with threat events.

Data Loader always prevents the creation of duplicate records.

False. Data Loader can prevent duplicate record entry only if you have a unique field being imported.

Data masking and encryption are the same thing.

False. Data masking prevents developers or other users from viewing sensitive data in the user interface or exporting it as plain text. Data encryption prevents malicious attackers from accessing or interacting with sensitive data at rest in the data center.

All users can export and view fields that use classic encryption.

False. Fields that use Classic Encryption can be exported. But if a user who does not have the 'View Encrypted Data' permission exports a Text (Encrypted) field, its value remains encrypted.

When using, Customer 360 Data Manager, data is copied to the central hub.

False. It is used to create a unified view of customer data by consolidating records from multiple data sources in a central hub. It generates a global profile for each customer, but data is not copied to the central hub.

All customers can use Salesforce Connect.

False. It's available in Developer edition and for an extra cost in Enterprise, Performance, and Unlimited editions

SOSL and SOQL can be used to query data in a big object.

False. Only normal or async SOQL.

Salesforce Data Mask is used to mask data in production.

False. Only works with sandboxes. Data Mask is a managed package that you install and configure in an Unlimited, Performance, or Enterprise production org. You then run the masking process from any sandbox created from the production org.

PK Chunking splits the Bulk API job into a number of separate batches.

False. PK Chunking splits the Bulk API job into a number of separate queries.

What can happen if you use two indexed fields joined by an OR in the WHERE clause?

If you use two indexed fields joined by an OR in the WHERE clause, your search results can exceed the index threshold. Solution: Decompose the query - break the query into two queries and join the results.

Where can person accounts not be enabled?

Person accounts can't be enabled as users for partner Experience Cloud sites and portals because partners represent companies, not individuals

What is the difference between personal and sensitive data?

Personal data is any data that can identify an individual. Sensitive data is a category of personal data that is sensitive as it could result in damage, harm, loss, embarrassment or harassment.

What is considered lean loading?

To load lean: 1. Identify the business-critical operations before using moving users to Salesforce. 2. Identify the minimal data set and configuration required to implement those operations. 3. Define a data and configuration strategy based on the requirements you've identified. 4. Load the data as quickly as possible to reduce the scope of synchronization.

Cosmic Grocery has millions of account records in Salesforce. A Lightning web component allows users to retrieve account records by selecting filter criteria. However, the SOQL query that is used by the component to get the records is very slow, which is negatively affecting the user experience. The data architect of the company need to recommend a suitable approach and any related considerations to optimize and speed up the query.

The Query Plan tool available in the Developer Console can be utilized to optimize and speed up the query. To improve the performance of the SOQL query, selective filters can be used. The tool shows the execution plans that are available for the query and can be used to test the selectivity of the query. The Lightning Platform Query Optimizer helps the underlying database system optimizer produce effective execution plans for Salesforce queries. It determines the best index from which to drive the query, if possible, based on filters in the query. The Query Plan tool makes it possible to check all the available plans that can be used to execute the query. It provides information such as the indexed field(s) used by the optimizer and the cost of using an index compared to a full table scan.

Query Plan Tool

The Query Plan tool in the Developer Console can be used to check the available plans for executing a SOQL query. It can be used to test the selectivity of a query

Index Tables

The Salesforce multitenant architecture makes the underlying data table for custom fields unsuitable for indexing. To overcome this limitation, the platform creates an index table that contains a copy of the data, along with information about the data types. The platform builds a standard database index on this index table. The index table places upper limits on the number of records that an indexed search can effectively return.

Three standard matching rules are available for which objects?

Three standard matching rules are available for accounts, contacts and leads, and person accounts

MDM - Survivorship Techniches

Three types of survivorship techniques are available: most recent, most frequent, and most complete.

Query Plan Tool - Cardinality

The estimated number of records that the leading operation type would return. For example, the number of records returned if using an index table.

Matching Rules - Exact Matching

The exact matching method looks for strings that exactly match a pattern. If you're using international data, we recommend using the exact matching method with your matching rules. You can use the exact matching method for almost any field, including custom fields

Any 'Transaction' records that are older than three years should be removed and aggregated in Salesforce. The data architect has been asked to suggest automated approaches that would meet this requirement.

To delete a large number of certain 'Transaction' records and aggregate them in a custom object automatically, schedulable batch Apex can be utilized. A job could be scheduled to run each month to identify records older than three years, aggregate totals in a custom object and then delete them.

Can you access the Metadata API with Visual Studio Code?

Yes - Visual Studio Code allows the use of Metadata API to retrieve, deploy, create, update or delete customization information, such as custom objects, custom fields, and page layouts.

Do skinny tables use indexes?

Yes. Custom indexes on the base table are also replicated, and they usually perform better because of the reduced table joins that happen in the underlying database queries.

What are some attributes to measure a record's score?

To measure and score records based on data quality, a value should be put on the key attributes that are used in data assessment, such as age, completeness, usage, accuracy, consistency, and duplication, along with any other important quality or value metrics

When updating records or importing them, try to group the batches by parent ID.

To minimize how often the shipment records that are updated in multiple, concurrent batches reference a single parent account ID, they can be ordered by the parent account IDs. Furthermore, the administrator should ensure that there are no account IDs that span multiple batches.

An Apex trigger and a custom object can be used to store the old and new values of fields for more than 24 months in a Salesforce org.

True

Big objects can be used in Tableau CRM.

True

Bulk API supports a hard delete (physical delete) option, which allows records to bypass the Recycle Bin and immediately become available for deletion

True

Custom indexes can be added to fields to improve the performance of global search, reports, and list views.

True

Extra licensing cost is only required if more than 1 million records need to be stored in a big object

True

Field history tracking data and Field Audit Trail data don't count against your data storage limits.

True

Master Data Management, Lightning Data, Customer 360 Data Manager, Salesforce CDP, and Salesforce Identity can be used to represent a single view of customer.

True

Once your sandbox data is masked, you can't unmask it.

True

Personal information, such as financial information, security codes or passwords, should not be shared through Customer 360 Data Manager.

True

Data Survivorship - Validations

Validations can be defined on data attributes to evaluate and determine the trust score. For instance, if the Phone field has less than 10 digits or does not follow the correct format, the trust score can be reduced by a certain number of points.

Bulk jobs and caching.

When testing and measuring the performance of your extraction, beware of the effects of the Force.com platform's caching. The more tests you run, the more likely that the data extraction will complete faster because of the underlying database cache utilization

How can you empty the Salesforce recycling bin? How long does data stay in the recycling bin?

When the Recycle Bin is emptied using the UI, the API, or Apex. Data stays in the Recycle Bin for 15 days, or until the Recycle Bin grows to a specific size. The data is then hard deleted from the database after 15 days; when the size limit is reached; or when the Recycle Bin is emptied using the UI, the API, or Apex.

Account Data Skew - Sharing Issues

When the owner of an account is changed, sharing recalculations need to be performed for all the child records, which includes recalculating the role hierarchy and sharing rules. If there are too many child records, this process can take a lot of time

Force.com automatically indexes most ___ fields so your users can build cross-object searches and quickly find records that contain strings of interest.

text

Which products are compatible with Customer 360 Data Manager?

1. B2C Commerce Cloud and Service Cloud Enterprise Edition or higher OR 2. Multiple Sales Cloud, Service Cloud or Salesforce Platform orgs Enterprise Edition or higher

Sharing rules, processes, triggers and flows are available for big objects.

False

With regards to GDPR, Salesforce is considered the: A. Data Processor B. Data Consumer C. Data Controller D. Data Subject

A

MDM - Canonical Modeling

A canonical model may be used for communication between different formats used by enterprise systems.

The company is planning to launch a customer community to allow customers to place orders. Once an order has been processed, a shipment will be created with various products in the order. Salesforce will be utilized to store shipments. The company is expecting more than 10,000 new shipment requests every week. Customers should be able to view and report on shipments generated within the previous one year, but they don't require access to any shipments before tha

A custom object can be used to store shipments in Salesforce. But after one year, the oldest shipment records can be archived and migrated to an off-platform system automatically. The data warehouse can be utilized to store the archived data.

What should a data governance plan entail?

A data governance plan should focus on elements such as data definitions, quality standards, roles and ownership, security and permissions, and quality control. Data stewardship includes defining the teams, roles, and activities for data quality improvement and day-to-day maintenance.

Mitigation Strategies for Lookup Skew - 3 (Using Picklists)

A picklist field can be utilized instead of a lookup field when the number of lookup values is relatively low. This can eliminate the locks associated with lookups on the records.

The criteria used for the data survivorship rules can be based on the following - Recency

A record that was recently added, updated or verified is more reliable than one that was modified several years ago. This can be taken into consideration while defining the data survivorship rules.

User Licenses

A user license determines the base features and functionality that the assigned user has access to in the org. Note that each user can be assigned to one user license only

Customer 360 Data Manager - ___, ___ , and ___ are used to generate global profiles and global party IDs. A. Data Preparation rules B. Match rules C. Delete rules D. Reconciliation rules E. Reference rules

A, B, D

Northern Trail Outfitters (NTO) has the following systems: • Customer master - source of truth for customer information • Service cloud - customer support • Marketing cloud - marketing communications • Enterprise data warehouse - business reporting The customer data is duplicated across all these systems and are not kept in sync. Customers are also complaining that they get repeated marketing emails and have to call in to update their information. NTO is planning to implement a master data management (MDM) solution across the enterprise. Which three data issues will an MDM tool solve? (Choose three.) A. Data completeness B. Data standardization C. Data accuracy and quality D. Data loss and recovery E. Data duplication

B, C, E

Lightning Platform Starter & Lightning Platform Plus - they both contain which type of user and permission set licenses?

Both contain a Salesforce Platform user license and a Company Communities permission set license.

OData 2.0 or 4.0 Adapter for Salesforce Connect

Connect to your back office for a complete view of your business. With the OData 2.0 or 4.0 adapter, Salesforce Connect uses Open Data Protocol Version 2.0 or Version 4.0 to access data that's stored outside Salesforce.

What should be reviewed with business users while defining a system of record when there are multiple enterprise systems and data integration?

Data flows

Ensuring Consistently High Data Quality - 4: Augmentation

Data can be made more valuable by making use of third-party data services. Lightning Data can be utilized to update existing data or import new data.

Which users can be part of the operational-level data governance team?

Data Domain Stewards, Analytics/BI Owners, Data Definers, and individual contributors.

Global search does not work with big objects.

True

Can you share data stored in a data warehouse?

ETL and Data Virtualization

Examples of Selective SOQL Queries - Could Be Selective SELECT Id FROM Account WHERE Name != '' AND CustomField__c = 'ValueA'

Here we have to see if each filter, when considered individually, is selective. As we saw in the previous example the first filter isn't selective. So let's focus on the second one. If the count of records returned by SELECT COUNT() FROM Account WHERE CustomField__c = 'ValueA' is lower than the selectivity threshold, and CustomField__c is indexed, the query is selective.

What happens if a bulk job fails?

If a job times out, the Bulk API automatically puts it back in the queue and re-tries it for you

What is data inconsistency?

Inconsistent use of formatting, spelling, and language across records.

What do companies care about?

Increase revenue Reduce cost Ensure compliance Agility

How many records can the Bulk API process per hour?

It can process up to 20 million records per hour

Denormalizing data

It could mean creating a new object to summarize information, but you you also summarize data about child records on parent records; for instance roll-up summary fields can be used be used to summarize information for the child records.

One difference between Lightning Platform Starter and Lighting Platform Plus license?

Lightning Platform Starter can create 10 custom objects while Lightning Platform Plus allows 110

Mapping Sets

Mapping sets are used in the Cloud Information Model to link a group of related objects and fields that need to be transformed.

What permission do you need to assign a user in Sales Cloud to allow them to see campaigns?

Marketing User

Which feature license is required to access campaigns?

Marketing User feature licenses to have access to campaigns

Bulk API Hard Delete

Permission on profile - can be used in conjunction with Data Loader to hard delete large number of records from Salesforce. Doing so bypasses the Recycling Bin.

With Customer 360 Data Manager, what can be added to Lightning pages to provide a single view of the customer?

Prebuilt Lightning Components are available to add to Lightning pages to provide a single view of the customer from Customer 360.

Data Survivorship - Precedence

Precedence determines which field value should be considered if the trust score is the same.

What is a Service Cloud User feature license needed?

Provides access to Salesforce Console for Service

Record Survivorship Techniques - Most Recent

Records are ordered from most recent to least recent. The most recent record can be considered eligible as the survivor

Bulk jobs and indempotence

Remember that idempotence is an important design consideration in successful extraction processes. Make sure that your job is designed so that resubmitting failed requests fills in the missing records without creating duplicate records for partial extractions.

Horizontal Optimization (based on number of fields)

Removing fields from the query (i.e. calculated fields) is also a type of horizontal optimization.

It is a best practice to not exceed 10,000 child records when using a master-detail relationship to avoid record locking during inserts and updates.

True

Set Up Audit Trail

Setup Audit Trail can be utilized to monitor Setup changes, such as who has been modifying or deleting fields. It allows downloading an org's full setup history for the past 180 days.

Mnemonic for query thresholds

Standard index: 30/15 1 Million Custom index: 10/5 333K

Where can you not using the State/Country picklists?

State and country/territory picklists don't work with: 1. Salesforce to Salesforce 2. Connect Offline 3. Change sets

Who or what determines the system of record?

The MDM solution

What approach can be utilized to avoid locking issues when importing a large number of child records?

The records should be ordered by the record IDs of the parent records that have been loaded before.

When merging two or more records that exist in different systems, the value of a field can be different in each system. The value that should win or survive in the master hub can depend on which factors?

The value that should win or survive in the master hub can depend on multiple factors, such as trust score, decay, validation, and precedence.

What are some examples of data transformation?

Transformation can include cleansing data, sorting, aggregating, identifying duplicates, etc.

A Lighting Component is able to perform client-side validation using JavaScript.

True

A SOQL query can be made selective by using an indexed field in a query filter while ensuring that the number of records returned is below the system threshold.

True

A record can belong to only one division at a time.

True

All fields in the OR clause must be indexed for any index to be used.

True

Salesforce Connect can be used to expose data in a particular Salesforce instance (or another external source) in another Salesforce instance Can use the cross-org adaptor.

True. Why would you use Salesforce Connect to connect to Salesforce instances? You're not actually sending data, but exposing it.

Up to four objects may be linked in a report type.

True; however, a custom report type can contain up to 60 object references with up to 1K fields. For example, if you select the maximum limit of four object relationships for a report type, you can select fields via lookup from an extra 56 objects.

How does the query optimizer determine if it can use a standard field index? really important point

Used if: 1. The filter matches less than 30% of the first million records 2. And less than 15% of additional records, up to a maximum of one million records.

What is a Data Definition Language (DDL)?

Used to define the data structure of a database. DDL statements are used to create, modify, and remove objects in a database. It cannot be used to document the data architecture.

What happens if bulk query does not execute with the standard timeout limit?

When a bulk query is processed, Salesforce attempts to execute the query. If the query doesn't execute within the standard two-minute timeout limit, the job fails and a QUERY_TIMEOUT error is returned.

What happens when you mark fields for deletion with Salesforce Data Mask?

When you mark fields for deletion, Data Mask transforms sensitive, readable sandbox data into empty sets.

When to Use SOSL

With SOQL, you know the object and potentially the fields to search With SOSL, when you don't know which object or field the data resides in.

When you have two filter condition in a SOQL query, how do the thresholds work for OR?

With the OR operator, each filter must meet the threshold individually

What level of masking can you configure using Salesforce Data Mask?

You can configure different levels of masking, depending on the sensitivity of the data: 1. Replace private data in your sandboxes with random characters. 2. Replace private data with similarly mapped words. 3. Replace private data using pattern based masking. 4. Delete sensitive data.

External IDs cause an index to be created on that field. The query optimizer then considers those fields.

You can create External IDs only on the following fields: 1. Auto Number 2. Email 3. Number 4. Text Acronym (NETA) for External ID/index

The query optimizer uses similar considerations to determine whether to use indexes when the WHERE clause contains AND, OR, or LIKE.

• For AND, the query optimizer uses the indexes unless one of them returns more than 20% of the object's records or 666,666 total records. • For OR, the query optimizer uses the indexes unless they all return more than 10% of the object's records or 333,333 total records • For LIKE, the query optimizer does not use its internal statistics table. Instead, it samples up to 100,000 records of actual data to decide whether to use the custom index

Cloud Kicks currently has a Public Read/Write sharing model for the company's Contacts. Cloud Kicks management team requests that only the owner of a contact record be allowed to delete that contact. What should an Architect do to meet these requirements?

- Cannot use a validation rule - Need a "Before Delete" trigger or flow to check Owner before record deletion - For flow, introduced in Winter 21

Which custom fields have indexes?

- Fields marked as Unique - Fields marked as External ID

What type of fields can you create for big objects?

- Lookup - Date/Time - Email - Number - Phone - Text - Text Area (Long) - URL

How do you mitigate the following scenario: Record lock errors can occur when Bulk API is used in parallel mode to insert a large number of opportunities that are related to the same parent account record.

- Use serial mode - Group opportunities by parent Account in each batch

How many history retention policies can be defined per object?

1

Difference between SystemModStamp and LastModifiedDate

1. 'LastModifiedDate' is the date and time when a record was last modified by a User 2. SystemModstamp' is the date and time when a record was last modified by a User or by an automated process (such as a trigger.)

Big Objects and SOQL Considerations

1. Certain operators cannot be used, including LIKE, INCLUDES, EXCLUDES, NOT IN 2. Not all index fields need to be used in the query, but the fields must be listed in index order within the query with no gaps

What are three techniques to consolidate data

1. ETL 2. Data Virtualization (e.g., Salesforce Connect) 3. Data Warehousing

Where can external objects be accessed in Salesforce?

1. List views 2. Detail pages 3. Record feeds 4. Custom tabs 5. Page layouts

Data Governance Design and Implementation - DATA GOVERNANCE FRAMEWORK DEVELOPMENT

1. Rules & definitions 2. Policies & standards 3. Quality control 4. Security & permissions 5. Roles & ownership

Encountering timeouts when moving large amounts of data

1. Should you encounter timeouts using the SOAP or REST API, you could retry and leverage the Bulk API. 2. Should you encounter timeouts using the Bulk API, you could retry and leverage the Bulk PK Chunking header (Spring '15).

Data Governance Design and Implementation - Implementation

1. Standardize data 2. Eliminate duplicates 3. Ensure accuracy 4. Perform data cleansing 4. Track and monitor

Lightning External App Plus License

1. This license provides highly customized experiences incorporating CRM objects, custom objects, external data. 2. It comes with additional API calls, data storage, and file storage. 3. Ideal use cases for this type of license are dealer, vendor, or supplier portals. 4. Also commonly used for franchise management, marketplaces, and multi-level marketing.

Data Governance Design and Implementation - Data Assessment

1. Who uses data 2. Why is the data used 3. Which data is used 4. How is the data used 5. What is the dat quality

How can you reduce lock contentions when loading tasks?

1. you can reduce lock contention from tasks by organizing your batches by the account a task is associated with. 2. A task record can reference an account in several different ways via the What or Who fields. 3. The What field can reference an account, or reference other objects that in turn reference an account, such as an opportunity, or a custom object that's a child of account. 4. The Who field can also reference an account by referencing a related contact.

What is the time limit on the execution of each batch?

10-minute limit on the execution time of each batch. If a batch fails to execute within that time limit, the system will try to execute it again. If the batch does not complete within 10 retries, it will fail.

A bulk query can retrieve up to __ GB of data, divided into fifteen ___ GB files.

15; 1

Salesforce Connect - Max number of objects

200

Data Loader is optimized to load up to how many records

5 million; if you need to import more than 5 million, use the Bulk API

When do you use a Cross Org adapter?

A Cross Org adapter is used in an external data source when one Salesforce org needs to connect to another Salesforce org

How can field sets work with HTTP callouts?

A Field Set that determines which fields to send in an HTTP callout. Alternate approach: configure an outbound action (with respective fields) that can be triggered via workflow or flow.

With Customer 360 Data Manager, what is created for each individual?

A Global Profile and Global Party ID for each individual is created to share across the instances to provide a unified view of the customer.

Performance of Full Backups

A full backup does not contain a WHERE clause by definition. Backup speed will thus depend on other factors than indexes, for example: 1. Number of records (lines) 2. Number of fields (columns) 3. Type of fields (i.e. rich text fields are slower to backup than checkbox fields) 4. Salesforce API selected 5. Your network capacity 6. Degree of parallelization

Which declarative feature can be used to send information to an external system when a record changes?

A record-triggered flow that uses an outbound message action can be used to send information to an external system when a record changes.

When using Data Loader, the batch size should be set to ___ if a process defined in Salesforce should be triggered. A. 1 B. 2 C. 3 D. 4

A. This means that one record is processed at a time. Duplicate rules can be triggered when using Data Loader to import the lead records, but it should process only one record at a time. This can be accomplished by setting the batch size to 1 in the Data Loader Settings prior to inserting the records.

What API syntax is used for big object names?

API name ends in __b

Account Data Skew

Account Data Skew occurs when too many child records are related to the same parent account record. For example, it occurs when more than 10,000 contacts are associated with the same account

What objects are supported for global matching in Customer 360 Data Manager?

Account, Person Account, Contact, Lead, Customer, and Order

What are some criteria that can be used for defining data survivorship rules?

Accuracy, recency, frequency, completeness

When you enable Contacts to Multiple Accounts, what do you need to do to page layouts?

Add the Related Contacts to the Account Page Layout and the Related Accounts to the Contact Page Layout; the latter shows if the relationship with an account is primary or secondary.

With Customer 360 Data Manager, what are mapping sets?

After adding a connected data source, mapping sets can be created between objects and fields within the source schema and the Cloud Information Model.

What are some of the key attributes that can be used to calculate the data quality score of records?

Age, completeness, accuracy, usage, consistency, and duplication.

MDM - Hierarchy Matching

An MDM solution that supports various hierarchy management features would be ideal for data such as product and customer hierarchies.

MDM - MDM Strategy

An MDM strategy can be defined to outline a single source of truth and the system of record for different types of data.

The company is implementing a new ERP system that will be used for managing transactions and generating invoices. Once the status of a 'Transaction' record changes to 'Paid', information about the record should be sent to the ERP system. A system administrator should be able to specify which fields should be sent to the system. The transaction data should be retained in Salesforce. The data architect has been asked to suggest a simple solution that does not require development or the use of 3rd party integration tools.

An outbound message can be used for this requirement. A flow that uses an 'outbound message' action can send information to an external system when the value of a field changes. It is possible for an administrator to select the fields that should be sent as part of the outbound message.

What are some ways of getting data from offline data storage?

Archive data accessible via Lightning Connect, callout or mash-up

Async SOQL

Async SOQL is a method for running SOQL queries when you can't wait for immediate results. These queries are run in the background over Salesforce big object data. Async SOQL provides a convenient way to query large amounts of data stored in Salesforce. Async SOQL is only available with licensing of additional big object capacity (add-on)

With regards to GDPR, what are authorization form objects?

Authorization form objects store consent to forms such as terms of service, privacy policy, and other consent forms.

Distance formula

Calculates the distance between two locations in miles or kilometers. Use: DISTANCE(mylocation1, mylocation2, 'unit') and replace mylocation1 and mylocation2 with two location fields, or a location field and a value returned by the GEOLOCATION function. Replace unit with mi (miles) or km (kilometers).

What major objects are available in Partner Community and not in Customer Community Plus?

Campaigns, Leads, Opportunities, Quotes

What can make SOQL queries non-selective?

Certain types of filter conditions, such as filter conditions with negative filter operators and comparison operators paired with text fields, can make SOQL queries non-selective.

The data architect for UC has written a SOQL query that will return all records from the Task object that do not have a value in the WhatId field: Select id, description, Subject from Task where WhatId != NULL When the data architect usages the query to select values for a process a time out error occurs. What does the data architect need to change to make this query more performant?

Change the where clause to filter by a deterministic defined value.

Where in Customer 360 Data Manager can data be mapped and transformed?

Cloud Information Model

With regards to GDPR, what are consent management objects?

Consent management objects track customer consent to be contacted via different contact methods and consent to subscriptions.

Which custom fields are indexed by default?

Custom fields that are marked as External ID or unique are also indexed by default.

How can you make sure that custom field will have an index?

Custom fields will have an index if they have been marked as Unique or Custom fields will have an index if they have been marked as Unique or External Id

What are some field types not supported by big objects?

Custom picklist, multi select picklist, formula and roll up summary fields are not available

Customer 360 Data Manager

Customer 360 Data Manager can be used to integrate data from multiple orgs and clouds to provide a unified view of customer data, including case and order history. It can be used when a company has multiple Sales Cloud, Service Cloud, or Salesforce Platform orgs. A global profile is created for each customer, which uniquely identifies the customer. For global profile matching, mapping data source records from Account, Person Account, Contact, and Lead are supported. In addition to these objects, integrated experiences, which are set up using Lightning components, also support the Case object.

LCOQ

Customer Community and Customer Community Plus don't have access to Leads, Campaigns, Opportunities or Quotes (LCOQ).

A custom object has been created in Salesforce to store information about shipments. There are almost 2 million shipment records already in Salesforce. An on-premise system contains almost 50 million shipment records, out of which almost 3 million records contain data about shipments created within the last six months. The sales director of the company would like to add these recent records to Salesforce and also update the existing records with the latest information from the on-premise system. In order to minimize the data migration time, the data architect has been asked to recommend a suitable approach.

Data Loader can be used to insert new records and update existing records. However, instead of using the upsert operation for this task, it would be better to use the insert operation to load new records and the update operation to update the existing records in Salesforce.

Data Classification - how it's used

Data classification typically consists of data discovery, identification, categorization, and definition of security controls and groups.

Which elements should be included in a data governance model framework?

Data definitions, quality standards, roles and ownership, security and permissions, and quality control.

Data Lineage - how to use it

Data lineage can be defined by specifying information such as integration specifications, data origins, operations, and movement of data

What can be used to populate a big object?

Data loader, API or Apex can populate big objects.

Which artifacts should be included in the documentation related to the integration between two systems?

Data models, integration specifications, and data lineage

In what format must data be stored and imported with Customer 360 Data Manager?

Data must be stored and imported as UTF-8

Ensuring Consistently High Data Quality - 1: Data Profiling

Data profiling includes taking an inventory of data by listing the data sources and names of fields and noting down any potential problems

Matching Algorithms Available with the Fuzzy Matching Method - Matching Algorithm: Edit Distance

Determines the similarity between two strings based on the number of deletions, insertions, and character replacements needed to transform one string into the other. For example, VP Sales matches VP of Sales with score of 73.

Note on sharing and divisions

Divisions is actually not a good fit for any organization that shares large amounts of their data across divisions. Because a Division is set at the Account level and inherited by all child records, you cannot have different Divisions selling to the same Account and mark the related Opportunities with different Division names.

When should you not index custom objects fields?

Don't index custom objects that you don't want searched. It adds to the number of records available to search, which can lead to search crowding.

What can be used to merge duplicate sets?

Duplicate record sets can be used to merge the duplicates.

What features are available in Salesforce to protect data?

Encryption, Masking, Event Monitoring, Sharing Model, Field Level Security, Session Settings.

Data Dictionary

Entity relationship diagrams (ERDs) and Metadata API can be used to build a data dictionary.

Which feature can be used to monitor the bulk export of sensitive org data in Salesforce?

Event Monitoring

The MDM system is always the system of record.

False. The MDM system typically used as the source of truth that contains the master records (most accurate and complete information)

Bulk API calls count as REST or SOAP API requests.

False. Bulk API calls do not count as REST or SOAP API requests.

What is the FieldHistoryArchive big object?

FieldHistoryArchive is a standard big object which is used to retain field history beyond the standard Salesforce retention period. Only available with the purchase field audit trail license.

Loading Data from the API - Goal: avoid unnecessary overhead

For custom integrations, authenticate once per load, not on each record.

Canvas Scenarios

From a high-level view, there are two common scenarios where Canvas is implemented: 1. Application integration—You're a partner, systems integrator, or customer that builds cloud apps, and you'd like to integrate these applications with Salesforce. Application rationalization/enterprise desktop—2. You're a large organization that has many existing apps that your users access in addition to Salesforce. You'd like to integrate these apps into Salesforce so that users can accomplish all of their tasks in one place.

Which profile is created by Customer 360 Data Manager to uniquely identify a customer?

Global profile

What represents the most trusted data source in an MDM solution?

Golden source of truth, also called the single source of truth (SSOT)

What should you take into consideration when developing a recovery strategy?

Here are the things that you should take into consideration when developing your recovery strategy: - Scope of restore is aligned with your DRP - Recover a specific version of the data/metadata/document - Minimize and plan for business impact of restore process - Minimize data transformation during restore process - Handle different types of restore processes - Performance and time to restore - Minimize the number of manual tasks - Minimize the impact on current and future - Salesforce application design - Automatically solve common restore roadblocks - Provide error logs for manual intervention

What can you use Heroku Connect?

Heroku Connect (a Heroku Add-On) can be used to allow a Salesforce org to connect to a Heroku Postgres SQL database.

What is Heroku Redis?

Heroku's managed key-value data store that is available as an add-on

What happens if more than one simple filter is selective?

If more than one filter are found to be selective, the query optimizer will choose the one with lower cost to drive the execution plan of the query.

What happens if you have two filters conditions on a query and one is not selective, how can you make the conditions selective?

If one of the filter conditions is nonselective, for example, Status='Closed Won' corresponds to 250,000 records, two possibilities can make the overall filter condition selective: • Each filter condition corresponds to less than 300,000 records (twice the selectivity thresholds for each filter) • The intersection of Status='Closed Won' AND CloseDate = THIS_WEEK is less than 150,000 records.

How do you reduce computations when loading data via the API?

If possible for initial loads, populate roles before populating sharing rules: 1. Load users into roles. 2. Load record data with owners, triggering calculations in the role hierarchy. 3. Configure public groups and queues, and let those computations propagate. 4. Add sharing rules one at a time, letting computations for each rule finish before adding the next one. If possible, add people and data before creating and assigning groups and queues: 1. Load the new users and new record data. 2. Optionally, load new public groups and queues. 3. Add sharing rules one at a time, letting computations for each rule finish before adding the next one.

Improve Performance by Avoiding Null Values (!= null is actually not bad)

In your SOQL and SOSL queries, explicitly filtering out null values in the WHERE clause allows Salesforce to improve query performance SELECT Name FROM CSO_CaseThread_Tag__c WHERE Thread__c = :threadId AND Thread__c != null

What objects can be used to enable a GDPR compliant data model?

Individual object, consent objects, contact type objects, custom objects.

A custom field that is marked as unique or External ID is automatically indexed by Salesforce, which makes search results faster.

Key point

Which type of solution can be used to enrich data?

Lightning Data

With Lighting, can you merge contacts with different primary accounts?

Lightning Experience allows merging contacts that have different primary accounts, unless a duplicate contact is associated with a portal user.

What does event monitoring track?

Logins Logouts URI (web clicks in Salesforce Classic) Lightning (web clicks, performance, and errors in Lightning Experience and the Salesforce mobile app) Visualforce page loads Application programming interface (API) calls Apex executions Report exports

Lookup Skew

Lookup Skew occurs when too many child records are related to the same record in the lookup field. For example, when 10,000 records of a custom object look up to the same record of another object, it causes lookup skew.

Lookup Skew - Identifying Skew Lookup

Lookup skew can be identified by evaluating objects with a large number of records and heavy concurrent insert and update activity. Lookup fields can be examined and lookup values can be extracted to identify lookup skew.

When loading data, what happens if there is an error?

Make sure the data is clean, especially in foreign key relationships. When there's an error, parallel loads switch to single execution mode, slowing down the load considerably.

Data Classification

Metadata can be identified and categorized based on risk levels and required security controls.

Prior to the implementation, it will be necessary to document the intended data architecture of the Salesforce org. The data architect is required to recommend a suitable approach for the same and suggest examples of information that should be included in the document.

Metadata types can be used to define the components and document the data architecture.

Account Data Skew - Avoid Locking Issues

More than 10,000 child records should not be associated with the same parent account record. A pool of accounts can be created and child records can be distributed among them using integration code or a trigger. The records can be redistributed in off-peak hours.

Data Classification Fields in Salesforce - Compliance Categorization

Multi-select picklist with following values: PII, HIPAA, GDPR, PCI, COPPA, CCPA Note: 1. COPPA: Children's Online Privacy Protection Rule 2. CCPA: California Consumer Privacy Act

GDPR Data Model Considerations - Standard Objects

Must be considered in relation to standard objects (Lead, Contact, Person Account, User) object and custom object records that record personal and sensitive information.

With regards to data, what is one feature of dynamic forms?

On pages where Dynamic Forms is enabled, Salesforce checks if the user is working on a potential duplicate and gives a warning before the record is saved.

Ownership Skew

Ownership skew is an imbalance that occurs when a large number of records of a particular object are owned by a single user in Salesforce who also exists in the role hierarchy.

When should you use SOSL?

Per notes, use SOSL when searching for a very specific term.

Permission Set Licenses

Permission set licenses can be assigned to users to allocate specific settings and permissions that allow them to use more tools and functions which are not included in their user license. A user can be assigned to multiple permission set licenses

Which Lighting Component allows merging duplicate contact records?

Potential Duplicates

Which sharing model can be used avoid account data skew?

Public Read/Write

REGEX function

REGEX(text, regex_text) and replace text with the text field, and regex_text with the regular expression you want to match.

Record Survivorship Techniques - Most Frequent

Records containing the same information are matched, which indicates their correctness. Repeating records indicate that the information is persistent and therefore reliable.

Geolocation formula

Returns a geolocation based on the provided latitude and longitude. Must be used with the DISTANCE function. GEOLOCATION(latitude, longitude) and replace latitude and longitude with the corresponding geolocation, numerical code values.

Which Salesforce cloud comes with ETM?

Sales Cloud

Salesforce Identity Connect

Salesforce Identity Connect can be used to synchronize users and their attributes from Active Directory (AD). Users can sign in using their AD credentials. Identity Connect integrates Microsoft Active Directory (AD) user accounts with Salesforce user records. When a user account is created or updated in AD, Identity Connect pushes those updates to the Salesforce user record seamlessly and instantaneously. For example, when a user is created in AD, the Salesforce user record is created as part of the provisioning process. When deprovisioned, the user's Salesforce session is revoked immediately. You can also use Identity Connect for single sign-on to Salesforce.

Which custom fields cannot be use for indexes?

Salesforce also supports custom indexes on custom fields, except for multi-select picklists, text areas (long), text areas (rich), non-deterministic formula fields, and encrypted text fields.

Lookup Skew - Record Locks

Salesforce locks the child record and the associated lookup record when a child record is inserted or updated. However, this can cause lock exceptions when inserting or updating a large number of child records.

Which APIs does Salesforce provide to back up data and metadata?

Salesforce provides four APIs to backup data and metadata information from the application: REST API SOAP API (data replication API; good for incremental backups) Bulk API Metadata API (for metada)

Examples of Selective SOQL Queries - Not Selective SELECT Id FROM Account WHERE Name != ''

Since Account is a large object even though Name is indexed (primary key), this filter returns most of the records, making the query non-selective.

Data Storage by Edition and Record Size

Starting in late March 2019, Contact Manager, Group, Essentials, Professional, Enterprise, Performance, and Unlimited Editions are allocated 10 GB for data storage, plus incrementally added user storage. For example, a Professional Edition org with 10 users receives 10 GB of data storage, plus 200 MB, for 10.2 GB of total data storage (20Mb per user) 1. Each record is 2K 2. 10G storage; 5 million records

What is the term used for the authoritative data source for a given piece of information in an organization that has multiple enterprise systems?

System of Record (SOR)

Important Metadata for data dictionary

The BusinessProcess metadata type enables you to display different picklist values for users based on their profile. This type extends the Metadata metadata type and inherits its fullName field.

What is the Contact Point Consent object?

The Contact Point Consent object stores information related to customer's consent to be contacted via a specific contact point, such as phone.

What is the Engagement Channel Type object?

The Engagement Channel Type object stores information about the channel through which a customer can be reached, such as SMS or fax.

Commerce Portal License

The External Apps license is available through the Commerce Portals SKU. It is used for custom digital experiences to engage any external stakeholder, including Brand Engagement and Customer Loyalty.

Query Plan Tool - sObject Cardinality

The approximate record count for the queried object.

Data Consolidation - SURVIVORSHIP CRITERIA

The data survivorship rules can be based on criteria such as accuracy, recency, frequency, and completeness of records in different systems

Get Cloud Consulting needs to integrate two different systems with customer records into the Salesforce Account object. So that no duplicate records are created in Salesforce, Master Data Management will be used. An Architect needs to determine which system is the system of record on a field level. What should the Architect do to achieve this goal?

The database schema for each external system should be reviewed, and fields with different names should always be separate fields in Salesforce.

Query Plan Tool - Leading Operation Type

The primary operation type that Salesforce will use to optimize the query. Index - The query will use an index on the queried object. Sharing - The query will use an index based on the sharing rules associated with the user who is executing the query.

Standard Person Account Matching Rule

The standard person account matching rule identifies duplicate person accounts using match keys, a matching equation, and matching criteria. To use the rule, first enable person accounts, and then activate rule in Setup.

System of Record (SOR)

The system of record is defined as the enterprise system that acts as the most authoritative source for a given piece of data. It is the primary repository of correct, durable, restorable and disaster-ready data.

Data Survivorship - Trust Score

The trust score can be calculated based on how much a particular data source or system is trusted.

Experience Cloud - Login Based

This is best when users do not log in frequently, e.g., a few times a month or less. A number of logins is purchased per month that are used by each user login. Multiple logins on the same day only consume one login (daily unique login).

How do you enable PK chunking?

To enable the feature, specify the header Sforce-Enable-PKChunking on the job request for your Bulk API query.

PK Chunking Examples Another customer is planning a security audit and wants to identify all the manual shares that exist on their Account records.

To execute this, they can perform a bulk query on AccountShare, using the filter WHERE rowCause=Manual, with a header like this: Sforce-Enable-PKChunking: chunkSize=250000; parent=Account

SOQL Operations Not Allowed with Big Objects

To perform operations not allowed with SOQL, use Async SOQL instead

What do you have to enable in Salesforce to update records with inactive owners?

To update records with inactive Owners, we have to enable: Enable "Set Audit Fields upon Record Creation" (under Select User Interface) and "Update Records with Inactive Owners" User Permissions on profile or permission set.

What is tokenization?

Tokenization is essentially splitting a phrase, sentence, paragraph, or an entire text document into smaller units, such as individual words or terms. Each of these smaller units are called tokens. The tokens could be words, numbers or punctuation marks. In tokenization, smaller units are created by locating word boundaries. These are the ending point of a word and the beginning of the next word. These tokens are considered as a first step for stemming and lemmatization

Bulk API can be used in parallel mode to ensure maximum performance while exporting or importing a large number of Salesforce records

True

Bulk API query supports both query and queryAll operations

True

Insert and Update operations are faster than using the 'upsert' operation.

True

When you add a standard price to a new product, it's automatically added to the Standard Price Book

True

When you load data with a Private sharing model, the system calculates sharing as the records are being added. If you load with a Public Read/Write sharing model, you can defer this processing until after cutover.

True

Skinny tables don't get copied over to sandbox organizations

True.

Loading Data from the API - Goal: avoid computations

Use Public Read/Write security during initial load to avoid sharing calculation overhead

Ensuring Consistently High Data Quality - 6: Training

Users can be shown how data quality directly affects their work. They can also be assigned responsibilities for ongoing data quality maintenance.

Can you auto-merge records?

Various auto-merge solutions are available in the AppExchange to allow users to merge a large number of duplicate records automatically based on a schedule.

What are some security measures when backing up data?

What if you need to encrypt the data while in transit?

What's important to remember when designing chunking strategy?

When designing your chunking strategy, remember that the platform puts a 10-minute limit on the execution time of each batch. If a batch fails to execute within that time limit, the system will try to execute it again. If the batch does not complete within 10 retries, it will fail

When extracting data with the bulk API, how is data chunked?

When extracting data with Bulk API, queries are split into 100K record chunks by default—you can use the chunkSize header field to configure smaller chunks, or larger ones up to 250K

With Contacts to Multiple Accounts, a person account can be what?

With Contacts to Multiple Accounts, a person account can be: 1. a related contact on a business account 2. a related account on a contact

General Guidelines for Data Loads - Minimize Number of Workflow Actions

Workflow actions increase processing time.

Scenario (Extracting Millions of Records in 30 Minutes): After loading millions of accounts, Customer A needed to pause its data loading and extract the account record IDs before loading the accounts' associated contacts, cases, opportunities, and custom objects. Important Scenario

- Salesforce.com Customer Support told Customer A that when queries might return many millions of records, they should be broken into smaller chunks to avoid running into query time limits or even session termination. - The Salesforce platform does not currently allow you to chunk queries based on record ID. - To define chunk boundaries, Customer A generated another sequence of numbers by adding an autonumber field to the records that it was extracting. - Because autonumber fields are treated as text, they can't be used in the extract query as the upper and lower bounds of a chunk boundary. - So Customer A also created a formula field to convert the query's autonumbers into numeric values for the query. - Finally, Customer A worked with salesforce.com Customer Support to create a custom index on the formula field to speed up its query. - Customer Support told Customer A that the most effective chunk size would feature fewer than 250,000 records; if chunks feature more than 250,000 records, the Force.com query optimizer's sampling code causes the query to be driven by the full table rather than an index. - Customer A used Informatica to automate the construction of the batches, the submission of the queries, and the collection of the results. - Using this approach, Customer A was able to extract 16 million of 300 million records in about 30 minutes.

Key point about batch Apex and Bulk API

- You cannot query data in external system with Bulk API - Formats for loading with bulk API - The REST Bulk API lets you query, queryAll, insert, update, upsert, or delete a large number of records asynchronously. - All operations use HTTP GET or POST methods to send and receive CSV, XML, or JSON data.

Canonical Model

1. A canonical model may be used for communication between the various enterprise systems. 2. This design pattern is a form of enterprise application architecture that can be used for communication between different data formats. 3. It involves creating a data model which supersedes all the others and creating a 'translator' module or layer to/from which all existing systems exchange data with other systems.

Considerations for geolocation fields

1. A geolocation field counts toward the org's limits as three custom fields. 2. Only the individual field components of a geolocation field can be modified or exported. 3. Although longitude values can be within -180 and 180, latitude values must be within -90 and 90. 4. The DISTANCE formula is used to calculate the distance between two locations in miles or kilometers. The GEOLOCATION formula returns a geolocation based on the provided latitude and longitude and must be used with the DISTANCE function.

Additional person account relationships

1. A person account can also be related to another person account as either a related contact or related account. 2. Person accounts are always indirectly related to other accounts or contacts

Guidelines for remembering query thresholds - custom index

1st Threshold: 10% of 1 - 1 Million Records 2nd Threshold: 5% of 1M+ Records Final Threshold: Sum of 1s & 2nd with a 333.333K ceiling Custom index: 10/5 333K

When you enable Contacts to Multiple Accounts AND Person Accounts, what type of relationships can you create for a Person Account?

1. A person account can be either a related contact on a business account or a related account on a contact. 6. A person account can also be related to another person account as either a related contact or related account. Note: A person account does not have a primary relationship with a person account or business account.

What is a skinny table?

1. A skinny table is a custom table in the Force.com platform that contains a subset of fields from a standard or custom base Salesforce object. 2. Force.com can have multiple skinny tables if needed, and maintains them and keeps them completely transparent to you. 3. Skinny tables don't include soft-deleted rows (i.e., records in the Recycle Bin with isDeleted = true) 4. Skinny tables are most useful with tables containing millions of records. 5. If you later decide to add a field to your report or SOQL query, you must contact Salesforce Customer Support to re-create the table 6. They can contain a maximum of 100 fields 7. They cannot contain data from other objects 8. Skinny tables are copied to a full sandbox but not to other types of sandbox organizations 9. Limited field types: Checkbox, Date, Date Time, Email, Number, Percent, Phone, Picklist (multi-select), Text, Text Area, Text Area (long), URL

The criteria used for the data survivorship rules can be based on the following - Accuracy

1. A system may contain more accurate information than the other systems. 2. For example, it is very likely that the Knowledge Management system used by the company contains more accurate Knowledge articles than Salesforce since the former is primarily used for article management. 3. A third-party service may also be utilized to establish the accuracy of records.

Disadvantages of using mashups

1. Accessing data takes more time. 2. Functionality is reduced. For example, reporting and workflow do not work on the external data.

Which solution can be utilized to store field history data of more than 60 fields on a particular object?

1. Add-on feature: Field Audit Trail feature includes provisioning for its required storage and does not count against an organization's data storage usage limits. With Field Audit Trail, you can track up to 60 fields per object. 2. Field history is copied from the History related list into the FieldHistoryArchive big object. 3. You define one HistoryRetentionPolicy for your related history lists, such as Account History, to specify Field Audit Trail retention policies for the objects that you want to archive.

Data Import Wizard Limits

1. All standard (Campaign Members, Accounts & Contacts, Person Accounts, Lead, Solutions) and custom objects: 50K 2. Your import file can be up to 100 MB; but each record in your file can't exceed 400 KB, which is about 4,000 characters. 3. You can import up to 90 fields per record. 4. Each imported note and each imported description can't exceed 32 KB. Text longer than 32 KB is truncated. 5. You can add new records (create), update existing records (update), or and new and existing records (upsert) 6. If important accounts & contacts, you can match contact by name or email; match account by Name & Site 7. You can alternatively check (to true) to trigger workflow rules & processes 8. Data source for accounts & contacts: csv, Outlook CSV, ACT! Csv, Gmail CSV

Enterprise Service Bus (ESB)

1. An Enterprise Service Bus (ESB) will be used to integrate different systems and map the messages between different endpoint systems. 2. The systems use different data formats. 3. The data architect needs to recommend a suitable approach as well as a design pattern for receiving event messages from the MDM and sending transformed messages to different endpoints through the ESB.

External Data Service (e.g., Data.com)

1. An external data service may be utilized to match records against a trusted source and determine their accuracy. 2. The AppExchange marketplace can be searched for possible third-party solutions for this use case. 3. External data sources can provide additional information, such as phone number, address, revenue, or size, which can help in enriching the records.

Flow with Outbound Message Action

1. An outbound message sends information to a designated endpoint, like an external service. 2. You configure outbound messages from Setup 3. You must configure the external endpoint and create a listener for the messages using SOAP API 4. You can associate outbound messages with flows, workflow rules, approval processes, or entitlement processes What is the endpoint URL? Enter an endpoint URL for the recipient of the message. Salesforce sends a SOAP message to this endpoint.

Invoking HTTP Callouts

1. Apex provides several built-in classes to work with HTTP services and create HTTP requests like GET, POST, PUT, and DELETE. 2. You can use these HTTP classes to integrate to REST-based services. They also allow you to integrate to SOAP-based web services as an alternate option to generating Apex code from a WSDL 3. By using the HTTP classes, instead of starting with a WSDL, you take on more responsibility for handling the construction of the SOAP message for the request and response.

Field Service Licensing

1. At least one Service Cloud user license (is that per org or per user?) sal

Data integration rules

1. Automatically match records to current information in a data service by activating a Lightning Data package rule, a geocode rule, or a company info rule. Note: data integration rules are used to define how to match records to current information in data service. 2. Each data service includes an external object used for updating and importing records and a data integration rule that identifies matches with your Salesforce records.

What are some considerations for a hybrid model of data governance?

1. Automation tools can be used to avoid duplication at the source of data. For eliminating duplicates, rules and standards should be defined for identifying and prioritizing the fields that should be used for matching and merging the records. 2. Data quality score can be defined based on various key attributes, which can be used by individual users to monitor and improve data on an ongoing basis. 3. Since the company uses multiple integrations that create or update data, the selection of master records can be based on individual attribute scores and weights. 4. It may also be necessary to identify controlled fields that have an impact across multiple departments and/or divisions and assign ownership of those fields to centralize data quality maintenance.

Deterministic Matching

1. Deterministic matching looks for an exact match between two records or pieces of data. 2. It is an ideal approach when data is at a 100% level, and it is cleansed and standardized in the same way 100% of the time. 3. However, this does not represent a very realistic situation.

What factors help determine if you should use APIs for data backup?

1. Backup and Restore scope (files, metadata, data) 2. Backup automation frequency 3. Need for Backup plan personalization (mix full backups and incremental backups, give higher priority to specific objects/fields/type of records, etc) 4. Backup plan maintenance (environment change detection, support new Salesforce releases and API changes)

Batch Processing Time

1. Batches are processed in chunks. The chunk size depends on the API version. In API version 20.0 and earlier, the chunk size is 100 records. In API version 21.0 and later, the chunk size is 200 records. 2. There's a five-minute limit for processing each chunk. Also, if it takes longer than 10 minutes to process a whole batch, the Bulk API places the remainder of the batch back in the queue for later processing. 3. If the Bulk API continues to exceed the 10-minute limit on subsequent attempts, the batch is placed back in the queue and reprocessed up to 10 times before the batch is permanently marked as failed.

SOQL with Big Objects

1. Build an index query starting from the first field defined in the index, without gaps between the first and last field in the query. You can use = or IN on any field in your query, although you can use IN only one time. 2. You can use the range operations <, >, <=, or >= only on the last field of your query. 3. You can include the system fields CreatedById, CreatedDate, and SystemModstamp in queries.

Query jobs using the Bulk 2.0 API - 2

1. Bulk API 2.0 query jobs enable asynchronous processing of SOQL queries. 2 Instead of manually configuring batches, Bulk API 2.0 query jobs automatically determine the best way to divide your query job into smaller chunks, helping to avoid failures or timeouts. 3. The API automatically handles retries. 4. If you receive a message that the API retried more than 15 times, apply a filter criteria and try again. When you get the results of a query job, the response body is always compressed.

How can you avoid timeouts when exporting data out of Salesforce?

1. Bulk API can be used to avoid timeouts during the export process by setting appropriate batch sizes. 2. Bulk API can efficiently query large data sets and reduce the number of API requests. 3. Bulk API can retrieve up to 15 GB of records, divided into 15 1-GB CSV files.

Bulk API Jobs

1. Bulk API is based on REST principles 2. You can use it to insert, update, upsert, or delete many records asynchronously 3. In contrast, SOAP and REST API use synchronous requests 4. Because Bulk API is REST-based, the request takes the familiar form of a REST request with four components: URI, HTTP method, headers, and body. The HTTP method is POST (to create job) 5. Bulk API uses the same framework that the REST API uses, which means that Bulk API supports many of the same features, such as OAuth authentication.

Bulk 2.0 API - Important Information

1. Bulk ingest jobs allow you to upload records to your org by using a CSV file representation. 2. Bulk query jobs return records based on the specified query. 3. A Bulk API job specifies which object is being processed (for example, Account or Opportunity) and what type of action is being used (insert, upsert, update, or delete). 4. You process a set of records by creating a job that contains one or more batches. Whether you create an ingest or query job, Salesforce automatically optimizes your request to process the job as quickly as possible and to minimize timeouts or other failures.

Bulk API 2.0 - How Requests Are Processed

1. Bulk ingest jobs allow you to upload records to your org by using a CSV file representation. 2. Bulk query jobs return records based on the specified query. 3. A Bulk API job specifies which object is being processed (for example, Account or Opportunity) and what type of action is being used (insert, upsert, update, or delete). 4. You process a set of records by creating a job that contains one or more batches. 5. Whether you create an ingest or query job, Salesforce automatically optimizes your request to process the job as quickly as possible and to minimize timeouts or other failures. 6. Salesforce creates a separate batch for every 10,000 records in your job data, up to a daily maximum of 150,000,000 records.

How can you track custom information for files associated to an account

1. Can add custom field to the ContentVersion object; cannot create field on standard Attachment object.

How can you make the State field required in Salesforce

1. Cannot make the standard Address fields required 2. Can use a validation rule (see attached)

Introducing Canvas

1. Canvas enables you to easily integrate a third-party application in Salesforce. 2. Canvas is a set of tools and JavaScript APIs that you can use to expose an application as a canvas app. This means you can take your new or existing applications and make them available to your users as part of their Salesforce experience.

What do you need to do if page responsiveness is important when conducting data validation?

1. Choose client-side validation instead of server side validation 2. Example of client-side validation: JavaScript within Lighting Component; using dependent picklists on Lighting Component 3. Examples of server-side validation: Validation rules and Apex triggers.

With Customer 360 Data Manager, what is the cloud information model?

1. Cloud Information Model is an open source data model that allows integration to be simplified (it's not a Salesforce product 2. CIM will work on MuleSoft's open-source modeling technology which will provide various file formats to work with varying applications. 3. Instead of creating custom code, developers can implement the CIM and quickly be able to create data lakes, generate analytics, train machine learning models and translate data across systems

Cross-Org Adapter for Salesforce Connect

1. Collaborate more effectively and improve processes by connecting the data across your Salesforce orgs. 2. With the cross-org adapter, Salesforce Connect uses Lightning Platform REST API calls to access records in other Salesforce orgs. 3. Nevertheless, setup is quick and easy with point-and-click tools. 3. Your users and the Lightning Platform interact with other orgs' data via external objects. The cross-org adapter for Salesforce Connect converts each of those interactions into a Lightning Platform REST API call.

Autonumber field again for large data volumes

1. Create or use an existing auto-number field. Alternatively, you can use any number fields that can make up a unique value, as long as they are sequential 2. Create a formula field that converts the auto-number field text value into a numeric value—you cannot use an index with comparison operators such as "<=" (less than or equal to) or ">" (greater than) for the text-based auto-number field. In this example, we'll name this field "ExportID." 3. Place a custom index on the formula field by contacting salesforce.com Customer Support

Be Aware of Operations that Increase Lock Contention

1. Creating new users 2. Updating ownership for records with private sharing 3. Updating user roles 4. Updating territory hierarchies

Global Profile

1. Customer 360 Data Manager identifies all of these records as belonging to the same person. 2. It reconciles the data for the records in the global profile. 3. The global profile is a single source of truth (SSOT) for the most accurate and recent data about your customer. 4. Customer 360 Data Manager assigns a global party ID to the global profile that uniquely identifies the customer across all data sources. 5. Lightning components use the global party ID to retrieve additional information that is stored in a data source, but isn't stored in the global profile; For example, the C360 Order List component can retrieve order information from Commerce Cloud based on global party ID.

Salesforce Data Mask

1. Data Mask uses platform-native obfuscation technology to mask sensitive data in any full or partial sandboxes. 2. You can configure different levels of masking, depending on the sensitivity of the data. 3. Data Mask enables admins and developers to mask sensitive data in sandboxes such as Personally Identifiable Information (PII) or sales revenue. 4. Data Mask is available for Sales Cloud, Service Cloud, Work.com, Salesforce's Industry products, AppExchange applications, and platform customizations. 5. The masking process lets you mask some or all sensitive data with different levels of masking, depending on the sensitivity of the data. 6. Once your sandbox data is masked, you can't unmask it. 7. This irreversible process ensures that the data is not replicated in a readable or recognizable way into another environment Note: this feature only masks data, but you need Classic encryption or Shield to actually encrypt data in certain fields.

Customer 360 Data Manager - Types of Rules

1. Data Preparation rules apply basic cleansing to customer data attributes. 2. Match rules are used to identify when multiple records refer to the same customer. If the default set of match rules are not sufficient, new rules based on specific needs can be created. 3. Reconciliation rules determine which data appears in the global profiles. The most recently updated information in the source records is used to create the global profiles.

Advantages of using mashups

1. Data is never stale 2. No proprietary method needs to be developed to integrate the two systems.

Data Techniques

1. Data propagation 2. Data consolidation 3. Data replication 4. Data federation

Data Governance & Stewardship: 2 - Developer Your Governance Plan

1. Date definitions: Also, define data types — including master, reference, and transactional 2. Quality standards: Set appropriate standards for data quality, including the ability to measure or score records 3. Roles & ownership 4. Security & permissions 6. Quality control process

How can you determine if a SOQL query has a filter that is selective - 1

1. Determine if it has an index. 2. If the filter is on a standard field, it'll have an index if it is a primary key (Id, Name, OwnerId), a foreign key (CreatedById, LastModifiedById, lookup, master-detail relationship), and an audit field (CreatedDate, SystemModstamp). 3. Custom fields will have an index if they have been marked as Unique or External Id 4. If the filter doesn't have an index, it won't be considered for optimization. 5. If the filter has an index, determine how many records it would return 6. For a standard index, the threshold is 30 percent of the first million targeted records and 15 percent of all records after that first million. In addition, the selectivity threshold for a standard index maxes out at 1 million total targeted records, which you could reach only if you had more than 5.6 million total records.

How do you determine if a simple filter is selective

1. Determine if it has an index. 2. If the filter doesn't have an index, it won't be considered for optimization. 3. If the filter has an index, determine how many records it would return 4. If the filter exceeds the threshold, it won't be considered for optimization. 4. If the filter doesn't exceed the threshold, this filter IS selective, and the query optimizer will consider it for optimization.

Cosmic Innovation has a generic account record in Salesforce that has more than 20,000 related contacts. Any new contact record that meets a certain set of criteria is associated with this account. The system administrator of the company is trying to update a field on the existing records using Bulk API in parallel mode, but the update fails due to UNABLE_TO_LOCK_ROW errors. The technical architect needs to recommend a suitable solution to avoid such errors while taking into account best practices for storing a large number of child records

1. Distribute the contacts to other accounts to avoid more than 10K contacts on an account. 2. An Apex trigger with custom logic can be used to associate any new contact record to a different account in Salesforce. 3. The administrator can also reduce the batch size and use serial mode to avoid the errors during the update Note: batch size does play a role; can use smaller batch sizes to avoid locking.

Content Document & Content Version (ContentVersion) standard objects

1. Document Object in Salesforce represents the files that are uploaded by the users. In contrast to Attachment Records, Document Object in Salesforce is not attached to a Parent Object. These are the documents uploaded to a library in Salesforce CRM Content or Salesforce Files. 2. Content Version Object is a child of Document Object in Salesforce that represents a specific version of a document in Salesforce CRM Content or Salesforce Files. This means that this object stores document information similar to an attachment. This object is referred to as a File in a User Interface with the Key Prefix 068.

When should you enable PK chunking & additional details

1. Enable PK chunking when querying tables with more than 10 million records or when a bulk query consistently times out. 2. Works with with most standard objects and all custom objects. 3. To enable the feature you specify the header 'Sforce-Enable-PKChunking' on the job request for your Bulk API query. 4. By default the Bulk API will split the query into 100K record chunks - you can use the 'chunkSize' header field to configure smaller chunks or larger ones up to 250K 5. You can perform filtering while using PK Chunking by simply including a WHERE clause in the Bulk API query. In this case, there may be fewer records returned for a chunk than the number you have specified in 'chunkSize'. 6. If an object is supported, you can also use PK Chunking to query the object's sharing table.

What is the Individual standard object?

1. Enabling the Data Protection and Privacy feature in Setup makes the Individual object available 2. Preferences include "Don't Profile", "Don't Track", "Don't Process" and "Forget Me". 3. Custom fields can be added to store additional specific preferences. 4. A lookup field (individual) is added to Lead, Contact, Person Account, Community User, User and Custom Object records. 5. An Individual object record can be linked to Lead, Contact, Person Account and Custom Object records if they are the same person. 6. It is possible to encrypt the addresses, email addresses, and phone numbers for the points of contacts associated with individuals and person account.

Duplicate Rules

1. Enforce sharing rules: The matching rule compares only records that the user has access to, and the resulting list of possible duplicates includes only records the user has access to. 2. Bypass sharing rules: The matching rule compares all records, regardless of user access, but the resulting list of possible duplicates includes only records the user has access to. 3. Action: Allow, block with alert or report 4. You can compare contacts with leads 5. You can specify conditions as to when the rule is used (could be Contact, Account, or User fields) 6. You can use a standard or custom matching rule

What is Enhanced Personal Information Management?

1. Enhanced Personal Information Management Hide personal information fields in user records from external users such as portal or guest users. Enabling this setting means that external users (portal or guest users) won't see the fields listed in a special field set. 2. Enabling the setting blocks view and edit access to 30 personal information fields using a field set called PersonalInfo_EPIM

Data Governance

1. Establishes rules and policies to ensure reliable and effective customer data. 2. These rules define processes and protocol to ensure usability, quality, and policy compliance of the data asset. 3. Includes business definitions, data quality and security rules, supports UI and integration design

Additional info on event monitoring - 1

1. Event Monitoring can be utilized to capture changes such as user logins, REST and SOAP API calls, Bulk API 2.0 jobs, Visualforce requests, flow usage, report executions, etc. 2. Event log files are available for these events. For example, the 'Flow Execution' event log file type can be used to analyze trends in usage of flows and the 'Bulk API 2.0' event type can be used to track how long Bulk API 2 jobs take to complete, what kinds of data they process and how much, and who runs the jobs.

What else can Event Monitoring be used for?

1. Event monitoring can be used to monitor any Bulk API activity. 2. The BulkApiResultEvent streaming object and BulkApiResultEventStore storage objects can be used to monitor when Bulk API results are downloaded via a REST endpoint or the Bulk API Job page in Setup.

What are two mashup designs that can be used with Salesforce?

1. External website: The Salesforce UI displays an external website, and passes information and requests to it. With this design, you can make the website look like part of the Salesforce UI. 2. Callouts: Apex code allows Salesforce to use Web services to exchange information with external systems in real time.

Field History Tracking

1. Field History Tracking (standard feature and different than Field Audit Trail, which is an add-on feature) can be enabled for the Account object to track and display the new and old values of up to 20 fields in the History related list of the object. 2. Field history data can be accessed for up to 24 months or two years via the API. 3. Field Audit Trail can be utilized to define a policy to retain archived field history data of up to 60 fields for up to 10 years.

Salesforce Field Audit Trail

1. Field history is copied from the History related list into the FieldHistoryArchive big object 2. You define one HistoryRetentionPolicy for your related history lists, such as Account History, to specify Field Audit Trail retention policies for the objects that you want to archive. 3. Then use Metadata API to deploy your policy. 4. You can update the retention policy on an object as often as needed 5. With Field Audit Trail, you can track up to 60 fields per object. Without it, you can track only 20 fields per object. 6. Values are stored according to the field history retention policy defined for up to 10 years.

Additional info on Field Audit Trail

1. Field history is copied from the History related list into the FieldHistoryArchive big object. 2. You define one HistoryRetentionPolicy for your related history lists, such as Account History, to specify Field Audit Trail retention policies for the objects that you want to archive. 3. Then use Metadata API to deploy your policy. You can update the retention policy on an object as often as needed. With Field Audit Trail, you can track up to 60 fields per object. 4. Without it, you can track only 20 fields per object. With Field Audit Trail, archived field history data is stored until you manually delete it. You can manually delete data that falls outside of your policy window.

More Efficient SOQL Queries

1. For best performance, SOQL queries must be selective, particularly for queries inside triggers. To avoid long execution times, the system can terminate nonselective SOQL queries 2. Developers receive an error message when a non-selective query in a trigger executes against an object that contains more than 1 million records.

Data Survivorship Rules

1. For merging duplicate records into a single golden record in the master hub, data survivorship rules can be utilized. 2. These rules can determine which data elements from the duplicate records should be considered to create the golden record. 3. Data elements that are more populated, more accurate (based on comparison with a third-party database) or were changed more recently can be picked during the merge process. 4. Empty, null, ambiguous and invalid field values can be ignored.

Consent Management for the Salesforce Platform - Key Regulations

1. General Data Protection Regulation (GDPR), European Union California Consumer Privacy Act (CCPA), United States Canada's Anti-Spam Law (CASL)

Customer 360 Data Manager - Objects that support mapping data source records

1. Global profile matching: a) Sales, Service, or Community Cloud objects: Account and Person Account, Contact, and Lead b) B2C Commerce Cloud: Customers, Orders 2. Integrated Experiences: a) Sales, Service, or Community Cloud objects: Account and Person Account, Contact, Lead, and Case b) B2C Commerce Cloud: Customers, Orders

What is a golden record?

1. Golden record " is a single, well-defined version of all the data entities in an organizational ecosystem. 2. In this context, a golden record is sometimes called the "single version of the truth," where "truth" is understood to mean the reference to which data users can turn when they want to ensure that they have the correct version of a piece of information. 3. The golden record encompasses all the data in every system of record (SOR) within a particular organization.

Local User Group Members: 1. read-only access to an event calendar 2. update their personal information and open support cases 3. User group members are expected to access the site one or two times a month. Local User Group Leaders: 1. add events to calendars 2. run reports about user activities 3. access the site at least weekly, with some leaders more active in the site. In addition, some of the site pages should be visible to individuals without them having to log in.

1. Group leaders will need the Customer Community Plus license, which gives them read/write access to calendars and the ability to run reports. 2. As there will be some user group leaders that access the site more frequently than others, a mix of member-based license types for the more frequent users and login-based for the less frequent users can be used 3. Other group members will need the Customer Community license so they can log in, maintain their information and open support cases. As the group members are expected to access the site infrequently, a login-based license would be appropriate.

Account and Contact Example - Large Load

1. Load accounts first 2. Load contacts, but use External Id of Account to upload them and populate the account lookup (https://focusonforce.com/integration-and-data-loading/using-dataloader-for-lookups/) On test if uploading children, group by parent ID and use the external id of the parent for the lookup popuation

Key point about Heroku Connect; data can be synchronized with Heroku from Salesforce or Heroku external objects can be used.

1. Heroku Connect can be added to the application in the Heroku platform to share and synchronize data between the Salesforce database and Postgres database. In a typical setup, Salesforce will have an exact copy of the mapped database tables in Heroku. Then, with bidirectional synchronization enabled, changes made to records in Salesforce can be synced to Heroku if necessary, and changes made in Heroku can be synced back to Salesforce. 2. However, if Heroku External Objects are used, data will reside in the Postgres database only. Then, Salesforce Connect is required in order to access data in Heroku from the Salesforce org. In the second type of setup, data is accessed on demand real-time and no synchronization process is needed as no records are copied or moved into the Salesforce org.

How to make queries selective

1. If a SOQL query contains at least 1 selective filter, the query is said to be selective. 2. If the SOQL query doesn't contain a selective filter, the query is said to be un-selective and will require a full table scan. 3. Criteria for selectivity: a) Determine if it has an index (could be on standard or custom field) b) If the filter has an index, determine how many records it would return

What happens if a bulk query succeeds?

1. If the query succeeds, Salesforce attempts to retrieve the results. 2. If the results exceed the 1 GB file size limit or take longer than 10 minutes to retrieve, the completed results are cached and another attempt is made. 3. After 15 attempts, the job fails and the error message Retried more than fifteen times is returned. 4. If this happens, consider using the PK Chunking header to split the query results into smaller chunks.

Universal Containers (UC) owns a complex Salesforce org with many Apex classes, triggers, and automated processes that will modify records if available. UC has identified that, in its current development state, UC runs change of encountering race condition on the same record. What should a data architect recommend to guarantee that records are not being updated at the same time?

1. In Apex, you can use FOR UPDATE to lock sObject records while they're being updated in order to prevent race conditions and other thread safety problems 2. While an sObject record is locked, no other client or user is allowed to make updates either through code or the Salesforce user interface. 3. The client locking the records can perform logic on the records and make updates with the guarantee that the locked records won't be changed by another client during the lock period. The lock gets released when the transaction completes.

The data architect has been asked to provide recommendations for optimizing the performance of reports and list views.

1. Indexing - Relevant fields can be indexed to improve the performance of SOQL queries that retrieve data for reports and list views. Indexing can be used to make SOQL queries that retrieve data selective. Certain standard fields, such as ID, Name, and SystemModStamp, are indexed by default. Custom fields that are marked as External ID or unique are also indexed by default. Other custom fields can be indexed by contacting customer support. 2. Skinny tables 3. Use divisions 4. Results and joins (The number of records returned by the report and the number of table joins used by the report can be reduced by using filters, which would improve the performance of the report.) Notes mentioned denormalizing data which means report at the summary level from the parent.

Additional Considerations for Custom Metada

1. It is possible to deploy custom metadata records using an unmanaged package. 2. A developer can also use Apex to view, create, and update custom metadata records using SOQL. 3. Although custom settings could also be used to create and store configurations, it is not possible to include custom settings data in a package. 4. It would be necessary to populate the custom settings after installing the package.

When creating custom matching rules, what type of matching is supported?

1. It's done on a field level: exact and fuzzy 2. Match blank fields: If this option is selected, if the field is blank on both records, its' considered a match 3. Custom matching support AND conditions b default, but can also configure complex logic

Custom metadata/settings - Declarative & Apex

1. It's possible to create, update, and delete custom metadata types and records declaratively. 2. With Apex, it's possible to create, read and update records of custom metadata but the Delete operation is not possible. 3. It's possible to create, update, and delete custom settings declaratively

Authorization Form Objects

1. Keep track of data related to authorization forms, such as terms of service, privacy policy, and other consent forms. Each authorization form object stores different data 2. You can use them together to create a full picture of your customer's consent to the authorization form.

Minimizing the Impact of Record Ownership

1. Less than 10K record per record owner 2. Do not assign a role or if a role is needed, a role at the top of the hierarchy that is separate from other roles 3. The skewed user must be kept out of public groups that could used as the source for sharing rules to avoid performance issues

Salesforce Backup & Restore native solution

1. Managed package offered by Salesforce 2. All data backups will be encrypted at rest and in transit 3. Backup and Restore gives customers the tools to create and restore comprehensive backups with just a few clicks, all within Salesforce, rather than waiting weeks for a .csv file that then requires days to re-load.

Multiple feature licenses can be assigned to users to provide features to users that are not included in the user license that is assigned to them. Below lists some of the feature licenses - what are those feature licenses?

1. Marketing User 2. Knowledge User 3. Salesforce CRM Content User 4. Service Cloud User

Person Accounts

1. Person accounts can be merged only with other person accounts 2. Person accounts can't be included in account hierarchies 3. Person accounts can't have direct relationships with other accounts or contacts 4. You can use Contacts to Multiple Accounts to create indirect relationships between a person account and another person account, business account, or contact. 5. On cases, person accounts can be entered in the Account Name field, the Contact Name field, or both. 6. Account and contact fields that appear on person account records can be tracked using the account field history settings. 7. Person accounts don't support certain business account fields (such as Parent Account) and some contact fields (such as Reports To) 8. Custom formula fields from contacts can't be referenced through person accounts 9. Leads that don't have a value in the Company field are converted to person accounts. Leads that do have a value in the Company field are converted to business accounts. 10. To display a custom Lightning page for person accounts, create a custom account record page, then assign it to the person account record type (has its own record type)

Probabilistic Matching

1. Probabilistic matching uses a statistical approach to determine if two records represent the same piece of information. 2. A wider set of data elements is used for matching. 3. Weights are used to calculate the match scores, and thresholds are used to determine a match, non-match, or possible match. 4. That is why it would most likely be the more suitable matching approach for the company's MDM implementation.

General Guidelines for Data Loads - Minimize number of fields

1. Processing time is faster if there are fewer fields loaded for each record. 2. Foreign key, lookup relationship, and roll-up summary fields are more likely to increase processing time. 3. It's not always possible to reduce the number of fields in your records, but if it is, loading times improve.

Data Stewardship

1. Puts tactical roles and activities into effect to ensure adherence and support of the data governance plan. 2. It includes assigning people to uphold the plan, and developing strategy for monitoring and maintenance of customer data. 3. Including data quality monitoring and maintenance

Examples of non-deterministic formula fields:

1. Reference other entities (i.e., fields accessible through lookup fields) 2. Include other formula fields that span over other entities 3. Use dynamic date and time functions (e.g., TODAY, NOW) 4. Includes some of these fields: Owner, Autonumber, divisions, or audit fields (except for CreatedDate and CreatedByID fields) 5. Owner, autonumber, divisions, or audit fields (except for CreatedDate and CreatedByID fields) 6. Standard fields with special functionalities (see attachment)

HistoryRetentionPolicy Object

1. Represents the policy for archiving field history data. When you set a policy, you specify the number of months that you want to keep field history in Salesforce before archiving it. By default, when Field Audit Trail is enabled, all field history is retained. 2. This component is only available to users with the RetainFieldHistory permission.

What are some suggestions for increasing throughput?

1. Run your requests in parallel. 2. Have an administrator with access to all the data perform the extraction. This practice can help you minimize sharing calculation overhead. Always use the Bulk API or batch Apex when dealing with large data volumes. Note: Your requests might not always be returned in the order that you submitted them. 3. If you decide to use a skinny table, you must first create an Apex trigger to take the auto-number field (or the sequence ID field) and store its value in a static custom field because formula fields cannot be included in skinny tables

Some facts about SOSL

1. SOSL can tokenize multiple terms within a field, and can build a search index off of this. 2. If you're searching for a specific distinct term that you know exists within a field, you might find SOSL faster than SOQL 3. For each Apex transaction, the governor limit for SOSL queries is 2,000; for SOQL queries it's 50,000

How Bulk Queries Are Processed

1. Salesforce attempts to execute the query. If the query doesn't execute within the standard two-minute timeout limit, the job fails and a QUERY_TIMEOUT error is returned. 2. If this happens, rewrite a simpler query and resubmit the batch. 3. f the query succeeds, Salesforce attempts to retrieve the results. If the results exceed the 1 GB file size limit or take longer than 10 minutes to retrieve, the completed results are cached and another attempt is made. 4. After 15 attempts, the job fails and the error message Retried more than fifteen times is returned. 5. If this happens, consider using the PK Chunking header to split the query results into smaller chunks.skinny tables don't include soft-deleted rows

External Services

1. Salesforce integration product that encompasses (1) registering an external web service that you submit as an OpenAPI-compliant specification defining the web service, and (2) magically (well, almost!) bringing the operations of your external web service into the Salesforce platform (see invocable actions) for use with point-and-click tools like Flow Builder. 2. In a nutshell, it declaratively connects external REST APIs using OpenAPI standards.

How does Salesforce searching work?

1. Salesforce performs indexed searches by first searching the indexes for appropriate records, then narrowing down the results based on access permissions, search limits, and other filters. This process creates a result set, which typically contains the most relevant results. 2. After the result set reaches a predetermined size, the remaining records are discarded. 3. The result set is then used to query the records from the database to retrieve the fields that a user sees.

Standard Matching Rules

1. Salesforce provides standard matching rules for business and person accounts, contacts, and leads

General Guidelines for Data Loads - Optimize Batch Size

1. Salesforce shares processing resources among all its customers. To ensure that each organization doesn't wait too long to process its batches, any batch that takes more than 10 minutes is suspended and returned to the queue for later processing. 2. The best course of action is to submit batches that process in less than 10 minutes.

Options to share information between Salesforce orgs

1. Salesforce to Salesforce (records are replicated in receiving org) 2. Change Data Capture 3. Create an Apex REST resource with connected app 4. Salesforce Connect (using the Cross Org adapter; virtualization, no data replication or actual exchange); With the cross-org adapter, Salesforce Connect uses Lightning Platform REST API calls to access records in other Salesforce orgs

General Guidelines for Data Loads - Minimize Number of Batches in the Asynchronous Queue

1. Salesforce uses a queue-based framework to handle asynchronous processes from such sources as future and batch Apex, and Bulk API batches. 2. This queue is used to balance request workload across organizations. If more than 2,000 unprocessed requests from a single organization are in the queue, any more requests from the same organization will be delayed while the queue handles requests from other organizations. 3. Minimize the number of batches submitted at one time to ensure that your batches are not delayed in the queue.

Where can divisions be used?

1. Search 2. List views: If you have the "Affected by Divisions" permission, list views include only the records in the division you specify when creating or editing the list view. 3. Chatter: doesn't support divisions 4. Reports: if you have the "Affected by Divisions" permission, you can set your report options to include records in just one division or all divisions. 5. Viewing records and related lists: shows all records 6. Creating records: When you create accounts, leads, or custom objects that are enabled for divisions, the division is automatically set to your default division, unless you override this setting.

Difference between a simple and composite SOQL filter

1. Simple: A simple filter would be each of the field expressions (<field> <operator> <value>) in a condition expression that uses the "AND" operator. 2. The result of joining 2 or more field expressions via the "OR" operator is a composite filter.

How do you segment the data governance models?

1. Small company --> decentralized or bottom-up 2. Medium company --> decentralized or bottom-up approach 3. Really large company --> centralized or top-down data governance model

When enabling the State/Country picklists, where are they available?

1. State and country/territory picklists are available in the shipping, billing, mailing, and "other" address fields in the account, campaign members, contact, contract, lead, order, person accounts, quotes, and service contracts standard objects 2. The picklists are also available for managing users and companies in Setup

Retrieving custom metadata to VF

1. Static methods are available to access the records, so SOQL queries are not required, making it a low-cost and efficient solution for storing and accessing reusable data. 2. For example, the getAll() method can be used to retrieve all the records of a specific custom metadata type. 3. Since a developer does not have to use SOQL queries, there is no impact on the associated governor limit.

Retrieving custom metadata and Apex

1. Static methods that are available for accessing custom metadata type records in Apex code. 2. 2. These are getAll(), getInstance(recordId),getInstance(qualifiedApiName), and getInstance(developerName). These can be used to retrieve information from custom metadata type records faster without using the SOQL engine. 3. For example, thegetAll() method can be used to retrieve a map of all the records of a particular custom metadata type

What is the Bulk API 2.0?

1. The REST-based Bulk API 2.0 provides a programmatic option to asynchronously upload, query, or delete large data sets in your Salesforce org. Note: the way to export data with the BULK API is using Data Loader; you can use inline data loader command to schedule automatic exports 2. Any data operation that includes more than 2,000 records is a good candidate for Bulk API 2.0 to successfully prepare, execute, and manage an asynchronous workflow that makes use of the Bulk framework. 3. Jobs with fewer than 2,000 records should involve "bulkified" synchronous calls in REST (for example, Composite) or SOAP. 4. Using the API requires basic familiarity with software development, web services, and the Salesforce user interface. 5. This API is enabled by default for Performance, Unlimited, Enterprise, and Developer Editions. 6. The "API Enabled" permission must be enabled.

MDM Implementation - Registry

1. The Registry style is typically used when there are various source systems and there is a need to identify duplicates by running cleansing and matching algorithms on data from the source systems. 2. Unique global identifiers are assigned to matched records to help identify a single version of the truth. 3.Although this is a non-intrusive approach since the source systems remain unchanged and changes to the master data continue to be made through existing source systems, it does not utilize a central hub. 4. This style can be utilized when a company has multiple low control and autonomous systems, and there is a need for distributed governance by those remote systems.

Review from previous cert

1. The chance of locking errors can be reduced by scheduling separate group maintenance processes carefully so that they do not overlap. 2. Granular locking can be utilized to allow some group maintenance operations to proceed concurrently if there is no hierarchical or other relationship between the roles of users and groups involved in the updates. For instance, role insertion can be performed concurrently with user role change. Granular locking is enabled by default. 3. Retry logic can also be implemented in the integration code to recover from failure when there is a locking error.

MDM Implementation - Co-Existence

1. The coexistence style can be used if the master data need to be stored in the central MDM system and updated in the source systems. 2. It is an expensive approach since master data changes can happen in the MDM system as well as in the source systems. 3. Data is mastered in the source systems and then synchronized with the central hub. It is suitable when a company wants to mirror customer data from disparate enterprise systems and focus on shared services.

What are the key points to ensure a formula field can be used as an index (it has to be deterministic):

1. The formula contains fields from a single object only (not relationship fields) 2. The formula field doesn't reference any non-deterministic functions (e.g. SYSDATE) 3. The formula field doesn't reference any non-supported fields for including in indexes. 4. The formula field contains references to Primary Keys (e.g., Id)

Matching Rules - Fuzzy Matching

1. The fuzzy matching methods look for strings that approximately match a pattern. 2. Some fuzzy matching methods, such as Acronym and Name Variant, identify similarities using hard-coded dictionaries. 3. Because the dictionaries aren't comprehensive, results can include unexpected or missing matches. 4. Specific fuzzy matching methods are available for commonly used standard fields on accounts, contacts, and leads.

Granular locking and retry logic

1. The probability of locking errors can be reduced by using the granular locking feature which allows some group maintenance operations to proceed simultaneously. 2. Retry logic can also be implemented in integrations and other automated group maintenance processes to recover from a failure to acquire a lock.

Lightning External Apps Starter License

1. This license is used for Custom digital experiences to engage any external stakeholder, including Brand Engagement and Customer Loyalty. 2. This license has limited access to CRM objects. 3. It is a good fit if your use case needs only custom objects and does not fit into either a self-service customer community or a partner relationship use case. 4. Advantage of Lightning External App Starter is that it comes with additional platform capacities like API calls, Data storage, and File storage. 5. Disadvantage of this license over Customer Community is that External App only comes with read-only access to Knowledge.

Multi-Level M/D Relationships

1. To create multilevel master-detail relationships, you need the Customize Application user permission 2. Standard objects can't be on the detail side of a custom object in a master-detail relationship. 3. Three level of M/D relationships are supported in Salesforce. 4. Roll-up summary fields work as in two-object master-detail relationships. A master can roll up fields on detail records; however, it can't directly roll up fields on subdetail records. The detail record must have a roll-up summary field for the field on the subdetail record, allowing the master to roll up from the detail's roll-up summary field. The two bottom levels can only be custom objects

How do you determine the lower and higher boundary for the load? How do you break the query into chunks? Don't understand

1. To find the lowest boundary: Select ExportID__c From YourObject__c order by ExportID__c asc null last limit 1; 2. To find the highest boundary: Select ExportID__c From YourObject__c order by ExportID__c desc null last limit 1 3. SELECT Field1__c, Field2__c, [Field3__c ...] FROM Account WHERE ExportID__c > 1000000 AND ExportID__c <= 1200000

MDM Implementation - Consolidation

1. Unlike the Registry style, the Consolidation style utilizes a central hub for storing the golden record, which is used for reporting and reference. 2. This style is also non-intrusive since the source systems do not modify the master data and are not responsible for master data authoring. 3. The master data is consolidated from various systems in the central hub. In the central hub, the consolidated data is cleansed, matched, and integrated to offer a complete single record. 4. It provides a trusted source of data for reporting and analytics. 5. This approach is suitable when an organization wants to improve the quality of analytics data through downstream reporting and can be utilized for the company's implementation requirement.

PK Chunking

1. Use PK Chunking to handle extra-large data set extracts (It should be used when an object has more than 10 million records). PK stands for Primary Key—the object's record ID—which is always indexed. 2. PK chunking splits bulk queries on very large tables into chunks based on the record IDs of the queried records; With this method, first query the target table to identify a number of chunks of records with sequential IDs. They submit separate queries to extract the data in each chunk, and finally combine the results. 3. Use the PK chunking request header to enable automatic PK chunking for a bulk query job. 4. PK chunking splits bulk queries on large tables into chunks based on the record IDs, or primary keys, of the queried records.

What is request and reply?

1. Use Visualforce pages or custom Lightning components to request and receive data from an external system or perform transactional activities using the SOAP API or REST API. 2. Small data volumes 3. Synchronous

Data Classification Settings

1. Use default data sensitivity level checkbox = Applies a default sensitivity level value to all contacts, leads, person accounts, and users. 2. Edit Data Sensitivity Picklist Values 3. Edit Compliance Categorization Picklist Values

Additional best practices for reports and dashboards

1. Use efficient filters: Whenever possible use selective filters. The more selective your filters are, the better the performance. Try to use "equals" or "not equal" expressions instead of "contains" or "does not contain". 2. Use time frame filters; information will be loaded faster with fixed time frames rather than with open-ended ones; can also use relative dates 3. Remove unnecessary information: Reduce the number of fields by removing unnecessary columns. This is especially important when you are grouping information, for instance in matrix or summary reports. 5. Use dashboards for reports which take a long time to process

QueryAll

1. Use queryAll to identify the records that have been deleted because of a merge or delete. 2. queryAll has read-only access to the field isDeleted; otherwise it is the same as query().

Introduction to SOSL

1. Use the Salesforce Object Search Language (SOSL) to construct text-based search queries against the search index. 2. By default, SOSL queries scan all entities 3. The search engine looks for matches to the search term across a maximum of 2,000 records 4. Sharing is applied after the result set is returned from the search stack. 5. If your filters are not selective and cause search term matches of more than 2K records, there is a possibility of running into search crowding.

Loading Data from the API - Goal: Using the Most Efficient Operations

1. Use the fastest operation possible — insert() is fastest, update() is next, and upsert() is next after that. 2. If possible, also break upsert() into two operations: create() and update(). 3. Ensure that data is clean before loading when using the Bulk API 2.0. Errors in batches trigger single-row processing for that batch, and that processing heavily impacts performance.

Customer Community Licenses

1. Used by Experience Cloud Sites with high volume of external users who need access to case records and/or knowledge articles 2. Can be used with person accounts

Customer Community Plus Licenses

1. Used for business-to-consumer experiences when external users need access to reports, dashboards, or advanced sharing 2. Can be used with person accounts

Vertical Optimization (based on number of records)

1. Using PK Chunking is one very efficient way of splitting a query vertically 2. Partial backup is also a type of vertical optimization.

When working with large volumes of data, what are the two key factors to build efficient queries, reports, and list views?

1. Using selectivity in indexes without going over thresholds. 2. Using standard and custom indexes

Request and Reply Integration - User Interfaces

1. VF pages: Use the Visualforce framework to create custom user interfaces and Apex controllers that perform server-side logic and custom functionality 2. Custom Lighting Components: Use the Aura or Lightning Web Component programming model for implementing custom user interfaces and extended functionality.

How do locks works with the Task object?

1. When a task is inserted or updated, the associated Who, What, and Account records will get locked 2. However, on insert, the locks only occur if the task status is Completed and the task activity date is not null. 3. On update and delete, the locks occur regardless of the task status or activity date values. Keep these conditions in mind as we talk about common strategies when loading tasks. Assigned To: User or Queue Related To: a lot of records Name: Contact or Lead record

MDM Implementation - Transaction/Centralized

1. When using this style, a central hub is the single provider of all the master data. 2. Master data attributes are stored and maintained using linking, cleansing, matching, and enriching algorithms, which enhances the data. 3. The enhanced data can be published back to the respective source system. 4. This style is suitable when maximum control over access and security is required, which could be due to high data sensitivity.

Salesforce to Salesforce - Quick Summary

1. When you enable Salesforce to Salesforce, a new user named "Connection User" is created (does not consume license) 2. When your business partner updates a shared record, the Last Modified By field on the record in your organization displays Connection User, allowing you to easily track all changes made by your business partners. 3. The Connection User is automatically assigned to the Partner Network profile. 4. System administrators can share all records, but most users can only forward records that they (or their subordinates) own. 5. You control the type of records you share with your connected organizations by selecting which objects and fields to publish. 6. Your connected organizations don't have direct access to records that you're sharing. They have a record in their organization that is connected to your record through Salesforce to Salesforce. 7. Any updates to the shared information on either record are reflected on the other record.

What happens when you upload records using the bulk API?

1. When you upload records using Bulk API, those records are streamed to Force.com to create a new job. 2. As the data rolls in for the job, it's stored in temporary storage and then sliced up into user-defined batches (10,000 records max). 3. Even as your data is still being sent to the server, the Force.com platform submits the batches for processing.

Contacts to Multiple Accounts

1. When you use Contacts to Multiple Accounts, each contact still requires a primary account (the account in the Account Name field). 2. The contact and its primary account have a direct relationship 3. You can add other accounts to the contact. These secondary account-contact relationships are indirect. 4. Contacts to Multiple Accounts works with person accounts 5. A person account can be either a related contact on a business account or a related account on a contact. 6. A person account can also be related to another person account as either a related contact or related account. 7. When you relate a person account to an account or contact, the relationship is always indirect. 8. Person accounts don't have primary accounts, so person accounts can't be directly related to business accounts. 9. Contacts to Multiple Accounts can't be enabled with the Metadata API.

Bulk ingest jobs using the Bulk 2.0 API - 2

1. While processing ingest jobs, Salesforce Bulk API 2.0 automatically divides your job's data into multiple batches to improve performance. 2. Salesforce creates a separate batch for every 10K records in your job data, up to a daily maximum of 150 Million records. 3. Just as a job can fail, so can an individual batch. If Salesforce can't process all the records in a batch within 10 minutes, the batch fails. 4. Salesforce automatically retries failed batches up to a maximum of 10 times. If the batch still can't be processed after 10 retries, the entire ingest job is moved to the Failed state and remaining job data isn't processed.

Deleting Data - Goals

1. While the data is soft deleted, it still affects database performance because the data is still resident, and deleted records have to be excluded from any queries. 2. In addition, Bulk API and Bulk API 2.0 support a hard delete option, which allows records to bypass the Recycle Bin and become immediately available for deletion

Data Governance & Stewardship: 1 - Assess the State of Your Data

1. Who is using customer data? 2. What are the business needs of the data? 3. Which data is used the most? 4. How is the data being used?

Bulk API Limits

1. You can submit up to 10,000 batches per rolling 24-hour period. You can't create batches associated with a job that is more than 24 hours old.

With which objects can PK chunking be used?

1. You can use PK Chunking with most standard objects. 2. It's supported for Account, Campaign, CampaignMember, Case, Contact, Lead, LoginHistory, Opportunity, Task, and User, as well as all custom objects.

General Guidelines for Data Loads - Minimize Number of Triggers

1. You can use parallel mode with objects that have associated triggers if the triggers don't cause side-effects that interfere with other parallel transactions. 2. However, Salesforce doesn't recommend loading large batches for objects with complex triggers (should you turn off the triggers?) 3. Instead, rewrite the trigger logic as a batch Apex job that is executed after all the data has loaded.

General Guidelines for Data Loads - Use Parallel Mode Whenever Possible

1. You get the most benefit from the Bulk API by processing batches in parallel, which is the default mode and enables faster loading of data. 2. However, sometimes parallel processing can cause lock contention on records. The alternative is to process using serial mode. 3. Don't process data in serial mode unless you know this would otherwise result in lock timeouts and you can't reorganize your batches to avoid the locks. 4. You set the processing mode at the job level. All batches in a job are processed in parallel or serial mode

Big object query considerations

1. a Limited subset of SOQL commands is available for big objects, but there are certain limitations. For example, there should be no gaps between the first and last field in the index 2. certain SOQL operations are not available. The !=,LIKE, NOT IN,EXCLUDES,and INCLUDES operators are not available.

General Guidelines for Data Loads - Organize Batches to Minimize Lock Contention

1. f you organize AccountTeamMember records by AccountId so that all records referencing the same account are in a single batch, you minimize the risk of lock contention by multiple batches. 2. If there are problems acquiring locks for more than 100 records in a batch, the Bulk API places the remainder of the batch back in the queue for later processing. 3. When the Bulk API processes the batch again later, records marked as failed are not retried. To process these records, you must submit them again in a separate batch. 4. If the Bulk API continues to encounter problems processing a batch, it's placed back in the queue and reprocessed up to 10 times before the batch is permanently marked as failed.

Additional info on event monitoring - 2

3. The event log files can be viewed and downloaded after 24 hours. Graphs can be used to visualize the files. The USER_TYPE field in an Event Log File tells whether the user associated with an event is an authenticated or guest user.

How can you determine if a SOQL query has a filter that is selective - 2

7. For a custom index, the selectivity threshold is 10 percent of the first million targeted records and 5 percent all records after that first million. In addition, the selectivity threshold for a custom index maxes out at 333,333 targeted records, which you could reach only if you had more than 5.6 million records. 8. If the filter exceeds the threshold, it won't be considered for optimization. 9. If the filter doesn't exceed the threshold, this filter IS selective, and the query optimizer will consider it for optimization.

Account Data Skew - Avoid Sharing Issues

A Public Read/Write model can be considered to ensure that the parent account stays locked but sharing calculations do not occur for the child records. The number of child records associated with each account should be under 10,000.

What's an important use case for big objects?

A big object can be used when the number of records in an sObject is close to 1 million or there are performance issues while running reports or queries.

What are the statuses for big objects?

A big object is created with 'In Development' status and updated to 'Deployed' status once index is defined.

Customer 360 Data Manager - mapping set

A mapping set is a group of related objects that need to be mapped between the data source and the Cloud Information Model. It aligns and transforms the data across the connected orgs. For example: 1. Commerce Cloud Order object can be mapped with the Sales Order entity in the Cloud Information Model 2. The Account object in the Sales Cloud org can be mapped with the Individual entity in the Cloud Information Model Mapping set templates that include default mappings can be utilized. To use integrated experiences, Lightning components such as 'C360 Order History' can be added to record pages using Lightning App Builder.

Cosmic Express uses an external system to manage sales orders. A Lightning web component has been developed to allows users to view records such as accounts, contacts, and opportunities on a Lightning page in Salesforce. The sales director would also like to display the related sales orders based on the currently viewed account record on the same page. The data architect needs to recommend a suitable approach for this use case that does not require storing the sales orders in Salesforce.

A mashup can be utilized for this requirement. Salesforce supports two mashup designs. The web page of the external system that hosts sales orders can be made to look like part of the user interface of the Lightning web component. Requests to retrieve sales orders can be passed to the external system. The other option is using the Lightning web component to execute an Apex callout that retrieves sales orders related to the account record that is currently being viewed on the page.

Salesforce reporting snapshots

A reporting snapshot lets you report on historical data: 1. Authorized users can save tabular or summary report results to fields on a custom object, then map those fields to corresponding fields on a target object 2. They can then schedule when to run the report to load the custom object's fields with the report's data. 3. Reporting snapshots enable you to work with report data similarly to how you work with other records in Salesforce. After you set up a reporting snapshot, users can: 1. Create and run custom reports from the target object 2. Create dashboards from the source report. 3. Define list views on the target object, if it's included on a custom object tab

Cosmic Express has operations in four regions, namely, North America, South America, Europe, and Asia Pacific. The sales users of the company are currently using a legacy CRM application to manage customer accounts in these regions, but the application does not perform well due to the large number of account records. The management of the company would like to use a Lightning component in Salesforce to allow users to manage the records. There are about 15 million account records that will need to be imported into Salesforce. The data architect needs to suggest a performance-friendly approach for storing the large number of records. She should take into consideration that the queries used for retrieving or updating the account data should only return the records that are relevant for the current user's region.

Although the Account object can be used to store the account records in Salesforce, divisions should be utilized to partition the account data based on the four regions. Four divisions can be created for North America, South America, Europe, and Asia Pacific. Divisions reduce the number of records returned by queries and improve their performance. They can be enabled by contacting Salesforce Customer Support.

Sales users are required to delete old and inactive account records from Salesforce. However, the company would still like to maintain a record of the data even after the account records are cleared from the Recycle Bin in Salesforce. The data architect needs to suggest a suitable approach that can be implemented for this use case.

An ETL job can be be used to query, extract and then delete the deleted account records. They can be sent to the data warehouse where they can be marked as deleted records.

Salesforce currently stores a lot of detailed information about the different types of items produced by different manufacturing facilities and stored in different warehouses. Salesforce users do not often require access to this information. The data architect needs to recommend a solution that can be utilized for storing this data. However, if the records are removed from Salesforce, users in Salesforce should still be able to access them when required. The data architect has been asked to consider if Heroku could be used for this requirement.

An external Heroku Postgres database could be used to store this information. Heroku Connect and Heroku External Objects can be utilized to expose the data stored in the Postgres database in Salesforce. The data can be accessed from a Heroku app and also synced with and displayed in Salesforce.

What is an external ID? What are some advantages of an external ID?

An external ID is a custom field that is used to store unique record identifiers from an external system. It is automatically indexed and improves query performance.

Loading Data from the API - Goal: Improving Performance

Any data operation that includes more than 2,000 records is a good candidate for Bulk API 2.0 to successfully prepare, execute, and manage an asynchronous workflow that makes use of the Bulk framework. Jobs with fewer than 2,000 records should involve "bulkified" synchronous calls in REST (for example, Composite) or SOAP.

How can you bring data from a big object to a custom object?

Async SOQL can be leveraged to query the big object records and return a filtered subset into a custom object. The records in this sObject can then be used in reports, dashboards, or otherwise manipulated as required.

Mitigation Strategies for Lookup Skew - 4 (Reducing the Load)

Automated processes or integrations running in parallel can be run serially during non-peak periods to avoid lock exceptions. The batch size can be reduced if the processing should occur during end user operations.

Benefits of Canonical Model

Benefits of the CDM shift are: 1. Improve Business Communication through standardization 2. Increase re-use of Software Components 3. No. of possible connections is (n * 2) against n (n-1). 4. Reduce transformations 5. Reduce Integration Time and Cost

Storage cost for big objects

Big Object storage doesn't count against organization data storage limit. Based upon Salesforce edition, up to 1 million records are free. Additional record capacity can be bought in blocks (50M but can vary) and price is negotiable.

How can big objects be queried?

Big object records can be queried without impact on performance. This is achieved via SOQL and Async SOQL.

Which API can be used to query, export, and import large volumes of data?

Bulk API

Which API should be used to export millions of records out of Salesforce?

Bulk API

Information about the products in each store changes regularly as new products are added and existing products are sold to customers. Currently, sales users receive a spreadsheet from the IT department every week and have to manually update the product data related to each store by creating new product records and deleting existing product records. The IT director of the company would like to automate this process by implementing automatic creation and deletion of a large number of records in batches, but an integration designer has raised a concern that this process would require the creation of batches manually. The data architect needs to suggest a solution that automatically divides the data into multiple batches.

Bulk API 2.0 can be utilized to create a job for inserting or deleting a large number of records in batches. It automatically divides the job's data into multiple batches to improve performance. A separate batch is created for every 10,000 records in the job's data.

Bulk API and Salesforce Connect

Bulk API can be used to retrieve a large number of records from a particular org and import them into another org. Salesforce Connect can be used to expose data that exists in an external system or another Salesforce org.

Which API can be used to ensure maximum performance while loading several million records into Salesforce?

Bulk API in parallel mode

Account Data Skew - Record Locking Issues

Child records (such as contacts) can fail to update due to record locking issues. Since the system locks both the parent account and the child record being updated, these issues can occur while updating a large number of child records associated with the same parent account in multiple threads

File Storage by Edition and Record Size

Contact Manager, Group, Professional, Enterprise, Performance, and Unlimited Editions are allocated 10 GB of file storage per org. Essentials edition is allocated 1 GB of file storage per org. Orgs are allocated additional file storage based on the number of standard user licenses. In Enterprise, Performance, and Unlimited Editions, orgs are allocated 2 GB of file storage per user license. Contact Manager, Group, Professional Edition orgs are allocated 612 MB per standard user license, which includes 100 MB per user license plus 512 MB per license for the Salesforce CRM Content feature license.

Cosmic Infinity is a company that offers a music subscription service. It recently started using Salesforce and would like to create a custom application for internal users that allows them to retrieve specific data about artists, albums, and songs. The application will use SOQL queries to retrieve the values of specific fields. There are almost 1 million artists, 4 million albums, and 12 million songs that need to be migrated to Salesforce. The data architect needs to recommend approaches for designing a suitable data model for this use case while ensuring that the large number of records does not degrade the performance of the custom application

Custom objects can be defined in Salesforce to store artists, albums, and songs. In order to improve the performance of the SOQL queries that will be used in the custom application, Salesforce Customer Support can be contacted to create a skinny table for each object with all the necessary fields that will be retrieved. Custom indexes can also be created on the fields that will be used in the SOQL queries in order to speed up their performance. Selective filter conditions should be used in the SOQL queries to improve their performance.

Custom Settings

Custom settings can be used to create custom sets of static data that can be used across an organization. For example, two-letter state codes can be defined After test, look more into custom settings.

What is the CustomObject metadata type?

CustomObject represents the metadata type for custom objects and external objects. Each metadata type also has several fields that represent related metadata components, which can be included in the definition of the metadata type. Examples include CustomField, BusinessProcess (define picklist types for profiles, now flow or workflow), RecordType, and ValidationRule, which can also be utilized for documentation.

Ensuring Consistently High Data Quality - 2: Data Control

Data control includes using automated processes and tools to clean the data, getting users to fix the data, using data quality analytics, etc.

What is Data Obfuscation?

Data obfuscation is a way to modify and ensure privacy protection for PI and PII data. You can mask a field's contents by replacing the characters with unreadable results. For example, Blake becomes gB1ff95-$. Or you can convert a field into readable values that are unrelated to the original value. For example, Kelsey becomes Amber.

Introduction to Data Tiering

Data tiering is the process whereby data is shifted from one storage tier to another. It Allows an organization to ensure that the appropriate data resides on the appropriate storage technology in order to: Reduce Costs Optimize Performance Reduce Latency Allow recovery Hot, warm, and cold data.

Data Lineage

Defining a data lineage involves specifying the origin of data, how it is affected, and where it moves in its lifecycle.

Data Taxonomy

Defining a data taxonomy involves classifying the data into categories and sub-categories

Matching Algorithms Available with the Fuzzy Matching Method - Matching Algorithm: Initials

Determines the similarity of two sets of initials in personal names. For example, the first name Jonathan and its initial J match and return a score of 100.

Matching Algorithms Available with the Fuzzy Matching Method - Matching Algorithm: Acronym

Determines whether a business name matches its acronym. For example, Advanced Micro Devices and its abbreviation AMD are considered a match, returning a score of 100.

Matching Algorithms Available with the Exact Matching Method - Matching Algorithm: Exact

Determines whether two strings are the same. For example, salesforce.com and Salesforce aren't considered a match because they're not identical. The algorithm returns a match score of 0.

MDM - Deterministic Matching

Deterministic Matching looks for an exact match between two records or pieces of data. 1. Focuses on using unique identifiers to determine a matching record or looking for an exact comparison between fields. For example, the External ID field of an account record in Salesforce can be compared with the Record ID field of an account record in the ERP system to determine if the two records match. 2. Another way deterministic matching can be used if by looking for an exact comparison between specific fields on records, such as the 'Billing Address' field on account records in the two systems.

Which two type of matching techniques are available for identifying duplicates?

Deterministic and probabilistic matching

When will developers receive an error message with regards to a non-selective query in a trigger?

Developers will receive an error message when a non-selective query in a trigger executes against an object that contains more than 200,000 records.

Loading Data from the API - Goal: Deferring computations and speeding up load throughput

Disable Apex triggers, workflow rules, and validations during loads; investigate the use of batch Apex to process records after the load is complete.

When are divisions beneficial?

Divisions is beneficial for: 1. Organizations with extremely large amounts of data (greater than 1 million records in a single object). 2. Organizations that are effectively multiple companies all sharing one org, but operating quite separately. 3. Organizations that find their search results cluttered by data that is related to some other division that they never deal with. 4. Organizations that have relatively equal amounts of data in each proposed division.

General company requirements to use divisions

Does my organization meet the requirements? 1. If you have greater than 1 millions records in a single object and more than 35 licenses. 2. You have large amounts of data and would like to improve search and analytics performance. If the company does not meet the requirements: Implementing record types, sharing rules, picklists, Territory Management, etc. will be a better alternative.

Data Consolidation - SURVIVORSHIP FACTORS

Factors such as trust score, decay, validation, and precedence can influence which field value wins or survives during a merge

Frontier Foods is a multinational subsidiary company that sells food items, beverages, and household products. It has 96 offices, 78 supermarkets, and dozens of farms and warehouses. The company has eight departments, namely, supply management, production, logistics, sales, support, finance, marketing, and human resource management. The operations of the company rely heavily on shared data creation and maintenance, data sharing with variable groups of users, and regular collaboration between departments. There are multiple enterprise systems and the data needs to be distributed to them every day. The data architect needs to recommend a suitable data governance model for the company

Due to complex data maintenance requirements and the need for sharing the data with multiple departments, it would be best to utilize a centralized or top-down data governance model in this case. A central data governance council or organization should be responsible for building the processes and procedures, improving and maintaining all the master data based on departmental requests, and making adjustments based on business requirements. The size of the data governance council should be based on the number of requests that need to be handled on a regular basis. Automation should be used to add transparency and visibility to the processes. KPIs should be established for different types of master data requests. Fields should be classified based on their usage by different departments. Since a limited number of users would be responsible for setting up and maintaining the master data, it would increase the probability of creating consistent master data and making improvements quickly.

Cosmic Transport Solutions issues an invoice when a sales order is finalized in Salesforce. An external system is used to generate invoices. It contains more than 50 million invoices. The number of invoices that are generated is likely to increase by 5% each month. Each invoice related to a sales order should be visible in Salesforce on the page that displays the sales order. The architect needs to recommend an appropriate solution, considering both data storage space and performance.

Due to the large number of invoices in the external system, invoices should not be stored in Salesforce because it could have a negative impact on performance. Salesforce Connect should be used for this use case. An external object can be used to make invoices visible in Salesforce. A lookup relationship can be defined on the external object to the 'Sales Order' object to link invoices to sales orders in Salesforce. The 'Invoices' related list can be added to the page layout used for sales orders to make related invoices visible on sales orders.

The company will soon import more than one million leads which have been obtained from a third-party data service. The architect has been asked to ensure that it does not result in the creation of any duplicate records.

Duplicate and matching rules can be utilized to prevent the creation of duplicate records. A Lead de-duplication application from the AppExchange can also be downloaded and installed before importing leads into Salesforce.

What is contained in a Data Lightning Package?

Each Lightning Data package includes the following: 1. A custom object containing data from the data service 2. An external object used for updating and importing records 3. A data integration rule that identifies matches between the external object records and your Salesforce records

Data in the Recycle Bin does not affect query performance

False.

Guest/Unauthenticated Users require a license to access Salesforce.

False.

What is the global party ID?

Each global profile (in Customer 360 Data Manager) includes the global party ID and a reconciled view of the customer's name, country, and contact points. The global party ID uniquely identifies an individual so that data about them can be accessed and shared across different systems. The global ID can be exported from Customer 360 Data Manager and added to customer records in other clouds and orgs such as another Salesforce org.

Queries in skinny tables use joins.

False.

Skinny tables only contained index fields

False.

Salesforce Shield - Enhanced Transaction Security

Enhanced Transaction Security is a framework that intercepts real-time events and applies appropriate actions to monitor and control user activity. Each transaction security policy has conditions that evaluate events and the real-time actions that are triggered after those conditions are met. The actions are Block, Multi-Factor Authentication, and Notifications. Before you build your policies, understand the available event types, policy conditions, and common use cases. Enhanced Transaction Security is included in Real-Time Event Monitoring.

There can be multiple systems of record for a given piece of information.

False.

What are examples of real-time event monitoring events?

Examples of real-time event monitoring events that support transaction security policies are ApiEvent, ReportEvent, LoginEvent, ListViewEvent, etc.

With Customer 360 Data Manager, you can only integrate Salesforce clouds.

False.

A big object allows storing data in PDF format.

False

Big objects have standard fields

False

Big objects support encrypted fields

False

Customer 360 Data Manager supports encryption with Shield

False

The Bulk API can load batches faster if you process them in serial.

False The Bulk API can load batches faster if you process them in parallel, but beware of data skew, which can be caused by a large number of child objects being associated with the same account. Note: to prevent locking errors, use serial mode.

Big objects work with flow/process builder

False.

Account and Person Account records consume the same amount of storage.

False. Person Accounts consume twice as much storage, as both an Account record and a Contact record is created. The purchase of extra data storage may be required, given the large number of customer records, Data storage calculations also include the Salesforce edition and number of licensed users.

Personal data can only be stored in standard objects.

False. Personal data can be stored in standard & custom objects; sensitive data is more restrictive.

Salesforce to Salesforce can be accessed via Lightning.

False. Salesforce to Salesforce can be configured to connect with business partners who can be invited to share data. It's available only in Salesforce Classic

All internal users require a Knowledge User feature license to access knowledge articles.

False. Users that will only access and view articles do not need an additional license as read only access to Knowledge articles is included in a Service Cloud license.

With Salesforce Connect, you can only read data from an external system.

False. Search, create, update and delete data in the remote system with Salesforce Connect.

When you use Heroku external objects, data is copied or moved to the Salesforce org.

False. Since no data is moved or copied to the Salesforce org and data is accessed real time, no synchronization process nor polling is required.

Customer Community licenses can view, edit, and delete Events & Calendars.

False. They can only view these records. Need CC + to view, edit, and delete these records.

Feature Licenses

Feature licenses can be assigned to existing users to enable access to an additional feature that is not provided in the user license that was assigned to them. A user can be assigned to multiple feature licenses.

What is federated search?

Federated search makes it easy to add external search engines (or connectors) to your org. Users look for information using Salesforce global search and see external results in a single search results page.

The Contact object has more than 120 custom fields. More than 80 of these fields have been identified as essential for storing guest information. It is required to track and store the new and old values of these fields in Salesforce. This data should remain available until it is decided that certain field value changes within a specific period should no longer be stored.

Field Audit Trail allows tracking the values of up to 60 fields per object, however, in this case, the company requires tracking more than 80 fields on the Contact object, so it would be necessary to use an Apex trigger and a custom object. The trigger can automatically store old and new field values in the custom object. When certain field changes within a specific period are no longer required, Bulk API can be utilized to query and delete the corresponding records.

Record Survivorship Techniques - Most Complete

Field completeness is considered to determine correctness. Records with more values populated for each available field are considered the best possible candidates for survivorship. Seems to be the Salesforce approach

The company has more than 20 million opportunity records in the legacy system. When these records are imported into Salesforce, the sales director would like to preserve the original date when these records were created in the legacy system. The data architect has been asked to suggest an approach that can be used for this use case.

For this requirement, the Enable "Set Audit Fields upon Record Creation" permission can be enabled on the User Interface page in Setup. Then a permission set can be used to assign a permission called Set Audit Fields upon Record Creation to the user who will be responsible for importing the contact records. It will allow specifying values in audit fields such as CreatedDate or LastModifiedDate when importing the records.

How can you provide a consistent name for opportunities?

Flow Builder can be used to define a flow that enforces the convention by automatically updating the opportunity name using a formula when a new opportunity is created. Non-profit cloud has oob functionality for opportunity naming conventions.

When performing advanced testing to bulk uploads, what are some best practices?

For both the Bulk API and the SOAP API, look for the largest batch size that is possible without generating network timeouts from large records, or from additional processing on inserts or updates that can't be deferred until after the load completes

How does the query optimizer determine if it can use a customer field index? really important point

For example, a custom index is used if: • A query is executed against a table with 500,000 records, and the filter matches 50,000 or fewer records. • A query is executed against a table with 5 million records, and the filter matches 333,333 or fewer records

For filter conditions that combine two or more conditions (using AND), when does the query optimizer consider the overall filter selective?

For filter conditions that combine two or more conditions (using AND), the query optimizer considers the overall filter condition selective when the filter targets less than: • Twice the selectivity thresholds for each filter • The selectivity thresholds for the intersection of those fields For the attached example this means: • Status = 'Closed Won' is selective (49,899 < 150,000) • CloseDate = THIS_WEEK is selective (~3000 < 150,000)

The date on which existing customer records were created is recorded in spreadsheets. This 'Created Date' field needs to be set when the customer records are imported into Salesforce using an import tool. How can you modify system fields when you import data into Salesforce?

For this requirement, the organizational preference called 'Enable Set Audit Fields upon Record Creation and Update Records with Inactive Owners' can be enabled on the 'User Interface' page in Setup. Once this is done, the 'Enable Set Audit Fields upon Record Creation' user permission can be granted to a user who is supposed to import records via Data Loader. A permission set can be used to grant this permission. This permission allows a user to set audit fields, such as Created By, Last Modified By, and Created Date, when records are created via the API.

What happens if you select too much data to process?

Force.com query optimizer's selectivity threshold. When this happens, the underlying query must do a full object scan, often leading to timeouts.

MDM - Surviving Records

In an MDM implementation, the selection of surviving records can depend on various factors.

If the query you provided contains an Indexed field in the filters, the plan will be shown for that field only if you are using a supported operation against that field.

Here is a list of unsupported operations: 1. Custom index will never be used when comparisons are being done with an operator like "NOT EQUAL TO" 2. Custom index will never be used when comparisons are being done with a null value like "Name = ''" 3. Leading '%' wildcards are inefficient operators that also make filter conditions non-selective 4. When using an OR comparison, all filters must be indexed and under the 10% threshold. If you have a non-indexed field or one is above 10%, the plan will not be displayed.

Syncing and Polling with Heroku Connect Sync - Polling Mode Options: Standard Polling

Heroku Connect polls the Salesforce org for changes in intervals of two up to sixty minutes depending on the Heroku Connect plan or settings.

Using Heroku Connect to archive Salesforce Data

Heroku Connect, Postgres, and Salesforce Connect. 1. Within this architecture, there are three parts that are all processed within Heroku. The first is a web service. The web service provides endpoints for Salesforce to call for archiving and unarchiving records. These web services also expose operations that can be performed on the Heroku Connect Tables. 2. The data is then added to a queue so it becomes an async process. 3. The second component is the Postgres database. A worker running in the background calls a set of stored procedures that process all the management of records between the archive tables and the Heroku Connect tables that are live. 4. The third component is managed by the Heroku add-on, Heroku Connect. In this component there are two parts, the external objects and the sync engine. The external objects help to expose data in the archive table by an OData endpoint which can be consumed by Salesforce Connect and in turn exposes the data back to Salesforce as an external object. The second part, the Heroku Connect sync engine, does the bidirectional syncs between the live Heroku Connect tables within Postgres and the Salesforce org.

What is an Heroku external object?

Heroku External Objects is available as part of Heroku Connect. It provides an oData wrapper for a Heroku Postgres database that has been configured for use with Heroku Connect. This feature allows other web services to retrieve data from within the specified Heroku Postgres database using RESTful endpoints generated by the wrapper.

What is Heroku?

Heroku is a fully-managed Platform as a Service (PaaS) where application infrastructure is maintained by Heroku and a broad range of languages and frameworks is supported for building and running apps on the platform

Syncing and Polling with Heroku Connect Sync - Polling Mode Options: Accelerated Polling

Heroku uses Salesforce's Streaming API to notify Heroku Connect when data changes are made in the mapped objects from Salesforce.

What data design does Customer 360 Manager use?

Hub-and-Spoke Design

What happens if a lookup relationship exists between two external data source tables?

If a lookup relationship exists between two external data source tables, an external lookup relationship is automatically created between the mapped external objects in Salesforce

The criteria used for the data survivorship rules can be based on the following - Frequency

If the field values of a particular record are repeated in multiple systems, the record may be considered more reliable than its version that exists in only one system.

Working divisions

If you have the "Affected by Divisions" permission, you can set the division using a drop-down list in the sidebar. Then, searches show only the data for the current working division. You can change your working division at any time. If you don't have the "Affected by Divisions" permission, you always see records in all divisions.

Salesforce Data Loader (in command line mode)

In addition to using Data Loader interactively to import and export data, you can run it from the command line. You can use commands to automate the import and export of data.

What is the best way to query and process large data set?

In general, the best way to query and process large data sets in the Force.com platform is to do it asynchronously in batches. You can query and process up to 50 million records using Batch Apex. Note: Batch Apex doesn't work in all use cases (for example, if you have a synchronous use like as a Visualforce page that needs to query more than 50,000 records).

Content Version Object

In the Object Manager, custom fields can be created on the 'Content Version' object to allow sales reps to specify additional information about the uploaded file. Also, to allow users to take notes in the Salesforce mobile app, 'Notes' should be made available in the navigation menu of the app by moving it from the 'Available' list to the 'Selected' list on the 'Salesforce Navigation' page in Setup.

Frontier Communications is a company that manufactures communication devices such as modems, routers, network interface cards, and Bluetooth devices. It has six offices and three manufacturing sites in the country. Each office also has five business units, namely, supply chain, production, finance, sales, and warehouse. The individual business users within each unit manage their own business data, but they frequently share information with users in other business units. As a result, records in a particular enterprise system are often handled by multiple users from different business units, which can often result in data duplication. The data architect needs to recommend a suitable data governance strategy for the company

In this case, the company can utilize a decentralized or bottom-up approach for defining the data governance framework, but there would be a need to set up some controls and implement some additional steps to make this model effective. Since individual users are responsible for maintaining their own records, they can also work on data quality improvement and maintenance at the individual level. However, data sharing with other business units can result in duplication and inconsistencies, which can also result in inconsistent or meaningless reports. In order to avoid such negative effects, the framework can utilize automation tools for ensuring the consistency of data. The number of fields that are maintained by users can be limited. For eliminating duplicates, rules and standards should be defined for identifying and prioritizing the fields that should be used for matching and merging the records. Data quality score can be defined based on various key attributes. Controls and audits can be set up to fix inconsistencies quickly. Strict controls can be established for fields that have an impact across multiple business units. The role of the data governance council should include building processes and procedures, performing regular audits, and owning responsibility for automation.

You can edit the index of a big object.

Index must be carefully designed as it can't be edited or deleted.

The developer is facing a frequent timeout issue while using SOQL to query the data from a custom object that has more than a million records. This is impacting the development of an important business application. The data architect needs to provide a solution for fixing this issue as soon as possible as the custom table is likely to grow. Furthermore, since the application will be used by most of the employees, it will need to be tested properly prior to deployment. The architect needs to provide recommendations and considerations pertaining to the same.

Indexing the fields used in the SOQL query would improve its performance and prevent the occurrence of the timeout issue. The performance of a SOQL query depends on the presence of a selective filter. If an indexed field is in the query filter and the number of records returned by the filter is within the system threshold, the query is considered selective. The Lightning Platform query optimizer determines the best index from which to drive the query and the best table to drive the query from if no good index is available. Performance testing can be utilized for the application, especially if it utilizes highly customized code or a large number of records. A performance test plan should be created and submitted to the customer support at Salesforce for this use case. Before the deployment of the application, load testing and stress testing should be performed in a full-copy sandbox.

You need to create an index for a big object

Keep these considerations in mind when creating the index: 1. An index must include at least one custom field and can have up to five custom fields total. 2. All custom fields that are part of the index must be marked as required (but they don't have to be unique) 3. You can't include Long Text Area and URL fields in the index. 4. The total number of characters across all text fields in an index can't exceed 100

If you use the Bulk API's "hard delete" option, the records that you delete are not moved to the Recycle Bin—they are immediately flagged for physical deletion. Until records are physically deleted, they still might be returned by the Force.com query optimizer and count against its selectivity thresholds, which help determine which index, if any, should drive each of your queries

Key Point

Perform advance testing to tune your batch sizes for throughput. For both the Bulk API and the SOAP API, look for the largest batch size that is possible without generating network timeouts from large records, or from additional processing on inserts or updates that can't be deferred until after the load completes

Key point

Records should always be loaded in the order of the object hierarchy. For example, parent orders should be loaded before the related order line items. In order to avoid record locking errors, order line items can be grouped by the ID of the parent order records. Each batch should only contain child records related to the same parent order.

Key point

Sharing recalculations and Bulk API jobs both draw from the org's available pool of asynchronous processing threads (really important point). This competition causes both jobs to go slower than if they were scheduled to happen apart from one another. Also, if the full backup was changed to a much faster incremental backup, the time requirement for the backup (and thread usage) would be much smaller, allowing more time for the sharing recalculation.

Key point

The Salesforce multitenant architecture uses the underlying database in such a way that the database system's optimizer can't effectively optimize search queries. The Lightning Platform query optimizer helps the database's optimizer produce effective queries by providing efficient data access in Salesforce.

Key point

The performance of a SOQL will depend on the presence of a selective filter. If a SOQL query contains at least 1 selective filter, the query is said to be selective. If the SOQL query doesn't contain a selective filter, the query is said to be un-selective and will require a full table scan.

Key point

The query optimizer maintains a table containing statistics about the distribution of data in each index. It uses this table to perform pre-queries to determine whether using the index can speed up the query.

Key point

In these situations, you should be able to order your batches of tasks by the associated account (or the related Who/What object that's referencing an account), load these batches in a job in parallel mode, and avoid lock contention

Key point.

Users with the "Weekly Data Export" permission can view all exported data and all custom objects and fields in the Export Service page. This permission is granted by default only to the System Administrator profile because it enables wide visibility.

Key point; you need the permission set for the weekly exports. You can generate backup files manually once every 7 days (for weekly export) or 29 days (for monthly export).

Lydia is attempting to refresh the entire local database every night, doing more work than what's necessary. This translates into long-running Bulk API jobs that unnecessarily hold onto asynchronous processing threads and delay other batch work from happening.

Lydia should be doing nightly incremental data backups-only backing up the data that is new or updated since the previous incremental backup. When doing this, she should use queries that filter records using SystemModstamp (a standard field in all objects that has an index) rather than LastModifiedDate field (not indexed).

What does an MDM solution require?

MDM solution requires: 1. Choosing an implementation style, such as registry, consolidation, coexistence or transaction. 2. Data survivorship techniques can be utilized to determine the best candidates for the surviving records. 3. A matching policy can be utilized to determine how the records should be matched. 4. Canonical modeling can be used for communication between different enterprise systems.

Single Source of Truth (SSOT) vs. System of Record (SOR)

MDM system or master hub would be considered SSOT, but not necessarily the SOR.

Experience Cloud - Member + Login Based

Member-based and Login-based licenses can be mixed to cater to different user groups in the same site.

Which solution should be used when company would like to back up Account records daily?

Need AppExchange product

Salesforce.com Customer Support told Customer A that the most effective chunk size would feature fewer than 250,000 records; if chunks feature more than 250,000 records, the Force.com query optimizer's sampling code causes the query to be driven by the full table rather than an index

Need to study

What makes a query non-selective?

Non-selective queries are SOQL queries (typically against) tables with more than 100K rows that bring back 100K rows; i.e. you have not specified specified what you are looking for so a full table scan is happening and if it were to proceed too long would cause locking. 1. Too many records, but only if the thresholds are exceed 2. Using wildcards (only trailing?) 3. The filter operator is a negative operator such as NOT EQUAL TO (or !=), NOT CONTAINS, and NOT STARTS WITH 5. Complex join statements 6. The CONTAINS operator is used in the filter, and the number of rows to be scanned exceeds 333,333. The CONTAINS operator requires a full scan of the index. This threshold is subject to change. 7. You're comparing with an empty value (Name != ''). (exception is when you do != null, which improves query performance)

What are two patterns for archiving data within Salesforce?

Pattern 1: Custom Storage Objects Pattern 2: Salesforce Big Object Disadvantages: need to pay storage costs

What are two patterns for archiving data outside of Salesforce?

Pattern 3: On Prem DataStore Pattern 4: 3rd Party Vendor product

Data Classification Fields in Salesforce - Field Usage

Picklist with following values: Active, DeprecateCandidate, Hidden

Data Classification Fields in Salesforce - Data Sensitive Level

Picklist with following values: Public, Internal, Confidential, Restricted, MissionCritical

Platform-event triggered flows can be created for which type of events?

Platform-event triggered flows can be created for ApiAnomalyEvent, CredentialStuffingEvent, ReportAnomalyEvent, and SessionHijackingEvent.

Ensuring Consistently High Data Quality - 5: Monitoring

Policies, processes, and tools can be utilized to monitor data. Automation tools can be used to enforce data quality. Reports and dashboards allow data analysis.

What is PostgreSQL?

PostgreSQL is an advanced, enterprise class open source relational database that supports both SQL (relational) and JSON (non-relational) querying. It is a highly stable database management system, backed by more than 20 years of community development which has contributed to its high levels of resilience, integrity, and correctness. PostgreSQL is used as the primary data store or data warehouse for many web, mobile, geospatial, and analytics applications. The latest major version is PostgreSQL 12.

Postman

Postman is an application that you can use to configure and call HTTP-based APIs, such as REST or SOAP. You configure this powerful tool in an easy-to-use graphical user interface. It supports environment variables, team workspaces, and JavaScript automation.

Which standard fields have indexes?

Primary Keys - Record ID - Name, - OwnerId Audit Dates - CreatedDate - SystemModstamp Other Fields - RecordType (indexed for all standard objects that feature it) - Division - Email (for Contacts & Leads) Foreign Keys - Master-detail and lookup fields - Lookup fields

MDM - Probabilistic Matching

Probabilistic Matching uses a statistical approach with weights and thresholds to match records. 1. In order to ensure high accuracy in matching records, use probabilistic matching 2. Uses a statistical approach to determine matches and can leverage statistical theory 3. It can be based on the likelihood of the occurrence of a particular word or phrase in the fields, phonetic matching, subtle variations, minor differences, etc 4. Multiple field values are compared when comparing two records, and a weight is assigned to each field in the two systems 5. The weight indicates how closely the value of one field matches the value of the corresponding field in the other system. 6. The probability of a match between the two records is determined by the sum of the individual field weights in the two systems.

Examples of Selective SOQL Queries - Could Be Selective SELECT Id FROM Account WHERE FormulaField__c = 'ValueA'

Query is based on a formula. The following rules have to be true in order to index a Formula Field: 1. The formula contains fields from a single object only (not relationship fields). 2. The formula field doesn't reference any non-deterministic functions (e.g. SYSDATE). 3. The formula field doesn't reference any non-supported fields for including in indexes. This list isn't documented anywhere specifically (there are lots of special cases), but in Spring 12(176), createdById was non-supported, but in Summer 12 (178), it is supported. Same story for CreatedDate. 3. The formula field doesn't contain references to Primary Keys (e.g Id) 4. The formula field does not use TEXT(<picklist-field>) function 5. If the formula references any Lookup fields, the field must not have the option "What to do if the lookup record is deleted?" set to "Clear the value of this field."

How can you use the RACI model for data governance?

RACI model can be used to outline the users are responsible, accountable, consulted, and informed for different types of data: 1. Responsible users own the data 2. Accountable users must approve changes to the data. 3. Consulted users can provide information about the data. 4. Informed users need to be notified about any changes to the data

Mike didn't properly anticipate the problems with allowing everyone to create reports, nor did he educate XYZ's users on how to build efficient reports that can scale as the company's database grows. Additionally, Mike didn't take the time to consider a great alternative-creating a library of public, controlled, and optimized reports that meet the requirements of XYZ's sales reps. Fewer reports to tune and maintain, plus high user satisfaction.

Report builders are including many of the formula fields that Vijay built to surface data from related records dynamically at report runtime. Consequently, Salesforce performs many joins in the underlying query to run the report. Better solution: Use a trigger to populate denormalized related fields that would facilitate blazing report/query performance without runtime joins.

What clouds can be consolidated with Customer 360 Data Manager?

Sales, Service, Communities, and Commerce

Which fields cannot be custom indexes?

Salesforce also allows customers to create custom indexes on most fields. The exceptions are: 1. non-deterministic formula fields 2. multi-select picklists 3. currency fields in a multi-currency org 4. long text area and rich text area fields 5. binary fields (type blob, file or encrypted text)

A global company that sells furniture has been using a homegrown order management system to manage sales orders and related invoices. But due to the large volume of data and the age of the system, accessing or updating a sales order or invoice takes a long time. That's why the company would like to switch to an ERP system for managing orders and invoices. However, until the sales users have been properly trained to use the new system, the sales director would like them to use Salesforce to search, view and modify orders and invoices stored in the order management system. Orders and invoices are related via a lookup relationship, which must be maintained in Salesforce. However, there is already an object named 'Invoice' in Salesforce, which should be considered. Furthermore, developers who have integrated with the order management system in the past have experienced connection timeout issues, which is another aspect to consider. Important question to review - complex scenario with Salesforce Connect

Select Writable External Objects when you define an external data source and use Salesforce Connect external objects to create, update, and delete data. External objects are read only by default.

How you specify the PK Chunking syntax in the header of a job?

Sforce-Enable-PKChunking Example: Sforce-Enable-PKChunking: chunkSize=50000; startRow=00130000000xEftMGH

The system administrator of the company recently imported more than 20 million account records into Salesforce. However, the sales director of the company wanted to add an additional field to the Account object and populate it on all the imported account records. The system administrator has decided to export the records, add the field value, and then re-import the records. He has defined the field in Salesforce and is currently trying to use a third-party ETL tool that supports Bulk API to extract the records from Salesforce. However, the ETL tool fails and the Bulk API query log shows time-out failure while performing a full table scan. He needs to export the records as soon as possible and would also like to minimize the migration time when importing them into Salesforce.

Since the number of rows that are returned by a Bulk API query may be higher than the selectivity threshold, it can result in a full table scan, slow performance, or even failure. When the number of records is more than 10 million, PK Chunking can be used to split the query and break the data into smaller chunks. The Sforce-Enable-PKChunking header can be specified on the job request to utilize PK Chunking. When importing records into Salesforce, Bulk API in parallel mode can be used to minimize the migration time. Prior to loading the records, calculation of sharing rules can be deferred in Setup, and once the import process is complete, it can be resumed. When loading the records, it is important to note that a single batch of records cannot contain more than 10,000 records.

More information on skinny tables.

Skinny tables can be created on custom objects, and on Account, Contact, Opportunity, Lead, and Case objects (ACCOL); skinny ACOOL Skinny tables can contain the following types of fields. • Checkbox • Date • Date and time • Email • Number • Percent • Phone • Picklist (multi-select) • Text • Text area • Text area (long) • URL

Use case (Data Aggregation) : The customer needed to aggregate monthly and yearly metrics using standard reports. The customer's monthly and yearly details were stored in custom objects with four million and nine million records, respectively. The reports were aggregating across millions of records across the two objects, and performance was less than optimal.

Solution: The solution was to create an aggregation custom object that summarized the monthly and yearly values into the required format for the required reports. The reports were then executed from the aggregated custom object. The summarization object was populated using batch Apex.

Use case (Custom Search Functionality): The customer needed to search in large data volumes across multiple objects using specific values and wildcards. The customer created a custom Visualforce page that would allow the user to enter 1-20 different fields, and then search using SOQL on those combinations of fields. Search optimization became difficult because: • When many values were entered, the WHERE clause was large and difficult to tune. When wildcards were introduced, the queries took longer. • Querying across multiple objects was sometimes required to satisfy the overall search query. This practice resulted in multiple queries occurring, which extended the search. • SOQL is not always appropriate for all query types

Solution: The solutions were to: • Use only essential search fields to reduce the number of fields that could be searched. Restricting the number of simultaneous fields that could be used during a single search to the common use cases allowed Salesforce to tune with indexes. • De-normalize the data from the multiple objects into a single custom object to avoid having to make multiple querying calls. • Dynamically determine the use of SOQL or SOSL to perform the search based on both the number of fields searched and the types of values entered. For example, very specific values (i.e., no wild cards) used SOQL to query, which allowed indexes to enhance performance

What can be used to classify fields?

Standard Data classification fields including Owner, Usage, Sensitivity level and Compliance Categorization.

Pattern 3: Replicate records to local data repository such as a Data Lake and delete them from Salesforce.

System of records is Salesforce, but once the record is not needed anymore by the Business, it will be moved out from Salesforce and injected into an external application such as a Data Lake. Following an example of how the architecture might look like, but it depends on business requirements defined as part of the Data Archiving. - ETL Processes such as Informatica or TalenD, in charge of exporting data from Salesforce into the external application, checking metadata change or restoring data if needed. - Page Layout/Related list/Search on Salesforce to give the possibility to the users see the archived data, using External Objects (with OData) or via Mashup Call-out - Record visibility and Reporting on the external application

The sales director of the United States would like to add a 'Tracking Number' field to every opportunity detail page. The field will be used by sales reps to manually add a shipment tracking number associated with the carrier used by the company, once a product has been sold on the company's website. The sales director expects that almost 10 million orders will be placed online in the next fiscal year, which means that 10 million opportunities will need to be updated with a tracking number. The data architect is required to recommend a solution that allows sales reps to search for a particular tracking number using global search without experiencing performance issues.

The 'Tracking Number' field can be marked as an External Id. Salesforce automatically adds a custom index to an External Id field, which will improve the performance of global search. In this case, the field can also be made unique since each tracking number will be unique. Salesforce also indexes a field that is marked as unique by default.

What API does data loader use?

The Bulk API is optimized to load or delete a large number of records asynchronously. It is faster than the SOAP-based API due to parallel processing and fewer network round-trips. By default, Data Loader uses the SOAP-based API to process records. You can also select the Enable serial mode for Bulk API option. Processing in parallel can cause database contention. When contention is severe, the load can fail. Serial mode processes batches one at a time, however it can increase the processing time for a load.

Customer Community (CC) vs. Customer Community Plus (CCP) Licenses

The CCP license is more powerful than the Customer Community license. It provides: 1. Roles and Sharing 2. Delegated Administration capabilities 3. Reports (create and manage) and Dashboards (read only) 4. Tasks 5. Additional storage (1MB for login-based license; 2MB for member-based license)

What are the Contact Point Address, Email, and Phone records?

The Contact Point Address, Email, and Phone records allow specify billing or mailing addresses, email addresses, and phone numbers of an individual or person account, which can be encrypted using Salesforce Shield or Shield Platform Encryption.

Data Classification Fields in Salesforce - Data Owner

The Data Owner is a lookup to a User or Public Group

The Lightning Platform Query Optimize

The Lightning Platform Query Optimizer helps the underlying database system's query optimizer produce effective query execution plans. It determines the best index to drive the query. Specifically, the optimizer: 1. Determines the best index from which to drive the query, if possible, based on filters in the query 2. Determines the best table from which to drive the query, if no good index is available 3. Determines how to order the remaining tables to minimize cost 4. Injects custom foreign key value tables that are required to create efficient join paths 5. Influences the execution plan for the remaining joins, including sharing joins, to minimize database input and output (I/O) 6. Updates statistics

Examples of Selective SOQL Queries - Selective SELECT Id FROM Account WHERE Id IN (<list of account IDs>)

The WHERE clause is on an indexed field (Id). If SELECT COUNT() FROM Account WHERE Id IN (<list of account IDs>) returns fewer records than the selectivity threshold, the index on Id is used. This will typically be the case since the list of IDs only contains a small amount of records.

What available options does Salesforce have for implementing data virtualization?

The available options for implementing data virtualization are Salesforce Connect, Request and Reply, and Heroku Connect.

What is required to make sure a process is triggered when using Data Loader to import records?

The batch size should be set to 1 in Data Loader Settings.

Use case (Indexing with Nulls): The customer needed to allow nulls in a field and be able to query against them. Because single-column indexes for picklists and foreign key fields exclude rows in which the index column is equal to null, an index could not have been used for the null queries. Important: review the syntax

The best practice would have been to not use null values initially. If you find yourself in a similar situation, use some other string, such as N/A, in place of NULL. If you cannot do that, possibly because records already exist in the object with null values, create a formula field that displays text for nulls, and then index that formula field. For example, assume the Status field is indexed and contains nulls. Issuing a SOQL query similar to the following prevents the index from being used.

The company would like to back up newly created accounts and contacts every day to recover from any accidental or intentional changes to customer data by Salesforce users. But the development team is currently busy working on another project. The data architect is required to recommend a suitable alternative.

The company can use Backup & Restore, which is a product offered by Salesforce. It can be used in the event of integration errors, malicious attempts, or incorrect data updates. It supports custom and standard object backups as well as daily incremental backups. Another option is installing an AppExchange solution such as OwnBackup, which can be utilized to backup Salesforce data every day for the purpose of recovery. It is important to note that the Data Export option in Setup only allows scheduling weekly or monthly export of data.

Frontier Automotive designs and manufactures cars that utilize the latest technology. It was founded very recently and currently has only one office and one manufacturing plant (very small company). It uses a single enterprise system to store and manage different types of data. Each business unit within the organization manages its own set of records and uses separate metrics and key performance indicators (KPIs). The local users within each business unit create, update and consume data as part of their daily responsibilities and do not need to share data with other units. The data architect needs to recommend a suitable data governance model for the company.

The company can utilize a data governance model that focuses on decentralized or bottom-up execution of rules, policies, standards, and procedures. Since it is a small company in which each business unit manages its own data and metrics, the users can be responsible for improving and maintaining data quality themselves on a regular basis. It would result in simpler data maintenance, but the ownership of the data should be clearly defined in order to avoid any data inconsistencies. The data governance council should only be responsible for building the necessary processes and procedures. Its role in the data governance framework should be limited.

But due to multiple integrations and the extensive sharing of master data, data quality issues are prominent in the enterprise systems used by the company. The Chief Technology Officer of the company would like to establish a framework that allows individual users to understand, monitor and improve data quality on an ongoing basis based on well-defined rules, standards, and policies. However, they should not be the only ones who should be relied on for data maintenance. The data architect needs to recommend a suitable model for this use case.

The company can utilize a hybrid data governance model, which can include aspects of both centralized and decentralized data governance. A central data governance council can be created for defining the rules, standards, and policies related to data governance. Individual users can be allowed to establish their own processes and procedures for improving and maintaining data quality. The data governance council can also be responsible for maintaining a part of the master data and making the necessary adjustments to meet the business needs. In addition, it would also play a mentoring role to ensure data consistency. In case of any conflicts, the data governance council would mediate between various departments and divisions. This would result in easy-to-use reporting and definition of KPIs and other metrics at the enterprise level.

Cosmic Electronics is considering an AppExchange application for managing employees and inventory data that is currently stored in an external database. There are 50,000 employees who work for the company throughout the world. With the exception of board members, each employee receives a performance review every week. There are more than 5 million performance reviews that have been conducted so far. The current inventory consists of 50 million items, but this number is likely to increase by 10 million every year as part of the company's expansion plan. An item can be one of the 200 different types of products that are sold by the company. The architect needs to recommend an appropriate solution for storing the data in Salesforce that ensures performance and scalability.

The company should either look for an AppExchange product that provides a suitable data archiving strategy or consider an on-premise solution for archiving data on a regular basis. As new records are created, old records can be archived daily, weekly or monthly.

Query Plan Tool - Cost

The cost of the query compared to the Force.com Query Optimizer's selectivity threshold. Values above 1 mean that the query won't be selective.

Use case (Rendering Related Lists with Large Data Volumes): The customer had hundreds of thousands of account records and 15 million invoices, which were within a custom object in a master-detail relationship with the account. Each account record took a long time to display because of the Invoices related list's lengthy rendering time.

The delay in displaying the Invoices related list was related to data skew. While most account records had few invoice records, there were some records that had thousands of them. To reduce the delay, the customer tried to reduce the number of invoice records for those parents and keep data skew to a minimum in child objects. Using the Enable Separate Loading of Related Lists setting allowed the account detail to render while the customer was waiting for the related list query to complete.

The ERP system has been used for managing and storing sales orders since its implementation one year ago, but millions of 'Sales Order' records are also currently stored in Salesforce. Sales users who can only access Salesforce often need to know about the total price of products that were included in an old sales order while calculating discount for a new bulk purchase. But they do not require other details related to a sales order, such as order status and additional details about each product. It is necessary to remove these sales orders from Salesforce but retain the data required by the sales users.

The existing sales orders can be removed from Salesforce using a tool such as Data Loader. Salesforce Connect can be utilized to display sales orders that are currently stored in the ERP system. This would allow using Salesforce to view the sales orders even if used to store them. An external object can be created in Salesforce for the sales orders, and its page layout can be modified so that each sales order only displays specific information required by the company's users.

Golden/Single Source of Truth (SSOT)

The golden or single source of truth (SSOT) represents the location of a certain data element where it is mastered. It is typically the master hub that is used for data consolidation in a master data management system. An SSOT system provides authentic and relevant data that other systems in an organization can refer to.

Query Plan Tool - Fields

The indexed field(s) used by the Query Optimizer. If the leading operation type is Index, the fields value is Index. Otherwise, the fields value is null.

Mitigation Strategies for Lookup Skew - 2 (Distribute the Skew)

The lookup skew can be distributed to ensure that a large number of records do not look up to the same parent record. Heavily skewed lookup records can be identified and additional lookup values can be added.

Mitigation Strategies for Lookup Skew - 1 (Reduce Record Save Time)

The record save time can be reduced to reduce the duration of locks. This can be done by increasing the performance of synchronous Apex code, removing unnecessary workflows, and using asynchronous processing. Increase the performance of synchronous apex code-tune you triggers for peak performance by consolidating code into a single trigger per object and following Apex best practices. Remove unnecessary workflow or consolidate into existing trigger code-workflow lengthens the lock duration, which in turn increases the lock failure rate. Removing or consolidating workflow into your Apex code can increase your save performance. Only process what's required-move any code that isn't required immediately upon save to asynchronous processing. Locks are held the entire time your custom code is executing, so processing logic that is not critical during the save will increase the lock failure rate.

The system administrator of the legacy CRM system had a responsibility to extract accounts and opportunities on a weekly basis and upload the file to a shared location for the BI team to access. This process needs to be retained until the integration solution between the BI system and Salesforce has been implemented. The administrator has started extracting the data on a weekly basis using Bulk API jobs, but the Bulk query frequently times out. There are millions of accounts and related opportunities that need to be extracted. The data architect needs to suggest a suitable approach to prevent query timeout. Important review about Bulk API.

The requirement is to extract data from Salesforce on a weekly basis, for which the Salesforce administrator should use Bulk API jobs with PK Chunking enabled. When there is a need to extract millions of records from a Salesforce org, to get better performance and reliability, the job should be split into a number of separate queries. Each query can retrieve a smaller portion of the data. When the number of records in a single query is lower than the selectivity threshold of the Salesforce Query Optimizer, the platform can process the queries more efficiently. The PK Chunking feature of the Bulk API automates this process by using the primary key (Record ID) of an object to split the data into manageable chunks and query them separately. This feature is supported for all the custom objects, many standard objects, and their sharing tables

Use case (API performance): The customer designed a custom integration to synchronize Salesforce data with external customer applications. The integration process involved: • Querying Salesforce for all data in a given object • Loading this data into the external systems • Querying Salesforce again to get IDs of all the data, so the integration process could determine what data had been deleted from Salesforce The objects contained several million records. The integration also used a specific API user that was part of the sharing hierarchy to limit the records retrieved. The queries were taking minutes to complete.

The solution was to give the query access to all the data, and then to use selective filters to get the appropriate records. For example, using an administrator as the API user would have provided access to all of the data and prevented sharing from being considered in the query. An additional solution would have been to create a delta extraction, lowering the volume of data that needed to be processed.

Standard Account Matching Rule

The standard account matching rule identifies duplicate accounts using match keys, a matching equation, and matching criteria. It's activated by default.

How can the standard address fields be made consistent?

The standard address fields in records can be made consistent by using the 'Mass Update Addresses' tool. Consistent naming can be used for country and state/province fields.

Standard Contact Matching Rule

The standard contact matching rule and standard lead matching rule identify duplicate contacts and leads using match keys, a matching equation, and matching criteria. They're activated by default.

One of the company's main objectives is to prevent the creation of duplicate records. Matching rules will need to be defined to identify duplicates. It will also be necessary to establish a system of record (SOR) for each object, which will act as the most authoritative source of data for matching the records of the object in the enterprise systems. However, the matching will occur at the field level for the identification of duplicates. For instance, fields such as 'Name' and 'Stage' will need to be used to match opportunities in Salesforce and the ERP system. The IT Director of the company would like to know how the system of record will be determined at the field level for this requirement. Who or what will establish the SOR for different objects and fields?

The system of record for different objects and fields would be determined by the master data management solution. It would not be necessary to keep track of or identify the system of record for every update within the enterprise landscape. However, it would be essential to establish different SORs for different types of data elements, objects and/or fields by meeting with the stakeholders prior to the implementation.

Heroku External Objects

They are used in conjunction with Salesforce Connect

PK Chunking Examples Suppose a customer is using the custom object MessageStatus__c to keep track of a high volume of phone calls, emails, and other communications. They want to perform a complete extract and limit the number of chunks to make consolidating the data easier.

They can perform a Bulk API query on MessageStatus with this header: Sforce-Enable-PKChunking: chunkSize=250000;

The Salesforce developer of the company would like to utilize SOSL in custom Visualforce pages that allow users to filter and search for records of certain standard and custom objects. Since these objects will contain millions of records after the migration, the developer is concerned about the performance of SOSL queries, and has asked the data architect to recommend best practices for optimizing their performance. Key story about improving performance of SOSL.

To optimize the performance of SOSL queries, one must ensure that they are as selective as possible. The exact word or phrase should be used to perform a search. For example, Martin* should be used instead of Mart* in a SOSL query to find records with the word 'Martin'. If a search needs to be performed in only the Name fields, 'IN NAME FIELDS' should be used instead of 'IN ALL FIELDS' in the SOSL query. Furthermore, the scope of the search should be limited by targeting specific objects, records owned by the searcher, or records within a division.

Use case (Multi-Join Report Performance): The customer created a report that used four related objects: Accounts (314,000), Sales Orders (769,000), Sales Details (2.3 million), and Account Ownership (1.2 million). The report had very little filtering and needed to be optimized.

To optimize the report, the customer: • Added additional filters to make the query more selective and ensured that as many filters as possible were indexable • Reduced the amount of data in each object, whenever possible • Kept the Recycle Bin empty. • Ensured that no complex sharing rules existed for the four related objects. Complex sharing rules can have a noticeable impact on performance.

What is the point of truncating a custom object?

Truncating a custom object erases all records currently sitting in the custom object's Recycle Bin; the custom object's history; and related events, tasks, notes, and attachments for each deleted record.

Which factors are necessary to consider when establishing data survivorship rules for an MDM implementation?

Trust score, decay, validation, and precedence

How is the difference between one-column and two-column indexes?

Two-column indexes are subject to the same restrictions as single-column indexes, with one exception. Two-column indexes can have nulls in the second column, whereas single-column indexes can't unless Salesforce Customer Support explicitly enabled the option to include nulls.

What should be considered as part of data retention policy?

Type of data, classification, purpose, how it is used, when it should be deleted or archived.

When can't a custom index not be used in a query?

Typically, a custom index won't be used in these cases: 1. The value(s) queried for exceeds the system-defined threshold mentioned above. 2 The filter operator is a negative operator such as NOT EQUAL TO (or !=), NOT CONTAINS, and NOT STARTS WITH. 3. The CONTAINS operator is used in the filter and the number of rows to be scanned exceeds 333,000. This is because the CONTAINS operator requires a full scan of the index. Note that this threshold is subject to change. When comparing with an empty value (Name != '').

Up to how many duplicate records can be merged?

Up to three duplicate records, such as duplicate accounts, contacts, or leads, can be manually merged into a single record.

Data Load - API Options - 2

Update: The Bulk API (2.0) can load binary attachments, which can be Attachment objects or Salesforce CRM Content.

How do you query the system for the latest incremental chages?

Use queries that filter records using SystemModstamp (a standard field in all objects that has an index) rather than LastModifiedDate field (not indexed)

GROUP BY ROLLUP in SOQL

Use the GROUP BY ROLLUP optional clause in a SOQL query to add subtotals for aggregated data in query results. This action enables the query to calculate subtotals so that you don't have to maintain that logic in your code.

Request and Reply Integration - SOAP API

Using a WSDL, a proxy Apex class can be generated that provides the capability to call the remote service and initiate synchronous SOAP callouts.

The company loads millions of lead records into Salesforce every quarter. While several hundred thousand leads are not of good quality, most of them are considered valuable by the marketing department. In addition, several thousand campaign records are created in Salesforce every month. Programmatic sharing has been implemented to share these records with different types of users in Salesforce. Marketing users work on a lead or campaign record for one year but require access to the record for at least two years for reference and reporting. The regulatory requirements, however, require certain leads and campaigns to be stored for three years. The data architect has been asked to suggest a suitable data archiving strategy that would meet these requirements.

Using a tier-based data archiving and retention strategy would be the best course of action in this case. Data can be structured and placed in a hierarchy based on its business value and the duration for which it should be stored for various business operations. It can then be archived based its position in the hierarchy and the desired schedule.

Request and Reply Integration - REST API

Using synchronous HTTP callouts, standard GET, POST, PUT and DELETE methods can be made to the external system or to integrate with RESTful services.

What happens when you extract data with the bulk API?

When extracting data with Bulk API, queries are split into 100,000 record chunks by default—you can use the chunkSize header field to configure smaller chunks, or larger ones up to 250,000.

When loading data, what are which one is the fastest operation?

When loading data, use the fastest operation possible: insert is faster than upsert, and even insert + update can be faster than upsert alone

Loading Data from the API - Goal: reducing data to transfer and process

When updating, send only fields that have changed (delta-only loads).

Cosmic Grocery has an Experience Cloud site that can be accessed by its customers and partners. Both types of external users can currently view all the personal information of Salesforce users. The CTO of the company would like to hide certain personal information, including Address, Email, Mobile, Phone, Username, and SAML Federation ID. Other fields should still be visible to the external users. The data architect needs to recommend a suitable approach to meet this requirement.

Why can't you just use standard field-level security?

Without necessary indexes, what does the query optimizer need to do?

Without the necessary indexes, the query optimizer must perform a full scan to fetch the target rows.

On which fields can you create External IDs?

You can create External IDs only on the following fields: • Auto Number • Email • Number • Text

Heroku Connect

You can use Heroku Connect for data replication and data proxies. Heroku Connect is used in conjunction with the awesome Heroku Postgres database. You can replicate data to and from Salesforce into this SQL database, or you can proxy it from the Heroku Postgres database into Salesforce using Salesforce Connect. 1. Data replication with Heroku Connect can be one way, from Salesforce to Heroku Postgres, or bidirectional. 2. Because Heroku Connect uses Heroku Postgres, all standard database features are available with the replicated data. 3. A common use for Heroku Connect is business-to-consumer apps that use and potentially change data stored in Salesforce.

In terms of features, which ones are compatible with the State/Country picklists?

You can use the state and country/territory picklists in most places that state and country/territory fields are available in Salesforce, including: 1. Record edit and detail pages List views, reports, and dashboards Filters, functions, rules, and assignments 2. State and country/territory picklists can also be searched, and they're supported in Translation Workbench.

Which standard and custom objects cannot be truncated?

You can't truncate standard objects or custom objects that are referenced by another object through a lookup field, or that are on the master side of a master-detail relationship, are referenced in a reporting snapshot, have a custom index or an external ID, or have activated skinny tables.

What happens if the number of tasks to a single account is a lot more than 10K?

You'll end up with too much overlap across multiple batches, and lock contention will cause load problems. In this case, you'll want to take that set of tasks and load them in a separate serial job using a controlled feed load.

How can you avoid timeouts on large SOQL queries?

Your SOQL query sometimes returns so many sObjects that the limit on heap size is exceeded and an error occurs. To resolve, use a SOQL query for loop instead, since it can process multiple batches of records by using internal calls to query and queryMore. Key point: Instead of using a SOQL query in a for loop, the preferred method of mass updating records is to use batch Apex, which minimizes the risk of hitting governor limits.

Standard index example and meting thresholds

• A query is executed against a table with 2 million records, and the filter matches 450,000 or fewer records. • A query is executed against a table with 5 million records, and the filter matches 900,000 or fewer records.


Kaugnay na mga set ng pag-aaral

(MCN) Preschooler Growth & Development

View Set

Nutrition final exam review questions

View Set

three types of irony : facts and examples

View Set

Polio: The Disease and Vaccines that Prevent Disease Course (1.5 hrs) Pre Test [DHA-US087]

View Set

Nursing 275 Exam 4 Rape and PTSD

View Set