Salesforce Data Mastery Q&A: How to build for large data volumes like a Master Architect

By Susannah St-Germain, Technical Architect at Odaseva

Large data volumes in Salesforce environments can grow out of control and cause problems like frustratingly slow user experiences for your employees, subpar customer experiences – even bringing down your entire Salesforce org.

Many Salesforce admins inherit this problem when lots of data was generated by code written by multiple internal and external teams over the span of a decade or more. Now these large data volumes are suddenly overwhelming your Salesforce environment and fingers are pointing at you to fix it. It’s not uncommon.

But designing and implementing solutions is challenging when you have a large Salesforce org, and figuring out how to avoid the risks associated with large data volumes can be tricky. How do you avoid hitting Salesforce governor limits? How often should you archive? How can you reduce expensive data storage costs?

Odaseva recently hosted a webinar to show how you can reign in the risks associated with large data volumes – without sacrificing business requirements. We collected and answered questions from the community about their most vexing LDV questions. The webinar, called “Salesforce Data Mastery Q & A: How to build for large data volumes like a Master Architect,” featured my brilliant colleague Carl Brundage, Master Architect at Odaseva, and Anil Sistla, Platform Architect at Schneider Electric.

Below is a summary of the questions and some key takeaways that any organization dealing with large data volumes should know:

What is a large data volume in Salesforce? What is the number when they really start to matter?

Large data volumes usually have the following characteristics:

Large amount of records (>10M records)
Large number of fields
Large size per record (Rich Text Field, Files) (API Size)
Large size of data storage (>50GB)

As far as when large data volumes really start to matter – that can vary considerably. Carl’s response is what he calls the “classic architect answer.”

“It depends,” he says. He explains that you can start to see the negative effects of large data volumes when you have fewer than 1 million records if you haven’t implemented best practices. The telltale signs are slow or even failing performance like “reports are taking a long time to load, you go to the all-list view and guess what – nothing comes back so you just wait and wait and wait.”

So while 10 million is considered a large data volume, the negative effects can start impacting you with far fewer records if you don’t plan for it.

What are the risks associated with large data volumes in Salesforce?

Anil from Schneider Electric explains that large data volumes can impact system performance ranging from very slow user experiences, to directly impacting customers, even to bringing down the Salesforce org.

“When running reporting on huge data volumes, you are going to be consuming so much resources on the platform which is going to impact your users who are trying to perform transactions on the platform,” says Anil. “Your users are starting to see the page loading taking much longer than it used to, sometimes minutes and minutes of time.”

“The issues look simple but are actually impacting your platform on a huge scale,” he says. That’s because when your employees are impacted by slowdowns, it ultimately affects your customers. “You’re directly translating that to your customer experience. Let’s say your customer care agent is on a call with a customer and if you’re taking such a long time to even open up a case, you’re directly impacting your customer who is probably not going to be willing to wait for that long.”

So how do we start to solve some of these issues? What are some best practices about mitigating these risks?

Carl says that “even if you’ve done the right things at a code level, you can run into problems when you start moving data in and out of the system. So the question becomes what can you do about it?”

Carl cites bypass strategies as a way to skip automations if the data coming into the system “already has the other things that the automations may be doing.”

“For example… for one customer to update 100,000 cases with batch Apex it took 6-7 hours of run time and that’s because of triggers, lots of logic blocks, process builders, work flows, escalation. Once a bypass strategy was implemented, it took 10 minutes.”

So while we talk about large data volumes and working within the constraints of Salesforce, there’s a lot that we can do as users to make sure that the things we’re adding on top don’t impact our large data volume experience even more.

What is the best way to avoid hitting Salesforce governor limits when dealing with large data volumes?

There are “in-your-face” limits… and then there are “sneaky” limits.

Susannah explains that there are API limits in Salesforce that can come as a surprise because while some Salesforce limits are soft, where you can ask for an increase, “a lot of these API limits are strictly enforced for good reason.

A quick example of how that might play out: if you’re not aware of the different API limits that are governed by each API… and you pick not-the-best API for the job, you can start to run into these 24 hour rolling blocks where if you exceed the number of API calls for a 24 hour rolling period you have to wait until that next period to open up and that can mean pause to your business and ultimately a pause to your customer. So that’s extremely important to be aware of.”

But it’s not just the obvious limits that you need to be aware of, says Carl. Sneaky limits can catch you off guard, such as that you can only put 30 million files into Salesforce, or partner limits. “Looking for sneaky limits at the start will save you a lot of headaches down the road.”

Storage space in Salesforce can be pricey, what should be done to keep data volumes low? Do we move data out of Salesforce?

This is another question where the answer is “it depends.” There are many different options depending on the business requirements.

Anil explains that “One aspect is storage space, the other aspect is ensuring that your users are getting the optimal performance from the Salesforce platform they’re using.” Schneider Electric deals with very, very large data volumes in the tens of millions, and Anil says that “it’s a very methodical process you have to put in place, it’s not just purely from a storage standpoint but from an end to end journey that you’re trying to address for your agents or your customer.”

“There is some data that we can take off the platform but at the same time we don’t want to lose that access to the data because it’s a gold mine in today’s world and it’s always important for your agents to refer back to something they have addressed in the past.”

So Schneider Electric’s archiving goals weren’t just about taking data off the platform, but also to think about “what are we going to gain in terms of improving the agent experience, how do we still supplement the agent experience so they never feel that they lost the data and they still have access to it? The first and foremost thing that is very important is to talk to your business users, understand what is it that they need and what they’ll be using often” and then remove what can be removed based on that.

When should you archive Salesforce data?

Anil explains that once you start getting into very large data volumes, it’s a trigger point to start thinking about your archiving strategy. While Schneider Electric archives every day, Anil says that there’s no hard and fast rule. “It’s going back to… what is the business expectation? What’s the volume of the data growth that you have? That is what is really driving the frequency that you need to archive the data.”

I like to sum this up as: “Data growth + business requirements = frequency.”

—

If you’re managing Large Data Volumes in Salesforce, we can help. Architects and platform owners at Fortune 500 companies like Schneider Electric, Robert Half, and Heineken rely on Odaseva to back up, archive, and distribute huge volumes of data. If you want to level up your data management, get in touch for a personalized demo.