All I want for Christmas is…. faster webhooks

We’ve been taking advantage of the quiet period around the holidays to work on some performance tweaks to WP Fusion, and we’ve come up with some pretty exciting changes with the webhooks system.

First a recap:

What are webhooks? When something changes on a contact record in your CRM, like a tag is applied or a field is edited, you usually want that data synced back to the contact’s user record in WordPress.

Webhooks are a way for your CRM to tell WP Fusion that something has changed for a contact (or a new user needs to be imported). We support webhooks with most platforms, you can see the various setup guides here.

How do they work currently? Each webhook contains the contact ID of the updated contact in the URL. WP Fusion then takes this ID and uses it to connect back to your CRM and load their updated tags and/or custom fields.

If a new user has been imported, WP Fusion also syncs their username and generated password back to your CRM so it can be sent in a welcome email.

What’s the problem? The problem is that all of this takes time. To import a new user, generate a password, and sync it back to your CRM can take 5 seconds. Longer if the API is slow.

That’s not so bad for a single user, but it can cause problems when multiple webhooks are coming in at the same time.

#Testing, testing….

Let’s do a test where we try to import 200 users (via webhooks) in one minute.

This roughly simulates what happens when a bunch of contacts hit an automation step with a webhook at the same time, and it’s the most common cause for user accounts not getting created / tags getting out of sync.

Note A: All tests are performed on a Digital Ocean server with 2Gb of memory, and one CPU (basic, $12/ mo hosting). No caching. Active plugins are Elementor, LearnDash, BuddyBoss, and WooCommerce. So this is a pretty “low end” environment in terms of available resources. These kinds of setups are where our customers have the most problems with webhooks and server load.

Note B: For the test we are importing the new user and their tags from the ActiveCampaign contact record, generating a password and syncing it back to a custom field, enrolling the user into two LearnDash courses, and applying one “Course Enrolled” tag to indicate a successful import.

#Test 1 – Default webhooks

This is the default webhook endpoint following our ActiveCampaign webhooks guide. For example https://mysite.com/?wpf_action=add&contact_id=123.

The first user is imported in about 5 seconds. But as more webhooks come in, the site starts to slow down under the load. By the 15th user, the site is now taking 10+ seconds per webhook ☹️

Then the site runs out of resources after about 30 seconds, and you get a “gateway timeout” error.

In this case 42 out of 200 users were successfully imported. Not great! 😬

API calls take time to send. Usually about a second each with ActiveCampaign (on a good day 😅). In this case we’re sending 4 API calls per import, so 5 seconds per user is about the best we can hope for.

The resource problem comes from the fact that basic hosting like this can only process a certain number concurrent requests at the same time. In this case it can handle about 30 before it crashes.

#Test 2 – Making the webhooks asynchronous

Since the API calls are the slowest part, let’s try offloading those to a separate process.

WP Fusion already has an import tool that can import user accounts for thousands of CRM contacts. It does this by working through the records one by one, instead of all at the same time.

We can take each incoming webhook and add the contact ID to a queue of records to be imported by the import tool, and then dispatch a background process to handle the import asynchronously. The background worker will then import the records one by one, as resources allow.

Handling the webhooks asynchronously let us process 173 out of 200 requests with an average response time of 7,304ms

Getting better! We managed to handle 173 out of 200 requests in a minute, with an average response time of about 7 seconds.

We didn’t get everybody imported, but at least the site didn’t crash! 😌

This alternate “async” method has been supported in WP Fusion for a couple of years now, and it has helped a lot with some customers, but we felt like there had to be room to improve.

#Test 3 – Tweaking the background process

As a part of this testing, we realized that each time we added a new record to the import queue, it was spawning a new instance of the background worker to handle the import— even if an import was already running.

We got around this by making a simple change to WP Background Processing so that a new asynchronous request won’t be spawned if there’s an existing process lock (i.e. the process is already running).

With this change, the very first webhook should dispatch a new background process, but subsequent webhooks will simply be recorded to the import queue (as long as the importer is still running).

Using a process lock to prevent concurrent background processes from spawning gets our average handle time down to 516ms.

Woah, now we’re cooking with gas 🔥

All 200 users were successfully imported, with an average response time of 516ms 😘👌

More importantly, the response times stay relatively steady throughout the test… meaning the site could probably put up with this level of activity for a sustained period, without running out of resources.

#Test 4 – Everything but the kitchen sink

Sustained half-second webhook handling on a basic hosting plan is great. In 99% of cases that will be fast enough.

But we have customers with 100,000+ members moving through CRM automations, sending webhooks back to their site all day every day, and in those cases every millisecond counts.

We’ve already offloaded the API calls to the background worker. What’s the biggest bottleneck now? It’s WordPress. 

Each time we hit https://mysite.com/?wpf_action= all of WordPress has to load, the theme, all the plugins, any past-due cron tasks. Basically a whole mess of stuff we don’t really need.

Since all we’re doing now is saving the contact ID to the import queue, all we really need is access to the database. But, without mucking about with .htaaccess and rewrite rules, it’s kind of hard to bypass the normal WordPress load process.

Since this is getting into edge-case territory, we’ll use an edge-case solution. WP Fusion now ships with an api.php file inside the plugin folder. You can POST your webhooks directly to this file, and it will validate them and save them directly to the database, bypassing the normal WordPress load process (check out how that works here).

Time to test again, now sending the webhooks into the plugins directory at https://mysite.com/wp-content/plugins/wp-fusion/api.php?wpf_action=add&contact_id=123

Bypassing WordPress and saving the data directly to the database cuts down on the response time by more than 50%

This test handled 200 webhooks with an average time of 212ms, with almost no variability in the response time 🤩

The background worker then proceeds to import all 200 users one at a time, while respecting the resource limits of the server as well as ActiveCampaign’s API limits.

In this case the 200 users were imported and enrolled in their courses over the following 6 minutes and 12 seconds.

So, that’s pretty easy. Let’s try 500 🤔

Increasing the number of requests to 500 caused only a slight drop in response time.

The response time is basically unchanged.

For the sake of “why not”, let’s throw 1,000 webhooks at it 💁‍♂️

239ms response time. Basically unchanged, and consistent throughput.

And keep in mind this is on a $12 / month hosting plan, running Elementor, BuddyBoss, LearnDash, and WooCommerce, with no caching or other optimizations in place.

So, I think it’s time to cautiously say we’ve solved the problem of incoming webhook performance in WP Fusion 😅

These changes will be available on Monday, December 27th in the v3.38.31 update of WP Fusion. Happy Holidays, y’all!  🎁

Leave a Comment

Your email address will not be published.