kunalrgarg's comments

kunalrgarg · on Dec 13, 2023

I haven’t heard of Valadic, but I’ll definitely check it out. Thanks for flagging!

kunalrgarg · on Dec 13, 2023

Wow congrats on making it through with CometD and Bayeux. I think salesforce has realized that a lot of their APIs aren’t useful when using non standard approaches. I think the move from SOAP to also including REST was a signal that they’re trying to be more useful in this realm. I definitely agree a RabbitMQ sink would've been so ideal! I’m hoping it’s somewhere on their roadmap to make our lives easier.

kunalrgarg · on Dec 12, 2023

That's awesome that you set this up, thanks for paving the way! Our users who want real-time syncs and are watchful of their REST API limits typically opt in for our streaming solution (I believe the pub/sub api was a recent addition to their change data capturing APIs). Too much polling definitely has its issues with limits as you mentioned, but we allow our users to set their own frequency to meet their needs.

If I'm understanding correctly, the scheduled jobs refer to the Bulk API (I agree it executes at seemingly random speeds). We only use the bulk api on the initial “seed”, where we write a large amount of data from salesforce to postgres. Otherwise, when it comes to reading/writing data, we stick to the REST API, which we’ve found pretty performant and which Heroku Connect seems to rely on, too: https://devcenter.heroku.com/articles/mapping-configuration-...

> Are you limited to working with customers that are paying Salesforce enough…

Yeah, right now we do require that users are on Salesforce plans that include API access, which are Performance, Developer, Unlimited, and Enterprise (or Professional w/ API add-ons).

couchand · on Dec 12, 2023

I'm guessing the GP's scheduled jobs are running within Salesforce, probably Apex. I'd note that I've seen inconsistent async processing delays even in EE and UE clients. First of all, I'm pretty sure everyone is on shared infrastructure, and second, the delay is at least in part relative to the amount of recent processing.

kunalrgarg · on Dec 12, 2023

Yeah, so far, we’ve found that this combination of the three APIs is a happy medium between reliability, simplicity, and API limit consciousness.

couchand · on Dec 12, 2023

I'm curious what's the biggest table you've tested on? Things start to get really interesting in Salesforce when your object has a million rows.

kunalrgarg · on Dec 12, 2023

Yeah definitely agree, we had to rework our logic for these cases. We've worked with ~4 Million row objects. In general with our polling strategy syncs, the problem was more the size of each record. So a table with 1M rows but 400 fields was way more problematic than 2M of just 5 fields.

kunalrgarg · on Dec 12, 2023

Thank you! We’re primarily a TypeScript team with Next/Nest frameworks on GCP, using serverless where we can

kunalrgarg · on Dec 12, 2023

Yes, you’re correct on that! Sorry for the lack of clarity on the docs. The snapshots would be stored in a client side table in these cases. These snapshot tables/collections can either be stored in your own PG or MongoDB.