How to Store and Update Data to Ensure Transaction Data Integrity
Who is this guide for:
This guide is intended for those who are integrating with the GoCardless Bank Account Data API and require continuous access to historical transaction data.
This covers use cases such as accounting, reporting and reconciliation, personal finance management, ERP, and other scenarios where you need to both maintain a history of transactions over a period of time and regularly update the data with the latest transactions.
In short:
- If you want to use data older than 90 days continuously, you need to store transactions in your system.
- Don't save data for the same period more than once. Either query it just once or overwrite data if you query again.
Storing the transaction data
The first necessity is to store transactions in your system. While GoCardless can guarantee stable access for the latest 90 days of transaction history, there are many cases where the extended history cannot be re-downloaded after several weeks or months. This is why, if you need to have the full history of your customer’s transactions at your fingertips, you need to store it in your system.
How you store the transactions is up to you. It can be in a relational database or any other setup that works for you.
Before we get to updating the transaction data, let us start with:
Known limitations to our service as of April 2025:
- In cases where we need to adjust the format of data for transactions for a specific bank or a group of banks, access to transactions older than 90 days might lapse. The latest 90 days of data will still be available, and the continuous access will be valid as before. This happens regularly.
- In some rare instances where we need to migrate to a newer version of a bank’s API or overhaul our de-duplication logic for that bank, this can result in a one-time change for all internalTransactionIds for all accounts at this bank. This affects all the internalTransactionIds at the same time, not individual ones.
What you should expect to work reliably:
- Getting the last 90 days of transactions: no known API-wide limitations (unless there is an issue with this particular account, such as described in the account errors guide. In these cases, /balances are also affected, and it has no bearing on whether 90 days of the full history are returned).
- Receiving only unique transactions on the /transactions endpoint: we have developed extensive logic to ensure we never return duplicates of transactions. In the very rare cases where a duplicate has been returned, please report this bug to our support, supplying the account_id and the affected transaction(s), and we will fix it.
With the above in mind, you can now decide how you will ensure that the transaction history necessary for your use is retrieved, stored, and updated reliably. Below are recommended strategies to choose from.
Initial connection.
When your user connects the account and you query the transactions for the first time, we recommend you save the full history to your database at once. Unless you will be re-connecting the account (user authorizing again), you can consider all data older than about a month to be fixed, i.e., it will not be updated again (unless, of course, there has been a longer period when you have not queried the data).
Updating the transaction data.
The issue is that we want to both update the list of transactions with new data and avoid creating duplicates by importing the same data twice.
Approach 1 - one-time import by date ranges.
In terms of implementation complexity, the simplest approach is to import the same date range only once.
This means that, for example, on Tuesday you would import the transactions for Monday's date, and on Wednesday you would only import transactions for Tuesday, but not for Monday.
This approach is useful if you want to maintain local information associated with each individual transaction, such as assigning, splitting, etc., and especially information that was manually created by the user (such as by an accountant).
Approach 2 - overwriting the data when importing.
Another approach is to overwrite the old data with new data each time you query for transactions. For example, if today you download the last 90 days of transactions, then before saving them in your database, you delete the last 90 days of transactions. This ensures that your data will not contain duplicates.
Such an approach is useful if you re-calculate metrics such as risk scores or cash flow predictions based on the full dataset each time new data is added.
Approach 3 - build your own duplicate detection system.
You develop a way to detect duplicated transactions based on a combination of fields or other factors relevant to your case. Keep in mind that while most banks provide consistently unique transactionId for each transaction, some banks either do not provide one, or it changes over time, or it is not unique.
A very useful feature to implement is a warning about potential duplicates for the users so that they can confirm whether both transactions are valid. In some cases, this also helps users identify if they have been charged twice by a merchant in error.
Conclusions
To ensure your transaction data does not ingest duplicates after an update, we recommend either avoiding importing the same date ranges twice or overwriting the relevant range upon each import. You can also choose to implement your own system to ensure data integrity.
Footnote on known limitations
We understand that edge cases that result in changes to internalTransactionIds for posted transactions are less than ideal. Improvements to prevent this are on our roadmap; however, as of May 2025, we do not have an estimate for when they will be rolled out.