Borg is generally a very reliable backup tool. So it’s relatively rare to see broken repositories after minor issues, like network interruptions.
More serious damage would be caused by issues with the underlying storage and file system. In such cases it can be necessary to run “extended” repair on the repo to save whatever can be saved and then continue using the repo by re-adding the missing data.
I couldn’t find much documentation on this, except for an old Github issue. So I’m documenting the steps I took for future reference:
Warning: Repairing a repository is a potentially destructive operation and should be run on a copy of the data only
This particular repo was 2.5 TB in size and lost a half dozen data files due to a file system error. The first thing to try would be an ordinary repair run
$ borg check --repair $REPO
In many cases this will be able to repair the repo. You will still loose segments stored in missing data files, but the repo will be good to use again after.
Note that this command can take a long time to run and sometimes it spends hours without disk activity, while recovering segments byte-by-byte. Running it on my 2.5 TB repo locally took about 10 hours.
If you still get errors with this command, you can try rebuilding the index as described here. To do so remove all numbered files (e.g.
index.38448) out of the repo root
$ rm hints.* index.* integrity.*
After this, run
$ borg -p -v check --repair --repository-only $REPO
Once this succeeds, repair archives:
$ borg check --repair --archives-only $REPO
This will rebuild the index of segments and try to recover any available data. When done, be sure to clear the old cache on other machines:
$ borg delete --cache-only $REPO
This will hopefully make the repo usable again. If you still have issues, head over to Github Discussions and other users may be able to help out.