merging subversion repositories
Originally posted at saji-codes tumblr.
Some time ago I tried to create single subversion repository for a project that was spread across several of these.
- In the old days we used
FOO1
repository. After several months we learned about standard layout and moved the code totrunk/
subdirectory. - Another several months and project has been rewritten and new repository
FOO2
has been created. - Again, another repository emerged (
FOO2.1
), this time we even created several branches in it. - In addition copy of the code has been living in some subproject repository (
BAR
).
Here’s a recipe to merge these, if you happen to be in similar situation.
Create yet another svn repository
svnadmin create /home/svn/FOO
# set up permissions to it etc.
cp -v /home/FOO2.1/conf/{svnserve.conf,passwd} /home/FOO
# create standard layout
svn mkdir --parents file:///home/svn/FOO/{trunk,tags,branches} -m 'layout'
Dump old repositories
svnadmin dump /home/svn/FOO1 > old-FOO1.dump
svnadmin dump /home/svn/FOO2 > old-FOO2.dump
svnadmin dump /home/svn/FOO2.1 > old-FOO2.1.dump
svnadmin dump /home/svn/BAR > old-BAR.dump
Take only what you need
# only trunk
cat old-FOO1.dump | svndumpfilter --renumber-revs --drop-empty-revs \
include trunk > new-FOO1.dump
cat old-FOO2.dump | svndumpfilter --renumber-revs --drop-empty-revs \
include trunk > new-FOO2.dump
# you can specify more than one path
cat old-FOO2.1.dump | svndumpfilter --renumber-revs --drop-empty-revs \
include trunk branches > new-FOO2.1.dump
# only one subdirectory from trunk containing FOO project
cat old-BAR.dump | svndumpfilter --renumber-revs --drop-empty-revs \
include trunk/FOO/ > new-BAR.dump
Fix paths
At this point you have several dumpfiles and probably all of them would like to put files in trunk/
, which is not good. So, we have to translate paths.
sed -ir \
-e 's,^Node-([^\s]*)path: trunk,Node-\1path: branches/1.0,' \
-e 's,^Node-path: trunk$,Node-path: dummy1,' \
-e 's,^Node-([^\s]*)path: (lib|tpl|htdocs|LICENSE|README),New-\1path: branches/1.0-legacy/\2,' \
new-FOO1.dump
This will translate all trunk/
paths to branch/1.0/
. Also, it searches where it is referred to without any subdirectory (basically, creation of a directory) and translates into dummy value. This is to avoid conflicts (every single dumpfile will try to create these directories — translate them to dummy1
, dummy2
etc).
The last expression (-e
) sorts out the dark times, when we did not use standard subversion layout and translates all paths to branches/1.0-legacy
— you have to specify all files that were in any revision in root directory (even when they were deleted before moving things to trunk/
).
After sed(1)ing you can check what paths are mentioned in new dump:
egrep -a '^Node[a-z-]*-path:' new-FOO1.dump \
| cut -d'/' -f-2 | sort | uniq
# to check two levels of directories (here: list subdirectories of trunk/)
Fix paths in rest of the dumps
sed -ir \
-e 's,^Node-([^\s]*)path: trunk,Node-\1path: branches/2.0,' \
-e 's,^Node-path: trunk$,Node-path: dummy2,' \
new-FOO2.dump
# trunk does not get translated here, as it is to stay trunk in new repository
sed -ir \
-e 's,^Node-path: trunk$,Node-path: dummy2,' \
-e 's,^Node-([^\s]*)path: branches/stable,Node-\1path: branches/2.1-stable,' \
-e 's,^Node-([^\s]*)path: branches/xxx,Node-\1path: branches/2.1-xxx,' \
-e 's,^Node-path: branches$,Node-path: dummy3,' \
new-FOO2.1.dump
sed -ir \
-e 's,^Node-([^\s]*)path: trunk/FOO,Node-\1path: branches/2.1-bar,' \
new-BAR.dump
Remember to check dumps with that grep(1) command above.
load ‘em up!
Now you should be ready to load all dumpfiles into the new repository.
cat new-FOO1.dump | svnadmin load /home/svn/FOO
cat new-FOO2.dump | svnadmin load /home/svn/FOO
cat new-FOO2.1.dump | svnadmin load /home/svn/FOO
cat new-BAR.dump | svnadmin load /home/svn/FOO
Unfortunately revisions will created in order of loading dumpfiles, not sorted by date, so you might want to tweak loading order.
You might want to check if everything is going according to plan after every load
svn ls file:///home/svn/FOO/
Also, be warned that if any error occur, you have to re-create the repository and start loading from the first one. So if dumpfiles are big, or you are afraid that something may go bad with one of them — backup /home/svn/FOO
after every successful load, so you may tweak dumpfile that causes problems and load it again.
one-liner
Of course, if you are sure that everything is going to work, you may dump and load with one, smooth command:
svnadmin dump /home/svn/FOO2.1 \
| svndumpfilter --renumber-revs --drop-empty-revs \
include trunk branches \
| sed -ir \
-e 's,^Node-path: trunk$,Node-path: dummy2,' \
-e 's,^Node-([^\s]*)path: branches/stable,Node-\1path: branches/2.1-stable,' \
-e 's,^Node-([^\s]*)path: branches/xxx,Node-\1path: branches/2.1-xxx,' \
-e 's,^Node-path: branches$,Node-path: dummy3,' \
| svnadmin load /home/svn/FOO
kill the dummies
svn rm /home/svn/FOO/dummy{1,2,3} -m 'repository merged' # we won't be needing these anymore
ta-dah
You are now proud owner of one, ultimate repository containing all project code. It is far from being perfect. To be so, you’d have to tweak dumpfiles to remove dummies completely and change some file-creation
to file-copy
operations; and somehow sort revisions by date.
Now, there might be good time to consider migration from SVN to GIT. [;