Extracting projects from a shared Subversion repository

I recently had the need to migrate a project from a Subversion (SVN) repository that was shared among many other projects and groups to a fresh repository where it would be the first of many projects.

My first instinct was to simply use svnadmin dump to dump out the contents of the whole shared repository, transfer that to the new machine, use svnadmin load to load the data into the new repository, and then delete out the projects that I did not want.

The first pass at this created a 2.5GB dump file, something which I did not want to send over the network! After poking around at the options for svnadmin dump, I found that I could shrink this down to about 1GB by using the –deltas flag, which saves space by dumping only the differences between each revision in the repository. 1GB was still pretty big, but we have a fast network, so it wasn’t that painful. I transferred it to the new server, created a new repository, and ran svnadmin load to load the dump into the repository.

All I had to do next was delete the directories from the repository that I didn’t want. I knew this would be a little tricky because I didn’t want to keep any code from those projects around, and simply running svn delete on each directory would have kept the other projects in the repository’s history.

As it turns out, you can’t just remove all traces of something from a Subversion repository. The reasons for this are many, but they simply haven’t gotten around to implementing svn oblitherate, yet. The current solution is to create a dump with svnadmin dump, and then process that file with a tool called svndumpfilter.

The docs for svndumpfilter are pretty straightforward, so I tried using it on the dump file I had already created, but no matter what I did I kept getting this error:

svndumpfilter: Unsupported dumpfile version: 3

What the docs (and error message) don’t tell you is that svndumpfilter only works on full dump files, and doesn’t support dump files made with the –deltas flag.

Long Story Short (Too Late)

In the end, what I wanted was simple, but not obvious. On the original server, I ran:
svnadmin dump /path/to/original/repository | \
    svndumpfilter include my_project \
               --drop-empty-revs \
               --renumber-revs > dump_file

I was then able to copy the resulting (much, much smaller!) dump file to the new machine, blow away and re-create the new repository, and load it with svnadmin load.

And now maybe you can learn from this example instead of having to figure it out yourself through trial-and-error!



Comments

schmarty on said:

Two lessons learned while writing this post:

Turn off ndash conversion in WordPress (it's a pain, but worth it if you put code snippets in blog posts): http://www.linuxscrew.com/2...

Also, get a good code syntax highlighting plugin. People seem to like this one:
http://wordpress.org/extend...

Wayne Goode on said:

Thanks. I've been trying to solve this for a couple of hours and the subversion book & websites don't have the key fact, that "svndumpfilter only works on full dump files."

cognitiaclaeves on said:

Thanks for posting this. It saved me some time. ( Which is good, because time was already lost generating a dump file with --deltas. ) Much appreciated!

cognitiaclaeves on said:

I found that svnfilter didn't work for me, after all.

This site explained why:
http://www.chiark.greenend.... ( Part 6, "Except that svndumpfilter doesn't quite work..." )

svndumpfilter2 also had an issue to overcome:
svndumpfilter: Invalid copy source path '/some/old/path'

I posted something on this site explaining how I just bumped the memory allocated to the VM to 2G: ( And the 2G may have been overkill. )
http://stackoverflow.com/qu...