Enhanced Berkeley DB Conversions

While I haven’t done a ton of Berkeley DB conversions for Movable Type, I am certain that if you are still on Berkeley, you should get onto a SQL pacakge pronto. This isn’t to say that BDB is a bad product. Just that SQL is so much easier to administer when used with MT that it’s not worth the effort of sticking with Berkeley.

As I was involved in the most recent of these projects, I ran into the age-old problem of too much data. The conversion just wouldn’t run all the way through without losing data. Perhaps there are server tweaks that can make it happen. I don’t know.

All I do know is that I could never get all of the entries to convert. Out of 1600 or so entries, I was getting perhaps 600 to convert. So I needed to take action, and I decided to do so by modifying the default MT 3.16 mt-db2sql.cgi script so that I could have a few more options while running the import. Because this is based on the 3.16 script, it may not work for any version other than that one!

The first step was to fix a problem in the duplicate processing code, since I had duplicate (author) data and the 3.16 version didn’t handle it correctly (thanks Brad!). Strangely, the default script also doesn’t convert the PluginData table, so I added it in there too.

Next, I dropped the function that rebuilt all the tables every time that the script runs, as this meant I had to completely drop all the tables from the database just to try again. Immediately after this, I added it back in, because sometimes you just need it. Now it is based on a command-line parameter.

I then added support for selecting which classes (tables) you can import and also for specifying both the number of entries and the offset for reading entries, so that you can load a section of the file at a time (no timeouts, everything works – yay!). Finally I added a touch of extra information to the output, just so you can watch the progress and know how far there is to go.

These options are all command-line, so you will need to run the script from a shell. Your command-line should look something like this (run it from your main MT directory):

  perl mt-db2sql.cgi 1 MT::Author

This particular example will create the tables (that’s the 1), then load the MT::Author data. You can also specify multiple classes to load more than one at a time:

  perl mt-db2sql.cgi 0 MT::Blog,MT::Category,MT::Comment

In this example, we are not creating the tables (so now the first parameter is a 0), and then we are loading data into the MT::Blog, MT::Category and MT::Comment tables, respectively. To add other classes, simply add them to the list, separating them with a comma. Do not separate them with spaces, or you will get some odd results.

Our next example shows the loading of the MT::Entry table (table creation is still skipped because of the 0 in the first parameter), with a limit of 500 entries:

  perl mt-db2sql.cgi 0 MT::Entry 500

The final example shows the same thing, but now we’re loading the next batch of entries:

  perl mt-db2sql.cgi 0 MT::Entry 500 501

The final parameter on the command line is the offset – that is, where to start loading entries. The number should be 1 above the limit, plus any previous entries. So after the first run with a limit of 500, the offset would be 501, and after the second, 1001, then 1501, 2001, etc. If you don’t add that extra 1, you will try and load an entry twice. This may or may not work, but I suspect that even if it does work, it won’t work like you want it to work.

You can use the script without specifying any command-line options, and it will load all classes. It is important to note that it will not create the tables if you don’t have that first parameter set to 1. So it could be used to combine multiple Berkeley blogs by running it first with a 1 (but nothing else), then running without the 1, or by setting the 1 to 0, for the next set of data.

As with other such solutions, no warranty is express or implied. I hope it helps, but if it breaks something, you are on your own. This is an advanced process, so if you aren’t familiar with the process being described, it is probably best that you don’t try it (or at least hire someone who does, to do it for you). In any case, please be careful and make sure you have backed up your data.

This updated script is made available for download with the express permission of Six Apart, the copyright holders. Please do not redistribute it and please be careful.

Comments

2 responses to “Enhanced Berkeley DB Conversions”

rene de paula jr

July 30, 2006

You saved my life. 🙂

I was almost giving up… Thank you so much!
lola Lee

September 30, 2005

I couldn’t agree more. My first blog incarnation used a Berkeley DB, and got hosed somehow when something went awry. Try as I could, I could never recover. Had quite a few posts there, as well. I’m using MySQL now.