[ Swingley Development ] [ Jump to Content ]
animals baseball beer blog / photolog books bookbinding me other weather woodworking

Bookpedia to MySQL

Several months ago I looked into different OS X programs to help me manage my book collection. An Internet search revealed that the two main competitors are Bookpedia and Delicious Library. Both have demo versions you can download and try, so you can make up your own mind. I can no longer remember why I chose Bookpedia, but I've been very happy with it, and the developer is very active in improving it, and responding to comments and complaints.

Two notes on my Bookpedia database. First, I've changed the CustomOne field to read Copyright and I put the Copyright year in this field. The four Custom fields appear on the right-most tab during data entry or editing in Bookpedia. The program and procedures listed below should work fine if you haven't changed any of the Custom fields, but if you've changed them to something other than Copyright you will probably need to edit the Bookpedia template and Python script to reflect your changes to the database structure.

Second, and probably more importantly, I've changed the date preference for Bookpedia to only show month and year, and my Python script assumes that this is how the dates are represented in the Bookpedia export file. It shouldn't be too hard to change the Python script to handle whatever date style you've chosen, but I can't do it myself because I don't know what a “normal” Bookpedia date looks like. Since MySQL stores dates as YYYY-MM-DD, the script sets the day to 14 and leaves the year and month as is. That way queries based on date will give the expected results (which isn't always the case in Bookpedia, incidentally).

Why?

One of the reasons I don't typically buy software is that you wind up giving your hard-earned data over to the program. If there's no way to get all of it back out again, and something happens to the company or you decide to switch platforms in the future, your data is lost. That's a pretty high hidden cost to the software you're already spending money on. It's great that Bookpedia lets you export everything into whatever format you want (text, HTML, XML), customized however you like. I probably wouldn't have bought it if I couldn't do that.

Here's how to get everything into MySQL. I'm assuming you've got MySQL and Python installed on the machine you'll be using. fink is a good way to get software like this on your Mac, and that's how I've installed it on my MacBook Pro. The same Python script works on my Linux box too, once I've copied the exported file over.

Export the Bookpedia database

I wrote a template that dumps everything in your database to a UTF-8 encoded text file. Download the Full-UTF8.txt Template template file and put it into the Library/Application\ Support/Bookpedia/Templates/ folder of your home directory. Then open Bookpedia, select your Library or a collection and Export the database using the Full-UTF8.txt template.

The only problem I've encountered is that in older versions (pre 3.4, I think) the export process was very slow and resource intensive. If you're running an older version, be sure to upgrade. Exporting is much faster with the newer versions.

Convert to SQL

Grab the Bookpedia to MySQL Python script and run the exported text file through it to produce SQL. To do that, open up a Terminal window (which is in /Applications/Utilities if you've never used it before). I'll assume you downloaded the Python script and saved the export file on the Desktop. If not, you'll need to add the path to these files in the following command.

$ chmod 755 bookpedia_to_mysql.py
$ ./bookpedia_to_mysql.py < bookpedia_db.txt > bookpedia_db.sql

The first command changes the file permissions to make the Python script executable. The second sends the Bookpedia export file to the script, and sends the result of the script to a file called bookpedia_db.sql.

If you had to export your database into multiple text files, you need to run all of them at once with:

$ cat db1.txt db2.txt db3.txt (etc.) | ./bookpedia_to_mysql.py > bookpedia_db.sql

Set up the database

All of the table creation statments are already in the output SQL file, but you will need to create the database itself. To do that:

$ mysql -p
mysql> create database bookpedia;
mysql> \q

You might want to set up another user for the database if you're planning on interacting with the database using PHP or some other scripting language (choose a sensible username and password, of course).

$ mysql -p
mysql> grant select on bookpedia.* to 'wwwuser'@'localhost' identified by 'wwwuserpassword';
mysql> flush privileges;
mysql> \q

Fill the database

Now we will put the data from the SQL file into the database we just created. The script is designed to drop and create all the tables that are part of the database, so every time you run the Python script and import it into the database, you'll be re-creating everything from scratch. This sounds really inefficient, but on my MacBook and Linux boxen, the creation and import of the SQL is very fast.

$ mysql -p bookpedia < bookpedia_db.sql

That's it! The whole thing can be done in one line with:

$ cat db1.txt db2.txt (etc.) | ./bookpedia_to_mysql.py | mysql -p bookpedia

Here's a sample query to get you started. It's the query that the recent page runs on my database:

mysql> SELECT b.title, b.copyright, CONCAT_WS(' ', a.first, a.middle, a.last, a.suffix), p.publisher, b.pages, b.isbn, 
    DATE_FORMAT(b.last_read, '%b %Y'), DATE_FORMAT(b.purchased_on, '%b %Y')
    FROM books b, authors a, publishers p, booksauthors j, formats f
    WHERE b.book_id=j.book_id AND 
        j.author_id=a.author_id AND
        b.publisher_id=p.publisher_id AND
        j.rank=1 AND 
        b.format_id=f.format_id AND 
        (b.last_read > '2007-01-01' OR b.purchased_on > '2007-01-01')
    ORDER BY b.last_read, b.purchased_on, a.last, b.copyright;

As you can see by the query, the database is quite complicated. I made a PDF that shows the structure of the database. If you aren't all that familiar with SQL and normalization, you may want to consult an introductory SQL book. SQL is very cool, but it's a very different way of thinking about data than most people are used to.

The Future

I may work on a program that updates the existing MySQL database, rather than recreating it from scratch each time. But Bookpedia and MySQL are so fast, that I haven't really needed to consider doing the database updating incrementally.

Other Book pages: