Tim Spalding - Future of Librarians Interview

Tim Spalding is the creator of LibraryThing, but for readers who may not know you, why don't you tell us a bit about yourself for and your background?

Sure, well, there isn't that much to say about me, personally. LibraryThing is my first endeavor that's gotten any notice at all. I worked for Houghton Mifflin for a while, and I was a graduate student in Greek and Latin at the University of Michigan.

Books are very central to my life. I ended up marrying a novelist, which is a big mistake if you want to keep your library down. I've been doing book cataloging since I was a kid; I had a FileMaker database with my books on it, I played with various different solutions and I thought it would be fun. It wasn't as if I wouldn't make any money from LibraryThing at all, but I figured it would be a very small hobby project, not turn into a company.

I started LibaryThing in August, 2005, while I was doing freelance web design and web development type work, and it took off within the first week or so, growing quickly ever since. When it started, it was pretty much only cataloging. The inflection point happened when I realized people were starting to use it socially. They were sending each other the URL's to their libraries and commenting on each others' libraries - so I just started adding social features and I think that's really what's provided a lot of the thrust.

Who were these people?

Extreme book nerds like me, and a lot of academics. Now it's slid down to the point where there are lots of people on LibraryThing who are there to put in some books that they are reading right now, and then have conversations about them or get recommendations. So it's not necessarily the guy with 3,000 books stashed away in his apartment.

LibraryThing is doing something different from a standard social network, where you?re connected to people based upon friends. That's a very sort of binary way of seeing the world: you're my friend, you're not my friend, maybe you're my friend but I'm not your friend. The idea behind LibraryThing is that you're connected to people in much more complex ways. We recently added "friends" and a category we call "interesting libraries," so you can track book reviews, ratings and so forth from people you know or whose libraries you want to follow. But the books are the core of it.The way that you're connected to people primarily is through books so if you and I share two books, it's not very interesting, but if you and I share 40 books, and they happen to be 40 of our more obscure books, then chances are we share some sort of deep connection of interest. So the content, the books that people put in, and the social dynamic, are intricately related.

Marshall McLuhan once said the future of the book is the blurb? and it sort of seems like your site is proving his prediction right. What is LibraryThing doing to books?

You can see LibraryThing as doing something both very new and very old. I think that 50 years ago when you got together at a party with a bunch fairly smart people, chances were that one of them would reference a novel. These days it's invariably a movie. You need to know somebody before you can start talking about shared novels. And you can't really assume anyone's read the same book except for Harry Potter. I think LibraryThing takes that idea and makes it possible to socialize around books which you wouldn't otherwise know you have in common. it's awfully like graduate school actually, where you could pretty much talk about certain books with anyone. That's just not true in the real world, and so it's a cool thing to be able to do online.

As something new, LibraryThing is doing a number of interesting things with book data, such as tags - you're even going to see a feature for tracking marginalia. It's taking things which are hard to make social and it's making them social in a huge way.

Can you talk about the recommendation engines?

Sure. If you look on a page that shows you a particular book, you'll see a list of suggestions and one of the links is to a list of larger suggestions. Right now I think I've got five recommendation algorithms, and there's some very interesting math involved in making these things. Amazon is a good example of one. LibraryThing rests upon very good data. One of the sources is the tags, which is often very good but there are ways that tags can go wrong. The classic case is the tag "leather", which can either be about the binding of a book, leather making, or a type of erotica, there's no way to tell the difference without very complicated algorithms. A lot of it has to do with just holding patterns. If you and I share books, then other books that we don't share ought to be interesting to each other. If you do some fairly standard statistics on that you get good data out. Amazon does the same thing, but when you buy a book on Amazon you might buy it for a co-worker, or your wife, and the books you buy on Amazon are not a good sampling of everything you've ever bought. LibraryThing by being in some way a representation of your whole library - or even just what you're reading now - is a much truer representation of who you are.

it's also much better for the so-called long tail. I have books in my library now which are not heavily sold but they're still good. A classic example on Amazon is when you type in "Harry Potter" and the five recommendations are the other five Harry Potters. Which makes sense. LibraryThing throttles that so it only gives you two Harry Potters and then it gives you things like A Wrinkle in Time. Well, A Wrinkle in Time is a great book, but it's not selling really well this week. But it's in peoples' libraries. So LibraryThing sees it and says, well, people who like Harry Potter are going to like A Wrinkle in Time and Susan Cooper books and whatever else.

LibraryThing uses Z39.50 protocol. Is that a gimmick like listening to music on vinyl records, or is is it really the best thing out there?

Z39.50 is a protocol that libraries have been using for quite some time to exchange data with each other, but the way that they do it returns records which are fairly difficult to parse, you have to know how to parse them. So with LibraryThing, I figured out how to make that work. it's not rocket science, but it takes a little bit of effort and a lot of tolerance of bad old technologies. Many of LibraryThing's competitors rest on the fact that there's this Amazon API that queries Amazon and comes back with beautifully formatted XML. If you're not really deep into books that's good enough, but if you own books that are out of print it's not going to be good enough for you. If you care about library data, if you want subjects, dewy decimals and so on - if you want really high quality book data you have to go to Libraries to get it. So LibraryThing sort of goes the extra mile.

At home I've got my bookshelf arranged by color, how can I do that on LibraryThing?

You'd have to tag it. People have suggested that we color-analyze the colors and allow people to display their shelves like that. I think that'd be a fun feature.

By using library technology and giving really personalized recommendations, could this take the place of librarians?

In terms of the recommendation thing, librarians don't present themselves as knowing the entire universe of books. Librarians have long relied on readers' advisory websites, journals, other librarians, patrons they respect and so forth. LibraryThing is just one more source in the mix there. I don't think it obviates the need for a librarian any more than amazon or anything else does. If there's a crisis in librarianship I think it's not there.

But there are a few ways in which LibraryThing does librarian-like tasks. There is a feature which disambiguates editions: you can put all the different editions of a particular work together, and the users decide whether a book is or is not also an edition of the Hobbit. Librarians have some ways of doing that but LibraryThing is a very good solution because it's drawing upon the collective intelligence of thousands of people. LibraryThing is doing some interesting things with statistics and user-generated content, a phrase that I absolutely hate but there's no better.

After I came up with related tags, I decided to look at which subjects relate to the tags the most, and displayed Library of Congress subject headings in a statistical relevancy order. It doesn't appear that anyone's ever done that before.

Can you talk more about that?

Sure. With tag classification, you build up this enormous database of what people think about books. So you can take a particular tag like ?chick lit?, for example, and it will spit back at you a list of books that are tagged ?chick lit? in descending order of relevance. It's a very good list. Also "cyberpunk" and various others can be better than Library of Congress subject headings, depending on what they?re for, and how they?re made. LibraryThing was born digital, so it has a concept of relevance that a lot of library classifications don't have. In the Library of Congress, for example, there?s a category called "Man-Woman Relationships". The book either is, or is not, about man-woman relationships. But of course 80% of all books in western literature are in some sense about man-woman relationships. LibraryThing goes beyond a lot of sites in that we have tagging, and we mix the tagging up with controlled subject headings like the Library of Congress, so you can see for a given tag what the most relevant subjects are, and vice versa. There are advantages and drawbacks to both. On one end you have the LibraryThing tag "leather", which is highly ambiguous, but on the other side you?ve got the Library of Congress' subject heading ?Cookery?, which is actually a good subject heading, it's just that no one knows to type "Cookery" into a catalog. There are problems and benefits to both approaches; I don't think tags are going to get rid of classification anymore than TV got rid of radio.

We recently introduced the concept of the "tagmash," static pages for the union of two or more tags. So, for example, you can find out what books are tagged "wwii" and "france". This gets past one problem with tags, that people don't tag that verbosely; they don't usually tag things with "france during World War Two." It closes some of the gap between tags and formal, hierarchical subject headings.

We also recently made LibraryThing recommendations and tag-based browsing available to libraries inside their current OPACs. The program is called "LibraryThing for Libraries". Through some magical JavaScript, it works within any OPAC, and it's pretty cheap to do. Four libraries are already live with the product. We've had trouble keeping up with demand, but will shortly hire a dedicated library programmer to get serious about expanding it.

Do you see LibraryThing as a vanguard, shaping the future of how libraries work?

I think it's pushed forward the idea of having users in the mix. Other people have done it, too, but LibraryThing is certainly the most prominent example of people tagging and classifying books online. Amazon had ratings and recommendations for a million years but that was always in the service of commerce. LibraryThing takes more of a booklover's approach to it. Library science is so binary and anti-statistical, so just seeing things like the relevancy feature on a website has inspired people to think about it a little bit differently.

The thing that really attracts librarians is that LibraryThing takes the library data seriously. There's a lot of really great data in library catalogs which hasn't gotten out there. If you look back 10-15 years ago, and you notice this thing called the Internet, you would probably assume that if you typed "The Hobbit" into a search engine, the Library of Congress would show up near the top. But no! No library is near the top! They are hundreds down, and it's because librarians didn't get their data out there, they didn't show people that their data was good. Organizations like OCLC have a vested interest in preventing library data from getting out there, while organizations like Amazon have gotten really hip to the idea of putting their data out there as a way to sell more. LibraryThing is taking this data seriously, doing statistical analysis on it, taking all of the records for the hobbit from 50 different libraries and figuring out what that data means, and what you can smoosh it together to mean. That?s something that should have happened 15 years ago, and we're just starting to play with it now. Whether inspired directly or just part of a general upsurge, people are starting to do stuff with library data now which is really inspiring.

Thanks so much for taking the time to talk with us Tim. You can check out LibraryThing.com to open a free account, the official blog for LibraryThing news or keep up with the intellectual LIS issues that LibraryThing raises at the Thingology blog.


Customize Your Education