eBook Collectors of the World Unit!
Links Videos Recommended Reading How to meet Infidel2u in person

 

 

This site is about the ins and outs of collecting ebooks, electronic books and audiobooks, and how to share them. The why of collecting ebooks will vary: maybe you plan on sailing around the world and would like a portable library; perhaps you've heard the phrase "overshoot and collapse" and foresee a future where there may be no public libraries you can use, no Amazon.com to order from, or even, horror of horrors, no internet (at least not for you). Collecting and preserving digital knowledge now, while we can, could be a good thing—good for you now and in the future, good for your grandchildren, good for humanity. Or it could be a bad thing—bad if you are merely seeking to obtain books you would otherwise buy, thus reducing the incentive for new books to be written and published. Needless to say, this site is about the former and not the latter.

No illegal material will be found on this site, but you will be told how to obtain material through both legal and illegal means. The justification goes something like this: For there to be a crime, there needs to be a victim. If you download a book instead of buying it, then you do real harm and your actions are reprehensible, not to mention criminal. It is unclear, however, whether someone in possession of a million ebook archive, some of which are (currently) copyrighted, is a criminal. The law says they are. Period. No ifs ands or buts. If you agree, then hit the back button now. The counter argument is that while authors want and should be compensated for their work, authors also want to have their work survive, to exist in the future (and be read). In this view, individual collector/archivists seek to be benefactors to future generations who may not regard them as criminals. It's not that you will be the last human on earth with the only surviving copy of the works of Shakespeare on a disk, but you may well be the only source in your immediate community—the future may resemble the past where access to knowledge was quite limited. For now the trend is still towards unlimited, and hopefully the trend will continue forever, but it would be naive to assume so. If a few thousand people in the world were to collect as many ebooks now as they could, that would not be too many, and may not be nearly enough.

With the possible exception of Bill Gates, having a person library of a million paper books is an impossibility even if there were enough trees in the world. Possessing a million ebooks, however, is not—a shoe box of DVDs would suffice. The more people who possess something like "the sum total of human knowledge" the better, or at least it won't hurt—there are worse things you could be doing than collecting digital "printed matter."

 

How to Obtain Material

Numerous internet sites distribute ebooks: The University of Virginia has 2,100 classics available for free download. Project Gutenberg offers over 10 times that and counting. There are many other sites distributing free and legal ebooks in various formats. While every internet user could go to these sites and download thousands, even tens of thousands of books, it would be a great waste of time for everyone to do so one book at a time. Far better would be for individuals working alone, or collaboratively in a small group, to collect ebooks in an area of interest, organize the collection, and make it—the entire collection—available for downloading. The way to do this is through peer to peer (P2P) file sharing. There is no cost involved other than the price of the effort needed to learn how to do it. P2P file sharing is also the key to obtaining vast amounts of material, both legal and illegal, to create a collection. If you want to limit your collection to only legally obtained material (not a problem if you're collecting ancient Greek literature), then that's up to you.

 

A Really Brief Introduction to P2P File Sharing

There are vast amounts of how-to information available for the googling—way too much. The beginner is easily overwhelmed and only the geekiest seem to survive to figure it out. Hardcore geeks already know; lesser geeks need at least a pointing finger:

1. All files that are shared exist on yours and other people's computers (your peers). You want a certain file on your computer. You search a P2P network using special software or web sites, find a file and try to down load it (using special software). As you acquire the file, others begin to acquire parts of it from you. While you have part of the file you are known as a peer or leecher. When you have all of a file you become a seeder (a peer who has the complete file). For the system to work you need to continue to allow others to "leech" from you (the seeder) until you have uploaded something more than the original file size. You could disallow others to upload from you as soon as you get the complete file, but doing so is frowned upon, can be detected, and you may be punished by having your future download speed reduced. The more people seeding a file, the faster it uploads to everyone (the swarm) who is trying to get it. Files that have few seeders should be seeded (by you) until there are enough other seeders. When there are no longer any seeders, the file dies. A file available now may not be in the future. If you have a valuable file, you should check every so often to make sure there are enough seeders, and seed it yourself if not.

2. There are several networks over which files are shared. Here's where the pointing finger can help.

BitTorrents: This is where large files are shared. This is where you will find large collections and share the ones you make. Download and install uTorrent. This is free software. Ignore the fact that there are dozens of competing alternatives—this one is geek approved. You may have issues getting it to work with your firewall and router, if any. There are tutorials online to help. When you think uTorrent might be working, go to BTjunkie. This is a web site that indexes hundreds of thousands of torrents. Torrents are small files that allow uTorrent to connect with peers having the file you want—that's all you need to know. If you can't find a file you want on BTjunkie, there are dozens of other torrent search engines: Mininova, Pirate Bay, TorrentScan, Isohunt, Demonoid, and so on. Each has features others lack, and some require membership.

The eDonkey/Kad networks: Download and install eMule. This is free software, and again ignore all the others. There is no web site indexing shared files. You search the networks from within eMule. There are several million users. You will find medium to small files, and are more likely to find rare files. Downloading tends to be slow compared to bit torrents, but sometimes you'll want to search here for files not otherwise available.

Gnutella network: Download and install Frostwire, its free. On this network you'll find small files, mostly mp3s, and plenty of viruses (try typing in "abcdefg" and note the files that put whatever you search for in the title and avoid these files, as well as .exe and .zip files). When Frostwire asks you if you want it to be your bit torrent client, say no (it will still do so and you'll have to go to 'tool, option, advanced, file associations' to change it, then reassociate .torrents with uTorrent or install Frostwire first, then uTorrent). Again, you'll occasionally want to search here for files that may not be available elsewhere. While compiling a collection of Edgar Allan Poe audiobooks I searched with Frostwire and eMule for individual files, but 99% of the time I use uTorrent.

3. Just say no. But if you do download files you know or suspect may be copyrighted you'll want to download and install PeerBlock first. PeerBlock (again free) is sort of like using a condom to practice "safe" file sharing. Like a condom its no guarantee you won't get a threatening email (or conceivably worse) from your internet provider to stop downloading copyrighted material, but it will go a long way towards protecting you from harm—your risk becomes very slight as about 80% of file sharers don't know enough to use PeerBlock and so are far easier targets for the lawyers to go after.

 

What to Collect

If you have a special interest, then, of course, collect books (essays, articles, etc.) related to your expertise. There is also a great need for well organized (accessible) general collections. Maybe you'd like to do a collection of how-to books. You don't need to be an expert in how to do everything, you just need to find and organize the material.

One worthy project someone should undertake is to make a decent collection of great books, most are not copyrighted and are available. There are numerous lists of great/best/must-read books that people have come up with. What someone needs to do is collect all the lists, combine them, then collect the actual books. Poetry and short story collections are other possibilities. For fiction, organize by author: last name then title. Much better, determine the year that each work was written, and name the files by year, last name, title. For example, within a folder named "Twain, Mark" you would have "1884 Twain, Huckleberry Finn.pdf" as the file name. This convention would make it easy to sort the collection chronologically or alphabetically by author or title. You can collect for quantity/completeness, or for quality.

There are many textbooks available. Here you would organize by subject, not author. Individuals could work on collecting textbooks on a particular topic (chemistry, Japanese, calculus, grammar, dictionaries...), but also cooperatively by using a similar format and avoiding overlapping categories.

 

eBook Readers

Aside from your laptop or desktop readers (as well as PDAs and some cell phones), there are dedicated ebook readers that use power only when turning pages and can be used for days, or even up to a month, before needing to be recharged (by a small solar panel if need be). Bigger and better are to come (Plastic Logic), while the hottest at present is Amazon's Kindle which is wirelessly connected to download books and even do limited surfing of text web sites like Wikipedea or Google. A couple of years ago ebook readers were about $300-$400, now that they're catching on prices have dropped down to the $100-$250 range. Some growth in technology means less resource consumption. Soon you will be able to subscribe to any newspaper or magazine, buy any new book, students any textbook, and have them all on your pencil-thin ebook reader, and if you can't read the screen, no problem, the ebook reader will read to you (Kindle). This will reduce both costs and resource consumption. Going digital and developing low-power, affordable, and durable devices to access all the 1s and 0s (text, audio, video) should be a high value goal for our civilization.

 

eBook File Formats

There are currently many, perhaps way too many, ebook file formats, with no clear winner. Plain text files (.txt) lack all formatting and are the most painful to read. Well formatted text goes a long way towards making reading a pleasure (try reading a poetry collection in a fixed space font where all lines have been fully left justified—the horror, the horror). So a close second to having a book is having it formatted for readability. If the first goal of a collector is to obtain a text, the second would have to be reformatting it as needed, and then perhaps converting to another ebook file format.

Adobe Reader's Portable Document Format (.pdf) has been around a long time and is the most common. This format is geared towards formatting text for printing (for output to a printer) and so is rigid in layout--text does not flow to allow resizing for different screen sizes that range from wall-sized to credit card sized PDAs. .PDF files look fine on a desktop, but don't adapt well to small screens. As ebook readers get larger, the inflexibility of the PDF format will be less objectionable, but still far from ideal.

Basically what is needed is a flexible, flowable format (like .html) that is compressed to save space. Microsoft Reader's files (.lit) are just that, and are fairly common, but this format has not been widely supported; only a few ebook readers allow it, so too bad. The format that is looking like a winner is the Mobipocket (.prc and .mobi) format. Thousands of books and other documents have been converted to this format already for use by Palm PDAs and the Kindle supports it (but not Sony ebook readers).

The Mobipocket Reader is a very nice desktop reader (free) that allows drag and drop conversion from other file formats (.txt, .pdf, .html, .doc, .rtf) with reasonably good results (indents are lost so poetry comes out left justified). All you do is drop a file in the view area and it soon appears on screen. More important, if you go to My eBooks in My Documents you'll find the .prc version. You can even select a whole bunch of files at once and drop them for batch conversion. Unfortunately I have found the conversions done by the Reader are "quick and dirty," fast but the files are bloated (12 Mb of .html/pictures turned into a 20 Mb .prc file) and the sizing of graphics is only fair. Also if there are several linked html files it does not combine them. For .pdf conversion, this may be your best bet.

Mobipocket does offer Mobipocket Creator (free) that does a better job (the 20 Mb file became 12 Mb and the formatting was better). The Creator, however, doesn't do batch conversions so every conversion is a multi-step process. It is also a bit quirky: I found that to convert .html with images the folder name containing the html must exactly match the name of the html file (after Import and doing Build of an .html file, a folder is created and if you copy every file associated with the book to it before doing Build again, then it works). I converted a 7 part, richly illustrated, html version of Mark Twain's 1880 "A Tramp Abroad" from Project Gutenberg and had to Add the 2-7 html files manually. The result is vastly more readable than a plain text version (download here). Moral: collect the best formatted version of a book whenever possible (usually html). Mobigen is a command line converter that may be able to do batch conversions--haven't tried it yet. When used to convert PDF files, an .html version is created and left in the folder along with the .prc. This is good as .pdf conversions tend to be only fair. What you can do is edit the .html version and then convert it to .prc with better formatting.

Recently Calibre (kal-iber) has become the ebook manager of choice for many. It's great for managing your library as well as converting between various formats. It's like ITunes for the IPod. Unfortunately it creates folders for each book based on the author's first name last name, and many collectors have been uploading their Calibre library.

I am tempted to convert everything to .mobi but not all ebook readers support it, so it's not a clear winner. Another problem is I haven't found a batch converter to convert .prc files to .html (or other format) should it become necessary in the future to do so. If you need to convert .pdf files, you would do well to drop them on Mobipocket Reader for conversion. Until all ebook readers support one format, it looks like we're stuck with collecting multiple formats. It is probably best to collect .html versions of documents when possible. These are uncompressed and may have a bunch of graphic files accociated with them that should go together in a folder, but the text and graphics are editable, and convertable to anything else including any format that may come along in the future.