|
BDDBot is a web robot, search engine, and web server written entirely in Java(TM). It was written by Tim Macinta for his book (co-authored with Wes Sonnenreich), a Web Developer's Guide to Search Engines by Wiley Publishing. It was written as an example for a chapter on how to write your search engines, and as such it is very simplistic. While not as heavy duty as other free search engines such as ht://Dig, the BDDBot offers the following advantages:
- Its simplicity makes it a good learning tool for how search engines work. The aforementioned book provides a good top-level overview of how it works so please go buy the book (insert goofy smiley face emoticon here).
- Its simplicity also makes it easily expandable. You can very easily expand it so that it can index document types besides HTML and plain text. You can also very easily expand it so that it can crawl using different protocols (e.g., gopher, wais) by using the standard Java method for adding protocols.
- It comes with its own built in web server - we don't know of any other free search engine out there that does this. If you do, please let us know.
- It's completely free, ala the GNU General Public License. ht://Dig is the only other free search engine we know of that's under the GPL.
- It's written in Java, which provides several advantages in and of itself. Because it's written in Java:
- The BDDBot can run on any machine that has a stable Java Virtual Machine (at least as long as Microsoft continues to fail at making Java a Windows specific language).
- It is in an easy to understand and powerful language.
- It is object oriented for even greater extensibility.
- It's very small - just over 100K including source code, binaries, and configuration files at last count.
- Its indexes are very small. They are on the order of 10% of the size of the text on your site even though they index every single alphanumeric word.
Please keep in mind that the BDDBot was written in about half a week, and that is why it's quite simplistic in most places.
URL: http://www.twmacinta.com/bddbot/
Licence: GPL
Related Tips
|
Page 1 of 0 ( 0 comments )
You can share your information about this topic using the form below!
Please do not post your questions with this form! Thanks.