What are MySpiders?

MySpiders is a java applet that uses intelligent, autonomous, adaptive software agents to search the Internet on behalf of the user for information about the user's query. MySpiders complement, rather than replace, traditional search engines, by locating recent documents that may not have been indexed by search engines yet.

The system is inspired by artificial life models that view the web as an ecological environment in which agents can compete for relevant sources of information, learning from experience and reproducing when they are successful or dying when they run out of luck. MySpiders are a limited prototype implementation of the InfoSpiders project. For more detailed information go to the InfoSpiders page.

What is an applet?

An applet is a piece of code that runs in the virtual machine of your browser. MySpiders is an applet that runs on your machine, in the browser's window, using the Java runtime environment of your browser. The Java classes (MySpider's executable code) are loaded on your machine from our web site.

Are MySpiders a search engine? Can I submit my URL?

No, MySpiders are not a search engine. There is no central index or database of pages visited by MySpiders, other than on a user's machine during a search. Therefore, it is meaningless to send us any URL.

Which browsers and operating systems configurations are supported?

Following is the summary of browsers and operating systems on which MySpiders could run (the applet will provide a mechanism to download the policy file):
OS Browser (tested)
Win 95/98 1 IE5+/Netscape4.07+
Win NT/2000/XP 1 IE5+
Linux 1 Netscape4.07+
MacOS X 2 IE 5.1.3+
1 The Java Plugin is required and will be automatically downloaded or you will be directed to a download site from the applet page.
2 No Java Plugin is required. MRJ is used to run the applet.

Do I need to install the Java Plugin for running the applet?

MySpiders require a browser with a Java 2 Runtime Environment. Unless your browser is already equipped with such a JRE (to our knowledge this is currently only available on Mac OS X), you will need to install the Java 2 Plugin but only the first time you run MySpiders. The Java plugin is automatically detected when you load the applet. If it is not found, instructions will be given on the screen for your browser. With Microsoft IE, the plugin will be downloaded and the installer will start automatically. With Netscape, you will be taken to a page where you can download the plugin and then you may need to start the plugin installer yourself. In either case follow the instructions given by the applet and the installer. At the end of the process MySpiders will start running.

If you are using Linux or Solaris and the plugin is not loaded, you can try to download it and install it yourself. Follow the installation instructions for your OS.

What is a policy file?

The policy file called .java.policy contains a description of the permissions that you may grant to an applet. The applet provides instructions for downloading and saving the policy files based on your OS.

Do I need to download and save the policy file each time I use the applet?

No, you need to do that only once - the first time you use the applet.

Are MySpiders secure?

Yes, you download MySpiders (the applet) onto your computer from our site. MySpiders will search the Internet and keep track of relevant infomation by writing files to you system. MySpiders will only work with the files that it creates, it will not modify any of your personal files or send any type of information to us.

How do I use the MySpiders applet?

The user interface of MySpiders is shown below along with brief explanations for the different components.

After you enter the query words in the space provided and press start, the results of the search are displayed in real time as a table. Clicking on a row of the table will open the corresponding URL in a browser window. The table also shows the URL's source (who found it first), its score (how similar the page is to the query), and its recency (how recently the page was last modified). If you like a page found by a spider, you might like to probe into the other pages that spider has visited. You can do so by clicking on the appropriate spider in the Spider Hierarchy (right), which opens up a window with the spider's details (see image below).

Why do some spiders die and/or new ones appear?

MySpiders are an artificial life system guided by an evolutionary algorithm. If a spider does not find enough relevant pages or gets into a dead end, it can die. If it finds a lot of relevant pages, it will reproduce and a new agent will appear. So for example if Spider 2 reproduces, you can view its children in the Spider Hierarchy by opening the Spider2 folder. The dynamic population size is also affected by how many pages you want MySpiders to visit and by the speed of your Internet connection.

Why are the terms in the spiders different from my query?

Some of the commanly used terms such as "and", "the", "of" etc. are removed from the query to improve the precision of results. Also, when a spider reproduces, the newborn spider tries to expand its query by adding a frequently occuring word in its birth page. These new terms are also conflated, so for example "student" and "study" will become "studi".

Why are MySpiders so slow?

MySpiders actively crawl the Internet for you in real time. To do this, they must visit many pages, which they must download onto your computer. The process of downloading documents from across the world wide web is what makes MySpiders slow.

Can I change the parameters of MySpiders?

No, the interface has been kept simple and there is no provision for changing parameters such as the number of spiders in the initial population. These parameters have been tuned for acceptable performance and hidden in the interest of a friendly interface. Some advanced features of InfoSpiders such as relevance feedback, localized TFIDF weighting, and query refinement are also not implemented in MySpiders.

What are cache files? Where do MySpiders store the cache files?

Cache files are MySpiders' temporary files. They are used to keep track of previously visited pages. If you are using Windows 95/98/XP, the cache files are stored under drive:\WINDOWS\TEMP. In Windows NT/2000 you can find them in the drive:\WINNT\TEMP directory. The /tmp directory is used under Linux or Mac OS X.

Why do I get a message, "Fatal Error! Could not write to cache"?

A likely cause of the message is that the applet did not find the temporary folder to write the cache files. If such a folder does not exist you will need to create one with the correct name depending on your OS (see the previous question). For example, if you are using Windows NT and you do not have a TEMP folder inside the WINNT folder, you will need to create one.

Why do I get a message, "Maximum queries to search engine exceeded"?

MySpiders rely on a Google Web API account to get the seed URLs that are used to initialize the spiders. The Google API currently restricts the number of queries per day from a given account. If the applet has been used by many people on the same day, the Google API will stop responding for some time and you will see the message "Maximum queries to search engine exceeded." You may try the applet again the next day.

Can I run many searches without downloading the applet again?

Yes, the applet will be stored in your browser's cache, which will allow it to be reloaded locally. However the applet may be deleted from your cache at anytime by your browser. You want to reload the applet to get the latest version, which may be updated frequently on the MySpiders web site anyway.

Who developed MySpiders?

The current applet implementation was developed in Java by Gautam Pant, PhD student in the Management Sciences department at the University of Iowa. A previous version was developed by Gautam Pant and Melania Degeratu, a former PhD student in the Applied Math and Computational Science program at the University of Iowa. Melania is currently at Columbia University's Computer Science department. The InfoSpiders project is supervised by Filippo Menczer, currently in the School of Informatics and Department of Computer Science at Indiana University, Bloomington. The project started with the doctoral work of Dr. Menczer under the guidance of Dr. Rik Belew at the University of California, San Diego.

Can we use this technology in our product?

If you wish to use MySpiders for a commercial product, contact us about licensing this technology.

Whom can I contact if I have problems or feedback regarding MySpiders?

You can contact Gautam Pant (pant at dollar.biz.uiowa.edu) for bug reports, comments, suggestions or any other feedback. A prompt response however is not guaranteed.

Back to MySpiders