Saturday, May 16, 2009

My mini project on search engine

I’m at my level two-semester two of my university life. Last month I was very busy after my new year vacation because I had to implement a search engine for Algorithm subject which was conducted by Madam Upuli and Sir Saliya.

We had taught several types of algorithms and data structures before this project and we had to apply those things when implementing that.

The search engine works like this. Initially we have to input the root path of the folder which we are going to search. Then the search engine should search all the files within that folder as well as the subfolders. Then it should read all the words in text files and store the words and the corresponding file paths in a suitable data structure. After finishing the storing the search engine is ready for search. Then the user can input a word alone or multiple words using & and | operations for search. then the search engine should return the corresponding file paths as the output.

At the beginning the task was not easy to achieve because I didn’t even knew how to differentiate text files and directories. So I had to work hard to fulfill this task.

I choose java to implement the search engine and these are my classes

I implemented my own vector class which have method

putValue( String data)-store data

getvalue(int index)-get the value according to the index

appendValue(String data)- append data to an existing row

to do this I had to implement an dynamic array.


This is the code for creating a dynamic array which is send by one of my friends.

1)first create a fixed array(say array[100]).

2)write a logical condition in the code, if the fixed array is full , create a new array(say tempArray) with a greater size (you can increment the fixed array size with some value).

3)then use the following method to copy the object references in the fist array to the new one.

System.arraycopy(,0,,0,) // here two zeros are to specify beginning position.

4)then make the new one referenced by the fist array name.
ex: array=tempArray;

Then I used the vector class to implement the hash table class. This has methods called

insertWord(String skey,String data)-insert the word and the path to the hash table

getData(String searchKey)-get the path of the word

checkData(String path,String data)- check the existence of the word before storing it.

In this class we use a 2D object array of vectors and after we pass the word and the path to the HashTable it converts the word to a hash value by using the inbuilt method hashCode(). Then calculate the key by

key=hashvalue%100

and key is the row of the object array. Then the class search the pre existence of that hash value in that vector and if not store the hashcode in the first column and store the path in the second column. If the hash value is already exists in that vector it appends the path of the word to the existing path.

Then the most interesting part was the searching the files in folders. And I used fileFilters to Identify the folders and the files in an given root folder.

A link for learn fileFilter

I used all these tools and finished my project successfully. I hope you can use try this so you can improve your programming skills. Keep cording ,it will train you to be an outstanding coder.

3 comments: