Vous êtes sur la page 1sur 7

C# and Lucene to index and search

http://www.aspcode.net/C-and-Lucene-to-index-and-search.aspx

ASPCode.net
Home ASP.NET articles

Free asp.net and asp scripts

Search... S

.NET

Applications

ASP

Other

ASPCode.net Blog

ASP.NET auctions

Navigation: Start : .NET : C# and Lucene to index and search Previous: Reading DBF files in C# Next: C# blog client

Sep

12

C# and Lucene to index and search


Posted by admin under .NET

2007

This sample will show you how to use Lucene from your .NET application to index and search content. There are some articles and samples to be found on the web, but it seems that they are a bit outdated. Myself I used Lucene version 1.4 something some year(s) ago and thought now that I needed it again I could just download the new dll:s and copy my existing code. Turns out they have made quite a few API changes. So I have created this example:

1 of 7

10/12/2008 9:06

C# and Lucene to index and search

http://www.aspcode.net/C-and-Lucene-to-index-and-search.aspx

Just download the solution and test it out - VS 2005 and C# has been used. And it's not many rows of code. Let me take you trhough some parts: 1. In Form_Load we verify the index exists. If not we create it

2 of 7

10/12/2008 9:06

C# and Lucene to index and search

http://www.aspcode.net/C-and-Lucene-to-index-and-search.aspx

private string IndexLocation() { return System.IO.Path.Combine(Environment.CurrentDirectory, "testindex"); }

private void Form1_Load(object sender, EventArgs e) { string sIndexLocation = IndexLocation(); //1. Check to see we do have an index bool created; EnsureIndexExists(out created, sIndexLocation); //2. If not then create the index - and fill with some fake data if (created) { InsertIndexData(sIndexLocation); } label3.Text = sIndexLocation; comboBox1.SelectedIndex = 0; }

private void EnsureIndexExists(out bool created, string sIndexPath) { created = false; if (!IndexReader.IndexExists(sIndexPath)) { IndexWriter writer = new IndexWriter(sIndexPath, new StandardAnalyzer(), true); created = true; writer.Close(); } }

3 of 7

10/12/2008 9:06

C# and Lucene to index and search

http://www.aspcode.net/C-and-Lucene-to-index-and-search.aspx

As you can see Lucense creates some physical binary files for the index and I choose to just create a directory testindex underneath the running exe, meaning if you run it from VS 2005 under debugmode it will be available at LuceneTest/debug/testindex Now for the actual indexing. I have just thrown in some hardcoded junk content - but nothing stops you from indexing real documents/websites etc:

private void InsertIndexData(string sIndexPath) { IndexWriter writer = new IndexWriter(sIndexPath, new StandardAnalyzer(), false);

//Lets insert all data - for this example I'm using //some fake stuff, but you could of course easily index anything - say data from //a database, generated from files/web spidering etc

IndexDoc(writer, "About Hockey", "hockey", "Hockey is a cool sport which I really like, bla bla"); IndexDoc(writer, "Some great players", "hockey", "Some of the great players from Sweden - well Peter Forsbe IndexDoc(writer, "Soccer info", "soccer", "Soccer might not be as fun as hockey but it's also pretty fun"); IndexDoc(writer, "Players", "soccer", "From Sweden we have Zlatan Ibrahimovic and Henrik Larsson. They are IndexDoc(writer, "1994", "soccer", "I remember World Cup 1994 when Sweden took the bronze. we had great pla

writer.Optimize(); writer.Close();

4 of 7

10/12/2008 9:06

C# and Lucene to index and search

http://www.aspcode.net/C-and-Lucene-to-index-and-search.aspx

And last - and maybe most important. The code for searching. Lucense has a very flexible query language - for a full reference see this document on Lucene query parsing language For simplicity I have created a very basic example here - but still I have some cool features to show you *some* of the power in the query language. As you can see in the screenshot you can (by using the combobox) select to search in all types of documents - or only those tagged as type=soccer resp. type=hockey. Also you can select to search in the "header" field as well. The code for creatng the query, execute the search and filling the results listview looks loke this:

//Do the search,,, listView1.Items.Clear();

IndexSearcher searcher = new IndexSearcher(IndexLocation());

QueryParser oParser = new QueryParser("content", new StandardAnalyzer());

string sHeader = " OR (header:" + textBox1.Text + ")"; if (checkBox1.Checked == false) sHeader = "";

string sSearchQuery = "(" + textBox1.Text + sHeader +

")";

if (comboBox1.SelectedIndex > 0)

5 of 7

10/12/2008 9:06

C# and Lucene to index and search

http://www.aspcode.net/C-and-Lucene-to-index-and-search.aspx

{ sSearchQuery += " AND (type:" + comboBox1.SelectedItem.ToString() + ")"; }

Hits oHitColl = searcher.Search(oParser.Parse(sSearchQuery)); for (int i = 0; i < oHitColl.Length(); i++) { Document oDoc = oHitColl.Doc(i); ListViewItem oItem = listView1.Items.Add(oDoc.Get("header")); oItem.SubItems.Add( oDoc.Get("type") ); }

searcher.Close();

Lucense is a (free) incredibly fast search engine and you can use it against any content, since it has a very loose model where you put "documents" into the index and you can decide yourself which properties the document should have. For example I have used it to accomplish fast full text searching against an existing MS SQL Server database, where some of the indexed properties just were pointers into the database tables (article_id column is one example.)

Attachments
LuceneTest.zip

6 of 7

10/12/2008 9:06

C# and Lucene to index and search

http://www.aspcode.net/C-and-Lucene-to-index-and-search.aspx

Previous: Reading DBF files in C# Next: C# blog client

Template design by Six Shooter Media | PHP scripts | Powered by ASPCode.net codeviewer Systementor AB, ASPCode.net.

7 of 7

10/12/2008 9:06

Vous aimerez peut-être aussi