Lucere API: What should it be?

Nov 11, 2010 at 10:45 AM
Edited Nov 11, 2010 at 10:47 AM



This post is meant to get the discussion started about designing the new API.  As a side note, I'm cross posting this to the mailing list.

What I'd love to see are some sort of "top down" pseudo-code snippets that show how you envision a new API interacting with your code. 

Some ideas/questions:

- Maybe for querying we should implement IQueryable<T>? How would that look? 

- Maybe for indexing we should implement IObservable<T>/IObserver<T>? How would that look? 

- How can we facilitate parallelization? What kinds of domain entities should be serializable so that you can send them across a wire as part of a distribution model?

- How should transactions and locking work?

- What kind of architectural patterns make sense for this problem domain? 

- We should totally implement IDisposable!.. or should we? Maybe not everything needs to be disposable or should be. What do you think?

- Generic collections and IEnumerable<T> interfaces... Great... but where exactly? What about collections that don't have a .NET BCL implementation already? Existing libraries for that? or roll our own?

- Injectable behaviours using delegates like Action<T> or Func<T>... for filtering, scoring, sorting? 


That's just a start of some of the things floating around in my head at the moment. I want to know what you think and I *really* want to see some pseudo-code examples of how you think the API should work. 




Nov 25, 2010 at 9:00 AM

Hmm, here's one: Last time I used Lucene, the suggested pattern for creating an IndexReader was to create an IndexWriter and fetch a reader from that. Personally, that is counterintuitive and perhaps a sign of too-strong coupling. I don't know the history behind the pattern (presumably for caching, allowing readers and writers to play nice during updates, etc.), so maybe it really is the best approach. But if possible, I think one should be able to instantiate a reader directly without concern for warm-up or synchronization with the writer(s).


using (var reader = new DefaultReader(blah)) {



Nov 29, 2010 at 11:39 AM

Definitely use IDisposable so that using statements can be used and resources disposed of in a nice way. 

IQueryable<T> should probably set on top of  a core api that its more of a net version of the lucene api (but uses .net like conventions) which would help people that have existing apps built on lucene to migrate.

IObserable could also be used for tracking or modifying data before indexing, so +1 for investigating.  

Use of attributes would also be nice. 

Another thing to look at would be what Solr does on top of lucene.