Quantcast
Channel: mikemccand
Browsing all 23 articles
Browse latest View live

Crawling Crowd-Data Spots Side Effects Faster

The social crowd has proven to be powerful, if you can find some way to harness it: crowd-sourcing can perform tasks and solve collaborative problems,...

View Article



New Lucene 4 Functions Improve Enterprise Search Indexing

In the past, Lucene recorded only the bare minimal aggregate index statistics necessary to support its hard-wired classic vector space scoring...

View Article

Lucene has two Google Summer of Code students!

I'm happy to announce that two Lucene Google Summer of Code projects were accepted for this summer! The first project (LUCENE-3312), proposed by Nikola...

View Article

Lucene's TokenStreams are Actually Graphs!

Lucene's TokenStream class produces the sequence of tokens to be indexed for a document's fields. The API is an iterator: you call incrementToken to...

View Article

Finite State Automata in Lucene

Lucene Revolution 2012 is now done, and the talk Robert and I gave went well! We showed how we are using automata (FSAs and FSTs) to make great...

View Article


Building a New Lucene Postings Format

As of 4.0 Lucene has switched to a new pluggable codec architecture, giving the application full control over the on-disk format of all index files. We...

View Article

Putting an Apache Lucene Index in RAM with Zing JVM

Google's entire index has been in RAM for at least 5 years now. Why not do the same with an Apache Lucene search index? RAM has become very affordable...

View Article

Lucene's New Analyzing Suggester

Live suggestions as you type into a search box, sometimes called suggest or autocomplete, is now a standard, essential search feature ever since Google set a...

View Article


Fun with Lucene's Faceted Search Module

These days faceted search and navigation is common and users have come to expect and rely upon it. Lucene's facet module, first appearing in the...

View Article


Getting Real-Time Field Values in Lucene

We know Lucene's near-real-time search is very fast: you can easily refresh your searcher once per second, even at high indexing rates, so that any change...

View Article

Drill Sideways faceting with Lucene

 Lucene's facet module, as I described previously, provides a powerful implementation of faceted search for Lucene. There's been a lot of progress...

View Article

Eating Dogfood with Lucene

Eating your own dog food is important in all walks of life: if you are a chef you should taste your own food; if you are a doctor you should treat yourself...

View Article

A New Lucene Suggester Based on Infix Matches

Suggest, sometimes called auto-suggest, type-ahead search or auto-complete, is now an essential search feature ever since Google added it almost 5 years...

View Article


Screaming fast Lucene searches using C++ via JNI

 At the end of the day, when Lucene executes a query, after the initial setup the true hot-spot is usually rather basic code that decodes sequential...

View Article

2X faster PhraseQuery with Lucene using C++ via JNI

I recently described the new lucene-c-boost github project, which provides amazing speedups (up to 7.8X faster) for common Lucene query types using specialized...

View Article


A New Version of the Compact Language Detector

It's been almost two years since I originally factored out the fast and accurateCompact Language Detector from the Chromium project,...

View Article

Three Exciting Lucene Features in One Day

This week has been a productive week. Suddenly, there are three exciting new features coming to Lucene. Expressions Module The first feature,...

View Article


Lucene's In-memory Terms Dictionary, Thanks to Google Summer of Code

Last year, Han Jiang's Google Summer of Code project was a big success: he created a new (now, default) postings format for substantially faster searches,...

View Article

Apache Lucene: Fast Range Faceting Using Segment Trees and the Java ASM Library

In Lucene's facet module we recently added support for dynamic range faceting, to show how many hits match each of a dynamic set of ranges. For example, the...

View Article

Using Lucene's Search Server to Search Jira Issues

You may remember my first blog post describing how the Lucene developers eat our own dog food by using a Lucene search application to find our...

View Article
Browsing all 23 articles
Browse latest View live




Latest Images