Skip to content

Guidelines for search index creation to support auto complete / suggestions#2217

Merged
maneuvertomars merged 6 commits into
masterfrom
search-autocomplete-writeup
Sep 3, 2025
Merged

Guidelines for search index creation to support auto complete / suggestions#2217
maneuvertomars merged 6 commits into
masterfrom
search-autocomplete-writeup

Conversation

@maneuvertomars
Copy link
Copy Markdown
Member

@maneuvertomars maneuvertomars commented Aug 20, 2025

Did a write up for search-autocomplete.
Comprehensive documentation for implementing search autocomplete functionality using edge n-gram tokenization in Bleve search engine. It provides detailed analysis of different tokenization methods and demonstrates why edge n-grams are optimal for autocomplete features.

Key changes:

  • Adds detailed documentation on edge n-gram autocomplete implementation
  • Compares various tokenization methods (single token, whitespace, regex, n-gram, edge n-gram)
  • Provides practical code examples and configuration samples

docs/search_autocomplete.md -> Dsicussion about edge n-gram autocomplete theory, implementation examples, and json mappings

docs/create_and_search_your_first_index.md -> Basic Bleve index creation and search operations with incomplete ending section

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

The PR introduces comprehensive documentation for implementing search autocomplete functionality using edge n-gram tokenization in Bleve search engine. It provides detailed analysis of different tokenization methods and demonstrates why edge n-grams are optimal for autocomplete features.

Key changes:

  • Adds detailed documentation on edge n-gram autocomplete implementation
  • Compares various tokenization methods (single token, whitespace, regex, n-gram, edge n-gram)
  • Provides practical code examples and configuration samples

Reviewed Changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.

File Description
docs/search_autocomplete.md New comprehensive guide covering edge n-gram autocomplete theory, implementation examples, and best practices
docs/create_and_search_your_first_index.md New tutorial on basic Bleve index creation and search operations with incomplete ending section

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread docs/search_autocomplete.md Outdated
3. **Better caching**: Exact term queries cache better than prefix queries
4. **Consistent performance**: Query time doesn't increase with index size

## 5. On low level implementaion sample:
Copy link

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in 'implementaion' should be 'implementation'.

Suggested change
## 5. On low level implementaion sample:
## 5. On low level implementation sample:

Copilot uses AI. Check for mistakes.
Comment thread docs/search_autocomplete.md Outdated
for _, token := range input {
runeCount := utf8.RuneCount(token.Term)
runes := bytes.Runes(token.Term)
// ..builds tokens based form either end, specified in the input
Copy link

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in 'form' should be 'from'.

Suggested change
// ..builds tokens based form either end, specified in the input
// ..builds tokens based from either end, specified in the input

Copilot uses AI. Check for mistakes.
n gram token filter



Copy link

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ending section appears to be incomplete notes rather than proper documentation. This section should either be completed with proper documentation or removed.

Suggested change

Copilot uses AI. Check for mistakes.
@abhinavdangeti abhinavdangeti changed the title Search autocomplete using suggestions Guidelines for search index creation to support auto complete / suggestions Aug 20, 2025
@abhinavdangeti abhinavdangeti added this to the v2.5.4 milestone Aug 20, 2025
```

### search autocomplete
return the response for
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maneuvertomars you can add a link here to your search_autocomplete.md file, so its convenient for readers

@abhinavdangeti
Copy link
Copy Markdown
Member

@maneuvertomars It's our practice to be as descriptive as possible in the commit message section about what the commit covers. This'll benefit us in the future when we look at the commit log.

Comment thread docs/search_autocomplete.md Outdated
"autocomplete": {
"analyzer": "edge_ngram_analyzer",
"min_gram": 3,
"max_gram": 20
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Comment thread docs/search_autocomplete.md Outdated
"autocomplete": {
"analyzer": "edge_ngram_analyzer",
"min_gram": 2,
"max_gram": 12
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntax is off - i think it is just min and max in the config.

Comment thread docs/search_autocomplete.md Outdated
```json
{
"min_gram": 2,
"max_gram": 15
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntax is off - i think it is just min and max in the config.

Comment thread docs/search_autocomplete.md Outdated
```json
{
"min_gram": 1,
"max_gram": 20
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

@@ -0,0 +1,221 @@
# Create and Search Index
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a Bleve Index

@@ -0,0 +1,221 @@
# Create and Search Index

Demonstration of creating an index on Documents and making it searchable.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple how-to example using Bleve in Go to create an index, add documents, and run search queries with results.

import (
"fmt"
"log"
"os"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"os" package is not used, must be removed

Comment on lines +57 to +75
// Search the created index
query := bleve.NewQueryStringQuery("bleve")
searchRequest := bleve.NewSearchRequest(query)
searchRequest.Size = 10
searchResult, err := index.Search(searchRequest)
if err != nil {
log.Fatal(err)
}

for i, hit := range searchResult.Hits {
fmt.Printf("%d. Document: %s (Score: %.2f)\n", i+1, hit.ID, hit.Score)
if len(hit.Fragments) > 0 {
for field, fragments := range hit.Fragments {
fmt.Printf(" %s: %s\n", field, fragments[0])
}
}
fmt.Println()
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify query

	// Search the created index
	query := bleve.NewMatchQuery("bleve")
	searchRequest := bleve.NewSearchRequest(query)
	searchRequest.Explain = true
	searchRequest.Fields = []string{"title", "content"}
	searchResult, err := index.Search(searchRequest)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(searchResult)

}
}
```
### Output:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a space between the code and the Output - also ## Output:

Comment thread docs/search_autocomplete.md Outdated
Comment on lines +276 to +280
**Field configuration explained:**
- `"analyzer": "search_autocomplete_feature"` - Use our custom analyzer
- `"store": true` - Keep original text for display
- `"index": true` - Make it searchable
- `"include_in_all": true` - Include in default search field
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not required, as the Bleve API will allow users to modify these options

Comment thread docs/search_autocomplete.md Outdated
- `"index": true` - Make it searchable
- `"include_in_all": true` - Include in default search field

![Index Mapping Configuration](/docs/name_filed_searchable_search_autocomplete_analyzer.png "Index mapping showing how the name field is configured with the custom analyzer")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove Couchbase UI

Comment thread docs/search_autocomplete.md Outdated
**User types "sc":**
1. Query: `name:sc`
2. Bleve looks up exact term "sc" in the index
3. Finds document with "Schaumbergfest"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and "Script" right?

Finds documents with "Schaumbergfest" and "Script"

ID string `json:"id"`
Title string `json:"title"`
Content string `json:"content"`
Author string `json:"author"`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Author is unused

Comment thread docs/search_autocomplete.md Outdated
3. Finds document with "Schaumbergfest"
4. Returns suggestion instantly

![Search Results](/docs/index_search_using_prefix.png "Search results showing 'Schaumbergfest' highlighted when searching for 'sc'")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace Couchbase UI with code

	type Document struct {
		ID    string `json:"id"`
		Title string `json:"title"`
	}
	// 4. Index Documents
	documents := []Document{
		{
			ID:    "doc1",
			Title: "Schaumbergfest",
		},
		{
			ID:    "doc2",
			Title: "Script",
		},
	}

	batch := index.NewBatch()
	for _, doc := range documents {
		batch.Index(doc.ID, doc)
	}
	if err := index.Batch(batch); err != nil {
		log.Fatal(err)
	}

	// 5. Search the created index
	query := bleve.NewMatchQuery("sc")
	query.SetField("title")
	searchRequest := bleve.NewSearchRequest(query)
	searchRequest.Explain = true
	searchRequest.Fields = []string{"title"}
	searchResult, err := index.Search(searchRequest)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(searchResult)

Output

$ go run main.go

2 matches, showing 1 through 2, took 311.125µs
    1. doc2 (0.343255)
        title
                Script
    2. doc1 (0.343255)
        title
                Schaumbergfest

@maneuvertomars maneuvertomars dismissed abhinavdangeti’s stale review September 3, 2025 11:26

all previous reviews fixed and got it re-reviewed from @CascadingRadium

@maneuvertomars maneuvertomars merged commit 674e516 into master Sep 3, 2025
9 checks passed
@CascadingRadium CascadingRadium deleted the search-autocomplete-writeup branch May 5, 2026 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants