Creating custom analyzers¶
Remove HTML tags and convert HTML entities¶
POST /_analyze
{
"char_filter": ["html_strip"],
"text": "I'm in a <em>good</em> mood - and I <strong>love</strong> açaí!"
}
Add the standard
tokenizer¶
POST /_analyze
{
"char_filter": ["html_strip"],
"tokenizer": "standard",
"text": "I'm in a <em>good</em> mood - and I <strong>love</strong> açaí!"
}
Add the lowercase
token filter¶
POST /_analyze
{
"char_filter": ["html_strip"],
"tokenizer": "standard",
"filter": [
"lowercase"
],
"text": "I'm in a <em>good</em> mood - and I <strong>love</strong> açaí!"
}
Add the stop
token filter¶
This removes English stop words by default.
POST /_analyze
{
"char_filter": ["html_strip"],
"tokenizer": "standard",
"filter": [
"lowercase",
"stop"
],
"text": "I'm in a <em>good</em> mood - and I <strong>love</strong> açaí!"
}
Add the asciifolding
token filter¶
Convert characters to their ASCII equivalent.
POST /_analyze
{
"char_filter": ["html_strip"],
"tokenizer": "standard",
"filter": [
"lowercase",
"stop",
"asciifolding"
],
"text": "I'm in a <em>good</em> mood - and I <strong>love</strong> açaí!"
}
Create a custom analyzer named my_custom_analyzer
¶
PUT /analyzer_test
{
"settings": {
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"type": "custom",
"char_filter": ["html_strip"],
"tokenizer": "standard",
"filter": [
"lowercase",
"stop",
"asciifolding"
]
}
}
}
}
}
Configure the analyzer to remove Danish stop words¶
To run this query, change the index name to avoid a conflict. Remember to remove the comments. :wink:
PUT /analyzer_test
{
"settings": {
"analysis": {
"filter": {
"danish_stop": {
"type": "stop",
"stopwords": "_danish_"
}
},
"char_filter": {
# Add character filters here
},
"tokenizer": {
# Add tokenizers here
},
"analyzer": {
"my_custom_analyzer": {
"type": "custom",
"char_filter": ["html_strip"],
"tokenizer": "standard",
"filter": [
"lowercase",
"danish_stop",
"asciifolding"
]
}
}
}
}
}
Test the custom analyzer¶
POST /analyzer_test/_analyze
{
"analyzer": "my_custom_analyzer",
"text": "I'm in a <em>good</em> mood - and I <strong>love</strong> açaí!"
}