querydsl - Elasticsearch how to match documents for which the field tokens are a sub-set of the query tokens -
i have keyword/key-phrase field tokenize using standard analyser. want field match if if there search phrase has tokens of field in it.
for example if field value "veni, vidi, vici" , search phrase "ceaser veni,vidi,vici" want search phrase match search phrase "veni, vidi" not match.
i need "vidi, veni, vici" (weird!) match. positions , ordering of terms not important. phrase match not quite work me think.
i can use "bool query" "minimum_should_match" parameter specific example not want minimum should match ratio/number of tokens in search phrase.
pure es solution go this. need 2 requests.
1) first need pass user query through analyze api search tokens.
curl -xget 'localhost:9200/_analyze' -d ' { "analyzer" : "standard", "text" : "ceaser veni,vidi,vici" }'
you 4 tokens ceaser, veni, vidi, vici . need pass these tokens array next search
request.
2) need search documents tokens subset of search tokens.
{ "query": { "filtered": { "filter": { "bool": { "must": [ { "query": { "match": { "title": "ceaser veni,vidi,vici" } } }, { "script": { "script": "if(search_tokens.containsall(doc['title'].values)){return true;}", "params": { "search_tokens": [ "ceaser", "veni", "vidi", "vici" ] } } } ] } } } } }
here job of first match query
inside filter narrow down documents on script should run. containsall
method check if documents tokens sublist
of search tokens. slow job current set up. 1 big improvement can store tokens array doc['title'].values
can replaced field improve script.
hope helps!
Comments
Post a Comment