CopyPastor

Detecting plagiarism made easy.

Score: 0.8021521209743173; Reported for: String similarity Open both answers

Possible Plagiarism

Plagiarized on 2023-04-26
by rabbitbr

Original Post

Original - Posted on 2011-09-20
by roka



            
Present in both answers; Present only in the new answer; Present only in the old answer;

I did this example. You can combine queries. I preserve your multi match query and add a new multi match query with type "[phrase_prefix][1]".
POST teste/_doc { "first_name":"Example Bar" } GET teste/_search { "query": { "bool": { "minimum_should_match": 1, "should": [ { "multi_match": { "query": "Example B", "fields": [ "first_name", "last_name" ], "type": "cross_fields", "operator": "and" } }, { "multi_match": { "query": "Example B", "fields": [ "first_name", "last_name" ], "type": "phrase_prefix", "operator": "and" } } ] } } }

[1]: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html#multi-match-types
I'm using nGram, too. I use standard tokenizer and nGram just as a filter. Here is my setup:
{ "index": { "index": "my_idx", "type": "my_type", "analysis": { "index_analyzer": { "my_index_analyzer": { "type": "custom", "tokenizer": "standard", "filter": [ "lowercase", "mynGram" ] } }, "search_analyzer": { "my_search_analyzer": { "type": "custom", "tokenizer": "standard", "filter": [ "standard", "lowercase", "mynGram" ] } }, "filter": { "mynGram": { "type": "nGram", "min_gram": 2, "max_gram": 50 } } } } }
Let's you find word parts up to 50 letters. Adjust the max_gram as you need. In german words can get really big, so I set it to a high value.

        
Present in both answers; Present only in the new answer; Present only in the old answer;