首页 > Web开发 > 详细

HTML Strip Char Filter

时间:2017-08-03 10:56:06      阅读:322      评论:0      收藏:0      [点我收藏+]

The html_strip character filter strips HTML elements from the text and replaces HTML entities with their decoded value (e.g. replacing & with &).

Example outputedit

POST _analyze
{
  "tokenizer":      "keyword", 
技术分享
  "char_filter":  [ "html_strip" ],
  "text": "<p>I&apos;m so <b>happy</b>!</p>"
}

技术分享

The keyword tokenizer returns a single term.

The above example returns the term:

[ \nI‘m so happy!\n ]

The same example with the standard tokenizer would return the following terms:

[ I‘m, so, happy ]

Configurationedit

The html_strip character filter accepts the following parameter:

escaped_tags

An array of HTML tags which should not be stripped from the original text.

Example configurationedit

In this example, we configure the html_strip character filter to leave <b> tags in place:

PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "keyword",
          "char_filter": ["my_char_filter"]
        }
      },
      "char_filter": {
        "my_char_filter": {
          "type": "html_strip",
          "escaped_tags": ["b"]
        }
      }
    }
  }
}

POST my_index/_analyze
{
  "analyzer": "my_analyzer",
  "text": "<p>I&apos;m so <b>happy</b>!</p>"
}

The above example produces the following term:

[ \nI‘m so <b>happy</b>!\n ]


源文:https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-htmlstrip-charfilter.html#analysis-htmlstrip-charfilter

HTML Strip Char Filter

原文:http://www.cnblogs.com/a-du/p/7278302.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!