[go: up one dir, main page]

Skip to content

Allow to configure additional stop words

By default, Lunr provides an English language stop word list filter: https://github.com/olivernn/lunr.js/blob/master/lib/stop_word_filter.js#L43-L163

In addition, Lunr languages provides stop word list filter for other languages, for instance in French: https://github.com/MihaiValentin/lunr-languages/blob/177653fb567006478ccb0ab6920b78eb77cad14e/lunr.fr.js#L699

I think it would be useful to be able to add additional stop words. For instance, a user might want to add "basically" and "actually" as stop words:

antora:
  extensions:
  - require: '@antora/lunr-extension'
    additional_stop_words: [basically, actually]

Since stop words are usually tied to a particular language, we might want to configure them on each language:

antora:
  extensions:
  - require: '@antora/lunr-extension'
    languages:
    - fr
    - en:
        additional_stop_words: [basically, actually]

Should we support both configuration? The first example is basically a shorter way of writing:

antora:
  extensions:
  - require: '@antora/lunr-extension'
    languages:
    - en:
        additional_stop_words: [basically, actually]

Alternatively, we could use extra_stop_words instead of additional_stop_words.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information