Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Starting the WebSpellChecker Server version 5.6.3.0, we added the ability to enable n-gram data sets to detect errors with words that are often confused, like their and there. This option is available for the English, German, French, Spanish, and Dutch languages. To enable n-gram data sets, you need to perform the steps described in this guide.

Note

N-gram data sets take 2.4-14.3 GB space depending on the chosen language. Please make sure you have a fast SSD.

1. Stop AppServer

Before making any changes to the AppServerX.xml file, it is recommended to stop AppServer.

2. Specify path to n-gram data sets in AppServer configuration file

  • Open the AppServerX.xml configuration file for editing. 
Info

The default path to the AppServerX.xml file: <WebSpellChecker_Installation_Path>/AppServer/AppServerX.xml

  • Find section with parameter responsible for configuration of n-gram data sets: PathToNgramData.
Code Block
languagexml
themeEmacs
titleAppServerX.xml
<!-- Path to n-gram data sets. Can be used to improve grammar quality. -->
<!-- <PathToNgramData></PathToNgramData>-->
  • Uncomment the PathToNgramData parameter and set a path to unzipped folder of ngrams.
Code Block
languagexml
themeEmacs
titleAppServerX.xml
<PathToNgramData>your_path_to_ngrams</PathToNgramData>


Info

Path example for Windows: <PathToNgramData>C:/Program Files/WebSpellChecker/AppServer/NgramData/</PathToNgramData>

Path example for Linux: <PathToNgramData>/home/spellcheck/svc/NgramData/</PathToNgramData>

  • Add the EnableNgramData parameter for each language you want to use n-grams .
Code Block
languagexml
themeEmacs
titleAppServerX.xml
<EnableNgramData>true</EnableNgramData>

This is an example of the for added EnableNgramData parameter for American English. You can find the list of language short code (used as Language Id) with the approprialte language in the Default Language section.

Code Block
languagexml
themeEmacs
titleAppServerX.xml
<Language Id="en_US">
	<Alias>en</Alias>
	<Alias>am</Alias>
	<GrammarCheckProviderOptions>en-US</GrammarCheckProviderOptions>
	<EnableNgramData>true</EnableNgramData>
	<ThesaurusEnabled>true</ThesaurusEnabled>
		<SpellEngineOptions>
		<Locale>am</Locale>
		<SpellCheckProvider>ssce</SpellCheckProvider>
		<Dictionary FullPath="ssceam2.clx">
			<ForSuggest>no</ForSuggest>
		</Dictionary>
		<Dictionary FullPath="ssceam2s.clx">
			<ForSuggest>yes</ForSuggest>
		</Dictionary>
		<Dictionary FullPath="sscema2.clx"/>
		<Dictionary FullPath="keywords.clx"/>
		<Dictionary FullPath="ssceam.tlx"/>
	</SpellEngineOptions>
</Language>

 3. Start AppServer

As soon as you made the nesessary actions to enable n-gram data sets in AppServerX.xml, start AppServer for the changes to take effect.