Loading Dictionaries and Mappings into Pulse
If you have made changes to the Pulse schema tables, then you must load either the dictionaries, the normalization map, or both. After the changes are loaded, Pulse stores them in memory across all sessions in the cluster. Because Pulse automatically loads the dictionaries and mapping at startup, you do not need to reload them after a database restart or system reboot.
- To load an individual user dictionary into memory, use the LoadDictionary() function with the appropriate parameter and word list.
- LoadDictionary does not append user-dictionary lists. Instead, it overwrites them. If you load a user dictionary more than once with the same list name, then only the most recent user dictionary is loaded for that list name.
- To load the normalization mapping into memory, use the LoadMapping() function with the normalization map.
- If you load a mapping with an incorrect mapName, then the result of LoadMapping() is false and the map is not loaded. LoadMapping() does not append maps. Instead, it overwrites them. If you load a map more than once with the same mapName, then only the most recent mapping is loaded for that mapName.
- If LoadMapping() is successful, Vertica returns a success message from each node in the cluster.
Automatically Loading Dictionaries and the Normalization Map
For ease of use, Pulse ships with a script to automatically load into memory all of the required user dictionaries and the normalization mapping. This script only exists on the node on which you installed the Pulse RPM/DEB package.
You can run the script from within vsql with the following command:
\i /opt/vertica/packages/pulse/ddl/loadUserDictionaries.sql
Manually Loading Dictionaries and the Normalization Map
If you want to manually load certain user dictionaries or mappings from the Pulse schema tables, run the following command. This example loads the pos_words dictionary. See LoadDictionary() for valid values for the listName
parameter and for multilingual version loading.
- Add a word to the pos_words dictionary:
- Load the updated dictionary into Pulse:
- If you change the normalization map, you can load the new normalization values with the following command:
=> INSERT INTO pulse.pos_words_en VALUES('SuperDuper');
=> COMMIT;
By default, added words are not case sensitive. "ERROR" produces the same results as "error". You can, however, specify a case setting for a single word using the $Case parameter. For example, to identify "Apple", rather than "apple", you would add the following:
=> INSERT INTO pulse.white_list_en VALUES('$Case(Apple)');
=> COMMIT;
=> SELECT LoadDictionary(standard USING PARAMETERS
listName='white_list') OVER()
FROM pulse.white_list_en;
=> SELECT LoadMapping(standard_base, standard_synonym USING PARAMETERS mapName='normalization') OVER() FROM pulse.normalization_en;
After loading, Vertica returns a success message and the number of rows (words or word pairs) loaded.