Loading Dictionaries and Mappings into Pulse

If you have made changes to the Pulse schema tables, then you must load either the dictionaries, the normalization map, or both. After the changes are loaded, Pulse stores them in memory across all sessions in the cluster. Because Pulse automatically loads the dictionaries and mapping at startup, you do not need to reload them after a database restart or system reboot.

Automatically Loading Dictionaries and the Normalization Map

For ease of use, Pulse ships with a script to automatically load into memory all of the required user dictionaries and the normalization mapping. This script only exists on the node on which you installed the Pulse RPM/DEB package.

You can run the script from within vsql with the following command:

\i /opt/vertica/packages/pulse/ddl/loadUserDictionaries.sql

Manually Loading Dictionaries and the Normalization Map

If you want to manually load certain user dictionaries or mappings from the Pulse schema tables, run the following command. This example loads the pos_words dictionary. See LoadDictionary() for valid values for the listName parameter and for multilingual version loading.

Note: The following examples use the English dictionaries. For Spanish, replace "_en" with "_es".
  1. Add a word to the pos_words dictionary:
  2. => INSERT INTO pulse.pos_words_en VALUES('SuperDuper');
    => COMMIT;

    By default, added words are not case sensitive. "ERROR" produces the same results as "error". You can, however, specify a case setting for a single word using the $Case parameter. For example, to identify "Apple", rather than "apple", you would add the following:

    => INSERT INTO pulse.white_list_en VALUES('$Case(Apple)');
    => COMMIT;
  3. Load the updated dictionary into Pulse:
  4. => SELECT LoadDictionary(standard USING PARAMETERS 
    listName='white_list') OVER()
    FROM pulse.white_list_en;
  5. If you change the normalization map, you can load the new normalization values with the following command:
  6. => SELECT LoadMapping(standard_base, standard_synonym USING PARAMETERS 
    mapName='normalization') OVER() FROM pulse.normalization_en;		

    After loading, Vertica returns a success message and the number of rows (words or word pairs) loaded.