PartsOfSpeech

Tags the words in one or more sentences with their part of speech classification, using Penn Treebank parts of speech tags.

Syntax

SELECT PartsOfSpeech('sentences'[, language='lang'] [using PARAMETERS [ language='lang'] [, adjustcasing=boolean) OVER(PARTITION BEST);

Parameters

Argument Description

sentences

One or more sentences to be tagged with parts of speech markup.

language

The language:

  • 'english' or 'en'
  • 'spanish' or 'es'
adjustcasing Optional. Defaults to false. When set to true, all letters in the text are converted to uppercase before sentence detection. After performing sentence detection, Vertica converts all letter to lowercase. This option can help you in cases where the original data is all in lowercase letters and Pulse is incorrectly identifying sentence boundaries.

Notes

Examples

select partsOfSpeech('The quick brown fox jumped over the lazy dog.') OVER(PARTITION BEST);
 sentence | token  | part_of_speech 
----------+--------+----------------
        1 | the    | DT
        1 | quick  | JJ
        1 | brown  | JJ
        1 | fox    | NN
        1 | jumped | VBD
        1 | over   | IN
        1 | the    | DT
        1 | lazy   | JJ
        1 | dog    | NN
        1 | .      | .
(10 rows)
 
select partsOfSpeech('Every good boy deserves fudge.') OVER(PARTITION BEST);
 sentence |  token   | part_of_speech 
----------+----------+----------------
        1 | every    | DT
        1 | good     | JJ
        1 | boy      | NN
        1 | deserves | VBZ
        1 | fudge    | NN
        1 | .        | .
(6 rows)
select partsOfSpeech('The quick brown fox jumped over the lazy dog.', 'english') 
OVER(PARTITION BEST);

sentence | token  | part_of_speech

----------+--------+----------------
1	| the	| DT
1	| quick	| JJ
1	| brown	| JJ
1	| fox	| NN
1	| jumped	| VBD
1	| over	| IN
1	| the	| DT
1	| lazy	| JJ
1	| dog	| NN
1	| .	| .
(10 rows)


select partsofSpeech('El zorro rapido brinco sobre el perro flojo','spanish') 
over();
 sentence | token  | part_of_speech 
----------+--------+----------------
        1 | El     | DA
        1 | zorro  | NC
        1 | rapido | AQ
        1 | brinco | AQ
        1 | sobre  | SP
        1 | el     | DA
        1 | perro  | NC
        1 | flojo  | AQ
(8 rows)

See Also