GetAllSentences
Extracts a row for each sentence in a body of text. This ability is useful if you need to programmatically get each sentence in a piece of text.
Syntax
GetAllSentences(text [, language[ USING PARAMETERS [ filterlinks = boolean ] [, filterusermentions = boolean ] [, filterhashtags = boolean ] [, adjustcasing = boolean ] [, language = string ] ])
Parameters
Argument | Description |
---|---|
text |
The text from which to get the sentences. |
language |
The language:
|
|
Optional. Default false. When set to true, sentences that are only links are skipped over and ignored. Any links in a sentence are not included in the extracted sentence. |
|
Optional. Default false. When set to true, sentences that are only Twitter user mentions (@username) are skipped over and ignored. Any user-mentions in a sentence are not included in the extracted sentence. |
|
Optional. Default false. When set to true, sentences that are only Twitter hashtags (#hashtag) are skipped over and ignored. Any hashtags in a sentence are not included in the extracted sentence. |
adjustcasing | Optional. Defaults to false. When set to true, all letters in the sentence are converted to upper-case before sentence detection. After sentence detection all letters are converted to lower-case. This option is helpful if the original data is all in lower-case and Pulse is incorrectly identifying parts of speech in the sentence. |
Notes
- The text argument is limited to 65,000 bytes.
-
This function must be used with the
over()
clause. Use withOVER(PARTITION BEST)
for the best performance if the query does not require specific columns in theover()
clause. -
language can be specified as an argument and/or as a parameter where the argument value supersedes the parameter value.
Examples
SELECT GetAllSentences('The quick brown fox jumped over the lazy
dog. Every good boy deserves fudge') OVER(PARTITION BEST);
sentence ----------------------------------------------- The quick brown fox jumped over the lazy dog. Every good boy deserves fudge. (2 rows)
select getAllSentences('the quick brown fox jumped over the lazy dog. All good boys deserve fudge' ,'english') over(); sentence_index | sentence_text ----------------+----------------------------------------------- 1 | the quick brown fox jumped over the lazy dog. 2 | All good boys deserve fudge (2 rows) select getAllSentences('the quick brown fox jumped over the lazy dog. All good boys deserve fudge' using parameters language='english') over(); sentence_index | sentence_text ----------------+----------------------------------------------- 1 | the quick brown fox jumped over the lazy dog. 2 | All good boys deserve fudge (2 rows) select getAllSentences('el zorro rapido brinco sobre el perro flojo. Todos los chicos buenos merecen un premio' ,'spanish') over(); sentence_index | sentence_text ----------------+---------------------------------------------- 1 | el zorro rapido brinco sobre el perro flojo. 2 | Todos los chicos buenos merecen un premio (2 rows) select getAllSentences('el zorro rapido brinco sobre el perro flojo. Todos los chicos buenos merecen un premio' using parameters language='spanish') over(); sentence_index | sentence_text ----------------+---------------------------------------------- 1 | el zorro rapido brinco sobre el perro flojo. 2 | Todos los chicos buenos merecen un premio (2 rows)
Filtering User-mentions
SELECT GetAllSentences('@user is always late. He kept me waiting 20 minutes last time.' USING PARAMETERS filterusermentions=true)
OVER(PARTITION BEST); sentence ----------------------------------------- is always late. he kept me waiting 20 minutes last time. (2 rows)