Vertica in Good Health: Ingesting, Securing, and Analyzing Healthcare Data

Posted September 29, 2020 by Bryan Herger, Vertica Big Data Solution Architect at Micro Focus

Electronic medical record on digital smart device

Electronic Medical Records and new personal healthcare technology, particularly medical devices, generate a huge amount of data. Organizations including Cerner and Philips Healthcare rely on Vertica for data analysis. How can Vertica help you analyze this data? Vertica excels at data ingest and provides hundreds of built-in SQL and analytic functions with broad integration with programming languages, user-defined functions, and commercial and open source tools to extract, load, query, and analyze your data. We can analyze data at scale with current customers ingesting terabytes of data from devices and other sources. Also, we can secure your data to meet HIPAA and other private and government mandates with encryption at rest and in transit, access control down rows and columns, redaction of sensitive data and integration with organizational security such as Active Directory and Kerberos.

Vertica excels at ingesting strictly formatted records like sensor data for IoT data analytics, but how can you ingest electronic medical record data? Microsoft is leading an open source project hosted on GitHub to provide a Web UI and REST API to convert HL7 data to FHIR JSON using a template. Here’s a sample from the Web UI showing HL7, format template, and JSON output:

Records can be extracted automatically using a template loaded into the REST API, and the resulting FHIR JSON can be ingested into Vertica flex tables and from there you can query the data directly or extract and transform the data into a SQL table for analysis. Let’s take a look at queries in the above JSON sample in Vertica:

-- Show a record
vfhir=> select __identity__, maptostring(__raw__) from fhir_json;
250003 | {
"category.coding.0.code": "394914008",
"category.coding.0.display": "Radiology",
"category.coding.0.system": "",
"category.coding.1.code": "RAD",
"category.coding.1.system": "",
"code.coding.0.code": "429858000",
"code.coding.0.display": "Computed tomography (CT) of head and neck",
"code.coding.0.system": "",
"code.text": "CT of head-neck",
"codedDiagnosis.0.coding.0.code": "188340000",
"codedDiagnosis.0.coding.0.display": "Malignant tumor of craniopharyngeal duct",
"codedDiagnosis.0.coding.0.system": "",
"conclusion": "CT brains: large tumor sphenoid/clivus.",
"effectiveDateTime": "2012-12-01T12:00:00+01:00",
"id": "f201",
"imagingStudy.0.display": "HEAD and NECK CT DICOM imaging study",
"issued": "2012-12-01T12:00:00+01:00",
"performer.0.display": "Blijdorp MC",
"performer.0.reference": "Organization/f203",
"resourceType": "DiagnosticReport",
"status": "final",
"subject.display": "Roel",
"subject.reference": "Patient/f201",

-- Find records with given content
vfhir=> select __identity__, maplookup(__raw__,'conclusion') from fhir_json where maplookup(__raw__,'conclusion')::varchar ILIKE '%tumor%';
__identity__ | maplookup
250003 | CT brains: large tumor sphenoid/clivus.
250006 | CT brains: large tumor sphenoid/clivus.

Vertica offers features for handling complex data types in a secure and performant manner. The above JSON structure can be queried by key and transformed into a SQL table. You can build a pipeline to extract the content you need from medical records, laboratory reports, sensor data, and more for discovery, analysis, machine learning, and prediction. Contact us to find out how we can enable your analysis of health care data.


Related Links:

Learn how Cerner improves patient care quality by helping physicians work more efficiently:

Learn how Philips Healthcare leverages Vertica for predictive analytics in medical devices:

Microsoft FHIR Converter project:

Vertica flex tables for JSON ingest: