Find the Balance Between MPP Databases and Spark for Analytical Processing
Find the Balance Between MPP Databases and Spark for Analytical Processing
Dave Menninger, SVP and Research Director at Ventana Research, dives into the strengths and power of Apache Spark and massively parallel processing (MPP) databases like Vertica. Spark and MPP databases are both designed for the demands of high scale analytical workloads. Each has strengths related to the full data science workflow, from consolidating data from many siloes, to deploying and managing machine learning models. Understanding the power of each technology, and the cost and performance trade-offs between them can help you optimize your analytics architecture to get the best of both.
End User License Agreement
Read carefully before downloading the software
You may not use more than 1TB (including Parquet and ORC External Tables) and 3 nodes.
You may not use software to provide services to third parties.
You may not distribute, resell, share or sublicense software to third parties.
You may not download and use patches, enhancements, bug fixes, or similar updates unless you have a license to the underlying software. Community Edition license does not give you a right to receive such updates.
You may not copy the Software or make it available on a public or external distributed network.
You may copy the Software for archival purposes or when it is an essential step in authorized use so long as You retain any product identification, trademark, copyright or other notices in the Software.
You may not modify, reverse engineer, disassemble, decrypt, decompile or make derivative works of the Software. If you have a right to do so under law, you must first inform Microfocus in writing about such modifications.
You may not disclose to any third-party performance information or analysis (including, without limitation, benchmarks and performance tests) from any source relating to the Software.
Support is not included. Additional information regarding the software may be available from the Vertica Community at https://forum.vertica.com. Vertica has no obligation to provide You with any bug fixes, upgrades, patches, new versions, new releases, or support.
You consent to the collection of anonymous analytics.
What are we collecting? We collect anonymized data including the date and timestamp, number of nodes, data size, storage size, version #, OS and other data. We’re collecting this information to learn how we can make the product better for you in the future.
How is this information stored and processed? We store and process all information on our secure servers. No third-party has access to any of this data.
What about my personal information? We do not track or store any personal data. All data collected is anonymized, including your IP address.