Ever wonder what Vertica engineers do when they aren’t busy building and perfecting an awesome product?
You guessed it: they read technical papers published by respectable peers! Back in September, Styliani Pantela, a Software Engineer on the Vertica Query Optimizer team started the reading group and has been organizing weekly discussions since. The group has picked plenty of interesting articles on topics like pure systems, distributed infrastructure algorithms, machine learning, and analytics. Below is a sample of free, easily-accessible documents from various journals and conferences.
Alagiannis, S. Idreos, and A. Ailamaki. 2014. H2O: a hands-free adaptive store. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD ’14). ACM, New York, NY, USA, 1103-1114.
In this paper, the authors present the H2O system, which “support multiple storage layouts and data access patterns in a single engine” and “decides on-the-fly, i.e., during query processing, which design is best for classes of queries and the respective data parts.”
Graefe, H. Volos, H. Kimura, H. Kuno, J. Tucek, M. Lillibridge, and A. Veitch. 2014. In-memory performance for big data. Proc. VLDB Endow. 8, 1 (September 2014), 37-48.
In this paper, the authors present a buffer pool design that can ‘match in-memory performance while supporting the ‘big data’ workloads that continue to require secondary storage.”
Cooper. 2013. Spanner: Google’s globally-distributed database. In Proceedings of the 6th International Systems and Storage Conference (SYSTOR ’13). ACM, New York, NY, USA, , Article 9 , 1 pages.
This paper presents Spanner, Google’s “scalable, multi-version, globally-distributed, and synchronously-replicated database”, and describes its feature set.
Gupta, et al. Presented at VLDB (2014)
This paper presents Mesa, an analytic data warehousing system that “is designed to satisfy a complex and challenging set of user and systems requirements, including near real-time data ingestion and queryability, as well as high availability, reliability, fault tolerance, and scalability for large data and query volumes.”
Kulkarni and J. Michels. 2012. Temporal features in SQL:2011. SIGMOD Rec. 41, 3 (October 2012), 34-43.
This paper demonstrations the ability to create and manipulate temporal tables in SQL 2011.
Curtsinger and E. Berger. Presented at SOSP 2015
The authors present and evaluate causal profiling, which directs programmers to where they need to focus their optimization efforts.
Petraki, S. Idreos, and S. Manegold, “Holistic Indexing in Main-memory Column-stores,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, Melbourne, Australia, 2015.
In this paper, the authors present holistic indexing, a index tuning approach for columnar databases that adapts to dynamic environments.
Yang, N. Meneghetti, R. Fehling, Z. Hua Liu, and O. Kennedy. 2015. Lenses: an on-demand approach to ETL. Proc. VLDB Endow. 8, 12 (August 2015), 1578-1589.
The authors explore a method for creating ETL workflows “on-demand”.
Hideaki Kimura. 2015. FOEDUS: OLTP Engine for a Thousand Cores and NVRAM. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD ’15). ACM, New York, NY, USA, 691-706.
This paper introduces and evaluates FOEDUS, an open-source database engine with a non-traditional architectural design that aims to “extend in-memory database technologies to further scale up.”
Do you have interesting articles you want to share? Feel free to leave us a comment with links!