
Shard Related System Tables
Shards are segments of the data in your communal storage location. The shards in your communal storage location are similar to a collection of segmented projections in an Enterprise mode database. Each node in your database subscribes to a subset of the shards in the communal storage location. The following shard related system tables were added in Eon mode:Table Name | Table Description |
---|---|
shards | Lists the shards in the database. |
node_subscriptions | Lists the nodes and its corresponding shards subscriptions in the database. |
session_subscriptions | Lists the session subscriptions in the database. Shows the session’s view of the node subscriptions. Session subscriptions are used when running queries within the session. |
catalog_subscription_changes | Lists node subscription state changes. |
node_subscription_change_phases | Lists details about node subscription changes. |
Depot Related System Tables
The depot provides the node with a local copy of data stored on the communal storage. Each node in an Eon database has its own depot. This intermediate layer of data storage provides a faster copy of the data that is local to the node. Data that is frequently used in your queries takes priority in your depot. If the data for a query is not in the depot, then Vertica might need to read data from communal storage. The following depot related system tables were added in Eon mode:Table Name | Table Description |
---|---|
depot_fetches | Lists the files fetched to the depot. |
depot_uploads | Lists the files uploaded to communal storage from the depot. |
depot_evictions | Lists the files evicted from the depot. |
vs_depot_lru | Lists the details of all the depot files. |
vs_depot_size | Lists the current and maximum size in bytes of the depots on all nodes. |
dc_file_reads | Lists the files read from the depot and/or S3 for each query. |
dc_communal_cleanup_records | Lists the data files detected on communal storage that are not referenced in the catalog. |
New Configuration Parameters
Configuration Parameter | Description |
---|---|
UseDepotForReads | Checks if the files are in the depot and reads from there. Disabling this parameter will force queries to use data from communal storage and have some performance impact. This flag is useful to test performance characteristics for data not in the depot. |
UsePeerToPeerDataTransfer | Sends data to all the subscribers for a shard at data load time. |
EnableDepotWarmingFromPeers | Warms up the depot when a node comes up. Disabling this parameter will have a performance impact for queries serviced by nodes with the cold depot. |
DelayForDeletes | The time interval to wait before actually deleting a file from shared storage. Default value is 2 hours. |
ShardRecoveryMaxIterations | Maximum number of iterations before the shard catalog installer is forced over concurrent loads. Default value is 10. |
DropFailedToActivateSubscriptions | Drops a subscription that failed to transition from PASSIVE TO ACTIVE. |
New Metafunctions
Metafunction | Description |
---|---|
clear_data_depot() | Deletes all the data from the depot for the given table. Deletes all the data in the depot, if no table name is passed. |
alter_location_size(‘path’, ‘node’,’size’) or alter_location_size(‘path’, ‘ ‘,’size’) | Changes the size of the depot. Vertica will delete the least recently accessed files, if the new size is smaller. If the node is left empty, all nodes in the cluster will be changed. |
finish_fetching_files() | Blocks the session until all the files in the fetcher queue to be downloaded from communal storage are downloaded. |
rebalance_shards() | Used to re-balance shard subscriptions between nodes when new nodes are added or removed from the cluster. |
Useful Queries
Query Description | SQL Statement |
---|---|
Query to list files read from depot and communal storage. | => select node_name,transaction_id,statement_id,storage_location,storageid
from dc_file_reads
where transaction_id=<transaction id> and statement_id=<statement id>; |
Number of bytes read from depot and communal storage (s3). This is an approximation. | => select s.node_name,storage_path,sum(used_bytes)
from dc_file_reads f join storage_containers s on s.sal_storage_id=f.storageid and s.node_name=f.node_name
where transaction_id=<transaction id> and statement_id=<statement id>
group by 1,2; |
List of files queued for deletion from communal storage. | => select node_name,transaction_id,statement_id,storage_location,file_name,size, queued_for_delete
from dc_communal_cleanup_records; |
List of files in the depot by the descending order of accessed time. | => select node_name,sal_storage_id,num_accesses,size,last_access_time
from vs_depot_lru
order by last_access_time desc; |
Query to confirm that there is no data skew between shards. | => select shard_name,sum(total_row_count) total_rows ,sum(deleted_row_count) del_row_count,sum(used_bytes) used_bytes
from storage_containers group by 1; |
Files fetched from communal storage along with the fetch time. | => select datediff('ms',start_time,end_time) as duration_millisec, source_node, node_name, file_size_bytes, destination_file, source_file
from depot_fetches where source_node = ''; |
Peer to peer files fetched along with the fetch time. | => select datediff('ms',start_time,end_time) as duration_millisec, source_node, node_name, file_size_bytes, destination_file, source_file
from depot_fetches where source_node != '';
|
Files uploaded to communal storage along with the upload time. | => select datediff('ms',start_time,end_time) as duration_millisec,*
from depot_uploads
order by 1 desc;
|
Files evicted from the depot, which were accessed in the last 5 minutes before eviction. | => select datediff('mi',last_access_time,time),*
from depot_evictions
where datediff('mi',last_access_time,time) > 5
order by 1;
|
Query that checks the percentage of space presently used in the storage location assigned to the depot. | => select node_name,storage_path,storage_usage,disk_space_used_mb,disk_space_free_mb,disk_space_free_percent
from disk_storage
where storage_usage='DEPOT';
|
Finds the number of queries executed in each fault group in a given time window. | select g.parent_name fault_group,count(*) Number_of_queries from dc_requests_issued i join fault_groups g on i.node_name=g.member_name where i.time between '' and '' group by 1 order by 1;
|
To pre-fetch files from communal storage to the depot for a query. | explain local<query>;
|