Detailed analysis based on the basic structure of the storage system SILT

March 30, 2023

The SILT storage system uses a number of basic key-value storage structures, each optimized for different operations: (1) The update operation of the key values ​​is performed on the write-optimized storage structure. (2) Most key-value pairs are stored in a storage-efficient structure. Although data outside the storage structure seldom uses efficient storage indexes, the cost of the average index per key is very low. (3) SILT can be adjusted to cope with extreme situations, that is, the query is in the last and most recent storage structure. SILT allows all queries to be completed within 1+(nbcl)flash read time by using a memory filter.

Detailed analysis based on the basic structure of the storage system SILT

The structure and basic storage structure of SILT (LogStore, HashStore, SortedStore) is shown in Figure 1.

Figure 1 SILT storage structure

LogStore is highly efficient for write operations and is primarily used for PUT and DELETE operations. In order to achieve high system performance, the result of writing data is directly added to the end of the log file on the flash. Because these records are sorted by the appropriate time, the LogStore maps them through the hash table in memory with the key and the corresponding offset in the log file. SILT uses cuckoo hash to achieve high performance with minimal memory consumption. The cuckoo hash of some of the keys proposed in this paper occupies 93% of space at a lower computational cost and memory consumption. Compared to the other two read-only storage structures, the data storage is compact, and the LogStore must store a 4-byte pointer. SILT therefore uses only one LogStore.

When the storage in the LogStore is saturated, the LogStore will be converted into a fixed HashStore. HashStore data is stored in flash as a hash table, which does not require an index of memory to retrieve records or data. SILT can use multiple HashStores before merging them into SortedStore. Each HashStore uses an efficient memory filter to filter out keys that do not exist.

SortedStore maintains key-value pair data in flash in a specified order, which changes the index in a very concise form. Because the cost of a single update to the sorted data is very high, the SILT periodically merges the contents of the HashStore table into the SortedStore.

LogStore sequentially writes PUT and DELETE operations into the flash to achieve high write throughput. The partial key of the memory cuckoo hash index can efficiently implement the mapping of the key to the position in the corresponding log file (as shown in Figure 2).

LogStore uses a hash table based on cuckoo hash. It uses two functions and maps that implement key values ​​to the appropriate locations. In the new key join hash table, if one of the two locations is empty, add it to this empty location; otherwise the new key replaces one of the two locations, and the replaced key is iterated as described above. Until it finds its optional location.

Figure 2 LogStore design

To make it as concise as possible, the hash table does not store the entire key, but just a tag that stores the key. Only when the query matches the corresponding tag, the subsequent operations are continued, so that the key that does not exist can be filtered.

Although storing tags can save memory space to some extent, the same problem arises: moving a key to another location of its choice requires knowing its other hash value in advance. However, the corresponding key value is stored in the flash, so in this case a flash read operation is required. In order to solve this problem, the index of its optional location is used as a tag in the corresponding hash table. For example, if a key is placed in position, its other hash value will be stored as its tag in the location, and vice versa.

When the contents of the LogStore are saturated, SILT translates them into data structures with higher memory utilization. Sorting the LogStore directly and merging it into the SortedStore requires overwriting a large amount of data. In addition, retaining a large number of LogStores can result in high memory consumption. Therefore, in order to solve this problem, SILT first converts the LogStore into an immutable HashStore. When the number of HashStores reaches the specified value, it is merged into the SortedStore (as shown in Figure 3).

HashStore can save a lot of memory by modifying the structure of the index and sorting the (key, value) on the flash in hash order.

The HashStore filter is very simple, just copy the tag in the hash table of the LogStore and remove the corresponding pointer.

Figure 3 LogStore converted to HashStore

SortedStore is a static key-value storage structure. It stores key values ​​(key, value) sorted according to the order of the keys on the flash, and is indexed using a trie tree based on entropy coding, and each key consumes 0.4 bytes for storage.

In addition, SILT uses Using Sorted Data on Flash to store most of the key values ​​in a single SortedStore, but its entropy-encoded trie tree does not allow inserts and deletes. Therefore, in order to merge the HashStore into the SortedStore, the SILT must regenerate the SortedStore. Therefore, the speed at which SortedStore is built becomes an important factor in the overall performance of SILT.

Build work that can be done quickly by sorting: (1) Sorting allows new data to be added: new data is sorted into ordered data in order by sorting. (2) The related technology of sorting is very mature: SILT can use highly optimized sorting systems such as Nsort.

Zinc Alloy Electromechanical

Zinc Alloy Electromechanical,Custom Aluminum Zinc Alloy,Zinc Aluminum Alloy Die Casting,Custom Gasoline Engine Shell

Dongguan Metalwork Technology Co., LTD. , https://www.dgdiecastpro.com