Introduction to MemSQL

MemSQL is an in-memory, distributed database that is queried through a SQL interface. Storing data in memory eliminates the latency that results from reading and writing to disk. MemSQL is designed to be uniquely efficient in comparison to other in-memory data stores. It uses multi-version concurrency control (MVCC) and lock-free data structures to enable high throughput for large concurrent workloads without sacrificing consistency. As a result, reads do not block writes, and vice versa, providing the fast access necessary to achieve real-time analytics on a Big Data scale.

Two-tiered Architecture

MemSQL has a two-tiered, clustered architecture that consists of two types of nodes:

  • Aggregator nodes serve as mediators between the client and the cluster. They query the relevant leaf nodes and aggregate results before sending them back to the client. Aggregators store only metadata.
  • Leaf nodes, in contrast, store and process data. MemSQL has a shared-nothing architecture, which means that no two nodes share memory, disk, or CPU.

High Throughput on Concurrent Workloads

MemSQL is designed to enable high throughput on concurrent workloads. A distributed query optimizer evenly divides the processing workload to maximize the efficiency of CPU usage. Queries are compiled to machine code and cached to expedite subsequent executions. Rather than cache the results of the query, MemSQL caches a compiled query plan to provide the most efficient execution path. The compiled query plan does not pre-specify values for the parameters, which allows MemSQL to substitute the values upon request, enabling subsequent queries of the same structure to run quickly, even with different parameter values. Moreover, due to MemSQL’s use of MVCC and lock-free data structures, data remains highly accessible, even amidst a high volume of concurrent reads and writes.

Highly-scalable Deployments

MemSQL is designed to be highly scalable. The cluster can be scaled out at any time to provide increased storage capacity and processing power. Sharding is done automatically, and the cluster re-balances data and workload distribution. Because the data is stored in memory, queries run at full speed on clusters built from commodity hardware. In addition to being fast, consistent, and scalable, MemSQL is also durable. Data is replicated across shards, and a node can go down with negligible effect on performance. Also, leaf nodes regularly commit transactions to disk as logs. Periodically, full backups are committed as compressed snapshots of the entire database. If any node goes down, it can restart using one of these snapshots.

当前网页内容, 由 大妈 ZoomQuiet 使用工具: ScrapBook :: Firefox Extension 人工从互联网中收集并分享;
若有不妥, 欢迎评注提醒:



关于 ~ DebugUself with DAMA ;-)

关注公众号, 持续获得相关各种嗯哼:

公安备案号: 44049002000656 ...::