Chanler

Chanler

"Dark Horse Redis Principles" II. Network Model

User Space and Kernel Space#

image.png|500

IO Models#

"UNIX Network Programming" summarizes five types of IO models:

  • Blocking IO
  • Nonblocking IO
  • IO Multiplexing
  • Signal Driven IO
  • Asynchronous IO

The user buffer in user space waits for data to be ready in the kernel buffer in kernel space.
The hardware device prepares data to the kernel buffer in kernel space.
The user buffer in user space reads data from the kernel buffer in kernel space.

Blocking IO#

image.png|500

Nonblocking IO#

image.png|500

IO Multiplexing#

Whether it is blocking IO or nonblocking IO, the user needs to use recvfrom to get data at one stage, but both need to wait for data to be ready; it is just a matter of blocking wait or non-blocking wait. In any case, only one can be processed.

However, if multiple data sources are being monitored simultaneously, the one that is ready will be processed. This is IO multiplexing.

image.png|500

In Linux, the file descriptor can associate everything, which can be a network socket, waiting for multiple fds simultaneously, processing whichever is ready.

image.png|500

There are various solutions for IO multiplexing, such as select, poll, and epoll. Select and poll can only know that there is a ready fd in the fd set but do not know which specific one.

image.png|500

IO Multiplexing - select#

Convert fds into binary flags for storage. For example, listening on fds = 1, 2, 5 means marking the 1st, 2nd, and 5th bits from the right as 1 and passing them to kernel space.

If the corresponding bit's fd is ready, it is marked as 1; otherwise, it is marked as 0. The bitmap is traversed and passed to user space, where a bit of 1 indicates readiness.

The advantage is that it saves space; the disadvantage is that it cannot exceed 1024, and fd_set needs to be copied back and forth, and checking requires a full traversal.

image.png|500

IO Multiplexing - poll#

Poll enhances the 1024 limitation of select, theoretically allowing unlimited storage with a linked list, but it is unnecessary; the longer it is, the more performance is consumed during traversal.

image.png|500

IO Multiplexing - epoll#

Epoll uses a red-black tree to maintain a record of all fds that need to be monitored in kernel space; it also maintains a ready list to store all ready fds.

Process: The user space creates an epoll instance through epoll_create, adds or removes fds from the red-black tree using epoll_ctl, and waits for ready events through epoll_wait, retrieving ready fds from the ready list.

When a fd is ready, the kernel adds it to the ready list. Epoll_wait only needs to take out the ready fds from the ready list without traversing the entire fd set.

Advantages: No fd limit, no need to copy the entire fd set, only passing single fds and operations, only returning ready fds, using the ep_poll_callback mechanism to monitor fd status without traversing all fds.

image.png|500

IO Multiplexing - Event Notification Mechanism#

When an event occurs on the monitored fd (such as data being readable or writable), the kernel will notify the epoll instance through a callback mechanism, such as epoll's ep_poll_callback.

The kernel will add the ready fds to the ready list of the epoll instance. When the application calls epoll_wait, it can directly retrieve the fds with occurred events from the ready list for processing without traversing all monitored fds.

In practice, using the ET mode means that only new data will trigger notifications; for already notified data, it will continuously perform non-blocking reads until completed.

image.png|500

IO Multiplexing - Web Service Process#

image.png|500

Signal Driven IO#

Signal driven IO is a nonblocking IO model.

Applications can set a signal handler function, and when the kernel sends this signal to the process upon fd readiness, the process receives the signal and immediately executes recvfrom to read data.

Advantages: Non-blocking, no need for active polling.

Disadvantages: Reading data is still blocking normally, signal handling has overhead, signals may be lost, and it is not as effective as IO multiplexing in high concurrency.

image.png|500

Asynchronous IO#

Asynchronous IO returns immediately after the user initiates an IO operation, with no blocking phase.

The kernel is responsible not only for waiting for data to be ready but also for copying data from kernel space to user space. After the entire IO operation is completed, the kernel will notify the user process based on the callback function or signal registered by the user.

Advantages: The user process is completely non-blocking from the initiation of the IO request to receiving the completion notification.

Disadvantages: Complexity.

image.png|500

Synchronous vs Asynchronous#

Asynchronous IO is also asynchronous during the second phase, which is the data reading phase.

image.png|500

Is Redis Single-threaded?#

Is Redis single-threaded or multi-threaded?

  • The core business part of Redis, command processing, is single-threaded.
  • Overall, Redis is multi-threaded.

During the version evolution of Redis, multi-threading support was introduced at two important time points:

  • Redis V4.0: Introduced multi-threaded asynchronous processing for some time-consuming tasks, such as the asynchronous delete command unlink.
  • Redis v6.0: Introduced multi-threading in the core network model to further improve utilization of multi-core CPUs.

For the core network model of Redis, it was single-threaded before Redis v6.0, continuously processing client situations in the event loop using IO multiplexing technologies like epoll.

Why did Redis choose single-threading?

  • Redis performs pure memory operations for command processing, and its execution speed is already very fast. The performance bottleneck lies in network IO rather than execution speed, so focusing on network IO optimization is more necessary than multi-threading.
  • Multi-threading can lead to excessive context switching, resulting in unnecessary overhead.
  • Introducing multi-threading can also lead to thread safety issues, requiring the introduction of thread locks and similar safety measures, which can waste performance.

Redis Implemented Network Model#

image.png|500

image.png|500

|500

image.png|500

image.png|500

This article is synchronized and updated to xLog by Mix Space. The original link is https://blog.0xling.cyou/posts/redis/redis-4

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.