Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add read replica and rotator docs #2497

Merged
merged 18 commits into from
May 16, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
103 changes: 103 additions & 0 deletions docs/user-guides/read-replica-and-rotator.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Read Replica and Rotator
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

Read replica enhances the search QPS (Queries Per Second) of the Vald cluster by deploying read-only agents in addition to the regular agents and distributing the requests among them. Read replica is deployed as Kubernetes deployments, and depending on the number of replicas (N), it is faster QPS by approximately 1.7 to 1.8 times \* N.
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

<div class="notice">
The increase in QPS is possible only if sufficient infrastructure is available (see [Important notes](#important-notes) ).
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
</div>

## How to deploy read replica

The read replica is managed with a separate chart from the Vald cluster and is deployed as an addon to the Vald cluster. Therefore, in any of the following steps, the Vald cluster should be deployed first, followed by the deployment of the read replica.
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

> The reason Vald and Vald-readreplica are in separate charts is to avoid conflicts between the read replica's restart and the Helm operator's processes when Vald is managed by a helm operator. Therefore, the read replica will be deployed using helm commands in any case.
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

### When you deploy vald with helm command
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

1. Edit `values.yaml` like below(Please refer to [deployment](deployment) for other fields.)
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

```yaml
agent:
ngt:
export_index_info_to_k8s: true
readreplica:
enabled: true
minReplicas: 1 # if you don't use hpa, this will be the replicas of the Deployment
maxReplicas: 3
hpa:
enabled: true # if you prefer to use hpa
targetCPUUtilizationPercentage: 80
manager:
index:
operator:
enabled: true
rotation_job_concurrency: 2
```

1. Deploy vald cluster
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

```bash
helm install vald vald/vald --values values.yaml
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
```

1. Deploy `vald-readreplica` with the same `values.yaml`
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
```bash
helm install vald-readreplica vald/vald-readreplica --values values.yaml
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
```

### When you deploy vald with `vald-helm-operator`
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

1. Edit `valdrelease.yaml` with the same fields as above
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

1. Deploy vald cluster
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

```bash
helm install vald-helm-operator-release vald/vald-helm-operator
kubectl apply -f valdrelease.yaml
```

1. Deploy `vald-readreplica`
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
```bash
helm install vald-readreplica vald/vald-readreplica --values <YOUR VALUES YAML FILE PATH>
```

## Architecture

Read replica mainly consists of the following four parts.

<img src="../../assets/docs/guides/read-replica-and-rotator/architecture.png" />
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

### Read replica deployment

The deployment that generates Pods where the actual processing of read replica takes place. Essentially, it is similar to a regular agent, but it differs in that it only accepts read requests (search requests) and reads the index from the read replica PVC.
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

### Read replica PVC

The PVC for read replica Pods is to read the index. It is generated based on the latest snapshot from the PVC of the regular agent. Unlike the agent PVC, it is generated as ROX, allowing it to be read from multiple Pods.
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

### Index operator

The operator is responsible for the following processes.
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

1. Monitoring the time when the agent saved the index to the PVC and when the read replica performed index rotation
1. Generating [Read replica rotator](#read-replica-rotator) job when an index save occurs after the most recent rotation
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

> The Index operator also manages the timing of index create/save operations other than those mentioned above. Please refer to another document for details.

### Read replica rotator
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

The Kubernetes job to be responsible for the following processes
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

1. Creating a snapshot from the agent's PVC
1. Generating a PVC for read replica from the snapshot
1. Rolling update of the read replica deployment to launch a group of read replica pods with the latest index

## Important notes

- Only result consistency is guaranteed
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

There is a time lag between index insertion, agent save, and the completion of read replica rotation. During this time, there may be inconsistencies between the index in the agent itself and the index in the read replica.

- Sufficient infrastructure is required for QPS scaling
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved

Even if read replicas are deployed, QPS will not scale if sufficient resources are not available in the Kubernetes cluster. Specifically, agent resources and read replica resources should be deployed on separate nodes. Vald sets `podAntiAffinity` to ensure that agent resources and read replica resources are deployed on separate nodes as much as possible.
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
kmrmt marked this conversation as resolved.
Show resolved Hide resolved
Loading