250 Commits

Author SHA1 Message Date
Sarthak Aggarwal f2f4e5dbfc Run ASan Tests on run-extra-tests label (#3512)
It's important to enabled ASAN on run-extra-tests label so we can
catch some of the bugs in the PRs before they are merged into unstable.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
2026-05-01 12:10:29 +08:00
Rain Valentine cea9354b56 Big Endian: add daily workflow UT job and fix UTs (#3330)
Big endian support on Valkey is "best effort" and not guaranteed, but we
haven't been doing any regular testing at all afaik. This PR adds a job
to the daily workflow to run UTs on an emulated big endian platform.
Integration tests failed excessively because of how slow emulation is.

I fixed several problems with tests and improved UT coverage of key
points where endian byte order matters - and fwiw I didn't find any
bugs. I think the main coverage gap remaining after this is RDB
serialization (maybe little endian <-> big endian round trips?)

There are couple lines of endian-specific code for #3166 and this change
can test it.

Signed-off-by: Rain Valentine <rsg000@gmail.com>
2026-05-01 12:09:23 +08:00
Ping Xie 5b7ac66918 Fix verify-provenance action pin (#3594) 2026-04-29 21:30:40 -07:00
Ping Xie 98724dda08 Update provenance action to refine layer2 exemption policies (#3593) 2026-04-29 17:06:57 -07:00
Jun Yeong Kim ff80b2d1dc Migrate the remaining cluster tests to the new framework and remove legacy files (#2297) (#3382)
Migrated the remaining cluster tests to tests/unit/cluster/ to use the same
framework for all cluster tests. Cleaned up the obsolete cluster test framework
files and updated the CI workflows to use the new unified test runner.

Changes:
  Moved and mapped 6 test files:
  - 03-failover-loop.tcl → Merged into existing failover.tcl
  - 04-resharding.tcl → resharding.tcl
  - 12-replica-migration-2.tcl + 12.1-replica-migration-3.tcl →
  replica-migration-slow.tcl
  - 07-replica-migration.tcl → Merged into existing replica-migration.tcl
  - 28-cluster-shards.tcl → Merged into existing cluster-shards.tcl

Other changes:
  - Converted old framework APIs (e.g., K, RI) to new framework APIs (e.g., R, srv)
  - Added process_is_alive check in cluster_util.tcl to fix an exception in
  failover tests caused by executing ps on dead processes
  - Heavy tests (resharding, replica-migration-slow) marked with slow tag and
  wrapped in run_solo to prevent resource contention in sanitizer environments
  - replica-migration-slow marked with valgrind:skip tag since it is very slow
  - Removed the entire tests/cluster/ directory including run.tcl, cluster.tcl,
  includes/, and helpers/
  - Kept runtest-cluster as a wrapper script (exec ./runtest --cluster "$@")
  - Removed ./runtest-cluster calls from .github/workflows/daily.yml as cluster
  tests are now included in ./runtest

Closes #2297.

Signed-off-by: Jun Yeong Kim <junyeonggim5@gmail.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
2026-04-27 17:31:37 +08:00
Ping Xie c861184762 Implement Provenance Guard (#3109)
This PR bootstraps Valkey's provenance guard integration.

The provenance guard is a content-based similarity detection system that helps maintain proper code provenance by comparing incoming PR changes against fingerprint databases built from Redis commits and PRs. The matching logic now lives in the external `valkey-io/verify-provenance` action repository; this PR wires Valkey to that action and seeds the required database branch.

Key features:
  * Content-based detection: Uses normalized diff fingerprints and fuzzy matching to detect similar changes, including cases where files have moved or been refactored.
  * Externalized action logic: The check and refresh implementation is maintained in `valkey-io/verify-provenance` and is pinned by exact commit SHA from Valkey workflows.
  * Provenance Guard workflow: Runs on PR activity to check incoming changes against the provenance databases and report potential matches.
  * Daily Refresh workflow: Runs daily to refresh PR fingerprints and commits updated data back to `verify-provenance-db`.
  * Dedicated DB branch: Stores provenance databases on the orphan `verify-provenance-db` branch, separate from Valkey source code.
  * Privacy-first storage: Stores compressed non-reversible fingerprints, not source code.

The initial `verify-provenance-db` branch has been bootstrapped with fingerprints of Redis commits and PRs.

  ---------

  Signed-off-by: Ping Xie <pingxie@outlook.com>
2026-04-26 14:36:18 -07:00
Harkrishn Patro cb2cfdd4e0 Revert "Pin clang to version 17 in sanitizer CI jobs" (#3556)
Reverts valkey-io/valkey#3546

This didn't help fix the build issue. Follow up PR is performed on
https://github.com/valkey-io/valkey/pull/3555

Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
2026-04-23 19:01:52 -07:00
Hanxi Zhang 7db5b70737 Pin clang to version 17 in sanitizer CI jobs (#3546)
### Analysis
The daily CI sanitizer jobs with clang are failing during the build
step.

The `ubuntu-latest` runner now has clang 18, but the LLVM gold plugin
is still version 17. When the static Lua module is built with `-flto`,
the `.o` files contain LLVM 18 bitcode that the gold plugin (v17) cannot
read:
`bfd plugin: LLVM gold plugin has failed to create LTO module: Unknown
attribute kind (91) (Producer: 'LLVM18.1.3' Reader: 'LLVM 17.0.6')
`
Example failure:
https://github.com/valkey-io/valkey/actions/runs/24753491944/job/72421581512

### Fix
Pin the sanitizer jobs to `clang-17` so the compiler and gold plugin
versions match.
Tested(successfully built):
https://github.com/hanxizh9910/valkey/actions/runs/24859845008

### Note
If `clang-17` is removed from the `ubuntu-latest` image in the future,
we may need to either add an explicit install step

Signed-off-by: Hanxi Zhang <hanxizh@amazon.com>
2026-04-23 16:14:24 -07:00
zhenwei pi 215b6c11db Show uname -a in RDMA CI job (#3418)
The RXE project should keep the same version with the CI machine,
showing uname in RDMA CI job to find out the reason of kmod installing
failure.

Signed-off-by: zhenwei pi <zhenwei.pi@linux.dev>
2026-03-30 09:14:27 +02:00
harrylin98 78ee1f555d ci: include gtests in code coverage report
Signed-off-by: harrylin98 <harrylin980107@gmail.com>
2026-03-24 07:46:59 -07:00
Roshan Khatri 07554d34d1 Upload all benchmark artifacts including server logs (#3388)
Upload the entire results directory instead of only metrics JSON files.
This includes server logs which are useful for debugging benchmark
failures.

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2026-03-20 15:21:14 -07:00
Roshan Khatri 9000e26ecf Pin workflow pip/go/npm dependencies for OpenSSF compliance (#3276)
Pin package manager dependencies in CI workflows to improve the Pinned-Dependencies
score in OpenSSF Scorecard.

Changes:
- benchmark-on-label.yml, benchmark-release.yml: add `--require-hashes`
  to `pip install` adding on valkey-perf-benchmark repo:
  https://github.com/valkey-io/valkey-perf-benchmark/pull/44
- ci.yml: pin `yamlfmt` to `v0.21.0` instead of `@latest`
- reply-schemas-linter.yml: use npm ci with `package-lock.json` instead
  of unpinned npm install, package files in `utils/reply-schema-linter/`

Signed-off-by: Roshaan Khatri <rvkhatri@amazon.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2026-03-20 15:11:00 +01:00
Roshan Khatri 8ddc582d23 fix benchmark queue and reduce the total duration (#3387)
Previously, our workflow used a global concurrency group, which
effectively limited execution to one running job and one pending job.
Any additional requests were automatically canceled, preventing a true
queue from forming.

We are now shifting to a model where we remove the concurrency
restriction and allow jobs to queue directly on the self-hosted runner.
This enables multiple workflow runs to be accepted and queued instead of
being dropped.

While GitHub can accept workflow triggers at a high rate (e.g., hundreds
per minute), the actual execution is still constrained by runner
capacity, in our case, a single runner processing one job at a time.

However, queued jobs are subject to GitHub’s 24-hour timeout policy.
This means any job that waits in the queue for more than 24 hours before
starting will be automatically canceled (timedout).

In practical terms, this approach improves reliability by eliminating
premature cancellations, but the effective queue size is still bounded
by how many jobs the runner can process within a 24-hour window. we
could increase the number of runners to run these in parallel.

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2026-03-19 16:37:43 -07:00
Harry Lin 543a6b83df Ensure the daily workflow uses gtest-parallel to run unit tests in isolation (#3375)
The daily workflow was directly invoking the `valkey-unit-gtests` executable.
The intended invocation is to use `gtest-parallel` to ensure that the tests are executed in isolation.

Signed-off-by: harrylin98 <harrylin980107@gmail.com>
2026-03-19 12:18:12 -07:00
Sarthak Aggarwal 6c329dfe2c Weekly tests branches are not honored on scheduled workflow (#3340)
`weekly.yml` calls `daily.yml` with `use_git_ref` set to each release
branch (for example 7.2). But the checkout logic in `daily.yml` only
used `inputs.use_git_ref` when `github.event_name` was
`workflow_dispatch` or `workflow_call`. otherwise it fell back to
`github.ref`.

For reusable workflows, GitHub keeps the caller workflow’s github
context. That means when `weekly.yml` is triggered by schedule, the
called `daily.yml` still sees `github.event_name == 'schedule'` and
`github.ref` for the caller branch (unstable). As a result, jobs labeled
as release-branch runs could still check out unstable.

Added a guard for Gtest Unit Tests. It will skip the job if gtest is not
available / supported.

Run with CI Issue:
https://github.com/valkey-io/valkey/actions/runs/22815380713

---------

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
2026-03-13 14:50:00 -07:00
Rain Valentine c9ce3e0919 Fix OOM aborts in large-memory ASAN tests on GitHub runners (#3263)
Carries on from where #3161 left off. The test-sanitizer-address-large-memory
jobs were being OOM-killed on GitHub-hosted runners (15.6GB RAM) due to
ASAN's 2-3x memory overhead.

Changes:
- Skip 4GB quicklist compression test under ASAN (requires ~16-24GB with
dual buffers + ASAN overhead)
- Reduce integration test sizes from 5GB to 4.1GB (preserves >4GB 32-bit
boundary coverage)
- Reduce XADD iterations from 10 to 3
- Add memory monitoring to track minimum free memory during CI runs

Signed-off-by: Rain Valentine <rsg000@gmail.com>
2026-03-12 12:33:33 +08:00
Daniel Lemire 6414720504 Replace fast_float (C++) with ffc.h (#3329)
There is now a port of fast_float in C. So instead of having an optional
fast_float dependency, we can just use ffc instead, unconditionally.

https://github.com/kolemannix/ffc.h

It is a high quality port. The performance should be the same or
improved.

Note : I am the maintainer and main author of fast_float.

---------

Signed-off-by: Daniel Lemire <daniel@lemire.me>
2026-03-11 12:26:44 +01:00
Roshan Khatri 8caa29298c Adds cluster benchmark support for benchmark-on-label (#3338)
Now we will be able to add a `run-cluster-benchmark` label to run a
benchmark with cluster-mode enabled valkey-server

It will use the config
https://github.com/valkey-io/valkey/blob/unstable/.github/benchmark_configs/benchmark-config-arm.json
modified for for cluster mode with a single clustermode enabled instance of
valkey.

It uses the same single instance for the benchmark as for run-benchmark.
 
If both labels are used, they are sequential in the same concurrency group `group:
ec2-al-2023-pr-benchmarking-arm64`.

---------

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2026-03-09 22:48:29 +01:00
Björn Svensson 747af1a85d Use ar archiver installed by brew in CI build-macos-latest (#3317)
Since there is some mismatch between the already installed `ar` tool on
a macOS runner
and Clang 22, installed by brew; lets use the brew installed `llvm-ar`.

Expected to fix the issue in CI job `build-macos-latest`.

---------

Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>
2026-03-06 12:37:29 +01:00
Sarthak Aggarwal 35db3e1256 fix Codecov v5 input from file to files (#3308)
This PR fixes a Codecov workflow misconfiguration introduced when
upgrading codecov/codecov-action from v4 to v5 (in #3185).
In v5, the action expects files (plural), but the workflow still used
file.

The coverage shown is 0 right now:
https://app.codecov.io/gh/valkey-io/valkey


Documentation from -
https://github.com/codecov/codecov-action/tree/v5?tab=readme-ov-file#arguments

```
The following arguments have been changed

    file (this has been deprecated in favor of files)
    plugin (this has been deprecated in favor of plugins)

```

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
2026-03-05 13:49:38 +01:00
Roshan Khatri a731e45fd3 Fix compatibility for OpenSSL < 3.0 and Almalinux version mismatch for daily tests (#3303)
`SSL_get0_peer_certificate()` was introduced in OpenSSL 3.0. The recent
commit 7e110ae2b (Support TLS authentication using SAN URI) used it in
`tlsGetPeerUser()` without a version guard, breaking builds against
`OpenSSL 1.1.x.`

Use `SSL_get_peer_certificate()` on OpenSSL < 3.0 with the corresponding
`X509_free()` since the older API increments the reference count.

Fixes build failure: implicit declaration of function
`SSL_get0_peer_certificate [-Werror=implicit-function-declaration]`

Also fixes the version mismatch for almalinux 9 daily tests.
Closes #3304.

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2026-03-05 19:34:17 +08:00
Kurt McKee d49eac9cad Update the CodeQL action to resolve v3 deprecation warnings (#3310)
The CodeQL workflow is currently throwing a deprecation warning
regarding use of v3.

> CodeQL Action v3 will be deprecated in December 2026. Please update
all occurrences of the CodeQL Action in your workflow files to v4.

This PR introduces the following changes:
* References to CodeQL v3 have been updated to the SHA of the latest
CodeQL release, [v4.32.5].

Signed-off-by: Kurt McKee <contactme@kurtmckee.org>
2026-03-05 10:55:47 +08:00
Sarthak Aggarwal 661c62f7a9 Fix weekly release runs checking out unstable in daily workflow (#3295)
Honors `workflow_call` inputs rather than checking out the
`GITHUB_HEAD_REF` always.

```
Determining the checkout info
  /usr/bin/git branch --list --remote origin/8.0
    origin/8.0
/usr/bin/git sparse-checkout disable
/usr/bin/git config --local --unset-all extensions.worktreeConfig
Checking out the ref
  /usr/bin/git checkout --progress --force -B 8.0 refs/remotes/origin/8.0
  Switched to a new branch '8.0'
  branch '8.0' set up to track 'origin/8.0'.

```


Now the workflow checks out the right branch. 
Link:
https://github.com/sarthakaggarwal97/valkey/actions/runs/22599943936/job/65479450708#step:3:83

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
2026-03-02 16:13:26 -08:00
Gagan H R 834dc83516 Set read permission in workflow to improve OpenSSF score (#3267)
This PR fixes #3264. Score is currently 6. It will get improved to
around 6.9 once this PR is merged.

Signed-off-by: Gagan H R <hrgagan4@gmail.com>
2026-02-27 23:23:31 +01:00
Harry Lin 52a94f710a Add gtest dependencies to unit test workflows (#3270)
Install gtest dependencies in daily.yml.

Signed-off-by: harrylin98 <harrylin980107@gmail.com>
2026-02-27 13:39:51 -08:00
Harry Lin f3e957cee8 Introduce GoogleTest for Valkey unit testing (#3241)
This PR adds GoogleTest (gtest) support to Valkey to enable 
writing modern unit tests,as mentioned in
https://github.com/valkey-io/valkey/issues/2878

**Motivation**: 
GoogleTest provides richer assertions, test fixtures, mocking
support, and improved diagnostics, helping improve test coverage
and maintainability over time.

For more details, see `src/gtest/README.md`.

**Changes**

This PR integrates the GoogleTest framework and migrates all
existing C unit tests to GoogleTest.

---------

Signed-off-by: Harry Lin <harrylhl@amazon.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Alina Liu <liusalisa6363@gmail.com>
Signed-off-by: Jacob Murphy <jkmurphy@google.com>
Signed-off-by: Jim Brunner <brunnerj@amazon.com>
Co-authored-by: Harry Lin <harrylhl@amazon.com>
Co-authored-by: Jim Brunner <brunnerj@amazon.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Jacob Murphy <jkmurphy@google.com>
Co-authored-by: Alina Liu <liusalisa6363@gmail.com>
2026-02-26 14:14:23 -08:00
Gagan H R 9c8130d414 Adds scorecard workflow to publish OpenSSF scores (#3163)
Publish OpenSSF Scorecard results, which means users and downstream
consumers can easily discover the project’s security best-practice
signals via Scorecard API.

Publishing Scorecard results:

- Improves transparency for users and integrators
- Provides early visibility into missing or improvable security
practices

Fixes #3162

---------

Signed-off-by: Gagan H R <hrgagan4@gmail.com>
2026-02-26 21:24:06 +01:00
Nikhil Manglore 48419e46ea Enable USE_LIBBACKTRACE Across CI and Fix Alpine Builds (#3213)
This PR enables `USE_LIBBACKTRACE=yes` across all CI builds and builds
upon the changes introduced in #3034. Alpine-based jobs previously
attempted to install `libbacktrace-dev`, which does not exist in
Alpine’s apk repositories.

This caused these two errors in the daily tests below:
-
https://github.com/valkey-io/valkey/actions/runs/22045858351/job/63694456995
-
https://github.com/valkey-io/valkey/actions/runs/22045858351/job/63694457018

To resolve this, Alpine jobs now build GNU libbacktrace from source
inside the container before compiling Valkey. This aligns Alpine
behavior with other environments (Ubuntu jobs) and now avoids utilizing
non-existent Alpine packages.

An alternative approach we can consider is to disable `USE_LIBBACKTRACE`
for Alpine-based tests.

Signed-off-by: Nikhil Manglore <nmanglor@amazon.com>
2026-02-16 21:28:14 +01:00
Sarthak Aggarwal f7e6429161 Label is not removed automatically after extra tests are completed (#3202)
We made some changes to the workflow where the label was getting removed
every time we ran it. This changes handles it, removes
`pull_request_target` and doesn't re-trigger on adding another label.

I tried it here: https://github.com/sarthakaggarwal97/valkey/pull/65
The code is already merged in my unstable.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
2026-02-16 19:21:12 +01:00
Yang Zhao 546020e54c Add test-tls-only CI job (#3143)
Add a CI specifically for TLS to build and test
both built-in and module modes.

---------

Signed-off-by: Yang Zhao <zymy701@gmail.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2026-02-15 10:45:52 +01:00
Rain Valentine 83e07389e0 Workflows use actions/checkout for libbacktrace instead of git clone (#3204)
A relatively simple cleanup in our workflows. Using actions is best
practice.

Signed-off-by: Rain Valentine <rsg000@gmail.com>
2026-02-15 10:34:09 +01:00
Rain Valentine fd57c2171e add option to use libbacktrace for backtraces in crash reports (#3034)
With this we get more detailed backtrace information, including
information about static functions. Off by default - to enable you must
enable at compile time:

    make USE_LIBBACKTRACE=yes

Signed-off-by: Rain Valentine <rsg000@gmail.com>
2026-02-13 17:22:29 +01:00
Rain Valentine 9cbe1045d5 Update and pin github actions to full SHAs for supply chain security (#3185)
Updates to latest versions for each of the github actions used.

Pinning prevents an attack where the upstream action dependency is
compromised and the "v4" tag for example gets edited to point to a
malicious version. We already do this for most checkout actions in our
workflows.

---------

Signed-off-by: Rain Valentine <rsg000@gmail.com>
2026-02-12 22:49:21 +01:00
Sarthak Aggarwal 42e6a0b765 Separate jobs for large memory tests with sanitizers (#3161)
We have been seeing github actions runners being OOM when large memory
tests are run with ASan. The operation eventually is being canceled
during the test.

This change moves the large-memory tests with ASan and UBSan to separate
jobs, so we get a dedicated runner with its own timeout. We can tweak
the number of simultaneous test clients for these tests without
affecting the other test jobs.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
2026-02-10 22:10:31 +01:00
Zhijun Liao 3ea5cb3989 CI: Stop using symlinks for tests with CMake (#3145)
Follow-up to my previous CMake PR
https://github.com/valkey-io/valkey/pull/2816.

**Changes:**

1. **`.github/workflows/ci.yml`** - Removed symlinks, use
`./build-release/runtest` instead of `./runtest`
2. **`tests/support/set_executable_path.tcl`** - Added
`::VALKEY_TLS_MODULE` variable
3. **Fixed hardcoded paths in 5 test files:**
   - `tests/unit/tls.tcl` - server and TLS module paths
   - `tests/unit/fuzzer.tcl` - benchmark path
   - `tests/unit/cluster/cli.tcl` - CLI path
   - `tests/support/server.tcl` - TLS module path
   - `tests/instances.tcl` - TLS module path

**Result:** All tests passed. The only failure was an unrelated flaky
test (`client-eviction.tcl`) that's been failing since TLS was added to
the cmake job - tracked in issue #3146.

---------

Signed-off-by: Zhijun <dszhijun@gmail.com>
2026-02-02 14:58:57 +01:00
Daniil Kashapov f5b0360fae Add waits in test-ubuntu-reclaim-cache (#3134)
After #3103 time sensitive `test-ubuntu-reclaim-cache` started to fail
because now startup always includes 30ms of calibration of HW clock,
that's why we get this output:

```
Run echo "test SAVE doesn't increase cache"
test SAVE doesn't increase cache
2460491776
Could not connect to Valkey at 127.0.0.1:8080: Connection refused
```

Added waits for server to start, locally run, it helps

---------

Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>
2026-01-30 15:05:13 +01:00
Madelyn Olson be561e2b4b CI: Make CMake test also run tls tests (#3097)
We are already double running the tests with CMake, and we are building
CMake with TLS, so just making it so we run the tests with TLS. This
seems like an simple update so that we are always running the TLS tests.

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2026-01-23 17:50:05 +01:00
Sarthak Aggarwal 1a2ed04705 Fixes run-extra-tests workflow to run on the PR commit instead of unstable (#3092)
Thanks madolson for pointing out the issue with
https://github.com/valkey-io/valkey/pull/2907.

Whenever we were using the label, it was picking up the head of the
unstable instead of the PR. The reason was that we changed the target to
`pull_request_target`. We had to do that since we wanted to remove the
label after the run is completed.

With this change, it correctly picks up the commit hash of the PR and
checks it out.

To test this, I merged the changes in my forked repository's
[unstable](https://github.com/valkey-io/valkey/compare/unstable...sarthakaggarwal97:valkey:unstable),
created a dummy
[PR](https://github.com/sarthakaggarwal97/valkey/pull/61), and was able
to verify that the commit of the PR is being checked out in the
[action](https://github.com/sarthakaggarwal97/valkey/actions/runs/21229598167/job/61084986422?pr=61#step:3:81).

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
2026-01-21 15:52:53 -08:00
Sarthak Aggarwal 951d942ab2 Fix weekly workflow to continue after failure in releases branches (#3082)
Currently, the weekly runs do not progress if there is a failed workflow
as github CI treats `fail-fast` to be true by default. With this change,
we continue to test all the branches even after failure.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
2026-01-21 15:33:00 +08:00
Sarthak Aggarwal 33a1b51cc0 Fixing Weekly Tests Workflow on Released Branches (#3045)
The patch fixes the error by adding `pull‑requests: write` to the
permissions block of `weekly.yml`. Github rejects if the reusable
workflow's (here daily.yml) permissions are not provided.

We recently changed the permissions in `daily.yml` where we gave write
permissions in https://github.com/valkey-io/valkey/pull/2907
With this change, we bring parity to the permissions since `weekly.yml`
uses calls `daily.yml` workflow call method.

Fixes: https://github.com/valkey-io/valkey/actions/runs/20890545320

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
2026-01-12 15:00:59 -08:00
Sarthak Aggarwal 35fdcea259 Weekly Test Runs on Released Branches (#2702)
Resolves: https://github.com/valkey-io/valkey/issues/2228

Visualization:
https://github.com/sarthakaggarwal97/valkey/actions/runs/19113712295

Currently, there are no tests running on the already released branches.
We often do backport for bug fixes and CVEs in these older versions, and
end up with multiple CI tests failures on these branches.

The PR adds support for running weekly tests on already released
versions `>= 7.2`. The workflow will execute the "daily" test workflow
for each of these branches on `Sunday 06:00 UTC`.

The idea is to continuously monitor our released versions through weekly
test runs (during the time when is lesser activity on github runners).

---------

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
2026-01-06 13:17:17 -08:00
Sarthak Aggarwal 7385586fdf Run Extra Tests with only a label (#2907)
Change the behaviour of the CI job triggered by the run-extra-tests
label.

Run the tests immediately when applying the run-extra-tests label to a
PR, without requiring an extra commit to be pushed to trigger the test
run.

When the extra tests have run, the job removes the label.

---------

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
2025-12-23 22:13:43 +01:00
Vitah Lin 79fff5dbac Upgrade macos version in actions (#2920)
GitHub has deprecated older macOS runners, and macos-13 is no longer supported.

1. The latest version of cross-platform-actions/action does allow
running on ubuntu-latest (Linux runner) and does not strictly require macOS.
2. Previously, cross-platform-actions/action@v0.22.0 used runs-on:
macos-13. I checked the latest version of cross-platform-actions, and
the official examples now use runs-on: ubuntu. I think we can switch from macOS to Ubuntu.

---------

Signed-off-by: Vitah Lin <vitahlin@gmail.com>
2025-12-11 23:50:26 +01:00
Roshan Khatri c90e634f11 Add PR and Release benchmark with new changes in framework (#2871)
This adds the workflow improvements for PR and Release benchmark where
it runs on `c8g.metal-48xl` for `ARM64` and `c7i.metal-48xl` for `X86`.

```
Cluster mode: disabled
TLS: disabled
io-threads: 1, 9
Pipelining: 1, 10
Clients: 1600
Benchmark Treads: 90
Data size: 16 ,96
Commands: SET, GET
```

c8g.metal-48xl Spec: https://aws.amazon.com/ec2/instance-types/c8g/
c7i.metal.48xl Spec: https://aws.amazon.com/ec2/instance-types/c7i/

```
vCPU: 192
NUMA nodes: 2
Memory (GiB): 384
Network Bandwidth (Gbps): 50
```

PR benchmarking will be executed on **ARM64** machine as it has been
seen to be more consistent.
Additionally, it runs 5 iterations for each tests and posts the average
and other statistical metrics like
- CI99%: 99% Confidence Interval - range where the true population mean
is likely to fall
- PI99%: 99% Prediction Interval - range where a single future
observation is likely to fall
- CV: Coefficient of Variation - relative variability (σ/μ × 100%)

_Note: Values with (n=X, σ=Y, CV=Z%, CI99%=±W%, PI99%=±V%) indicate
averages from X runs with standard deviation Y, coefficient of variation
Z%, 99% confidence interval margin of error ±W% of the mean, and 99%
prediction interval margin of error ±V% of the mean. CI bounds [A, B]
and PI bounds [C, D] show the actual interval ranges._

For comparing between versions, it adds a workflow which runs on both
**ARM64** and **X86** machine. It will also post the comparison between
the versions like this:
https://github.com/valkey-io/valkey/issues/2580#issuecomment-3399539615

---------

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
Signed-off-by: Roshan Khatri <117414976+roshkhatri@users.noreply.github.com>
2025-12-04 17:33:21 +01:00
Sarthak Aggarwal b835463a73 Fixes test-freebsd workflow in daily (package lang/tclX) (#2832)
This PR fixes the freebsd daily job that has been failing consistently
for the last days with the error "pkg: No packages available to install
matching 'lang/tclx' have been found in the repositories".

The package name is corrected from `lang/tclx` to `lang/tclX`. The
lowercase version worked previously but appears to have stopped working
in an update of freebsd's pkg tool to 2.4.x.

Example of failed job:

https://github.com/valkey-io/valkey/actions/runs/19282092345/job/55135193499

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
2025-11-13 08:24:37 +01:00
Harkrishn Patro 95154feaa1 Bump old engine version(s) for compatibility test (#2741)
Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
2025-10-16 22:43:19 -07:00
Harkrishn Patro 18214be490 Add compatibility test with Valkey 7.2/8.0 (#2342)
* Add cross version compatibility test to run with Valkey 7.2 and 8.0
* Add mechanism in TCL test to skip tests dynamically - #2711

---------

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2025-10-09 17:00:10 -07:00
Viktor Söderqvist 3390b1e608 Allow TCL 9.0 for tests (#1673)
Makes our tests possible to run with TCL 9.

The latest Fedora now has TCL 9.0 and it's working now, including the
TCL TLS package. (This wasn't working earlier due to some packaging
errors for TCL packages in Fedora, which have been fixed now.)

This PR also removes the custom compilation of TCL 8 used in our Daily
jobs and uses the system default TCL version instead. The TCL version
depends on the OS. For the latest Fedora, you get 9.0, for macOS you get
8.5 and for most other OSes you get 8.6.

The checks for TCL 8.7 are removed, because 8.7 doesn't exist. It was
never released.

---------

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2025-10-08 11:37:15 +02:00
Roshan Khatri e53d7de40e Update automated benchmarking configs (#2625)
reduce the req and warmup time to finish in 6 hrs as the github workflow
times out after 6 hrs

---------

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2025-09-18 19:42:13 +02:00
Roshan Khatri 73d5b0ed9b Adds io-threads configs to PR-perf tests (#2598)
- Adds io-thread enabled perf-tests for pr
- changes the server and benchmark client cpu ranges so there are on
separate NUMA nodes of the metal machine.
- Also kill any servers that are active on the metal machine if anything
fails.
- Adds a benchmark wf to benchmark versions and publish on a issue id
provided:
<img width="340" height="449" alt="Screenshot 2025-09-11 at 12 14 28 PM"
src="https://github.com/user-attachments/assets/04f6a781-e163-4d6b-9b70-deedad15c9ef"
/>

- Comments on the issue with the full comparison like this:
 
<img width="936" height="1152" alt="Screenshot 2025-09-11 at 12 15
35 PM"
src="https://github.com/user-attachments/assets/e1584c8e-25dc-433f-a4d4-5b08d7548ddf"
/>

https://github.com/roshkhatri/valkey/pull/3#issuecomment-3282289440

---------

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2025-09-16 14:09:39 -07:00