Skip to content

Add ROOT CAs to the trust store#12911

Draft
vishesh92 wants to merge 5 commits intoapache:mainfrom
shapeblue:update-ca-certs
Draft

Add ROOT CAs to the trust store#12911
vishesh92 wants to merge 5 commits intoapache:mainfrom
shapeblue:update-ca-certs

Conversation

@vishesh92
Copy link
Copy Markdown
Member

Description

This PR adds ROOT CAs to the trust store in System VMs & the management
This PR also adds another parameter forced to provisionCertificate command which allows provisioning certs via SSH to KVM hosts & system VMs.

Details

This pull request introduces several improvements and new features to the CloudStack Certificate Authority (CA) management system, with a focus on enhancing certificate provisioning flexibility, supporting user-provided CA material, and improving truststore management. The most significant changes add support for forced certificate re-provisioning via SSH for disconnected agents, clarify and enforce requirements for custom CA keys, and enable automatic truststore injection for outgoing HTTPS connections.

Certificate Provisioning Enhancements:

  • Added a forced boolean parameter to the ProvisionCertificateCmd API and the CAManager.provisionCertificate method, allowing administrators to re-provision agent certificates via SSH when agents are disconnected (e.g., after a CA change). This is supported for KVM hosts and SystemVMs. The implementation includes a new provisionCertificateViaSsh method for KVM hosts. [1] [2] [3] [4] [5] [6]

CA Provider and Configuration Improvements:

  • Expanded and clarified configuration key documentation for user-provided CA material in the RootCAProvider, including PEM format requirements and the need to set all three keys together. Enhanced error handling and logging when loading user-provided CA keys/certificates fails, with fallback to auto-generation. [1] [2]
  • Improved the description of the ca.framework.provider.plugin configuration key to clarify its purpose and the requirements for custom CA material.

Truststore Management:

  • Introduced a new configuration key ca.framework.inject.default.truststore to control whether the CA provider's certificate is injected into the JVM default truststore on management server startup, enabling outgoing HTTPS connections from the management server to trust servers signed by the configured CA.
  • Updated the keystore-cert-import script to also import the CA certificate into the realhostip.keystore used by the SSVM JVM, ensuring trust for CloudStack CA-signed servers.

Script and Utility Fixes:

  • Improved the keystore-cert-import script to handle certificate splitting and file cleanup more robustly, preventing errors if no certificates are present.

Code Cleanup:

  • Removed unused imports and code from LibvirtServerDiscoverer and related classes as part of the refactoring for SSH-based certificate provisioning. [1] [2] [3] [4] [5] [6] [7]

These changes collectively improve CloudStack's certificate management flexibility, security, and ease of integration with external CA infrastructures.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

@vishesh92
Copy link
Copy Markdown
Member Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@vishesh92 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

❌ Patch coverage is 2.18978% with 134 lines in your changes missing coverage. Please review.
✅ Project coverage is 18.01%. Comparing base (1bff543) to head (516f1f4).

Files with missing lines Patch % Lines
...n/java/org/apache/cloudstack/ca/CAManagerImpl.java 1.78% 109 Missing and 1 partial ⚠️
.../apache/cloudstack/ca/provider/RootCAProvider.java 0.00% 18 Missing ⚠️
.../api/command/admin/ca/ProvisionCertificateCmd.java 0.00% 4 Missing ⚠️
...rvisor/kvm/discoverer/LibvirtServerDiscoverer.java 0.00% 1 Missing ⚠️
utils/src/main/java/com/cloud/utils/nio/Link.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #12911      +/-   ##
============================================
- Coverage     18.02%   18.01%   -0.01%     
- Complexity    16460    16461       +1     
============================================
  Files          5968     5968              
  Lines        537213   537320     +107     
  Branches      65975    65993      +18     
============================================
- Hits          96825    96824       -1     
- Misses       429469   429575     +106     
- Partials      10919    10921       +2     
Flag Coverage Δ
uitests 3.53% <ø> (ø)
unittests 19.18% <2.18%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.


provisionCertificateViaSsh(sshConnection, hostIp, host.getName());

SSHCmdHelper.sshExecuteCmd(sshConnection, "sudo service cloudstack-agent restart");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. sudo could be a problem, depends on which user is used on the kvm host. please check.
  2. may use systemctl instead of service, please check and keep consistent with other commands.

null,
"The ROOT CA certificate.", true);
"The ROOT CA X.509 certificate in PEM format (must start with '-----BEGIN CERTIFICATE-----'). " +
"Required when providing a custom CA. Restart management server(s) when changed.", true);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test and update description if an intermediate certificate exists ?

@DaanHoogland DaanHoogland requested a review from Copilot March 30, 2026 07:43
@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17282

@blueorangutan
Copy link
Copy Markdown

[SF] Trillian Build Failed (tid-15758)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends CloudStack’s CA/certificate provisioning to support forced re-provisioning via SSH (for disconnected KVM hosts and SystemVMs) and improves truststore handling so CloudStack CA roots can be trusted by management and SystemVM JVM processes.

Changes:

  • Add a forced flag to provisionCertificate (API + manager) and implement SSH-based provisioning paths for KVM hosts and SystemVMs.
  • Inject configured CA certificate(s) into the management server JVM default truststore on startup (configurable via a new global setting).
  • Update SystemVM/scripts truststore import logic to also populate realhostip.keystore with CA certs; improve cert-chain splitting robustness.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
utils/src/main/java/org/apache/cloudstack/utils/security/KeyStoreUtils.java Removes an unused SystemVM import-script constant.
utils/src/main/java/com/cloud/utils/nio/Link.java Adjusts handshake wrap error handling to fail the handshake on SSL wrap exceptions.
systemvm/patch-sysvms.sh Imports CA cert chain into realhostip.keystore during live patch for CPVM/SSVM.
server/src/test/java/org/apache/cloudstack/ca/CAManagerImplTest.java Updates tests for the new provisionCertificate(..., forced) signature.
server/src/test/java/org/apache/cloudstack/ca/CABackgroundTaskTest.java Updates mocks/verifications for the new method signature.
server/src/main/java/org/apache/cloudstack/ca/CAManagerImpl.java Implements forced provisioning via SSH + JVM truststore injection feature and wires new dependencies.
server/src/main/java/com/cloud/hypervisor/kvm/discoverer/LibvirtServerDiscoverer.java Refactors KVM discovery provisioning to call caManager.provisionCertificateViaSsh(...).
scripts/util/keystore-cert-import Improves cert split loop robustness; imports CA into realhostip.keystore for SystemVMs.
plugins/ca/root-ca/src/main/java/org/apache/cloudstack/ca/provider/RootCAProvider.java Clarifies config docs, improves fallback logging, and fixes startup loading sequence.
api/src/main/java/org/apache/cloudstack/ca/CAManager.java Adds forced support + introduces CaInjectDefaultTruststore config key + adds provisionCertificateViaSsh.
api/src/main/java/org/apache/cloudstack/api/command/admin/ca/ProvisionCertificateCmd.java Adds forced API parameter and passes it through to CA manager.
Comments suppressed due to low confidence (1)

scripts/util/keystore-cert-import:78

  • The certificate-chain splitting uses temp files named cloudca.* in the current working directory. This can fail or be unsafe if the script runs from an unexpected directory (clobbering existing files or following symlinks). Consider using a dedicated temporary directory via mktemp -d and writing the split certs there, then cleaning up that directory.
awk 'BEGIN{n=0} /-----BEGIN CERTIFICATE-----/{n++}{print > "cloudca." n }' "$CACERT_FILE"
for caChain in $(ls cloudca.* 2>/dev/null); do
    keytool -delete -noprompt -alias "$caChain" -keystore "$KS_FILE" -storepass "$KS_PASS" > /dev/null 2>&1 || true
    keytool -import -noprompt -storepass "$KS_PASS" -trustcacerts -alias "$caChain" -file "$caChain" -keystore "$KS_FILE" > /dev/null 2>&1
done
rm -f cloudca.*

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +266 to +270
// 2. Issue Certificate based on returned CSR
final String csr = keystoreSetupResult.getStdOut();
final Certificate certificate = issueCertificate(csr, Arrays.asList(agentHostname, agentIp),
Collections.singletonList(agentIp), null, null);

Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The forced provisioning flow currently ignores the caProvider argument: provisionCertificateForced receives it but provisionCertificateViaSsh issues the certificate with caProvider=null (default provider). This makes provisionCertificate(..., provider, forced=true) behave differently than the non-forced path. Consider threading the provider through (e.g., add a provider parameter to provisionCertificateViaSsh or have provisionCertificateForced call issueCertificate(..., caProvider)).

Copilot uses AI. Check for mistakes.
Comment on lines +354 to +356
final Certificate certificate = issueCertificate(null, Arrays.asList(vm.getHostName(), vm.getInstanceName()),
new ArrayList<>(ipAddressDetails.values()), CertValidityPeriod.value(), null);
return deployCertificate(hypervisorHost, certificate, reconnect, sshAccessDetails);
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provisionSystemVmViaSsh also hard-codes caProvider=null when issuing the certificate, so the API provider parameter is ignored for forced SystemVM provisioning. This should use the requested provider (or explicitly disallow provider overrides when forced=true).

Copilot uses AI. Check for mistakes.
Comment on lines +590 to +596
int count = 0;
for (final X509Certificate caCert : caCerts) {
final String alias = "cloudstack-ca-" + count;
trustStore.setCertificateEntry(alias, caCert);
count++;
logger.info("Injected CA certificate into JVM default truststore: subject={}, alias={}",
caCert.getSubjectX500Principal().getName(), alias);
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

injectCaCertIntoDefaultTruststore() assumes configuredCaProvider.getCaCertificate() contains only non-null entries. Some providers (e.g., RootCAProvider) can return a list containing null if initialization failed, which would cause trustStore.setCertificateEntry(...) to throw. Consider filtering out null certificates (and logging) before attempting to inject them.

Suggested change
int count = 0;
for (final X509Certificate caCert : caCerts) {
final String alias = "cloudstack-ca-" + count;
trustStore.setCertificateEntry(alias, caCert);
count++;
logger.info("Injected CA certificate into JVM default truststore: subject={}, alias={}",
caCert.getSubjectX500Principal().getName(), alias);
int count = 0;
int i = 0;
for (final X509Certificate caCert : caCerts) {
if (caCert == null) {
logger.warn("Skipping null CA certificate at index {} from configured CA provider during JVM truststore injection", i);
i++;
continue;
}
final String alias = "cloudstack-ca-" + count;
trustStore.setCertificateEntry(alias, caCert);
count++;
logger.info("Injected CA certificate into JVM default truststore: subject={}, alias={}",
caCert.getSubjectX500Principal().getName(), alias);
i++;

Copilot uses AI. Check for mistakes.
# Import CA cert(s) into realhostip.keystore so the SSVM JVM
# (which overrides the truststore via -Djavax.net.ssl.trustStore in _run.sh)
# can trust servers signed by the CloudStack CA
REALHOSTIP_KS_FILE="$(dirname $(dirname $PROPS_FILE))/certs/realhostip.keystore"
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REALHOSTIP_KS_FILE is computed with nested dirname calls but $PROPS_FILE isn’t quoted inside the command substitution. While uncommon, this will break if the path contains spaces or glob characters. Quote $PROPS_FILE within the dirname invocations to make this robust.

Suggested change
REALHOSTIP_KS_FILE="$(dirname $(dirname $PROPS_FILE))/certs/realhostip.keystore"
REALHOSTIP_KS_FILE="$(dirname "$(dirname "$PROPS_FILE")")/certs/realhostip.keystore"

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh , right, realhostip.keystore will be renamed in #12856

Comment on lines +142 to +149
awk 'BEGIN{n=0} /-----BEGIN CERTIFICATE-----/{n++}{print > "cloudca." n }' "$CACERT_FILE"
for caChain in $(ls cloudca.* 2>/dev/null); do
keytool -delete -noprompt -alias "$caChain" -keystore "$REALHOSTIP_KS_FILE" \
-storepass "$REALHOSTIP_PASS" > /dev/null 2>&1 || true
keytool -import -noprompt -trustcacerts -alias "$caChain" -file "$caChain" \
-keystore "$REALHOSTIP_KS_FILE" -storepass "$REALHOSTIP_PASS" > /dev/null 2>&1
done
rm -f cloudca.*
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds the same cloudca.* temp-file splitting/import logic in the livepatch script, again using the current working directory. For the same reasons as keystore-cert-import, it would be safer to use a dedicated mktemp -d directory and avoid importing cloudca.0 when the PEM has any preamble/whitespace.

Suggested change
awk 'BEGIN{n=0} /-----BEGIN CERTIFICATE-----/{n++}{print > "cloudca." n }' "$CACERT_FILE"
for caChain in $(ls cloudca.* 2>/dev/null); do
keytool -delete -noprompt -alias "$caChain" -keystore "$REALHOSTIP_KS_FILE" \
-storepass "$REALHOSTIP_PASS" > /dev/null 2>&1 || true
keytool -import -noprompt -trustcacerts -alias "$caChain" -file "$caChain" \
-keystore "$REALHOSTIP_KS_FILE" -storepass "$REALHOSTIP_PASS" > /dev/null 2>&1
done
rm -f cloudca.*
TMP_CA_DIR="$(mktemp -d 2>/dev/null)" || TMP_CA_DIR=""
if [ -z "$TMP_CA_DIR" ]; then
echo "Failed to create temporary directory for CA cert import; skipping CloudStack CA import." >> $logfile 2>&1
else
awk 'BEGIN{n=0} /-----BEGIN CERTIFICATE-----/{n++}{print > "'"$TMP_CA_DIR"'/cloudca." n }' "$CACERT_FILE"
for caChain in "$TMP_CA_DIR"/cloudca.*; do
# If the glob did not match anything, break out of the loop.
[ -e "$caChain" ] || break
caAlias="$(basename "$caChain")"
# Skip potential preamble/whitespace chunk (cloudca.0).
if [ "$caAlias" = "cloudca.0" ]; then
continue
fi
keytool -delete -noprompt -alias "$caAlias" -keystore "$REALHOSTIP_KS_FILE" \
-storepass "$REALHOSTIP_PASS" > /dev/null 2>&1 || true
keytool -import -noprompt -trustcacerts -alias "$caAlias" -file "$caChain" \
-keystore "$REALHOSTIP_KS_FILE" -storepass "$REALHOSTIP_PASS" > /dev/null 2>&1
done
rm -f "$TMP_CA_DIR"/cloudca.* 2>/dev/null || true
rmdir "$TMP_CA_DIR" 2>/dev/null || true
fi

Copilot uses AI. Check for mistakes.
Comment on lines +152 to +162
boolean provisionCertificate(final Host host, final Boolean reconnect, final String provider, final boolean forced);

/**
* Provisions certificate for a KVM host using an existing SSH connection.
* Runs keystore-setup to generate a CSR, issues a certificate, then runs keystore-cert-import.
* Used during host discovery and for forced re-provisioning when the NIO agent is unreachable.
* @param sshConnection active SSH connection to the KVM host
* @param agentIp IP address of the KVM host agent
* @param agentHostname hostname of the KVM host agent
*/
void provisionCertificateViaSsh(Connection sshConnection, String agentIp, String agentHostname);
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CAManager.provisionCertificateViaSsh exposes com.trilead.ssh2.Connection in the public API interface, coupling the API module (and any implementers) to a specific SSH library. Consider keeping the SSH connection type internal to the server layer (e.g., accept host/credentials and let the implementation manage SSH), or introduce a small CloudStack-owned SSH abstraction/interface instead of leaking Trilead into the API surface.

Copilot uses AI. Check for mistakes.
Comment on lines +210 to 214
if (forced) {
return provisionCertificateForced(host, reconnect, caProvider);
}

String csr = null;
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior paths were added (forced SSH provisioning + JVM default truststore injection) but there are no unit tests covering these flows. Adding tests for forced=true (KVM and SystemVM) and for CaInjectDefaultTruststore would help prevent regressions in authentication and startup behavior.

Copilot uses AI. Check for mistakes.
# Import cacerts into the keystore
awk '/-----BEGIN CERTIFICATE-----?/{n++}{print > "cloudca." n }' "$CACERT_FILE"
for caChain in $(ls cloudca.*); do
awk 'BEGIN{n=0} /-----BEGIN CERTIFICATE-----/{n++}{print > "cloudca." n }' "$CACERT_FILE"
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The awk split writes to cloudca.0 until the first BEGIN CERTIFICATE line is encountered (because {print > "cloudca." n} runs even when n==0). That can lead to attempting to import an invalid/empty file. Consider only printing once n>0 (or changing the awk script to skip preamble) so only real PEM blocks become import candidates.

Suggested change
awk 'BEGIN{n=0} /-----BEGIN CERTIFICATE-----/{n++}{print > "cloudca." n }' "$CACERT_FILE"
awk 'BEGIN{n=0} /-----BEGIN CERTIFICATE-----/{n++} n>0 {print > "cloudca." n }' "$CACERT_FILE"

Copilot uses AI. Check for mistakes.
@blueorangutan
Copy link
Copy Markdown

[SF] Trillian Build Failed (tid-15759)

@vishesh92
Copy link
Copy Markdown
Member Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@vishesh92 a [SL] Jenkins job has been kicked to build packages. It will be bundled with no SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✖️ debian ✔️ suse15. SL-JID 17290

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants