In this article we go over resetting the vSphere certificates, particularly around the vCenter including the main process, best practices, checking certificates and what errors you might encounter while trying to do this, as a one stop shot for everything vSphere Certificate issues to help get your systems back up and running
Important – By continuing you are agreeing to the disclaimer here
Checking Certificates Expiry And Errors
The number 1 error you’ll see if your certificates have already expired is the HTTP 500 error
The exact error is ‘HTTP Status 500 – Internal Server Error’ and it looks like this
If you have this, some certificates have expired
You also may have expired certificates if the GUI isnt loading properly or seems generally broken, but isnt quite as bad as the HTTP 500 issue, an example would be errors in the Certificate Management section, under administration in the vSphere UI
If you have odd errors like the ones above, you’ll probably want to check your certificates to see if anything has expired
The first on this list is the STS Signing Certificate, this one is likely to not cause HTTP 500, but can cause other issues shown above
There are two ways to check the STS Signing Certificate, the UI is the best place, but if you cant get into the UI, the CLI will be the way forward
vSphere UI
To check in the UI click the three lines in the top left
Go into administration
Then under Certificate Management at the bottom, followed by STS Signing Certificate
This UI is for vSphere 8U3, for older versions it may look like this
vCenter CLI
To check in the CLI you’ll need the script VMware have on this article
If that doesnt work for any reason, I have added it below, it will need saving as a Python file, use this only if the Broadcom link doesnt work, as it may change and this copy may not be accurate or work properly
CheckSTS
#!/opt/vmware/bin/python
"""
Copyright 2020-2022 VMware, Inc. All rights reserved. -- VMware Confidential
Author: Keenan Matheny ([email protected])
"""
##### BEGIN IMPORTS #####
import os
import sys
import json
import subprocess
import re
import pprint
import ssl
from datetime import datetime, timedelta
import textwrap
from codecs import encode, decode
import subprocess
from time import sleep
try:
# Python 3 hack.
import urllib.request as urllib2
import urllib.parse as urlparse
except ImportError:
import urllib2
import urlparse
sys.path.append(os.environ['VMWARE_PYTHON_PATH'])
from cis.defaults import def_by_os
sys.path.append(os.path.join(os.environ['VMWARE_CIS_HOME'],
def_by_os('vmware-vmafd/lib64', 'vmafdd')))
import vmafd
from OpenSSL.crypto import (load_certificate, dump_privatekey, dump_certificate, X509, X509Name, PKey)
from OpenSSL.crypto import (TYPE_DSA, TYPE_RSA, FILETYPE_PEM, FILETYPE_ASN1 )
today = datetime.now()
today = today.strftime("%d-%m-%Y")
vcsa_kblink = "https://kb.vmware.com/s/article/76719"
win_kblink = "https://kb.vmware.com/s/article/79263"
##### END IMPORTS #####
class parseCert( object ):
# Certificate parsing
def format_subject_issuer(self, x509name):
items = []
for item in x509name.get_components():
items.append('%s=%s' % (decode(item[0],'utf-8'), decode(item[1],'utf-8')))
return ", ".join(items)
def format_asn1_date(self, d):
return datetime.strptime(decode(d,'utf-8'), '%Y%m%d%H%M%SZ').strftime("%Y-%m-%d %H:%M:%S GMT")
def merge_cert(self, extensions, certificate):
z = certificate.copy()
z.update(extensions)
return z
def __init__(self, certdata):
built_cert = certdata
self.x509 = load_certificate(FILETYPE_PEM, built_cert)
keytype = self.x509.get_pubkey().type()
keytype_list = {TYPE_RSA:'rsaEncryption', TYPE_DSA:'dsaEncryption', 408:'id-ecPublicKey'}
extension_list = ["extendedKeyUsage",
"keyUsage",
"subjectAltName",
"subjectKeyIdentifier",
"authorityKeyIdentifier"]
key_type_str = keytype_list[keytype] if keytype in keytype_list else 'other'
certificate = {}
extension = {}
for i in range(self.x509.get_extension_count()):
critical = 'critical' if self.x509.get_extension(i).get_critical() else ''
if decode(self.x509.get_extension(i).get_short_name(),'utf-8') in extension_list:
extension[decode(self.x509.get_extension(i).get_short_name(),'utf-8')] = self.x509.get_extension(i).__str__()
certificate = {'Thumbprint': decode(self.x509.digest('sha1'),'utf-8'), 'Version': self.x509.get_version(),
'SignatureAlg' : decode(self.x509.get_signature_algorithm(),'utf-8'), 'Issuer' :self.format_subject_issuer(self.x509.get_issuer()),
'Valid From' : self.format_asn1_date(self.x509.get_notBefore()), 'Valid Until' : self.format_asn1_date(self.x509.get_notAfter()),
'Subject' : self.format_subject_issuer(self.x509.get_subject())}
combined = self.merge_cert(extension,certificate)
cert_output = json.dumps(combined)
self.subjectAltName = combined.get('subjectAltName')
self.subject = combined.get('Subject')
self.validfrom = combined.get('Valid From')
self.validuntil = combined.get('Valid Until')
self.thumbprint = combined.get('Thumbprint')
self.subjectkey = combined.get('subjectKeyIdentifier')
self.authkey = combined.get('authorityKeyIdentifier')
self.combined = combined
class parseSts( object ):
def __init__(self):
self.processed = []
self.results = {}
self.results['expired'] = {}
self.results['expired']['root'] = []
self.results['expired']['leaf'] = []
self.results['valid'] = {}
self.results['valid']['root'] = []
self.results['valid']['leaf'] = []
def get_certs(self,force_refresh):
urllib2.getproxies = lambda: {}
vmafd_client = vmafd.client('localhost')
domain_name = vmafd_client.GetDomainName()
dc_name = vmafd_client.GetAffinitizedDC(domain_name, force_refresh)
if vmafd_client.GetPNID() == dc_name:
url = (
'http://localhost:7080/idm/tenant/%s/certificates?scope=TENANT'
% domain_name)
else:
url = (
'https://%s/idm/tenant/%s/certificates?scope=TENANT'
% (dc_name,domain_name))
return json.loads(urllib2.urlopen(url).read().decode('utf-8'))
def check_cert(self,certificate):
cert = parseCert(certificate)
certdetail = cert.combined
# Attempt to identify what type of certificate it is
if cert.authkey:
cert_type = "leaf"
else:
cert_type = "root"
# Try to only process a cert once
if cert.thumbprint not in self.processed:
# Date conversion
self.processed.append(cert.thumbprint)
exp = cert.validuntil.split()[0]
conv_exp = datetime.strptime(exp, '%Y-%m-%d')
exp = datetime.strftime(conv_exp, '%d-%m-%Y')
now = datetime.strptime(today, '%d-%m-%Y')
exp_date = datetime.strptime(exp, '%d-%m-%Y')
# Get number of days until it expires
diff = exp_date - now
certdetail['daysUntil'] = diff.days
# Sort expired certs into leafs and roots, put the rest in goodcerts.
if exp_date <= now:
self.results['expired'][cert_type].append(certdetail)
else:
self.results['valid'][cert_type].append(certdetail)
def execute(self):
json = self.get_certs(force_refresh=False)
for item in json:
for certificate in item['certificates']:
self.check_cert(certificate['encoded'])
return self.results
def main():
warning = False
warningmsg = '''
WARNING!
You have expired STS certificates. Please follow the KB corresponding to your OS:
VCSA: %s
Windows: %s
''' % (vcsa_kblink, win_kblink)
parse_sts = parseSts()
results = parse_sts.execute()
valid_count = len(results['valid']['leaf']) + len(results['valid']['root'])
expired_count = len(results['expired']['leaf']) + len(results['expired']['root'])
#### Display Valid ####
print("\n%s VALID CERTS\n================" % valid_count)
print("\n\tLEAF CERTS:\n")
if len(results['valid']['leaf']) > 0:
for cert in results['valid']['leaf']:
print("\t[] Certificate %s will expire in %s days (%s years)." % (cert['Thumbprint'], cert['daysUntil'], round(cert['daysUntil']/365)))
else:
print("\tNone")
print("\n\tROOT CERTS:\n")
if len(results['valid']['root']) > 0:
for cert in results['valid']['root']:
print("\t[] Certificate %s will expire in %s days (%s years)." % (cert['Thumbprint'], cert['daysUntil'], round(cert['daysUntil']/365)))
else:
print("\tNone")
#### Display expired ####
print("\n%s EXPIRED CERTS\n================" % expired_count)
print("\n\tLEAF CERTS:\n")
if len(results['expired']['leaf']) > 0:
for cert in results['expired']['leaf']:
print("\t[] Certificate: %s expired on %s!" % (cert.get('Thumbprint'),cert.get('Valid Until')))
continue
else:
print("\tNone")
print("\n\tROOT CERTS:\n")
if len(results['expired']['root']) > 0:
for cert in results['expired']['root']:
print("\t[] Certificate: %s expired on %s!" % (cert.get('Thumbprint'),cert.get('Valid Until')))
continue
else:
print("\tNone")
if expired_count > 0:
print(warningmsg)
if __name__ == '__main__':
exit(main())
Once you have that, you’ll need to pop it on the vCenter, ensure SSH is anabled for this, if its not, you can enable it from the admin portal on https://fqdn:5480 and click edit here
Ideally, the scripts wants to go into /tpm with WinSCP, when logging on to the vCenter you may encounter this error
Received too large (1433299822 B) SFTP packet
To fix this SSH into the vCenter shell and run
chsh -s /bin/bash <account>
You are most likely using the root account so the command will be
chsh -s /bin/bash root
This changes the default login shell from the appliance shell to the bash shell
It can be changed back with
chsh -s /bin/appliancesh <account>
Once its in the /tmp directory, run the following to execute it
python ./<filename>
When I downloaded it, it saved as 0685G00000lTiIBQA0_checksts.py so I ran
python /tmp/0685G00000lTiIBQA0_checksts.py
Which gave this output
If everything is under valid, you are all good, if some have expired, this will need renewing
Then you’ll want to check the rest of the certificates, the best way to do this is via the CLI, so SSH into the vCenter and run the following
for store in $(/usr/lib/vmware-vmafd/bin/vecs-cli store list | grep -v TRUSTED_ROOT_CRLS); do echo "[*] Store :" $store; /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store $store --text | grep -ie "Alias" -ie "Not After";done;
This will tell you whats expiring for the rest of your certificates
If you have renewed certificated you may see a certificate store at the bottom with expired certificates, this is expected and normal, from here make a note of what, if any, have expired
Preparing To Renew Certificates
When renewing the certificates the first thing you will want to do before touching the vCenter You will want a config backup from the vCenter admin portal, you can find more info on setting that up and viewing them here
However, if you are replacing certificates due to them already having expired, and you dont have a backup taken recently, ideally within 24 hours, but within 48-72 you should be able to get away with, then you can try but it will likely fail due to key services slipping into the stop state, if this is the case the snapshot which I will go through below will have to do as you wont be able to get a new config backup
Then ensure you have a powered off snapshot of the VM, this included all vCenter VMs in an SSO domain, so if you have 2+ in enhanced link mode, eg you can see all vCenter from any vCenter UI, they all need powering off and snapshotting before you do this
To do this, first, find then vCenter VM in the vSphere UI and node the ESXi host its running on, then power the vCenter off with an OS shutdown, and go to the ESXi host client UI on https://fqdn or https://IP for the ESXi host the vCenter VM is running on
As an example, my vCenter is call Borealis, so I will find that VM here, and note the host, glacier.leaha.co.uk, from there I will click the red stop icon to initiate a guest OS shutdown
Then I am going to go to the ESXi UI on https://glacier.leaha.co.uk, log in and find Borealis, and take a snapshot, once the VM is powered off, you’ll know its off when the VM doesnt have the play symbol, and isnt consuming CPU, showing 0MHz
This gives us our restore point should anything go really wrong, of course remember, for vCenters in enhanced link mode, all vCenters in the SSO domain need this doing, no exceptions, even if I only need to change the certificates on a single vCenter, this is really important
Once thats all done, power up the vCenter VM, once its all up, you are ready to renew certificates, give it 5-10 mins, not all services will come up if you have expired certificates
Renewing Certificates
Now all the prep work is done, we can look to renew the certificates, you’ll always want to start with the STS signing certificate
One thing to note, is when you have renewed this you may need to update this and rescan the vCenter in other places, for example, NSX, Veeam and Horizon, so they work properly
Renewing The STS Signing Certificate
If you can, you want to do this in the UI rather than the CLI, but if you dont have UI access, then you will need to use the CLI method
vSphere UI
Go into Administration by clicking the three lines
Then certificate management
Then click actions under STS Signing Certificate, and click refresh with vCentre certificate
This will require a reboot of the vCentre
For vSphere 8U3 and newer you can click the certificate here and renew, you’ll want to do both
vCenter CLI
If the vSphere UI isnt accessible then you’ll need to do this in the CLI
Broadcom’s recommendation here is to do this part on only one node in a system of enhanced linked vCenters, shown here, under STS Certificate, you will also want to grab the fixsts script from that link as we will need it to replace the STS signing certificate
Upload the fixsts script to the vCenter in /tmp, the same way we checked the STS certificate in the first section
From there, lets make it executable
chmod +x /tmp/fixsts.sh
Then run the script
/tmp/fixsts.sh
If you get this error
bash: /tmp/fixsts.sh: /bin/bash^M: bad interpreter: No such file or directory
This is caused by DOS carriage returns caused by the script when copied from a windows based text editor, to resolve this run
sed -i -e ‘s/\r$//’ /tmp/fixsts.sh
Once thats run, you’ll need to restart all the services with
service-control –stop –all && service-control –start –all
This mail fail if there are other certificates that have expired, thats fine, you’ll want to move onto the next section to replace the other expired certificates, which will have you restart the services
Renewing Other Certificates
To renew anything else, you’ll want to be SSH into the vCenter, now what we need to renew here will be what you made a note of when we checked the rest, if there is a lot you are best off going and renewing everything, but if its just one thing, you can renew just that
Open the certificate manager by running
/usr/lib/vmware-vmca/bin/certificate-manager
Now, you will want to pick an option relevant to the certificates that have expired, the process is the same either way, so the rule I go by is if you have one certificate option you need, do just that, if you have two or more, just do everything
You can use custom certificates from here by importing them into the vCenter, via WinSCP, but for this article I will be going through using self signed certificates as thats what most people use
The main options here are
- 3 – For the Machine SSL certificate
- 4 – For the VMCA root certificate
- 6 – For solution user certificates
- 8 – For everything
For this article we will go through option 8, but its the same process
So enter 8 and press enter to load that option
We will say no here, we will manually enter the info
Then we need to fill out the info, whats in squar brakets is the default, so if you enter nothing, that value will ne used
You need
- SSO admin credentials, this is normally [email protected], mine is [email protected] as I have changed the SSO domain when I set the appliance up
The general rule is administrator@<vsphere-domain> - Password, the field here doesnt type for security, this is for the vSphere admin account
- Country, Name, Organisation, OrgUnit, State, Localuty – These can all be left at default unless you want to change them
- IP address, this needs to match the vCenter IP
- Email, this can also be left at default unless you want to change it
- Hostname, this needs to be the vCenter FQDN
- VMCA Name, this is the hostname without the domain
Here is mine as an example
Then we hit Y to continue, and yes again
You may see something like this for a vCenter with external registrations
This can either go fine or have errors, see Error While Performing Rollback Operation if it errors out here
Error While Performing Rollback Operation
I have used this on vCenter 6.7/7, I havent seen it needed on vCenter 8 yet
This part can error out like so
This is due to exception returns is the service registrations
This error can be found in /var/log/vmware/certificate-manager.log
ERROR certificate-manager ‘lstool get’ failed: 1
Fixing this is a little tricky and a bit involved with the certificate manager script
Go to the directory where this is installed with
cd /usr/lib/vmware/site-packages/cis
Create a backup copy of the certificate manager script, incase it goes wrong or you need to refer back to it with
cp certificateManagerHelper.py certificateManagerHelper.py.bak
Now you will need to edit the file with
vi certificateManagerHelper.py
Vi can be a little painful to use if you havent used it before, but you can use the arrow keys to find the lines you want
You are looking for something like this
if(rc != 0):
logging.error(“‘lstool list’ failed: {}”.format(rc))
raise Exception(“‘lstool list’ failed: %d” % rc)
It may look different, but as long as you have the first line, the if statement, and a raise Exception line, thats what you want to edit
Type the letter ‘I’ in Vi to insert, this allows you to type where the curser is, you want to commend out the raise Exception line like so, with a ‘#’
if(rc != 0):
logging.error(“‘lstool list’ failed: {}”.format(rc))
# raise Exception(“‘lstool list’ failed: %d” % rc)
There is about 3 instances of this, it can vary, but check you get them all
Once you have commented them all out, press escape to exit insert mode
Then we need to save and quit, press ‘:’, then wq and hit enter to save and exit
You may need to use wq! over wq if you have isues
This should fix the problem
For more info check out James Mueller’s article around this as it helped me get a few vCenters past this error here
When you get to this stage, I recommend opening another SSH connection to the vCenter
This doesnt have any info, so if you run the following in another windows, you can track the progress
watch service-control –status
You’ll get something like this
Its worth noting, not all services will start, these depend on the features you have enabled by important ones are
- wcp
- vmware-vpxd
- vmware-certificatemanagement
- vmdird
- vmware-vapi-endpoint
- vstats
- vsphere-ui
If any of the above wont start the certificate renewal may fail, but be patient and wait for the renewal terminal to say completed or failed
If you get a failure due to a service not starting, its likely the vmware-vapi-endpoint, and this is linked to the STS signing certificate, so ensure this is valid and you have run fixsts in the Renewing The STS Signing Certificate section as this has caught me out a few time
Renewing ESXi Certificates
If your ESXi host is saying certificates are due to expire, renewing them is dead simple
To change this navigate to the host, then Configure/System/Certificate, click manager with VMCA, for self signed certificates and hit renew
One thing to note, if the valid to date doesnt change, but the valid from does, this valid to is linked to the vCenter, so if its got certificates that are due to expire, they will affect this, so check those if the valid to isnt any newer, then renew
Restoring After A Failed Renewal
So what happens if the worst has happened and its now much more broken that it was before or unchanged?
Well, we took all the backup options before so there are a few bits you want to look to do
If you have a config backup from the last 24 hours, worst case 48-72 hours, then if you still havent got a working backup, restore the config backup into a new appliance to fix everything
A different type of backup wont help here, ie Veeam or Rubrik, as while they arent good backups in general due to the back end database, and the backup software isnt application aware, it will restore it with the old certificates
If you cant restore the config, and have made the vCenter worse and want to start over, this is where you power it off, and go back to the snapshot manager in ESXi and restore the snapshot to start again
Go to ESXi and shut down the vCenter, then go to snapshot manager
Select the snapshot and click restore snapshot