Troubleshooting

EVC Mode & CPU Compatibility: The “Lazy Admin” FAQ

Posted on Updated on


Youโ€™ve just unboxed a shiny new host with the latest Intel or AMD processor, but your current cluster is running hardware from three years ago. You try to vMotion a VM, and vSphere gives you the dreaded “CPU Incompatibility” error.

Enter Enhanced vMotion Compatibility (EVC). Hereโ€™s everything you need to know to get your mixed-hardware cluster working without the headache.


What exactly is EVC?

Think of EVC as a “lowest common denominator” filter for your CPUs. It masks the advanced features of newer processors so that every host in the cluster appears to have the exact same instruction set. This allows VMs to live-migrate between old and new hardware because the “view” of the CPU never changes.

Quick FAQ

Q: Can I mix Intel and AMD in the same EVC cluster? A: No. EVC only works within a single vendor family. You can mix different generations of Intel, or different generations of AMD, but you cannot vMotion between the two brands.

Q: Will EVC slow down my new servers? A: Technically, yesโ€”but rarely in a way youโ€™ll notice. It hides new instructions (like specialized encryption or AI math sets), but the raw clock speed and core count of your new CPUs are still fully utilized. Most general-purpose VMs don’t use the high-end instructions being masked.

Q: Do I need to power off VMs to enable EVC? A: It depends:

  • Enabling on an empty cluster: No downtime.
  • Enabling on a cluster where VMs are already running on the oldest host: Usually no downtime.
  • Enabling on a cluster where VMs are running on newer hosts: You must power off those VMs so they can “re-boot” with the masked CPU instructions.

Q: What is “Per-VM EVC”? A: Introduced in vSphere 6.7, this allows you to set the EVC mode on the VM itself rather than the whole cluster. This is a lifesaver for migrating VMs across different vCenters or into the Cloud (like AWS/Azure).


How to Find Your Correct EVC Mode

Don’t guess. Use the official tool:

  1. Go to the VMware Compatibility Guide (CPU/EVC Matrix).
  2. Select your ESXi version.
  3. Select the CPU models of your oldest and newest hosts.
  4. The tool will tell you the highest supported “Baseline” you can use.

Step-by-Step: Enabling EVC on an Existing Cluster

  1. Select your Cluster in vCenter.
  2. Go to Configure > VMware EVC.
  3. Click Edit.
  4. Select Enable EVC for Intel/AMD hosts.
  5. Choose the Baseline that matches your oldest host.
  6. Validation: vCenter will check if any running VMs are currently using features above that baseline. If they are, you’ll need to shut them down before you can save the settings.

Summary Table: EVC Baselines

If your oldest host is…Use this EVC Mode
Intel Ice LakeIntel “Ice Lake” Generation
Intel Cascade LakeIntel “Cascade Lake” Generation
AMD EPYC RomeAMD EPYC “Rome” Generation

Lost Your VM? How to Find Its ESXi Host from the Guest OS

Posted on Updated on


Itโ€™s a classic “Ghost in the Machine” scenario: You can RDP or SSH into a virtual machine, but you can’t find it in vCenter. Maybe itโ€™s a massive environment with thousands of VMs, maybe the naming convention doesn’t match, or maybe you’re dealing with a rogue host that isn’t even in your main cluster.

If VMware Tools is installed and running, the VM actually knows exactly where it lives. You just have to ask it nicely through the Command Prompt.


The Magic Tool: vmtoolsd.exe

On Windows VMs, the VMware Tools service includes a CLI utility called vmtoolsd.exe. This tool can query the hypervisor for specific environment variables that are passed down to the guest.

1. Find the ESXi Hostname

If you need to know which physical server is currently crunching the cycles for your VM, run this command:

"C:\Program Files\VMware\VMware Tools\vmtoolsd.exe" --cmd "info-get guestinfo.hypervisor.hostname"

2. Get the ESXi Build Details

Need to know if the underlying host is patched or running an ancient version of ESXi? Query the build number:

"C:\Program Files\VMware\VMware Tools\vmtoolsd.exe" --cmd "info-get guestinfo.hypervisor.build"

Why is this useful?

  • vCenter Search is failing: Sometimes the inventory search index gets corrupted, and “Name contains” returns nothing.
  • Nested Environments: If you are running VMs inside VMs, this helps you verify which layer of the onion you are currently on.
  • Troubleshooting Performance: If a VM is lagging, you can quickly identify the host to check for hardware alerts or CPU contention without leaving the OS.

What if I’m on Linux?

The same logic applies! Most modern Linux distributions use open-vm-tools. You can run the same query via the terminal:

vmtoolsd --cmd "info-get guestinfo.hypervisor.hostname"

Important Requirement: Guest RPC

For these commands to work, the VM must have VMware Tools installed and the guestinfo variables must be accessible. In some hardened environments, admins might disable these RPC (Remote Procedure Call) queries in the .vmx file for security reasons, but in 95% of standard builds, this will work out of the box.

Troubleshooting: How to Force Cancel a Hung Task in vCenter or ESXi

Posted on Updated on


    Weโ€™ve all been there: a vMotion hits 99% and just… stays there. Or a backup job finishes on the proxy side, but vCenter still thinks the VM is “busy.” Usually, the Cancel button is grayed out, leaving you stuck in management limbo.

    When the GUI fails you, itโ€™s time to hop into the CLI. Here is how to manually kill a hung task by targeting the VM’s parent process.


    Step 1: Verify the Task

    Before pulling the trigger, confirm the task is actually stuck and not just slow. Check the Monitor > Tasks and Events tab for the specific VM. If the progress bar hasn’t budged in an hour and the “Cancel” option is disabled, proceed to the host.

    Step 2: Enable and Connect via SSH

    To kill a process, you need to be on the specific ESXi host where the VM is currently registered.

    1. Enable SSH: Go to the ESXi host in vSphere > Configure > System > Services > Start SSH.
    2. Connect: Open your terminal (Putty, CMD, or Terminal) and log in as root.

    Step 3: Locate the Parent Process ID (PID)

    We need to find the specific process tied to your VM. Use the ps command combined with grep to filter for your VM’s name.

    Run the following command:

    ps -v | grep "Your_VM_Name"

    (Note: Using the -v flag in ESXi provides a more detailed view of the world ID and parent processes.)

    Look for the line representing the VM’s main process. You are looking for the Leader ID or the first ID listed in the row.

    Step 4: Kill the Process

    Once you have identified the ID (e.g., 859467), send the kill signal. Start with a standard terminate signal, which allows the process to clean up after itself.

    Run the command:

    kill 859467

    Lazy Admin Tip: If the process is extremely stubborn and won’t die, you can use kill -9 859467 to force an immediate termination. Use this as a last resort!

    Step 5: Verify in vSphere

    Give vCenter a minute to catch up. The hung task should now disappear or show as “Canceled” in the Tasks and Events console. Your VM should return to an “Idle” state, allowing you to power it on, move it, or restart your backup.

    Hyper-V Performance Hack: The Essential Antivirus Exclusions List

    Posted on Updated on


    Running antivirus on your Hyper-V host is a security must, but if you don’t configure it correctly, you’re asking for trouble. We’re talking “disappearing” VMs, corrupted virtual disks, and performance so sluggish you’ll think you’re back on physical hardware from 2005.

    The culprit is usually the Real-Time Scanning engine trying to “inspect” a 100GB .vhdx file every time the guest OS writes a single bit. Here is the definitive “Lazy Admin” guide to Hyper-V AV exclusions.


    1. File Extension Exclusions

    Tell your AV to keep its hands off these specific virtual machine file types:

    • Virtual Disks: .vhd, .vhdx
    • Snapshots/Checkpoints: .avhd, .avhdx
    • Saved State: .vsv, .bin, .vmgs
    • Configuration: .xml, .vmcx, .vmrs
    • ISO Images: .iso
    • Tracking: .rct (Resilient Change Tracking)

    2. Directory Exclusions

    If you are using the default paths, exclude these. If you have a dedicated D:\VMs drive (which you should!), exclude that entire custom path as well.

    • Default Configs: C:\ProgramData\Microsoft\Windows\Hyper-V
    • Default VHDs: C:\Users\Public\Documents\Hyper-V\Virtual Hard Disks
    • Default Snapshots: C:\ProgramData\Microsoft\Windows\Hyper-V\Snapshots
    • Cluster Shared Volumes (CSV): C:\ClusterStorage
    • Hyper-V Replica: Any custom replication data folders.
    • SMB 3.0 Shares: If your VMs live on a remote file server, apply these same exclusions to that file server!

    Lazy Admin Pro-Tip: If you’re using a Cluster, don’t just exclude the C:\ClusterStorage folder by path. Use the Volume ID (get it via mountvol) to ensure the exclusion sticks even if drive letters or paths shift.

    3. Process Exclusions

    Sometimes excluding the file isn’t enough; you need to exclude the “person” opening the file. Exclude these core Hyper-V executables:

    • Vmms.exe: The Virtual Machine Management Service.
    • Vmwp.exe: The Virtual Machine Worker Process (one runs for every active VM).
    • Vmcompute.exe: (For Windows Server 2019+) The Host Compute Service.

    Why this matters (The “Error 0x800704C8”)

    If you don’t set these, you’ll eventually see the dreaded Error 0x800704C8 (The process cannot access the file because it is being used by another process). This happens when your AV locks the VM’s configuration file exactly when Hyper-V tries to start it.

    What about Windows Defender?

    Good news for the truly lazy: if you are using built-in Microsoft Defender on Windows Server, it automatically detects the Hyper-V role and applies most of these exclusions for you. However, it does not always catch your custom storage paths (like E:\MyVMs), so always double-check your work!

    Recovery Guide: Fixing Corrupt Image Profiles on ESXi

    Posted on Updated on


    Weโ€™ve all been thereโ€”a patch remediation task in vSphere Update Manager (VUM) or vSphere Lifecycle Manager (vLCM) gets interrupted (shoutout to that one colleague!), and suddenly your ESXi host is in a “zombie” state.

    If you see the dreaded “Unknown – no profile defined” error, your host has lost its identity. It no longer knows which VIBs (VMware Installation Bundles) should be installed. This is usually caused by a corrupt imgdb.tgz file.

    Weโ€™ve all been thereโ€”a patch remediation task in vSphere Update Manager (VUM) or vSphere Lifecycle Manager (vLCM) gets interrupted (shoutout to that one colleague!), and suddenly your ESXi host is in a “zombie” state.

    If you see the dreaded “Unknown – no profile defined” error, your host has lost its identity. It no longer knows which VIBs (VMware Installation Bundles) should be installed. This is usually caused by a corrupt imgdb.tgz file.

    image profile issue

    The Symptom: Missing Image Profile

    When an image profile is empty or corrupt, you cannot install patches, remove drivers, or perform upgrades. ESXi relies on the image database to maintain consistency.

    How to Diagnose a Corrupt imgdb.tgz

    Before you resort to a full host rebuild, verify the file size of the database. A healthy imgdb.tgz is typically around 26 KB. If yours is only a few bytes, itโ€™s corrupted.

    1. SSH into the host.

    2. Locate the files:

      cd /vmfs/volumes
      find * | grep imgdb.tgz
    3. Note: You will usually see two results (one for each bootbank).

    4. Check the size:

      ls -l <path_to_result>/imgdb.tgz

      If the size is tiny (e.g., 0-100 bytes), the database is toast.


    The Fix: Borrowing a “Known Good” Profile

    Instead of a time-consuming reinstall, you can manually restore the database from a healthy host running the exact same version and patch level.

    Step 1: Export from a Healthy Host

    On a working ESXi host, copy the healthy database to a shared datastore:

    cp /bootbank/imgdb.tgz /vmfs/volumes//

    Step 2: Restore on the Corrupt Host

    On the host with the issue, move the good file to /tmp and extract it to access the internal VIB and Profile metadata:

    cp /vmfs/volumes//imgdb.tgz /tmp
    cd /tmp
    tar -xzf imgdb.tgz

    Step 3: Rebuild the Database Directories

    Now, manually place the healthy metadata into the system directories:

    1. Copy Profiles: cp /tmp/var/db/esximg/profiles/* /var/db/esximg/profiles/

    2. Copy VIBs: cp /tmp/var/db/esximg/vibs/* /var/db/esximg/vibs/

    3. Replace Bootbank File:

      rm /bootbank/imgdb.tgz
      cp /tmp/imgdb.tgz /bootbank/

    Step 4: Finalize and Persist

    To ensure these changes survive a reboot, run the backup script:

    /sbin/auto-backup.sh

    Summary Table: Resolution Options

    OptionEffortRiskWhen to use
    Rebuild HostHighLowIf you don’t have a matching “known good” host.
    Manual File CopyLowMediumWhen you need a fast fix and have a twin host available.