This article describes how to deal with a container stuck in a transitional status such as starting, backing up, migrating, etc.
A container can become stuck in any of the transitional statuses described below when a parent process is terminated or the timeout has been misconfigured.
Find the status the container has become stuck in below and follow the steps to unlock it.
Starting - The container is mounted and the container's processes are being launched.
If there are more than 16 processes inside the container, it is most likely that SCM did not report a successful startup completion because some services inside the container were not successfully started or did not return any startup result.
In these cases, the container can be used in starting status just as you would use a container in the running state.
If a container is in the starting state and there are no processes in it yet, it is most likely that the disk of the container is being resized or its format has been changed from "compact" to "plain." There is no separate process for this operation, and it is suggested that you wait until the operation is completed.
If a container is in the starting state and there are between five and ten processes running, it is most likely that SCM is waiting for a service's startup completion to take place. In this case, it is necessary to set the startup mode to "disabled" or "manual" when the container is stopped for services that do not start properly, start the container, then troubleshoot these services' startup issues separately. There are two ways to unlock a container from this state:
- Disable the container startup on Hardware Node boot and reboot the node.
- Kill SCM processes for the container:
ATTENTION: The following steps require a clear understanding of what you are doing. If you are not sure how to carry out these steps, stick to the first option above. If you are certain of what you are doing:
- Open Task Manager.
- Sort processes by the CTID column.
Locate the processes "sc.exe," "net.exe," and "net1.exe."
NOTE: Make sure they are related to the container in question by checking the CTID column.
- Terminate these processes.
A container may also get stuck in the starting state if "vz-poststart.cmd" does not successfully complete or hangs on an operation.
Check whether you see any processes related to the locked container in the output of the command below:
wmic process WHERE "commandline like '%vzctl%' and name <> 'wmic.exe'" get processid,caption,commandline
If you see a command like "
vzctl exec2 --skiplock 123456 msiexec /unreg" for the stuck container, it is possible to kill the process and make the container operable.
Stopping - Container shutdown is initiated, and the processes inside the container are being terminated.
If there are more than three processes, shutdown can be accelerated by killing the container's userspace processes.
A container with VPN running inside can hang in "stopping" status with running 3 processes. Make sure that KB2983488 is installed.
If there are three processes or fewer (e.g., csrss, smss, lsass), there is no way to get rid of the stopping status except for a node reboot. Prior to the node reboot, create a LiveKD dump and pass it to Virtuozzo Support for further analysis.
NOTE: Never use vzctl stop --skiplock in cases where the container is stuck in starting or stopping states. This may lead to hung kernel threads, which can only be fixed with a node reboot.
Migrating - The container has been moved from another node.
Simply restart VA Agent (PVA Agent) or VZAgent:
On Virtuozzo containers for Windows 4.6:
net stop pvaagent net start pvaagent
On Virtuozzo containers for Windows 4.5 and earlier:
net stop vzaop net start vzaop
To release a container from a c2v migration process:
net stop c2vservice net start c2vservice
Backing up, restoring - A container backup or restore task has been launched and then terminated.
Get the Process ID of "vzlpl.exe," which is responsible for this task, using the VA Agent (PVA Agent) log file. Then, terminate the process:
taskkill /F /T /PID PID
Cloning - Container cloning has been initiated (typically, using the "vzmlocal -C" command) and then terminated.
Kill the fsresizersrv.exe process:
taskkill /IM fsresizersrv.exe /T /F
Updating - A package or application template is being installed in the container.
There are two possible reasons a container may become stuck in this state:
A modal window is spawned and user input is required.
In this case, connect to the container via RDP using a console session and close all pop-up windows and dialog boxes.
If there are no modal windows, it is most likely that the VBS script has hung.
Get the Process ID of "cscript.exe" inside the container and kill the process:
taskkill /F /P PID /T
NOTE: It is suggested that you use Process Explorer to find the exact process tree, starting with vzpkg, and kill the child, "cscript.exe."