Article ID: 123234, created on Oct 24, 2014, last review on Jul 3, 2015

  • Applies to:
  • Virtuozzo

Symptoms

pstorage top is flooded with messages that keep appearing every 30 seconds:

MON ERR MDS# died unexpectedly (122): Can't load MDS id
MON ERR MDS# died unexpectedly (122): Can't load MDS id

or

MON ERR CS# died unexpectedly (122): csd: could not lock repository
MON ERR CS# died unexpectedly (122): csd: could not lock repository

or

MON ERR MDS# died unexpectedly (1)
MON ERR MDS# died unexpectedly (1)

However, all MDS and CS services in pstorage top are marked as online.

Cause

Pstorage-monitor process that has been monitoring some metadata server or chunk server got orphaned — its server was removed, but monitor was not shut down for an unknown reason. Monitor is trying to restart the service which failed in monitor's opinion, however, since the service was removed, monitor cannot open MDS (CS) directory, hence the error.

Resolution

In order to get rid of these errors it's necessary to find orphaned monitor and kill it manually.

Check hosts participating in pstorage for monitor process that doesn't have CS or MDS service as a child, and kill the monitoring process. Make sure you're killing the proper monitor process. Below you may find examples of all use-cases.

Example of a correct monitor running for a CS service:

# ps fax | grep /usr/libexec/pstorage/monitor -A1
...
   9625 ?        S      0:00 /bin/sh /usr/libexec/pstorage/monitor
   9628 ?        Sl    15:55  \_ /usr/bin/csd -r /pstorage/pcs-bsh-cs/data -l /pstorage/pcs-bsh-cs/data/logs/cs.log.gz -u pstorage
...

Example of a correct monitor running for a MDS service:

# ps fax | grep /usr/libexec/pstorage/monitor -A1
...
 128717 pts/0    S      0:00 /bin/sh /usr/libexec/pstorage/monitor
 128720 pts/0    Sl     0:02  \_ /usr/bin/mdsd -r /pstorage/ssd1/mds/data -l /pstorage/ssd1/mds/data/logs/mds.log.gz -u pstorage
...

Example of an orphaned monitor, which is most likely causing the flood:

# ps fax | grep /usr/libexec/pstorage/monitor -A1
...
   3979 ?        S      1:09 /bin/sh /usr/libexec/pstorage/monitor
 132860 ?        S      0:00  \_ sleep 5
...

Once orphaned monitor is found, simply kill it:

# kill -9 3979

NOTE: Replace monitor PID with the PID you've found

Search Words

died unexpectedly (122): csd: could not lock repository

pstorage top flood

MON ERR CS# died unexpectedly (1)

could not lock repository

PCS: проблема с MDS

MON ERR MDS# died unexpectedly (122): Can't load MDS id

2897d76d56d2010f4e3a28f864d69223 0dd5b9380c7d4884d77587f3eb0fa8ef

Email subscription for changes to this article
Save as PDF