[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] domU oom -> xvda1 read-only without any notice?



On Wed, Mar 31, 2010 at 10:52:45AM +0200, Josip Rodin wrote:
> I'm going to restart it now because we need the machine in a
> not-so-useless state.

After the shutdown and create:

[...]
[    0.488419] blkfront: xvda1: barriers enabled
done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... done.
[    0.670172] EXT3-fs: INFO: recovery required on readonly filesystem.
[    0.670186] EXT3-fs: write access will be enabled during recovery.
[    9.996563] kjournald starting.  Commit interval 5 seconds
[    9.996577] EXT3-fs warning (device xvda1): ext3_clear_journal_err: 
Filesystem error recorded from previous mount: IO failure
[    9.996585] EXT3-fs warning (device xvda1): ext3_clear_journal_err: Marking 
fs in need of filesystem check.
[    9.996991] EXT3-fs: recovery complete.
[    9.997318] EXT3-fs: mounted filesystem with ordered data mode.
[...]
Will now check root file system:fsck 1.41.3 (12-Oct-2008)
[/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a -C0 /dev/xvda1
lastovo-root contains a file system with errors, check forced.
Deleted inode 1704618 has zero dtime.  FIXED.
lastovo-root: ***** REBOOT LINUX *****
lastovo-root: 1046964/3276800 files (1.1% non-contiguous), 10147312/13107200 
blocks
fsck died with exit status 3
 failed!
The file system check corrected errors on the root partition but requested that 
the system be restarted. failed!
The system will be restarted in 5 seconds. (warning).
Will now restart.

Then it booted fine.

I examined the graphs of the machine from the time of the incident, and it
seems that everything was fine until around 2:30 when a large network
operation started - Legato nsrexecd was backing it up - its remote logs say
it transferred 4 GB of data in around 11 minutes, and finished successfully.
At that point, the graphs on the machine recorded a huge spike in both
Apache and PostgreSQL connections, and then soon after the whole thing
went AWOL. If necessary I can attach the entire graph snapshot, which also
includes the approximated state of /proc/interrupts and /proc/meminfo.

-- 
     2. That which causes joy or happiness.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.