vm crashes (caused by ipfw ?)

Hi Pat,

I am working with wpt 2.5.
I set up several agents on virtual machines, both, on the internet and in a LAN. As operating system we use 32 bit win 7 and xp. The virtualization in LAN is realized with Citrix XenDesktop 4.
Up to now everything worked fine. Now I have trouble with some new agent-VM’s. They’re localized in a LAN and windows xp is running (newly set up). The wpt-agent gets the jobs from server (browser UI shows “test is running” and job-files in /work/jobs/myAgent disappear) but the hole vm crashes with an error before browser opens (independent of used browser: ie, ff or chrome). Other wpt-agents in the same LAN are working correctly with the same wpt-server.

Windows logs some information about the crash to a minidump-file:

C:\Users\nku>kd -y srv*c:\symbols*http://msdl.microsoft.com/download/symbols -z d:\temp\Mini061212-01.dmp -v

Microsoft (R) Windows Debugger Version 6.2.8400.0 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [d:\temp\Mini061212-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: srv*c:\symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:

Loading symbols for 804d7000     ntkrnlmp.exe ->   ntkrnlmp.exe
ModLoad: 804d7000 80701000   ntkrnlmp.exe
Windows XP Kernel Version 2600 (Service Pack 3) MP (2 procs) Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 2600.xpsp_sp3_gdr.120411-1615
Machine Name:
Kernel base = 0x804d7000 PsLoadedModuleList = 0x805634c0
Debug session time: Tue Jun 12 09:51:23.406 2012 (UTC + 2:00)
System Uptime: 0 days 17:04:24.621
Loading symbols for 804d7000     ntkrnlmp.exe ->   ntkrnlmp.exe
ModLoad: 804d7000 80701000   ntkrnlmp.exe
Loading Kernel Symbols
.ModLoad: 80701000 80721d00   halmacpi.dll
.ModLoad: f7987000 f7988b80   kdcom.dll
.ModLoad: f7897000 f789a000   BOOTVID.dll
.ModLoad: f75a7000 f75d5180   ACPI.sys
.ModLoad: f7989000 f798a100   WMILIB.SYS
.ModLoad: f7596000 f75a6a80   pci.sys
.ModLoad: f75f7000 f7600300   isapnp.sys
.ModLoad: f798b000 f798c580   intelide.sys
.ModLoad: f7707000 f770d180   PCIIDEX.SYS
.ModLoad: f7607000 f7611580   MountMgr.sys
.ModLoad: f74d7000 f74f5d80   ftdisk.sys
.ModLoad: f798d000 f798e700   dmload.sys
.ModLoad: f74b1000 f74d6a00   dmio.sys
.ModLoad: f770f000 f7713d00   PartMgr.sys
.ModLoad: f749a000 f74b1000   xevtchn.sys
.ModLoad: f7442000 f749a000   XENUTIL.SYS
.ModLoad: f7617000 f7624200   VolSnap.sys
.ModLoad: f742a000 f7441900   atapi.sys
.ModLoad: f7403000 f742a000   xenvbd.sys
.ModLoad: f787f000 f7896880   SCSIPORT.SYS
.ModLoad: f7627000 f7635000   scsifilt.sys
.ModLoad: f7637000 f763fe00   disk.sys
.ModLoad: f7647000 f7653180   CLASSPNP.SYS
.ModLoad: f785f000 f787eb00   fltMgr.sys
.ModLoad: f784d000 f785ef00   sr.sys
.ModLoad: f7b70000 f7bdef40   mfehidk.sys
.ModLoad: f7836000 f784cb00   KSecDD.sys
.ModLoad: f7ae3000 f7b6f600   Ntfs.sys
.ModLoad: f795a000 f7986980   NDIS.sys
.ModLoad: f7a35000 f7a4ec00   Mup.sys
.ModLoad: f7526000 f752fc00   processr.sys
.ModLoad: f7516000 f7522f00   i8042prt.sys
.ModLoad: f7506000 f7511000   picamouf.sys
.ModLoad: f7787000 f778cc00   mouclass.sys
.ModLoad: f74f6000 f7502000   picakbf.sys
.ModLoad: f778f000 f7795280   kbdclass.sys
Image at f7797000 had size 0
.ModLoad: f7797000 f7798000   fdc.sys
.ModLoad: f7657000 f7667000   serial.sys
.ModLoad: ba7f8000 ba7fbd80   serenum.sys
.ModLoad: b8de0000 b8df3a00   parport.sys
.ModLoad: b911e000 b912d600   cdrom.sys
.ModLoad: f779f000 f77a4080   usbuhci.sys
.ModLoad: b8dbc000 b8ddf200   USBPORT.SYS
.ModLoad: b910e000 b9119280   cirrus.sys
.ModLoad: b8da8000 b8dbbf00   VIDEOPRT.SYS
.ModLoad: b90fe000 b910c000   picakbm.SYS
.ModLoad: f77a7000 f77ae000   picacdd.sys
.ModLoad: b8d93000 b8da8000   ctxad.sys
.ModLoad: b8d6f000 b8d92a80   portcls.sys
.ModLoad: b90ee000 b90fcb00   drmk.sys
.ModLoad: b8d4c000 b8d6e700   ks.sys
.ModLoad: f7a63000 f7a63c00   audstub.sys
.ModLoad: b90de000 b90ea880   rasl2tp.sys
.ModLoad: ba7f4000 ba7f6900   ndistapi.sys
.ModLoad: b8d35000 b8d4b580   ndiswan.sys
.ModLoad: b90ce000 b90d8200   raspppoe.sys
.ModLoad: b90be000 b90c9d00   raspptp.sys
.ModLoad: f77af000 f77b3a80   TDI.SYS
.ModLoad: b8c84000 b8c94e00   psched.sys
.ModLoad: b90ae000 b90b6900   msgpc.sys
.ModLoad: b8c59000 b8c831e0   mfeavfk.sys
.ModLoad: f77b7000 f77bb580   ptilink.sys
.
ModLoad: f77bf000 f77c3080   raspti.sys
.ModLoad: b8c01000 b8c30e80   rdpdr.sys
.ModLoad: b909e000 b90a7f00   termdd.sys
.ModLoad: f79ab000 f79ac100   swenum.sys
.ModLoad: b8b7b000 b8bd8f00   update.sys
.ModLoad: ba7b8000 ba7bbc80   mssmbios.sys
.ModLoad: b908e000 b909d000   xennet.sys
.ModLoad: b8b5d000 b8b7a100   ipfw.sys
.ModLoad: b8b45000 b8b5d000   ctxusbb.sys
.ModLoad: f7667000 f7674000   WDFLDR.SYS
.ModLoad: b8ac9000 b8b45000   wdf01000.sys
.ModLoad: f7677000 f7681000   NDProxy.SYS
.ModLoad: f7697000 f76a5880   usbhub.sys
.ModLoad: f79b1000 f79b2280   USBD.SYS
.ModLoad: f79b3000 f79b4f00   Fs_Rec.SYS
.ModLoad: f7a74000 f7a74b80   Null.SYS
.ModLoad: f79b5000 f79b6080   mnmdd.SYS
.ModLoad: f79b7000 f79b8080   RDPCDD.sys
.ModLoad: f77df000 f77e3a80   Msfs.SYS
.ModLoad: f77e7000 f77ee880   Npfs.SYS
.ModLoad: f7943000 f7945280   rasacd.sys
.ModLoad: b89fc000 b8a0e600   ipsec.sys
.ModLoad: b89a3000 b89fb480   tcpip.sys
.ModLoad: b898e000 b89a2140   mfetdi2k.sys
.ModLoad: f76a7000 f76af700   wanarp.sys
.ModLoad: b8966000 b898dc00   netbt.sys
.ModLoad: b8944000 b8965d00   afd.sys
.ModLoad: b88e9000 b8944000   picadm.sys
.ModLoad: f77ef000 f77f7000   PICAVC.SYS
.ModLoad: f76b7000 f76bf780   netbios.sys
.ModLoad: b8756000 b8780e80   rdbss.sys
.ModLoad: b86e6000 b8755680   mrxsmb.sys
.ModLoad: f76f7000 f7701e80   Fips.SYS
.ModLoad: b86d2000 b86e6000   ctxusbm.sys
.ModLoad: f7576000 f7581000   cdfdrv.sys
.ModLoad: f793f000 f7941880   hidusb.sys
.ModLoad: f7556000 f755f000   HIDCLASS.SYS
.ModLoad: f77f7000 f77fd180   HIDPARSE.SYS
.ModLoad: b8895000 b8898000   mouhid.sys
.ModLoad: f7546000 f7555900   Cdfs.SYS
.ModLoad: b8885000 b8888780   dump_scsiport.sys
.ModLoad: b8683000 b86aa000   dump_xenvbd.sys
.ModLoad: b866c000 b8683000   dump_xevtchn.sys
.ModLoad: b8614000 b866c000   dump_XENUTIL.SYS
.ModLoad: bf800000 bf9c6b00   win32k.sys
.ModLoad: b886d000 b886f900   Dxapi.sys
.ModLoad: f7817000 f781b500   watchdog.sys
.ModLoad: bf000000 bf011600   dxg.sys
.ModLoad: b9087000 b9087d00   dxgthk.sys
.ModLoad: bf0ba000 bf0c0000   picatwcomms.sys
.ModLoad: bff60000 bff76480   cirrus.dll
.ModLoad: bf012000 bf058e80   ATMFD.DLL
.ModLoad: b8110000 b8113900   ndisuio.sys
.ModLoad: f775f000 f7767000   ctxsmcdrv.sys
.ModLoad: b7d0c000 b7d63600   srv.sys
.ModLoad: f7777000 f777df00   npf.sys
.ModLoad: b7bde000 b7bf4000   picadd.sys
.ModLoad: b7baf000 b7bde000   picapar.sys
.ModLoad: b7b51000 b7b87000   picaser.sys
.ModLoad: b88c9000 b88ce500   TDTCP.SYS
.ModLoad: b74ee000 b7510180   RDPWD.SYS
.ModLoad: b7485000 b74c5e00   HTTP.sys

Loading User Symbols
Loading unloaded module list
........
Loaded dbghelp extension DLL
Loaded ext extension DLL
Loaded exts extension DLL
Loaded kext extension DLL
Loaded kdexts extension DLL
Loading symbols for f795a000         NDIS.sys ->   NDIS.sys
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 100000D1, {0, 2, 0, 0}

Unable to load image ipfw.sys, Win32 error 0n2
Loading symbols for b8b5d000         ipfw.sys ->   ipfw.sys
*** WARNING: Unable to verify timestamp for ipfw.sys
*** ERROR: Module load completed but symbols could not be loaded for ipfw.sys
Loading symbols for b8c84000       psched.sys ->   psched.sys
Loading symbols for b89a3000        tcpip.sys ->   tcpip.sys
Probably caused by : ipfw.sys ( ipfw+1645c )

Followup: MachineOwner
---------

 : kd>

 !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 00000000, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000000, value 0 = read operation, 1 = write operation
Arg4: 00000000, address which referenced memory

Debugging Details:
------------------


READ_ADDRESS: GetUlongFromAddress: unable to read from 80567ce8
 00000000

CURRENT_IRQL:  2

FAULTING_IP:
+0
00000000 ??              ???

PROCESS_NAME:  Idle

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  DRIVER_FAULT

BUGCHECK_STR:  0xD1

LAST_CONTROL_TRANSFER:  from f79710ef to 00000000

FAILED_INSTRUCTION_ADDRESS:
+0
00000000 ??              ???

STACK_TEXT:
WARNING: Frame IP not in any known module. Following frames may be wrong.
80556f70 f79710ef 89631d78 805570fc 898a2000 0x0
80556f9c b8b7345c 8968a9b0 00000000 00000014 NDIS!ndisMTransferData+0x109
80556fc8 f79710ef 89631d78 805570fc 8968a328 ipfw+0x1645c
80556ff4 b8c8c0a6 89903f40 00000000 00000014 NDIS!ndisMTransferData+0x109
80557014 f79710ef 89631d78 805570fc 89686648 psched!MpTransferData+0x3c
80557040 b89d1239 8974b508 00000000 00000014 NDIS!ndisMTransferData+0x109
80557060 b89b16c6 896d5e70 00000000 00000000 tcpip!ARPXferData+0x26
8055710c b89a3928 89723ad0 890f87fa 0000001a tcpip!IPRcvPacket+0x32b
8055714c b89a86ef 00000000 00000000 890f87d8 tcpip!ARPRcvIndicationNew+0x149
8055717c f797dad6 896d5e70 00000000 890f87d8 tcpip!ARPRcv+0x42
805571b0 b8c8c5d7 896868a8 00000000 890f87d8 NDIS!EthFilterDprIndicateReceive+0xe0
805571f0 f797dad6 89686648 00000000 890f87d8 psched!ClReceiveIndication+0x21b
80557224 b8b72351 89686b18 00000000 890f87d8 NDIS!EthFilterDprIndicateReceive+0xe0
8055726c b8b6cad8 00000002 890f87a8 ffffffff ipfw+0x15351
80557298 b8b6c544 890f87a8 fffffffd ffffffff ipfw+0xfad8
805572c4 b8b6764f 00000000 00000001 805572e0 ipfw+0xf544
805572d4 b8b727bc 00000000 805573fc 804e2b6e ipfw+0xa64f
805572e0 804e2b6e b8b77658 00000000 298a0ca4 ipfw+0x157bc
805573fc 804e209d 80561f20 ffdff9c0 ffdff000 nt!KiTimerListExpire+0x14b
80557428 804dcd22 80562320 00000000 003c0627 nt!KiTimerExpiration+0xb1
80557440 80561cc0 ffdffc50 00000000 80561cc0 nt!KiRetireDpcList+0x61
80557450 804dcc07 00000000 0000000e 00000000 nt!KiIdleThread0
80557454 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x28


STACK_COMMAND:  kb

FOLLOWUP_IP:
ipfw+1645c
b8b7345c ??              ???

SYMBOL_STACK_INDEX:  2

SYMBOL_NAME:  ipfw+1645c

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: ipfw

IMAGE_NAME:  ipfw.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  4ba75e39

FAILURE_BUCKET_ID:  0xD1_CODE_AV_NULL_IP_ipfw+1645c

BUCKET_ID:  0xD1_CODE_AV_NULL_IP_ipfw+1645c

Followup: MachineOwner
---------

So ipfw seems to cause this error.
Do you ever had similar errors with ipfw?
Any ideas of a possible issue on my vm’s which could be incompatible to ipfw?

Regards Nils

I largely just use VMWare so haven’t had a chance to test in other VM environments but dummynet has been tested on quite a wide range of hardware and drivers so I’d tend to suspect the Xen guest drivers.

They also run fine on EC2 which is Xen-based but it uses the RedHat paravirtual drivers if I remember correctly.

Do you have different options for the type of network adapter the VM emulates?

I have different driver-versions for the citrix ethernet-adapter on the vm’s. Is it possible to use a newer version of ipfw which supports the newest xen desktop drivers.
My IT-admins don’t really like to remain on an older version of the xen desktop drivers…

There is no newer version of ipfw. Given that I haven’t seen ipfw crash anything else, I’d be more inclined that the issue is with the citrix drivers. Are the IT admins comfortable with running the default Microsoft drivers? There will be a little more overhead than with the paravirtual drivers but the web testing isn’t network intensive and it shouldn’t be a big issue.