June 14 2018
The network glitch seems to have gone away if I use the "NAT" interface (vmnet8, 192.168.146.X) instead of the bridged network. UPDATE: The glitch occurs less often, but is still present.
Disable maintenance and security warnings:
HKCU\
SOFTWARE\
Microsoft\
Windows\
CurrentVersion\
Notifications\
Settings\
Windows.SystemToast.SecurityAndMaintenance
DWORD Enabled = 0
HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Notifications\Settings\Windows.SystemToast.SecurityAndMaintenance
ServerFault
October 11 2017
I want to see if my driver loads in Win10 without kernel debugging enabled.
Archive the current configuration:
bcdedit /export Win10Target.bcd
This is an opaque binary file.
Copy the current boot configuration to a new name:
bcdedit /copy {current} /d "No debug"
This will copy the current configuration into a new one. Make
note of the guid id.
Turn off debugging for the new configuration:
bcdedit /debug {guid} off
Reboot and choose "No debug".
September 6 2017
Windows10 Update is a VIRUS!
TL;DR: Windows Update was hogging the network. I disabled the following services and the network quieted down:
Update: The unholy triad resurrected itself after a couple days, so disabling the services is only a temporary fix. I will leave Win10Target running overnight with update enabled and hope it fixes itself.
Update: Windows10 is stuck trying to install the update to build 1703. It spends the whole day hogging the network to download a 4GB update, then the install fails, so it downloads the update again! This cycle continues forever, and my only recourse is to disable Windows Update every time I need to use the Win10Target VM. Windows10 Update is horribly, horribly broken to the point where it behaves exactly like a malware virus. This is why over-the-air updates are a BAD IDEA, because they will always be overused and eventually result in disaster.
Something is seriously horked with my VMware network. Whenever I start the Win10Target, and especially when WinDbg is running, my network slows to a crawl and I have frequent stalls in the communication between Win10Target and WinDbg. This causes a lot of problems that are not showstoppers, but does make remote debugging even more painful.
I am using WireShark to see if I can figure anything out. This will open a firehose of data, and I am definitely not a network protocol expert. My network expertise extends to IPv4 and IPv6, and I suspect my problem is somewhere down in ARP land.
During the WireShark installation I selected "No" to "launch WinPCap at boot?". Running WireShark is a rare occurence for me, and I firmly believe that more services running means more problems. The effect of this option is that I need to launch WireShark as admin to see the network interfaces.
The first thing I see is a flood of MBNS queries looking for a printer
on the wrong network:
8727 496.896977 192.168.146.129 192.168.146.255 NBNS 92 Name query NB MFC_OFFICE<00>
8729 496.880964 192.168.114.129 192.168.114.255 NBNS 92 Name query NB MFC_OFFICE<00>
These are happening in flurries of 8 about every 2 seconds. It
appears Windows desperately wants to connect to a pair of
printers that are usually powered off. This would be a classic example
of a programmer assuming unrealistic (stupid) priorities.
I tried disabling every network protocol except IPv4 on the vmnet2 and vmnet8 interfaces. The MBNS queries are still there. (UPDATE: This is an annoyance, not the real problem.)
This is interesting: In the VMnet editor the VMnet8/146 subnet is set as a NAT. NAT Settings > NetBIOS Settings shows NBNS timeout is 2 seconds with 3 retries. This is suspiciously similar to my packet trace. I might be able to effectively disable NBNS on the vmnet by setting a very high timeout value (600 seconds) and low retry count (1). In the DNS settings, I disabled "Auto detect DNS server" and pointed it to my local DNS server (192.168.0.14) with a timeout of 300 seconds and retry of 1.
It appears the real culprit is 72.21.81.200. This appears to be a Verizon router. After Win10Target boots up, it starts streaming to this address non-stop. Other addresses: 8.254.220.78 (Level 3).
Knowing the packet provides very little relief, since it just gives me a relay point inside a Verizon or Level3 datacenter and the contents of the packet are unintelligible. What I really need is the process that owns the connection on Win10Target. I run SysInternal's TCPView, which reveals (no surprise here) SvcHost.exe owns both. SvcHost is receiving about 1MB/s, which is saturating my DSL connection. The good news is that it is sending very little, so this is probably something like a DNS lookup gone haywire. ProcExplorer provides more info, zeroing in on DoSvc. This is (supposedly) a Microsoft service with the friendly name "Delivery Optimization". If it is trying to optimize pain, it is succeeding magnificiently.
I suspect this is a botched Windows Update. Similar complaints found in a Windows XP forum cross-pollenated with Windows 7 users.
I hate Windows Update with a burning passion. It is responsible for more grief than any other part of Windows, and Microsoft's answer is to embed it even deeper into Windows10 and remove any way to turn it off or even control when it happens!
I could not stop the service, so I changed it to "disabled" and restarted.
This seemed to help, everything was quiet for about 5 minutes. Then WireShark showed traffic from 69.16.164.31. TCPView showed this was owned by SvcHost/wuausrv, Windows Update.
I tried stopping the service; it reported "stopped" by the data kept flowing. I changed it to disabled and restarted.
Now it is BITS...
I stopped the "Background Intelligent Transfer Service" (BITS) and disabled it. The network activity went to nearly zero.
Summary: This was Windows Update gone haywire. Since I don't care about updating (nor even want) Win10Target, the best option was to disable the services:
TechGage.com has some useful tips on minimizing the collateral damage caused by Windows Update.
The target machine can be configured to bypass the usual password login since it contains nothing of interest. Windows 10 has removed the advanced user account dialog as part of its ongoing campaign to dumb-down Windows, so the easiest way to open the dialog is to first open a DOS box and run netplwiz Simply uncheck the "Users must enter a password" box. This only works if the machine is using Workgroups, not a domain. The configuration to automatically log into a domain is a bit more complicated. (See LifeWire.com)
August 21 2017
There is something going on with the network when Win10Target is running. My internet access on the system slows to a crawl whenever Win10Target is running, when when WinDbg is not. Then sometimes everything is fine. And every once in a while the connection between WinDbg and Win10Target runs smoothly and I am given a glimpse of how easy kernel debugging could be when everything is working properly. I suspect this is something in the configuration of vmnet subnets. Win10Target has two subnets, 192.168.114.0 (Host-only) and 192.168.146.0 (NAT). When I ping "Win10Target" from VS12, I receive a reply from 192.168.146.145.
From Win10Target:
debugtype NET
hostip 192.168.146.129
port 49999
dhcp Yes
I could disable DHCP on the host-only subnet, set a fixed IP for
Win10Target, and connect WinDbg to Win10Target using the host-only
subnet. Does the debug key contain the IP information? There is no
option to give an IP address to WinDbg in the command line. I have
already saved away a copy of the Win10Target VM, so I can experiment
without losing much.
First step is to change the bcdedit debug configuration on Win10Target.
I change the hostip value from 192.168.146.129 to 192.168.114.129 (VS12
on the host-only subnet).
bcdedit /dbgsettings net hostip:192.168.146.129 port:49999 nodhcp
The encryption key is unchanged.
It failed to connect to WinDbg. I am able to ping from Win10Target to 192.168.114.129. I reverted back to 192.168.146.129, keeping the "nodhcp" option. My first impression is that things seem to be running smoother...
I think "nodhcp" fixed it! There must have been an ARP-storm blasting away inside the vmnets.
Disabling DHCP has helped tremendously, but I still see an occasional delay. I tried disabling every network protocol except File sharing and IPv4.
Sometimes the target fails to boot, dumping the VM just before transitioning to the full-res login window. This can usually be fixed by closing and restarting WinDbg.
I need to figure out how to stop WinDbg from constantly downloading symbols from Redmond. Even though I supposedly have the symbol path configured to use a local cache, WinDbg still reaches out to Redmond every time it is launched. This results in a 2-minute delay every time I set the first breakpoint of a session.
OSR Insider 2015/1 Fix Your (Offline) Symbols describes how to configure symbols for a system without internet access.
When I turned on kernel DbgPrint() output, I saw some suspicious activity. I'm especially curious about the "asimovuploader" service.
August 17 2017
The current Microsoft symbol server:
http://msdl.microsoft.com/download/symbols
I am still dealing with very slow response times when debugging with WinDbg. It takes about 90 seconds to set the first breakpoint.
These seem to help:
August 15 2017
The ideal debug environment is to use WinDbg to step through the kernel mode code and Visual Studio for the user mode, with the dev machine in one VM and the target in another VM on the same physical machine. This mostly works, if only Visual Studio would not time out its network connection while the target is halted in WinDbg.
This problem is exacerbated when WinDbg is constantly freezing the target while it takes forever to download symbols. I am (supposedly) caching the symbols locally, but WinDbg still takes 20-30 seconds after the first breakpoint to present the command prompt and randomly freezes the target VM when a new process starts. This makes following user/kernel transitions very difficult as Visual Studio will drop its debug session and kill the user process if I spend more than 30 seconds with the target halted in WinDbg.
Using the embedded Visual Studio kernel debugger is just another can of worms, which has even bigger roadblocks.
So debugging kernel drivers remains an exercise in controlling frustration. The sequence to debug a driver:
It is not unusual for this entire process to take 15-20 minutes to get a single good run of the client app.
I found this hint:
make a file named symsrv.ini in windbg folder insert a section called exclusions in that file and include all the module names for which you do not want windbg to look up symbols for like av drivers and other stuff windbg will skip looking for the symbols of those modules F:\windbg>type symsrv.ini ;start of symsrv.ini [exclusions] Normaliz.* xpsp2res.* mshtml.* pdm.* iphlpapi.* aswMon2.* aswFsBlk.* aswSP.* ipnat.* vmm.*
August 4 2017
Debugging kernel drivers directly on the development machine is generally a bad idea. Dedicating a complete computer system to act as the debug target is expensive, takes a long time to configure, and takes up too much space. The ideal solution is to run both the development system and the target as virtual machines on the same physical computer.
My target system is running Windows10 64bit under VMware Workstation. The computer name is "Win10Target". Remote debugging has improved from Win8 to Win10 with better support for network debugging over TCP/IP, instead of relying on the old virtual COM ports.
Kernel debugging on the target is enabled using bcdedit.
Open a DOS terminal as administrator and run the following commands:bcdedit /debug on bcdedit /dbgsettings net hostip:192.168.0.204 port:49999 Key=wt5hk1dbnt8d.pej2vfc7ye5s.2zjls110857zn.cqsxm3lie4et bcdedit /set {bootmgr} displaybootmenu yes bcdedit /timeout 10 bcdedit /set {current} recoveryenabled no
hostip is the address of the machine running WinDbg, not the target. The port value is any unused IP port between 49152 and 65535.
The Key value is generated by bcdedit. I need to record this value to use when connecting WinDbg or Visual Studio to the target.
The venerable old WinDbg is still the best tool for Windows kernel debugging. Visual Studio 2015 added support for debugging kernel drivers but it has some problems that make it difficult to use, especially when debugging both user-mode and kernel-mode code. WinDbg has a lot of esoteric commands for examining kernel data structures that are not available in Visual Studio.
Set the symbol server:
S:\Src\HQ\Dev\SB\Chip\DbgOut\1300\Out\Winx64Debug;srv*D:\CL\Symbols\MS*https://msdl.microsoft.com/download/symbols
This downloads the symbols from the Microsoft server and caches them locally.
Create a .BAT file to launch WinDbg:
start "WinDbg" "C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\windbg.exe" -b -k net:port=49999,key=wt5hk1dbnt8d.pej2vfc7ye5s.2zjls110857zn.cqsxm3lie4et
The port must match the value set on the target. The target will accept only a single debug connection, so WinDbg and Visual Studio are mutually exclusive with respect to kernel debugging.
The key must match the value reported by bcdedit on the target. This value is a hash of the target's installation key so it should not change once it has been set. The key is a security measure to prevent hackers from hijacking your target system if it is exposed to the wild internets with kernel debugging enabled.
Using NET for debugging Opened WinSock 2.0 Waiting to reconnect...
WinDbg is now waiting for the target system to accept its connection. There seem to be two opportunities to catch the target during boot. If WinDbg fails to connect, I will need to reboot the target and try again. The kernel debug connection seems to mess up the timing on the target so I can expect some odd behavior with the "please wait" swirlies.
The first connection opportunity is early in the boot, just after the Windows logo appears.
There is sometimes a period during the early boot, after connecting to WinDbg, when the target will freeze for about 30-60 seconds. I'm not sure what is happening during this time, but the target will usually (not always) continue.
The second connection window is during the Windows sign in.
Configure the symbol path to include my project output directory, which should contain the driver binary and PDB files.
Configure the VS project for remote debugging. I will be using Visual Studio to debug the user-mode loader program and WinDbg to step through the kernel-mode driver. I configure the program for remote debugging with the driver listed under "Additional files to deploy". Make sure "Deploy" is enabled for the project in the configuration manager.
Make sure the target is visible on the network from the dev machine. Launch the Remote Debugger on the target as administrator. (Admin is required to install and remove the driver service.) Deploy the project to the target. Build > Deploy solution.
If the deployment is successful, the target should now have a set of files deposited by Visual Studio.
Set a breakpoint in the loader and launch it (start debugging).
Driver breakpoints that are handled by WinDbg will cause the Visual Studio debug connection to time out if WinDbg does not release the target within 30 seconds. This is annoying and I can't find any way to change the time out limit.
VMware will ocassionally crash and leave orphaned file locks. The VM will then be unusable until the locks are cleared. Fortunately, this can be done by simply deleting the files. Open a DOS box, go to the directory containing the .vmx file, and execute:
for /d %%f in (*.lck) do del /q %%f\*