I’ve been in IT for over 20 years now and in my time I’ve seen some crazy stuff like –
- Grass growing in a Unisys Green Screen terminal that was sent in for repair by a Lumber yard
- A Goofy screen saver on a IBM PS/2 running OS/2 kept bringing down Token Ring till we found it
But this friend is one of the more weird issues I’ve come across….
This all started last March 2012. I bought some more RAM and a pair of 2TB Hitachi HD’s for my Workstation 8 PC. I needed to expand my system and Newegg had a great deal on them. I imaged up my existing Windows 7 OS and pushed it down to the new HD. When the system booted I noticed that is was running very slow. I figured this to be an issue with the image process. So I decided to install from Windows 7 from scratch but I ran into various installation issues and slowness problems. I put my old Samsung HD back in my system and it booted fine. When I plugged the new Hitachi HD in the system as a second HD via SATA or USB the problems started again, basically it was decreased performance, programs not loading, and choppy video. I repeated these same steps with the 2nd Hitachi HD that I bought and it had the same issues.
A bit perplexed at this point I figure I have a pair of bad HD’s or bad HD BIOS. Newegg would not take back the HD’s, so I start working with Hitachi. I tried a firmware HD update, I RMA both HD’s and I still have the same issue. Hitachi sends me different model but slower HD and it works fine. So now I know there is something up with this model of HD.
I start working with Gigabyte – Same deal as Hitachi BIOS Update, RMA for a new System board Revision (Now I’m at a Rev 1.3) and I still have the same issue. I send an HD to Gigabyte in California and they cannot reproduce the problem. I’ll spare you all the details but trust me I try every combination I can think of. At this point I’m now at this for 5 Months, I still cannot use my new HD, and then I discover the following…
I put in a PCI (Not PCIe) VGA video card into my system and it works…
and then it hit me – “I wonder if this is some weird HDMI Video HD conflict problem”
I asked Gigabyte if disabling onboard HDMI video might help.
They were unsure but I try it anyway and sure enough I found the solution!
It was like the computer gods had finally shone down on me from above – halle-freaking-lujah…..
Here are the overall symptoms….
Windows 7 x64 Enterprise or Professional installer fails to load or complete the installation process
If the installation completes, mouse movements are choppy, the system locks up or will not boot
Attaching the Hitachi HD to a booted system via USB the system will start to exhibit performance issues.
Here is what I found out….
Any Combination of the following products will result in a failure…. Change any one out and it works!
Here is the solution to making them work together….
BIOS under Advanced BIOS Settings – Change On Board VGA to ‘Enable if No Ext PEG’
This simple setting disabled the on board HDMI Video and resolved the conflicts with the products not working together.
I got to meet some really talented engineers at Hitachi and Gigabyte. All were friendly and worked with me to solve my issue. One person Danny from Gigabyte was the most responsive and talented MoBo engineer I’ve meet. Even though in the end I found my own solution, I wouldn’t have made it there without some of their expert guidance!
I hear this topic come up from MANY and I mean MANY VMware folk. When I say VMware folk, I mean just about every person who interfaces with the product – Yes it’s that many
I believe it is a common misconception that Windows 2008 is aligned out of the box.
*The crowd goes silent as a distance ‘Ahh..’ and ‘No’ silently streams through the audience*
I also believe that Windows 2008 has a better chance of being aligned out of the box then most – But Don’t Trust it.
Still don’t believe me? Then read this from the horse’s mouth…
http://msdn.microsoft.com/en-us/library/dd758814(v=sql.100).aspx << Look for the topic “Partition Alignment in Windows Operating Systems”
From the above Microsoft link about alignment –
Partition Alignment in Windows Operating Systems
The way partition alignment works depends on the version of Windows being used and the version in which the partition alignment was created. The following sections describe how partition alignment works in Windows Server 2008, the Windows Vista® operating system, and Windows Server 2003 and earlier.
Windows Server 2008 and Windows Vista: New Partitions
In Windows Vista as well as Windows Server 2008, partition alignment is usually performed by default. The default for disks larger than 4 GB is 1 MB; the setting is configurable and is found in the registry at the following location:
However, if OEM setups are delivered (for example, with recovery partitions), even fresh installations of Windows Server 2008 having partitions with undesirable partition starting offsets have been observed.
Whatever the operating system, confirm that new partitions are properly aligned.
I’m guessing at this point you still have doubt… But wait here’s more proof… I’ve seen misalignment in production environments… *No Way – Yes Way*
Do you believe now?
If so maybe the best approach to this topic is to start stating “Windows 2008 is a better aligned OS but it needs to be checked just like ever Windows OS out there.”
Here is one way you can determine if your server is doing soft or hard Page faults.
Hard vs. Soft
Hard Page faults indicate the server is going to the Hard Disk to retrieve needed data and place it in RAM.
Soft Page faults indicate it is going to RAM or Cache to get the data it needs. This is a normal for most programs
Setup Windows performance monitor with the following…
SOFT Page Faults = Cache Faults/sec & Page Faults/sec
Hard Page Faults = Page Reads/sec & Avg. Disk Sec/Read
As you can see from this screen shot this server isn’t doing any hard page faults.
If you notice consistent hard page faults, this could be by design, or you need to add RAM to the server or allocate appropriate RAM to the application. Either way, I’d recommend consulting with the application owner or company who created the application for proper guidance.
This is an on going post that I am updating as it progresses… the issue start in early July of 2010 – Present date…
Recently I was working on a MS SQL 2000 Server and it was having some performance issues. Users were reporting random slowness and disconnects. Three other servers would feed this server SQL based data and a MS SQL 2008 Reporting server would occasionally connect and retrieve data for reporting services. Keep in mind this is non-clustered production server and the business needs to have up 24/7, and rebooting it is close to impossible. Hence this drove some of our decisions…
What we know about the server…
- Server is a HP DL380 G5 server, Single Socket Quad Core Xeon 5160, 4GB RAM (4x 1GB Sticks) , 2 x 36GB SAS 10K Drives (C Drive) , 5 x 146GB SAS 10K Drives (D Drive)
- OS is Windows 2003 SP2
- SQL Enterprise Edition 2000 SP 4
- HP Management Tools are installed
- C: Drive is 33GB / 14GB Free and is ~ 75% Fragmented
- D: Drive is 410GB / 172GB Free and is ~100% Fragmented
- SQL is taking 1.7GB of RAM as of 07/28/2010
- SQL is taking 5 to 20% of the CPU
- 980MB of RAM is average Free Space
Items we tried… (Keep in mind the order we could attempt analysis was partially based on the business)
- Basic analysis – No issues found, memory okay, disk okay, etc..
- Checked Network connections (cable, switch), and Error Logs – Found HP NIC was reporting disconnects since 2008
- Reseated and tested cables, okay no issues
- Updated with MS Updates and rebooted
- Updated firmware (HP FW 9.00), Software Drivers (PSP 8.40), and Rebooted
- Noted that PSP8.40 NIC driver was dated update Driver manually to latest
- After updates users reported no change still slow
- Found the TCP Off Load Chimney issue (kb/942861) but we decided to explore other options first
- Monitored the server via Task Manager / Process Explorer, Nothing definitive found
- Vendor Ran the SQL Profiler Program to determine issues, Nothing definitive found
- Vendor believed that Hard Page Faults were the issue based on Task Manager Reports. I used the link below with Performance Monitor & Process Explorer to prove the server was not paging to disk.
- Noted the SQL Data disk and Boot Disk were fragmented
- Noted that SQL Maintenance was never run
- Noted that the /3GB Switch could be implemented & vendor concurred it is being used in other locations without issue
- Implemented the TCP Off Load Chimney and the /3GB, users reported improvements
- Contacted HP about the issue with Windows 2003 SP2, NC373i, and the TCP OffLoad issue
- HP Confirms NIC driver is up today
- HP would like to run HPS Reports, I ran/emailed them the reports
- HP Responds, Nothing definitive found in the HPS Reports
- HP will escalate to their network team for further analysis
Still to do…
- Database Maintenance & De-fragment hard disks
Summary so far..
It does appear that specific types of NIC controllers are having issues after the Windows 2003 SP2 update with the TCP Offload feature. Even updated drivers and firmware at this time don’t fix this. In-fact we even had one P2V VM that was having the same issue (I still need to look at this one)
Defrag Link –
SysInternals Links –
Basic of Page Faults –
The effect of TCP Chimney off load –
Symantec In-depth explanation of TCP Chimney off load – (a great read)
Memory Management – Demystifying /3GB
Error message when an application connects to SQL Server on a server that is running Windows Server 2003: “General Network error,” “Communication link failure,” or “A transport-level error”
An update to turn off default SNP features is available for Windows Server 2003-based and Small Business Server 2003-based computers