SQL Server Performance Issues

Posted on Updated on

This is an on going post that I am updating as it progresses… the issue start in early July of 2010 – Present date…

The Issue..

Recently I was working on a MS SQL 2000 Server and it was having some performance issues. Users were reporting random slowness and disconnects. Three other servers would feed this server SQL based data and a MS SQL 2008 Reporting server would occasionally connect and retrieve data for reporting services. Keep in mind this is non-clustered production server and the business needs to have up 24/7, and rebooting it is close to impossible.  Hence this drove some of our decisions…

What we know about the server…

  1. Server is a HP DL380 G5 server, Single Socket Quad Core Xeon 5160, 4GB RAM (4x 1GB Sticks) , 2 x 36GB SAS 10K Drives (C Drive) , 5 x 146GB SAS 10K Drives (D Drive)
  2. OS is Windows 2003 SP2
  3. SQL Enterprise Edition 2000 SP 4
  4. HP Management Tools are installed
  5. C: Drive is 33GB / 14GB Free  and is ~ 75% Fragmented
  6. D: Drive is 410GB / 172GB Free and is ~100% Fragmented
  7. SQL is taking 1.7GB of RAM as of 07/28/2010
  8. SQL is taking 5 to 20% of the CPU
  9. 980MB of RAM is average Free Space

 

Items we tried… (Keep in mind the order we could attempt analysis was partially based on the business)

  • Basic analysis – No issues found, memory okay, disk okay, etc..
  • Checked Network connections (cable, switch), and Error Logs – Found HP NIC was reporting disconnects since 2008
    •  Reseated and tested cables, okay no issues
    • Updated with MS Updates and rebooted
    • Updated firmware (HP FW 9.00), Software Drivers (PSP 8.40), and Rebooted
    • Noted that PSP8.40 NIC driver was dated update Driver manually to latest
    • After updates users reported no change still slow
  • Found the TCP Off Load Chimney issue (kb/942861)  but we decided to explore other options first
  • Monitored the server via Task Manager / Process Explorer, Nothing definitive found
  • Vendor Ran the SQL Profiler Program to determine issues, Nothing definitive found
  • Vendor believed that Hard Page Faults were the issue based on Task Manager Reports. I used the link below with Performance Monitor & Process Explorer to prove the server was not paging to disk.
  • Noted the SQL Data disk and Boot Disk were fragmented
  • Noted that SQL Maintenance was never run
  • Noted that the /3GB Switch could be implemented & vendor concurred it is being used in other locations without issue
  • Implemented the TCP Off Load Chimney and the /3GB, users reported improvements
  • Contacted HP about the issue with Windows 2003 SP2, NC373i, and the TCP OffLoad issue
    • HP Confirms NIC driver is up today
    • HP would like to run HPS Reports, I ran/emailed them the reports
    • HP Responds, Nothing definitive found in the HPS Reports
    • HP will escalate to their network team for further analysis

Still to do…

  • Database Maintenance & De-fragment hard disks

Summary so far..

It does appear that specific types of NIC controllers are having issues after the Windows 2003 SP2 update with the TCP Offload feature.  Even updated drivers and firmware at this time don’t fix this. In-fact we even had one P2V VM that was having the same issue (I still need to look at this one)

Helpful Links….

Defrag Link –

http://technet.microsoft.com/en-us/library/cc966523.aspx

SysInternals Links –

http://technet.microsoft.com/en-us/sysinternals/bb963887.aspx

Basic of Page Faults –

http://blogs.technet.com/b/askperf/archive/2008/06/10/the-basics-of-page-faults.aspx

The effect of TCP Chimney off load –

http://blogs.technet.com/b/networking/archive/2008/11/14/the-effect-of-tcp-chimney-offload-on-viewing-network-traffic.aspx

Symantec In-depth explanation of TCP Chimney off load – (a great read)

http://seer.entsupport.symantec.com/docs/290098.htm

SQL Profiler

http://msdn.microsoft.com/en-us/library/aa173918(SQL.80).aspx

Memory Management – Demystifying /3GB

http://blogs.technet.com/b/askperf/archive/2007/03/23/memory-management-demystifying-3gb.aspx

Error message when an application connects to SQL Server on a server that is running Windows Server 2003: “General Network error,” “Communication link failure,” or “A transport-level error”

http://support.microsoft.com/kb/942861

An update to turn off default SNP features is available for Windows Server 2003-based and Small Business Server 2003-based computers

http://support.microsoft.com/kb/948496

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s