American Power Conversion (APC), a global provider of UPS systems, uses SQL Server as the back end for a number of applications, including Siebel call center, sales, and field-service modules; Web infrastructure; and homegrown manufacturing software. Worldwide, the company has more than 60 SQL Servers containing databases exceeding a total of 1TB in disk storage. Each SQL Server runs approximately 10 jobs every day, so the whole system runs close to 600 jobs daily, performing a variety of functions that include database backups, script transfers for Web sites, and auditing. To perform daily routine checks, DBAs had to log into every server. Thus, a DBA would spend an average of an hour every day checking different SQL Servers for failed jobs. APC needed a centralized management tool that would keep track of failed jobs, alert DBAs worldwide, and publish the alerts on the Web. The company also needed to monitor front-end Web servers not only for failed jobs, but also for performance metrics such as CPU utilization, memory utilization, and status of services. In addition, internal and external auditors wanted to see daily reports of jobs running on all SQL Servers globally and their status. To meet all these needs, I decided that I needed to build a monitoring and alerting tool to centrally monitor all tiers of our Web and customer relationship management (CRM) infrastructure, which includes hardware load balancers, Web servers, the OS, and databases. Our solution would also let internal and external auditors see from a centralized location all jobs running on all tiers and their success or failure.
Key Parts
The solution wouldn't be easy, but it was possible. The tools I used to create this monitoring solution were SQL Server, Visual Studio .NET, T-SQL, Data Transformation Services (DTS), and SQL Server 2000 Reporting Services. To bring information from all these parts together, I created a centralized monitoring Web site that had several links that would display the information DBAs needed about SQL Servers, Siebel applications, APC's Web infrastructure, and OS scheduled tasks. Figure 1, shows what the monitoring site's home page looks like.
SQL Server monitoring. To create a comprehensive SQL Server monitoring solution, I used SQL Distributed Management Objects (SQL-DMO) to create a Windows Service that polls all APC SQL Servers worldwide every 10 minutes. The service stores information—such as failed jobs, time of failure, and reason for failure—in a SQL Server table. Then, per auditor requirements, the service pulls status information for all jobs running on all SQL Servers globally and stores the information in a different table. After the information is in the database, the solution groups the servers by region and sends an email message to the DBAs who are on call at that time, notifying them of the jobs' status. Finally, the service publishes the information on the central monitoring Web site.
Siebel monitoring. I created this part of the solution a couple of years ago. My Seibel monitoring tool gathers information about Siebel jobs by using a combination of Windows NT batch files that run commands to send status information to text files. I used DTS T-SQL scripts (because they weren't delimited) to parse the text files and store them in a SQL Server table. Because the data wasn't delimited, I had to take the raw data and parse it to get meaningful information from it.
Another important part of my Siebel-monitoring tool is monitoring replication of Siebel data between geographically dispersed servers in the United States, Ireland, and Singapore. The only way to do this was to monitor specific folders in the Siebel installation that contained .dx files (native Siebel replication files).
If the number of files was greater than 50 at any location, the increase would indicate a backlog in the replication process. To capture this information, I used Visual Basic .NET code that runs in a Windows Service and stores the required information in a SQL Server table, just as I did with my SQL Server monitoring tool.
Because I already had the Siebel monitoring tool, all I needed to do to integrate it with my new monitoring solution was to take the data from the SQL Server table, send alerts, and publish the information on the monitoring Web site. The Siebel page contains information not only about jobs running in Siebel, but also the status of outbound email messages to field service engineers that workflows (Siebel application processes) running within Siebel generate.
Prev. page  
[1]
2
next page