Download the complete service definition docx format of monitoring_support.doc
.

1. Description of Service

ITD provides 24x7 Monitoring services for fully managed IT services at ITD’s Data Center. The monitoring service uses software agents to determine the health of all infrastructure hardware and software components of a hosted service.

This service:  

  • Proactively anticipates and preempts potential service outages before they result in a service disruption
  • Provides data for analysis and restoring of a service outage
  • Provides historical utilization and performance data to assist in ongoing service improvements

 

Standard services 

Standard monitoring services include performance statistics and alert notifications if a threshold is reached.

Standard services include: 

  • Server monitoring
    • Up/Down Alerts
    • Alerts for CPU, memory, & disk space at pre-defined threshold levels
    • Performance metrics for CPU, memory, disk, network, I/O & other parameters that can be viewed in real-time
    • Performance data for long term analysis 
  • Web Service Monitoring
    • Monitor URL for availability and response time
    • Monitor simulated transaction for availability and response time
    • Performance statistics are available in real time
    • Historical summarized performance metrics are stored for multiple years for long term analysis
  • Network Devices
    • Up/Down Alerts
    • Utilization Statistics available on request
    • Utilization for network device as well as network throughput
    • Alerts are generated if a utilization reaches pre-defined threshold
  • Database monitoring
    • SQL & Oracle
    • Alerts on failure and performance thresholds
    • Historical summarized performance metrics are stored for multiple years for long term analysis
  • Application Monitoring
    • Process monitoring & Alerts
    • Log file monitoring
    • HTTP Webserver, Apache, Peoplesoft
    • MQ & Message Broker
    • Historical summarized performance metrics are stored for multiple years for long term analysis
  • VMWare Monitoring
    • Monitors ESX server H/W (Hosts) and Guests
    • Monitors other components such as Network connectivity and performance
    • Monitors Data Store and its performance
    • Historical summarized performance metrics are stored for multiple years for long term analysis
  • Appliance Monitoring
    • Data Power appliance(XML Gateway)
  • Available performance statistics
    • Server statistics are accessible to customers
    • Provide alert notifications if a threshold is reached

 

System Alerts

Alerts are generated based on a standard set of criteria established by the Monitoring Group and requirements as defined by the customer. Alerts are not generated during periods of planned downtime such as maintenance windows.  By default, alerts have three different levels:

  • Warnings – These alerts are the lowest level caution message, which indicates a need to monitor/inspect a given resource. The message appears on the Monitoring browser only
  • Major – These are action level alerts, which indicate an immediate need to respond by increasing capacity or otherwise addressing resource degradation. An email notification is sent to defined individuals/groups to address these alerts.
  • Critical – These alerts are generated when there is a “Failure of service” and also instructs staff as to next steps (page on call person, wait, escalate, call, etc). 
    • Some examples include:   
      • Server is down (i.e. non-pingable at 10 sec intervals for duration of 30 sec.)
      • An email notification is sent to individuals/groups and Operation staff is notified on 24 X 7 X 365 about the failure. 

 

Additional Services

Monitoring of Weblogic and Websphere middleware application services that require additional software modules and/or other tools includes monitoring and alert for various parameters, and performance metrics available for users.

 

Please note: Certain types of Security related monitoring, for example for Denial of Service (DOS) attacks, virus alerts, etc., are performed under the ITD Security Service separately from the Monitoring Service described here. 

 

 

2. Service Targets and Metrics

Service Requirement

Description

Service AvailabilityService availability hours are 24x7x52
Incident Management *ITD Service Management Office has standard processes to managing incidents, requests and changes.

Outages or urgent issues should be reported by phone to receive the quickest response 1-866-888-2808.
Request Fulfillment

Staff will respond to service requests during the hours of 8:00am to 5:00pm Monday through Friday excluding holidays.

Off hours requests: Must be opened via CommonHelp phone support and reported as an emergency - the on-call person will be paged.

What

Description

Availability

Restriction

Alerts – Mail & Notification to OPSThis measures availability of Alerts via mail and via console notification to OPS

98.9%

 

Perform Maintenance during business hours
Performance MetricsTEP- View for customer to look at application performance metrics

98.9%

 

Perform Maintenance during business hours

*Incidents, requests, or changes that are outside the scope of the defined service description or normal service hours will be direct charged to the customer.

 

 


 

3. Service Reporting

The following reporting information is provided to customers as part of this service

Report

Description

Reporting Interval

Performance Reports

Performance reports are available to users upon request. 

Performance reports can also be viewed through Tivoli Enterprise Portal (TEP).

 

Upon customer request via email to CommonHelp



4. Service Requests

 COMiT Service Request*

Description

Lead Time-Business Days

Request New Monitoring Service

This request is to set up or establish monitoring to a network, software, or hardware device.

5 Days

Modify Alerts

This request is to modify thresholds that are already set.

1 Day

Modify Notification

This request is to modify who is notified.

1 Day

*For new service requests only. To manage existing requests, please log into COMiT


 


5. Customer Responsibilities

Customers should work with ITD to define all monitoring requirements. 

For your convenience, you may view a detailed list of customer responsibilities docx format of monitoring_support.doc
.
 



 

6. Chargeback Rate Information

There is no charge for this service and all costs are included in ITD overhead. 

For more information on Chargeback, including an overview of the program as well as current and previous fiscal year rates, please visit our Chargeback Services webpage .

 


Updated August 14, 2014
Reviewed August 14, 2014
Published November 01, 2013