1.
Introduction
This is a runbook for the HCL UNIX BAU team that explains the standard policies and procedures to
support SSE UNIX-Linux Infrastructure from offshore. The primary objective of this document is to
provide details of the SSE UNIX-LINUX Infrastructure and explain the process of SSE’s UNIX
administration and operations.
2. Background
SSE plc (formerly Scottish and Southern Energy plc) is an energy company headquartered in Perth,
Scotland. It is involved in the generation and supply of electricity and gas, the operation of gas and
telecoms networks and other energy related services such as gas storage, exploration and
production, contracting, connections and metering.
SGN (previously known as Scotia Gas Networks) is a UK gas distribution company which manages
the network that distributes natural and green gas to homes and businesses across Scotland and the
south of England. Its owned by SSE and other shareholders.
SSE UNIX infrastructure is hosted in a professionally managed Data Centre meeting industry
standards and best practices. Most of the servers are hosted in two main data centers at Havant -
Martin Road and Pyramid Park. Both the data centers are interconnected with dark fibre and data
disks are mirrored at host level to provide redundancy. Most of the critical applications are configured
under high availability Veritas Cluster.
SSE Unix infrastructure is also hosted in other data centers located at Penner Road, Dublin (Ireland)
and Perth (Scotland).
3. Document Overview
This document is intended to be a single point of reference for UNIX administration and operations.
It is assumed that the person reading this document has good technical understanding of UNIX
administration. The reader should have adequate proficiency in UNIX (AIX, Solaris and RHEL).
4. Scope
1.1 Supported Services Overview
Currently the Midrange Support team (OPMID) support the following;
Monitoring the following ITSC Incident, Request, Problem and Change queues;
Midrange Support, Midrange Support non-production, Midrange Support Maintenance, Midrange
Support Audit.
Service Name Description Business Contact
IBM HMC management
IBM Pseries support
AIX LPAR support (Server and
service restore incase of failure)
AIX VIO (Server and service
restore incase of failure)
RHEL Unix support(Server and
service restore incase of failure)
TRU64 Unix support(Server and
service restore incase of failure)
Solaris Unix Support(Server and
service restore incase of failure)
UNIX server monitoring
Filemover code support
Unix user access
Unix storage requests
System housekeeping
Disaster recovery failover
System failovers
1.2 Service requests response and resolution
In support of services outlined in this Agreement, the Service Provider will respond to service
requests submitted by the Customer within the following time frames:
Notification Initial
Medium Response
Request title Time Delivery Target
SSVUI300 Account ITSC Request 1 Day 3 working days
Creation/Amendment/Removal
SLIMS server access Addition/Removal ITSC Request 1 Day 3 working days
Production TWS ISC Account ITSC Request 1 Day 3 working days
Creation/Amendment/Removal
Non- Prod TWS ISC Account ITSC Request 3 Days 5 working days
Creation/Amendment/Removal
Unix Filesystem ITSC Request 1 Day 5 working days
2
Creation/Amendment/Removal
Unix Filesystem Storage ITSC Incident 1 Day 5 working days
Increase/Decrease
Log file retrieval from Unix production ITSC Incident 2 Day 5 working days
systems
Production TSM Data ITSC Incident 1 Day 2 working days
Backup/Restore/Configuration
Non-Prod TSM Data ITSC Incident 3 Days 5 working days
Backup/Restore/Configuration
Standard Unix Housekeeping ITSC 1 Day 3 working days
Configuration Incident/Request
Standard Unix Monitoring ITSC 1 Day 3 working days
Installation/Configuration Incident/Request
Unix Memory/CPU Increase/Decrease ITSC 1 Day 5 working days
Incident/Request
Filemover SSH Key setup ITSC 1 Day 5 working days
Incident/Request
Samba User Creation ITSC Request 2 Days 5 working days
5. Out of Scope
Out of scope services for SSE UNIX Infrastructure hosted in various Data Centers
Service Name Description Business Contact
Rack and Stack Servers
IBM Pseries Server Build and Decommissioning
AIX LPAR Server Build and Decommission
AIX VIO Build and Decommission
RHEL Unix Server Build and Decommission
TRU64 Unix Server Build and Decommission
Solaris Server Build and Decommission
Disaster Recovery Planning and Designing
UNIX server monitoring Tool Management
Capacity Tool Management
Printer Hardware support
VCS Server Build
3
ORACLE RAC CLUSTER Build and Support
System housekeeping
Application Installation and Configuration
Cloud and cloud server support
Projects Work
Hand and Feet Support
6. Infrastructure Details
SSE UNIX environment servers are largely divided into following three categories:
Production
Pre-Production
Development
Hardware and Operating Systems
Hardware Model OS Instances Operating System
Power 720
Power 740
Power 780 Server
AIX 5.3
Power 795 Server AIX 6.1
1889
AIX 7.1
Power System E880C AIX 7.2
Power System E880
HMC
RHEL 5
RHEL HP ProLiant Servers 356 RHEL 6
RHEL7
Sun-Fire-V440
Sun-Fire-V215
Sun-Fire-V210
SPARC T5120
Sun-Fire-V890
Solaris 8
Sun Oracle Sun-Fire-T2000 66 Solaris 10
SPARC T5220 Solaris 11
Sun-Fire-V445
SPARC T7-2
SPARC T7-1
SPARC T3-2
OSF1 DS 25 4 4 Tru64
7. Storage Infrastructure Design
SSE SAN environment is hosted in POR (Pyramid Data Park) and HAV (Martin Road Data Hall)
Two DCs are within 3 miles away and hosted in Havant, UK.
Storage
Six storage in each DC
Each DC contains 1 HP StoreServ 9450 and 5 HP StoreServ 7400
Two Fabrics i.e. Fabric A and Fabric B spanned across both DC
Each Fabric contains two core Switches
HP SN8000B (DCX-8510-8) & Edge/Blade Chassis Switches
Rack Servers/Blade are connected to Edge Switches and Edge Switches is connected to Core
Switches.
SL8500 Tape Library is connected to Core Switches
SSE Prod and Pre-Environment disks are mirrored at host level (across HAV and POR) and in
UAT/Test/DEV only root disks are mirrored
Two MSA 2040 Storage boxes one in each DC) which are iSCSI and connected to Network
ISCSI Switches in SSE environment (showing in Fig 2)
Two Chassis are connected to MSA boxes and one Chassis is connected to MSA 2040 (via
ISCSI) and HP StoreServ 9400 (via Fibre)
8. Storage Infrastructure Design- NON-PROD
There is no separate storage infrastructure for NonProd environment
5
9. RACK/DC Layout
Midrange team do not have details. These are held by Steve Downing, the Hardware Planner.
10. Devices naming Convention
SSE UNIX servers are currently named as follows: AAABCnnn
AAA is a three-letter abbreviation of the location of the instance. For most servers this will be
HAV for Havant and POR for Portsmouth. VHA is a Virtual Havant device on the HYPER-V farm
in Havant and VPO is a Virtual Portsmouth device on the HYPER-V farm in Portsmouth.
B is a one letter code for the operating system type where
‘U’ is UNIX
‘L’ is Linux
‘W’ is Windows.
C is a one letter code depicting the service status where
‘A’ is a production server.
‘D’ is a development server, which has come to be used for pre-production, UAT or any
such non-live system
‘I’ standards for Infrastructure, used for every server whose purpose is to provide
infrastructure management services.
nnn is a sequential three-digit number allocated at build time.
Few are examples of hostnames;
HAVUI300 - HAV ( HAVANT) U (UNIX) I (INFRA)
PORUI300 – POR (PORTSMOUTH) U (UNIX) I (INFRA)
DUBLD001 - DUB (DUBLIN) L (LINUX) D (DEVELOPMENT)
VPOLA020 – VPO (HYPERV – POR) L (LINUX) A (PRODUCTION)
Hostname are not case sensitive. Below examples hosts are SSE Jump servers.
11. Vendor Service Request
Red hat Support contract details
Web page for RHN:
https://www.redhat.com/wapps/sso/login.html
Username: SSE_opmid
Password: <not here>
For subscription / license info:
Once logged in, click the "manage subscriptions" link in the mid-left of the site.
There are no comments associated with the licenses - ideally these should be managed by the projects /
Neil Payne.
Contract IDs should ideally be added below for our own info:
CONTRACT COVERAGE:
2 x PROD
6
3 x PREPROD
4 x VMWARE
`
Customer number:
(sort of like a user/company ID)
908855
Contract number for this renewal: 11077407
Subscription expiry date: 29th Aug 2019
Subscription Numbers
Held in the subscription part of the above web page.
Phone Numbers:
00800 4673 3428
01252 362 710
sales - 01252 362 795
Matt Hall
Key Account Manager
Red Hat Ltd
Mobile: +44 (0)7827 300421
Remote Office: +44 (0)207 0094449
Ext: 8274449
mhall@redhat.com<mailto:mhall@redhat.com>
http://www.redhat.com/
Software:
http://havui322/repo/ISO/
Contract number:
1557627 - ONZO
Veritas Support contract details
Veritas Infoscale - Logging Support Calls
VERITAS TECHNICAL SUPPORT
Account Name: SSE plc
ERP #: 18260
Support ID (SID): 4683-9643-3743
BCS Support ID : 2684-0178-6670
Product(s) covered: Infoscale Enterprise
Contract Expiry: 12 December 2017
Business Critical Account Manager (BCAM)
Felicity Deacon (Fliss) 0772 0082 835 Felicity.Deacon@veritas.com
Sales Account Manager
Tim Howard 0779 504 7149 Tim.Howard@vertias.com
7
Technical Presales Consultant
Steve Bowman 0792 1985 021 Steven.bowman@veritas.com
How to Raise a Technical Support Case:
Before logging the case, ensure you have the following information:
•Veritas product name and version
•Details of the Operating System on which the product is running
•Contact your BCAM if you wish to escalate the case - always call on Sev1
All case correspondence with Technical Support Engineers (TSE’s) will be via:
Enterprise_Technical_Support@Veritas.com
To ensure inclusion of your emails in Veritas’s case tracking system, please ensure that you do not delete
the unique reference code that is in the subject/body of the email.
You can log calls in 2 ways phone and online
Phone:
• Call Veritas Customer Support
• Select option 1 for Technical Support
• Enter your Support ID
• When speaking to the agent, state clearly that you are a BCS customer"
Online:
• Not appropriate for Severity 1 cases
• Go to https://my.Veritas.com/
• If you do not have an account,please register and notify your BCAM so that your account can be linked to
the BCS contract
Missed Service Level Goal (SLG)
Contact Technical Support, selecting option 1
Advise the HUB Agent that SLG has been missed
Case handling issues
If you need some more assistance or need to escalate please BCAM (Business Critical Account Manager)
Below you can find the descriptions of Severity Level Goals (SLG) defined by Veritas (Symantec):
Severity 1 [Emergency] - System down/product inoperative condition impacts your business critical
operations (Within 15 minutes)
Severity 2 [Critical] - Severely affects or restricts major functionality (Within 2 hours)
Severity 3 [Major] - Issue with no major effect on business systems (Within 6 business hours)
Severity 4 [Minor] - Minor issue, How to? with no major effect on system (Within the next business day)
Oracle Support contract details
8
Log an Oracle Support Request via:
Login via:
https://support.oracle.com
sign in..
(create an account if you don’t have one..
Support Identifier 20505920
Customer: Scottish and Southern Energy Power Distribution Limited - use first 5 letters to find
You have to wait for approval) Select "Switch to Cloud Support"
For DbaaS Service Select - "Oracle Database as a Service" Service Name/Environment "DBAAS a514234"
select "Create Service Request"
Sev 1 they will call you 24/7 unless you specify "working hours only"
If you log as sev 3/4 it does not appear that you will get a good response time.
12. Alerts and Reporting
Currently all the Unix server monitoring that Midrange (UNIX) support, is driven through
MUM (Midrange Unix Monitoring). MUM is an in-house developed monitoring database
that lives on a centralised AIX LPAR. MUM receives messages from all the Unix instances
in the estate via SSH, processes them and creates incidents in IT Service Centre (ITSC)
when necessary.
There are two servers which run the MUM tool.
Service Name Description Remark
Havant MUM
HAVUA384
Server
PORUA384 Portsmouth MUM
Server
SSVUA384 MUM Server DNS Always login to this. It will be connected to active instance.
Alias
SSVUA384 MUM DB Server Mysql Server
1.3 Stop Raising Incidents
Sandpit – Mechanism to stop raising Incidents on server, monitoring will still take place
9
Put server in sandpit
Usage: sse sandpit_server "<reason> <UserID>" <Servername>
1.4 Start Raising Incidents
Usage: sse unsandpit_server "<reason> <UserID>" <Servername>
More details can be found in below forum link.
http://ssvla001.uk.ssegroup.net/mybb/showthread.php?tid=983&highlight=sandpit
1.5 THIRD PARY CONTRACT DETAILS
Supplier IBM RedHat Oracle IBM
IBM UK Oracle Corporation UK IBM UK Limited
Limited Ltd., North Harbour
Supplier name and North Harbour
address Oracle Parkway, Thames Portsmouth
Portsmouth Valley Park (TVP)
Hampshire, PO6 3AU
Hampshire, Reading, Berkshire
PO6 3AU
RG6 1RA
Contract Scope IBM Technical RHEL Solaris Technical IBM Technical
Support Technical Support Support
Support
SSE Contract Owner Wayne Wayne Doric Tong Wayne Renwick
Renwick Renwick
Contact Name IBM Software IBM Hardware
Support Support
Contact telephone 03700 101 952 03705 500 900
Contact email
Escalation contact Richard Lewis Claire Emma Bolger Tony King
name Stephens
Escalation contact Availability Customer Support Account Account SSR
role Manager Success Manager
Manager
Escalation contact 07789 07824 As support no. 07990 795269
tel. no 270438 140101
10
Escalation contact lewisrj@uk.ib cstephen@r Emma.Bolger@oracle.c APKING@uk.ibm.co
email m.com edhat.com om m
Support hours 24x7 24x7 24x7 24x7
Incident Severity 1) Severity 1) Severity 1) Complete Severity 1) Mission
Classification Mission Urgent loss of service Critical
Critical
Severity 2) Severity 2) Severe loss Severity 2) Medium
Severity 2) High of service priority
Medium
priority Severity 3) Severity 3) Minor loss of Severity 3) Low
Medium service Priority
Severity 3)
Low Priority Severity 4) Severity 4) Request - no
Low loss of service
Response Time Severity 1) 1 Severity 1) 1 Depends on product but Severity 1) 1 Hour
Hour Hour typically for
Severity 2) 4 Hours
Severity 2) 4 Severity 2) 2 Severity 1) 1 Hour
Hours Hours Severity 3) 24 Hours
Severity 2, 3 &4)
Severity 3) 24 Severity 3) not documented
Hours 24 Hours
see policies for details:
Severity 4)
48 Hours www.oracle.com/us/
support/policies
Support Method Onsite, WebURL Onsite, Telephone, Web Onsite, Telephone,
Telephone, (preferred) URL Email, Web URL
Email, Web
URL Telephone support.oracle.com
Support telephone 03700 101 952 00800 4673 0870 4000900 03705 500 900
no 3428
(Office hours
only)
+44 1252
362 710
(24x7,
severity 1&2)
Support email Not Available https:// aixhw@uk.ibm.com
support.oracle.com
Support hours 24x7 24x7 24x7
Service Reporting Monthly Weekly Quarterly Monthly
11
Service Reviews Monthly Six monthly Quarterly Monthly
12