Status:

  • Windows Failover Cluster fail to start any of the Roles
  • All nodes in the Windows Failover Cluster are up.
  • Identified several important event IDs that are common across all SQL nodes: 1069, 1205, 1254.

Probable Causes: 
-SQL node names and AG name are not resolving or couldn’t register in the network. please look at the IP and DNS configurations.

Suggestions:
-Please make sure no IP config or DNS issues for AG and SQL nodes on DR side.
-Please make sure all the SQL nodes are on same patch level and up to date.
-Please make sure the replication is healthy.

  • Event ID 1069:
    Description: This event is logged when a clustered resource fails to come online.
    Common Causes:
    Resource Configuration: Incorrect configuration of the resource (e.g., network name, IP address, disk, etc.).
    Network Issues: DNS-related issues for network name resources or NIC-related events for IP address resources.
    Disk Errors: Disk-related errors or warnings for physical disk resources.
    Troubleshooting Steps:
    Check the system event log for Event ID 1069.
    Review other related errors or warnings.
    Examine the cluster log for further details.
    Use PowerShell to collect cluster logs: Get-ClusterLog -Destination C:\temp.
  • Event ID 1205:
    Description: Indicates a deadlock situation in the cluster.
    Common Causes:
    Resource Dependencies: Resource dependencies causing conflicts.
    Network Issues: Network communication problems.
    Troubleshooting Steps:
    Check the cluster log for deadlock-related entries.
    Verify resource dependencies and network connectivity.
  • Event ID 1254:
    Description: Indicates that a resource failed to come online due to a timeout.
    Common Causes:
    Resource Configuration: Incorrect configuration.
    Resource Startup Time: Resource taking too long to start.
    Troubleshooting Steps:
    Review the cluster log for details.
    Check resource configuration and startup time.

Fix:

Do a force failover on the SQL Server side. Run the following on SQL Server Management Studio…

ALTER AVAILABILITY GROUP xxxxxxx FORCE_FAILOVER_ALLOW_DATA_LOSS; 

Last modified: March 22, 2024

Author

Comments

Write a Reply or Comment