GHE server version: 2.19.2

Reference

https://github.community/t5/GitHub-Enterprise-Best-Practices/High-Availability-and-Disaster-Recovery-for-GitHub-Enterprise/ba-p/11725

https://help.github.com/en/enterprise/2.19/admin/installation/configuring-github-enterprise-server-for-high-availability

HA (Normal Case - Primary & Replica normal status)

Primary Side

Mode change to 'Maintnance mode'

Run command

  • ghe-maintenance -s

admin@githubtest-testdomain-io:~$ ghe-maintenance -s

 

 

Replica Side

Change AWS Route 53 IP address

  • primary → replica

 

Run command (GHE status change replica to primary)

  • ghe-repl-promote
admin@ip-172-3?-???-???:~$ ghe-repl-promote
Warning: You are about to promote this Replica node
Promoting this Replica will tear down replication and enable maintenance mode on the current Primary.
All other Replicas need to be re-setup to use this new Primary server.
 
Proceed with promoting this appliance to Primary? [y/N] y
Enabling maintenance mode on the primary to prevent writes ...
Stopping replication ...
  | Stopping Pages replication ...
  | Stopping Git replication ...
  | Stopping Alambic replication ...
  | Stopping git-hooks replication ...
  | Stopping MySQL replication ...
  | Stopping Redis replication ...
  | Stopping Consul replication ...
  | Success: Replication was stopped for all services.
  | To disable replica mode and remove all replica configuration, run 'ghe-repl-teardown'.
Switching out of replica mode ...
  | Dec 04 07:41:45 Preparing storage device...
  | Dec 04 07:41:46 Updating configuration...
  | Dec 04 07:41:46 Reloading system services...
  | Dec 04 07:42:18 Running migrations...
  | Dec 04 07:42:54 Reloading application services...
  | Dec 04 07:43:18 Done!
  | jq: error (at :0): Cannot index number with string "settings"
  | Success: Replication configuration has been removed.
  | Run `ghe-repl-setup' to re-enable replica mode.
Applying configuration and starting services ...
  | ERROR: cannot launch /usr/local/bin/ghe-single-config-apply - run is locked
admin@ip-172-3?-???-???:~$

 

Former Primary Side

Former primary mode change to replica

Run command

  • ghe-repl-setup 172.3?.???.???
  • ghe-repl-start
  • ghe-repl-status
  • ghe-maintenance -u (maintnace mode unset)
admin@githubtest-testdomain.io:~$ ghe-repl-setup 172.3?.???.???
Warning: This appliance is or has been a configured appliance.
Proceeding will overwrite data on this appliance.
 
Proceed with initializing this appliance as a replica? [y/N] y
Verifying ssh connectivity with 172.3?.???.??? ...
Connection check succeeded.
Updating Elasticsearch configuration ...
Copying license and settings from primary appliance ...
 --> Importing SSH host keys...
 --> The SSH host keys on this appliance have been replaced to match the primary.
 --> Please run 'ssh-keygen -R 172.3?.XXX.XXX; ssh-keygen -R "[172.3?.XXX.XXX]:122"' on your client to prevent future ssh warnings.
Copying custom CA certificates from primary appliance ...
Success: Replica mode is configured against 172.3?.???.???.
To disable replica mode and undo these changes, run 'ghe-repl-teardown'.
Run 'ghe-repl-start' to start replicating from the newly configured primary.
 
admin@githubtest-testdomain.io:~$ ghe-repl-start
Verifying ssh connectivity with 172.3?.???.??? ...
Updating configuration...
Validating configuration
Updating configuration for githubtest-testdomain.io-primary (172.3?.???.???)
Configuration Updated
Configuration Phase 1
githubtest-testdomain.io-replica: Dec 04 07:49:43 Preparing storage device...
githubtest-testdomain.io-replica: Dec 04 07:49:45 Updating configuration...
githubtest-testdomain.io-replica: Dec 04 07:49:45 Reloading system services...
githubtest-testdomain.io-replica: Dec 04 07:50:04 Done!
githubtest-testdomain.io-primary: Dec 04 07:49:43 Preparing storage device...
githubtest-testdomain.io-primary: Dec 04 07:49:45 Updating configuration...
githubtest-testdomain.io-primary: Dec 04 07:49:45 Reloading system services...
githubtest-testdomain.io-primary: Dec 04 07:50:10 Done!
Configuration Phase 2
githubtest-testdomain.io-replica: Dec 04 07:50:13 Running migrations...
githubtest-testdomain.io-replica: Dec 04 07:50:13 Done!
githubtest-testdomain.io-primary: Dec 04 07:50:15 Running migrations...
githubtest-testdomain.io-primary: Dec 04 07:50:47 Done!
Configuration Phase 3
githubtest-testdomain.io-primary: Waiting for services to be active...
githubtest-testdomain.io-primary: Dec 04 07:51:06 Reloading application services...
githubtest-testdomain.io-primary: Dec 04 07:51:29 Done!
githubtest-testdomain.io-replica: Dec 04 07:50:48 Reloading application services...
githubtest-testdomain.io-replica: Dec 04 07:51:59 Done!
Finished cluster configuration
Success: replication is running for all services.
Run `ghe-repl-status' to monitor replication health and progress.
 
 
 
admin@githubtest-testdomain.io:~$ ghe-repl-status
OK: mysql replication is in sync
OK: redis replication is in sync
OK: elasticsearch cluster is in sync
OK: git replication is in sync
OK: pages replication is in sync
OK: alambic replication is in sync
OK: git-hooks replication is in sync
OK: consul replication is in sync
 
 
 
admin@githubtest-testdomain.io:~$ ghe-maintenance -u

 

HA (Disaster case - Primary EC2 terminated)

If Primary EC2 instance terminated.

Just run this command replica side

  • ghe-repl-promote

And change Route53 DNS IP address to replica

 

admin@githubtest-testdomain.io-replica:~$ ghe-repl-promote
Warning: You are about to promote this Replica node
Promoting this Replica will tear down replication and enable maintenance mode on the current Primary.
All other Replicas need to be re-setup to use this new Primary server.
 
Proceed with promoting this appliance to Primary? [y/N] y
ssh: connect to host 172.3?.???.??? port 122: Connection timed out
Warning: Primary node is unavailable.
Warning: Performing hard failover without cleaning up on the primary side.
Stopping replication ...
  | Skipping Pages, Alambic, git-hooks and Git replication cleanup on primary ...
  | Stopping MySQL replication ...
  | Stopping Redis replication ...
  | Stopping Consul replication ...
  | Success: Replication was stopped for all services.
  | To disable replica mode and remove all replica configuration, run 'ghe-repl-teardown'.
Switching out of replica mode ...
  | ssh: connect to host 172.3?.???.??? port 122: Connection timed out
  | ssh: connect to host 172.3?.???.??? port 122: Connection timed out
  | ssh: connect to host 172.3?.???.??? port 122: Connection timed out
  | ssh: connect to host 172.3?.???.??? port 122: Connection timed out
  | ssh: connect to host 172.3?.???.??? port 122: No route to host
  | jq: error (at :0): Cannot index number with string "settings"
  | jq: error (at :0): Cannot index number with string "settings"
  | Success: Replication configuration has been removed.
  | Run `ghe-repl-setup' to re-enable replica mode.
Applying configuration and starting services ...
Success: Replica has been promoted to primary and is now accepting requests.

 

+ Recent posts