Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Upgrade Ubuntu] be_prod #959

Open
1 of 50 tasks
Tracked by #157
dacook opened this issue Nov 4, 2024 · 1 comment
Open
1 of 50 tasks
Tracked by #157

[Upgrade Ubuntu] be_prod #959

dacook opened this issue Nov 4, 2024 · 1 comment
Assignees

Comments

@dacook
Copy link
Member

dacook commented Nov 4, 2024

1. Setting up the new server

  • Check old server config for any additional services to be aware of. Document any necessary steps for migration. Eg:
    • ls /etc/nginx/sites-enabled
    • systemctl --state=running
  • Hosting: provision new server with Ubuntu 20
  • DNS: add temporary domain (eg prod2.openfoodnetwork.org)

config

  • Add temporary name to inventory/hosts
  • Review host_vars/x/config.yml, clean up if needed
    • Make a copy for the temp hostname, add temp domain to bottom of certbot_domains
  • Review ofn-secrets:x_prod/secrets.yml, clean up if needed
    • Change to shared bugsnag projects
    • Don't bother making a copy of this one

setup

Enable passthrough on current server to allow new server to generate a certificate:

  • ansible-playbook playbooks/letsencrypt_proxy.yml -l x_prod -e "proxy_target=<new_ip>"

Then setup new server. Ensure you have the correct secrets (current secrets are usually fine).
ansible-playbook -l x_prod2 -e "@../ofn-secrets/x_prod/secrets.yml" playbooks/

  • setup.yml
  • provision.yml
  • deploy.yml -e git_version=releases/latest (untested, curious to see if releases/latest works)
  • db_integrations (Permit DB access for n8n, Metabase)

initial migration

  • Ensure sidekiq is disabled, to avoid creating subscription orders when data is loaded:
    sudo systemctl stop sidekiq && sudo systemctl disable sidekiq
  • Setup direct ssh access for ofn-admin and openfoodnetwork as per guide

ansible-playbook -l x_prod -e rsync_to=x_prod2 playbooks/

  • db_transfer.yml
  • transfer_assets.yml

Make sure to clear cache so that instance settings are applied:
cd ~/apps/openfoodnetwork/current; bin/rails runner -e production "Rails.cache.clear"

2. Testing

  • test reboot
  • send test mail (/admin/mail_methods/edit).
  • terms of service file: /admin/terms_of_service_files
  • shop catalogue display correctly, with images, add to cart, begin checkout, login
  • note: check cookies if login won't work
  • Check integrations
    • Payments (check Stripe connect status /admin/stripe_connect_settings/edit)
    • New Relic
    • Bugsnag

3. Migration

preparation

  • new server: bin/rake db:reset -e production (important: make sure you're on the new server!)
  • deploy.yml -l x_prod2 -e "git_version=vX.Y.Z" matching version with current prod
  • old server: make a tiny data change to verify later (eg add . in meta description /admin/general_settings/edit)

switchover: old server

  • 🚧 maintenance_mode.yml
  • sudo systemctl stop sidekiq redis-jobs puma
  • Transfer /var/lib/redis-jobs/dump.rdb to new server (see guide)
  • db_transfer.yml ~3min
  • sudo systemctl stop postgres (ensure other integrations no longer touch it)
  • transfer_assets.yml just in case

switchover: new server

  • sudo systemctl restart puma; sudo systemctl start sidekiq redis-jobs
  • Rails.cache.clear (or migrate redis-cache/dump.rdb also)
  • ⏭️ temporary_proxy.yml -e 'proxy_target=<ip>' redirect traffic to new prod
    • Note: this doesn't include webservices, and doesn't handle images. So it's a very short-term fix if at all.
    • Use a hosts file entry to test a direct connection
  • Check there are no alarm bells, eg:
    • ~/apps/openfoodnetwork/current/logs/production.log and sidekiq.log
    • tiny data change is present. undo it.
    • shopfront and checkout looks good
    • upload a product image
    • get confirmation from local team
  • Update DNS to point to new server

4. Cleanup (after 48hrs)

Rollback plan

  • If an error occurs before the temporary proxy is active, and can't be resolved quickly, then restore service back to current server
  • If an error occurs after proxy is active, users may have interacted with the new server (eg made payments.
    • if serious, consider putting into maintenance mode (and stop sidekiq) to avoid further changes
    • otherwise seek to resolve issue in-place.
@dacook dacook mentioned this issue Nov 4, 2024
9 tasks
@github-project-automation github-project-automation bot moved this to All the things 💤 in OFN Delivery board Nov 4, 2024
@dacook dacook changed the title be_prod [Upgrade Ubuntu] be_prod Nov 4, 2024
@dacook dacook self-assigned this Nov 20, 2024
@dacook dacook moved this from All the things 💤 to In Progress ⚙ in OFN Delivery board Nov 20, 2024
@dacook
Copy link
Member Author

dacook commented Nov 26, 2024

Vincent has left OMdM and I have had this email from his manager - maybe you could contact him:

On Fri, 22 Nov 2024 08:20:45 +0000 Emmanuel Bawin <[email protected]> wrote
Dear Nick,
Thanks for you message.
Indeed Vincent do not work anymore for us.
I will meet our different Belgian partners soon to clarify the repartition of roles.
Until this, I will be the point of contact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress ⚙
Development

No branches or pull requests

1 participant