DC/OS on GCP using the Universal Installer

Guide for DC/OS on GCP using the Universal Installer

This guide is meant to take an operator through all steps necessary for a successful installation of DC/OS using Terraform. If you are already familiar with the prerequisites, you can jump to Creating a DC/OS Cluster.

Prerequisites

  • Linux, macOS, or Windows
  • command-line shell terminal such as Bash or PowerShell
  • verified Azure Resource Manager account with the necessary permissions

Install Terraform

IMPORTANT: Terraform got updated to 0.12.x. The DC/OS Universal Installer currently only supports Terraform 0.11.x

  1. Visit the Terraform releases page for bundled installations and support for Linux, macOS and Windows. Choose the latest 0.11 version.

    If you’re on a Mac environment with Homebrew installed, simply run the following command:

    brew unlink terraform || true
    brew install tfenv
    tfenv install 0.11.14
    

    Windows users that have Chocolatey installed, run:

    choco install terraform --version 0.11.14 -y
    

Get Application Default Credentials for authentication

You will need the Application Default Credentials for Terraform to authenticate against GCP. To understand more about how Terraform authenticates with Google, see the Terraform Google provider reference.

To receive Application Default Credentials:

  1. Run the following command:
$ gcloud auth application-default login
  1. Verify that you have Application Default Credentials by running the following command:
$ gcloud auth application-default print-access-token
EXMAPLE.EXAMPLE-1llO--ZEvh6gQ-qhpL0I3gHcCeDKG_EXAMPLE7WtAepmpp47c0RCv9e0Oq6QnpQ79RZlHKzOw69XMxI87M2Q

Set the GCP default region and project

The GCP provider requires you to export the Region (desired-gcp-region) and Project (desired-gcp-project) identifiers into environment variables even if those values are set in the gcloud-cli. You can set them easily in your terminal.

export GOOGLE_REGION="us-west1"
export GOOGLE_PROJECT="production-123"

Alternatively, they can be inserted into the configuration file you will create. Please keep in mind storing your credentials outside of your version control for security.

provider "google" {
  version     = "~> 1.18.0"
  credentials = "${file("account.json")}"
  project     = "my-project-id"
  region      = "us-central1"
  zone        = "us-central1-c"
}

Verify you have a license key for Enterprise Edition

DC/OS Enterprise requires a valid license key provided by Mesosphere that will be passed into the main.tf configuration file as dcos_license_key_contents. If you do not set a password, the default superuser and password will be available for log in:

Username: bootstrapuser
Password: deleteme

IMPORTANT: You should NOT use the default credentials in a production environment. When you create or identify an administrative account for the production environment, you also need to generate a password hash for the account.

To set superuser credentials for the first log in, add the following values into your main.tf along with your license key. The password will need to be hashed to SHA-512.

dcos_superuser_username      = "superuser-name"
dcos_superuser_password_hash = "${file("./dcos_superuser_password_hash")}

Creating a cluster

  1. Create a local folder.
mkdir dcos-demo && cd dcos-demo
  1. Copy and paste the example code below into a new file and save it as main.tf in the local folder.

The example code below creates a DC/OS OSS 1.13.0 cluster on GCP with:

  • 1 Master
  • 2 Private Agents
  • 1 Public Agent

The example also specifies that the following output should be printed once cluster creation is complete:

  • masters-ips - Lists the DC/OS master nodes.
  • cluster-address - Specifies the URL you use to access DC/OS UI after the cluster is set up.
  • public-agent-loadbalancer - Specifies the URL of your Public routable services.
variable "dcos_install_mode" {
  description = "specifies which type of command to execute. Options: install or upgrade"
  default = "install"
}

module "dcos" {
  source = "dcos-terraform/dcos/gcp"
  version = "~> 0.1.0"

  cluster_name        = "my-open-dcos"
  ssh_public_key_file = "~/.ssh/id_rsa.pub"

  num_masters        = "1"
  num_private_agents = "2"
  num_public_agents  = "1"

  dcos_version = "1.13.0"

  # Enterprise users uncomment this section and comment out below
  # dcos_variant              = "ee"
  # dcos_license_key_contents = "${file("./license.txt")}"
  # Make sure to set your credentials if you do not want the default
  # dcos_superuser_password_hash = "${file("./dcos_superuser_password_hash.sha512")}
  # dcos_superuser_username = "admin"

  # Default is DC/OS
  dcos_variant = "open"

  # Reads the install mode set above
  dcos_install_mode = "${var.dcos_install_mode}"
}

output "masters-ips" {
  value       = "${module.dcos.masters-ips}"
}

output "cluster-address" {
  value       = "${module.dcos.masters-loadbalancer}"
}

output "public-agents-loadbalancer" {
  value = "${module.dcos.public-agents-loadbalancer}"
}

For simplicity in this example, the configuration values are hard-coded. If you have a desired cluster name or number of masters/agents, you can adjust the values directly in the main.tf configuration file.

You can find additional input variables and their descriptions here.

Initialize terraform and create a cluster

  1. Now the action of actually creating your cluster and installing DC/OS begins. First, initialize the project’s local settings and data. Make sure you are still working in the same folder where you created your main.tf file, and run the initialization.

    terraform init
    
    Terraform has been successfully initialized!
    
    You may now begin working with Terraform. Try running "terraform plan" to see
    any changes that are required for your infrastructure. All Terraform commands
    should now work.
    
    If you ever set or change modules or backend configuration for Terraform,
    rerun this command to reinitialize your environment. If you forget, other
    commands will detect it and remind you to do so if necessary.
    

    Note: If terraform is not able to connect to your provider, ensure that you are logged in and are exporting your credentials and necessary region information for your cloud provider.

  2. After Terraform has been initialized, the next step is to run the execution planner and save the plan to a static file - in this case, plan.out.

    terraform plan -out=plan.out
    

    Writing the execution plan to a file allows us to pass the execution plan to the apply command below as well help us guarantee the accuracy of the plan. Note that this file is ONLY readable by Terraform.

    Afterwards, we should see a message like the one below, confirming that we have successfully saved to the plan.out file. This file should appear in your dcos-demo folder alongside main.tf.

    Plan: 74 to add, 3 to change, 0 to destroy.
    
    ------------------------------------------------------------------------
    
    This plan was saved to: plan.out
    
    To perform exactly these actions, run the following command to apply:
      terraform apply "plan.out"
    

    Every time you run terraform plan, the output will always detail the resources your plan will be adding, changing or destroying. Since we are creating our DC/OS cluster for the very first time, our output tells us that our plan will result in adding 38 pieces of infrastructure/resources.

  3. The next step is to get Terraform to build/deploy our plan. Run the command below.

    terraform apply plan.out
    

    Sit back and enjoy! The infrastructure of your DC/OS cluster is being created while you watch. This may take a few minutes.

    Once Terraform has completed applying the plan, you should see output similar to the following:

    Apply complete! Resources: 74 added, 0 changed, 0 destroyed.
    
    Outputs:
    
    cluster-address = testing-123-958581895.us-east-1.elb.amazonaws.com
    masters-ips = [
        3.93.239.91
    ]
    public-agents-loadbalancer = ext-testing-123-40f11d1227e88057.elb.us-east-1.amazonaws.com
    

    And congratulations - you’re up and running!

Logging in to DC/OS

  1. To login and start exploring your cluster, navigate to the cluster-address listed in the output of the CLI. From here you can choose your provider to create the superuser account Open Source, or login with your specified Enterprise credentials Enterprise.

Scaling Your Cluster

Terraform makes it easy to scale your cluster to add additional agents (public or private) once the initial cluster has been created. Simply follow the instructions below.

  1. Increase the value for the num_private_agents and/or num_public_agents in your main.tf file. In this example we are going to scale our cluster from 2 private agents to 3, changing just that line, and saving the file.

    num_masters        = "1"
    num_private_agents = "3"
    num_public_agents  = "1"
    
  2. Now that we’ve made changes to our main.tf, we need to re-run our new execution plan.

    terraform plan -out=plan.out
    

    Doing this helps us to ensure that our state is stable and to confirm that we will only be creating the resources necessary to scale our Private Agents to the desired number.

    You should see a message similar to above. There will be 3 resources added as a result of scaling up our cluster’s Private Agents (1 instance resource & 2 null resources which handle the DC/OS installation & prerequisites behind the scenes).

  3. Now that our plan is set, just like before, let’s get Terraform to build/deploy it.

    terraform apply plan.out
    

    Once you see an output like the message above, check your DC/OS cluster to ensure the additional agents have been added.

    You should see now 4 total nodes connected like below via the DC/OS UI.

Upgrading Your Cluster

Terraform also makes it easy to upgrade our cluster to a newer version of DC/OS. If you are interested in learning more about the upgrade procedure that Terraform performs, please see the official DC/OS Upgrade documentation.

  1. In order to perform an upgrade, we need to go back to our main.tf and modify the current DC/OS Version (dcos_version) to a newer version, such as 1.13.0 for this example.

    dcos_version = "1.13.0"
    
  2. Re-run the execution plan, terraform will notice the change in version and run accordingly.

    terraform plan -out=plan.out
    

    You should see an output like below, with your main.tf now set for normal operations on a new version of DC/OS.

  3. Apply the plan.

    terraform apply plan.out
    

    Once the apply completes, you can verify that the cluster was upgraded via the DC/OS UI.

Deleting Your Cluster

If you want to destroy your cluster, then use the following command and wait for it to complete.

terraform destroy

Important: Running this command will cause your entire cluster and all of its associated resources to be destroyed. Only run this command if you are absolutely sure you no longer need access to your cluster.

You will be required to enter yes to verify.