Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

11.1 Portal upgrade failure #490

Open
PleaseStopAsking opened this issue Aug 9, 2023 · 0 comments
Open

11.1 Portal upgrade failure #490

PleaseStopAsking opened this issue Aug 9, 2023 · 0 comments

Comments

@PleaseStopAsking
Copy link
Contributor

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request

Module Version

  • 4.1.0

Affected Resource(s)

  • Invoke-ArcGISConfiguration
    • ArcGIS_PortalUpgrade

Configuration Files

{
    "Notes": {
        "Updated": "2023-08-08",
        "Version": "0.1.0",
        "ArcGISModule": "4.1.0"
    },
    "AllNodes": [
        {
            "NodeName": "redacted",
            "Role": [
                "Portal"
            ]
        },
        {
            "NodeName": "redacted",
            "Role": [
                "Portal"
            ]
        }
    ],
    "ConfigData": {
        "Version": "11.1",
        "OldVersion": "10.9.1",
        "ServerContext": "arcgis",
        "PortalContext": "arcgis",
        "DownloadPatches": true,
        "Credentials": {
            "ServiceAccount": {
                "UserName": "arcgis",
                "Password": "redacted",
                "IsMSAAccount": false,
                "IsDomainAccount": false
            }
        },
        "Portal": {
            "LicenseFilePath": "C:\\AllUTs_AllAddOnApps.json",
            "PortalLicenseUserTypeId": "creatorUT",
            "EnableAutomaticAccountCreation": false,
            "DisableServiceDirectory": true,
            "DisableAnonymousAccess": true,
            "EnableHSTS": true,
            "Installer": {
                "Path": "C:\\Portal_for_ArcGIS_Windows_111_185219.exe",
                "PatchesDir": "C:\\ArcGISPatches",
                "InstallDir": "C:\\Program Files\\ArcGIS\\Portal",
                "ContentDir": "C:\\arcgisportal"
            },
            "ContentDirectoryLocation": "\\\\redacted\\arcgisportal\\content",
            "PortalAdministrator": {
                "UserName": "portaladmin",
                "Email": "[email protected]",
                "Password": "redacted",
                "SecurityQuestionIndex": 1,
                "SecurityAnswer": "redacted"
            }
        }
    }
}

Expected Behavior

The HA portal deployment is successfully upgraded

Actual Behavior

The HA portal deployment fails

Description

We have attempted to upgrade an HA Enterprise deployment from 10.9.1 to 11.1 and discovered that the process is failing. Our current thought/testing is that this appears to be caused by the order of the nodes defined in the json config. Specifically, the Portal nodes need to be ordered so that the primary Portal instance in the HA site is listed first.

Based off the logic defined in the Invoke-PortalUpgradeScript function, the primary/secondary machines are determined by the order in which they are listed in the json config. This determination is then used to kick off a step on the assumed secondary Portal which only updates the Portal DataStore host identifier prop file at C:\Program Files\ArcGIS\Portal\framework\runtime\ds\framework\etc\hostidentifier.properties and then restarts the Portal service. The actual post upgrade step is then carried out upon the assumed primary.

The documentation for upgrading Portal states ... then start the upgrade process on either machine which seems to indicate the issue via DSC may be related to the restarting of the assumed secondary Portal which is in fact the primary portal based on the actual site configuration.

We have been able to reproduce this in two separate HA deployments as well as found that we can work around it by ensuring the primary portal site is listed first in the json config.

We are not sure if Portal attempts a half-baked failover when the primary goes down for a restart during an upgrade but if it does, that could explain what we are seeing in our testing.

Steps to Reproduce

  1. Deploy an HA portal environment at 10.9.1 with DSC
  2. Within the config provided above (used for upgrading to 11.1), set the second node in the array to the primary machine in the HA portal site. You should verify which machine is listed as primary via .../portaladmin/machines
  3. Start the upgrade site which should fail on the PortalPostUpgrade step.
  • The PortalPostUpgrade step gets to the final phase (Upgrade standby machine) and then errors out with
     {"lastUpdated":1691524771647,"name":"Upgrade database","startTime":1691524615102,"state":"completed"},{"lastUpdated":1691524824307,"name":"Migrate configuration settings","startTime":1691524822608,"state":"completed"},{"lastUpdated":1691524919123,"name":"Update configuration settings","startTime":1691524877634,"state":"completed"},{"lastUpdated":1691524877634,"name":"Configure index service","startTime":1691524844022,"state":"completed"},{"lastUpdated":1691525041227,"name":"Reindex","startTime":1691524920167,"state":"completed"},{"lastUpdated":1691525930553,"name":"Upgrade standby machine","startTime":1691525210616,"state":"failed"}],"messages":["Index Service configuration failed."],"recheckAfterSeconds":20}
    

Important Factoids

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant