Index your Atlassian Confluence Cloud contents using the Amazon Q Confluence Cloud connector for Amazon Q Business
One such enterprise data repository you can use to store content is Atlassian Confluence… Confluence is a team workspace that provides a place to create, and collaborate on various projects, products, or ideas… There are two Confluence offerings: Cloud – This is offered as a software as…
Amazon Q Business is a generative artificial intelligence (AI)-powered assistant designed to enhance enterprise operations. It’s a fully managed service that helps provide accurate answers to users’ questions while honoring the security and access restrictions of the content. It can be tailored to your specific business needs by connecting to your company’s information and enterprise systems using built-in connectors to a variety of enterprise data sources. Amazon Q Business enables users in various roles, such as marketing managers, project managers, and sales representatives, to have tailored conversations, solve business problems, generate content, take action, and more, through a web interface. This service aims to help make employees work smarter, move faster, and drive significant impact by providing immediate and relevant information to help them with their tasks.
One such enterprise data repository you can use to store content is Atlassian Confluence. Confluence is a team workspace that provides a place to create, and collaborate on various projects, products, or ideas. Team spaces help your teams structure, organize, and share work, so each user has visibility into the institutional knowledge of the enterprise and access to the information they need or answers to the questions they have.
There are two Confluence offerings:
- Cloud – This is offered as a software as a service (SaaS) product. It’s always on and continuously updated.
- Data Center (self-managed) – Here, you host Confluence on your infrastructure, which may be on premises or the cloud, allowing you to keep data within your chosen environment and manage it yourself.
Your users may need to get answers in Amazon Q Business from the content in Atlassian’s Confluence Cloud instance as a part of their work. For this you will need to configure an Amazon Q Confluence Cloud connector. As a part of this configuration, one of the steps is to configure the authentication of the connector so that it can authenticate with Confluence (Cloud) and then index the relevant content.
This post covers the steps to configure the Confluence Cloud connector for Amazon Q Business.
Types of documents
When you connect Amazon Q to a data source, what Amazon Q considers—and crawls—as a document varies by connector. The Confluence Cloud connector crawls the following as documents:
- Spaces – Each space is considered a single document.
- Pages – Each page is considered a single document.
- Blogs – Each blog is considered a single document.
- Comments – Each comment is considered a single document.
- Attachments – Each attachment is considered a single document.
Metadata
Every document has structural attributes—or metadata—attached to it. Document attributes can include information such as document title, document author, time created, time updated, and document type.
When you connect Amazon Q Business to a data source, it automatically maps specific data source document attributes to fields within an Amazon Q Business index. If a document attribute in your data source doesn’t have an attribute mapping already available, or if you want to map additional document attributes to index fields, use the custom field mappings to specify how a data source attribute maps to an Amazon Q Business index field. You create field mappings by editing your data source after your application and retriever are created.
To learn more about the supported entities and the associated reserved and custom attributes for the Amazon Q Confluence connector, refer to Amazon Q Business Confluence (Cloud) data source connector field mappings.
Authentication types
An Amazon Q Business application requires you to use AWS IAM Identity Center to manage user access. Although it’s recommended to have an IAM Identity Center instance configured (with users federated and groups added) before you start, you can also choose to create and configure an IAM Identity Center instance for your Amazon Q Business application using the Amazon Q console.
You can also add users to your IAM Identity Center instance from the Amazon Q Business console, if you aren’t federating identity. When you add a new user, make sure that the user is enabled in your IAM Identity Center instance and they have verified their email ID. They need to complete these steps before they can log in to your Amazon Q Business web experience.
Your identity source in IAM Identity Center defines where your users and groups are managed. After you configure your identity source, you can look up users or groups to grant them single sign-on access to AWS accounts, applications, or both.
You can have only one identity source per organization in AWS Organizations. You can choose one of the following as your identity source:
- IAM Identity Center directory – When you enable IAM Identity Center for the first time, it’s automatically configured with an IAM Identity Center directory as your default identity source. This is where you create your users and groups, and assign their level of access to your AWS accounts and applications.
- Active Directory – Choose this option if you want to continue managing users in either your AWS Managed Microsoft AD directory using AWS Directory Service or your self-managed directory in Active Directory (AD).
- External Identity Provider – Choose this option if you want to manage users in other external identity providers (IdPs) through the Security Assertion Markup Language (SAML) 2.0 standard, such as Okta.
Access control lists
Amazon Q Business connectors index access control list (ACL) information that’s attached to a Confluence document along with the document itself. For document ACLs, Amazon Q Business indexes the following:
- User email address
- Group name for the local group
- Group name for the federated group
When you connect a Confluence (Cloud) data source to Amazon Q Business, the connector crawls ACL (user and group) information attached to a document from your Confluence (Cloud) instance. The information is used to determine which content can be used to construct chat responses for a given user, according the end-user’s document access permissions.
You configure user and group access to Confluence spaces using the space permissions page, in Confluence. Similarly for pages and blogs, you use the restrictions page. For more information about space permissions, see Space Permissions Overview on the Confluence Support website. For more information about page and blog restrictions, see Page Restrictions on the Confluence Support website.
An Amazon Q Business connector updates any changes in ACLs each time that your data source content is crawled. To capture ACL changes to make sure that the right end-users have access to the right content, re-sync your data source regularly.
Identity crawling for Amazon Q Business User Store
As stated earlier, Amazon Q Business crawls ACL information at the document level from supported data sources. In addition, Amazon Q Business crawls and stores principal information within each data source (local user alias, local group, and federated group identity configurations) into the Amazon Q Business User Store. This is useful when your application is connected to multiple data sources with different authorization and authentication systems, but you want to create a unified, access-controlled chat experience for your end-users.
Amazon Q Business internally maps the local user and group IDs attached to the document, to the federated identities of users and groups. Mapping identities streamlines user management and speeds up chat responses by reducing ACL information retrieval time during chat requests. Identity crawling, along with the authorization feature, helps filter and generate web experience content restricted by end-user context. For more information about this process, see Understanding Amazon Q Business User Store.
The group and user IDs are mapped as follows:
- _group_ids – Group names are present on spaces, pages, and blogs where there are restrictions. They’re mapped from the name of the group in Confluence. Group names are always lowercase.
- _user_id – Usernames are present on the space, page, or blog where there are restrictions. They’re mapped depending on the type of Confluence instance that you’re using. For Confluence Cloud, the _user_id is the account ID of the user.
Overview of solution
With Amazon Q Business, you can configure multiple data sources to provide a central place to search across your document repository. For our solution, we demonstrate how to index a Confluence repository using the Amazon Q Business connector for Confluence. In this blog we will:
- Configure an Amazon Q Business Application.
- Connect Confluence (Cloud) to Amazon Q Business.
- Index the data in the Confluence repository.
- Run a sample query to test the solution.
Prerequisites
Before you begin using Amazon Q Business for the first time, complete the following tasks:
- Set up your AWS account.
- Optionally, install the AWS Command Line Interface (AWS CLI).
- Optionally, set up the AWS SDKs.
- Consider AWS Regions and endpoints.
- Set up required permissions.
- Enable and configure an IAM Identity Center instance.
For more information, see Setting up for Amazon Q Business.
To set up the Amazon Q Business connector for Confluence, you need to complete additional prerequisites. For more information, see Prerequisites for connecting Amazon Q Business to Confluence (Cloud).
Create an Amazon Q Business application with the Confluence Cloud connector
As the first step towards creating a generative AI assistant, you configure an application. Then you select and create a retriever, and also connect any data sources. After this, you grant end-user access to users to interact with an application using the preferred identity provider, IAM Identity Center. Complete the following steps:
- On the Amazon Q Business console, choose Get started.
- On the Applications page, choose Create application.
- Enter a name for your application, select the level of service access, and connect to IAM Identity Center. (Note: The IAM Identity Center instance does not have to be in the same Region as Amazon Q Business.)
- Choose Create.
For additional details on configuring the Amazon Q application and connecting to IAM Identity Center, refer to Creating an Amazon Q Business application environment.
- Select your retriever and index provisioning options.
- Choose Next.
For additional details on creating and selecting a retriever, refer to Creating and selecting a retriever for an Amazon Q Business application.
- Connect to Confluence as your data source.
- Enter a name and description.
- Select Confluence Cloud as the source and enter your Confluence URL.
- There are two options for Authentication: Basic authentication and OAuth 2.0 authentication. Select the best option depending on your use case.
Before you connect Confluence (Cloud) to Amazon Q Business, you need to create and retrieve the Confluence (Cloud) credentials you will use to connect Confluence (Cloud) to Amazon Q Business. You also need to add any permissions needed by Confluence (Cloud) to connect to Amazon Q Business.
The following procedures give you an overview of how to configure Confluence (Cloud) to connect to Amazon Q Business using either basic authentication or OAuth 2.0 authentication.
Configure Confluence (Cloud) basic authentication for Amazon Q Business
Complete the following steps to configure basic authentication:
- Log in to your account from Confluence (Cloud). Note the username you logged in with. You will need this later to connect to Amazon Q Business.
- From your Confluence (Cloud) home page, note your Confluence (Cloud) URL from your Confluence browser URL. For example, https://example.atlassian.net. You will need this later to connect to Amazon Q Business.
- Navigate to the Security page in Confluence (Cloud).
- On the API tokens page, choose Create API token.
- In the Create an API token dialog box, for Label, add a name for your API token.
- Choose Create.
- From the Your new API token dialog box, copy the API token and save it in your preferred text editor. You can’t retrieve the API token after you close the dialog box.
- Choose Close.
You now have the username, Confluence (Cloud) URL, and Confluence (Cloud) API token you need to connect to Amazon Q Business with basic authentication.
For more information, see Manage API tokens for your Atlassian account in Atlassian Support.
Configure Confluence (Cloud) OAuth 2.0 authentication for Amazon Q Business
Complete the following steps to configure Confluence (Cloud) OAuth 2.0 authentication:
- Retrieve the username and Confluence (Cloud) URL.
- Configure an OAuth 2.0 app integration.
- Retrieve the Confluence (Cloud) client ID and client secret.
- Generate a Confluence (Cloud) access token.
- Generate a Confluence (Cloud) refresh token.
- Generate a new Confluence (Cloud) access token using a refresh token.
Retrieve the username and Confluence (Cloud) URL
Complete the following steps:
- Log in to your account from Confluence (Cloud). Note the username you logged in with. You will need this later to connect to Amazon Q Business.
- From your Confluence (Cloud) home page, note your Confluence (Cloud) URL from your Confluence browser URL. For example, https://example.atlassian.net. You will need this later to both configure your OAuth 2.0 token and connect to Amazon Q Business.
Configuring an OAuth 2.0 app integration
Complete the following steps:
- Log in to your account from the Atlassian Developer page.
- Choose the profile icon in the top-right corner and on the dropdown menu, choose Developer console.
- On the welcome page, choose Create and choose OAuth 2.0 integration.
- Under Create a new OAuth 2.0 (3LO) integration, for Name, enter a name for the OAuth 2.0 application you’re creating. Then, read the Developer Terms, and select I agree to be bound by Atlassian’s developer terms checkbox, if you do.
- Select Create.
The console will display a summary page outlining the details of the OAuth 2.0 app you created.
- Still in the Confluence console, in the navigation pane, choose Authorization.
- Choose Add to add OAuth 2.0 (3LO) to your app.
- Under OAuth 2.0 authorization code grants (3LO) for apps, for Callback URL, enter the Confluence (Cloud) URL you copied, then choose Save changes.
- Under Authorization URL generator, choose Add APIs to add APIs to your app. This will redirect you to the Permissions page.
- On the Permissions page, for Scopes, navigate to User Identity API. Select Add, then select Configure.
- Under User Identity API, choose Edit Scopes, then add the following read scopes:
- read:me – View active user profile.
- read:account – View user profiles.
- Choose Save and return to the Permissions page.
- On the Permissions page, for Scopes, navigate to Confluence API. Select Add, and then select Configure.
- Under Confluence API, make sure you’re on the Classic scopes tab.
- Choose Edit Scopes and add the following read scopes:
- read:confluence-space.summary – Read Confluence space summary.
- read:confluence-props – Read Confluence content properties.
- read:confluence-content.all – Read Confluence detailed content.
- read:confluence-content.summary – Read Confluence content summary.
- read:confluence-content.permission – Read content permission in Confluence.
- read:confluence-user – Read user.
- read:confluence-groups – Read user groups.
- Choose Save.
- Navigate to the Granular scopes
- Choose Edit Scopes and add the following read scopes:
- read:content:confluence – View detailed contents.
- read:content-details:confluence – View content details.
- read:space-details:confluence – View space details.
- read:audit-log:confluence – View audit records.
- read:page:confluence – View pages.
- read:attachment:confluence – View and download content attachments.
- read:blogpost:confluence – View blog posts.
- read:custom-content:confluence – View custom content.
- read:comment:confluence – View comments.
- read:template:confluence – View content templates.
- read:label:confluence – View labels.
- read:watcher:confluence – View content watchers.
- read:group:confluence – View groups.
- read:relation:confluence – View entity relationships.
- read:user:confluence – View user details.
- read:configuration:confluence – View Confluence settings.
- read:space:confluence – View space details.
- read:space.permission:confluence – View space permissions.
- read:space.property:confluence – View space properties.
- read:user.property:confluence – View user properties.
- read:space.setting:confluence – View space settings.
- read:analytics.content:confluence – View analytics for content.
- read:content.permission:confluence – Check content permissions.
- read:content.property:confluence – View content properties.
- read:content.restriction:confluence – View content restrictions.
- read:content.metadata:confluence – View content summaries.
- read:inlinetask:confluence – View tasks.
- read:task:confluence – View tasks.
- read:permission:confluence – View content restrictions and space permissions.
- read:whiteboard:confluence – View whiteboards.
- read:app-data:confluence – Read app data.
For more information, see Implementing OAuth 2.0 (3LO) and Determining the scopes required for an operation in Atlassian Developer.
Retrieve the Confluence (Cloud) client ID and client secret
Complete the following steps:
- In the navigation pane, choose Settings.
- In the Authentication details section, copy and save the following in your preferred text editor:
- Client ID – You enter this as the app key on the Amazon Q Business console.
- Secret – You enter this as the app secret on the Amazon Q Business console.
You need these to generate your Confluence (Cloud) OAuth 2.0 token and also to connect Amazon Q Business to Confluence (Cloud).
For more information, see Implementing OAuth 2.0 (3LO) and Determining the scopes required for an operation in the Atlassian Developer documentation.
Generate a Confluence (Cloud) access token
Complete the following steps:
- Log in to your Confluence account from the Atlassian Developer page.
- Open the OAuth 2.0 app you want to generate a refresh token for.
- In the navigation pane, choose Authorization.
- For OAuth 2.0 (3LO), choose Configure.
- On the Authorization page, under Authorization URL generator, copy the URL for Granular Confluence API authorization URL and save it in your preferred text editor.
The URL is in the following format:
- In the saved authorization URL, update the state=${YOUR_USER_BOUND_VALUE} parameter value to any text of your choice. For example, state=sample_text.
For more information, see What is the state parameter used for? in the Atlassian Support documentation.
- Open your preferred web browser and enter the authorization URL you copied into the browser URL.
- On the page that opens, make sure everything is correct and choose Accept.
You will be returned to your Confluence (Cloud) home page.
- Copy the URL of the Confluence (Cloud) home page and save it in your preferred text editor.
The URL contains the authorization code for your application. You will need this code to generate your Confluence (Cloud) access token. The whole section after code= is the authorization code.
- Navigate to Postman.
If you don’t have Postman installed on your local system, you can also choose to use cURL to generate a Confluence (Cloud) access token. Use the following cURL command to do so:
- If, however, you have Postman installed, on the main Postman window, choose POST as the method, then enter the following URL: https://auth.atlassian.com/oauth/token.
- Choose Body, then choose raw and JSON.
- In the text box, enter the following code extract, replacing the fields with your credential values:
- Choose Send.
If everything is configured correctly, Postman will return an access token.
- Copy the access token and save it in your preferred text editor. You will need it to connect Confluence (Cloud) to Amazon Q Business.
For more information, see Implementing OAuth 2.0 (3LO) in the Atlassian Developer documentation.
Generate a Confluence (Cloud) refresh token
The access token you use to connect Confluence (Cloud) to Amazon Q Business using OAuth 2.0 authentication expires after 1 hour. When it expires, you can either repeat the whole authorization process and generate a new access token, or generate a refresh token.
Refresh tokens are implemented using a rotating refresh token mechanism. Each time they’re used, rotating refresh tokens issues a new limited-life refresh token that is valid for 90 days. Each new rotating refresh token resets the inactivity expiry time and allocates another 90 days. This mechanism improves on single persistent refresh tokens by reducing the period in which a refresh token can be compromised and used to obtain a valid access token. For additional details, see OAuth 2.0 (3LO) apps in the Atlassian Developer documentation.
To generate a refresh token, you add a %20offline_access parameter to the end of the scope value in the authorization URL you used to generate your access token. Complete the following steps to generate a refresh token:
- Log in to your account from the Atlassian Developer page.
- Open the OAuth 2.0 app you want to generate a refresh token for.
- In the navigation pane, choose Authorization.
- For OAuth 2.0 (3LO), choose Configure.
- On the Authorization page, under Authorization URL generator, copy the URL for Granular Confluence API authorization URL and save it in your preferred text editor.
- In the saved authorization URL, update the state=${YOUR_USER_BOUND_VALUE} parameter value to any text of your choice. For example, state=sample_text.
For more information, see What is the state parameter used for? in the Atlassian Support documentation.
- Add the following text at the end of the scope value in your authorization URL: %20offline_access and copy it. For example:
- Open your preferred web browser and enter the modified authorization URL you copied into the browser URL.
- On the page that opens, make sure everything is correct and then choose Accept.
You will be returned to the Confluence (Cloud) console.
- Copy the URL of the Confluence (Cloud) home page and save it in a text editor of your choice.
The URL contains the authorization code for your application. You will need this code to generate your Confluence (Cloud) refresh token. The whole section after code= is the authorization code.
- Navigate to Postman.
If you don’t have Postman installed on your local system, you can also choose to use cURL to generate a Confluence (Cloud) access token. Use the following cURL command to do so:
- If, however, you have Postman installed, on the main Postman window, choose POST as the method, then enter the following URL: https://auth.atlassian.com/oauth/token.
- Choose Body on the menu, then choose raw and JSON.
- In the text box, enter the following code extract, replacing the fields with your credential values:
- Choose Send.
If everything is configured correctly, Postman will return a refresh token.
- Copy the refresh token and save it using your preferred text editor. You will need it to connect Confluence (Cloud) to Amazon Q Business.
For more information, see Implementing a Refresh Token Flow in the Atlassian Developer documentation.
Generate a new Confluence (Cloud) access token using a refresh token
You can use the refresh token you generated to create a new access token and refresh token pair when an existing access token expires. Complete the following steps to generate a refresh token:
- Copy the refresh token you generated following the steps in the previous section.
- Navigate to Postman.
If you don’t have Postman installed on your local system, you can also choose to use cURL to generate a Confluence (Cloud) access token. Use the following cURL command to do so:
- In the Postman main window, choose POST as the method, then enter the following URL: https://auth.atlassian.com/oauth/token.
- Choose Body from the menu and choose raw and JSON.
- In the text box, enter the following code extract, replacing the fields with your credential values:
- Choose Send.
If everything is configured correctly, Postman will return a new access token and refresh token pair in the following format:
Author: Tyler Geary