Intelligently search your organization’s Microsoft Teams data source with the Amazon Kendra connector for Microsoft Teams Praveen Edem AWS Machine Learning Blog

Organizations use messaging platforms like Microsoft Teams to bring the right people together to securely communicate with each other and collaborate to get work done. Microsoft Teams captures invaluable organizational knowledge in the form of the information that flows through it as users collaborate. However, making this knowledge easily and securely available to users can be challenging due to the fragmented nature of conversations across groups, channels, and chats within an organization. Additionally, the conversational nature of Microsoft Teams communication renders a traditional keyword-based approach to search ineffective when trying to find accurate answers to questions from the content and therefore requires intelligent search capabilities that have the ability to process natural language queries.

You can now use the Amazon Kendra connector for Microsoft Teams to index Microsoft Teams messages and documents, and search this content using intelligent search in Amazon Kendra, powered by machine learning (ML).

This post shows how to configure the Amazon Kendra connector for Microsoft Teams and take advantage of the service’s intelligent search capabilities. We use an example of an illustrative Microsoft Teams instance where users discuss technical topics related to AWS.

Solution overview

Microsoft Teams content for active organizations is dynamic in nature due to continuous collaboration. Microsoft Teams includes public channels where any user can participate, and private channels where only those users who are members of these channels can communicate with each other. Furthermore, individuals can directly communicate with one another in one-on-one and ad hoc groups. This communication is in the form of messages and threads of replies, with optional document attachments.

In our solution, we configure Microsoft Teams as a data source for an Amazon Kendra search index using the Amazon Kendra connector for Microsoft Teams. Based on the configuration, when the data source is synchronized, the connector crawls and indexes all the content from Microsoft Teams that was created on or before a specific date. The connector also indexes the Access Control List (ACL) information for each message and document. When access control or user context filtering is enabled, the search results of a query made by a user includes results only from those documents that the user is authorized to read.

The Amazon Kendra connector for Microsoft Teams can integrate with AWS IAM Identity Center (Successor to AWS Single Sign-On). You first must enable IAM Identity Center and create an organization to sync users and groups from your active directory. The connector will use the user name and group lookup for the user context of the search queries.

With Amazon Kendra Experience Builder, you can build and deploy a low-code, fully functional search application to search your Microsoft Teams data source.

Prerequisites

To try out the Amazon Kendra connector for Microsoft Teams using this post as a reference, you need the following:

An AWS account with privileges to create AWS Identity and Access Management (IAM) roles and policies. For more information, see Overview of access management: Permissions and policies.
Basic knowledge of AWS and working knowledge of Microsoft Teams.

Note that the Microsoft Graph API places throttling limits on the number of concurrent calls to a service to prevent overuse of resources.

Configure Microsoft Teams

The following screenshot shows our example Microsoft Teams instance with sample content and the PDF file AWS_Well-Architect_Framework.pdf that we will use for our Amazon Kendra search queries.

The following steps describe the configuration of a new Amazon Kendra connector application in the Azure portal. This will create a user OAuth token to be used in configuring the Amazon Kendra connector for Microsoft Teams.

Next to Client credentials, choose Add a certificate or secret to add a new client secret.

For Description, enter a description (for example, KendraConnectorSecret).
For Expires, choose an expiry date (for example, 6 months).
Choose Add.

Save the secret ID and secret value to use later when creating an Amazon Kendra data source.

Choose Add a permission.

Choose Microsoft Graph to add all necessary Microsoft Graph permissions.

Choose Application permissions.

The registered application should have the following API permissions to allow crawling all entities supported by the Amazon Kendra connector for Microsoft Teams:

ChannelMessage.Read.All
Chat.Read
Chat.Read.All
Chat.ReadBasic
Chat.ReadBasic.All
ChatMessage.Read.All
Directory.Read.All
Files.Read.All
Group.Read.All
Mail.Read
Mail.ReadBasic
User.Read
User.Read.All
TeamMember.Read.All

However, you can select a lesser scope based on the entities chosen to be crawled. The following lists are the minimum sets of permissions needed for each entity:

Channel Post:

ChannelMessage.Read.All
Group.Read.All
User.Read
User.Read.All
TeamMember.Read.All (user-group mapping for identity crawl)

Channel Attachment:

ChannelMessage.Read.All
Group.Read.All
User.Read
User.Read.All
TeamMember.Read.All (user-group mapping for identity crawl)

Channel Wiki:

Group.Read.All
User.Read
User.Read.All
TeamMember.Read.All (user-group mapping for identity crawl)

Chat Message:

Chat.Read.All
ChatMessage.Read.All
ChatMember.Read.All
User.Read
User.Read.All
Group.Read.All
TeamMember.Read.All (user-group mapping for identity crawl)

Meeting Chat:

Chat.Read.All
ChatMessage.Read.All
ChatMember.Read.All
User.Read
User.Read.All
Group.Read.All
TeamMember.Read.All (user-group mapping for identity crawl)

Chat Attachment:

Chat.Read.All
ChatMessage.Read.All
ChatMember.Read.All
User.Read
User.Read.All
Group.Read.All
Files.Read.All
TeamMember.Read.All (user-group mapping for identity crawl)

Meeting File:

Chat.Read.All
ChatMessage.Read.All
ChatMember.Read.All
User.Read
User.Read.All
Group.Read.All
Files.Read.All
TeamMember.Read.All (user-group mapping for identity crawl)

Calendar Meeting:

Calendars.Read
Group.Read.All
TeamMember.Read.All
User.Read
User.Read.All
TeamMember.Read.All (user-group mapping for identity crawl)

Meeting Notes:

Group.Read.All
User.Read
User.Read.All
Files.Read.All
TeamMember.Read.All (user-group mapping for identity crawl)

Select your permissions and choose Add permissions.

Configure the data source using the Amazon Kendra connector for Microsoft Teams

To add a data source to your Amazon Kendra index using the Microsoft Teams connector, you can use an existing Amazon Kendra index, or create a new Amazon Kendra index. Then complete the steps in this section. For more information on this topic, refer to Microsoft Teams.

On the Amazon Kendra console, open the index and choose Data sources in the navigation pane.
Choose Add data source.
Under Microsoft Teams connector, choose Add connector.

In the Specify data source details section, enter the details of your data source and choose Next.
In the Define access and security section, for Tenant ID, enter the Microsoft Teams tenant ID from the Microsoft account dashboard.
Under Authentication, you can either choose Create to add a new secret with the client ID and client secret of the Microsoft Teams tenant, or use an existing AWS Secrets Manager secret that has the client ID and client secret of the Microsoft Teams tenant that you want the connector to access.
Choose Save.

Optionally, choose the appropriate payment model:

Model A payment models are restricted to licensing and payment models that require security compliance.
Model B payment models are suitable for licensing and payment models that don’t require security compliance.
Use Evaluation Mode (default) for limited usage evaluation purposes.

For IAM role, you can choose Create a new role or choose an existing IAM role configured with appropriate IAM policies to access the Secrets Manager secret, Amazon Kendra index, and data source.
Choose Next.

In the Configure sync settings section, provide information regarding your sync scope.

For Sync mode, choose your sync mode (for this post, select Full sync).

With the Full sync option, every time the sync runs, Amazon Kendra will crawl all documents and ingest each document even if ingested earlier. The full refresh enables you to reset your Amazon Kendra index without the need to delete and create a new data source. If you choose New or modified content sync or New, modified, or deleted content sync, every time the sync job runs, it will process only objects added, modified, or deleted since the last crawl. Incremental crawls can help reduce runtime and cost when used with datasets that append new objects to existing data sources on a regular basis.

For Sync run schedule, choose Run on demand.
Choose Next.

In the Set field mappings section, you can optionally configure the field mappings, wherein Microsoft Teams field names may be mapped to a different Amazon Kendra attribute or facet.
Choose Next.

Review your settings and confirm to add the data source.
After the data source is added, choose Data sources in the navigation pane, select the newly added data source, and choose Sync now to start data source synchronization with the Amazon Kendra index.

The sync process can take upwards of 30 minutes (depending on the amount of data to be crawled).

Now let’s enable access control for the Amazon Kendra index.

In the navigation pane, choose your index.
On the User access control tab, choose Edit settings and change the settings to look like the following screenshot.
Choose Next, then choose Update.

Perform intelligent search with Amazon Kendra

Before you try searching on the Amazon Kendra console or using the API, make sure that the data source sync is complete. To check, view the data sources and verify if the last sync was successful.

Now we’re ready to search our index.

On the Amazon Kendra console, navigate to the index and choose Search indexed content in the navigation pane.
Let’s use the query “How do you detect security events” and not provide an access token.

Based on our access control settings, a valid access token is needed to access authenticated content; therefore, when we use this search query without setting any user name or group, no results are returned.

Next, choose Apply token and set the user name to a user in the domain (for example, usertest4) that has access to the Microsoft Teams content.

In this example, the search will return a result from the PDF file uploaded in the Microsoft Teams chat message.

Finally, choose Apply token and set the user name to a different user in the domain (for example, usertest) that has access to different Microsoft Teams content.

In this example, the search will return a different Microsoft Teams chat message.

This confirms that the ACLs ingested in Amazon Kendra by the connector for Microsoft Teams are being enforced in the search results based on the user name.

Clean up

To avoid incurring future costs, clean up the resources you created as part of this solution. If you created a new Amazon Kendra index while testing this solution, delete it. If you only added a new data source using the Amazon Kendra connector for Microsoft Teams, delete that data source.

Conclusion

With the Amazon Kendra connector for Microsoft Teams, organizations can make invaluable information trapped in their Microsoft Teams instances available to their users securely using intelligent search powered by Amazon Kendra. Additionally, the connector provides facets for Microsoft Teams attributes such as channels, authors, and categories for the users to interactively refine the search results based on what they’re looking for.

To learn more about the Amazon Kendra connector for Microsoft Teams, refer to Microsoft Teams.

For more information on how you can create, modify, or delete metadata and content when ingesting your data from the Microsoft Teams, refer to Customizing document metadata during the ingestion process and Enrich your content and metadata to enhance your search experience with custom document enrichment in Amazon Kendra.

About the Authors

Praveen Edem is a Senior Solutions Architect at Amazon Web Services. He works with major financial services customers, architecting and modernizing their critical large-scale applications while adopting AWS services. He has over 20 years of IT experience in application development and software architecture.

Gunwant Walbe is a Software Development Engineer at Amazon Web Services. He is an avid learner and keen to adopt new technologies. He develops complex business applications, and Java is his primary language of choice.