Google

05 April 2016

SharePoint Document Tagger for Azure Text Analytics

Document Tagger logotype Relevant metadata is crucial for an expanding SharePoint farm or tenant. With good metadata, users can find the documents they need even if there are thousands of files in SharePoint libraries. The problem is to get users to tag documents correctly, and the kalmstrom.com team has tried to solved that issue. With the Microsoft Azure web services for text analysis we now see great opportunities for a tagging tool to really work well.

The importance of metadata
Metadata is often called "data about data". Everything that describes the contents and context of files is included in SharePoint metadata, like file names, location and of course keywords. When you have good metadata, the SharePoint search works well, and you can filter and create views that show files from different aspects. To manually tag documents with metadata is however boring and time consuming, so people often avoid doing it.

Templates Manager icon Last week I described how the kalmstrom.com product Templates Manager for SharePoint makes it easier to add metadata to files created from Office templates. Another way to spur users to add metadata is  to use a text analysis service and tag files that are uploaded to document libraries or already exist there with suggested keywords.

A couple of years ago the kalmstrom.com team developed a beta version of a SharePoint solution for automatic tagging with the help of an analysis service. We never released the final product on the market, but now I think it is time to update and finish Document Tagger.
Microsoft Azure
Azure text analysis tool
A product that helps SharePoint users to analyze and tag documents with metadata stands and falls with the text analysis tool. If the text analysis is not sufficient, you will have irrelevant metadata, which will lead to frustration and bad search results. Now we have found something that meet our standards.

Microsoft is continually improving their Azure services, and we have been keeping an eye on the Text Analytics API for some time. It is beginning to reach general availability, and we would love to build something on top of it. The starting point would be our beta version of Document Tagger.

Keep the control
Document Tagger is a SharePoint Sandboxed Solution that allows users to tag existing documents much more easily than doing it manually. However, even if the tagging is automatic we don't force users to blindly accept all the metadata that the text analysis suggests. When a document or presentation has been analyzed by the service, the metadata suggestions must be approved before they are added to the file, and users can make any changes needed.

This is how the new Document Tagger will work on existing library files:
  1. Select one or several files and press the Document Tagger button. A dialog is displayed.
  2. Document Tagger sends the documents in a secure way to the analysis service.
  3. The service suggests keywords that are shown in the Document Tagger dialog. Icons show what category the keywords belong to.
  4. Deselect the keywords you don't want to use.
  5. Add any additional metadata.
  6. Click on the Apply button, and the keywords will be added to the document.
Document Tagger can also tag documents in a similar way directly when they are uploaded. Other Document Tagger features help you use the tags to find the files you need in a quicker way than SharePoint in itself can manage. You can for example find related documents by clicking on a tag and group keywords and metadata by category for a better overview.

See the beta
Does this sound interesting? The beta version of Document Tagger is introduced on the kalmstrom.com website. Even if it was developed for SharePoint 2010 and never was quite finished, it gives a good idea on how the modern version will work.

More about metadata
In the kalmstrom.com Tips section you can find general info about SharePoint categorization and metadata. The most recent tutorials are for Office 365 SharePoint, but the principles are the same for on-premise installations.

By Peter Kalmström
CEO and Systems Designer
kalmstrom.com Business Solutions