AI Classification

Shield AI Classification is available only as part of the Shield Pro add-on.

Overview

Box AI Classification helps to assess and classify your content, applying the appropriate classification label automatically. AI Classification can work alongside existing classification policies. For example, you can keep automated classification policies used to detect specific information types or file extensions, then use AI Classification to label a broader set of content that wasn’t easily identifiable via specific data types or keywords. One AI Classification policy is permitted per enterprise. To classify your content using Box AI, you need to:

Classification text file types

AI Classification scans the text in files for all of the following extensions:


Extensions	Text Extraction Limit
as, as3, bat, boxcanvas, boxnote, cmake, css, diff, doc, docx, gdoc, gslide, gslides, haml, htm, html, key, less, log, make, md, mm, msg, odp, odt, pages, pdf, ppt, pptx, properties, rst, rtf, sass, scm, script, sh, sml, txt, vi, vim, webdoc, wpd, xbd, xdw, xhtml, xml, xsd, xsl	2MB
asm, c, cc, cpp, cs, csv, cxx, erb, groovy, gsheet, h, hh, java, js, json, m, ml, php, pl, py, rb, scala, sql, ods, xls, xlsm, xlsx, yaml	100KB

AI Classification supports different sizes of text extraction depending on the file type. The amount of text in a file is usually much less than the size of the file. For example, a 20MB PowerPoint file (.ppt) may have 200 KB of text that can be extracted for evaluation. Box scans only up to the text extraction limit. For example, for a PDF file where the text extraction exceeds 2MB, the AI Classification policy is based on whether the text in the first 2MB meets the conditions specified in the classification policy. Note: Automated classification in Box does not support optical character recognition (OCR), so Box cannot extract and consider text in scanned PDFs or images embedded in text-based files (for example, images in a PPT).

Classification image file types

Supported image file types are: ai, bmp, cr2, crw, dng, eps, gif, heic, indd, idml, indt, inx, jpeg, jpg, nef, png, ps, psd, raf, raw, svg, svs, tga, tif, tiff, webp Unlike traditional OCR, which extracts visible text from an image, AI Classification analyzes the whole image, including text and objects within that image, to determine what the content means - not just what the text says. Note: AI Classification uses a version of the image that is a maximum of 2048 x 2048 pixels. This means very small or fine details might not be visible if the original image was larger. This may impact the classification result.

AI Classification policy limits

There is a total limit of 25,000 bytes for all combined criteria across labels. The limit varies by language, with the following an approximation of the number of characters supported:


Language	Characters
English	25,000
Japanese	8,500
French	23,000
Chinese	8,500
Korean	8,500

Create an AI Classification policy

Admins, and co-admins with the following permissions, can create, modify, and delete AI Classification policies:

Create and edit metadata templates for your company
View Shield Dashboard for your company
Create, edit, and delete Shield configuration for your company

To create an AI Classification policy:

Navigate to the Admin Console.
Select Classification.
Select Create, then choose AI Classification Policy from the dropdown options.

Note: This option does not display if you already have an AI Classification policy configured and listed in the Classification policies list.

Select a classification label, then enter detailed information about the type of content that should be classified. For example, an internal classification may include content such as payroll slips, resumes, or policy documents.

You can remove the classification label by selecting the X above the description box.

Optionally, test and iterate using up to 10 files.
Select the policy setting of Apply to all folders or Only selected folders.
Select a conflict handling behavior.
Click Next.
Click either Save as Draft or Enable. After selecting Enable, the classification policy will be in effect for files that are triggered by classification events.

Notes:

There is a limit of 50 classification policies per EID. AI Classification policies count towards this limit.
If you have multiple auto-classification policies, the AI Classification policy will be set to the last priority by default. This is modifiable by changing the priority order.
One AI Classification policy is permitted per enterprise.
Content is only scanned prospectively after the policy is enabled.
Please do not perform large scale migrations or use Shuttle when you have AI Classification enabled on your account. If you are interested in scanning large volumes of content, reach out to your Account team.
As LLMs are non-deterministic in nature, it is possible that the Security Classification Agent will not always return the same Classification result.

Test and iterate

By selecting test files, you can ensure you are seeing the expected classification results and modify the criteria if needed. You can select up to 10 files at a time. Once files are selected, the chosen inputs are used to create a prompt and sent to AI to evaluate each test file. Follow the process to create an AI Classification policy up to step 5, then:

Click Select Files in the Test and iterate section.
Select up to 10 files to test.
The test results will display, with a classification applied based on the provided guidance. Reasoning is shown for why the AI chose the label that it did.

If no classification is provided, reasoning will be given to justify the lack of classification. Common reasons for Box AI being unable to classify the file include:

The file may no longer exist.
We are unable to extract text from the file (AI only works on content with extractable text).
The file is empty.

You can rerun a test by selecting the circular arrow that sits above the uploaded files. This is particularly helpful after refining your input in the classification label descriptions. Select Clear at the bottom of the section to remove your test results.

AI Classification policy settings

Folder criteria


Setting	Description
Apply to all folders	The policy will apply to files in all folders in your enterprise.
Only selected folders	The policy will apply to files only in folders you select and in all sub-folders of those folders. To select folders: 1. Click Select Folders. 2. Enter a search term and press Enter. 3. Select one or more folders. 4. Click Save.

Conflict handling

Determines the behavior of the AI Classification policy for conflicts when content has an existing classification label:

Skip files that already have a classification label (Recommended) - The policy will:
- Overwrite a classification label that was previously applied by another classification policy
- Skip files with classification labels applied by a user, by folder cascade, by workflow, or that were applied via Microsoft Purview Information Protection (MPIP) integration from MPIP sensitivity labels
Overwrite any existing classification label - The policy will overwrite any existing classification label, whether that label was previously applied by a user, by folder cascade, by a workflow, or by a previous policy, except when:
- The auto-classified label was overridden manually by a user for the latest file version
- A classification label was applied from an MPIP sensitivity label and the MPIP Prevent Modification setting is enabled

Note: Overwriting existing classification labels cannot easily be undone. It is recommended you only select Overwrite any existing classification label if you’re confident in the accuracy of your AI Classification guidance.

Enable, disable, or delete an AI Classification policy

To enable, disable, or delete an AI Classification policy:

Navigate to the Admin Console.
Select Classification.
Click the name of your AI Classification policy.
Click either Enable, Disable, or Delete.

An AI Classification policy can be in a disabled state if you saved it as a draft when creating it, or if you disabled it any time after creating it. Disabling a classification policy does not remove any classifications that have been already applied. It just stops application of the policy until the policy is enabled again. When you delete an AI Classification policy, Box does not remove classifications that this policy applied to content. Notes:

You cannot duplicate an AI Classification policy, as you can only create one policy.
Once enabled, the classification policy will be in effect for files that are triggered by classification events.

AI Classification user experience

AI Classification details are accessible by:

Selecting a file within Box.
Clicking the Details button in the panel on the right-hand side.

Information shows in the Applied by section, where it states the classification was applied by Box AI on a specific date. A description is shown with the reasoning for why that label was applied. For example: “The file was marked internal only because it contains non-public financial results.”

AI Classification results information

If a classification label is applied to a file by AI Classification, you can view the AI’s reasoning in the side panel as explained above in AI Classification user experience. If the label was not applied, you need to select to make this information visible: Make AI Classification Results information visible:

In the Admin Console, select Content.
Select the Metadata tab.
Select AI Classification Results.
In the Visibility setting, disable Hide template from users to make the template visible.
Click Save.

Viewing AI Classification Results information:

In the Box web application, select a file.
Click the Metadata icon next to the right-hand pane.

The AI Classification Results section shows the Box AI Classification Agent reasoning for the decision it made when evaluating the file.

AI Classification policy best practices

Define effective label criteria

To ensure accurate AI Classification, label definitions should be:

Distinct: Each label should have non-overlapping, clearly differentiated criteria that targets a unique set of document characteristics.
Descriptive: Use plain language to specify:
- Document types (e.g. contracts, strategy decks, spreadsheets)
- Topics or intent (e.g. product roadmap, security breach, deal terms)
- Data types (e.g. PII, source code, financials)
- Audience (e.g. internal teams, legal)

Avoid:

Vague descriptors (e.g., “High risk to the company”)
Overlapping labels (e.g., “Confidential” vs. “Highly Confidential”)
Undefined technical jargon

Troubleshooting tips

If AI Classification results are not meeting expectations:

Use fewer, well-defined labels: Add examples and tighten criteria
Check for overlap: Ensure labels are clear and unambiguous without overlap and avoid “catch-all” labels
Ensure the file is a supported file type: View the text and image file types that are supported

Known Limitations

AI Classification returns mixed and sometimes inaccurate information for criteria that includes the following conditions or topics:

Calculations, table structures, and numbers
Counting words or phrases
Document metadata such as page number, authors, file size, word count, and collaborators (AI Classification doesn’t take into account any of these document components)
Images, charts, graphs, etc. that are within text documents (it can only analyze image files directly)

Using Box Shield

Getting Started with Box Shield

Shield Classification Labels and Policies

Shield Threat Detection Rules

Shield Smart Access Policies

Shield Lists

Shield Information Barrier

AI Classification

Overview

Classification text file types

Classification image file types

AI Classification policy limits

Create an AI Classification policy

Test and iterate

AI Classification policy settings

Folder criteria

Conflict handling

Enable, disable, or delete an AI Classification policy

AI Classification user experience

AI Classification results information

AI Classification policy best practices

Define effective label criteria

Troubleshooting tips

Known Limitations

Using Box Shield

Getting Started with Box Shield

Shield Classification Labels and Policies

Shield Threat Detection Rules

Shield Smart Access Policies

Shield Lists

Shield Information Barrier

​Overview

​Classification text file types

​Classification image file types

​AI Classification policy limits

​Create an AI Classification policy

​Test and iterate

​AI Classification policy settings

​Folder criteria

​Conflict handling

​Enable, disable, or delete an AI Classification policy

​AI Classification user experience

​AI Classification results information

​AI Classification policy best practices

​Define effective label criteria

​Troubleshooting tips

​Known Limitations

Overview

Classification text file types

Classification image file types

AI Classification policy limits

Create an AI Classification policy

Test and iterate

AI Classification policy settings

Folder criteria

Conflict handling

Enable, disable, or delete an AI Classification policy

AI Classification user experience

AI Classification results information

AI Classification policy best practices

Define effective label criteria

Troubleshooting tips

Known Limitations