Foxit Smart Redact
Security Overview

Foxit Smart Redact is an AI-powered tool provided by Foxit that automatically detects and removes sensitive data like names, addresses, and IDs. It helps users efficiently and securely complete redaction workflows, protect sensitive information from unauthorized access, and comply with global data privacy regulations. Foxit prioritizes data security through end-to-end encrypted transmission, encrypted storage of sensitive information, high-tier data centers, and prudent data retention policies. Additionally, its development process follows the Security Development Lifecycle (SDL). This multi-layered approach provides individuals and enterprises with an efficient and secure sensitive data discovery and redaction solution.

About Foxit Smart Redact

Foxit Smart Redact utilizes AI models trained to comply with global data privacy regulations (e.g., GDPR, HIPAA), enabling intelligent identification of sensitive information. It uses optimized workflows to enhance redaction speed and accuracy. Additionally, it offers enterprise-oriented solutions to help businesses and organizations detect and redact sensitive information in documents at scale.
Foxit Smart Redact includes the

  • Smart Redact Plugin of PDF Editor, built into Foxit PDF Editor for seamless and secure redaction.
  • Smart Redact Server, an enterprise-grade solution that scans document repositories in bulk to detect and label sensitive information, enabling users to efficiently review and redact as needed.

What types of data can be detected?

Smart Redact detects Personally Identifiable Information (PII) and Protected Health Information (PHI) as defined by modern data regulations, including the EU GDPR, the California Consumer Privacy Act (CCPA) as amended by the CPRA, and HIPAA. This includes data elements like personal names, Social Security numbers, credit card numbers, driver’s license numbers, medical records, and diagnostic codes. Detection currently supports English-language documents only.
For a detailed list of supported categories, please refer to Appendix A.

How does Smart Redact work?

This section breaks down the workflows and key data flows of the Smart Redact Plugin for PDF Editor and Smart Redact Server, helping users choose the right solution based on their needs.

Workflow of the Smart Redact Plugin

When using the Smart Redact Plugin, users can securely perform redaction without switching tools—saving time, protecting document integrity, and maintaining compliance within a seamless workspace.

Foxit AI Assistant Service Architecture

The diagram above illustrates the key participants and their respective data flow during redaction using the Smart Redact Plugin. The participants in the workflow are as follows:

  • PDF Editor: Runs locally on the user's device. It is responsible for extracting document information and performing redaction. This is the main interface users interact with.
  • SRP (Smart Redact Plugin) Service: Manages access control and business logic processing. It acts as an intermediary between the PDF Editor and AI services.
  • Internal Multi-Model AI System: Hosted on AWS. It analyzes documents to detect sensitive information and returns results to the SRP Service.
  • Azure AI Language Service: An external AI service used to identify sensitive information and interacts with the SRP Service to provide detection results.

The subsequent steps outline the main workflow of the Smart Redact Plugin of PDF Editor to detect and redact sensitive information.

  • Document Preparation: The user opens the document. If the document is a scanned file or contains embedded images, Text Recognition will extract text from the document.
  • Initiate Smart Redact: The user activates Smart Redact, selects options (e.g., regions, sensitive categories), and initiates the scan.
  • Local Text Extraction: The PDF Editor extracts all text content – both native and OCR-generated – and sends it, along with user-defined parameters, to the SRP Service.
  • AI-Driven Sensitive Information Detection:
    • a) AI Analysis: The SRP uses multiple models (Azure AI Language and internal Multi-Model AI System) to detect the text content separately.
    • b) Model Ensemble & Voting Integration: The SRP Service aggregates outputs from the two AI systems using a Model Ensemble approach, applying majority voting to consolidate results. This consensus-driven method generates a unified sensitive information list, significantly enhancing detection accuracy.
  • User Review & Confirmation: The PDF Editor displays the detected sensitive information to the user. The user reviews and confirms which items to redact.
  • Perform redaction after confirmation: The PDF Editor performs redaction only after explicit user confirmation, ensuring compliance with privacy policies.

The Smart Redact Plugin helps users intelligently identify and redact sensitive information, streamlining the redaction process. Discovery and redaction actions are only executed after user confirmation, ensuring that users maintain full control over the process and that all actions are compliant with data privacy requirements.

Workflow of the Smart Redact Server

The Smart Redact Server (SRS) efficiently automates sensitive information detection and file migration. For instance, it can scan 10,000 customer contracts in an AWS S3 bucket and apply preset policies to handle redaction and migration tasks automatically.

Foxit AI Assistant Service Architecture

The figure illustrates the key participants and their respective data flow during redaction using the Smart Redact Server. The participants in the workflow are as follows:

  • Cloud Storages: The user’s cloud storage platforms (e.g., OneDrive, AWS S3) serve as the document data source.
  • SRS (Smart Redact Server) Service: A standalone web application that manages access control, business logic, user authentication, workflow orchestration, and user interface.
  • Internal Multi-Model AI System
  • Azure AI Language Service
  • Azure AI-Vision OCR Service: It extracts text content from scanned documents or images to provide input for sensitive information detection.

The subsequent steps outline the main workflow of the Smart Redact Server (SRS) to detect and redact sensitive information.

  • Create a Project
    • a) Select/Create Policy: Define the types of sensitive data to detect.
    • b) Select/Connect Data Source: Link cloud storage (e.g., AWS S3, OneDrive). SRS follows each cloud storage provider's security guidelines for connecting and accessing data, such as using token-based authentication rather than password storage.
    • c) Configure Scanning Scope & Schedule:
      • i. Specify the root folder and document types to scan.
      • ii. Set scan schedule (daily/weekly/monthly or manual trigger).
    • d) Define File Migration Policy: For documents containing sensitive data, choose to copy or move them to designated paths.
  • Batch Document Processing Workflow: SRS processes documents periodically based on the schedule. It determines whether a rescan is needed by comparing filenames, modification times, and historical records.
  • Single Document Processing Steps:
    • a) Download Document: Fetch files from the data source to the SRS server.
    • b) Text Extraction: Use Azure AI-Vision OCR to extract text contents from the document.
    • c) AI-Driven Sensitive Information Detection: This follows the ensemble-based model workflow, which is described for the Smart Redact Plugin and is not reiterated here.
    • d) Record Storage: Save scan record, logs, and detected sensitive information to the database. All data is encrypted.
    • e) File Migration: Copy or move sensitive documents according to the user-defined configuration.
    • f) Cleanup: Delete temporary file copies from the SRS server.
  • Post-Processing for Users: Users can review scan records on the SRS platform and perform redaction, move, or copy operations.

When users use Smart Redact Server to detect and redact sensitive information, SRS processes the documents in the user-designated Cloud Storages as per the user's policy. SRS does not retain original or intermediate documents. However, sensitive information is stored in an encrypted form for review and follow-up actions.

How does Smart Redact keep data secure?

Secure data in transit

  • All web APIs are called via HTTPS, including calls to the Foxit Smart Redact Services API and Azure AI Services API. This ensures the secure transmission of documents and user data. Additionally, HTTPS versions and cipher suite selections are regularly reviewed and updated to align with industry best practices.
  • When Smart Redact Server accesses cloud storage services, all requests are carried out according to the recommended guidelines of Cloud Storages to ensure the security of access.

Secure data in rest

Foxit implements a variety of measures to ensure the security of data at rest, with the key measures as follows.

  • Encryption of sensitive information: Sensitive information is encrypted and stored using the 256-bit Advanced Encryption Standard (AES).
  • Data Center Security: By leveraging AWS's Tier-4 data centers, Foxit ensures robust access controls, environmental safeguards, and restricted access limited to authorized personnel in Virginia, Frankfurt, and Montreal.
  • Data Privacy: Databases are firewall-protected and not publicly accessible, with access restricted to authorized personnel for business or legal purposes only.
  • Off-Grid Operation: For high-security needs, Foxit offers an "off-grid" mode, allowing SmartRedact Server operation without cloud access.

Prudent data retention

  • When users use the Smart Redact Plugin of PDF Editor to detect and redact sensitive information in documents, Smart Redact follows a zero-retention policy. Once the task is complete, all original and intermediate documents, along with results, are instantly deleted.
  • When using the Smart Redact Server (SRS) for document processing, SRS does not retain original or intermediate documents. Detected Sensitive information is stored in encrypted form. Deleting a project will also permanently delete all associated discovery results.

Secure development practices

Foxit follows the industry-standard Security Development Lifecycle (SDL) to ensure the security and reliability of Smart Redact. Key measures include:

  • Secure Design: Threat modeling is conducted early to identify risks, with security controls embedded in the design phase.
  • Secure Coding: Adherence to strict coding standards prevents common vulnerabilities.
  • Code Audits & Vulnerability Testing: Regular audits and testing ensure code integrity.
  • Security Testing: A combination of automated and manual testing is used to validate system security.
  • Secure Release: Rigorous security reviews ensure compliance before deployment.

Additionally, Foxit prioritizes security training to enhance developers' expertise. These practices reinforce Foxit’s commitment to delivering a secure and reliable Smart Redact Solution.

Privacy and Guidelines

Your use of Foxit’s Smart Redact solutions is governed by Foxit End-User License Agreement (Foxit EULA) and Foxit General Terms of Service. The Guidelines reflect Foxit’s dedication to complying with applicable laws and regulations, upholding the company’s values, and promoting the ethical use of AI technologies.

Foxit uses some of the Azure AI Service technologies to provide the Smart Redact solution. Each Azure AI service used by Smart Redact adheres to its own security and privacy standards. For details, refer to:

Conclusion

Foxit offers a best-in-class level of security tailored to the diverse needs of users and organizations across industries. We acknowledge the sensitivity of your information and workflows and are committed to safeguarding them with the highest level of protection. With Foxit, you gain a trusted vendor committed to not only delivering uncompromising PDF software but also ensuring its security across all facets in accordance with industry best practices.
For more information on Foxit security, please visit the Foxit Security Center.

Appendix A

A detailed list of categories supported by Smart Redact

Code Name Country Remark
Person All (PII)
Organization All (PII)
PersonType All (PII)
Address All (PII)
ZipCode All (PII) The first three digits of a zip code
Location All (PII) Location All includes names like cities, countries, regions, states, manmade structures, and geographical locations, like rivers, oceans, and deserts.
Email All (PII)
FaxNumber All (PII)
DateTime All (PII)
Temperature All (PII)
Currency All (PII)
Age All (PII)
Percentage All (PII)
CreditCardNumber All (PII)
InternationalBankingAccountNumber All (PII) IBAN
Gender All (PII) Terms that disclose the gender of the subject, e.g., male, female, woman, gentleman, or lady.
SWIFTCode All
SocialMediaUrl All (PII) It supports the social media account detection:
* Twitter username
* Facebook username
* YouTube account
* Vimeo account
* Instagram username
* LinkedIn URL
* Pinterest username
HumanRace All (PII) Examples: "African", "Asian", "European", "Native American", "Oceanian"
ReligiousView All (PII) Example: "Judaism", "Catholic"
SexualPreference All (PII) Example: "bisexual", "homosexual", "heterosexual"
PoliticalAffiliation All (PII) Example: "Democratic Party (United States)" or "Republican Party (United States)"
CountryCode All (PII) Example: +591, +886
Language All (PII) Example: English, French
Occupation All (PII) Example: Scientist, Doctor
BloodType All (PII) e.g., A, B, AB, O
MaritalStatus All (PII) e.g., Married, Single, Divorced
IP All (PII) Network IPv4 and IPv6 addresses. Example: 168.131.1.1 and 21DA:D3:0:2F3B:2AA:FF:FE28:9C5A
ABARoutingNumber US (PII)
USPhoneNumber US (PII)
USIndividualTaxpayerIdentification US (PII)
USSocialSecurityNumber US (PII)
USDriversLicenseNumber US (PII)
USUKPassportNumber US and UK (PII) Context-aware cognition. Needs passport number text around numbers.
USBankAccountNumber US (PII)
ACHRoutingNumber US (PII) Automatic Clearing House number
InsuranceProvider US (PII)
MemberID US (PII) Insurance member ID number
GroupID US Insurance group number
AUDriversLicense Australia (PII) Insurance group number
AUPassportNumber Australia (PII) Insurance group number
AUBusinessNumber Australia (PII) Australian Business number
BSBCode Australia (PII) Bank state branch code
AUSTRALIAPhoneNumber Australia (PII)
CustomerReferenceNumber Australia (PII) A CRN is 9 numbers and ends with a letter. For example: 123 456 789A.
TaxFileNumber Australia (PII) A tax file number (TFN) is free and identifies the user for tax and superannuation purposes.
UKDriversLicenseNumber UK (PII) UK drivers’ license number
CommunityHealthIndex UK (PII) Community Health Index (CHI) number, e.g., 0911640250
UKNationalHealthNumber UK (PII) National Health Service (NHS) number
UKNationalInsuranceNumber UK (PII) National insurance number
UKNationalHealthNumber UK (PII) National Health Service number
UKPhoneNumber UK (PII) U.K. phone number
ExaminationName All Examination (PHI); diagnostic procedures and tests, including vital signs and body measurements
Diagnosis All Diagnosis (PHI); disease, syndrome, poisoning
SymptomOrSign All Symptom (PHI); subjective or objective evidence of disease or other diagnoses
TreatmentName All Treatment (PHI); therapeutic procedures
Allergen All Allergen (PHI); an antigen triggering an allergic reaction
Course All Course (PHI); description of a change in another entity over time, such as condition progression, a course of treatment or medication
MeasurementValue All Measurement value (PHI); the value related to an examination or a medical condition measurement
Variant All Variant (PHI); all mentions of gene variations and mutations
GeneOrProtein All Gene/Protein (PHI); all mentions of names and symbols of human genes as well as chromosomes and parts of chromosomes and proteins
MutationType All Mutation type (PHI); description of the mutation, including its type, effect, and location
Expression All Expression (PHI); gene expression level
AdministrativeEvent All Administrative event (PHI); events that relate to the healthcare system but of an administrative/semi-administrative nature
CareEnvironment All Care environment (PHI); an environment or location where patients are given care
ConditionQualifier All Condition qualifier (PHI); qualitative terms that are used to describe a medical condition
MedicationName All Medication name (PHI); medication mentions, including copyrighted brand names, and non-brand names
Dosage All Dosage (PHI); amount of medication ordered
FamilyRelation All Family relation (PHI); mentions of family relatives of the subject
BodyStructure All Body structure (PHI); body systems, anatomic locations or regions, and body sites
Direction All Direction (PHI); directional terms that may relate to a body structure, medical condition, examination, or treatment
Frequency All Frequency (PHI); describes how often a medical condition, examination, treatment, or medication occurred, occurs, or should occur
Time All Time (PHI); temporal terms relating to the beginning and/or length (duration) of a medical condition, examination, treatment, medication, or administrative event
MeasurementUnit All Measurement value (PHI); the value related to an examination or a medical condition measurement
RelationalOperator All Relational operator (PHI); phrases that express the quantitative relation between an entity and some additional information
HealthcareProfession All Healthcare profession (PHI); a healthcare practitioner licensed or non-licensed
ConditionScale All Condition scale (PHI); qualitative terms that characterize the condition by a scale, which is a finite ordered list of values
MedicationClass All Medication class (PHI); a set of medications that have a similar mechanism of action, a related mode of action, a similar chemical structure, and/or are used to treat the same disease
MedicationForm All Medication form (PHI); the form of the medication
MedicationRoute All Medication route (PHI); the administration method of medication