Skip to Content

MC256841: Updates to U.S. Social Security Number (SSN) sensitive information type definition for improved accuracy

To improve the accuracy of the “U.S. Social Security Number” (SSN) sensitive information type, we are making the following changes to its definition:

  1. Three discreet confidence levels (High, Medium, and Low) depending on the level of accuracy. The three levels indicate the likelihood of a true positive considering the following:
    • When the SSN was issued. SSNs issued pre-2011 had relatively strong definition due to additional checks.
    • Whether the SSN are formatted (ddd dd dddd or ddd-dd-dddd) or unformatted (ddddddddd).
    • Whether a keyword is found in proximity to the SSN.
  2. An additional pattern which does not require mandatory keywords in proximity to reduce false negatives. The current definition requires keywords like “SSN” or “Social Security Number” in proximity to the actual number, which can sometimes lead to valid numbers not being detected (i.e. in an Excel spreadsheet where the supporting keyword is present only in the header row).
  3. Added intelligence to detect high volume SSNs in tabular data, like an Excel spreadsheet where keyword is present only in the header of the table. Use “High confidence” or “Medium confidence” in your policy for this. Please note that this requires at least one instance to be detected with a keyword in proximity.
    See details of current definition vs. new definition below.

MC256841: Updates to U.S. Social Security Number (SSN) sensitive information type definition for improved accuracy

Updated July 19, 2021: We have made the decision to make additional changes before we proceed with the rollout. To ensure that your policies continue to behave as they do today without impacting the accuracy, we are delaying this release until we work on the necessary changes. We will send out another communication with the next updates. Thank you for your patience.

Affected Workloads

  • Microsoft 365 suite

When this will happen

We will communicate via Message center when we are ready to proceed.

How this will affect your organization

Your existing policies, including data loss prevention policies, do not need to be changed. However, depending on your needs, you may wish to change the confidence level for US SSN within your policies (such as data loss prevention, communication compliance, sensitivity labeling, or records management). For example, if you wish to have minimal false positives, you may set the confidence level to High, and you can set the confidence level to Low if you want minimal false negatives.

  • We recommend that you use High confidence level in your policies for minimal false positives.
  • If you wish to detect unformatted numbers like 123121234 as well, you should use Medium confidence level.
  • Using Low confidence may result in a lot of false positives due to the weak definition of US SSN, where any 9-digit number can be a valid SSN. Please note that using Medium or High confidence will still detect high volume SSNs without keywords, provided at least one instance has keyword in proximity.

What you need to do to prepare

Review your policies and set the appropriate confidence level for the US SSN sensitive information type based on what you want to detect.

Learn more

Message ID: MC256841
Published: 17 May 2021
Updated: 19 July 2021

Alex Lim is a certified IT Technical Support Architect with over 15 years of experience in designing, implementing, and troubleshooting complex IT systems and networks. He has worked for leading IT companies, such as Microsoft, IBM, and Cisco, providing technical support and solutions to clients across various industries and sectors. Alex has a bachelor’s degree in computer science from the National University of Singapore and a master’s degree in information security from the Massachusetts Institute of Technology. He is also the author of several best-selling books on IT technical support, such as The IT Technical Support Handbook and Troubleshooting IT Systems and Networks. Alex lives in Bandar, Johore, Malaysia with his wife and two chilrdren. You can reach him at [email protected] or follow him on Website | Twitter | Facebook

    Ads Blocker Image Powered by Code Help Pro

    Your Support Matters...

    We run an independent site that is committed to delivering valuable content, but it comes with its challenges. Many of our readers use ad blockers, causing our advertising revenue to decline. Unlike some websites, we have not implemented paywalls to restrict access. Your support can make a significant difference. If you find this website useful and choose to support us, it would greatly secure our future. We appreciate your help. If you are currently using an ad blocker, please consider disabling it for our site. Thank you for your understanding and support.