Cleaning text data is a crucial step in data preparation for Power BI reports and dashboards. Power Query provides powerful text transformation tools that allow you to:

Remove extra spaces and unwanted characters
Standardize text formatting (uppercase, lowercase, proper case)
Split and merge columns based on delimiters
Replace patterns and fix inconsistencies

In this guide, you’ll learn advanced text cleaning techniques to improve data quality and make your Power BI models more reliable.


1. Removing Extra Spaces and Unwanted Characters

📌 Trimming Spaces

Power Query allows you to remove leading, trailing, and extra spaces inside text fields using the Trim function.

Steps to Trim Spaces:

  1. Select the text column.
  2. Go to the Transform tab.
  3. Click FormatTrim.

👉 This removes unnecessary spaces while preserving single spaces between words.

📌 Cleaning Non-Printable Characters

If your dataset contains hidden characters (e.g., line breaks, special symbols), use the Clean function:

  1. Select the column.
  2. Click TransformFormatClean.

🚀 Example: " Customer Name ""Customer Name" (extra spaces removed)


2. Standardizing Text Case (Upper, Lower, Proper Case)

Consistent text formatting is essential for data matching and analysis.

Steps to Change Case:

  1. Select the column.
  2. Click TransformFormat, then choose:
    • UPPERCASE – Converts all text to capital letters.
    • lowercase – Converts text to small letters.
    • Capitalize Each Word – Formats text in Proper Case.

🚀 Example:

  • "jane doe""Jane Doe" (Proper Case)
  • "CUSTOMER""customer" (Lowercase)

3. Splitting Text Columns Using Delimiters

If data is stored in a single column but needs separation, use Split Column by Delimiter.

Common Use Cases:

  • Splitting Full Name into First Name and Last Name.
  • Separating City, State, ZIP into individual fields.
  • Extracting email usernames from domain names.

Steps to Split Text by Delimiter:

  1. Select the column.
  2. Click TransformSplit ColumnBy Delimiter.
  3. Choose a delimiter (comma, space, hyphen, custom).
  4. Select how to split (e.g., into two columns or at each occurrence).

🚀 Example:

  • "John Doe""John" and "Doe" (Split by Space)
  • "[email protected]""user" and "example.com" (Split by @)

4. Merging Multiple Text Columns

To combine multiple text columns into a single column, use Merge Columns.

Steps to Merge Columns:

  1. Select multiple columns.
  2. Click TransformMerge Columns.
  3. Choose a Separator (space, comma, dash, or custom).
  4. Enter a new column name.

🚀 Example:

  • "John" + "Doe""John Doe" (Merged with Space)
  • "New York" + "NY""New York, NY" (Merged with Comma)

5. Replacing Patterns & Fixing Inconsistencies

📌 Find & Replace Values

If your dataset contains inconsistent spellings (e.g., "USA" vs. "U.S.A."), you can replace values easily.

Steps to Replace Values:

  1. Select the column.
  2. Click TransformReplace Values.
  3. Enter Old Value and New Value, then click OK.

🚀 Example:

  • "N.Y.""New York"
  • "U.S.A.""USA"

📌 Using Advanced Pattern Matching (Regular Expressions in M Language)

For complex replacements (e.g., removing numbers, fixing typos), use M language functions like Text.Replace(), Text.Remove(), or Text.Select().

📌 Example: Removing Numbers from Text

mCopyEdit= Table.AddColumn(Source, "Cleaned Text", each Text.Remove([Column1], {"0".."9"}))

🚀 Input: "Product123"Output: "Product"


6. Extracting Substrings for More Control

You can extract specific parts of a text field based on position.

Steps to Extract Text:

  1. Select the column.
  2. Click TransformExtract.
  3. Choose from:
    • First Characters (e.g., first 5 letters)
    • Last Characters (e.g., last 3 letters)
    • Text Between Delimiters

🚀 Example:

  • "Invoice_2025_ABC" → Extract "2025" (Text Between "_")

7. Removing Duplicates from Text Data

If your dataset has duplicate text entries, use Remove Duplicates:
✔ Click HomeRemove Duplicates on the selected column.

🚀 Example:

Original DataCleaned Data
John DoeJohn Doe
John Doe(Removed)

Best Practices for Text Cleaning in Power Query

Always preview transformations before applying them to large datasets.
Use M Language for complex text manipulations when built-in tools aren’t enough.
Keep column names clean by renaming them after transformations.
Apply transformations in the right order to optimize performance.


Conclusion

🚀 Power Query offers powerful text cleaning techniques that save time and improve data quality. By mastering:
Trimming, formatting, and replacing text
Splitting and merging columns
Extracting and transforming text dynamically

You’ll enhance data accuracy and optimize your Power BI reports effortlessly.