Remove Accents from Strings in MuleSoft

A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or basic glyph.

The main use of diacritical marks in the Latin script is to change the sound-values of the letters to which they are added. Here you can get more details.

Now let us talk about the use case we are trying to solve.

We want to strip these accents from words to it’s original form. Example : José Alberto should give Jose Alberto

The Approaches

  1. Using Custom Java Code

We can create a simple Java Class called StringUtils under src/main/java.

The code should look like below:

package utility;import java.text.Normalizer;public class StringUtils {	public static String stripAccents(String src) {		return Normalizer.normalize(src, Normalizer.Form.NFD).replaceAll("[^\\p{ASCII}]", "");	}}

We can then import this in a Java connector or better to use it in Dataweave as below:

%dw 2.0import java!utility::StringUtilsoutput application/json---{	name1: StringUtils::stripAccents("Santiago Muñez"),	name2: StringUtils::stripAccents("Mesut Özil")	}

Here we import the java class using import java!utility::StringUtils

The output:

2. Using Apache Commons Lang

First we will add the maven dependency to pom.xml

<dependency>    <groupId>org.apache.commons</groupId>    <artifactId>commons-lang3</artifactId>    <version>3.12.0</version></dependency>

Next we will directly import it in Dataweave and call the stripAccents function

%dw 2.0import java!org::apache::commons::lang3::StringUtilsoutput application/json---{	name1: StringUtils::stripAccents("Santiago Muñez"),	name2: StringUtils::stripAccents("Mesut Özil")	}

Here we import the java code from apache commons lang package.

The output:

Hope you liked it and will be useful in some of your integrations.

Cheers.

 

Leave a Reply

Your email address will not be published. Required fields are marked *