Skip to content Skip to sidebar Skip to footer

Remove A Long Dash From A String In Javascript?

I've come across an error in my web app that I'm not sure how to fix. Text boxes are sending me the long dash as part of their content (you know, the special long dash that MS Word

Solution 1:

This code might help:

text = text.replace(/\u2013|\u2014/g, "-");

It replaces all– (–) and — (—) symbols with simple dashes (-).

DEMO:http://jsfiddle.net/F953H/

Solution 2:

That character is call an Em Dash. You can replace it like so:

str.replace('\u2014', '');​​​​​​​​​​

Here is an example Fiddle: http://jsfiddle.net/x67Ph/

The \u2014 is called a unicode escape sequence. These allow to to specify a unicode character by its code. 2014 happens to be the Em Dash.

Solution 3:

There are three unicode long-ish dashes you need to worry about: http://en.wikipedia.org/wiki/Dash

You can replace unicode characters directly by using the unicode escape:

'—my string'.replace( /[\u2012\u2013\u2014\u2015]/g, '' )

Solution 4:

There may be more characters behaving like this, and you may want to reuse them in html later. A more generic way to to deal with it could be to replace all 'extended characters' with their html encoded equivalent. You could do that Like this:

[yourstring].replace(/[\u0080-\uC350]/g, 
                      function(a) {
                        return'&#'+a.charCodeAt(0)+';';
                      }
);

Solution 5:

With the ECMAScript 2018 standard, JavaScript RegExp now supports Unicode property (or, category) classes. One of them, \p{Dash}, matches any Unicode character points that are dashes:

/\p{Dash}/gu

In ES5, the equivalent expression is:

/[-\u058A\u05BE\u1400\u1806\u2010-\u2015\u2053\u207B\u208B\u2212\u2E17\u2E1A\u2E3A\u2E3B\u2E40\u2E5D\u301C\u3030\u30A0\uFE31\uFE32\uFE58\uFE63\uFF0D]|\uD803\uDEAD/g

See the Unicode Utilities reference.

Here are some JavaScript examples:

const text = "Dashes: \uFF0D\uFE63\u058A\u1400\u1806\u2010-\u2013\uFE32\u2014\uFE58\uFE31\u2015\u2E3A\u2E3B\u2053\u2E17\u2E40\u2E5D\u301C\u30A0\u2E1A\u05BE\u2212\u207B\u208B\u3030𐺭";
const es5_dash_regex = /[-\u058A\u05BE\u1400\u1806\u2010-\u2015\u2053\u207B\u208B\u2212\u2E17\u2E1A\u2E3A\u2E3B\u2E40\u2E5D\u301C\u3030\u30A0\uFE31\uFE32\uFE58\uFE63\uFF0D]|\uD803\uDEAD/g;
console.log(text.replace(es5_dash_regex, '-')); // Normalize each dash to ASCII hyphen// => Dashes: ----------------------------

To match one or more dashes and replace with a single char (or remove in one go):

/\p{Dash}+/gu
/(?:[-\u058A\u05BE\u1400\u1806\u2010-\u2015\u2053\u207B\u208B\u2212\u2E17\u2E1A\u2E3A\u2E3B\u2E40\u2E5D\u301C\u3030\u30A0\uFE31\uFE32\uFE58\uFE63\uFF0D]|\uD803\uDEAD)+/g

Post a Comment for "Remove A Long Dash From A String In Javascript?"