top of page

Excel Arena

This is the place, where we can test the boundaries of Excel functions.

Extract a string of unique characters from a given text

Today we are going to discuss about numerous methods to extract Unique characters string from a given text. Lot to learn, let's dive in.


Expected result shown in below picture.

Method -1:

Formula in C4 cell,

=CONCAT(MID(B4,MODE.MULT(IFERROR(MATCH(ROW(INDIRECT("1:"&LEN(B4))),MATCH(LOWER(MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1)),LOWER(MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1)),0),{0,0}),"")),1))


Let me start this from scratch,


Step-1:

Row, Indirect construction helps us to get the sequence of numbers in dynamic way. =ROW(INDIRECT("1:"&LEN(B4)))

Result of above construction is,

={1;2;3;4;5;6;7;8}


Step-2:

MID function helps to extract one character from each position.

=MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1)

Result of above construction is,

={"a";"d";"e";"q";"u";"a";"t";"e"}


Step-3:

Match function helps us to get the relative position of each string.

=MATCH(MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1),MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1),0)

Result of above construction is,

={1;2;3;4;5;1;7;3}

You can see that 1 and 3 are duplicated, Strings of relative position are "a" and "e".


Step-4:

In this step, our main task is to remove the relative position of duplicate characters.

=MATCH(ROW(INDIRECT("1:"&LEN(B4))),MATCH(MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1),MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1),0),0)

Result of above construction is,

={1;2;3;4;5;#N/A;7;#N/A}


Step-5:

Make a slight adjustment in match function to remove the error part.

=IFERROR(MATCH(ROW(INDIRECT("1:"&LEN(B4))),MATCH(MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1),MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1),0),{0,0}),"")

Result of above construction is,

={1,1;2,2;3,3;4,4;5,5;"","";7,7;"",""}

Pair of relative position of unique characters and zero length text strings.


Step-6:

MODE.MULT function helps us to remove texts and to get one side elements in pair.

=MODE.MULT(IFERROR(MATCH(ROW(INDIRECT("1:"&LEN(B4))),MATCH(MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1),MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1),0),{0,0}),""))

Result of above construction is,

={1;2;3;4;5;7}


Step-3 to Step-6 is to extract relative position of Unique characters, now we can use this in INDEX function.


Step-7:

=INDEX(MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1),MODE.MULT(IFERROR(MATCH(ROW(INDIRECT("1:"&LEN(B4))),MATCH(MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1),MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1),0),{0,0}),"")))

Result of above construction is,

={"a";"d";"e";"q";"u";"t"}


Then, with the help of CONCAT function we can simply concatenate.


Method - 2:

Formula in D4 cell is,

=LET(seq,ROW(INDIRECT("1:"&LEN(B4))),CONCAT(MID(B4,MODE.MULT(IFERROR(MATCH(seq,MATCH(LOWER(MID(B4,seq,1)),LOWER(MID(B4,seq,1)),0),{0,0}),"")),1)))


In method - 1, we used Row and Indirect construction in 3 places to create sequence numbers. Now, we are going to use LET function to improve the efficiency and readability.

In this method, sequence of numbers evaluated and stored in memory for once. This will be used whenever it is asked to give the sequence.

That's it, no other change.


Method - 3:

This is the simplest of all,

Formula in E4 cell is,

=CONCAT(UNIQUE(MID(B4,ROW(INDIRECT("1:"&LEN(B4))),1)))


In method - 1, step-3 to step-7 done to extract unique characters. Here,

these steps were replaced by UNIQUE function. That is the power of Dynamic array function.

20 views3 comments

3 Comments


A couple of points.

  1. Why are you using ROW(INDIRECT("1:"&LEN(B4))) instead of SEQUENCE(LEN(B4))?

  2. If you still want to use your ROW approach, you can eliminate the Volatile INDIRECT function call if you can guess at a number that is guaranteed to be bigger than the number of characters in the text. For example, if we know none of the text will ever be longer than 99 characters, then this formula would work... =CONCAT(UNIQUE(MID(B4,ROW(1:99),1)))

Like
Replying to

Rick, Great to have you here, this means a lot to me.

Like
bottom of page