Today we are going to discuss about numerous methods in text extraction between specified pairs of characters, including parentheses, brackets, and curly braces.
We will explore techniques in Excel and Power Query.
Method-1:
=MAP(B3:B7,LAMBDA(x,TEXTBEFORE(TEXTAFTER(x,{"(","[","{"}),{")","]","}"})))
TEXTAFTER and TEXTBEFORE functions are suitable for our need.
To populate our results for entire range, we can use MAP function, which is one among LAMBDA helper functions.
The variable 'x' in LAMBDA function refers,
B3 cell value in first iteration and B4 cell value in 2nd iteration and so on.
Both TEXTAFTER and TEXTBEFORE functions can handle multiple delimiters wrapped inside curly braces. If any of the delimiters found in the text, then these functions will extract after or before the delimiters otherwise returns #N/A error. Last argument (if_not_found) in both functions can be used as error handling methods.
Let me start this from scratch,
=TEXTAFTER(x,{"(","[","{"})
in first iteration x -> B3 cell value,
This construction will pick text after any of the delimiters provided, if it is found.
Result of this construction is -> "quick brown) fox jumps over the lazy dog."
TEXTBEFORE(TEXTAFTER(x,{"(","[","{"}),{")","]","}"})
This construction will pick text before any of the delimiters provided, if it is found.
Then our result is -> "quick brown".
This customized function will be carried out till B7 range and will deliver the same number of results as in the range B3:B7.
Based on Rick Rothstein comment:
We can use TEXTAFTER and TEXTBEFORE function without LAMBDA helper functions, since they can handle arrays independently. Here is the formula,
=TEXTBEFORE(TEXTAFTER(B3:B7,{"(","[","{"}),{")","]","}"})
Method-2:
=MAP(B3:B7,LAMBDA(x,TEXTSPLIT(TAKE(TEXTSPLIT(x,{"(","{","["}),,-1),{")","}","]"})))
TEXTSPLIT function can handle array of delimiters inside curly braces. This function will split the text based on the delimiters of opening pairs.
=TEXTSPLIT(B3,{"(","{","["})
The result of inner TEXTSPLIT function is,
{"The ","quick brown) fox jumps over the lazy dog."}
Since, we have any one of the delimiters in each text, this function will split the text in 2 parts. The part which we need is 2nd one. So, TAKE function will help us to take the last part.
We can split the text again with the help of TEXTSPLIT function, but this time our delimiters are closing pairs.
Result of this customized function is -> "quick brown".
MAP will do this to entire range of B3:B7.
Method-3:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Result = Table.AddColumn(Source, "New", (f) =>
[
t = Text.SplitAny,
fin = t(t(f[Text],"({["){1},")}]"){0}
][fin])
in
Result
This is same as Method-2, Text.SplitAny is the M-Code replacement of TEXTSPLIT function in excel. In Text.SplitAny function we can use multiple delimiters as text.
Instead of TAKE function, here we can pick 2nd item in list using row index operator within curly braces. Since, power query is zero base, we used 1 inside curly brace to pick 2nd item.
Doing the split on resulting text with closing pairs delimiters will deliver us 2 items as list.
To pick first item in list, use zero inside curly brace.
Below are some M-Code and Excel formula to do this task with the help of Text functions.
Method-4:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Result = Table.AddColumn(Source,"Ans", (f) =>
[
t = Text.PositionOfAny, txt = Text.RemoveRange( f[Text],0, t(f[Text], {"(","{","["}) + 1 ), fin = Text.Start(txt, t(txt, {")","}","]"}))
][fin])
in
Result
Method-5:
=MAP(B3:B7,LAMBDA(x,LET(txt,REPLACE(x,1,MAX(IFERROR(FIND({"(","{","["},x),)),),MID(txt,1,MAX(IFERROR(FIND({")","}","]"},txt),))-1))))
Method-6:
=BYROW(B3:B7,LAMBDA(x,LET(txt,SUBSTITUTE(x,MID(x,MAX(IFERROR(FIND({")","}","]"},x),)),100),),MID(txt,MAX(IFERROR(FIND({"(","{","["},txt),))+1,100))))
I updated Method-1 formula in this post based on Rick Rothstein comment.
Your Method #1 formula does not need to use the MAP function as both TEXTBEFORE and TEXTAFTER work with dynamic arrays... so your Method #1 formula can be reduced to this...
=TEXTBEFORE(TEXTAFTER(B3:B7,{"(","[","{"}),{")","]","}"})