Action that extracts regex match details (multiple matches & capture groups) into dictionary variable.

sampleuserhere

Active member
For example we have the following text.
Code:
1,one
2,two
3,three
4,four
And we want to extract the text with the following regex. (?<number>\d+),(?<text>[a-z]+)


Here's the information that can be retrieved based on the previous details
  1. Matches [ "1,one", "2,two", "3,three", "4,four" ]
  2. Capture group number [ "1", "2", "3", "4" ]
  3. Capture group text. [ "one", "two", "three", "four" ]
See the details here https://regex101.com/r/iCrCFG/1


What I propose is a sub-action of Text Manipulation that could return those information into the following JSON object/dictionary variable.
Code:
{
  "matches": [
    "1,one",
    "2,two",
    "3,three",
    "4,four"
  ],
  "number": [
    "1",
    "2",
    "3",
    "4"
  ],
  "text": [
    "one",
    "two",
    "three",
    "four"
  ]
}


The rules is fairly simple.
  1. Matches will be stored into matches key.
  2. Any named capture group will be stored under its own name. If capture group is not named, they will be stored in a key with the following name pattern "group#".
Code:
{
  "matches": [
  ],
  "group0": [
  ],
  "group1": [
  ]
}

With this mechanism, I believe extracting data out of a text will become easier like how it should be with regex.

TIA.
 
Last edited:

LinerSeven

Active member
Hi, @sampleuserhere.

Your idea is very good.

However, what I would like you to consider a little coolly is whether there are other apps that have similar support, which probably doesn't seem realistic in view of the amount of work the developer has to do.

Probably Macrodroid's Regex depends on a class library to interpret it, so if that class library returns "provable" results as you say, it might be possible, but I, and probably other developers I, and probably other developers, do not know of such a library.

I find it tremendously difficult to deal with.

Best Regards,
Liner Seven
 

sampleuserhere

Active member
There is one I know, Tasker (written in Kotlin) with its Simple Match Regex action. Not sure though how complicated it would be in the language MD is written with.

Anyway, There seems to be a function in Javascript that is able to extract the named capture group, I had asked ChatGPT last night.

I'll try to include the script later if it's needed.
 

sampleuserhere

Active member
It seems that regex.exec() returns necessary information for what I proposed earlier.


Here's the script, generated mostly with chatGPT.
Code:
const inputText = `1,one
2,two
3,three
4,four as a a
ffive,five`;

const regex = /(\w*),(?<text>\w+)|(a)/gm;
let matches;
let matchInfo = {};

let groupCounter = 0;

while ((matches = regex.exec(inputText)) !== null) {
  for (let i = 1; i < matches.length; i++) {
    let groupName = matches.groups ?
                      (matches.groups[`group${i}`] || Object.keys(matches.groups)[i - 1])
                      : `group${groupCounter++}`;
                      
    if ( groupName === undefined ) {
      groupName = "group" + ( i - 2 )
    }
    
    if (!matchInfo[groupName]) {
      matchInfo[groupName] = [];
    }

    if (matches[i]) {
      matchInfo[groupName].push(matches[i]);
    }
  }
}

matchInfo.matches = inputText.match(regex);
console.log(JSON.stringify(matchInfo, null, 2));

Returns the following when being run with node.js in visual studio code.
Code:
{
  "text": [
    "1",
    "2",
    "3",
    "4",
    "ffive"
  ],
  "group0": [
    "one",
    "two",
    "three",
    "four",
    "five"
  ],
  "group1": [
    "a",
    "a",
    "a"
  ],
  "matches": [
    "1,one",
    "2,two",
    "3,three",
    "4,four",
    "a",
    "a",
    "a",
    "ffive,five"
  ]
}
 
Top