Start a new topic

Regex problem

Hi,

We've been using a dictionary for a long time to split a single field into two values.

We receive a string something like "1712CT/123" and we use dictionaries to split this into two components. The regex in the dictionary is;
([0-9]{4}[A-Za-z]{0,4})\/([0-9A-Za-z ]+)

The dictionaries we use have the following and return either $1 or $2 so in the above example it would return "1712CT" and "123".

We have started to receive strings in the format "1712CT/123X1" but now the dictionary appears to be returning "1" from $2 instead of "123X1".

Any advice would be gratefully received

Hi Ricky,


I just tested the dictionary using the example you gave and got 123X1 as the result.  That being said I think the value to match on could be improved slightly using this ([0-9]{4}[A-Za-z]{0,4})/([0-9A-Za-z]+).  Also make sure you don't have any extra spaces before or after any of the text in the values to match on or the replacement values.  If it's still giving you issues you could simplify it greatly by just using (.+)/(.+) with the replacement values still being either $1 or $2, but again from what I'm seeing that shouldn't be necessary.


Thanks,


John

Hi John,

I worked out the problem when you said it worked fine for you. There was a second entry in the dictionary that replaced

--BLANK-- when matching on ([0-9]{4}[A-Za-z]{0,4})

to catch cases where we didn't have a slash. Because the input was "1711AHV/1713X1" the $2 was returning 1713X1 which matched on the second pattern so stripped off the leading 1713X.

I changed this second line to match on [A-WYZa-wyz] as we wouldn't expect X to be a valid character in $1. Of course now I've made that change we'll start getting legitimate values with an X in...

Login or Signup to post a comment