Start a new topic

RegEx to Format and Abbreviate Street Names

Here is a large set of regular expressions designed to replace a street name with its standardized USPS abbreviation. (See the "Street Suffixes" and "Secondary Units" tabs). It takes an address such as "15800 Countrydrive Avenue blvd., apt. 1506" and formats it to "15800 Countrydrive Ave Blvd Apt 1506". This isn't perfect as it mistakes an intended part of the street name as an abbreviation. Hence "Avenue" becomes "Ave" and some of the meaning is lost. However if the street name includes a possible abbreviation (such as "Countrydrive") then the abbreviation is appropriately NOT applied.
Regular Expressions - Street Suffixes & Secondary Unit Designators.xlsx (UPDATED BY A LATER POST)

The list includes 202 Street Suffixes and 24 Secondary Unit Designators. The "Use Regular Expressions" check box must be painstakingly ticked for each item. The form needs a "Check All" button or link.

image


There are two regular expressions that should also be included, BLANK and PO Box. They are:

1. BLANK --> , AND \. (two match-on entries)
2. PO Box --> (?i)\b((P\.O\.)|(P\. O\.)|(PO BOX)|(POBOX)|(BOX)|(POB)|(P O BOX))\b

Hopefully someone can modify the regular expression format to restrict alterations to only the last word before the Secondary Unit Designator. That would be helpful.

1 person has this question

Jenny,


Here are the results from the expression I posted.  They appear to be working in the example you gave.

image

Thanks Nic, appreciate your response, but I'm afraid it doesn't work for me.

To make the issue a little clearer:
The raw data reads: lvl 11, 515 John Street

So in the dictionary I'm using:
Replacement Value: Level
Value to match on: (?i)\b(level\.?|lvl\.?|lev\.?)\b
which gives me: Level 11 515 John Street

Basically I need to change " lvl" to "Level" without losing the comma.

Many thanks
Jenny
Jenny,

This should take care of it for you.

(level\.?(?=,? )|lvl\.?(?=,? )|lev\.?(?=,? ))
Thank you! I tried the full address and it works, so it was the space I was forgetting about.

You may be able to help me with something else. Here is Australia our addresses are, e.g.
Level 1, 18 Bell Road or Unit 6, 22 Smith Drive

I want to be able to convert, say lvl 1 to Level 1 or U 6 to Unit 6 but still keep the comma.
I am using the following:
Replacement Value: Level
Value to match on: (?i)\b(level\.?|lvl\.?|lev\.?)\b
But I end up with Level 1 18 Bell Road. I can live with this but wondered if there was a way to get the comma in there.

Many thanks
Jenny


Wayne is correct. The regular expressions are looking for a space before "rd" and it being at the end of the line or being followed by "," or " #".
Those look ok to me. I would suggest just trying one line, like the Road one, and getting that to work first.

Make sure to try a full address and not just "rd" because it might be looking for only rd that is at the end of an address line.

Thank you for your response Wayne

Yes, the RegExp box is checked for each entry and the TRUE column was copied in.  However, the only one that works is the one for the PO Box, i.e.

Replacement Value:  PO Box

Value to match on:  (?i)\b((P\.O\.)|(P\. O\.)|(PO BOX)|(POBOX)|(BOX)|(POB)|(P O BOX))\b

None of the others work, a couple of examples are:

Replacement Value:  Street

Value to match on:  (?i)(?

Replacement Value: Road

Value to match on:  (?i)(?

I'm stumped.  Any advice you can give would be very helpful.

thanks

Jenny

 

One thing to check is that the "use regular expressions" is checked off for each entry. They way to do it automatically is to have a third column in the excel spreadsheet set to TRUE. It might already be on there but just make sure.

I'm very new to ImportOmatic and RegEx.  I want to import addresses that have been abbreviated, i.e. I need to set up a RegEx to change rd to Road and st to Street, etc. I was excited to see this topic and copied the Address abbreviations (Long) into my dictionary as instructed - but it does not work for me at all.  As this is entry is over a year old I wonder if there has been an update or change at all?

thanks

Jenny

I would guess that first it is replacing " Apt" with ", Apartment" and then it is not recognizing "Ave," as being Avenue.
Hi,

I'm testing out the Address Abbreviations (Long) but am running into a problem with Ave, St., etc when I am entering an apartment abbreviation as well.
I'm entering 234 West Ave Apt. 45 and it's giving me 234 West Ave, Apartment 45. Does anyone know the solution to this? I would like for Ave to be spelled out.

Thanks!




Tammy,

Take a look at this post: http://www.omaticsoftware.com/Forums/tabid/108/aft/1076/Default.aspx

It may not be exactly what you are looking for but it will get you on the right track.

Is there a regular expression for changing expiration dates?  In a CSV file the dates are fine, but if I save it and have to reopen the file, EXCEL changes the dates from Jun-16 to 16-Jun and RE doesn't like that.  I've got an Excel workaround but wondered if Omatic had a regular expression for this? 

Thanks.

Tammy

 

 

Does Blackbaud have an AddressAccelerator product for Canada? If so, I would import the data as is and then run AA against it. It is then matching to exact addresses per your postal system and not using simple find/replace.
Also, when I tested this PO Box code from this list with "P.o. Box 25", it resulted with "P.o. PO Box 25 instead of replacing the P.o.
I didn't copy the whole dictionary in because they don't all apply to us in Canada and they don't all match the abbreviations we use. Would that have affected the way the dictionary understood it?

(?i)\b((P\.O\.)|(P\. O\.)|(PO BOX)|(POBOX)|(BOX)|(POB)|(P O BOX))\b
Login or Signup to post a comment