Regular Expressions? | INFJ Forum

Regular Expressions?

basic

Donor
Aug 3, 2010
451
89
572
MBTI
ROFL
Enneagram
5w6 sp/so/sx
Does anyone happen to know anything about regular expressions (regex) in JavaScript? I'm trying to parse characters/words between two different strings. I could also potentially do between two HTML tags also, <h1>. The only thing I know is that it'll probably include ".*?"

What I'm parsing is HTML content, currently being converted to plain text. I already have regex filters set up to find instances of a string, but can't figure out how to find characters in between strings.
 
I know they're a PITA to work with if you don't know them well. Every time I use them, it takes me over 30 mins to figure out a suitable one.

Like Deathjam suggested, you should try using one of those generators. I've used this one a few times and it's okay: http://txt2re.com/
 
Does anyone happen to know anything about regular expressions (regex) in JavaScript? I'm trying to parse characters/words between two different strings. I could also potentially do between two HTML tags also, <h1>. The only thing I know is that it'll probably include ".*?"

What I'm parsing is HTML content, currently being converted to plain text. I already have regex filters set up to find instances of a string, but can't figure out how to find characters in between strings.

Could you give us an example input/output of what you are trying to do? I am familiar with regular expressions but do not quite understand what you are asking. Also, could you elaborate on your definition of "string"? I understand a string to be any set of continuous
characters such as:

var string1="this is the string, with letters, punctuation, numbers like 35, and so on.";
 
What I'm parsing is HTML content, currently being converted to plain text. I already have regex filters set up to find instances of a string, but can't figure out how to find characters in between strings.
give examples of strings you're trying to match
 
wow i was not expecting this.
i was ready to post a regular expression, such as
keep your stick on the ice
don't take wooden nickels
you can't make a silk purse out of a sow's ear...
etc.
oh well, nevermind.
 
Ok, I re-read what I posted and I admit it's kinda confusing.

So basically, I am using jquery to .get an html file, and what I want to do is pull certain things from the file, such as anything between h1 tags, other html tags, or simply between words, and store that in a var.

For example:
var string="<h1>Joe Schmoe</h1>";
I want to take "Joe Schmoe" and store it in a var. I happen to be working with names, so the name between the h1 tags will always be different.

And by the way, I'm a graphic designer, not a programmer, so I would appreciate explanations for a solution. :)
 
Last edited:
  • Like
Reactions: invisible
In order to match the string in between <h1> tags you will want to use the regex:
<h1>[a-zA-Z0-9]<\/h1>
the \/ before the last h1 is an escaped / ( \ / without the space, not a V)
This regex will match any string of letters, capital letters, and/or numbers inbetween the <h1> </h1> tags.

I am more use to C and Java programming and Bash and Perl scripting than Javascript, but I believe that you can use the string.match(<regex>) function to return the string that matches the pattern you are looking for. For this, you will need a variable that contains the contents of the file in which you are parsing for the html tags. So, have a variable, say var file1, that contains the contents of the file and a second variable to hold the matched pattern. If you are expecting multiple variables you might want to use an array to store them. So, the code would look like this:
var file1=new String(<file contents>);
var pattern=new RegExp(<h1>[a-zA-Z0-9]<\/h1>, g);
var string1=file1.match(pattern);
var arrayOfStrings=string1.split(,);
You will want to use the string.split(< delimiter >) function because the string.match(<regex>) function will return comma-delimited string of all matches and split will break them up and put each string in it's own element in an array.
I believe this will do what you are looking for, but I have not tested it. If anyone else has any other suggestions, please feel free to correct any mistakes I have posted here.
 
Last edited:
OK.

So I figured out that the regex that kinda sorta solves me problem is /<h1>(?:[a-z][a-z0-9_]*)(\s+)(?:[a-z][a-z0-9_]*)<\/h1>/gi which matches <h1>Word Word</h1>. However, it isn't always <h1>Word Word</h1>. The string that needs to be matched may sometimes be <h1>Word Word Word</h1> and include almost any character possible. SO, at this point, I'm thinking it's going to be easier if I just find a way to convert the html into XML, and simply parse the h1 tag in XML.

So much for regular expressions...