Regular Expressions? | INFJ Forum

Regular Expressions?

Discussion in 'Computer Science' started by basic, Dec 25, 2011.

Share This Page

More threads by basic
  1. basic

    Donor

    Joined:
    Aug 3, 2010
    Threads:
    44
    Messages:
    451
    Likes Received:
    89
    Trophy Points:
    572
    MBTI:
    ROFL
    Enneagram:
    5w6 sp/so/sx
    Does anyone happen to know anything about regular expressions (regex) in JavaScript? I'm trying to parse characters/words between two different strings. I could also potentially do between two HTML tags also, <h1>. The only thing I know is that it'll probably include ".*?"

    What I'm parsing is HTML content, currently being converted to plain text. I already have regex filters set up to find instances of a string, but can't figure out how to find characters in between strings.
     
  2. Deathjam

    Deathjam ooooh
    Staff Member Tech Admin

    Joined:
    Aug 28, 2008
    Threads:
    410
    Messages:
    4,552
    Featured Threads:
    6
    Likes Received:
    1,647
    Trophy Points:
    856
    Gender:
    Male
    Location:
    Yorkshire, UK
    MBTI:
    ENTP
    i did't really understand the question,yeah alcohol, but couldnt you you use a regex generator to find the correct expression, i know it help me when i was doing some unicode stuf
     
  3. ceri

    Donor

    Joined:
    Oct 14, 2008
    Threads:
    7
    Messages:
    132
    Likes Received:
    16
    Trophy Points:
    0
    MBTI:
    INFJ
    Enneagram:
    6w5 / 5w6
    I know they're a PITA to work with if you don't know them well. Every time I use them, it takes me over 30 mins to figure out a suitable one.

    Like Deathjam suggested, you should try using one of those generators. I've used this one a few times and it's okay: http://txt2re.com/
     
  4. Russ84

    Russ84 Community Member

    Joined:
    Jun 18, 2010
    Threads:
    9
    Messages:
    231
    Likes Received:
    42
    Trophy Points:
    0
    MBTI:
    INFJ
    Enneagram:
    4
    Could you give us an example input/output of what you are trying to do? I am familiar with regular expressions but do not quite understand what you are asking. Also, could you elaborate on your definition of "string"? I understand a string to be any set of continuous
    characters such as:

    var string1="this is the string, with letters, punctuation, numbers like 35, and so on.";
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  5. ruji

    ruji Well-known member

    Joined:
    Dec 29, 2009
    Threads:
    29
    Messages:
    17,147
    Featured Threads:
    4
    Likes Received:
    54,122
    Trophy Points:
    3,257
    MBTI:
    .
    give examples of strings you're trying to match
     
  6. JGirl

    JGirl no chocolate flavored gum? wow

    Joined:
    Nov 9, 2011
    Threads:
    47
    Messages:
    4,729
    Likes Received:
    1,213
    Trophy Points:
    370
    MBTI:
    INFJ
    Enneagram:
    5
    wow i was not expecting this.
    i was ready to post a regular expression, such as
    keep your stick on the ice
    don't take wooden nickels
    you can't make a silk purse out of a sow's ear...
    etc.
    oh well, nevermind.
     
  7. OP
    basic

    Donor

    Joined:
    Aug 3, 2010
    Threads:
    44
    Messages:
    451
    Likes Received:
    89
    Trophy Points:
    572
    MBTI:
    ROFL
    Enneagram:
    5w6 sp/so/sx
    Ok, I re-read what I posted and I admit it's kinda confusing.

    So basically, I am using jquery to .get an html file, and what I want to do is pull certain things from the file, such as anything between h1 tags, other html tags, or simply between words, and store that in a var.

    For example:
    var string="<h1>Joe Schmoe</h1>";
    I want to take "Joe Schmoe" and store it in a var. I happen to be working with names, so the name between the h1 tags will always be different.

    And by the way, I'm a graphic designer, not a programmer, so I would appreciate explanations for a solution. :)
     
    #7 basic, Dec 27, 2011
    Last edited: Dec 27, 2011
    invisible likes this.
  8. Russ84

    Russ84 Community Member

    Joined:
    Jun 18, 2010
    Threads:
    9
    Messages:
    231
    Likes Received:
    42
    Trophy Points:
    0
    MBTI:
    INFJ
    Enneagram:
    4
    In order to match the string in between <h1> tags you will want to use the regex:
    <h1>[a-zA-Z0-9]<\/h1>
    the \/ before the last h1 is an escaped / ( \ / without the space, not a V)
    This regex will match any string of letters, capital letters, and/or numbers inbetween the <h1> </h1> tags.

    I am more use to C and Java programming and Bash and Perl scripting than Javascript, but I believe that you can use the string.match(<regex>) function to return the string that matches the pattern you are looking for. For this, you will need a variable that contains the contents of the file in which you are parsing for the html tags. So, have a variable, say var file1, that contains the contents of the file and a second variable to hold the matched pattern. If you are expecting multiple variables you might want to use an array to store them. So, the code would look like this:
    var file1=new String(<file contents>);
    var pattern=new RegExp(<h1>[a-zA-Z0-9]<\/h1>, g);
    var string1=file1.match(pattern);
    var arrayOfStrings=string1.split(,);
    You will want to use the string.split(< delimiter >) function because the string.match(<regex>) function will return comma-delimited string of all matches and split will break them up and put each string in it's own element in an array.
    I believe this will do what you are looking for, but I have not tested it. If anyone else has any other suggestions, please feel free to correct any mistakes I have posted here.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
    #8 Russ84, Dec 27, 2011
    Last edited: Dec 27, 2011
  9. OP
    basic

    Donor

    Joined:
    Aug 3, 2010
    Threads:
    44
    Messages:
    451
    Likes Received:
    89
    Trophy Points:
    572
    MBTI:
    ROFL
    Enneagram:
    5w6 sp/so/sx
    OK.

    So I figured out that the regex that kinda sorta solves me problem is /<h1>(?:[a-z][a-z0-9_]*)(\s+)(?:[a-z][a-z0-9_]*)<\/h1>/gi which matches <h1>Word Word</h1>. However, it isn't always <h1>Word Word</h1>. The string that needs to be matched may sometimes be <h1>Word Word Word</h1> and include almost any character possible. SO, at this point, I'm thinking it's going to be easier if I just find a way to convert the html into XML, and simply parse the h1 tag in XML.

    So much for regular expressions...
     

Share This Page