We closed this forum 18 June 2010. It has served us well since 2005 as the ALPHA forum did before it from 2002 to 2005. New discussions are ongoing at the new URL http://forum.processing.org. You'll need to sign up and get a new user account. We're sorry about that inconvenience, but we think it's better in the long run. The content on this forum will remain online.
IndexProgramming Questions & HelpSyntax Questions › please help. regex syntax
Page Index Toggle Pages: 1
please help. regex syntax? (Read 303 times)
please help. regex syntax?
Mar 7th, 2009, 1:14am
 
hi, i don't think i'm doing this right.

this part works. i am getting the words "grey" and "hoody" from the sentence: "i am wearing a grey hoody and a hat." with this code

colorPattern[i] = Pattern.compile("\\b"+colorNames[i]+"\\s*([\\w'-]+)", Pattern.CASE_INSENSITIVE);

now, i am trying to change my code now where i want to get the word before the color if there is no word following the color word. for example: "the sun is red." has no word following "red", so i will print the word "is" instead.

colorPattern[i] = Pattern.compile("\\b"+"\\s*([\\w'-]+)"+colorNames[i]+"\\s*([\\w'-]+)", Pattern.CASE_INSENSITIVE);

i would like to read the word before and after the color word, and i will just chose what part of the string i want to print:

if (colorMatch[i].find()){    
 String [] wordsToSplit = split(colorMatch[i].group(), " "); //get string with color, before and following words
       String drawWordTest = wordsToSplit[0];
       println("Color: " + drawWordTest);

//split words
 if (wordsToSplit.length > 1){ //if wordsToSplit String is longer than 1, take 3rd word and place in drawWord
         drawWord = wordsToSplit[2];
       }

//else if there is no word that follows the color word, get first word
      else if (wordsToSplit.length<2){
         drawWord = wordsToSplit[0];  
        }


any help is appreciated! thanks!
Re: please help. regex syntax?
Reply #1 - Mar 7th, 2009, 11:27pm
 
Don't split result, use the regex to this job too:
Code:
String[] colorNames = { "grey", "red", "blue", "green" };

void ShowMatch(int i, String description)
{
 String pattern = "\\s*([\\w'-]+)\\s+" + colorNames[i] + "(?:\\s*([\\w'-]+)?)";
 Pattern colorPattern = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);  
 Matcher matcher = colorPattern.matcher(description);
 if (matcher.find())
 {
   if (matcher.group(2) != null)
     println(">>" + matcher.group(2));
   else
     println("> " + matcher.group(1));
 }
 else
   println("Not found");
}

void setup()
{
 ShowMatch(0, "i am wearing a grey hoody and a hat.");
 ShowMatch(1, "The sun is red");
 exit();
}
Re: please help. regex syntax?
Reply #2 - Mar 8th, 2009, 8:31pm
 
thanks! it works great. i still have trouble writing regex's myself cause it's just hard for me to put the combos together. i just have an additional question--in addition to the found words from the previous code, i would also like to print out the whole sentence that the words of color are in.

for example:
"the sun is hot. i am wearing a grey hoody and a hat. i am 2 today."

would return: ""i am wearing a grey hoody and a hat."

these sentences are not set, static sentences, but are RSS posts that are streaming in. some people don't put periods at the end of the sentences, but just returns to a newline, which could be a problem cause the entire post will be read instead of just that sentence. if it is too difficult to handle, i might not mind reading that entire post then.

My code is working, except for the areas where it says "//new":

//only taking putting one entry thru at a time
   for (int i=0; i<num_colors; i++){  
   //find colors in array first  
     //colorPattern[i] = Pattern.compile("\\s*([\\w'-]+)\\s+" + colorNames[i] + "(?:\\s*([\\w'-]+)?)", Pattern.CASE_INSENSITIVE); //old
     colorPattern[i] = Pattern.compile("[\\d\\w'-]\\s*([\\w'-]+)\\s+" + colorNames[i] + "(?:\\s*([\\w'-]+)?[\\d\\w'-]\\b)", Pattern.CASE_INSENSITIVE); //incorrect syntax for new?
     colorMatch[i] = colorPattern[i].matcher(title+description); //look in title & description that match any colors

     //only if there is a color that matches, get word & find the hood it belongs to...
     if (colorMatch[i].find()){    
       drawWord = colorMatch[i].group();
       
       //draw the ENTIRE sentence that the color word is in <<<<<<<<<<<<<<<<<<<<<<
       //drawSentence = colorMatch[i].group(); //how to draw entire sentence?
       
       //draw color hex code
       drawColor = colorCodes[i]; //assign color name location to color #code
       String drawColorColor = colorNames[i];
       println("Colorcode: "+ drawColor);  
       println("Color: "+ drawColorColor);  
       
       //draw words surrounding the found color word
       if (colorMatch[i].group(2) != null){
           drawWord = colorMatch[i].group(2);
           println("Word After: "+ drawWord); //print following word
       }else{
           drawWord = colorMatch[i].group(1);
           println("Word BEFORE: "+ drawWord);
       }
       //find hoods
       for (int k = 0; k < num_hoods; k++){
         hoodsPattern[k] = Pattern.compile("\\b"+hoodsNames[k], Pattern.CASE_INSENSITIVE);  //look for any words within findHoods array
         hoodsMatch[k] = hoodsPattern[k].matcher(title); //look in titles to match any neighborhoods
         if(hoodsMatch[k].find()){
           drawHood = k; //take the hood number and put it into integer drawHood
           println("Hood: " + hoodsMatch[k].group());
           //dots.add(new DrawDot(drawHood, drawColor, drawWord));  //old
           dots.add(new DrawDot(drawHood, drawColor, drawWord, drawSentence)); //new
           counter++;    
         } //end if colorMatch find hoods    
       }
     } //end colorMatch[i].find())        
   }  //end for loop for find colors in array  
   println("# found colors counter: " + counter + newline);
Re: please help. regex syntax?
Reply #3 - Mar 8th, 2009, 10:45pm
 
rrrr wrote on Mar 8th, 2009, 8:31pm:
some people don't put periods at the end of the sentences, but just returns to a newline, which could be a problem cause the entire post will be read instead of just that sentence. if it is too difficult to handle, i might not mind reading that entire post then.

Indeed. I might be wrong, but often newlines are non significant in XML (they sure are not in HTML!). So using newlines as delimiter is a bad idea.
Re: please help. regex syntax?
Reply #4 - Mar 8th, 2009, 11:13pm
 
hmmm, maybe i can just read the entire sentence or post that they have, but cut it off at 200 characters?

we already got the color word, the word before (or the word before after) the color word. these are all bits of information i want to keep in addition to my new question...how would the regex syntax be written to get the entire sentence that the color word is in? it's ok if they don't put periods. i will just have to limit the number of characters that is displayed.

thanks again!
Page Index Toggle Pages: 1